hf_text-generation-inference/integration-tests/models
drbh 40213c957f
Pali gemma modeling (#1895)
This PR adds paligemma modeling code

Blog post: https://huggingface.co/blog/paligemma
Transformers PR: https://github.com/huggingface/transformers/pull/30814

install the latest changes and run with
```bash
# get the weights
# text-generation-server download-weights gv-hf/PaliGemma-base-224px-hf

# run TGI
text-generation-launcher --model-id gv-hf/PaliGemma-base-224px-hf
```


basic example sending various requests
```python
from huggingface_hub import InferenceClient

client = InferenceClient("http://127.0.0.1:3000")


images = [
    "https://huggingface.co/datasets/hf-internal-testing/fixtures-captioning/resolve/main/cow_beach_1.png",
    "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/rabbit.png",
]

prompts = [
    "What animal is in this image?",
    "Name three colors in this image.",
    "What are 10 colors in this image?",
    "Where is the cow standing?",
    "answer en Where is the cow standing?",
    "Is there a bird in the image?",
    "Is ther a cow in the image?",
    "Is there a rabbit in the image?",
    "how many birds are in the image?",
    "how many rabbits are in the image?",
]

for img in images:
    print(f"\nImage: {img.split('/')[-1]}")
    for prompt in prompts:
        inputs = f"![]({img}){prompt}\n"
        json_data = {
            "inputs": inputs,
            "parameters": {
                "max_new_tokens": 30,
                "do_sample": False,
            },
        }
        generated_output = client.text_generation(prompt, max_new_tokens=30, stream=False)
        print([f"{prompt}\n{generated_output}"])

```

---------

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
2024-05-16 06:58:47 +02:00
..
__snapshots__ Pali gemma modeling (#1895) 2024-05-16 06:58:47 +02:00
test_bloom_560m.py feat(server): only compute prefill logprobs when asked (#406) 2023-06-02 17:12:30 +02:00
test_bloom_560m_sharded.py feat(server): only compute prefill logprobs when asked (#406) 2023-06-02 17:12:30 +02:00
test_chat_llama.py feat: improve tools to include name and add tests (#1693) 2024-04-16 09:02:46 -04:00
test_completion_prompts.py feat: accept list as prompt and use first string (#1702) 2024-04-17 10:41:12 +02:00
test_flash_awq.py fix(router): fix openapi and add jsonschema validation (#1578) 2024-02-21 11:05:32 +01:00
test_flash_awq_sharded.py fix(router): fix openapi and add jsonschema validation (#1578) 2024-02-21 11:05:32 +01:00
test_flash_falcon.py feat(server): only compute prefill logprobs when asked (#406) 2023-06-02 17:12:30 +02:00
test_flash_gemma.py feat: add support for Gemma (#1583) 2024-02-21 14:15:22 +01:00
test_flash_gpt2.py Add GPT-2 with flash attention (#1889) 2024-05-15 13:31:22 +02:00
test_flash_grammar_llama.py fix: correctly index into mask when applying grammar (#1618) 2024-03-01 18:22:01 +01:00
test_flash_llama.py feat(server): only compute prefill logprobs when asked (#406) 2023-06-02 17:12:30 +02:00
test_flash_llama_gptq.py feat: add cuda memory fraction (#659) 2023-07-24 11:43:58 +02:00
test_flash_medusa.py Revamp medusa implementation so that every model can benefit. (#1588) 2024-02-26 19:49:28 +01:00
test_flash_mistral.py fix(router): fix openapi and add jsonschema validation (#1578) 2024-02-21 11:05:32 +01:00
test_flash_neox.py feat(server): add paged attention to flash models (#516) 2023-06-30 19:09:59 +02:00
test_flash_neox_sharded.py feat(server): only compute prefill logprobs when asked (#406) 2023-06-02 17:12:30 +02:00
test_flash_pali_gemma.py Pali gemma modeling (#1895) 2024-05-16 06:58:47 +02:00
test_flash_phi.py fix(router): fix openapi and add jsonschema validation (#1578) 2024-02-21 11:05:32 +01:00
test_flash_qwen2.py feat: Qwen2 (#1608) 2024-02-28 15:50:31 +01:00
test_flash_santacoder.py feat(server): only compute prefill logprobs when asked (#406) 2023-06-02 17:12:30 +02:00
test_flash_starcoder.py feat(server): only compute prefill logprobs when asked (#406) 2023-06-02 17:12:30 +02:00
test_flash_starcoder2.py feat: starcoder2 (#1605) 2024-02-28 12:07:08 +01:00
test_flash_starcoder_gptq.py fix(router): fix openapi and add jsonschema validation (#1578) 2024-02-21 11:05:32 +01:00
test_grammar_llama.py fix: correctly index into mask when applying grammar (#1618) 2024-03-01 18:22:01 +01:00
test_idefics.py Adding Llava-Next (Llava 1.6) with full support. (#1709) 2024-04-09 21:32:00 +02:00
test_idefics2.py Idefics2. (#1756) 2024-04-23 23:04:44 +02:00
test_llava_next.py Adding Llava-Next (Llava 1.6) with full support. (#1709) 2024-04-09 21:32:00 +02:00
test_mamba.py fix(router): fix openapi and add jsonschema validation (#1578) 2024-02-21 11:05:32 +01:00
test_mpt.py feat(server): Add Non flash MPT. (#514) 2023-07-03 13:01:46 +02:00
test_mt0_base.py Adding Llava-Next (Llava 1.6) with full support. (#1709) 2024-04-09 21:32:00 +02:00
test_neox.py feat(server): Rework model loading (#344) 2023-06-08 14:51:52 +02:00
test_neox_sharded.py feat(server): Rework model loading (#344) 2023-06-08 14:51:52 +02:00
test_t5_sharded.py Improve the defaults for the launcher (#1727) 2024-04-12 14:20:31 +02:00
test_tools_llama.py feat: improve tools to include name and add tests (#1693) 2024-04-16 09:02:46 -04:00