hf_text-generation-inference/integration-tests/models/__snapshots__
drbh 40213c957f
Pali gemma modeling (#1895)
This PR adds paligemma modeling code

Blog post: https://huggingface.co/blog/paligemma
Transformers PR: https://github.com/huggingface/transformers/pull/30814

install the latest changes and run with
```bash
# get the weights
# text-generation-server download-weights gv-hf/PaliGemma-base-224px-hf

# run TGI
text-generation-launcher --model-id gv-hf/PaliGemma-base-224px-hf
```


basic example sending various requests
```python
from huggingface_hub import InferenceClient

client = InferenceClient("http://127.0.0.1:3000")


images = [
    "https://huggingface.co/datasets/hf-internal-testing/fixtures-captioning/resolve/main/cow_beach_1.png",
    "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/rabbit.png",
]

prompts = [
    "What animal is in this image?",
    "Name three colors in this image.",
    "What are 10 colors in this image?",
    "Where is the cow standing?",
    "answer en Where is the cow standing?",
    "Is there a bird in the image?",
    "Is ther a cow in the image?",
    "Is there a rabbit in the image?",
    "how many birds are in the image?",
    "how many rabbits are in the image?",
]

for img in images:
    print(f"\nImage: {img.split('/')[-1]}")
    for prompt in prompts:
        inputs = f"![]({img}){prompt}\n"
        json_data = {
            "inputs": inputs,
            "parameters": {
                "max_new_tokens": 30,
                "do_sample": False,
            },
        }
        generated_output = client.text_generation(prompt, max_new_tokens=30, stream=False)
        print([f"{prompt}\n{generated_output}"])

```

---------

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
2024-05-16 06:58:47 +02:00
..
test_bloom_560m feat(server): support vectorized warpers in flash causal lm (#317) 2023-05-26 12:30:27 +02:00
test_bloom_560m_sharded feat(integration-tests): improve comparison and health checks (#336) 2023-05-16 20:22:11 +02:00
test_chat_llama v2.0.1 2024-04-18 17:20:36 +02:00
test_completion_prompts v2.0.1 2024-04-18 17:20:36 +02:00
test_flash_awq Add AWQ quantization inference support (#1019) (#1054) 2023-09-25 15:31:27 +02:00
test_flash_awq_sharded Add AWQ quantization inference support (#1019) (#1054) 2023-09-25 15:31:27 +02:00
test_flash_falcon feat(server): add retry on download (#384) 2023-05-31 10:57:53 +02:00
test_flash_gemma feat: add support for Gemma (#1583) 2024-02-21 14:15:22 +01:00
test_flash_gpt2 Add GPT-2 with flash attention (#1889) 2024-05-15 13:31:22 +02:00
test_flash_grammar_llama fix: correctly index into mask when applying grammar (#1618) 2024-03-01 18:22:01 +01:00
test_flash_llama Remove the stripping of the prefix space (and any other mangling that tokenizers might do). (#1065) 2023-09-27 12:13:45 +02:00
test_flash_llama_gptq ROCm AWQ support (#1514) 2024-02-09 10:45:16 +01:00
test_flash_medusa Speculative (#1308) 2023-12-11 12:46:30 +01:00
test_flash_mistral feat: add mistral model (#1071) 2023-09-28 09:55:47 +02:00
test_flash_neox fix(server): fix init for flash causal lm (#352) 2023-05-22 15:05:32 +02:00
test_flash_neox_sharded fix(server): fix init for flash causal lm (#352) 2023-05-22 15:05:32 +02:00
test_flash_pali_gemma Pali gemma modeling (#1895) 2024-05-16 06:58:47 +02:00
test_flash_phi feat: adds phi model (#1442) 2024-01-25 15:37:53 +01:00
test_flash_qwen2 feat: Qwen2 (#1608) 2024-02-28 15:50:31 +01:00
test_flash_santacoder feat(integration-tests): improve comparison and health checks (#336) 2023-05-16 20:22:11 +02:00
test_flash_starcoder feat(server): Rework model loading (#344) 2023-06-08 14:51:52 +02:00
test_flash_starcoder2 feat: starcoder2 (#1605) 2024-02-28 12:07:08 +01:00
test_flash_starcoder_gptq ROCm AWQ support (#1514) 2024-02-09 10:45:16 +01:00
test_grammar_llama fix: correctly index into mask when applying grammar (#1618) 2024-03-01 18:22:01 +01:00
test_idefics Fixing non divisible embeddings. (#1476) 2024-01-24 13:08:41 +01:00
test_idefics2 Idefics2. (#1756) 2024-04-23 23:04:44 +02:00
test_llava_next Idefics2. (#1756) 2024-04-23 23:04:44 +02:00
test_mamba Improving mamba runtime by using updates (#1552) 2024-02-14 09:54:10 +01:00
test_mpt feat(server): Add Non flash MPT. (#514) 2023-07-03 13:01:46 +02:00
test_mt0_base Adding Llava-Next (Llava 1.6) with full support. (#1709) 2024-04-09 21:32:00 +02:00
test_neox feat(server): Rework model loading (#344) 2023-06-08 14:51:52 +02:00
test_neox_sharded feat(server): Rework model loading (#344) 2023-06-08 14:51:52 +02:00
test_t5_sharded feat(server): support fp16 for t5 (#360) 2023-05-23 18:16:48 +02:00
test_tools_llama v2.0.1 2024-04-18 17:20:36 +02:00