hf_text-generation-inference

History

drbh 40213c957f Pali gemma modeling (#1895 ) This PR adds paligemma modeling code Blog post: https://huggingface.co/blog/paligemma Transformers PR: https://github.com/huggingface/transformers/pull/30814 install the latest changes and run with ```bash # get the weights # text-generation-server download-weights gv-hf/PaliGemma-base-224px-hf # run TGI text-generation-launcher --model-id gv-hf/PaliGemma-base-224px-hf ``` basic example sending various requests ```python from huggingface_hub import InferenceClient client = InferenceClient("http://127.0.0.1:3000") images = [ "https://huggingface.co/datasets/hf-internal-testing/fixtures-captioning/resolve/main/cow_beach_1.png", "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/rabbit.png", ] prompts = [ "What animal is in this image?", "Name three colors in this image.", "What are 10 colors in this image?", "Where is the cow standing?", "answer en Where is the cow standing?", "Is there a bird in the image?", "Is ther a cow in the image?", "Is there a rabbit in the image?", "how many birds are in the image?", "how many rabbits are in the image?", ] for img in images: print(f"\nImage: {img.split('/')[-1]}") for prompt in prompts: inputs = f"![]({img}){prompt}\n" json_data = { "inputs": inputs, "parameters": { "max_new_tokens": 30, "do_sample": False, }, } generated_output = client.text_generation(prompt, max_new_tokens=30, stream=False) print([f"{prompt}\n{generated_output}"]) ``` --------- Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>		2024-05-16 06:58:47 +02:00
..
__snapshots__	Pali gemma modeling (#1895 )	2024-05-16 06:58:47 +02:00
test_bloom_560m.py	…
test_bloom_560m_sharded.py	…
test_chat_llama.py	feat: improve tools to include name and add tests (#1693 )	2024-04-16 09:02:46 -04:00
test_completion_prompts.py	feat: accept list as prompt and use first string (#1702 )	2024-04-17 10:41:12 +02:00
test_flash_awq.py	…
test_flash_awq_sharded.py	…
test_flash_falcon.py	…
test_flash_gemma.py	…
test_flash_gpt2.py	Add GPT-2 with flash attention (#1889 )	2024-05-15 13:31:22 +02:00
test_flash_grammar_llama.py	fix: correctly index into mask when applying grammar (#1618 )	2024-03-01 18:22:01 +01:00
test_flash_llama.py	…
test_flash_llama_gptq.py	…
test_flash_medusa.py	…
test_flash_mistral.py	…
test_flash_neox.py	…
test_flash_neox_sharded.py	…
test_flash_pali_gemma.py	Pali gemma modeling (#1895 )	2024-05-16 06:58:47 +02:00
test_flash_phi.py	…
test_flash_qwen2.py	feat: Qwen2 (#1608 )	2024-02-28 15:50:31 +01:00
test_flash_santacoder.py	…
test_flash_starcoder.py	…
test_flash_starcoder2.py	feat: starcoder2 (#1605 )	2024-02-28 12:07:08 +01:00
test_flash_starcoder_gptq.py	…
test_grammar_llama.py	fix: correctly index into mask when applying grammar (#1618 )	2024-03-01 18:22:01 +01:00
test_idefics.py	Adding Llava-Next (Llava 1.6) with full support. (#1709 )	2024-04-09 21:32:00 +02:00
test_idefics2.py	Idefics2. (#1756 )	2024-04-23 23:04:44 +02:00
test_llava_next.py	Adding Llava-Next (Llava 1.6) with full support. (#1709 )	2024-04-09 21:32:00 +02:00
test_mamba.py	…
test_mpt.py	…
test_mt0_base.py	Adding Llava-Next (Llava 1.6) with full support. (#1709 )	2024-04-09 21:32:00 +02:00
test_neox.py	…
test_neox_sharded.py	…
test_t5_sharded.py	Improve the defaults for the launcher (#1727 )	2024-04-12 14:20:31 +02:00
test_tools_llama.py	feat: improve tools to include name and add tests (#1693 )	2024-04-16 09:02:46 -04:00