History

drbh 30be188400 Fix: don't apply post layernorm in SiglipVisionTransformer (#2459 ) * Fix: don't apply post layernorm in SiglipVisionTransformer This fixes a bug with LLaVA Next when using Siglip as the vision model. LLaVA Next expects the output of the vision model to be the encoder outputs before layernorm (see original transformers implementation here: https://github.com/huggingface/transformers/blob/main/src/transformers/models/llava_next/modeling_llava_next.py#L813). This also makes Siglip consistent with the existing Clip implementation: https://github.com/huggingface/text-generation-inference/blob/main/server/text_generation_server/models/custom_modeling/clip.py#L613 * fix: adjust pali gemma for post layer norm and small refactors --------- Co-authored-by: Travis Addair <tgaddair@gmail.com>		2024-08-26 17:04:46 -04:00
..
custom_kernels	All integration tests back everywhere (too many failed CI). (#2428 )	2024-08-16 21:19:46 +02:00
exllama_kernels	…
exllamav2_kernels	…
tests	feat: add ruff and resolve issue (#2262 )	2024-07-26 10:29:09 -04:00
text_generation_server	Fix: don't apply post layernorm in SiglipVisionTransformer (#2459 )	2024-08-26 17:04:46 -04:00
.gitignore	…
Makefile	Upgrading exl2. (#2415 )	2024-08-14 11:58:08 +02:00
Makefile-awq	…
Makefile-eetq	…
Makefile-exllamav2	Upgrading exl2. (#2415 )	2024-08-14 11:58:08 +02:00
Makefile-fbgemm	Upgrade fbgemm (#2398 )	2024-08-12 14:08:38 +02:00
Makefile-flash-att	…
Makefile-flash-att-v2	Softcapping for gemma2. (#2273 )	2024-07-22 18:27:10 +02:00
Makefile-lorax-punica	Enable multiple LoRa adapters (#2010 )	2024-06-25 14:46:27 -04:00
Makefile-selective-scan	…
Makefile-vllm	Add support for Deepseek V2 (#2224 )	2024-07-19 17:23:20 +02:00
README.md	…
poetry.lock	Fixing exl2 and other quanize tests again. (#2419 )	2024-08-15 11:12:51 +02:00
pyproject.toml	Fixing exl2 and other quanize tests again. (#2419 )	2024-08-15 11:12:51 +02:00
requirements_cuda.txt	Fixing exl2 and other quanize tests again. (#2419 )	2024-08-15 11:12:51 +02:00
requirements_intel.txt	Fixing exl2 and other quanize tests again. (#2419 )	2024-08-15 11:12:51 +02:00
requirements_rocm.txt	Fixing exl2 and other quanize tests again. (#2419 )	2024-08-15 11:12:51 +02:00

README.md

Text Generation Inference Python gRPC Server

A Python gRPC server for Text Generation Inference

Install

make install

Run

make run-dev