hf_text-generation-inference

History

drbh 30be188400 Fix: don't apply post layernorm in SiglipVisionTransformer (#2459 ) * Fix: don't apply post layernorm in SiglipVisionTransformer This fixes a bug with LLaVA Next when using Siglip as the vision model. LLaVA Next expects the output of the vision model to be the encoder outputs before layernorm (see original transformers implementation here: https://github.com/huggingface/transformers/blob/main/src/transformers/models/llava_next/modeling_llava_next.py#L813). This also makes Siglip consistent with the existing Clip implementation: https://github.com/huggingface/text-generation-inference/blob/main/server/text_generation_server/models/custom_modeling/clip.py#L613 * fix: adjust pali gemma for post layer norm and small refactors --------- Co-authored-by: Travis Addair <tgaddair@gmail.com>		2024-08-26 17:04:46 -04:00
..
adapters	feat: add ruff and resolve issue (#2262 )	2024-07-26 10:29:09 -04:00
layers	Prefix caching (#2402 )	2024-08-20 11:15:30 +02:00
models	Fix: don't apply post layernorm in SiglipVisionTransformer (#2459 )	2024-08-26 17:04:46 -04:00
pb	chore: add pre-commit (#1569 )	2024-02-16 11:58:58 +01:00
utils	fix: allocate tmp based on sgmv kernel if available (#2345 )	2024-08-12 17:24:32 +02:00
__init__.py	feat(clients): Python client (#103 )	2023-03-07 18:52:22 +01:00
cache.py	fix(server): decrease memory fragmentation (#557 )	2023-07-06 14:28:33 +02:00
cli.py	feat: add ruff and resolve issue (#2262 )	2024-07-26 10:29:09 -04:00
interceptor.py	v2.0.0 (#1736 )	2024-04-12 18:38:34 +02:00
server.py	Upgrading exl2. (#2415 )	2024-08-14 11:58:08 +02:00
tracing.py	Add OTLP Service Name Environment Variable (#2076 )	2024-06-25 09:33:01 +02:00