hf_text-generation-inference/server
drbh 30be188400
Fix: don't apply post layernorm in SiglipVisionTransformer (#2459)
* Fix: don't apply post layernorm in SiglipVisionTransformer

This fixes a bug with LLaVA Next when using Siglip as the vision model. LLaVA Next expects the output of the vision model to be the encoder outputs before layernorm (see original transformers implementation here: https://github.com/huggingface/transformers/blob/main/src/transformers/models/llava_next/modeling_llava_next.py#L813).

This also makes Siglip consistent with the existing Clip implementation:

https://github.com/huggingface/text-generation-inference/blob/main/server/text_generation_server/models/custom_modeling/clip.py#L613

* fix: adjust pali gemma for post layer norm and small refactors

---------

Co-authored-by: Travis Addair <tgaddair@gmail.com>
2024-08-26 17:04:46 -04:00
..
custom_kernels All integration tests back everywhere (too many failed CI). (#2428) 2024-08-16 21:19:46 +02:00
exllama_kernels MI300 compatibility (#1764) 2024-05-17 15:30:47 +02:00
exllamav2_kernels chore: add pre-commit (#1569) 2024-02-16 11:58:58 +01:00
tests feat: add ruff and resolve issue (#2262) 2024-07-26 10:29:09 -04:00
text_generation_server Fix: don't apply post layernorm in SiglipVisionTransformer (#2459) 2024-08-26 17:04:46 -04:00
.gitignore Impl simple mamba model (#1480) 2024-02-08 10:19:45 +01:00
Makefile Upgrading exl2. (#2415) 2024-08-14 11:58:08 +02:00
Makefile-awq chore: add pre-commit (#1569) 2024-02-16 11:58:58 +01:00
Makefile-eetq Upgrade EETQ (Fixes the cuda graphs). (#1729) 2024-04-12 08:15:28 +02:00
Makefile-exllamav2 Upgrading exl2. (#2415) 2024-08-14 11:58:08 +02:00
Makefile-fbgemm Upgrade fbgemm (#2398) 2024-08-12 14:08:38 +02:00
Makefile-flash-att Hotfixing `make install`. (#2008) 2024-06-04 23:34:03 +02:00
Makefile-flash-att-v2 Softcapping for gemma2. (#2273) 2024-07-22 18:27:10 +02:00
Makefile-lorax-punica Enable multiple LoRa adapters (#2010) 2024-06-25 14:46:27 -04:00
Makefile-selective-scan chore: add pre-commit (#1569) 2024-02-16 11:58:58 +01:00
Makefile-vllm Add support for Deepseek V2 (#2224) 2024-07-19 17:23:20 +02:00
README.md chore: add pre-commit (#1569) 2024-02-16 11:58:58 +01:00
poetry.lock Fixing exl2 and other quanize tests again. (#2419) 2024-08-15 11:12:51 +02:00
pyproject.toml Fixing exl2 and other quanize tests again. (#2419) 2024-08-15 11:12:51 +02:00
requirements_cuda.txt Fixing exl2 and other quanize tests again. (#2419) 2024-08-15 11:12:51 +02:00
requirements_intel.txt Fixing exl2 and other quanize tests again. (#2419) 2024-08-15 11:12:51 +02:00
requirements_rocm.txt Fixing exl2 and other quanize tests again. (#2419) 2024-08-15 11:12:51 +02:00

README.md

Text Generation Inference Python gRPC Server

A Python gRPC server for Text Generation Inference

Install

make install

Run

make run-dev