hf_text-generation-inference/integration-tests
Daniël de Kok 64142489b6
Add support for fused MoE Marlin for AWQ (#2616)
* Add support for fused MoE Marlin for AWQ

This uses the updated MoE Marlin kernels from vLLM.

* Add integration test for AWQ MoE
2024-10-08 11:56:41 +02:00
..
images Pali gemma modeling (#1895) 2024-05-16 06:58:47 +02:00
models Add support for fused MoE Marlin for AWQ (#2616) 2024-10-08 11:56:41 +02:00
conftest.py Add basic FP8 KV cache support (#2603) 2024-10-04 17:51:48 +02:00
poetry.lock Prefix test - Different kind of load test to trigger prefix test bugs. (#2490) 2024-09-11 18:10:40 +02:00
pyproject.toml Prefix test - Different kind of load test to trigger prefix test bugs. (#2490) 2024-09-11 18:10:40 +02:00
pytest.ini chore: add pre-commit (#1569) 2024-02-16 11:58:58 +01:00
requirements.txt Prefix test - Different kind of load test to trigger prefix test bugs. (#2490) 2024-09-11 18:10:40 +02:00