hf_text-generation-inference/server/text_generation_server/layers/attention
drbh bab02ff2bc
feat: add ruff and resolve issue (#2262)
* feat: add ruff and resolve issue

* fix: update client exports and adjust after rebase

* fix: adjust syntax to avoid circular import

* fix: adjust client ruff settings

* fix: lint and refactor import check and avoid model enum as global names

* fix: improve fbgemm_gpu check and lints

* fix: update lints

* fix: prefer comparing model enum over str

* fix: adjust lints and ignore specific rules

* fix: avoid unneeded quantize check
2024-07-26 10:29:09 -04:00
..
__init__.py feat: add ruff and resolve issue (#2262) 2024-07-26 10:29:09 -04:00
common.py [Major Change][Undecided yet] Move to FlashDecoding instead of PagedAttention kernel. (#1940) 2024-07-01 23:28:00 +02:00
cuda.py feat: add ruff and resolve issue (#2262) 2024-07-26 10:29:09 -04:00
flash_attn_triton.py feat: add ruff and resolve issue (#2262) 2024-07-26 10:29:09 -04:00
ipex.py fix FlashDecoding change's regression in intel platform (#2161) 2024-07-02 11:56:07 +02:00
rocm.py feat: add ruff and resolve issue (#2262) 2024-07-26 10:29:09 -04:00