hf_text-generation-inference

History

drbh bab02ff2bc feat: add ruff and resolve issue (#2262 ) * feat: add ruff and resolve issue * fix: update client exports and adjust after rebase * fix: adjust syntax to avoid circular import * fix: adjust client ruff settings * fix: lint and refactor import check and avoid model enum as global names * fix: improve fbgemm_gpu check and lints * fix: update lints * fix: prefer comparing model enum over str * fix: adjust lints and ignore specific rules * fix: avoid unneeded quantize check		2024-07-26 10:29:09 -04:00
..
__init__.py	feat: add ruff and resolve issue (#2262 )	2024-07-26 10:29:09 -04:00
common.py	[Major Change][Undecided yet] Move to FlashDecoding instead of PagedAttention kernel. (#1940 )	2024-07-01 23:28:00 +02:00
cuda.py	feat: add ruff and resolve issue (#2262 )	2024-07-26 10:29:09 -04:00
flash_attn_triton.py	feat: add ruff and resolve issue (#2262 )	2024-07-26 10:29:09 -04:00
ipex.py	fix FlashDecoding change's regression in intel platform (#2161 )	2024-07-02 11:56:07 +02:00
rocm.py	feat: add ruff and resolve issue (#2262 )	2024-07-26 10:29:09 -04:00