hf_text-generation-inference/server/text_generation_server/layers/attention
Mohit Sharma 473d9a892d Merge remote-tracking branch 'upstream/main' into rocm_6.2_updates 2024-09-27 15:36:12 +00:00
..
__init__.py Improve support for GPUs with capability < 8 (#2575) 2024-09-27 16:19:42 +02:00
common.py fix issue for sliding window models 2024-09-24 10:53:19 +00:00
cuda.py Improve support for GPUs with capability < 8 (#2575) 2024-09-27 16:19:42 +02:00
flash_attn_triton.py feat: add ruff and resolve issue (#2262) 2024-07-26 10:29:09 -04:00
flashinfer.py More tensor cores. (#2558) 2024-09-24 23:57:26 +02:00
ipex.py Improve support for GPUs with capability < 8 (#2575) 2024-09-27 16:19:42 +02:00
rocm.py Merge remote-tracking branch 'upstream/main' into rocm_6.2_updates 2024-09-27 15:36:12 +00:00