hf_text-generation-inference

mirror of https://github.com/huggingface/text-generation-inference.git

History

Daniël de Kok 22fb1be588 Fix cache block size for flash decoding (#2351 ) * Fix cache block size for flash decoding This seems to have been accidentally dropped during the TRT-LLM PR rebase. * Also run CI on changes to `backends`		2024-08-01 15:38:57 +02:00
..
ISSUE_TEMPLATE	chore: add pre-commit (#1569 )	2024-02-16 11:58:58 +01:00
workflows	Fix cache block size for flash decoding (#2351 )	2024-08-01 15:38:57 +02:00
PULL_REQUEST_TEMPLATE.md	chore(github): add templates (#264 )	2023-05-02 15:43:19 +02:00