Commit Graph

4 Commits

Author SHA1 Message Date
Wang, Yi b6bb1d5160
Cpu dockerimage (#2367)
add intel-cpu docker image

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
2024-08-12 14:10:30 +02:00
Daniël de Kok 22fb1be588
Fix cache block size for flash decoding (#2351)
* Fix cache block size for flash decoding

This seems to have been accidentally dropped during the TRT-LLM
PR rebase.

* Also run CI on changes to `backends`
2024-08-01 15:38:57 +02:00
Daniël de Kok 67ef0649cf
GPTQ CI improvements (#2151)
* Add more representative Llama GPTQ test

The Llama GPTQ test is updated to use a model with the commonly-used
quantizer config format and activation sorting. The old test is
kept around (but renamed) since it tests the format produced by
`text-generation-server quantize`.

* Add support for manually triggering a release build
2024-07-05 14:12:16 +02:00
Nicolas Patry 480d3b3304
New runner. Manual squash. (#2110)
* New runner. Manual squash.

* Network host.

* Put back trufflehog with proper extension.

* No network host ?

* Moving buildx install after tailscale ?

* 1.79
2024-06-24 18:08:34 +02:00