Commit Graph

1 Commits

Author SHA1 Message Date
OlivierDehaene 6f88bd9390
feat: add triton kernels to decrease latency of large batches (#2687)
* feat: add triton kernels to decrease latency of large batches

* cast to int32

* fix kernel

* fix kernel

* disable triton on rocm

* fix speculation

* add slots filtering kernel
2024-10-25 21:10:00 +00:00