Commit Graph

4 Commits

Author SHA1 Message Date
OlivierDehaene db4cb5e4ed
fix(server): fix past key values logic (#216)
@njhill fyi
2023-04-21 15:59:18 +02:00
OlivierDehaene 343437c7b5
feat(router): add device and dtype info (#215) 2023-04-21 15:36:29 +02:00
OlivierDehaene e14ae3b5e9
feat(server): support quantization for flash models (#200)
closes #197
2023-04-19 12:51:11 +02:00
OlivierDehaene 299217c95c
feat(server): add flash attention llama (#144) 2023-04-11 16:38:22 +02:00