OlivierDehaene
|
ebc74d5666
|
feat(router): use number of tokens in batch as input for dynamic batching (#226)
Co-authored-by: Nick Hill <nickhill@us.ibm.com>
|
2023-04-24 17:59:00 +02:00 |
Nick Hill
|
4a7dd4085a
|
feat(server): reduce memory requirement (#214)
|
2023-04-24 14:15:42 +02:00 |
OlivierDehaene
|
343437c7b5
|
feat(router): add device and dtype info (#215)
|
2023-04-21 15:36:29 +02:00 |
OlivierDehaene
|
709d8936f6
|
feat(router): drop requests when client closes the channel (#202)
|
2023-04-20 11:07:40 +02:00 |
OlivierDehaene
|
5fa8ae041c
|
feat(server): optimize decode for sane tokenizers (#170)
|
2023-04-12 12:03:10 +02:00 |
OlivierDehaene
|
299217c95c
|
feat(server): add flash attention llama (#144)
|
2023-04-11 16:38:22 +02:00 |
OlivierDehaene
|
9987960062
|
feat(router): make router input validation optional (#164)
|
2023-04-09 20:22:27 +02:00 |
OlivierDehaene
|
05e9a796cc
|
feat(server): flash neoX (#133)
|
2023-03-24 14:02:14 +01:00 |
OlivierDehaene
|
b49dbf2d88
|
fix(server): use server tokenizer as gt (#128)
|
2023-03-16 12:12:26 +01:00 |
OlivierDehaene
|
8ad60b752f
|
fix(server): add position ids to neox (#126)
|
2023-03-15 13:12:49 +01:00 |
OlivierDehaene
|
941cd42e0c
|
fix(server): fix index out of range for watermarking (#110)
|
2023-03-08 18:29:08 +01:00 |
OlivierDehaene
|
3fef90d50f
|
feat(clients): Python client (#103)
|
2023-03-07 18:52:22 +01:00 |