Commit Graph

11 Commits

Author SHA1 Message Date
OlivierDehaene ad66f6ef9a
feat(server): optim flash causal lm decode_token (#285) 2023-05-09 18:26:19 +02:00
OlivierDehaene 85aa7e2e7b
feat(server): support hf endpoint weight layout (#266) 2023-05-03 11:36:24 +02:00
OlivierDehaene 4096000e34
fix(server): fix typo in tokenizers decode (#269)
closes #268
2023-05-03 10:10:34 +02:00
OlivierDehaene db4cb5e4ed
fix(server): fix past key values logic (#216)
@njhill fyi
2023-04-21 15:59:18 +02:00
OlivierDehaene 343437c7b5
feat(router): add device and dtype info (#215) 2023-04-21 15:36:29 +02:00
OlivierDehaene e14ae3b5e9
feat(server): support quantization for flash models (#200)
closes #197
2023-04-19 12:51:11 +02:00
OlivierDehaene 880a76eed5
feat(server): support sharded santacoder (#167) 2023-04-12 17:18:08 +02:00
OlivierDehaene 5fa8ae041c
feat(server): optimize decode for sane tokenizers (#170) 2023-04-12 12:03:10 +02:00
OlivierDehaene 9987960062
feat(router): make router input validation optional (#164) 2023-04-09 20:22:27 +02:00
OlivierDehaene 3f2542bb6a
fix(server): fix escape characters in stop sequence (#155) 2023-04-05 19:37:41 +02:00
OlivierDehaene c0aeb32583
feat(server): flash santacoder (#153) 2023-04-03 19:06:42 +02:00