OlivierDehaene
|
4236e41b0d
|
feat(server): Improved doc
|
2022-11-07 12:53:56 +01:00 |
OlivierDehaene
|
cea6051eff
|
feat(launcher): Pass CUDA_VISIBLE_DEVICES to the shard
|
2022-11-04 18:31:08 +01:00 |
OlivierDehaene
|
427d7cc444
|
feat(server): Support AutoModelForSeq2SeqLM
|
2022-11-04 18:03:04 +01:00 |
OlivierDehaene
|
c5665f5c8b
|
feat(server): Support generic AutoModelForCausalLM
|
2022-11-04 14:22:47 +01:00 |
OlivierDehaene
|
755fc0e403
|
fix(models): Revert buggy support for AutoModel
|
2022-11-03 16:07:54 +01:00 |
OlivierDehaene
|
b3b7ea0d74
|
feat: Use json formatter by default in docker image
|
2022-11-02 17:29:56 +01:00 |
OlivierDehaene
|
3cf6368c77
|
feat(server): Support all AutoModelForCausalLM on a best effort basis
|
2022-10-28 19:24:00 +02:00 |
OlivierDehaene
|
09674e6df9
|
feat(server): Support bitsandbytes
|
2022-10-27 14:25:29 +02:00 |
OlivierDehaene
|
beb552127a
|
feat(client): Simplify sharded logic
|
2022-10-22 23:40:05 +02:00 |
Nicolas Patry
|
c8ce9b2515
|
feat(server): Use safetensors
Co-authored-by: OlivierDehaene <23298448+OlivierDehaene@users.noreply.github.com>
|
2022-10-22 20:00:15 +02:00 |
Thomas Wang
|
be8827fe41
|
Create LICENSE (#2)
|
2022-10-22 10:44:52 +02:00 |
OlivierDehaene
|
c837893370
|
feat(router): Add max_waiting_tokens
|
2022-10-21 16:40:05 +02:00 |
OlivierDehaene
|
895a341d06
|
fix(validation): Fix error messages
|
2022-10-21 10:59:15 +02:00 |
Olivier Dehaene
|
f16f2f5ae1
|
v0.1.0
|
2022-10-20 19:14:44 +02:00 |
Olivier Dehaene
|
92c1ecd008
|
feat: Add arguments to CLI
|
2022-10-17 18:27:33 +02:00 |
Olivier Dehaene
|
5e5d8766a2
|
feat: Improve error handling
|
2022-10-17 14:59:00 +02:00 |
Olivier Dehaene
|
00e6ce44b1
|
Update aml deployment
|
2022-10-17 10:39:59 +02:00 |
Olivier Dehaene
|
bcb53903b8
|
feat: Add AML deployment
|
2022-10-15 20:21:50 +02:00 |
Olivier Dehaene
|
bf99afe916
|
feat: Docker image
|
2022-10-14 15:56:21 +02:00 |
Olivier Dehaene
|
39df4d9975
|
Use axum
|
2022-10-11 18:14:39 +02:00 |
Olivier Dehaene
|
e86ecbac63
|
ValidationError was not correctly handled
|
2022-10-11 16:53:40 +02:00 |
Olivier Dehaene
|
4c693e6524
|
Refactored gRPC interface
Added validation logic
|
2022-10-11 16:50:54 +02:00 |
Olivier Dehaene
|
fa9a088467
|
Add load testing
|
2022-10-11 10:36:51 +02:00 |
Olivier Dehaene
|
1d986983d5
|
fix: cleanup
|
2022-10-08 12:34:25 +02:00 |
Olivier Dehaene
|
295831a481
|
Init
|
2022-10-08 12:30:12 +02:00 |