OlivierDehaene
1bb394631d
fix(docker): fix docker image ( #184 )
2023-04-14 17:31:13 +02:00
OlivierDehaene
53ee09c0b0
fea(dockerfile): better layer caching ( #159 )
2023-04-14 10:12:21 +02:00
OlivierDehaene
1883d8ecde
feat(docker): improve flash_attention caching ( #160 )
2023-04-09 19:59:16 +02:00
OlivierDehaene
d503e8f09d
feat: aws sagemaker compatible image ( #147 )
...
The only difference is that now it pushes to
registry.internal.huggingface.tech/api-inference/community/text-generation-inference/sagemaker:...
instead of
registry.internal.huggingface.tech/api-inference/community/text-generation-inference:sagemaker-...
---------
Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com>
2023-03-29 21:38:30 +02:00
OlivierDehaene
05e9a796cc
feat(server): flash neoX ( #133 )
2023-03-24 14:02:14 +01:00
OlivierDehaene
e3ded361b2
feat(ci): improve CI speed ( #94 )
2023-03-03 15:07:27 +01:00
OlivierDehaene
17bc841b1b
feat(server): enable hf-transfer ( #76 )
2023-02-18 14:04:11 +01:00
OlivierDehaene
9af454142a
feat: add distributed tracing ( #62 )
2023-02-13 13:02:45 +01:00
OlivierDehaene
1ad3250b89
fix(docker): increase shm size ( #60 )
2023-02-08 17:53:33 +01:00
OlivierDehaene
20c3c5940c
feat(router): refactor API and add openAPI schemas ( #53 )
2023-02-03 12:43:37 +01:00
OlivierDehaene
13e7044ab7
fix(dockerfile): fix docker build ( #32 )
2023-01-24 19:52:39 +01:00
OlivierDehaene
ab2ad91da3
fix(docker): fix api-inference deployment ( #30 )
2023-01-23 17:33:08 +01:00
OlivierDehaene
f9d0ec376a
feat(docker): Make the image compatible with api-inference ( #29 )
2023-01-23 17:11:27 +01:00
OlivierDehaene
6c781025ae
feat(rust): Update to 1.65
2022-11-14 13:59:56 +01:00
OlivierDehaene
fa43fb71be
fix(server): Fix Transformers fork version
2022-11-08 17:42:38 +01:00
OlivierDehaene
4236e41b0d
feat(server): Improved doc
2022-11-07 12:53:56 +01:00
OlivierDehaene
b3b7ea0d74
feat: Use json formatter by default in docker image
2022-11-02 17:29:56 +01:00
OlivierDehaene
3cf6368c77
feat(server): Support all AutoModelForCausalLM on a best effort basis
2022-10-28 19:24:00 +02:00
OlivierDehaene
09674e6df9
feat(server): Support bitsandbytes
2022-10-27 14:25:29 +02:00
Nicolas Patry
c8ce9b2515
feat(server): Use safetensors
...
Co-authored-by: OlivierDehaene <23298448+OlivierDehaene@users.noreply.github.com>
2022-10-22 20:00:15 +02:00
Olivier Dehaene
f16f2f5ae1
v0.1.0
2022-10-20 19:14:44 +02:00
Olivier Dehaene
92c1ecd008
feat: Add arguments to CLI
2022-10-17 18:27:33 +02:00
Olivier Dehaene
5e5d8766a2
feat: Improve error handling
2022-10-17 14:59:00 +02:00
Olivier Dehaene
bf99afe916
feat: Docker image
2022-10-14 15:56:21 +02:00