Nicolas Patry
411b0d4e1f
chore(github): add templates ( #264 )
2023-05-02 15:43:19 +02:00
OlivierDehaene
593a563414
feat(docker): add nvidia env vars ( #255 )
2023-04-27 19:18:33 +02:00
OlivierDehaene
98a3e0d135
chore(server): update huggingface-hub ( #227 )
2023-04-24 15:57:13 +02:00
OlivierDehaene
97df0c7bc0
misc: update to rust 1.69 ( #221 )
2023-04-21 21:00:30 +02:00
OlivierDehaene
b6ee0ec7b0
feat(router): add git sha to info route ( #208 )
2023-04-19 21:36:59 +02:00
OlivierDehaene
6837b2eb77
fix(docker): remove unused dependencies ( #205 )
2023-04-19 19:39:31 +02:00
OlivierDehaene
5d27f5259b
fix(server): fix hf_transfer issue with private repos ( #203 )
2023-04-19 17:36:16 +02:00
OlivierDehaene
7a1ba58557
fix(docker): fix docker image dependencies ( #187 )
2023-04-17 00:26:47 +02:00
OlivierDehaene
379c5c4da2
fix(docker): revert dockerfile changes ( #186 )
2023-04-14 19:30:30 +02:00
OlivierDehaene
f9047562d0
fix(docker): fix image ( #185 )
2023-04-14 18:58:38 +02:00
OlivierDehaene
1bb394631d
fix(docker): fix docker image ( #184 )
2023-04-14 17:31:13 +02:00
OlivierDehaene
53ee09c0b0
fea(dockerfile): better layer caching ( #159 )
2023-04-14 10:12:21 +02:00
OlivierDehaene
1883d8ecde
feat(docker): improve flash_attention caching ( #160 )
2023-04-09 19:59:16 +02:00
OlivierDehaene
d503e8f09d
feat: aws sagemaker compatible image ( #147 )
...
The only difference is that now it pushes to
registry.internal.huggingface.tech/api-inference/community/text-generation-inference/sagemaker:...
instead of
registry.internal.huggingface.tech/api-inference/community/text-generation-inference:sagemaker-...
---------
Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com>
2023-03-29 21:38:30 +02:00
OlivierDehaene
05e9a796cc
feat(server): flash neoX ( #133 )
2023-03-24 14:02:14 +01:00
OlivierDehaene
e3ded361b2
feat(ci): improve CI speed ( #94 )
2023-03-03 15:07:27 +01:00
OlivierDehaene
17bc841b1b
feat(server): enable hf-transfer ( #76 )
2023-02-18 14:04:11 +01:00
OlivierDehaene
9af454142a
feat: add distributed tracing ( #62 )
2023-02-13 13:02:45 +01:00
OlivierDehaene
1ad3250b89
fix(docker): increase shm size ( #60 )
2023-02-08 17:53:33 +01:00
OlivierDehaene
20c3c5940c
feat(router): refactor API and add openAPI schemas ( #53 )
2023-02-03 12:43:37 +01:00
OlivierDehaene
13e7044ab7
fix(dockerfile): fix docker build ( #32 )
2023-01-24 19:52:39 +01:00
OlivierDehaene
ab2ad91da3
fix(docker): fix api-inference deployment ( #30 )
2023-01-23 17:33:08 +01:00
OlivierDehaene
f9d0ec376a
feat(docker): Make the image compatible with api-inference ( #29 )
2023-01-23 17:11:27 +01:00
OlivierDehaene
6c781025ae
feat(rust): Update to 1.65
2022-11-14 13:59:56 +01:00
OlivierDehaene
fa43fb71be
fix(server): Fix Transformers fork version
2022-11-08 17:42:38 +01:00
OlivierDehaene
4236e41b0d
feat(server): Improved doc
2022-11-07 12:53:56 +01:00
OlivierDehaene
b3b7ea0d74
feat: Use json formatter by default in docker image
2022-11-02 17:29:56 +01:00
OlivierDehaene
3cf6368c77
feat(server): Support all AutoModelForCausalLM on a best effort basis
2022-10-28 19:24:00 +02:00
OlivierDehaene
09674e6df9
feat(server): Support bitsandbytes
2022-10-27 14:25:29 +02:00
Nicolas Patry
c8ce9b2515
feat(server): Use safetensors
...
Co-authored-by: OlivierDehaene <23298448+OlivierDehaene@users.noreply.github.com>
2022-10-22 20:00:15 +02:00
Olivier Dehaene
f16f2f5ae1
v0.1.0
2022-10-20 19:14:44 +02:00
Olivier Dehaene
92c1ecd008
feat: Add arguments to CLI
2022-10-17 18:27:33 +02:00
Olivier Dehaene
5e5d8766a2
feat: Improve error handling
2022-10-17 14:59:00 +02:00
Olivier Dehaene
bf99afe916
feat: Docker image
2022-10-14 15:56:21 +02:00