Commit Graph

74 Commits

Author SHA1 Message Date
OlivierDehaene 1bb394631d
fix(docker): fix docker image (#184) 2023-04-14 17:31:13 +02:00
OlivierDehaene 53ee09c0b0
fea(dockerfile): better layer caching (#159) 2023-04-14 10:12:21 +02:00
OlivierDehaene 1883d8ecde
feat(docker): improve flash_attention caching (#160) 2023-04-09 19:59:16 +02:00
OlivierDehaene d503e8f09d
feat: aws sagemaker compatible image (#147)
The only difference is that now it pushes to
registry.internal.huggingface.tech/api-inference/community/text-generation-inference/sagemaker:...
instead of
registry.internal.huggingface.tech/api-inference/community/text-generation-inference:sagemaker-...

---------

Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com>
2023-03-29 21:38:30 +02:00
OlivierDehaene 05e9a796cc
feat(server): flash neoX (#133) 2023-03-24 14:02:14 +01:00
OlivierDehaene e3ded361b2
feat(ci): improve CI speed (#94) 2023-03-03 15:07:27 +01:00
OlivierDehaene 17bc841b1b
feat(server): enable hf-transfer (#76) 2023-02-18 14:04:11 +01:00
OlivierDehaene 9af454142a
feat: add distributed tracing (#62) 2023-02-13 13:02:45 +01:00
OlivierDehaene 1ad3250b89
fix(docker): increase shm size (#60) 2023-02-08 17:53:33 +01:00
OlivierDehaene 20c3c5940c
feat(router): refactor API and add openAPI schemas (#53) 2023-02-03 12:43:37 +01:00
OlivierDehaene 13e7044ab7
fix(dockerfile): fix docker build (#32) 2023-01-24 19:52:39 +01:00
OlivierDehaene ab2ad91da3
fix(docker): fix api-inference deployment (#30) 2023-01-23 17:33:08 +01:00
OlivierDehaene f9d0ec376a
feat(docker): Make the image compatible with api-inference (#29) 2023-01-23 17:11:27 +01:00
OlivierDehaene 6c781025ae feat(rust): Update to 1.65 2022-11-14 13:59:56 +01:00
OlivierDehaene fa43fb71be fix(server): Fix Transformers fork version 2022-11-08 17:42:38 +01:00
OlivierDehaene 4236e41b0d feat(server): Improved doc 2022-11-07 12:53:56 +01:00
OlivierDehaene b3b7ea0d74 feat: Use json formatter by default in docker image 2022-11-02 17:29:56 +01:00
OlivierDehaene 3cf6368c77 feat(server): Support all AutoModelForCausalLM on a best effort basis 2022-10-28 19:24:00 +02:00
OlivierDehaene 09674e6df9 feat(server): Support bitsandbytes 2022-10-27 14:25:29 +02:00
Nicolas Patry c8ce9b2515
feat(server): Use safetensors
Co-authored-by: OlivierDehaene <23298448+OlivierDehaene@users.noreply.github.com>
2022-10-22 20:00:15 +02:00
Olivier Dehaene f16f2f5ae1 v0.1.0 2022-10-20 19:14:44 +02:00
Olivier Dehaene 92c1ecd008 feat: Add arguments to CLI 2022-10-17 18:27:33 +02:00
Olivier Dehaene 5e5d8766a2 feat: Improve error handling 2022-10-17 14:59:00 +02:00
Olivier Dehaene bf99afe916 feat: Docker image 2022-10-14 15:56:21 +02:00