Commit Graph

31 Commits

Author SHA1 Message Date
fxmarty b2b5df0e94
Add RoCm support (#1243)
This PR adds support for AMD Instinct MI210 & MI250 GPUs, with paged
attention and FAv2 support.

Remaining items to discuss, on top of possible others:
* Should we have a
`ghcr.io/huggingface/text-generation-inference:1.1.0+rocm` hosted image,
or is it too early?
* Should we set up a CI on MI210/MI250? I don't have access to the
runners of TGI though.
* Are we comfortable with those changes being directly in TGI, or do we
need a fork?

---------

Co-authored-by: Felix Marty <felix@hf.co>
Co-authored-by: OlivierDehaene <olivier@huggingface.co>
Co-authored-by: Your Name <you@example.com>
2023-11-27 14:08:12 +01:00
OlivierDehaene 8acdc1fae7 hotfix 1.1.1 2023-11-16 18:35:09 +01:00
OlivierDehaene e3e487dc71
feat(server): support trust_remote_code (#363) 2023-05-23 20:40:39 +02:00
OlivierDehaene 5f67923cac
feat: add nightly load testing (#358) 2023-05-23 17:42:19 +02:00
oOraph 0a6494785c
fix(ci): fix security group (#359)
# What does this PR do?
Switch security group used for ci
(open outbound rules)

Signed-off-by: Raphael <oOraph@users.noreply.github.com>
Co-authored-by: Raphael <oOraph@users.noreply.github.com>
2023-05-23 16:49:11 +02:00
OlivierDehaene 5a58226130
fix(server): fix decode token (#334)
Fixes #333

---------

Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
2023-05-16 23:23:27 +02:00
OlivierDehaene dbdc587ddd
feat(integration-tests): improve comparison and health checks (#336) 2023-05-16 20:22:11 +02:00
OlivierDehaene e71471bec9
feat: add snapshot testing (#282) 2023-05-15 23:36:30 +02:00
OlivierDehaene 66b277321d
feat(ci): custom gpu runners (#328) 2023-05-15 15:53:08 +02:00
Nicolas Patry 411b0d4e1f
chore(github): add templates (#264) 2023-05-02 15:43:19 +02:00
OlivierDehaene 274513e6a3
fix(ci): fix sha in docker image (#212) 2023-04-20 18:50:47 +02:00
OlivierDehaene 709d8936f6
feat(router): drop requests when client closes the channel (#202) 2023-04-20 11:07:40 +02:00
OlivierDehaene b6ee0ec7b0
feat(router): add git sha to info route (#208) 2023-04-19 21:36:59 +02:00
OlivierDehaene 7a1ba58557
fix(docker): fix docker image dependencies (#187) 2023-04-17 00:26:47 +02:00
OlivierDehaene 1bb394631d
fix(docker): fix docker image (#184) 2023-04-14 17:31:13 +02:00
OlivierDehaene 01c0e368e5
fix(ci): fix cosign error (#183) 2023-04-14 12:35:26 +02:00
OlivierDehaene 53ee09c0b0
fea(dockerfile): better layer caching (#159) 2023-04-14 10:12:21 +02:00
OlivierDehaene 12e5633c4d
fix(ci): fix ci permissions (#181) 2023-04-13 16:32:37 +02:00
OlivierDehaene c1e2ea3b78
feat(ci): faster scanning (#180) 2023-04-13 16:23:47 +02:00
OlivierDehaene 13f1cd024b
feat(ci): use large runners (#179) 2023-04-13 16:11:48 +02:00
OlivierDehaene 9683c37bd3
feat(ci): add Trivy and scan docker image (#178) 2023-04-13 15:43:17 +02:00
OlivierDehaene 643a39d556
feat(ci): add image signing with cosign (#175) 2023-04-13 15:26:34 +02:00
OlivierDehaene 64347b05ff
fix(ci): fix CVE in github-slug-action (#174) 2023-04-13 12:43:05 +02:00
OlivierDehaene 55106ec476
fix(ci): fix sagemaker action (#148) 2023-03-29 22:27:01 +02:00
OlivierDehaene d503e8f09d
feat: aws sagemaker compatible image (#147)
The only difference is that now it pushes to
registry.internal.huggingface.tech/api-inference/community/text-generation-inference/sagemaker:...
instead of
registry.internal.huggingface.tech/api-inference/community/text-generation-inference:sagemaker-...

---------

Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com>
2023-03-29 21:38:30 +02:00
OlivierDehaene 05e9a796cc
feat(server): flash neoX (#133) 2023-03-24 14:02:14 +01:00
OlivierDehaene 603e20b5f7
feat(ci): add ci paths (#134) 2023-03-23 18:01:30 +01:00
OlivierDehaene e3ded361b2
feat(ci): improve CI speed (#94) 2023-03-03 15:07:27 +01:00
OlivierDehaene e114d87486
feat(ci): push to AML registry (#56) 2023-02-06 14:33:56 +01:00
OlivierDehaene 20c3c5940c
feat(router): refactor API and add openAPI schemas (#53) 2023-02-03 12:43:37 +01:00
OlivierDehaene 404ed7a1f6
feat(ci): Docker build and push (#46) 2023-01-31 20:14:05 +01:00