Michael Feil
|
ff703cb867
|
Adding ctranslate2 quantization and inference: moving the contribution (#1)
* rebaseing the commit on preemo fork.
* reformatting and changes.
* update dockerfile
* update changes for dockerfile
* adapt path
* rebaseing the commit on preemo fork.
* reformatting and changes.
* update dockerfile
* update changes for dockerfile
* adapt path
---------
Co-authored-by: michaelfeil <me@michaelfeil.eu>
|
2023-10-02 11:12:49 -07:00 |
Nick Hill
|
e4b26aa10b
|
fix(server): avoid errors for very small top_p values (#544)
See https://github.com/huggingface/transformers/pull/24111
I didn't add validation to the `__init__` method since it's not done for
other values/warpers.
|
2023-07-04 20:11:33 +02:00 |
OlivierDehaene
|
53aa9194c8
|
fix(server): fix warpers on CPU (#472)
Closes #471
|
2023-06-20 11:06:10 +02:00 |
OlivierDehaene
|
62f91f78ac
|
feat(server): support vectorized warpers in flash causal lm (#317)
Co-authored-by: Joel Lamy-Poirier <joel.lamy-poirier@servicenow.com>
|
2023-05-26 12:30:27 +02:00 |