Commit Graph

4 Commits

Author SHA1 Message Date
Michael Feil ff703cb867
Adding ctranslate2 quantization and inference: moving the contribution (#1)
* rebaseing the commit on preemo fork.

* reformatting and changes.

* update dockerfile

* update changes for dockerfile

* adapt path

* rebaseing the commit on preemo fork.

* reformatting and changes.

* update dockerfile

* update changes for dockerfile

* adapt path

---------

Co-authored-by: michaelfeil <me@michaelfeil.eu>
2023-10-02 11:12:49 -07:00
Nick Hill e4b26aa10b
fix(server): avoid errors for very small top_p values (#544)
See https://github.com/huggingface/transformers/pull/24111

I didn't add validation to the `__init__` method since it's not done for
other values/warpers.
2023-07-04 20:11:33 +02:00
OlivierDehaene 53aa9194c8
fix(server): fix warpers on CPU (#472)
Closes #471
2023-06-20 11:06:10 +02:00
OlivierDehaene 62f91f78ac
feat(server): support vectorized warpers in flash causal lm (#317)
Co-authored-by: Joel Lamy-Poirier <joel.lamy-poirier@servicenow.com>
2023-05-26 12:30:27 +02:00