hf_text-generation-inference

History

Jae-Won Chung b9633c46d0 Fix typing in `Model.generate_token` (#733 ) ## What does this PR do? This PR fixes a minor type annotation issue in the signature of `Model.generate_token`. All existing overrides of `Model.generate_token` return `Tuple[List[Generation], Optional[B]]`: `3ef5ffbc64/server/text_generation_server/models/causal_lm.py (L535-L537)` `3ef5ffbc64/server/text_generation_server/models/flash_causal_lm.py (L802-L804)` `3ef5ffbc64/server/text_generation_server/models/seq2seq_lm.py (L589-L591)` I suspect that back in `017a2a8c` when `GeneratedText` and `Generation` were separated, the function signature was not updated. ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [x] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? - [ ] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. - [ ] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). - [ ] Did you write any new necessary tests? CC @OlivierDehaene		2023-07-31 14:35:14 +02:00
..
models	Fix typing in `Model.generate_token` (#733 )	2023-07-31 14:35:14 +02:00
pb	feat(server): clear cache on error (#143 )	2023-03-28 11:29:35 +02:00
utils	Local gptq support. (#738 )	2023-07-31 10:32:52 +02:00
__init__.py	feat(clients): Python client (#103 )	2023-03-07 18:52:22 +01:00
cache.py	fix(server): decrease memory fragmentation (#557 )	2023-07-06 14:28:33 +02:00
cli.py	feat(server): Reworking the quantization script so it's still universal (not llama specific) (#587 )	2023-07-18 12:19:05 +02:00
interceptor.py	feat(server): empty cache on errors	2023-07-12 17:06:19 +02:00
server.py	fix(server): fix quantization python requirements (#708 )	2023-07-27 12:28:10 +02:00
tracing.py	feat(clients): Python client (#103 )	2023-03-07 18:52:22 +01:00