hf_text-generation-inference

Commit Graph

Author	SHA1	Message	Date
Nicolas Patry	8dca3b04f8	Force weights_only (before fully breaking pickle files anyway). (#1710 ) # What does this PR do? <!-- Congratulations! You've made it this far! You're not quite done yet though. Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflects the extent of your awesome contribution. Then, please replace this with a description of the change and which issue is fixed (if applicable). Please also include relevant motivation and context. List any dependencies (if any) that are required for this change. Once you're done, someone will review your PR shortly (see the section "Who can review?" below to tag some potential reviewers). They may suggest changes to make the code even better. If no one reviewed your PR after a week has passed, don't hesitate to post a new comment @-mentioning the same persons---sometimes notifications get lost. --> <!-- Remove if not applicable --> Fixes # (issue) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). - [ ] Did you read the [contributor guideline](https://github.com/huggingface/transformers/blob/main/CONTRIBUTING.md#start-contributing-pull-requests), Pull Request section? - [ ] Was this discussed/approved via a Github issue or the [forum](https://discuss.huggingface.co/)? Please add a link to it if that's the case. - [ ] Did you make sure to update the documentation with your changes? Here are the [documentation guidelines](https://github.com/huggingface/transformers/tree/main/docs), and [here are tips on formatting docstrings](https://github.com/huggingface/transformers/tree/main/docs#writing-source-documentation). - [ ] Did you write any new necessary tests? ## Who can review? Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR. <!-- Your PR will be replied to more quickly if you can figure out the right person to tag with @ @OlivierDehaene OR @Narsil -->	2024-04-05 19:23:57 +02:00
xiaobin	4cce84301b	fit for baichuan models (#981 ) As more and more people begin to use Baichuan's open-source models, the influence of Baichuan models is growing, especially in China. Many community members are interested in adding support for Baichuan models to TGI. Meanwhile, Baichuan is a very open company, and in the future, it plans to open-source more and more models, taking all this into consideration, we would like to add support for the Baichuan model to TGI. To do this, we need to make some changes, which we hope can be merged into the main branch of TGI. In the future, we would be happy to help maintain support for Baichuan models in TGI. We sincerely hope that our pull request can be accepted. Thank you. By the way, the changes of this time mainly for supporting Baichuan-7B. --------- Co-authored-by: xiaoyuze <xiaoyuze@baichuan.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>	2023-09-08 16:51:34 +02:00
OlivierDehaene	5b9de4a1d3	fix(server): blacklist local files (#609 ) Close #589 #602	2023-07-13 21:54:55 +02:00
Nicolas Patry	e943a294bc	fix(server): harden the weights choice to save on disk. (#561 ) - Look at `transformers` base class to check for `_key_to_ignore_on_load_missing` or `_tied_weights` which are the standard attributes to select the keys to NOT save on disk (since they are ignored) - Modified safetensors code (to be reflected in safetensors even if it's an internal function). - Will not work for trust_remote_code=True repos (like santacoder). Should help with : https://github.com/huggingface/text-generation-inference/issues/555 and : https://github.com/huggingface/text-generation-inference/pull/501 and https://github.com/huggingface/text-generation-inference/issues/556 and https://github.com/huggingface/text-generation-inference/issues/482#issuecomment-1623713593	2023-07-07 14:50:12 +02:00
Nicolas Patry	49b4b33e80	feat(server): Update convert logic. (#483 ) Should be more robust to shared tensors (ok when using `from_pretrained). But forcing us to add new checks in our loading code (since the chosen key to keep might be different from `transformers`). --------- Co-authored-by: Ubuntu <ubuntu@ip-172-31-41-161.ec2.internal>	2023-06-23 12:40:46 +02:00
OlivierDehaene	ece7ffa40a	feat(server): improve flash attention import errors (#465 ) @lewtun, is this enough? Closes #458 Closes #456	2023-06-19 09:53:45 +02:00
OlivierDehaene	62f91f78ac	feat(server): support vectorized warpers in flash causal lm (#317 ) Co-authored-by: Joel Lamy-Poirier <joel.lamy-poirier@servicenow.com>	2023-05-26 12:30:27 +02:00
Nicolas Patry	b4aa87db58	fea(server): decrease convert RAM requirements (#286 )	2023-05-05 17:57:02 +02:00
Nicolas Patry	690fc31757	fix(server): fix convert (#284 )	2023-05-05 15:28:08 +02:00
Nicolas Patry	f08343d44d	fix(server): Removes the parallelism in file convertion (during download) (#275 )	2023-05-04 15:22:54 +02:00
OlivierDehaene	3fef90d50f	feat(clients): Python client (#103 )	2023-03-07 18:52:22 +01:00

11 Commits