hf_text-generation-inference/server/text_generation_server/models
xiaobin 4cce84301b
fit for baichuan models (#981)
As more and more people begin to use Baichuan's open-source models, the
influence of Baichuan models is growing, especially in China. Many
community members are interested in adding support for Baichuan models
to TGI. Meanwhile, Baichuan is a very open company, and in the future,
it plans to open-source more and more models, taking all this into
consideration, we would like to add support for the Baichuan model to
TGI. To do this, we need to make some changes, which we hope can be
merged into the main branch of TGI. In the future, we would be happy to
help maintain support for Baichuan models in TGI. We sincerely hope that
our pull request can be accepted. Thank you.

By the way, the changes of this time mainly for supporting Baichuan-7B.

---------

Co-authored-by: xiaoyuze <xiaoyuze@baichuan.com>
Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>
2023-09-08 16:51:34 +02:00
..
custom_modeling fit for baichuan models (#981) 2023-09-08 16:51:34 +02:00
__init__.py fit for baichuan models (#981) 2023-09-08 16:51:34 +02:00
bloom.py feat(server): Using `quantize_config.json` instead of GPTQ_BITS env variables. (#671) 2023-07-25 13:00:27 +02:00
causal_lm.py Rebased #617 (#868) 2023-08-28 11:43:47 +02:00
flash_causal_lm.py Rebased #617 (#868) 2023-08-28 11:43:47 +02:00
flash_llama.py fix: LlamaTokenizerFast to AutoTokenizer at flash_llama.py (#619) 2023-08-14 14:20:18 +02:00
flash_neox.py feat(server): Using `quantize_config.json` instead of GPTQ_BITS env variables. (#671) 2023-07-25 13:00:27 +02:00
flash_rw.py Fix Falcon weight mapping for H2O.ai checkpoints (#953) 2023-08-31 21:15:14 +02:00
flash_santacoder.py feat(server): Using `quantize_config.json` instead of GPTQ_BITS env variables. (#671) 2023-07-25 13:00:27 +02:00
galactica.py feat(server): Using `quantize_config.json` instead of GPTQ_BITS env variables. (#671) 2023-07-25 13:00:27 +02:00
gpt_neox.py feat(server): Using `quantize_config.json` instead of GPTQ_BITS env variables. (#671) 2023-07-25 13:00:27 +02:00
idefics.py Adding Idefics multi modal model. (#842) 2023-08-17 14:38:49 +02:00
idefics_causal_lm.py Rebased #617 (#868) 2023-08-28 11:43:47 +02:00
model.py Fix typing in `Model.generate_token` (#733) 2023-07-31 14:35:14 +02:00
mpt.py feat(server): Using `quantize_config.json` instead of GPTQ_BITS env variables. (#671) 2023-07-25 13:00:27 +02:00
opt.py feat(server): Using `quantize_config.json` instead of GPTQ_BITS env variables. (#671) 2023-07-25 13:00:27 +02:00
rw.py "Fix" for rw-1b. (#860) 2023-08-17 09:05:41 +02:00
santacoder.py Directly load GPTBigCode to specified device (#618) 2023-07-21 11:27:31 +02:00
seq2seq_lm.py Rebased #617 (#868) 2023-08-28 11:43:47 +02:00
t5.py fix(server): T5 weights names. (#582) 2023-07-12 10:01:42 +02:00
types.py Rebased #617 (#868) 2023-08-28 11:43:47 +02:00