hf_text-generation-inference

History

xiaobin 4cce84301b fit for baichuan models (#981 ) As more and more people begin to use Baichuan's open-source models, the influence of Baichuan models is growing, especially in China. Many community members are interested in adding support for Baichuan models to TGI. Meanwhile, Baichuan is a very open company, and in the future, it plans to open-source more and more models, taking all this into consideration, we would like to add support for the Baichuan model to TGI. To do this, we need to make some changes, which we hope can be merged into the main branch of TGI. In the future, we would be happy to help maintain support for Baichuan models in TGI. We sincerely hope that our pull request can be accepted. Thank you. By the way, the changes of this time mainly for supporting Baichuan-7B. --------- Co-authored-by: xiaoyuze <xiaoyuze@baichuan.com> Co-authored-by: Nicolas Patry <patry.nicolas@protonmail.com>		2023-09-08 16:51:34 +02:00
..
custom_modeling	fit for baichuan models (#981 )	2023-09-08 16:51:34 +02:00
__init__.py	fit for baichuan models (#981 )	2023-09-08 16:51:34 +02:00
bloom.py	feat(server): Using `quantize_config.json` instead of GPTQ_BITS env variables. (#671 )	2023-07-25 13:00:27 +02:00
causal_lm.py	Rebased #617 (#868 )	2023-08-28 11:43:47 +02:00
flash_causal_lm.py	Rebased #617 (#868 )	2023-08-28 11:43:47 +02:00
flash_llama.py	fix: LlamaTokenizerFast to AutoTokenizer at flash_llama.py (#619 )	2023-08-14 14:20:18 +02:00
flash_neox.py	feat(server): Using `quantize_config.json` instead of GPTQ_BITS env variables. (#671 )	2023-07-25 13:00:27 +02:00
flash_rw.py	Fix Falcon weight mapping for H2O.ai checkpoints (#953 )	2023-08-31 21:15:14 +02:00
flash_santacoder.py	feat(server): Using `quantize_config.json` instead of GPTQ_BITS env variables. (#671 )	2023-07-25 13:00:27 +02:00
galactica.py	feat(server): Using `quantize_config.json` instead of GPTQ_BITS env variables. (#671 )	2023-07-25 13:00:27 +02:00
gpt_neox.py	feat(server): Using `quantize_config.json` instead of GPTQ_BITS env variables. (#671 )	2023-07-25 13:00:27 +02:00
idefics.py	Adding Idefics multi modal model. (#842 )	2023-08-17 14:38:49 +02:00
idefics_causal_lm.py	Rebased #617 (#868 )	2023-08-28 11:43:47 +02:00
model.py	Fix typing in `Model.generate_token` (#733 )	2023-07-31 14:35:14 +02:00
mpt.py	feat(server): Using `quantize_config.json` instead of GPTQ_BITS env variables. (#671 )	2023-07-25 13:00:27 +02:00
opt.py	feat(server): Using `quantize_config.json` instead of GPTQ_BITS env variables. (#671 )	2023-07-25 13:00:27 +02:00
rw.py	"Fix" for rw-1b. (#860 )	2023-08-17 09:05:41 +02:00
santacoder.py	Directly load GPTBigCode to specified device (#618 )	2023-07-21 11:27:31 +02:00
seq2seq_lm.py	Rebased #617 (#868 )	2023-08-28 11:43:47 +02:00
t5.py	fix(server): T5 weights names. (#582 )	2023-07-12 10:01:42 +02:00
types.py	Rebased #617 (#868 )	2023-08-28 11:43:47 +02:00