hf_text-generation-inference

History

zspo bd3088748e add FastLinear import (#750 ) # What does this PR do? Fixes #749 ## Who can review? Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR. <!-- Your PR will be replied to more quickly if you can figure out the right person to tag with @ @OlivierDehaene OR @Narsil --> Co-authored-by: p_spozzhang <p_spozzhang@tencent.com>		2023-08-02 20:04:46 +02:00
..
__init__.py	…
bloom_modeling.py	feat: better errors for warmup and TP (#575 )	2023-07-10 14:47:15 +02:00
flash_llama_modeling.py	Adding Rope scaling. (#741 )	2023-07-31 15:38:47 +02:00
flash_neox_modeling.py	Adding Rope scaling. (#741 )	2023-07-31 15:38:47 +02:00
flash_rw_modeling.py	Adding Rope scaling. (#741 )	2023-07-31 15:38:47 +02:00
flash_santacoder_modeling.py	feat(server): Using `quantize_config.json` instead of GPTQ_BITS env variables. (#671 )	2023-07-25 13:00:27 +02:00
mpt_modeling.py	chore: fix typo in mpt_modeling.py (#737 )	2023-07-31 15:43:44 +02:00
neox_modeling.py	feat: better errors for warmup and TP (#575 )	2023-07-10 14:47:15 +02:00
opt_modeling.py	add FastLinear import (#750 )	2023-08-02 20:04:46 +02:00
t5_modeling.py	fix(server): Adding logger import to t5_modeling.py (#585 )	2023-07-12 10:40:32 +02:00