hf_text-generation-inference

History

OlivierDehaene c9bdaa8b73 feat(server): reduce mlp and attn in one op for flash neox (#145 )		2023-03-28 16:51:41 +02:00
..
__init__.py	fix(server): Avoid using try/except to determine kind of AutoModel (#142 )	2023-03-27 09:23:22 +02:00
bloom.py	feat(clients): Python client (#103 )	2023-03-07 18:52:22 +01:00
causal_lm.py	feat(server): flash neoX (#133 )	2023-03-24 14:02:14 +01:00
flash_neox.py	feat(server): cleanup flash neox loading (#139 )	2023-03-26 16:37:21 +02:00
flash_neox_modeling.py	feat(server): reduce mlp and attn in one op for flash neox (#145 )	2023-03-28 16:51:41 +02:00
galactica.py	fix(server): add position ids to neox (#126 )	2023-03-15 13:12:49 +01:00
gpt_neox.py	fix(server): add position ids to neox (#126 )	2023-03-15 13:12:49 +01:00
model.py	feat(clients): Python client (#103 )	2023-03-07 18:52:22 +01:00
santacoder.py	feat(clients): Python client (#103 )	2023-03-07 18:52:22 +01:00
seq2seq_lm.py	fix(server): use server tokenizer as gt (#128 )	2023-03-16 12:12:26 +01:00
t5.py	feat(clients): Python client (#103 )	2023-03-07 18:52:22 +01:00
types.py	feat(clients): Python client (#103 )	2023-03-07 18:52:22 +01:00