hf_text-generation-inference

History

Daniël de Kok 53aec27328 server quantize: store quantizer config in standard format (#2299 ) - Create `quantization_config` option in the model config. - Don't store the quantizer config in tensors anymore.		2024-07-30 15:16:20 +02:00
..
attention	feat: add ruff and resolve issue (#2262 )	2024-07-26 10:29:09 -04:00
awq	feat: add ruff and resolve issue (#2262 )	2024-07-26 10:29:09 -04:00
gptq	server quantize: store quantizer config in standard format (#2299 )	2024-07-30 15:16:20 +02:00
marlin	Install Marlin from standalone package (#2320 )	2024-07-29 15:37:10 +02:00
__init__.py	feat: add ruff and resolve issue (#2262 )	2024-07-26 10:29:09 -04:00
bnb.py	feat: add ruff and resolve issue (#2262 )	2024-07-26 10:29:09 -04:00
conv.py	Refactor layers. (#1866 )	2024-05-13 12:44:30 +02:00
eetq.py	feat(fp8): use fbgemm kernels and load fp8 weights directly (#2248 )	2024-07-20 19:02:04 +02:00
exl2.py	Add support for Deepseek V2 (#2224 )	2024-07-19 17:23:20 +02:00
fp8.py	feat: add ruff and resolve issue (#2262 )	2024-07-26 10:29:09 -04:00
layernorm.py	Removing IPEX_AVAIL. (#2115 )	2024-06-25 13:20:57 +02:00
linear.py	feat: add ruff and resolve issue (#2262 )	2024-07-26 10:29:09 -04:00
lora.py	feat: add ruff and resolve issue (#2262 )	2024-07-26 10:29:09 -04:00
medusa.py	fix: use path inside of speculator config (#1935 )	2024-05-22 20:46:29 +02:00
mlp.py	MLPSpeculator. (#1865 )	2024-05-14 12:33:18 +02:00
rotary.py	feat: add ruff and resolve issue (#2262 )	2024-07-26 10:29:09 -04:00
speculative.py	feat: add ruff and resolve issue (#2262 )	2024-07-26 10:29:09 -04:00
tensor_parallel.py	feat: add ruff and resolve issue (#2262 )	2024-07-26 10:29:09 -04:00