90a1d04a2f
This change add support for MoE models that use GPTQ quantization. Currently only models with the following properties are supported: - No `desc_act` with tensor parallelism, unless `group_size=-1`. - No asymmetric quantization. - No AWQ. |
||
---|---|---|
.. | ||
images | ||
models | ||
conftest.py | ||
poetry.lock | ||
pyproject.toml | ||
pytest.ini | ||
requirements.txt |