153ff3740b
* add gptq and awq int4 support in intel platform Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * fix ci failure Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * set kv cache dtype Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * refine the code according to the review command Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * Simplifying conditionals + reverting integration tests values. * Unused import * Fix redundant import. * Revert change after rebase. * Upgrading the tests (TP>1 fix changes to use different kernels.) * Update server/text_generation_server/layers/gptq/__init__.py --------- Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Co-authored-by: Wang, Yi A <yi.a.wang@intel.com> |
||
---|---|---|
.. | ||
__init__.py | ||
cuda.py | ||
custom_autotune.py | ||
exllama.py | ||
exllamav2.py | ||
ipex.py | ||
quantize.py | ||
utils.py |