fix incorrect output of Qwen2-7B-Instruct-GPTQ-Int4 and Qwen2-7B-Instruct-AWQ
ipex kernel provide func like add_bias, so no need add it outside
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
* add gptq and awq int4 support in intel platform
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
* fix ci failure
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
* set kv cache dtype
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
* refine the code according to the review command
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
* Simplifying conditionals + reverting integration tests values.
* Unused import
* Fix redundant import.
* Revert change after rebase.
* Upgrading the tests (TP>1 fix changes to use different kernels.)
* Update server/text_generation_server/layers/gptq/__init__.py
---------
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
Co-authored-by: Wang, Yi A <yi.a.wang@intel.com>