Add support for GPTQ Marlin kernels
GPTQ Marlin extends the Marlin kernels to support common GPTQ
configurations:
- bits: 4 or 8
- groupsize: -1, 32, 64, or 128
- desc_act: true/false
Using the GPTQ Marlin kernels requires repacking the parameters in the
Marlin quantizer format.
The kernels were contributed by Neural Magic to VLLM. We vendor them
here for convenience.