doc: clarify that `--quantize` is not needed for pre-quantized models (#2536)

This commit is contained in:
Daniël de Kok 2024-09-19 22:17:15 +02:00 committed by GitHub
parent c103760172
commit abd24dd385
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
3 changed files with 9 additions and 2 deletions

View File

@ -55,7 +55,9 @@ Options:
## QUANTIZE
```shell
--quantize <QUANTIZE>
Whether you want the model to be quantized
Quantization method to use for the model. It is not necessary to specify this option for pre-quantized models, since the quantization method is read from the model configuration.
Marlin kernels will be used automatically for GPTQ/AWQ models.
[env: QUANTIZE=]

View File

@ -157,6 +157,7 @@
pyright
pytest
pytest-asyncio
redocly
ruff
syrupy
]);

View File

@ -367,7 +367,11 @@ struct Args {
#[clap(long, env)]
num_shard: Option<usize>,
/// Whether you want the model to be quantized.
/// Quantization method to use for the model. It is not necessary to specify this option
/// for pre-quantized models, since the quantization method is read from the model
/// configuration.
///
/// Marlin kernels will be used automatically for GPTQ/AWQ models.
#[clap(long, env, value_enum)]
quantize: Option<Quantization>,