Update README.md
This commit is contained in:
parent
a9838bba2f
commit
da9746586b
|
@ -15,8 +15,8 @@ We at Preemo are currently busy working on our first release of our other produc
|
|||
|
||||
Our long-term goal is to grow the community around this repository, as a playground for trying out new ideas and optimizations in LLM inference. We at Preemo will implement features that interest us, but we also welcome contributions from the community, as long as they are modularized and composable.
|
||||
|
||||
## Extra features in comparison to Hugging Face `text-generation-inference`
|
||||
## Extra features in comparison to Hugging Face `text-generation-inference` v0.9.4
|
||||
|
||||
### 4bit quantization
|
||||
|
||||
4bit quantization is available using the [NF4 and FP4 data types from bitsandbytes](https://arxiv.org/pdf/2305.14314.pdf). It can be enabled by providing `--quantize bitsandbytes-nf4` or `--quantize bitsandbytes-fp4` as a command line argument to `text-generation-launcher`.
|
||||
4bit quantization is available using the [NF4 and FP4 data types from bitsandbytes](https://arxiv.org/pdf/2305.14314.pdf). It can be enabled by providing `--quantize bitsandbytes-nf4` or `--quantize bitsandbytes-fp4` as a command line argument to `text-generation-launcher`.
|
||||
|
|
Loading…
Reference in New Issue