Update README.md

This commit is contained in:
Michael Feil 2023-08-03 23:23:02 +02:00 committed by GitHub
parent a9838bba2f
commit da9746586b
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 2 additions and 2 deletions

View File

@ -15,8 +15,8 @@ We at Preemo are currently busy working on our first release of our other produc
Our long-term goal is to grow the community around this repository, as a playground for trying out new ideas and optimizations in LLM inference. We at Preemo will implement features that interest us, but we also welcome contributions from the community, as long as they are modularized and composable.
## Extra features in comparison to Hugging Face `text-generation-inference`
## Extra features in comparison to Hugging Face `text-generation-inference` v0.9.4
### 4bit quantization
4bit quantization is available using the [NF4 and FP4 data types from bitsandbytes](https://arxiv.org/pdf/2305.14314.pdf). It can be enabled by providing `--quantize bitsandbytes-nf4` or `--quantize bitsandbytes-fp4` as a command line argument to `text-generation-launcher`.
4bit quantization is available using the [NF4 and FP4 data types from bitsandbytes](https://arxiv.org/pdf/2305.14314.pdf). It can be enabled by providing `--quantize bitsandbytes-nf4` or `--quantize bitsandbytes-fp4` as a command line argument to `text-generation-launcher`.