Hotfixing the link. (#2811)

This commit is contained in:
Nicolas Patry 2024-12-10 01:20:07 +05:30 committed by GitHub
parent 042791fbd5
commit a70dd2998b
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
1 changed files with 1 additions and 1 deletions

View File

@ -72,7 +72,7 @@ Long: `MODEL_ID=$MODEL_ID HOST=localhost:8000 k6 run load_tests/long.js`
### Results ### Results
![benchmarks_v3](https://github.com/huggingface/text-generation-inference/blob/main/assets/v3_benchmarks.png) ![benchmarks_v3](https://github.com/huggingface/text-generation-inference/blob/042791fbd5742b1644d42c493db6bec669df6537/assets/v3_benchmarks.png)
Our benchmarking results show significant performance gains, with a 13x speedup over vLLM with prefix caching, and up to 30x speedup without prefix caching. These results are consistent with our production data and demonstrate the effectiveness of our optimized LLM architecture. Our benchmarking results show significant performance gains, with a 13x speedup over vLLM with prefix caching, and up to 30x speedup without prefix caching. These results are consistent with our production data and demonstrate the effectiveness of our optimized LLM architecture.