Fixed README ToC (#2196)
Co-authored-by: Vinayak Kamath <Vinayak.Kamath@target.com>
This commit is contained in:
parent
fe710af25f
commit
f5ba9bfd52
27
README.md
27
README.md
|
@ -20,19 +20,20 @@ to power Hugging Chat, the Inference API and Inference Endpoint.
|
|||
|
||||
## Table of contents
|
||||
|
||||
- [Get Started](#get-started)
|
||||
- [API Documentation](#api-documentation)
|
||||
- [Using a private or gated model](#using-a-private-or-gated-model)
|
||||
- [A note on Shared Memory](#a-note-on-shared-memory-shm)
|
||||
- [Distributed Tracing](#distributed-tracing)
|
||||
- [Local Install](#local-install)
|
||||
- [CUDA Kernels](#cuda-kernels)
|
||||
- [Optimized architectures](#optimized-architectures)
|
||||
- [Run Mistral](#run-a-model)
|
||||
- [Run](#run)
|
||||
- [Quantization](#quantization)
|
||||
- [Develop](#develop)
|
||||
- [Testing](#testing)
|
||||
- [Get Started](#get-started)
|
||||
- [Docker](#docker)
|
||||
- [API documentation](#api-documentation)
|
||||
- [Using a private or gated model](#using-a-private-or-gated-model)
|
||||
- [A note on Shared Memory (shm)](#a-note-on-shared-memory-shm)
|
||||
- [Distributed Tracing](#distributed-tracing)
|
||||
- [Architecture](#architecture)
|
||||
- [Local install](#local-install)
|
||||
- [Optimized architectures](#optimized-architectures)
|
||||
- [Run locally](#run-locally)
|
||||
- [Run](#run)
|
||||
- [Quantization](#quantization)
|
||||
- [Develop](#develop)
|
||||
- [Testing](#testing)
|
||||
|
||||
Text Generation Inference (TGI) is a toolkit for deploying and serving Large Language Models (LLMs). TGI enables high-performance text generation for the most popular open-source LLMs, including Llama, Falcon, StarCoder, BLOOM, GPT-NeoX, and [more](https://huggingface.co/docs/text-generation-inference/supported_models). TGI implements many features, such as:
|
||||
|
||||
|
|
Loading…
Reference in New Issue