Adding some docs.

This commit is contained in:
Nicolas Patry 2024-02-27 15:38:02 +01:00
parent bf700e7eef
commit cea291718e
5 changed files with 9 additions and 0 deletions

View File

@ -52,6 +52,8 @@ Text Generation Inference (TGI) is a toolkit for deploying and serving Large Lan
- Logits warper (temperature scaling, top-p, top-k, repetition penalty, more details see [transformers.LogitsProcessor](https://huggingface.co/docs/transformers/internal/generation_utils#transformers.LogitsProcessor)) - Logits warper (temperature scaling, top-p, top-k, repetition penalty, more details see [transformers.LogitsProcessor](https://huggingface.co/docs/transformers/internal/generation_utils#transformers.LogitsProcessor))
- Stop sequences - Stop sequences
- Log probabilities - Log probabilities
- [Speculation](https://huggingface.co/docs/text-generation-inference/conceptual/speculation) ~2x latency
- [Guidance/JSON](https://huggingface.co/docs/text-generation-inference/conceptual/guidance). Specify output format to speed up inference and make sure the output is valid according to some specs..
- Custom Prompt Generation: Easily generate text by providing custom prompts to guide the model's output - Custom Prompt Generation: Easily generate text by providing custom prompts to guide the model's output
- Fine-tuning Support: Utilize fine-tuned models for specific tasks to achieve higher accuracy and performance - Fine-tuning Support: Utilize fine-tuned models for specific tasks to achieve higher accuracy and performance

View File

@ -37,4 +37,8 @@
title: Safetensors title: Safetensors
- local: conceptual/flash_attention - local: conceptual/flash_attention
title: Flash Attention title: Flash Attention
- local: conceptual/speculation
title: Speculation (Medusa, ngram)
- local: conceptual/guidance
title: Guidance, JSON, tools (using outlines)
title: Conceptual Guides title: Conceptual Guides

View File

@ -0,0 +1 @@
## Guidance

View File

@ -0,0 +1 @@
## Speculation

View File

@ -0,0 +1 @@
## Speculation