Adding some docs.
This commit is contained in:
parent
bf700e7eef
commit
cea291718e
|
@ -52,6 +52,8 @@ Text Generation Inference (TGI) is a toolkit for deploying and serving Large Lan
|
|||
- Logits warper (temperature scaling, top-p, top-k, repetition penalty, more details see [transformers.LogitsProcessor](https://huggingface.co/docs/transformers/internal/generation_utils#transformers.LogitsProcessor))
|
||||
- Stop sequences
|
||||
- Log probabilities
|
||||
- [Speculation](https://huggingface.co/docs/text-generation-inference/conceptual/speculation) ~2x latency
|
||||
- [Guidance/JSON](https://huggingface.co/docs/text-generation-inference/conceptual/guidance). Specify output format to speed up inference and make sure the output is valid according to some specs..
|
||||
- Custom Prompt Generation: Easily generate text by providing custom prompts to guide the model's output
|
||||
- Fine-tuning Support: Utilize fine-tuned models for specific tasks to achieve higher accuracy and performance
|
||||
|
||||
|
|
|
@ -37,4 +37,8 @@
|
|||
title: Safetensors
|
||||
- local: conceptual/flash_attention
|
||||
title: Flash Attention
|
||||
- local: conceptual/speculation
|
||||
title: Speculation (Medusa, ngram)
|
||||
- local: conceptual/guidance
|
||||
title: Guidance, JSON, tools (using outlines)
|
||||
title: Conceptual Guides
|
||||
|
|
|
@ -0,0 +1 @@
|
|||
## Guidance
|
|
@ -0,0 +1 @@
|
|||
## Speculation
|
|
@ -0,0 +1 @@
|
|||
## Speculation
|
Loading…
Reference in New Issue