Adding some docs.
This commit is contained in:
parent
bf700e7eef
commit
cea291718e
|
@ -52,6 +52,8 @@ Text Generation Inference (TGI) is a toolkit for deploying and serving Large Lan
|
||||||
- Logits warper (temperature scaling, top-p, top-k, repetition penalty, more details see [transformers.LogitsProcessor](https://huggingface.co/docs/transformers/internal/generation_utils#transformers.LogitsProcessor))
|
- Logits warper (temperature scaling, top-p, top-k, repetition penalty, more details see [transformers.LogitsProcessor](https://huggingface.co/docs/transformers/internal/generation_utils#transformers.LogitsProcessor))
|
||||||
- Stop sequences
|
- Stop sequences
|
||||||
- Log probabilities
|
- Log probabilities
|
||||||
|
- [Speculation](https://huggingface.co/docs/text-generation-inference/conceptual/speculation) ~2x latency
|
||||||
|
- [Guidance/JSON](https://huggingface.co/docs/text-generation-inference/conceptual/guidance). Specify output format to speed up inference and make sure the output is valid according to some specs..
|
||||||
- Custom Prompt Generation: Easily generate text by providing custom prompts to guide the model's output
|
- Custom Prompt Generation: Easily generate text by providing custom prompts to guide the model's output
|
||||||
- Fine-tuning Support: Utilize fine-tuned models for specific tasks to achieve higher accuracy and performance
|
- Fine-tuning Support: Utilize fine-tuned models for specific tasks to achieve higher accuracy and performance
|
||||||
|
|
||||||
|
|
|
@ -37,4 +37,8 @@
|
||||||
title: Safetensors
|
title: Safetensors
|
||||||
- local: conceptual/flash_attention
|
- local: conceptual/flash_attention
|
||||||
title: Flash Attention
|
title: Flash Attention
|
||||||
|
- local: conceptual/speculation
|
||||||
|
title: Speculation (Medusa, ngram)
|
||||||
|
- local: conceptual/guidance
|
||||||
|
title: Guidance, JSON, tools (using outlines)
|
||||||
title: Conceptual Guides
|
title: Conceptual Guides
|
||||||
|
|
|
@ -0,0 +1 @@
|
||||||
|
## Guidance
|
|
@ -0,0 +1 @@
|
||||||
|
## Speculation
|
|
@ -0,0 +1 @@
|
||||||
|
## Speculation
|
Loading…
Reference in New Issue