hf_text-generation-inference/docs/source/basic_tutorials/consuming_tgi.md

867 B

Consuming Text Generation Inference

ChatUI

ChatUI is the open-source interface built for large language model serving. It offers many customization options, web search with SERP API and more. ChatUI can automatically consume the Text Generation Inference server, and even provide option to switch between different TGI endpoints. You can try it out at Hugging Chat, or use ChatUI Docker Spaces to deploy your own Hugging Chat to Spaces.

To serve both ChatUI and TGI in same environment, simply add your own endpoints to the MODELS variable in ``.env.localfile insidechat-ui` repository. Provide the endpoints pointing to where TGI is served.

{
// rest of the model config here
"endpoints": [{"url": "https://HOST:PORT/generate_stream"}]
}