diff --git a/docs/source/basic_tutorials/consuming_tgi.md b/docs/source/basic_tutorials/consuming_tgi.md index 9cdc9b14..2439930f 100644 --- a/docs/source/basic_tutorials/consuming_tgi.md +++ b/docs/source/basic_tutorials/consuming_tgi.md @@ -11,4 +11,16 @@ To serve both ChatUI and TGI in same environment, simply add your own endpoints // rest of the model config here "endpoints": [{"url": "https://HOST:PORT/generate_stream"}] } -``` \ No newline at end of file +``` + +## Inference Client + +`huggingface-hub` is a Python library to interact and manage repositories and endpoints on Hugging Face Hub. `InferenceClient` is a class that lets users interact with models on Hugging Face Hub and Hugging Face models served by any TGI endpoint. Once you start the TGI server, simply instantiate `InferenceClient()` with the URL to endpoint serving the model. You can then call `text_generation()` to hit the endpoint through Python. + +```python +from huggingface_hub import InferenceClient +client = InferenceClient(model=URL_TO_ENDPOINT_SERVING_TGI) +client.text_generation(prompt="Write a code for snake game", model=URL_TO_ENDPOINT_SERVING_TGI) +``` + +You can check out the details of the function [here](https://huggingface.co/docs/huggingface_hub/main/en/package_reference/inference_client#huggingface_hub.InferenceClient.text_generation). \ No newline at end of file diff --git a/docs/source/basic_tutorials/preparing_model.md b/docs/source/basic_tutorials/preparing_model.md new file mode 100644 index 00000000..e69de29b