Minor docs style fixes (#806)

This commit is contained in:
Omar Sanseviero 2023-08-10 14:32:51 +02:00 committed by GitHub
parent 04f7c2d86b
commit 7dbaef3f5b
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 12 additions and 12 deletions

View File

@ -6,7 +6,7 @@ There are many ways you can consume Text Generation Inference server in your app
After the launch, you can query the model using either the `/generate` or `/generate_stream` routes: After the launch, you can query the model using either the `/generate` or `/generate_stream` routes:
```shell ```bash
curl 127.0.0.1:8080/generate \ curl 127.0.0.1:8080/generate \
-X POST \ -X POST \
-d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":20}}' \ -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":20}}' \
@ -20,14 +20,13 @@ curl 127.0.0.1:8080/generate \
You can simply install `huggingface-hub` package with pip. You can simply install `huggingface-hub` package with pip.
```python ```bash
pip install huggingface-hub pip install huggingface-hub
``` ```
Once you start the TGI server, instantiate `InferenceClient()` with the URL to the endpoint serving the model. You can then call `text_generation()` to hit the endpoint through Python. Once you start the TGI server, instantiate `InferenceClient()` with the URL to the endpoint serving the model. You can then call `text_generation()` to hit the endpoint through Python.
```python ```python
from huggingface_hub import InferenceClient from huggingface_hub import InferenceClient
client = InferenceClient(model=URL_TO_ENDPOINT_SERVING_TGI) client = InferenceClient(model=URL_TO_ENDPOINT_SERVING_TGI)

View File

@ -16,7 +16,7 @@ Text Generation Inference is available on pypi, conda and GitHub.
To install and launch locally, first [install Rust](https://rustup.rs/) and create a Python virtual environment with at least To install and launch locally, first [install Rust](https://rustup.rs/) and create a Python virtual environment with at least
Python 3.9, e.g. using conda: Python 3.9, e.g. using conda:
```shell ```bash
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
conda create -n text-generation-inference python=3.9 conda create -n text-generation-inference python=3.9
@ -27,7 +27,7 @@ You may also need to install Protoc.
On Linux: On Linux:
```shell ```bash
PROTOC_ZIP=protoc-21.12-linux-x86_64.zip PROTOC_ZIP=protoc-21.12-linux-x86_64.zip
curl -OL https://github.com/protocolbuffers/protobuf/releases/download/v21.12/$PROTOC_ZIP curl -OL https://github.com/protocolbuffers/protobuf/releases/download/v21.12/$PROTOC_ZIP
sudo unzip -o $PROTOC_ZIP -d /usr/local bin/protoc sudo unzip -o $PROTOC_ZIP -d /usr/local bin/protoc
@ -37,13 +37,13 @@ rm -f $PROTOC_ZIP
On MacOS, using Homebrew: On MacOS, using Homebrew:
```shell ```bash
brew install protobuf brew install protobuf
``` ```
Then run to install Text Generation Inference: Then run to install Text Generation Inference:
```shell ```bash
BUILD_EXTENSIONS=True make install # Install repository and HF/transformer fork with CUDA kernels BUILD_EXTENSIONS=True make install # Install repository and HF/transformer fork with CUDA kernels
``` ```
@ -51,7 +51,7 @@ BUILD_EXTENSIONS=True make install # Install repository and HF/transformer fork
On some machines, you may also need the OpenSSL libraries and gcc. On Linux machines, run: On some machines, you may also need the OpenSSL libraries and gcc. On Linux machines, run:
```shell ```bash
sudo apt-get install libssl-dev gcc -y sudo apt-get install libssl-dev gcc -y
``` ```
@ -59,13 +59,14 @@ sudo apt-get install libssl-dev gcc -y
Once installation is done, simply run: Once installation is done, simply run:
```shell ```bash
make run-falcon-7b-instruct make run-falcon-7b-instruct
``` ```
This will serve Falcon 7B Instruct model from the port 8080, which we can query. This will serve Falcon 7B Instruct model from the port 8080, which we can query.
To see all options to serve your models, check in the [codebase](https://github.com/huggingface/text-generation-inference/blob/main/launcher/src/main.rs) or the CLI: To see all options to serve your models, check in the [codebase](https://github.com/huggingface/text-generation-inference/blob/main/launcher/src/main.rs) or the CLI:
```
```bash
text-generation-launcher --help text-generation-launcher --help
``` ```

View File

@ -19,7 +19,7 @@ To use GPUs, you need to install the [NVIDIA Container Toolkit](https://docs.nvi
Once TGI is running, you can use the `generate` endpoint by doing requests. To learn more about how to query the endpoints, check the [Consuming TGI](./basic_tutorials/consuming_tgi) section. Once TGI is running, you can use the `generate` endpoint by doing requests. To learn more about how to query the endpoints, check the [Consuming TGI](./basic_tutorials/consuming_tgi) section.
```shell ```bash
curl 127.0.0.1:8080/generate -X POST -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":20}}' -H 'Content-Type: application/json' curl 127.0.0.1:8080/generate -X POST -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":20}}' -H 'Content-Type: application/json'
``` ```
@ -27,7 +27,7 @@ curl 127.0.0.1:8080/generate -X POST -d '{"inputs":"What is Deep Learning?","par
To see all possible flags and options, you can use the `--help` flag. It's possible to configure the number of shards, quantization, generation parameters, and more. To see all possible flags and options, you can use the `--help` flag. It's possible to configure the number of shards, quantization, generation parameters, and more.
```shell ```bash
docker run ghcr.io/huggingface/text-generation-inference:1.0.0 --help docker run ghcr.io/huggingface/text-generation-inference:1.0.0 --help
``` ```