local-llm-server/other/vllm/README.md

### Nginx

Make sure your proxies all have a long timeout:
```
proxy_read_timeout 300;
proxy_connect_timeout 300;
proxy_send_timeout 300;
```

The LLM middleware has a request timeout of 95 so this longer timeout is to avoid any issues.

### Model Preperation

Make sure your model's `tokenizer_config.json` has `4096` set equal to or greater than your token limit.
option to disable streaming, improve timeout on requests to backend, fix error handling. reduce duplicate code, misc other cleanup 2023-09-14 14:05:50 -06:00			`### Nginx`

fix division by 0, prettify /stats json, add js var to home 2023-09-16 17:37:43 -06:00			`Make sure your proxies all have a long timeout:`
option to disable streaming, improve timeout on requests to backend, fix error handling. reduce duplicate code, misc other cleanup 2023-09-14 14:05:50 -06:00			```
			`proxy_read_timeout 300;`
			`proxy_connect_timeout 300;`
			`proxy_send_timeout 300;`
			```
fix division by 0, prettify /stats json, add js var to home 2023-09-16 17:37:43 -06:00
			`The LLM middleware has a request timeout of 95 so this longer timeout is to avoid any issues.`

			`### Model Preperation`

			Make sure your model's `tokenizer_config.json` has `4096` set equal to or greater than your token limit.