local-llm-server/other/vllm
Cyberes 79b1e01b61 option to disable streaming, improve timeout on requests to backend, fix error handling. reduce duplicate code, misc other cleanup 2023-09-14 14:05:50 -06:00
..
README.md option to disable streaming, improve timeout on requests to backend, fix error handling. reduce duplicate code, misc other cleanup 2023-09-14 14:05:50 -06:00
build-vllm.sh adjust logging, add more vllm stuff 2023-09-13 11:22:33 -06:00
vllm-gptq-setup-no-cuda.py adjust logging, add more vllm stuff 2023-09-13 11:22:33 -06:00
vllm.service adjust requests timeout, add service file 2023-09-14 01:32:49 -06:00
vllm_api_server.py adjust requests timeout, add service file 2023-09-14 01:32:49 -06:00

README.md

Nginx

  1. Make sure your proxies all have a long timeout:
proxy_read_timeout 300;
proxy_connect_timeout 300;
proxy_send_timeout 300;

The LLM middleware has a request timeout of 120 so this longer timeout is to avoid any issues.