local-llm-server

History

Cyberes 79b1e01b61 option to disable streaming, improve timeout on requests to backend, fix error handling. reduce duplicate code, misc other cleanup		2023-09-14 14:05:50 -06:00
..
README.md	option to disable streaming, improve timeout on requests to backend, fix error handling. reduce duplicate code, misc other cleanup	2023-09-14 14:05:50 -06:00
build-vllm.sh	adjust logging, add more vllm stuff	2023-09-13 11:22:33 -06:00
vllm-gptq-setup-no-cuda.py	adjust logging, add more vllm stuff	2023-09-13 11:22:33 -06:00
vllm.service	adjust requests timeout, add service file	2023-09-14 01:32:49 -06:00
vllm_api_server.py	adjust requests timeout, add service file	2023-09-14 01:32:49 -06:00

proxy_read_timeout 300;
proxy_connect_timeout 300;
proxy_send_timeout 300;

The LLM middleware has a request timeout of 120 so this longer timeout is to avoid any issues.