local-llm-server/other/vllm
Cyberes 3a538d649a fix docker typo lol 2023-09-28 00:02:41 -06:00
..
Docker fix docker typo lol 2023-09-28 00:02:41 -06:00
README.md fix division by 0, prettify /stats json, add js var to home 2023-09-16 17:37:43 -06:00
build-vllm.sh adjust logging, add more vllm stuff 2023-09-13 11:22:33 -06:00
vllm-gptq-setup-no-cuda.py adjust logging, add more vllm stuff 2023-09-13 11:22:33 -06:00
vllm.service adjust requests timeout, add service file 2023-09-14 01:32:49 -06:00
vllm_api_server.py fix homepage slowness, fix incorrect 24 hr prompters, fix redis wrapper, 2023-09-25 17:20:21 -06:00

README.md

Nginx

Make sure your proxies all have a long timeout:

proxy_read_timeout 300;
proxy_connect_timeout 300;
proxy_send_timeout 300;

The LLM middleware has a request timeout of 95 so this longer timeout is to avoid any issues.

Model Preperation

Make sure your model's tokenizer_config.json has 4096 set equal to or greater than your token limit.