### Nginx Make sure your proxies all have a long timeout: ``` proxy_read_timeout 300; proxy_connect_timeout 300; proxy_send_timeout 300; ``` The LLM middleware has a request timeout of 95 so this longer timeout is to avoid any issues. ### Model Preperation Make sure your model's `tokenizer_config.json` has `4096` set equal to or greater than your token limit.