### Nginx

Make sure your proxies all have a long timeout:
```
proxy_read_timeout 300;
proxy_connect_timeout 300;
proxy_send_timeout 300;
```

The LLM middleware has a request timeout of 95 so this longer timeout is to avoid any issues.

### Model Preperation

Make sure your model's `tokenizer_config.json` has `4096` set equal to or greater than your token limit.