hf_text-generation-inference/router/src
OlivierDehaene afd04dc71e
feat(server): update vllm version (#723)
2023-07-28 15:36:38 +02:00
..
health.rs feat(server): only compute prefill logprobs when asked (#406) 2023-06-02 17:12:30 +02:00
infer.rs feat(server): auto max_batch_total_tokens for flash att models (#630) 2023-07-19 09:31:25 +02:00
lib.rs chore: update openapi schema 2023-06-05 18:16:08 +02:00
main.rs feat(server): update vllm version (#723) 2023-07-28 15:36:38 +02:00
queue.rs feat(server): auto max_batch_total_tokens for flash att models (#630) 2023-07-19 09:31:25 +02:00
server.rs feat(server): add local prom and health routes if running w/ ngrok 2023-07-21 16:56:30 +02:00
validation.rs feat(launcher): add arg validation and drop subprocess (#595) 2023-07-13 14:22:37 +02:00