Merge cluster to master #3

Merged
cyberes merged 163 commits from cluster into master 2023-10-27 19:19:22 -06:00
1 changed files with 1 additions and 2 deletions
Showing only changes of commit 28c250385d - Show all commits

View File

@ -34,12 +34,11 @@ from llm_server.sock import init_wssocket
# TODO: return an `error: True`, error code, and error message rather than just a formatted message # TODO: return an `error: True`, error code, and error message rather than just a formatted message
# TODO: what happens when all backends are offline? What about the "online" key in the stats page? # TODO: what happens when all backends are offline? What about the "online" key in the stats page?
# TODO: redis SCAN vs KEYS?? # TODO: redis SCAN vs KEYS??
# TODO: implement blind RRD controlled via header and only used when there is a queue on the primary backend(s)
# TODO: is frequency penalty the same as ooba repetition penalty??? # TODO: is frequency penalty the same as ooba repetition penalty???
# TODO: make sure openai_moderation_enabled works on websockets, completions, and chat completions # TODO: make sure openai_moderation_enabled works on websockets, completions, and chat completions
# TODO: if a backend is at its limit of concurrent requests, choose a different one
# Lower priority # Lower priority
# TODO: if a backend is at its limit of concurrent requests, choose a different one
# TODO: make error messages consitient # TODO: make error messages consitient
# TODO: support logit_bias on OpenAI and Ooba endpoints. # TODO: support logit_bias on OpenAI and Ooba endpoints.
# TODO: add a way to cancel VLLM gens. Maybe use websockets? # TODO: add a way to cancel VLLM gens. Maybe use websockets?