archive project

2024-10-27 12:13:26 -06:00 · 2024-10-27 12:13:26 -06:00 · b13fda722a
parent 9a1d41a9b7
commit b13fda722a
2 changed files with 4 additions and 1 deletions
--- a/README.md
+++ b/README.md
@ -2,6 +2,8 @@
 _An HTTP API to serve local LLM Models._
 **ARCHIVED PROJECT:** this project was created before any good solution existed for managing LLM endpoints and has now been superseded by many good options. [LiteLLM](https://github.com/BerriAI/litellm) is the best replacement. If a need for an un-authenticated public model arises, check out [cyberes/litellm-public](https://git.evulid.cc/cyberes/litellm-public).
 The purpose of this server is to abstract your LLM backend from your frontend API. This enables you to switch your backend while providing a stable frontend clients.
--- a/server.py
+++ b/server.py
@ -21,6 +21,7 @@ from llm_server.sock import init_wssocket
 # TODO: is frequency penalty the same as ooba repetition penalty???
 # TODO: make sure openai_moderation_enabled works on websockets, completions, and chat completions
 # TODO: insert pydantic object into database
 # TODO: figure out blocking API disconnect https://news.ycombinator.com/item?id=41168033
 # Lower priority
 # TODO: if a backend is at its limit of concurrent requests, choose a different one