Commit Graph

191 Commits

Author SHA1 Message Date
Cyberes 2ab2e6eed1 fix service file 2024-05-08 22:19:57 -06:00
Cyberes 20366fbd08 misc adjustments 2024-05-07 22:56:36 -06:00
Cyberes fe23a2282f refactor, add Llm-Disable-Openai header 2024-05-07 17:41:53 -06:00
Cyberes 5bd1044fad openai error message cleanup 2024-05-07 17:07:34 -06:00
Cyberes fd09c783d3 refactor a lot of things, major cleanup, use postgresql 2024-05-07 17:03:41 -06:00
Cyberes ee9a0d4858 redo config 2024-05-07 12:20:53 -06:00
Cyberes ff82add09e redo database connection, add pooling, minor logging changes, other clean up 2024-05-07 09:48:51 -06:00
Cyberes ab408c6c5b ready for public release 2024-03-18 12:42:44 -06:00
Cyberes 4b3e0671c6 clean some stuff up, bump VLLM version 2024-01-10 15:01:26 -07:00
Cyberes 0e7f04ab2d fix gunicorn logging 2023-12-21 14:24:50 -07:00
Cyberes 8cab76712f update service 2023-12-21 13:22:47 -07:00
Cyberes 885b2d64f7 fix homepage incorrectly showing online when all backends are offline 2023-11-13 20:57:19 -07:00
Cyberes 009039dbd8 fix server exception 2023-10-30 14:42:50 -06:00
Cyberes 0059e7956c Merge cluster to master (#3)
Co-authored-by: Cyberes <cyberes@evulid.cc>
Reviewed-on: #3
2023-10-27 19:19:22 -06:00
Cyberes e1d3fca6d3 try to cancel inference if disconnected from client 2023-09-28 09:55:31 -06:00
Cyberes e42f2b6819 fix negative queue on stats 2023-09-28 08:47:39 -06:00
Cyberes 347a82b7e1 avoid sending to backend to tokenize if it's greater than our specified context size 2023-09-28 03:54:20 -06:00
Cyberes 467b804ad7 raise printer interval 2023-09-28 03:47:27 -06:00
Cyberes 315d42bbc5 divide by 0??? 2023-09-28 03:46:01 -06:00
Cyberes 59f2aac8ad rewrite redis usage 2023-09-28 03:44:30 -06:00
Cyberes a4a1d6cce6 fix double logging 2023-09-28 01:34:15 -06:00
Cyberes ecdf819088 fix try/finally with continue, fix wrong subclass signature 2023-09-28 00:11:34 -06:00
Cyberes e86a5182eb redo background processes, reorganize server.py 2023-09-27 23:36:44 -06:00
Cyberes 097d614a35 fix duplicate logging from console printer thread 2023-09-27 21:28:25 -06:00
Cyberes e5fbc9545d add ratelimiting to websocket streaming endpoint, fix queue not decrementing IP requests, add console printer 2023-09-27 21:15:54 -06:00
Cyberes 43299b32ad clean up background threads 2023-09-27 19:39:04 -06:00
Cyberes 35e9847b27 set inference workers to daemon, add finally to inference worker, hide estimated avg tps 2023-09-27 18:36:51 -06:00
Cyberes 105b66d5e2 unify error message handling 2023-09-27 14:48:47 -06:00
Cyberes 957a6cd092 fix error handling 2023-09-27 14:36:49 -06:00
Cyberes aba2e5b9c0 don't use db pooling, add LLM-ST-Errors header to disable formatted errors 2023-09-26 23:59:22 -06:00
Cyberes 048e5a8060 fix API key handling 2023-09-26 22:49:53 -06:00
Cyberes d9bbcc42e6 more work on openai endpoint 2023-09-26 22:09:11 -06:00
Cyberes e0af2ea9c5 convert to gunicorn 2023-09-26 13:32:33 -06:00
Cyberes 0eb901cb52 don't log entire request on failure 2023-09-26 12:32:19 -06:00
Cyberes b44dda7a3a option to show SYSTEM tokens in stats 2023-09-25 23:39:50 -06:00
Cyberes e37cde5d48 exclude system token more places 2023-09-25 23:22:16 -06:00
Cyberes bbdb9c9d55 try to prevent "### XXX" responses on openai 2023-09-25 23:14:35 -06:00
Cyberes 11e84db59c update database, tokenizer handle null prompt, convert top_p to vllm on openai, actually validate prompt on streaming, 2023-09-25 22:32:48 -06:00
Cyberes 2d299dbae5 openai_force_no_hashes 2023-09-25 22:01:57 -06:00
Cyberes 8240a1ebbb fix background log not doing anything 2023-09-25 18:18:29 -06:00
Cyberes 8184e24bff fix sending error messages when streaming 2023-09-25 17:37:58 -06:00
Cyberes 7ce60079d7 fix typo 2023-09-25 17:24:51 -06:00
Cyberes 30282479a0 fix flask exception 2023-09-25 17:22:28 -06:00
Cyberes 135bd743bb fix homepage slowness, fix incorrect 24 hr prompters, fix redis wrapper, 2023-09-25 17:20:21 -06:00
Cyberes 52e6965b5e don't count SYSTEM tokens for recent prompters, fix sql exclude for SYSTEM tokens 2023-09-25 13:00:39 -06:00
Cyberes 3eaabc8c35 fix copied code 2023-09-25 12:38:02 -06:00
Cyberes 44e692c9cf remove debug print 2023-09-25 12:35:36 -06:00
Cyberes 1646a00987 implement streaming on openai, improve streaming, run DB logging in background thread 2023-09-25 12:30:40 -06:00
Cyberes bbe5d5a8fe improve openai endpoint, exclude system tokens more places 2023-09-25 09:32:23 -06:00
Cyberes 6459a1c91b allow setting simultaneous IP limit per-token, fix token use tracker, fix tokens on streaming 2023-09-25 00:55:20 -06:00