Commit Graph

59 Commits

Author SHA1 Message Date
Cyberes 2fed87d340 remove timed-out items from queue 2023-10-17 11:46:39 -06:00
Cyberes 2c7773cc4f get streaming working again 2023-10-16 16:22:52 -06:00
Cyberes 83f3ba8919 trying to fix workers still processing after backend goes offline 2023-10-15 15:11:37 -06:00
Cyberes 18e37a72ae add model selection to openai endpoint 2023-10-09 23:51:26 -06:00
Cyberes ae4d4e5ca9 fix exception 2023-10-09 10:31:35 -06:00
Cyberes e8964fcfd2 fix the queue?? 2023-10-05 21:37:18 -06:00
Cyberes 08df52a4fd fix exception when not valid model 2023-10-05 12:28:00 -06:00
Cyberes acf409abfc fix background logger, add gradio chat example 2023-10-04 19:24:47 -06:00
Cyberes 1670594908 fix import error 2023-10-04 16:29:19 -06:00
Cyberes 09fa69e031 fix 2023-10-04 13:37:39 -06:00
Cyberes d78ef652fc c 2023-10-04 13:21:43 -06:00
Cyberes 5e90fa54d4 handle model offline 2023-10-04 13:18:47 -06:00
Cyberes 77db34a6a7 g 2023-10-04 12:59:19 -06:00
Cyberes a15b5465df c 2023-10-04 12:44:09 -06:00
Cyberes e16f415749 fix 2023-10-03 13:49:00 -06:00
Cyberes 32ad97e57c do default model rather than default backend, adjust moderation endpoint logic and add timeout, exclude system tokens from recent proompters, calculate number of moderators from endpoint concurrent gens, adjust homepage 2023-10-03 13:40:08 -06:00
Cyberes 94141b8ecf fix processing not being decremented on streaming, fix confusion over queue, adjust stop sequences 2023-10-02 20:53:08 -06:00
Cyberes b0089859d7 fix ratelimiting 2023-10-02 02:05:15 -06:00
Cyberes 21da2f6373 fix openai error message 2023-10-01 22:58:08 -06:00
Cyberes f7e9687527 finish openai endpoints 2023-10-01 16:04:53 -06:00
Cyberes 2a3ff7e21e update openai endpoints 2023-10-01 14:15:01 -06:00
Cyberes 1151bb5475 adjust stats 2023-09-30 20:42:48 -06:00
Cyberes 114f36e709 functional 2023-09-30 19:41:50 -06:00
Cyberes 624ca74ce5 mvp 2023-09-29 00:09:44 -06:00
Cyberes e7b57cad7b set up cluster config and basic background workers 2023-09-28 18:40:24 -06:00
Cyberes 59f2aac8ad rewrite redis usage 2023-09-28 03:44:30 -06:00
Cyberes a4a1d6cce6 fix double logging 2023-09-28 01:34:15 -06:00
Cyberes e5fbc9545d add ratelimiting to websocket streaming endpoint, fix queue not decrementing IP requests, add console printer 2023-09-27 21:15:54 -06:00
Cyberes 43299b32ad clean up background threads 2023-09-27 19:39:04 -06:00
Cyberes 105b66d5e2 unify error message handling 2023-09-27 14:48:47 -06:00
Cyberes 957a6cd092 fix error handling 2023-09-27 14:36:49 -06:00
Cyberes aba2e5b9c0 don't use db pooling, add LLM-ST-Errors header to disable formatted errors 2023-09-26 23:59:22 -06:00
Cyberes 048e5a8060 fix API key handling 2023-09-26 22:49:53 -06:00
Cyberes d9bbcc42e6 more work on openai endpoint 2023-09-26 22:09:11 -06:00
Cyberes 52e6965b5e don't count SYSTEM tokens for recent prompters, fix sql exclude for SYSTEM tokens 2023-09-25 13:00:39 -06:00
Cyberes bbe5d5a8fe improve openai endpoint, exclude system tokens more places 2023-09-25 09:32:23 -06:00
Cyberes 6459a1c91b allow setting simultaneous IP limit per-token, fix token use tracker, fix tokens on streaming 2023-09-25 00:55:20 -06:00
Cyberes cb99c3490e rewrite tokenizer, restructure validation 2023-09-24 13:02:30 -06:00
Cyberes 62412f4873 add config setting for hostname 2023-09-23 23:24:08 -06:00
Cyberes 84a1fcfdd8 don't store host if it's an IP 2023-09-23 23:14:22 -06:00
Cyberes 76a1428ba0 implement streaming for vllm 2023-09-23 17:57:23 -06:00
Cyberes 8593198216 close mysql cursor 2023-09-20 21:19:26 -06:00
Cyberes 03e3ec5490 port to mysql, use vllm tokenizer endpoint 2023-09-20 20:30:31 -06:00
Cyberes eb3179cfff fix recent proompters to work with gunicorn 2023-09-17 19:06:53 -06:00
Cyberes 3c1254d3bf cache stats in background 2023-09-17 18:55:36 -06:00
Cyberes 77edbe779c actually validate prompt length lol 2023-09-14 18:31:13 -06:00
Cyberes 3100b0a924 set up queue to work with gunicorn processes, other improvements 2023-09-14 17:38:20 -06:00
Cyberes 93a344f4c5 check if the backend crapped out, print some more stuff 2023-09-14 14:26:25 -06:00
Cyberes 79b1e01b61 option to disable streaming, improve timeout on requests to backend, fix error handling. reduce duplicate code, misc other cleanup 2023-09-14 14:05:50 -06:00
Cyberes bcedd2ab3d adjust logging, add more vllm stuff 2023-09-13 11:22:33 -06:00