Commit Graph

157 Commits

Author SHA1 Message Date
Cyberes 2fed87d340 remove timed-out items from queue 2023-10-17 11:46:39 -06:00
Cyberes 9e3cbc9d2e fix streaming slowdown? 2023-10-16 23:36:25 -06:00
Cyberes c3c053e071 test 2023-10-16 23:29:17 -06:00
Cyberes 806e522d16 don't pickle streaming 2023-10-16 18:35:10 -06:00
Cyberes 2c7773cc4f get streaming working again 2023-10-16 16:22:52 -06:00
Cyberes 24aab3cd93 fix streaming disabled 2023-10-15 20:59:11 -06:00
Cyberes 3ec9b2347f fix wrong datatype 2023-10-15 17:24:18 -06:00
Cyberes 83f3ba8919 trying to fix workers still processing after backend goes offline 2023-10-15 15:11:37 -06:00
Cyberes 18e37a72ae add model selection to openai endpoint 2023-10-09 23:51:26 -06:00
Cyberes e8964fcfd2 fix the queue?? 2023-10-05 21:37:18 -06:00
Cyberes e9f6fdf65e fix streaming? 2023-10-05 20:14:28 -06:00
Cyberes 08df52a4fd fix exception when not valid model 2023-10-05 12:28:00 -06:00
Cyberes acf409abfc fix background logger, add gradio chat example 2023-10-04 19:24:47 -06:00
Cyberes 1670594908 fix import error 2023-10-04 16:29:19 -06:00
Cyberes 6dc3529190 show online status on stats page 2023-10-03 23:39:25 -06:00
Cyberes 1a7f22ec55 adjust again 2023-10-03 20:47:37 -06:00
Cyberes 67f5df9bb9 fix stats page 2023-10-03 20:42:53 -06:00
Cyberes 33b4b8404b clean up streaming 2023-10-03 14:10:50 -06:00
Cyberes 581a0fec99 fix exception 2023-10-03 13:47:18 -06:00
Cyberes 32ad97e57c do default model rather than default backend, adjust moderation endpoint logic and add timeout, exclude system tokens from recent proompters, calculate number of moderators from endpoint concurrent gens, adjust homepage 2023-10-03 13:40:08 -06:00
Cyberes 94141b8ecf fix processing not being decremented on streaming, fix confusion over queue, adjust stop sequences 2023-10-02 20:53:08 -06:00
Cyberes f7e9687527 finish openai endpoints 2023-10-01 16:04:53 -06:00
Cyberes 2a3ff7e21e update openai endpoints 2023-10-01 14:15:01 -06:00
Cyberes 93d19fb95b fix exception 2023-10-01 10:25:32 -06:00
Cyberes d203973e80 fix routes 2023-10-01 01:13:13 -06:00
Cyberes 25ec56a5ef get streaming working, remove /v2/ 2023-10-01 00:20:00 -06:00
Cyberes b10d22ca0d cache the home page in the background 2023-09-30 23:03:42 -06:00
Cyberes 9235725bdd adjust message 2023-09-30 21:35:55 -06:00
Cyberes 61856b4383 adjust message 2023-09-30 21:34:32 -06:00
Cyberes 7af3dbd76b add message about settings 2023-09-30 21:31:25 -06:00
Cyberes 592eb08cb1 add message for /v1/ 2023-09-30 21:07:12 -06:00
Cyberes 166b2316e8 depricate v1 2023-09-30 20:59:24 -06:00
Cyberes e0f86d053a reorganize to api v2 2023-09-30 19:42:41 -06:00
Cyberes 114f36e709 functional 2023-09-30 19:41:50 -06:00
Cyberes 624ca74ce5 mvp 2023-09-29 00:09:44 -06:00
Cyberes e7b57cad7b set up cluster config and basic background workers 2023-09-28 18:40:24 -06:00
Cyberes e1d3fca6d3 try to cancel inference if disconnected from client 2023-09-28 09:55:31 -06:00
Cyberes e42f2b6819 fix negative queue on stats 2023-09-28 08:47:39 -06:00
Cyberes 347a82b7e1 avoid sending to backend to tokenize if it's greater than our specified context size 2023-09-28 03:54:20 -06:00
Cyberes 59f2aac8ad rewrite redis usage 2023-09-28 03:44:30 -06:00
Cyberes a4a1d6cce6 fix double logging 2023-09-28 01:34:15 -06:00
Cyberes e5fbc9545d add ratelimiting to websocket streaming endpoint, fix queue not decrementing IP requests, add console printer 2023-09-27 21:15:54 -06:00
Cyberes 43299b32ad clean up background threads 2023-09-27 19:39:04 -06:00
Cyberes 35e9847b27 set inference workers to daemon, add finally to inference worker, hide estimated avg tps 2023-09-27 18:36:51 -06:00
Cyberes 105b66d5e2 unify error message handling 2023-09-27 14:48:47 -06:00
Cyberes aba2e5b9c0 don't use db pooling, add LLM-ST-Errors header to disable formatted errors 2023-09-26 23:59:22 -06:00
Cyberes 048e5a8060 fix API key handling 2023-09-26 22:49:53 -06:00
Cyberes d9bbcc42e6 more work on openai endpoint 2023-09-26 22:09:11 -06:00
Cyberes e0af2ea9c5 convert to gunicorn 2023-09-26 13:32:33 -06:00
Cyberes 0eb901cb52 don't log entire request on failure 2023-09-26 12:32:19 -06:00