Commit Graph

113 Commits

Author SHA1 Message Date
Cyberes 28c250385d add todo 2023-10-27 19:00:49 -06:00
Cyberes 563630547a add robots.txt 2023-10-23 17:32:33 -06:00
Cyberes b4e01e129d fix when all offline 2023-10-23 17:28:59 -06:00
Cyberes 3cf73fec9b fix a few exceptions when all backends go offline 2023-10-23 15:22:57 -06:00
Cyberes 92e4ecd8a1 refer to queue for tracking IP count rather than seperate value 2023-10-18 09:03:10 -06:00
Cyberes 2c7773cc4f get streaming working again 2023-10-16 16:22:52 -06:00
Cyberes 83f3ba8919 trying to fix workers still processing after backend goes offline 2023-10-15 15:11:37 -06:00
Cyberes 18e37a72ae add model selection to openai endpoint 2023-10-09 23:51:26 -06:00
Cyberes 8df667bc0a t 2023-10-05 19:25:08 -06:00
Cyberes 64d7a9edbb fix 2023-10-05 18:09:24 -06:00
Cyberes 3d0a5cf0a2 t 2023-10-05 18:06:36 -06:00
Cyberes 01fb619b9b f 2023-10-05 18:05:31 -06:00
Cyberes 1670594908 fix import error 2023-10-04 16:29:19 -06:00
Cyberes 6723dd79dc fix exceptoin 2023-10-04 16:04:03 -06:00
Cyberes 95d781725e t 2023-10-04 12:42:18 -06:00
Cyberes 4deb32bf1c test 2023-10-04 10:32:11 -06:00
Cyberes 5f4e4710c1 option to prioritize by parameter count 2023-10-04 10:19:44 -06:00
Cyberes 67f5df9bb9 fix stats page 2023-10-03 20:42:53 -06:00
Cyberes 33b4b8404b clean up streaming 2023-10-03 14:10:50 -06:00
Cyberes 32ad97e57c do default model rather than default backend, adjust moderation endpoint logic and add timeout, exclude system tokens from recent proompters, calculate number of moderators from endpoint concurrent gens, adjust homepage 2023-10-03 13:40:08 -06:00
Cyberes 94141b8ecf fix processing not being decremented on streaming, fix confusion over queue, adjust stop sequences 2023-10-02 20:53:08 -06:00
Cyberes 51881ae39d fix tokenizer 2023-10-01 17:19:34 -06:00
Cyberes 2a3ff7e21e update openai endpoints 2023-10-01 14:15:01 -06:00
Cyberes 93d19fb95b fix exception 2023-10-01 10:25:32 -06:00
Cyberes d203973e80 fix routes 2023-10-01 01:13:13 -06:00
Cyberes 25ec56a5ef get streaming working, remove /v2/ 2023-10-01 00:20:00 -06:00
Cyberes c5b30d985c adjust jinja template 2023-09-30 22:11:51 -06:00
Cyberes 7af3dbd76b add message about settings 2023-09-30 21:31:25 -06:00
Cyberes 592eb08cb1 add message for /v1/ 2023-09-30 21:07:12 -06:00
Cyberes 91ba2fad1b add proompter stats back in 2023-09-30 20:11:14 -06:00
Cyberes e0f86d053a reorganize to api v2 2023-09-30 19:42:41 -06:00
Cyberes 114f36e709 functional 2023-09-30 19:41:50 -06:00
Cyberes 624ca74ce5 mvp 2023-09-29 00:09:44 -06:00
Cyberes e7b57cad7b set up cluster config and basic background workers 2023-09-28 18:40:24 -06:00
Cyberes e1d3fca6d3 try to cancel inference if disconnected from client 2023-09-28 09:55:31 -06:00
Cyberes a4a1d6cce6 fix double logging 2023-09-28 01:34:15 -06:00
Cyberes e86a5182eb redo background processes, reorganize server.py 2023-09-27 23:36:44 -06:00
Cyberes 097d614a35 fix duplicate logging from console printer thread 2023-09-27 21:28:25 -06:00
Cyberes adc0905c6f fix imports 2023-09-27 21:20:08 -06:00
Cyberes e5fbc9545d add ratelimiting to websocket streaming endpoint, fix queue not decrementing IP requests, add console printer 2023-09-27 21:15:54 -06:00
Cyberes 43299b32ad clean up background threads 2023-09-27 19:39:04 -06:00
Cyberes 35e9847b27 set inference workers to daemon, add finally to inference worker, hide estimated avg tps 2023-09-27 18:36:51 -06:00
Cyberes 74f16afa67 update dockerfile 2023-09-27 16:12:36 -06:00
Cyberes 105b66d5e2 unify error message handling 2023-09-27 14:48:47 -06:00
Cyberes 957a6cd092 fix error handling 2023-09-27 14:36:49 -06:00
Cyberes aba2e5b9c0 don't use db pooling, add LLM-ST-Errors header to disable formatted errors 2023-09-26 23:59:22 -06:00
Cyberes d9bbcc42e6 more work on openai endpoint 2023-09-26 22:09:11 -06:00
Cyberes e0af2ea9c5 convert to gunicorn 2023-09-26 13:32:33 -06:00
Cyberes b44dda7a3a option to show SYSTEM tokens in stats 2023-09-25 23:39:50 -06:00
Cyberes bbdb9c9d55 try to prevent "### XXX" responses on openai 2023-09-25 23:14:35 -06:00