Commit Graph

105 Commits

Author SHA1 Message Date
Cyberes 8df667bc0a t 2023-10-05 19:25:08 -06:00
Cyberes 64d7a9edbb fix 2023-10-05 18:09:24 -06:00
Cyberes 3d0a5cf0a2 t 2023-10-05 18:06:36 -06:00
Cyberes 01fb619b9b f 2023-10-05 18:05:31 -06:00
Cyberes 1670594908 fix import error 2023-10-04 16:29:19 -06:00
Cyberes 6723dd79dc fix exceptoin 2023-10-04 16:04:03 -06:00
Cyberes 95d781725e t 2023-10-04 12:42:18 -06:00
Cyberes 4deb32bf1c test 2023-10-04 10:32:11 -06:00
Cyberes 5f4e4710c1 option to prioritize by parameter count 2023-10-04 10:19:44 -06:00
Cyberes 67f5df9bb9 fix stats page 2023-10-03 20:42:53 -06:00
Cyberes 33b4b8404b clean up streaming 2023-10-03 14:10:50 -06:00
Cyberes 32ad97e57c do default model rather than default backend, adjust moderation endpoint logic and add timeout, exclude system tokens from recent proompters, calculate number of moderators from endpoint concurrent gens, adjust homepage 2023-10-03 13:40:08 -06:00
Cyberes 94141b8ecf fix processing not being decremented on streaming, fix confusion over queue, adjust stop sequences 2023-10-02 20:53:08 -06:00
Cyberes 51881ae39d fix tokenizer 2023-10-01 17:19:34 -06:00
Cyberes 2a3ff7e21e update openai endpoints 2023-10-01 14:15:01 -06:00
Cyberes 93d19fb95b fix exception 2023-10-01 10:25:32 -06:00
Cyberes d203973e80 fix routes 2023-10-01 01:13:13 -06:00
Cyberes 25ec56a5ef get streaming working, remove /v2/ 2023-10-01 00:20:00 -06:00
Cyberes c5b30d985c adjust jinja template 2023-09-30 22:11:51 -06:00
Cyberes 7af3dbd76b add message about settings 2023-09-30 21:31:25 -06:00
Cyberes 592eb08cb1 add message for /v1/ 2023-09-30 21:07:12 -06:00
Cyberes 91ba2fad1b add proompter stats back in 2023-09-30 20:11:14 -06:00
Cyberes e0f86d053a reorganize to api v2 2023-09-30 19:42:41 -06:00
Cyberes 114f36e709 functional 2023-09-30 19:41:50 -06:00
Cyberes 624ca74ce5 mvp 2023-09-29 00:09:44 -06:00
Cyberes e7b57cad7b set up cluster config and basic background workers 2023-09-28 18:40:24 -06:00
Cyberes e1d3fca6d3 try to cancel inference if disconnected from client 2023-09-28 09:55:31 -06:00
Cyberes a4a1d6cce6 fix double logging 2023-09-28 01:34:15 -06:00
Cyberes e86a5182eb redo background processes, reorganize server.py 2023-09-27 23:36:44 -06:00
Cyberes 097d614a35 fix duplicate logging from console printer thread 2023-09-27 21:28:25 -06:00
Cyberes adc0905c6f fix imports 2023-09-27 21:20:08 -06:00
Cyberes e5fbc9545d add ratelimiting to websocket streaming endpoint, fix queue not decrementing IP requests, add console printer 2023-09-27 21:15:54 -06:00
Cyberes 43299b32ad clean up background threads 2023-09-27 19:39:04 -06:00
Cyberes 35e9847b27 set inference workers to daemon, add finally to inference worker, hide estimated avg tps 2023-09-27 18:36:51 -06:00
Cyberes 74f16afa67 update dockerfile 2023-09-27 16:12:36 -06:00
Cyberes 105b66d5e2 unify error message handling 2023-09-27 14:48:47 -06:00
Cyberes 957a6cd092 fix error handling 2023-09-27 14:36:49 -06:00
Cyberes aba2e5b9c0 don't use db pooling, add LLM-ST-Errors header to disable formatted errors 2023-09-26 23:59:22 -06:00
Cyberes d9bbcc42e6 more work on openai endpoint 2023-09-26 22:09:11 -06:00
Cyberes e0af2ea9c5 convert to gunicorn 2023-09-26 13:32:33 -06:00
Cyberes b44dda7a3a option to show SYSTEM tokens in stats 2023-09-25 23:39:50 -06:00
Cyberes bbdb9c9d55 try to prevent "### XXX" responses on openai 2023-09-25 23:14:35 -06:00
Cyberes 11e84db59c update database, tokenizer handle null prompt, convert top_p to vllm on openai, actually validate prompt on streaming, 2023-09-25 22:32:48 -06:00
Cyberes 2d299dbae5 openai_force_no_hashes 2023-09-25 22:01:57 -06:00
Cyberes 8240a1ebbb fix background log not doing anything 2023-09-25 18:18:29 -06:00
Cyberes 289b40181c forgot to test all config possibilities 2023-09-25 17:23:43 -06:00
Cyberes 30282479a0 fix flask exception 2023-09-25 17:22:28 -06:00
Cyberes 135bd743bb fix homepage slowness, fix incorrect 24 hr prompters, fix redis wrapper, 2023-09-25 17:20:21 -06:00
Cyberes bbe5d5a8fe improve openai endpoint, exclude system tokens more places 2023-09-25 09:32:23 -06:00
Cyberes 6459a1c91b allow setting simultaneous IP limit per-token, fix token use tracker, fix tokens on streaming 2023-09-25 00:55:20 -06:00