Commit Graph

71 Commits

Author SHA1 Message Date
Cyberes 5a61bdccd4 f 2023-10-05 18:07:59 -06:00
Cyberes 3d0a5cf0a2 t 2023-10-05 18:06:36 -06:00
Cyberes acf409abfc fix background logger, add gradio chat example 2023-10-04 19:24:47 -06:00
Cyberes 1670594908 fix import error 2023-10-04 16:29:19 -06:00
Cyberes 62d5d43da4 handle backend offline in tokenizer 2023-10-04 13:34:59 -06:00
Cyberes 7acaa3c885 g 2023-10-04 13:32:54 -06:00
Cyberes 754a4cbdf3 r 2023-10-04 13:11:43 -06:00
Cyberes d0eec88dbd f 2023-10-04 13:03:58 -06:00
Cyberes 6bad5b3fa0 t 2023-10-04 13:02:53 -06:00
Cyberes 4deb32bf1c test 2023-10-04 10:32:11 -06:00
Cyberes f88e2362c5 remove some debug prints 2023-10-03 20:01:28 -06:00
Cyberes 581a0fec99 fix exception 2023-10-03 13:47:18 -06:00
Cyberes 62eb0196cc t 2023-10-03 00:13:55 -06:00
Cyberes 0f5e22191c test 2023-10-03 00:12:37 -06:00
Cyberes cd325216e2 test 2023-10-02 22:45:07 -06:00
Cyberes 94141b8ecf fix processing not being decremented on streaming, fix confusion over queue, adjust stop sequences 2023-10-02 20:53:08 -06:00
Cyberes b0089859d7 fix ratelimiting 2023-10-02 02:05:15 -06:00
Cyberes 21da2f6373 fix openai error message 2023-10-01 22:58:08 -06:00
Cyberes a594729d00 fix keyerror 2023-10-01 22:37:13 -06:00
Cyberes 51881ae39d fix tokenizer 2023-10-01 17:19:34 -06:00
Cyberes f7e9687527 finish openai endpoints 2023-10-01 16:04:53 -06:00
Cyberes 2a3ff7e21e update openai endpoints 2023-10-01 14:15:01 -06:00
Cyberes 25ec56a5ef get streaming working, remove /v2/ 2023-10-01 00:20:00 -06:00
Cyberes 114f36e709 functional 2023-09-30 19:41:50 -06:00
Cyberes 624ca74ce5 mvp 2023-09-29 00:09:44 -06:00
Cyberes e7b57cad7b set up cluster config and basic background workers 2023-09-28 18:40:24 -06:00
Cyberes 347a82b7e1 avoid sending to backend to tokenize if it's greater than our specified context size 2023-09-28 03:54:20 -06:00
Cyberes e5fbc9545d add ratelimiting to websocket streaming endpoint, fix queue not decrementing IP requests, add console printer 2023-09-27 21:15:54 -06:00
Cyberes 957a6cd092 fix error handling 2023-09-27 14:36:49 -06:00
Cyberes aba2e5b9c0 don't use db pooling, add LLM-ST-Errors header to disable formatted errors 2023-09-26 23:59:22 -06:00
Cyberes d9bbcc42e6 more work on openai endpoint 2023-09-26 22:09:11 -06:00
Cyberes e0af2ea9c5 convert to gunicorn 2023-09-26 13:32:33 -06:00
Cyberes 11e84db59c update database, tokenizer handle null prompt, convert top_p to vllm on openai, actually validate prompt on streaming, 2023-09-25 22:32:48 -06:00
Cyberes 8240a1ebbb fix background log not doing anything 2023-09-25 18:18:29 -06:00
Cyberes 1646a00987 implement streaming on openai, improve streaming, run DB logging in background thread 2023-09-25 12:30:40 -06:00
Cyberes 320f51e01c further align openai endpoint with expected responses 2023-09-24 21:45:30 -06:00
Cyberes cb99c3490e rewrite tokenizer, restructure validation 2023-09-24 13:02:30 -06:00
Cyberes 76a1428ba0 implement streaming for vllm 2023-09-23 17:57:23 -06:00
Cyberes 81452ec643 adjust vllm info 2023-09-21 20:13:29 -06:00
Cyberes 03e3ec5490 port to mysql, use vllm tokenizer endpoint 2023-09-20 20:30:31 -06:00
Cyberes 354ad8192d fix division by 0, prettify /stats json, add js var to home 2023-09-16 17:37:43 -06:00
Cyberes 77edbe779c actually validate prompt length lol 2023-09-14 18:31:13 -06:00
Cyberes 3100b0a924 set up queue to work with gunicorn processes, other improvements 2023-09-14 17:38:20 -06:00
Cyberes 79b1e01b61 option to disable streaming, improve timeout on requests to backend, fix error handling. reduce duplicate code, misc other cleanup 2023-09-14 14:05:50 -06:00
Cyberes c45e68a8c8 adjust requests timeout, add service file 2023-09-14 01:32:49 -06:00
Cyberes 05a45e6ac6 didnt test anything 2023-09-13 11:51:46 -06:00
Cyberes bcedd2ab3d adjust logging, add more vllm stuff 2023-09-13 11:22:33 -06:00
Cyberes 9740df07c7 add openai-compatible backend 2023-09-12 16:40:09 -06:00
Cyberes 1d9f40765e remove text-generation-inference backend 2023-09-12 13:09:47 -06:00
Cyberes 6152b1bb66 fix invalid param error, add manual model name 2023-09-12 10:30:45 -06:00