Cyberes
|
62eb0196cc
|
t
|
2023-10-03 00:13:55 -06:00 |
Cyberes
|
0f5e22191c
|
test
|
2023-10-03 00:12:37 -06:00 |
Cyberes
|
cd325216e2
|
test
|
2023-10-02 22:45:07 -06:00 |
Cyberes
|
94141b8ecf
|
fix processing not being decremented on streaming, fix confusion over queue, adjust stop sequences
|
2023-10-02 20:53:08 -06:00 |
Cyberes
|
b0089859d7
|
fix ratelimiting
|
2023-10-02 02:05:15 -06:00 |
Cyberes
|
21da2f6373
|
fix openai error message
|
2023-10-01 22:58:08 -06:00 |
Cyberes
|
a594729d00
|
fix keyerror
|
2023-10-01 22:37:13 -06:00 |
Cyberes
|
51881ae39d
|
fix tokenizer
|
2023-10-01 17:19:34 -06:00 |
Cyberes
|
f7e9687527
|
finish openai endpoints
|
2023-10-01 16:04:53 -06:00 |
Cyberes
|
2a3ff7e21e
|
update openai endpoints
|
2023-10-01 14:15:01 -06:00 |
Cyberes
|
25ec56a5ef
|
get streaming working, remove /v2/
|
2023-10-01 00:20:00 -06:00 |
Cyberes
|
114f36e709
|
functional
|
2023-09-30 19:41:50 -06:00 |
Cyberes
|
624ca74ce5
|
mvp
|
2023-09-29 00:09:44 -06:00 |
Cyberes
|
e7b57cad7b
|
set up cluster config and basic background workers
|
2023-09-28 18:40:24 -06:00 |
Cyberes
|
347a82b7e1
|
avoid sending to backend to tokenize if it's greater than our specified context size
|
2023-09-28 03:54:20 -06:00 |
Cyberes
|
e5fbc9545d
|
add ratelimiting to websocket streaming endpoint, fix queue not decrementing IP requests, add console printer
|
2023-09-27 21:15:54 -06:00 |
Cyberes
|
957a6cd092
|
fix error handling
|
2023-09-27 14:36:49 -06:00 |
Cyberes
|
aba2e5b9c0
|
don't use db pooling, add LLM-ST-Errors header to disable formatted errors
|
2023-09-26 23:59:22 -06:00 |
Cyberes
|
d9bbcc42e6
|
more work on openai endpoint
|
2023-09-26 22:09:11 -06:00 |
Cyberes
|
e0af2ea9c5
|
convert to gunicorn
|
2023-09-26 13:32:33 -06:00 |
Cyberes
|
11e84db59c
|
update database, tokenizer handle null prompt, convert top_p to vllm on openai, actually validate prompt on streaming,
|
2023-09-25 22:32:48 -06:00 |
Cyberes
|
8240a1ebbb
|
fix background log not doing anything
|
2023-09-25 18:18:29 -06:00 |
Cyberes
|
1646a00987
|
implement streaming on openai, improve streaming, run DB logging in background thread
|
2023-09-25 12:30:40 -06:00 |
Cyberes
|
320f51e01c
|
further align openai endpoint with expected responses
|
2023-09-24 21:45:30 -06:00 |
Cyberes
|
cb99c3490e
|
rewrite tokenizer, restructure validation
|
2023-09-24 13:02:30 -06:00 |
Cyberes
|
76a1428ba0
|
implement streaming for vllm
|
2023-09-23 17:57:23 -06:00 |
Cyberes
|
81452ec643
|
adjust vllm info
|
2023-09-21 20:13:29 -06:00 |
Cyberes
|
03e3ec5490
|
port to mysql, use vllm tokenizer endpoint
|
2023-09-20 20:30:31 -06:00 |
Cyberes
|
354ad8192d
|
fix division by 0, prettify /stats json, add js var to home
|
2023-09-16 17:37:43 -06:00 |
Cyberes
|
77edbe779c
|
actually validate prompt length lol
|
2023-09-14 18:31:13 -06:00 |
Cyberes
|
3100b0a924
|
set up queue to work with gunicorn processes, other improvements
|
2023-09-14 17:38:20 -06:00 |
Cyberes
|
79b1e01b61
|
option to disable streaming, improve timeout on requests to backend, fix error handling. reduce duplicate code, misc other cleanup
|
2023-09-14 14:05:50 -06:00 |
Cyberes
|
c45e68a8c8
|
adjust requests timeout, add service file
|
2023-09-14 01:32:49 -06:00 |
Cyberes
|
05a45e6ac6
|
didnt test anything
|
2023-09-13 11:51:46 -06:00 |
Cyberes
|
bcedd2ab3d
|
adjust logging, add more vllm stuff
|
2023-09-13 11:22:33 -06:00 |
Cyberes
|
9740df07c7
|
add openai-compatible backend
|
2023-09-12 16:40:09 -06:00 |
Cyberes
|
1d9f40765e
|
remove text-generation-inference backend
|
2023-09-12 13:09:47 -06:00 |
Cyberes
|
6152b1bb66
|
fix invalid param error, add manual model name
|
2023-09-12 10:30:45 -06:00 |
Cyberes
|
40ac84aa9a
|
actually we don't want to emulate openai
|
2023-09-12 01:04:11 -06:00 |
Cyberes
|
747d838138
|
move where the vllm model is set
|
2023-09-11 21:05:22 -06:00 |
Cyberes
|
4c9d543eab
|
implement vllm backend
|
2023-09-11 20:47:19 -06:00 |
Cyberes
|
c14cc51f09
|
get working with ooba again, give up on dockerfile
|
2023-09-11 09:51:01 -06:00 |
Cyberes
|
bf39b8da63
|
still having issues
|
2023-08-31 09:24:37 -06:00 |
Cyberes
|
47887c3925
|
missed a spot, clean up json error handling
|
2023-08-30 20:19:23 -06:00 |
Cyberes
|
2816c01902
|
refactor generation route
|
2023-08-30 18:53:26 -06:00 |
Cyberes
|
f9b9051bad
|
update weighted_average_column_for_model to account for when there was an error reported, insert null for response tokens when error, correctly parse x-forwarded-for, correctly convert model reported by hf-textgen
|
2023-08-29 15:46:56 -06:00 |
Cyberes
|
23f3fcf579
|
log errors to database
|
2023-08-29 14:48:33 -06:00 |
Cyberes
|
b44dfa2471
|
update info page
|
2023-08-29 14:00:35 -06:00 |
Cyberes
|
ba0bc87434
|
add HF text-generation-inference backend
|
2023-08-29 13:46:41 -06:00 |
Cyberes
|
0aa52863bc
|
forgot to start workers
|
2023-08-23 20:33:49 -06:00 |