Cyberes
|
9a1d41a9b7
|
get functional again
|
2024-07-07 15:05:35 -06:00 |
Cyberes
|
fd09c783d3
|
refactor a lot of things, major cleanup, use postgresql
|
2024-05-07 17:03:41 -06:00 |
Cyberes
|
ee9a0d4858
|
redo config
|
2024-05-07 12:20:53 -06:00 |
Cyberes
|
ff82add09e
|
redo database connection, add pooling, minor logging changes, other clean up
|
2024-05-07 09:48:51 -06:00 |
Cyberes
|
0059e7956c
|
Merge cluster to master (#3)
Co-authored-by: Cyberes <cyberes@evulid.cc>
Reviewed-on: #3
|
2023-10-27 19:19:22 -06:00 |
Cyberes
|
a4a1d6cce6
|
fix double logging
|
2023-09-28 01:34:15 -06:00 |
Cyberes
|
43299b32ad
|
clean up background threads
|
2023-09-27 19:39:04 -06:00 |
Cyberes
|
105b66d5e2
|
unify error message handling
|
2023-09-27 14:48:47 -06:00 |
Cyberes
|
957a6cd092
|
fix error handling
|
2023-09-27 14:36:49 -06:00 |
Cyberes
|
aba2e5b9c0
|
don't use db pooling, add LLM-ST-Errors header to disable formatted errors
|
2023-09-26 23:59:22 -06:00 |
Cyberes
|
11e84db59c
|
update database, tokenizer handle null prompt, convert top_p to vllm on openai, actually validate prompt on streaming,
|
2023-09-25 22:32:48 -06:00 |
Cyberes
|
320f51e01c
|
further align openai endpoint with expected responses
|
2023-09-24 21:45:30 -06:00 |
Cyberes
|
03e3ec5490
|
port to mysql, use vllm tokenizer endpoint
|
2023-09-20 20:30:31 -06:00 |
Cyberes
|
3100b0a924
|
set up queue to work with gunicorn processes, other improvements
|
2023-09-14 17:38:20 -06:00 |
Cyberes
|
79b1e01b61
|
option to disable streaming, improve timeout on requests to backend, fix error handling. reduce duplicate code, misc other cleanup
|
2023-09-14 14:05:50 -06:00 |
Cyberes
|
3d40ed4cfb
|
shit code
|
2023-09-13 11:58:38 -06:00 |
Cyberes
|
1582625e09
|
how did this get broken
|
2023-09-13 11:56:30 -06:00 |
Cyberes
|
bcedd2ab3d
|
adjust logging, add more vllm stuff
|
2023-09-13 11:22:33 -06:00 |
Cyberes
|
9740df07c7
|
add openai-compatible backend
|
2023-09-12 16:40:09 -06:00 |