Cyberes
|
fd09c783d3
|
refactor a lot of things, major cleanup, use postgresql
|
2024-05-07 17:03:41 -06:00 |
Cyberes
|
ee9a0d4858
|
redo config
|
2024-05-07 12:20:53 -06:00 |
Cyberes
|
ff82add09e
|
redo database connection, add pooling, minor logging changes, other clean up
|
2024-05-07 09:48:51 -06:00 |
Cyberes
|
0e7f04ab2d
|
fix gunicorn logging
|
2023-12-21 14:24:50 -07:00 |
Cyberes
|
009039dbd8
|
fix server exception
|
2023-10-30 14:42:50 -06:00 |
Cyberes
|
0059e7956c
|
Merge cluster to master (#3)
Co-authored-by: Cyberes <cyberes@evulid.cc>
Reviewed-on: #3
|
2023-10-27 19:19:22 -06:00 |
Cyberes
|
e1d3fca6d3
|
try to cancel inference if disconnected from client
|
2023-09-28 09:55:31 -06:00 |
Cyberes
|
a4a1d6cce6
|
fix double logging
|
2023-09-28 01:34:15 -06:00 |
Cyberes
|
e86a5182eb
|
redo background processes, reorganize server.py
|
2023-09-27 23:36:44 -06:00 |
Cyberes
|
097d614a35
|
fix duplicate logging from console printer thread
|
2023-09-27 21:28:25 -06:00 |
Cyberes
|
adc0905c6f
|
fix imports
|
2023-09-27 21:20:08 -06:00 |
Cyberes
|
e5fbc9545d
|
add ratelimiting to websocket streaming endpoint, fix queue not decrementing IP requests, add console printer
|
2023-09-27 21:15:54 -06:00 |
Cyberes
|
43299b32ad
|
clean up background threads
|
2023-09-27 19:39:04 -06:00 |
Cyberes
|
35e9847b27
|
set inference workers to daemon, add finally to inference worker, hide estimated avg tps
|
2023-09-27 18:36:51 -06:00 |
Cyberes
|
74f16afa67
|
update dockerfile
|
2023-09-27 16:12:36 -06:00 |
Cyberes
|
105b66d5e2
|
unify error message handling
|
2023-09-27 14:48:47 -06:00 |
Cyberes
|
957a6cd092
|
fix error handling
|
2023-09-27 14:36:49 -06:00 |
Cyberes
|
aba2e5b9c0
|
don't use db pooling, add LLM-ST-Errors header to disable formatted errors
|
2023-09-26 23:59:22 -06:00 |
Cyberes
|
d9bbcc42e6
|
more work on openai endpoint
|
2023-09-26 22:09:11 -06:00 |
Cyberes
|
e0af2ea9c5
|
convert to gunicorn
|
2023-09-26 13:32:33 -06:00 |
Cyberes
|
b44dda7a3a
|
option to show SYSTEM tokens in stats
|
2023-09-25 23:39:50 -06:00 |
Cyberes
|
bbdb9c9d55
|
try to prevent "### XXX" responses on openai
|
2023-09-25 23:14:35 -06:00 |
Cyberes
|
11e84db59c
|
update database, tokenizer handle null prompt, convert top_p to vllm on openai, actually validate prompt on streaming,
|
2023-09-25 22:32:48 -06:00 |
Cyberes
|
2d299dbae5
|
openai_force_no_hashes
|
2023-09-25 22:01:57 -06:00 |
Cyberes
|
8240a1ebbb
|
fix background log not doing anything
|
2023-09-25 18:18:29 -06:00 |
Cyberes
|
289b40181c
|
forgot to test all config possibilities
|
2023-09-25 17:23:43 -06:00 |
Cyberes
|
30282479a0
|
fix flask exception
|
2023-09-25 17:22:28 -06:00 |
Cyberes
|
135bd743bb
|
fix homepage slowness, fix incorrect 24 hr prompters, fix redis wrapper,
|
2023-09-25 17:20:21 -06:00 |
Cyberes
|
bbe5d5a8fe
|
improve openai endpoint, exclude system tokens more places
|
2023-09-25 09:32:23 -06:00 |
Cyberes
|
6459a1c91b
|
allow setting simultaneous IP limit per-token, fix token use tracker, fix tokens on streaming
|
2023-09-25 00:55:20 -06:00 |
Cyberes
|
320f51e01c
|
further align openai endpoint with expected responses
|
2023-09-24 21:45:30 -06:00 |
Cyberes
|
8d6b2ce49c
|
minor changes, add admin token auth system, add route to get backend info
|
2023-09-24 15:54:35 -06:00 |
Cyberes
|
cb99c3490e
|
rewrite tokenizer, restructure validation
|
2023-09-24 13:02:30 -06:00 |
Cyberes
|
62412f4873
|
add config setting for hostname
|
2023-09-23 23:24:08 -06:00 |
Cyberes
|
84a1fcfdd8
|
don't store host if it's an IP
|
2023-09-23 23:14:22 -06:00 |
Cyberes
|
fab7b7ccdd
|
active gen workers wait
|
2023-09-23 21:17:13 -06:00 |
Cyberes
|
03e3ec5490
|
port to mysql, use vllm tokenizer endpoint
|
2023-09-20 20:30:31 -06:00 |
Cyberes
|
3c1254d3bf
|
cache stats in background
|
2023-09-17 18:55:36 -06:00 |
Cyberes
|
3100b0a924
|
set up queue to work with gunicorn processes, other improvements
|
2023-09-14 17:38:20 -06:00 |
Cyberes
|
a89295193f
|
add moderation endpoint to openai api, update config
|
2023-09-14 15:07:17 -06:00 |
Cyberes
|
507327db49
|
lower caching of home page
|
2023-09-14 14:30:01 -06:00 |
Cyberes
|
79b1e01b61
|
option to disable streaming, improve timeout on requests to backend, fix error handling. reduce duplicate code, misc other cleanup
|
2023-09-14 14:05:50 -06:00 |
Cyberes
|
035c17c48b
|
reformat info page info_html field
|
2023-09-13 20:40:55 -06:00 |
Cyberes
|
12e894032e
|
show the openai system prompt
|
2023-09-13 20:25:56 -06:00 |
Cyberes
|
9740df07c7
|
add openai-compatible backend
|
2023-09-12 16:40:09 -06:00 |
Cyberes
|
1d9f40765e
|
remove text-generation-inference backend
|
2023-09-12 13:09:47 -06:00 |
Cyberes
|
6152b1bb66
|
fix invalid param error, add manual model name
|
2023-09-12 10:30:45 -06:00 |
Cyberes
|
57ccedcfb9
|
adjust some things
|
2023-09-12 01:10:58 -06:00 |
Cyberes
|
a84386c311
|
move import check furthger up
|
2023-09-12 01:05:03 -06:00 |
Cyberes
|
40ac84aa9a
|
actually we don't want to emulate openai
|
2023-09-12 01:04:11 -06:00 |