Cyberes
|
11e84db59c
|
update database, tokenizer handle null prompt, convert top_p to vllm on openai, actually validate prompt on streaming,
|
2023-09-25 22:32:48 -06:00 |
Cyberes
|
8240a1ebbb
|
fix background log not doing anything
|
2023-09-25 18:18:29 -06:00 |
Cyberes
|
8184e24bff
|
fix sending error messages when streaming
|
2023-09-25 17:37:58 -06:00 |
Cyberes
|
7ce60079d7
|
fix typo
|
2023-09-25 17:24:51 -06:00 |
Cyberes
|
30282479a0
|
fix flask exception
|
2023-09-25 17:22:28 -06:00 |
Cyberes
|
135bd743bb
|
fix homepage slowness, fix incorrect 24 hr prompters, fix redis wrapper,
|
2023-09-25 17:20:21 -06:00 |
Cyberes
|
52e6965b5e
|
don't count SYSTEM tokens for recent prompters, fix sql exclude for SYSTEM tokens
|
2023-09-25 13:00:39 -06:00 |
Cyberes
|
3eaabc8c35
|
fix copied code
|
2023-09-25 12:38:02 -06:00 |
Cyberes
|
1646a00987
|
implement streaming on openai, improve streaming, run DB logging in background thread
|
2023-09-25 12:30:40 -06:00 |
Cyberes
|
6459a1c91b
|
allow setting simultaneous IP limit per-token, fix token use tracker, fix tokens on streaming
|
2023-09-25 00:55:20 -06:00 |
Cyberes
|
320f51e01c
|
further align openai endpoint with expected responses
|
2023-09-24 21:45:30 -06:00 |
Cyberes
|
8d6b2ce49c
|
minor changes, add admin token auth system, add route to get backend info
|
2023-09-24 15:54:35 -06:00 |
Cyberes
|
2678102153
|
handle error while streaming
|
2023-09-24 13:27:27 -06:00 |
Cyberes
|
0015e653b2
|
adjust a few final things
|
2023-09-23 22:30:59 -06:00 |
Cyberes
|
fab7b7ccdd
|
active gen workers wait
|
2023-09-23 21:17:13 -06:00 |
Cyberes
|
7ee2311183
|
whats going on
|
2023-09-23 21:10:14 -06:00 |
Cyberes
|
94e845cd1a
|
if there's less than num concurrent wait time is 0
|
2023-09-23 21:09:21 -06:00 |
Cyberes
|
41e622d19c
|
fix two exceptions
|
2023-09-23 20:55:49 -06:00 |
Cyberes
|
f67ac8175b
|
fix wrong approach for streaming
|
2023-09-23 18:44:07 -06:00 |
Cyberes
|
8a4de7df44
|
oops
|
2023-09-23 18:01:12 -06:00 |
Cyberes
|
76a1428ba0
|
implement streaming for vllm
|
2023-09-23 17:57:23 -06:00 |
Cyberes
|
f9a80f3028
|
change proompters 1 min to 5 min
|
2023-09-20 21:21:22 -06:00 |
Cyberes
|
03e3ec5490
|
port to mysql, use vllm tokenizer endpoint
|
2023-09-20 20:30:31 -06:00 |
Cyberes
|
2d390e6268
|
*blushes* oopsie daisy
|
2023-09-17 20:22:17 -06:00 |
Cyberes
|
eb3179cfff
|
fix recent proompters to work with gunicorn
|
2023-09-17 19:06:53 -06:00 |
Cyberes
|
3c1254d3bf
|
cache stats in background
|
2023-09-17 18:55:36 -06:00 |
Cyberes
|
edf13db324
|
calculate estimateed wate time better
|
2023-09-17 18:33:57 -06:00 |
Cyberes
|
354ad8192d
|
fix division by 0, prettify /stats json, add js var to home
|
2023-09-16 17:37:43 -06:00 |
Cyberes
|
a89295193f
|
add moderation endpoint to openai api, update config
|
2023-09-14 15:07:17 -06:00 |
Cyberes
|
8f4f17166e
|
adjust
|
2023-09-14 14:36:22 -06:00 |
Cyberes
|
93a344f4c5
|
check if the backend crapped out, print some more stuff
|
2023-09-14 14:26:25 -06:00 |
Cyberes
|
79b1e01b61
|
option to disable streaming, improve timeout on requests to backend, fix error handling. reduce duplicate code, misc other cleanup
|
2023-09-14 14:05:50 -06:00 |
Cyberes
|
e79b206e1a
|
rename average_tps to estimated_avg_tps
|
2023-09-14 01:35:25 -06:00 |
Cyberes
|
12e894032e
|
show the openai system prompt
|
2023-09-13 20:25:56 -06:00 |
Cyberes
|
9740df07c7
|
add openai-compatible backend
|
2023-09-12 16:40:09 -06:00 |
Cyberes
|
1d9f40765e
|
remove text-generation-inference backend
|
2023-09-12 13:09:47 -06:00 |
Cyberes
|
6152b1bb66
|
fix invalid param error, add manual model name
|
2023-09-12 10:30:45 -06:00 |
Cyberes
|
5dd95875dd
|
oops
|
2023-09-12 01:12:50 -06:00 |
Cyberes
|
40ac84aa9a
|
actually we don't want to emulate openai
|
2023-09-12 01:04:11 -06:00 |
Cyberes
|
4c9d543eab
|
implement vllm backend
|
2023-09-11 20:47:19 -06:00 |
Cyberes
|
4b32401542
|
oops wrong data strucutre
|
2023-08-30 20:24:55 -06:00 |
Cyberes
|
47887c3925
|
missed a spot, clean up json error handling
|
2023-08-30 20:19:23 -06:00 |
Cyberes
|
8c04238e04
|
disable stream for now
|
2023-08-30 19:58:59 -06:00 |
Cyberes
|
2816c01902
|
refactor generation route
|
2023-08-30 18:53:26 -06:00 |
Cyberes
|
bf648f605f
|
implement streaming for hf-textgen
|
2023-08-29 17:56:12 -06:00 |
Cyberes
|
26b04f364c
|
remove old code
|
2023-08-29 15:57:28 -06:00 |
Cyberes
|
cef88b866a
|
fix wrong response status code
|
2023-08-29 15:52:58 -06:00 |
Cyberes
|
f9b9051bad
|
update weighted_average_column_for_model to account for when there was an error reported, insert null for response tokens when error, correctly parse x-forwarded-for, correctly convert model reported by hf-textgen
|
2023-08-29 15:46:56 -06:00 |
Cyberes
|
2d9ec15302
|
I swear I know what I'm doing
|
2023-08-29 14:57:49 -06:00 |
Cyberes
|
06b52c7648
|
forgot to remove a snippet
|
2023-08-29 14:53:03 -06:00 |
Cyberes
|
23f3fcf579
|
log errors to database
|
2023-08-29 14:48:33 -06:00 |
Cyberes
|
ba0bc87434
|
add HF text-generation-inference backend
|
2023-08-29 13:46:41 -06:00 |
Cyberes
|
6c0e60135d
|
exclude tokens with priority 0 from simultaneous requests ratelimit
|
2023-08-28 00:03:25 -06:00 |
Cyberes
|
c16d70a24d
|
limit amount of simultaneous requests an IP can make
|
2023-08-27 23:48:10 -06:00 |
Cyberes
|
1a4cb5f786
|
reorganize stats page again
|
2023-08-27 22:24:44 -06:00 |
Cyberes
|
f43336c92c
|
adjust estimated wait time calculations
|
2023-08-27 22:17:21 -06:00 |
Cyberes
|
6a09ffc8a4
|
log model used in request so we can pull the correct averages when we change models
|
2023-08-26 00:30:59 -06:00 |
Cyberes
|
d64152587c
|
reorganize nvidia stats
|
2023-08-25 15:02:40 -06:00 |
Cyberes
|
0e6aadf5e1
|
fix missing empty strings logged when errors
|
2023-08-25 13:44:41 -06:00 |
Cyberes
|
839bb115c6
|
reorganize stats, add 24 hr proompters, adjust logging when error
|
2023-08-25 12:20:16 -06:00 |
Cyberes
|
26a0a13aa7
|
actually we want this
|
2023-08-24 23:57:46 -06:00 |
Cyberes
|
0b4da89de2
|
fix exception
|
2023-08-24 23:57:25 -06:00 |
Cyberes
|
25e3255c9b
|
fix issue with tokenizer
|
2023-08-24 23:13:07 -06:00 |
Cyberes
|
77fe1e237e
|
also handle when no response
|
2023-08-24 22:53:54 -06:00 |
Cyberes
|
e5aca7b09d
|
adjust netdata json, don't log error messages during generationg
|
2023-08-24 22:53:06 -06:00 |
Cyberes
|
0230ddda17
|
dynamically fetch GPUs for netdata
|
2023-08-24 21:56:15 -06:00 |
Cyberes
|
16b986c206
|
track nvidia power states through netdata
|
2023-08-24 21:36:00 -06:00 |
Cyberes
|
01b8442b95
|
update current model when we generate_stats()
|
2023-08-24 21:10:00 -06:00 |
Cyberes
|
ec3fe2c2ac
|
show total output tokens on stats
|
2023-08-24 20:43:11 -06:00 |
Cyberes
|
9b7bf490a1
|
sort keys of stats dict
|
2023-08-24 18:59:52 -06:00 |
Cyberes
|
763dd832cc
|
update home, update readme, calculate estimated wait based on database stats
|
2023-08-24 16:47:14 -06:00 |
Cyberes
|
21174750ea
|
update readme
|
2023-08-24 12:19:59 -06:00 |
Cyberes
|
afc138c743
|
update readme
|
2023-08-24 00:09:57 -06:00 |
Cyberes
|
f3fe514c11
|
add home template
|
2023-08-23 23:11:12 -06:00 |
Cyberes
|
cdda2c840c
|
dont test code, don't care
|
2023-08-23 22:24:32 -06:00 |
Cyberes
|
1eb8e885d0
|
am dumb
|
2023-08-23 22:22:38 -06:00 |
Cyberes
|
e52acb03a4
|
log gen time to DB, also keep generation_elapsed under 3 min
|
2023-08-23 22:20:39 -06:00 |
Cyberes
|
3317bd5f1a
|
allow hiding of more variables
|
2023-08-23 22:08:10 -06:00 |
Cyberes
|
11a0b6541f
|
fix some stuff related to gunicorn workers
|
2023-08-23 22:01:06 -06:00 |
Cyberes
|
02c07bbd53
|
pycarm deeleted import
|
2023-08-23 21:34:27 -06:00 |
Cyberes
|
de19af900f
|
add estimated wait time and other time tracking stats
|
2023-08-23 21:33:52 -06:00 |
Cyberes
|
6f8b70df54
|
add a queue system
|
2023-08-23 20:12:38 -06:00 |
Cyberes
|
a79d67adbb
|
do caching ourself on /model
|
2023-08-23 16:40:20 -06:00 |
Cyberes
|
64e1b1654f
|
more cloudflare finicky stuff
|
2023-08-23 16:32:13 -06:00 |
Cyberes
|
f76d7bbc5d
|
more caching stuff
|
2023-08-23 16:23:24 -06:00 |
Cyberes
|
a6b0bb0183
|
actually we want 500
|
2023-08-23 16:09:36 -06:00 |
Cyberes
|
fd5796ed07
|
oops
|
2023-08-23 16:08:52 -06:00 |
Cyberes
|
508089ce11
|
model info timeout and additional info
|
2023-08-23 16:07:43 -06:00 |
Cyberes
|
1f5e2da637
|
print fetch model error message
|
2023-08-23 16:02:57 -06:00 |
Cyberes
|
806073ee4c
|
update config
|
2023-08-23 15:23:06 -06:00 |
Cyberes
|
ba063f7f1b
|
caching
|
2023-08-23 12:40:13 -06:00 |
Cyberes
|
33190e3cfe
|
fix stats for real
|
2023-08-23 01:14:19 -06:00 |
Cyberes
|
3bb27d6900
|
track IPs for last min proompters
|
2023-08-22 23:37:39 -06:00 |
Cyberes
|
61b9e313d2
|
cache again
|
2023-08-22 23:14:56 -06:00 |
Cyberes
|
36b793e8a2
|
fix proompters_1_min again
|
2023-08-22 23:01:09 -06:00 |
Cyberes
|
b051f8dd6b
|
remove caching on stats route
|
2023-08-22 22:42:40 -06:00 |
Cyberes
|
9f14b166dd
|
fix proompters_1_min, other minor changes
|
2023-08-22 22:32:29 -06:00 |
Cyberes
|
06ae8adf0d
|
add backend name to error messages
|
2023-08-22 21:14:12 -06:00 |
Cyberes
|
a525093c75
|
rename, more stats
|
2023-08-22 20:42:38 -06:00 |
Cyberes
|
a9b7a7a2c7
|
display error messages in sillytavern
|
2023-08-22 20:28:41 -06:00 |