Merge cluster to master #3
cyberes
commented 2023-10-27 19:19:13 -06:00
Owner
No description provided.
cyberes
added 163 commits 2023-10-27 19:19:14 -06:00
e7b57cad7b
set up cluster config and basic background workers
624ca74ce5
mvp
114f36e709
functional
e0f86d053a
reorganize to api v2
e6267a7d46
remove vllm from requirements.txt
11a10f85c1
adjust home page
e553fa6e9f
adjust home page fontsize
91ba2fad1b
add proompter stats back in
1151bb5475
adjust stats
166b2316e8
depricate v1
592eb08cb1
add message for /v1/
7af3dbd76b
add message about settings
61856b4383
adjust message
9235725bdd
adjust message
bc25d92c95
reduce tokens for backend tester
c5b30d985c
adjust jinja template
3ecb7bcf88
adjust jinja template
b10d22ca0d
cache the home page in the background
25ec56a5ef
get streaming working, remove /v2/
d203973e80
fix routes
93d19fb95b
fix exception
2a3ff7e21e
update openai endpoints
f7e9687527
finish openai endpoints
51881ae39d
fix tokenizer
a594729d00
fix keyerror
21da2f6373
fix openai error message
d1c4e68f8b
fix openai models response
b0089859d7
fix ratelimiting
4f226ae38e
handle requests to offline backends
94141b8ecf
fix processing not being decremented on streaming, fix confusion over queue, adjust stop sequences
aed5db4968
trying to narrow down error
cd325216e2
test
07d6f6d8e9
test
f6acd67738
t
70126acdf2
test
0f5e22191c
test
62eb0196cc
t
ca1baa4870
test
63c12ea830
fix
32ad97e57c
do default model rather than default backend, adjust moderation endpoint logic and add timeout, exclude system tokens from recent proompters, calculate number of moderators from endpoint concurrent gens, adjust homepage
581a0fec99
fix exception
e16f415749
fix
33b4b8404b
clean up streaming
f88e2362c5
remove some debug prints
67f5df9bb9
fix stats page
1a7f22ec55
adjust again
6dc3529190
show online status on stats page
5f4e4710c1
option to prioritize by parameter count
b76e77a66a
fix exception
4634e36eeb
text
7e3af3599d
test
4deb32bf1c
test
1b21cb69c1
test
95d781725e
t
a15b5465df
c
f3a13fcda8
c
6af5365015
c
7cb624c5f5
f
364b795268
fix
77db34a6a7
g
6bad5b3fa0
t
d0eec88dbd
f
754a4cbdf3
r
5e90fa54d4
handle model offline
d78ef652fc
c
7acaa3c885
g
62d5d43da4
handle backend offline in tokenizer
09fa69e031
fix
6723dd79dc
fix exceptoin
1670594908
fix import error
acf409abfc
fix background logger, add gradio chat example
08df52a4fd
fix exception when not valid model
27e461c76b
test
19e62be3e8
t
979a945466
t
84c1ed8737
t
a53790ee37
fix???
a229b4d6c5
c
01fb619b9b
f
3d0a5cf0a2
t
5a61bdccd4
f
64d7a9edbb
fix
10eb6269b7
t
6be1e9acd3
t
fb8bc05b4c
t
0718f10eb9
t
e07e31df0a
fix
9b819573e8
fix import error
817c454c89
t
46d44f95ac
t
a37b12a221
t
96dd62478f
fix
50992116f5
fix
9befda5acb
c
5540112607
t
0bef14ea55
t
c4cc7bbaa0
f
8df667bc0a
t
67173f30dd
t
e9f6fdf65e
fix streaming?
da20d1807b
actually wait again
ea61766838
fix
e8964fcfd2
fix the queue??
3e5feb9c97
fix stat
467e1893ea
fix issue with null data on openai
ae4d4e5ca9
fix exception
5f7bf4faca
misc changes
18e37a72ae
add model selection to openai endpoint
f4e5b5275d
test
7286e38cb0
t
78114771b0
fix oai exception
1d1c45dc1a
add length penalty param to vllm
69b8c1e35c
fix openai confusion
169e216a38
add background thread to gradio
74cf8f309b
clean up
4e3985e156
fix wrong status code on openai streaming
ca7044bc90
update gradio chat
83f3ba8919
trying to fix workers still processing after backend goes offline
b3f0c4b28f
remove debug print
3ec9b2347f
fix wrong datatype
31ab4188f1
fix issues with queue and streaming
381bdb950f
remove debug print
24aab3cd93
fix streaming disabled
151b3e4769
begin streaming rewrite
2c7773cc4f
get streaming working again
f421436048
add nginx config
19a193b792
increase tokenization chunk size
20047fa0e4
2000 chunk size
1e68e10b62
fix GeneratorExit
21755450a3
test
81baf9616f
revert
806e522d16
don't pickle streaming
70cf6843e5
update requiorements
c3c053e071
test
9e3cbc9d2e
fix streaming slowdown?
6f65791795
adjust
2ed0e01db6
background thread
7998cfca87
cleanup
2fed87d340
remove timed-out items from queue
4c2c164ce1
test
90adffaec8
test
be03569165
use backend handler to build parameters when sending test prompt
92e4ecd8a1
refer to queue for tracking IP count rather than seperate value
50377eca22
track lag on get_ip_request_count()
56a2ca464b
change print
b9566e9db7
docs and stuff
6e74ce7c28
fix old code in completions
4f5b2dbecb
add tests
0abd4b94fb
track down keyerror
e838f591aa
fix keyerror?
763139c949
fix keyerror
1a15232400
tests: make sure all prompts are the same
f39e976b34
dameon printer: Calculate the queue size the same way it's done on the stats
e236e93a79
clean up a bit
d43f110a14
fix redis cycle and add no reset to daemon
3cf73fec9b
fix a few exceptions when all backends go offline
0771c2325c
fix inference workers quitting when a backend is offline, start adding logging, improve tokenizer error handling
177dabd209
Give some time for the background threads to get themselves ready to go
96ba48affc
make sure to regen stats on startup
b4e01e129d
fix when all offline
563630547a
add robots.txt
28c250385d
add todo
ee44371fdf
Merge branch 'master' into cluster
cyberes
merged commit 0059e7956c into master 2023-10-27 19:19:22 -06:00
cyberes
referenced this issue from a commit 2023-10-27 19:19:23 -06:00
Merge cluster to master (#3)
Loading…
Reference in New Issue
No description provided.
Delete Branch "cluster"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?