Cyberes
|
2fed87d340
|
remove timed-out items from queue
|
2023-10-17 11:46:39 -06:00 |
Cyberes
|
2c7773cc4f
|
get streaming working again
|
2023-10-16 16:22:52 -06:00 |
Cyberes
|
83f3ba8919
|
trying to fix workers still processing after backend goes offline
|
2023-10-15 15:11:37 -06:00 |
Cyberes
|
18e37a72ae
|
add model selection to openai endpoint
|
2023-10-09 23:51:26 -06:00 |
Cyberes
|
ae4d4e5ca9
|
fix exception
|
2023-10-09 10:31:35 -06:00 |
Cyberes
|
e8964fcfd2
|
fix the queue??
|
2023-10-05 21:37:18 -06:00 |
Cyberes
|
08df52a4fd
|
fix exception when not valid model
|
2023-10-05 12:28:00 -06:00 |
Cyberes
|
acf409abfc
|
fix background logger, add gradio chat example
|
2023-10-04 19:24:47 -06:00 |
Cyberes
|
1670594908
|
fix import error
|
2023-10-04 16:29:19 -06:00 |
Cyberes
|
09fa69e031
|
fix
|
2023-10-04 13:37:39 -06:00 |
Cyberes
|
d78ef652fc
|
c
|
2023-10-04 13:21:43 -06:00 |
Cyberes
|
5e90fa54d4
|
handle model offline
|
2023-10-04 13:18:47 -06:00 |
Cyberes
|
77db34a6a7
|
g
|
2023-10-04 12:59:19 -06:00 |
Cyberes
|
a15b5465df
|
c
|
2023-10-04 12:44:09 -06:00 |
Cyberes
|
e16f415749
|
fix
|
2023-10-03 13:49:00 -06:00 |
Cyberes
|
32ad97e57c
|
do default model rather than default backend, adjust moderation endpoint logic and add timeout, exclude system tokens from recent proompters, calculate number of moderators from endpoint concurrent gens, adjust homepage
|
2023-10-03 13:40:08 -06:00 |
Cyberes
|
94141b8ecf
|
fix processing not being decremented on streaming, fix confusion over queue, adjust stop sequences
|
2023-10-02 20:53:08 -06:00 |
Cyberes
|
b0089859d7
|
fix ratelimiting
|
2023-10-02 02:05:15 -06:00 |
Cyberes
|
21da2f6373
|
fix openai error message
|
2023-10-01 22:58:08 -06:00 |
Cyberes
|
f7e9687527
|
finish openai endpoints
|
2023-10-01 16:04:53 -06:00 |
Cyberes
|
2a3ff7e21e
|
update openai endpoints
|
2023-10-01 14:15:01 -06:00 |
Cyberes
|
1151bb5475
|
adjust stats
|
2023-09-30 20:42:48 -06:00 |
Cyberes
|
114f36e709
|
functional
|
2023-09-30 19:41:50 -06:00 |
Cyberes
|
624ca74ce5
|
mvp
|
2023-09-29 00:09:44 -06:00 |
Cyberes
|
e7b57cad7b
|
set up cluster config and basic background workers
|
2023-09-28 18:40:24 -06:00 |
Cyberes
|
59f2aac8ad
|
rewrite redis usage
|
2023-09-28 03:44:30 -06:00 |
Cyberes
|
a4a1d6cce6
|
fix double logging
|
2023-09-28 01:34:15 -06:00 |
Cyberes
|
e5fbc9545d
|
add ratelimiting to websocket streaming endpoint, fix queue not decrementing IP requests, add console printer
|
2023-09-27 21:15:54 -06:00 |
Cyberes
|
43299b32ad
|
clean up background threads
|
2023-09-27 19:39:04 -06:00 |
Cyberes
|
105b66d5e2
|
unify error message handling
|
2023-09-27 14:48:47 -06:00 |
Cyberes
|
957a6cd092
|
fix error handling
|
2023-09-27 14:36:49 -06:00 |
Cyberes
|
aba2e5b9c0
|
don't use db pooling, add LLM-ST-Errors header to disable formatted errors
|
2023-09-26 23:59:22 -06:00 |
Cyberes
|
048e5a8060
|
fix API key handling
|
2023-09-26 22:49:53 -06:00 |
Cyberes
|
d9bbcc42e6
|
more work on openai endpoint
|
2023-09-26 22:09:11 -06:00 |
Cyberes
|
52e6965b5e
|
don't count SYSTEM tokens for recent prompters, fix sql exclude for SYSTEM tokens
|
2023-09-25 13:00:39 -06:00 |
Cyberes
|
bbe5d5a8fe
|
improve openai endpoint, exclude system tokens more places
|
2023-09-25 09:32:23 -06:00 |
Cyberes
|
6459a1c91b
|
allow setting simultaneous IP limit per-token, fix token use tracker, fix tokens on streaming
|
2023-09-25 00:55:20 -06:00 |
Cyberes
|
cb99c3490e
|
rewrite tokenizer, restructure validation
|
2023-09-24 13:02:30 -06:00 |
Cyberes
|
62412f4873
|
add config setting for hostname
|
2023-09-23 23:24:08 -06:00 |
Cyberes
|
84a1fcfdd8
|
don't store host if it's an IP
|
2023-09-23 23:14:22 -06:00 |
Cyberes
|
76a1428ba0
|
implement streaming for vllm
|
2023-09-23 17:57:23 -06:00 |
Cyberes
|
8593198216
|
close mysql cursor
|
2023-09-20 21:19:26 -06:00 |
Cyberes
|
03e3ec5490
|
port to mysql, use vllm tokenizer endpoint
|
2023-09-20 20:30:31 -06:00 |
Cyberes
|
eb3179cfff
|
fix recent proompters to work with gunicorn
|
2023-09-17 19:06:53 -06:00 |
Cyberes
|
3c1254d3bf
|
cache stats in background
|
2023-09-17 18:55:36 -06:00 |
Cyberes
|
77edbe779c
|
actually validate prompt length lol
|
2023-09-14 18:31:13 -06:00 |
Cyberes
|
3100b0a924
|
set up queue to work with gunicorn processes, other improvements
|
2023-09-14 17:38:20 -06:00 |
Cyberes
|
93a344f4c5
|
check if the backend crapped out, print some more stuff
|
2023-09-14 14:26:25 -06:00 |
Cyberes
|
79b1e01b61
|
option to disable streaming, improve timeout on requests to backend, fix error handling. reduce duplicate code, misc other cleanup
|
2023-09-14 14:05:50 -06:00 |
Cyberes
|
bcedd2ab3d
|
adjust logging, add more vllm stuff
|
2023-09-13 11:22:33 -06:00 |