Cyberes
|
151b3e4769
|
begin streaming rewrite
|
2023-10-16 00:18:05 -06:00 |
Cyberes
|
381bdb950f
|
remove debug print
|
2023-10-15 20:46:32 -06:00 |
Cyberes
|
31ab4188f1
|
fix issues with queue and streaming
|
2023-10-15 20:45:01 -06:00 |
Cyberes
|
3e5feb9c97
|
fix stat
|
2023-10-05 21:43:49 -06:00 |
Cyberes
|
e8964fcfd2
|
fix the queue??
|
2023-10-05 21:37:18 -06:00 |
Cyberes
|
e9f6fdf65e
|
fix streaming?
|
2023-10-05 20:14:28 -06:00 |
Cyberes
|
67173f30dd
|
t
|
2023-10-05 19:35:12 -06:00 |
Cyberes
|
5540112607
|
t
|
2023-10-05 19:09:25 -06:00 |
Cyberes
|
19e62be3e8
|
t
|
2023-10-05 17:41:01 -06:00 |
Cyberes
|
27e461c76b
|
test
|
2023-10-05 17:00:35 -06:00 |
Cyberes
|
364b795268
|
fix
|
2023-10-04 12:57:11 -06:00 |
Cyberes
|
7cb624c5f5
|
f
|
2023-10-04 12:47:59 -06:00 |
Cyberes
|
95d781725e
|
t
|
2023-10-04 12:42:18 -06:00 |
Cyberes
|
1b21cb69c1
|
test
|
2023-10-04 12:40:29 -06:00 |
Cyberes
|
7e3af3599d
|
test
|
2023-10-04 10:29:58 -06:00 |
Cyberes
|
94141b8ecf
|
fix processing not being decremented on streaming, fix confusion over queue, adjust stop sequences
|
2023-10-02 20:53:08 -06:00 |
Cyberes
|
b0089859d7
|
fix ratelimiting
|
2023-10-02 02:05:15 -06:00 |
Cyberes
|
114f36e709
|
functional
|
2023-09-30 19:41:50 -06:00 |
Cyberes
|
624ca74ce5
|
mvp
|
2023-09-29 00:09:44 -06:00 |
Cyberes
|
e7b57cad7b
|
set up cluster config and basic background workers
|
2023-09-28 18:40:24 -06:00 |
Cyberes
|
e42f2b6819
|
fix negative queue on stats
|
2023-09-28 08:47:39 -06:00 |
Cyberes
|
59f2aac8ad
|
rewrite redis usage
|
2023-09-28 03:44:30 -06:00 |
Cyberes
|
a4a1d6cce6
|
fix double logging
|
2023-09-28 01:34:15 -06:00 |
Cyberes
|
e5fbc9545d
|
add ratelimiting to websocket streaming endpoint, fix queue not decrementing IP requests, add console printer
|
2023-09-27 21:15:54 -06:00 |
Cyberes
|
43299b32ad
|
clean up background threads
|
2023-09-27 19:39:04 -06:00 |
Cyberes
|
35e9847b27
|
set inference workers to daemon, add finally to inference worker, hide estimated avg tps
|
2023-09-27 18:36:51 -06:00 |
Cyberes
|
52e6965b5e
|
don't count SYSTEM tokens for recent prompters, fix sql exclude for SYSTEM tokens
|
2023-09-25 13:00:39 -06:00 |
Cyberes
|
3100b0a924
|
set up queue to work with gunicorn processes, other improvements
|
2023-09-14 17:38:20 -06:00 |
Cyberes
|
4c9d543eab
|
implement vllm backend
|
2023-09-11 20:47:19 -06:00 |
Cyberes
|
ba0bc87434
|
add HF text-generation-inference backend
|
2023-08-29 13:46:41 -06:00 |
Cyberes
|
6c0e60135d
|
exclude tokens with priority 0 from simultaneous requests ratelimit
|
2023-08-28 00:03:25 -06:00 |
Cyberes
|
c16d70a24d
|
limit amount of simultaneous requests an IP can make
|
2023-08-27 23:48:10 -06:00 |
Cyberes
|
11a0b6541f
|
fix some stuff related to gunicorn workers
|
2023-08-23 22:01:06 -06:00 |
Cyberes
|
de19af900f
|
add estimated wait time and other time tracking stats
|
2023-08-23 21:33:52 -06:00 |
Cyberes
|
0aa52863bc
|
forgot to start workers
|
2023-08-23 20:33:49 -06:00 |
Cyberes
|
6f8b70df54
|
add a queue system
|
2023-08-23 20:12:38 -06:00 |