Commit Graph

36 Commits

Author SHA1 Message Date
Cyberes 151b3e4769 begin streaming rewrite 2023-10-16 00:18:05 -06:00
Cyberes 381bdb950f remove debug print 2023-10-15 20:46:32 -06:00
Cyberes 31ab4188f1 fix issues with queue and streaming 2023-10-15 20:45:01 -06:00
Cyberes 3e5feb9c97 fix stat 2023-10-05 21:43:49 -06:00
Cyberes e8964fcfd2 fix the queue?? 2023-10-05 21:37:18 -06:00
Cyberes e9f6fdf65e fix streaming? 2023-10-05 20:14:28 -06:00
Cyberes 67173f30dd t 2023-10-05 19:35:12 -06:00
Cyberes 5540112607 t 2023-10-05 19:09:25 -06:00
Cyberes 19e62be3e8 t 2023-10-05 17:41:01 -06:00
Cyberes 27e461c76b test 2023-10-05 17:00:35 -06:00
Cyberes 364b795268 fix 2023-10-04 12:57:11 -06:00
Cyberes 7cb624c5f5 f 2023-10-04 12:47:59 -06:00
Cyberes 95d781725e t 2023-10-04 12:42:18 -06:00
Cyberes 1b21cb69c1 test 2023-10-04 12:40:29 -06:00
Cyberes 7e3af3599d test 2023-10-04 10:29:58 -06:00
Cyberes 94141b8ecf fix processing not being decremented on streaming, fix confusion over queue, adjust stop sequences 2023-10-02 20:53:08 -06:00
Cyberes b0089859d7 fix ratelimiting 2023-10-02 02:05:15 -06:00
Cyberes 114f36e709 functional 2023-09-30 19:41:50 -06:00
Cyberes 624ca74ce5 mvp 2023-09-29 00:09:44 -06:00
Cyberes e7b57cad7b set up cluster config and basic background workers 2023-09-28 18:40:24 -06:00
Cyberes e42f2b6819 fix negative queue on stats 2023-09-28 08:47:39 -06:00
Cyberes 59f2aac8ad rewrite redis usage 2023-09-28 03:44:30 -06:00
Cyberes a4a1d6cce6 fix double logging 2023-09-28 01:34:15 -06:00
Cyberes e5fbc9545d add ratelimiting to websocket streaming endpoint, fix queue not decrementing IP requests, add console printer 2023-09-27 21:15:54 -06:00
Cyberes 43299b32ad clean up background threads 2023-09-27 19:39:04 -06:00
Cyberes 35e9847b27 set inference workers to daemon, add finally to inference worker, hide estimated avg tps 2023-09-27 18:36:51 -06:00
Cyberes 52e6965b5e don't count SYSTEM tokens for recent prompters, fix sql exclude for SYSTEM tokens 2023-09-25 13:00:39 -06:00
Cyberes 3100b0a924 set up queue to work with gunicorn processes, other improvements 2023-09-14 17:38:20 -06:00
Cyberes 4c9d543eab implement vllm backend 2023-09-11 20:47:19 -06:00
Cyberes ba0bc87434 add HF text-generation-inference backend 2023-08-29 13:46:41 -06:00
Cyberes 6c0e60135d exclude tokens with priority 0 from simultaneous requests ratelimit 2023-08-28 00:03:25 -06:00
Cyberes c16d70a24d limit amount of simultaneous requests an IP can make 2023-08-27 23:48:10 -06:00
Cyberes 11a0b6541f fix some stuff related to gunicorn workers 2023-08-23 22:01:06 -06:00
Cyberes de19af900f add estimated wait time and other time tracking stats 2023-08-23 21:33:52 -06:00
Cyberes 0aa52863bc forgot to start workers 2023-08-23 20:33:49 -06:00
Cyberes 6f8b70df54 add a queue system 2023-08-23 20:12:38 -06:00