Commit Graph

19 Commits

Author SHA1 Message Date
Cyberes ee9a0d4858 redo config 2024-05-07 12:20:53 -06:00
Cyberes ff82add09e redo database connection, add pooling, minor logging changes, other clean up 2024-05-07 09:48:51 -06:00
Cyberes 0059e7956c Merge cluster to master (#3)
Co-authored-by: Cyberes <cyberes@evulid.cc>
Reviewed-on: #3
2023-10-27 19:19:22 -06:00
Cyberes e42f2b6819 fix negative queue on stats 2023-09-28 08:47:39 -06:00
Cyberes 59f2aac8ad rewrite redis usage 2023-09-28 03:44:30 -06:00
Cyberes a4a1d6cce6 fix double logging 2023-09-28 01:34:15 -06:00
Cyberes e5fbc9545d add ratelimiting to websocket streaming endpoint, fix queue not decrementing IP requests, add console printer 2023-09-27 21:15:54 -06:00
Cyberes 43299b32ad clean up background threads 2023-09-27 19:39:04 -06:00
Cyberes 35e9847b27 set inference workers to daemon, add finally to inference worker, hide estimated avg tps 2023-09-27 18:36:51 -06:00
Cyberes 52e6965b5e don't count SYSTEM tokens for recent prompters, fix sql exclude for SYSTEM tokens 2023-09-25 13:00:39 -06:00
Cyberes 3100b0a924 set up queue to work with gunicorn processes, other improvements 2023-09-14 17:38:20 -06:00
Cyberes 4c9d543eab implement vllm backend 2023-09-11 20:47:19 -06:00
Cyberes ba0bc87434 add HF text-generation-inference backend 2023-08-29 13:46:41 -06:00
Cyberes 6c0e60135d exclude tokens with priority 0 from simultaneous requests ratelimit 2023-08-28 00:03:25 -06:00
Cyberes c16d70a24d limit amount of simultaneous requests an IP can make 2023-08-27 23:48:10 -06:00
Cyberes 11a0b6541f fix some stuff related to gunicorn workers 2023-08-23 22:01:06 -06:00
Cyberes de19af900f add estimated wait time and other time tracking stats 2023-08-23 21:33:52 -06:00
Cyberes 0aa52863bc forgot to start workers 2023-08-23 20:33:49 -06:00
Cyberes 6f8b70df54 add a queue system 2023-08-23 20:12:38 -06:00