Commit Graph

52 Commits

Author SHA1 Message Date
Cyberes 18e37a72ae add model selection to openai endpoint 2023-10-09 23:51:26 -06:00
Cyberes 2a3ff7e21e update openai endpoints 2023-10-01 14:15:01 -06:00
Cyberes 25ec56a5ef get streaming working, remove /v2/ 2023-10-01 00:20:00 -06:00
Cyberes e0f86d053a reorganize to api v2 2023-09-30 19:42:41 -06:00
Cyberes 114f36e709 functional 2023-09-30 19:41:50 -06:00
Cyberes 105b66d5e2 unify error message handling 2023-09-27 14:48:47 -06:00
Cyberes aba2e5b9c0 don't use db pooling, add LLM-ST-Errors header to disable formatted errors 2023-09-26 23:59:22 -06:00
Cyberes d9bbcc42e6 more work on openai endpoint 2023-09-26 22:09:11 -06:00
Cyberes 0eb901cb52 don't log entire request on failure 2023-09-26 12:32:19 -06:00
Cyberes 320f51e01c further align openai endpoint with expected responses 2023-09-24 21:45:30 -06:00
Cyberes a89295193f add moderation endpoint to openai api, update config 2023-09-14 15:07:17 -06:00
Cyberes 8f4f17166e adjust 2023-09-14 14:36:22 -06:00
Cyberes 93a344f4c5 check if the backend crapped out, print some more stuff 2023-09-14 14:26:25 -06:00
Cyberes 9740df07c7 add openai-compatible backend 2023-09-12 16:40:09 -06:00
Cyberes 6152b1bb66 fix invalid param error, add manual model name 2023-09-12 10:30:45 -06:00
Cyberes 40ac84aa9a actually we don't want to emulate openai 2023-09-12 01:04:11 -06:00
Cyberes 4c9d543eab implement vllm backend 2023-09-11 20:47:19 -06:00
Cyberes 4b32401542 oops wrong data strucutre 2023-08-30 20:24:55 -06:00
Cyberes 47887c3925 missed a spot, clean up json error handling 2023-08-30 20:19:23 -06:00
Cyberes 2816c01902 refactor generation route 2023-08-30 18:53:26 -06:00
Cyberes 26b04f364c remove old code 2023-08-29 15:57:28 -06:00
Cyberes cef88b866a fix wrong response status code 2023-08-29 15:52:58 -06:00
Cyberes f9b9051bad update weighted_average_column_for_model to account for when there was an error reported, insert null for response tokens when error, correctly parse x-forwarded-for, correctly convert model reported by hf-textgen 2023-08-29 15:46:56 -06:00
Cyberes 2d9ec15302 I swear I know what I'm doing 2023-08-29 14:57:49 -06:00
Cyberes 06b52c7648 forgot to remove a snippet 2023-08-29 14:53:03 -06:00
Cyberes 23f3fcf579 log errors to database 2023-08-29 14:48:33 -06:00
Cyberes ba0bc87434 add HF text-generation-inference backend 2023-08-29 13:46:41 -06:00
Cyberes 6c0e60135d exclude tokens with priority 0 from simultaneous requests ratelimit 2023-08-28 00:03:25 -06:00
Cyberes c16d70a24d limit amount of simultaneous requests an IP can make 2023-08-27 23:48:10 -06:00
Cyberes 0e6aadf5e1 fix missing empty strings logged when errors 2023-08-25 13:44:41 -06:00
Cyberes 839bb115c6 reorganize stats, add 24 hr proompters, adjust logging when error 2023-08-25 12:20:16 -06:00
Cyberes 26a0a13aa7 actually we want this 2023-08-24 23:57:46 -06:00
Cyberes 0b4da89de2 fix exception 2023-08-24 23:57:25 -06:00
Cyberes 25e3255c9b fix issue with tokenizer 2023-08-24 23:13:07 -06:00
Cyberes 77fe1e237e also handle when no response 2023-08-24 22:53:54 -06:00
Cyberes e5aca7b09d adjust netdata json, don't log error messages during generationg 2023-08-24 22:53:06 -06:00
Cyberes afc138c743 update readme 2023-08-24 00:09:57 -06:00
Cyberes cdda2c840c dont test code, don't care 2023-08-23 22:24:32 -06:00
Cyberes 1eb8e885d0 am dumb 2023-08-23 22:22:38 -06:00
Cyberes e52acb03a4 log gen time to DB, also keep generation_elapsed under 3 min 2023-08-23 22:20:39 -06:00
Cyberes 11a0b6541f fix some stuff related to gunicorn workers 2023-08-23 22:01:06 -06:00
Cyberes de19af900f add estimated wait time and other time tracking stats 2023-08-23 21:33:52 -06:00
Cyberes 6f8b70df54 add a queue system 2023-08-23 20:12:38 -06:00
Cyberes 33190e3cfe fix stats for real 2023-08-23 01:14:19 -06:00
Cyberes 3bb27d6900 track IPs for last min proompters 2023-08-22 23:37:39 -06:00
Cyberes 9f14b166dd fix proompters_1_min, other minor changes 2023-08-22 22:32:29 -06:00
Cyberes 06ae8adf0d add backend name to error messages 2023-08-22 21:14:12 -06:00
Cyberes a525093c75 rename, more stats 2023-08-22 20:42:38 -06:00
Cyberes a9b7a7a2c7 display error messages in sillytavern 2023-08-22 20:28:41 -06:00
Cyberes 0d32db2dbd prototype hf-textgen and adjust logging 2023-08-22 19:58:31 -06:00