Commit Graph

163 Commits

Author SHA1 Message Date
Cyberes 747d838138 move where the vllm model is set 2023-09-11 21:05:22 -06:00
Cyberes 4c9d543eab implement vllm backend 2023-09-11 20:47:19 -06:00
Cyberes c14cc51f09 get working with ooba again, give up on dockerfile 2023-09-11 09:51:01 -06:00
Cyberes 4c49aa525a still working on dockerfile 2023-09-10 18:11:25 -06:00
Cyberes 170c912d71 reorganize dockerfile 2023-09-06 23:32:43 -06:00
Cyberes f213b9a3ae cuda nn 2023-09-06 22:27:48 -06:00
Cyberes b2b6cdabaa still working on dockerfile 2023-09-06 12:01:32 -06:00
Cyberes cc1db8a0ba more docker stuff 2023-09-04 20:15:45 -06:00
Cyberes a98d7edeb7 add docker file 2023-08-31 15:59:45 -06:00
Cyberes 2d8812a6cd fix crash again 2023-08-31 09:31:16 -06:00
Cyberes bf39b8da63 still having issues 2023-08-31 09:24:37 -06:00
Cyberes 4b32401542 oops wrong data strucutre 2023-08-30 20:24:55 -06:00
Cyberes 47887c3925 missed a spot, clean up json error handling 2023-08-30 20:19:23 -06:00
Cyberes 8c04238e04 disable stream for now 2023-08-30 19:58:59 -06:00
Cyberes 41b8232499 update example config 2023-08-30 18:59:29 -06:00
Cyberes 2816c01902 refactor generation route 2023-08-30 18:53:26 -06:00
Cyberes e45eafd286 update requirements.txt 2023-08-29 17:57:06 -06:00
Cyberes bf648f605f implement streaming for hf-textgen 2023-08-29 17:56:12 -06:00
Cyberes 26b04f364c remove old code 2023-08-29 15:57:28 -06:00
Cyberes cef88b866a fix wrong response status code 2023-08-29 15:52:58 -06:00
Cyberes f9b9051bad update weighted_average_column_for_model to account for when there was an error reported, insert null for response tokens when error, correctly parse x-forwarded-for, correctly convert model reported by hf-textgen 2023-08-29 15:46:56 -06:00
Cyberes da77a24eaa damn 2023-08-29 14:58:08 -06:00
Cyberes 2d9ec15302 I swear I know what I'm doing 2023-08-29 14:57:49 -06:00
Cyberes 06b52c7648 forgot to remove a snippet 2023-08-29 14:53:03 -06:00
Cyberes 23f3fcf579 log errors to database 2023-08-29 14:48:33 -06:00
Cyberes b44dfa2471 update info page 2023-08-29 14:00:35 -06:00
Cyberes ba0bc87434 add HF text-generation-inference backend 2023-08-29 13:46:41 -06:00
Cyberes 6c0e60135d exclude tokens with priority 0 from simultaneous requests ratelimit 2023-08-28 00:03:25 -06:00
Cyberes c16d70a24d limit amount of simultaneous requests an IP can make 2023-08-27 23:48:10 -06:00
Cyberes 1a4cb5f786 reorganize stats page again 2023-08-27 22:24:44 -06:00
Cyberes f43336c92c adjust estimated wait time calculations 2023-08-27 22:17:21 -06:00
Cyberes 441a870e85 calculate weighted average for stat tracking 2023-08-27 19:58:04 -06:00
Cyberes 6a09ffc8a4 log model used in request so we can pull the correct averages when we change models 2023-08-26 00:30:59 -06:00
Cyberes 0150bbf8dd actually we do know how long it will take 2023-08-25 15:17:01 -06:00
Cyberes d64152587c reorganize nvidia stats 2023-08-25 15:02:40 -06:00
Cyberes c6edeb2b70 There will be a wait if the queue is empty but prompts are processing 2023-08-25 13:53:23 -06:00
Cyberes 0e6aadf5e1 fix missing empty strings logged when errors 2023-08-25 13:44:41 -06:00
Cyberes 2543db87e8 fix database error 2023-08-25 12:25:30 -06:00
Cyberes 839bb115c6 reorganize stats, add 24 hr proompters, adjust logging when error 2023-08-25 12:20:16 -06:00
Cyberes 26a0a13aa7 actually we want this 2023-08-24 23:57:46 -06:00
Cyberes 0b4da89de2 fix exception 2023-08-24 23:57:25 -06:00
Cyberes 25e3255c9b fix issue with tokenizer 2023-08-24 23:13:07 -06:00
Cyberes 77fe1e237e also handle when no response 2023-08-24 22:53:54 -06:00
Cyberes e5aca7b09d adjust netdata json, don't log error messages during generationg 2023-08-24 22:53:06 -06:00
Cyberes 06173f900e remove debug print 2023-08-24 22:02:15 -06:00
Cyberes 0230ddda17 dynamically fetch GPUs for netdata 2023-08-24 21:56:15 -06:00
Cyberes 16b986c206 track nvidia power states through netdata 2023-08-24 21:36:00 -06:00
Cyberes 01b8442b95 update current model when we generate_stats() 2023-08-24 21:10:00 -06:00
Cyberes ec3fe2c2ac show total output tokens on stats 2023-08-24 20:43:11 -06:00
Cyberes 9b7bf490a1 sort keys of stats dict 2023-08-24 18:59:52 -06:00