Cyberes
|
79b1e01b61
|
option to disable streaming, improve timeout on requests to backend, fix error handling. reduce duplicate code, misc other cleanup
|
2023-09-14 14:05:50 -06:00 |
Cyberes
|
e79b206e1a
|
rename average_tps to estimated_avg_tps
|
2023-09-14 01:35:25 -06:00 |
Cyberes
|
c45e68a8c8
|
adjust requests timeout, add service file
|
2023-09-14 01:32:49 -06:00 |
Cyberes
|
035c17c48b
|
reformat info page info_html field
|
2023-09-13 20:40:55 -06:00 |
Cyberes
|
15a0390875
|
typo
|
2023-09-13 20:27:20 -06:00 |
Cyberes
|
12e894032e
|
show the openai system prompt
|
2023-09-13 20:25:56 -06:00 |
Cyberes
|
320c3fc710
|
calculate time stats based on backend url
|
2023-09-13 12:34:14 -06:00 |
Cyberes
|
3d40ed4cfb
|
shit code
|
2023-09-13 11:58:38 -06:00 |
Cyberes
|
1582625e09
|
how did this get broken
|
2023-09-13 11:56:30 -06:00 |
Cyberes
|
05a45e6ac6
|
didnt test anything
|
2023-09-13 11:51:46 -06:00 |
Cyberes
|
84369d6c78
|
oops
|
2023-09-13 11:30:22 -06:00 |
Cyberes
|
bcedd2ab3d
|
adjust logging, add more vllm stuff
|
2023-09-13 11:22:33 -06:00 |
Cyberes
|
e053f48fdc
|
change gpt4 prompt
|
2023-09-12 16:47:08 -06:00 |
Cyberes
|
6ba1fc06d3
|
reorder homepage
|
2023-09-12 16:43:15 -06:00 |
Cyberes
|
9740df07c7
|
add openai-compatible backend
|
2023-09-12 16:40:09 -06:00 |
Cyberes
|
1d9f40765e
|
remove text-generation-inference backend
|
2023-09-12 13:09:47 -06:00 |
Cyberes
|
6152b1bb66
|
fix invalid param error, add manual model name
|
2023-09-12 10:30:45 -06:00 |
Cyberes
|
5dd95875dd
|
oops
|
2023-09-12 01:12:50 -06:00 |
Cyberes
|
57ccedcfb9
|
adjust some things
|
2023-09-12 01:10:58 -06:00 |
Cyberes
|
a84386c311
|
move import check furthger up
|
2023-09-12 01:05:03 -06:00 |
Cyberes
|
40ac84aa9a
|
actually we don't want to emulate openai
|
2023-09-12 01:04:11 -06:00 |
Cyberes
|
747d838138
|
move where the vllm model is set
|
2023-09-11 21:05:22 -06:00 |
Cyberes
|
4c9d543eab
|
implement vllm backend
|
2023-09-11 20:47:19 -06:00 |
Cyberes
|
c14cc51f09
|
get working with ooba again, give up on dockerfile
|
2023-09-11 09:51:01 -06:00 |
Cyberes
|
4c49aa525a
|
still working on dockerfile
|
2023-09-10 18:11:25 -06:00 |
Cyberes
|
170c912d71
|
reorganize dockerfile
|
2023-09-06 23:32:43 -06:00 |
Cyberes
|
f213b9a3ae
|
cuda nn
|
2023-09-06 22:27:48 -06:00 |
Cyberes
|
b2b6cdabaa
|
still working on dockerfile
|
2023-09-06 12:01:32 -06:00 |
Cyberes
|
cc1db8a0ba
|
more docker stuff
|
2023-09-04 20:15:45 -06:00 |
Cyberes
|
a98d7edeb7
|
add docker file
|
2023-08-31 15:59:45 -06:00 |
Cyberes
|
2d8812a6cd
|
fix crash again
|
2023-08-31 09:31:16 -06:00 |
Cyberes
|
bf39b8da63
|
still having issues
|
2023-08-31 09:24:37 -06:00 |
Cyberes
|
4b32401542
|
oops wrong data strucutre
|
2023-08-30 20:24:55 -06:00 |
Cyberes
|
47887c3925
|
missed a spot, clean up json error handling
|
2023-08-30 20:19:23 -06:00 |
Cyberes
|
8c04238e04
|
disable stream for now
|
2023-08-30 19:58:59 -06:00 |
Cyberes
|
41b8232499
|
update example config
|
2023-08-30 18:59:29 -06:00 |
Cyberes
|
2816c01902
|
refactor generation route
|
2023-08-30 18:53:26 -06:00 |
Cyberes
|
e45eafd286
|
update requirements.txt
|
2023-08-29 17:57:06 -06:00 |
Cyberes
|
bf648f605f
|
implement streaming for hf-textgen
|
2023-08-29 17:56:12 -06:00 |
Cyberes
|
26b04f364c
|
remove old code
|
2023-08-29 15:57:28 -06:00 |
Cyberes
|
cef88b866a
|
fix wrong response status code
|
2023-08-29 15:52:58 -06:00 |
Cyberes
|
f9b9051bad
|
update weighted_average_column_for_model to account for when there was an error reported, insert null for response tokens when error, correctly parse x-forwarded-for, correctly convert model reported by hf-textgen
|
2023-08-29 15:46:56 -06:00 |
Cyberes
|
da77a24eaa
|
damn
|
2023-08-29 14:58:08 -06:00 |
Cyberes
|
2d9ec15302
|
I swear I know what I'm doing
|
2023-08-29 14:57:49 -06:00 |
Cyberes
|
06b52c7648
|
forgot to remove a snippet
|
2023-08-29 14:53:03 -06:00 |
Cyberes
|
23f3fcf579
|
log errors to database
|
2023-08-29 14:48:33 -06:00 |
Cyberes
|
b44dfa2471
|
update info page
|
2023-08-29 14:00:35 -06:00 |
Cyberes
|
ba0bc87434
|
add HF text-generation-inference backend
|
2023-08-29 13:46:41 -06:00 |
Cyberes
|
6c0e60135d
|
exclude tokens with priority 0 from simultaneous requests ratelimit
|
2023-08-28 00:03:25 -06:00 |
Cyberes
|
c16d70a24d
|
limit amount of simultaneous requests an IP can make
|
2023-08-27 23:48:10 -06:00 |