local-llm-server

Commit Graph

Author	SHA1	Message	Date
Cyberes	79b1e01b61	option to disable streaming, improve timeout on requests to backend, fix error handling. reduce duplicate code, misc other cleanup	2023-09-14 14:05:50 -06:00
Cyberes	e79b206e1a	rename average_tps to estimated_avg_tps	2023-09-14 01:35:25 -06:00
Cyberes	c45e68a8c8	adjust requests timeout, add service file	2023-09-14 01:32:49 -06:00
Cyberes	035c17c48b	reformat info page info_html field	2023-09-13 20:40:55 -06:00
Cyberes	15a0390875	typo	2023-09-13 20:27:20 -06:00
Cyberes	12e894032e	show the openai system prompt	2023-09-13 20:25:56 -06:00
Cyberes	320c3fc710	calculate time stats based on backend url	2023-09-13 12:34:14 -06:00
Cyberes	3d40ed4cfb	shit code	2023-09-13 11:58:38 -06:00
Cyberes	1582625e09	how did this get broken	2023-09-13 11:56:30 -06:00
Cyberes	05a45e6ac6	didnt test anything	2023-09-13 11:51:46 -06:00
Cyberes	84369d6c78	oops	2023-09-13 11:30:22 -06:00
Cyberes	bcedd2ab3d	adjust logging, add more vllm stuff	2023-09-13 11:22:33 -06:00
Cyberes	e053f48fdc	change gpt4 prompt	2023-09-12 16:47:08 -06:00
Cyberes	6ba1fc06d3	reorder homepage	2023-09-12 16:43:15 -06:00
Cyberes	9740df07c7	add openai-compatible backend	2023-09-12 16:40:09 -06:00
Cyberes	1d9f40765e	remove text-generation-inference backend	2023-09-12 13:09:47 -06:00
Cyberes	6152b1bb66	fix invalid param error, add manual model name	2023-09-12 10:30:45 -06:00
Cyberes	5dd95875dd	oops	2023-09-12 01:12:50 -06:00
Cyberes	57ccedcfb9	adjust some things	2023-09-12 01:10:58 -06:00
Cyberes	a84386c311	move import check furthger up	2023-09-12 01:05:03 -06:00
Cyberes	40ac84aa9a	actually we don't want to emulate openai	2023-09-12 01:04:11 -06:00
Cyberes	747d838138	move where the vllm model is set	2023-09-11 21:05:22 -06:00
Cyberes	4c9d543eab	implement vllm backend	2023-09-11 20:47:19 -06:00
Cyberes	c14cc51f09	get working with ooba again, give up on dockerfile	2023-09-11 09:51:01 -06:00
Cyberes	4c49aa525a	still working on dockerfile	2023-09-10 18:11:25 -06:00
Cyberes	170c912d71	reorganize dockerfile	2023-09-06 23:32:43 -06:00
Cyberes	f213b9a3ae	cuda nn	2023-09-06 22:27:48 -06:00
Cyberes	b2b6cdabaa	still working on dockerfile	2023-09-06 12:01:32 -06:00
Cyberes	cc1db8a0ba	more docker stuff	2023-09-04 20:15:45 -06:00
Cyberes	a98d7edeb7	add docker file	2023-08-31 15:59:45 -06:00
Cyberes	2d8812a6cd	fix crash again	2023-08-31 09:31:16 -06:00
Cyberes	bf39b8da63	still having issues	2023-08-31 09:24:37 -06:00
Cyberes	4b32401542	oops wrong data strucutre	2023-08-30 20:24:55 -06:00
Cyberes	47887c3925	missed a spot, clean up json error handling	2023-08-30 20:19:23 -06:00
Cyberes	8c04238e04	disable stream for now	2023-08-30 19:58:59 -06:00
Cyberes	41b8232499	update example config	2023-08-30 18:59:29 -06:00
Cyberes	2816c01902	refactor generation route	2023-08-30 18:53:26 -06:00
Cyberes	e45eafd286	update requirements.txt	2023-08-29 17:57:06 -06:00
Cyberes	bf648f605f	implement streaming for hf-textgen	2023-08-29 17:56:12 -06:00
Cyberes	26b04f364c	remove old code	2023-08-29 15:57:28 -06:00
Cyberes	cef88b866a	fix wrong response status code	2023-08-29 15:52:58 -06:00
Cyberes	f9b9051bad	update weighted_average_column_for_model to account for when there was an error reported, insert null for response tokens when error, correctly parse x-forwarded-for, correctly convert model reported by hf-textgen	2023-08-29 15:46:56 -06:00
Cyberes	da77a24eaa	damn	2023-08-29 14:58:08 -06:00
Cyberes	2d9ec15302	I swear I know what I'm doing	2023-08-29 14:57:49 -06:00
Cyberes	06b52c7648	forgot to remove a snippet	2023-08-29 14:53:03 -06:00
Cyberes	23f3fcf579	log errors to database	2023-08-29 14:48:33 -06:00
Cyberes	b44dfa2471	update info page	2023-08-29 14:00:35 -06:00
Cyberes	ba0bc87434	add HF text-generation-inference backend	2023-08-29 13:46:41 -06:00
Cyberes	6c0e60135d	exclude tokens with priority 0 from simultaneous requests ratelimit	2023-08-28 00:03:25 -06:00
Cyberes	c16d70a24d	limit amount of simultaneous requests an IP can make	2023-08-27 23:48:10 -06:00

... 2 3 4 5 6

284 Commits All Branches Search

284 Commits

All Branches