local-llm-server

Commit Graph

Author	SHA1	Message	Date
Cyberes	cb99c3490e	rewrite tokenizer, restructure validation	2023-09-24 13:02:30 -06:00
Cyberes	62412f4873	add config setting for hostname	2023-09-23 23:24:08 -06:00
Cyberes	84a1fcfdd8	don't store host if it's an IP	2023-09-23 23:14:22 -06:00
Cyberes	76a1428ba0	implement streaming for vllm	2023-09-23 17:57:23 -06:00
Cyberes	8593198216	close mysql cursor	2023-09-20 21:19:26 -06:00
Cyberes	03e3ec5490	port to mysql, use vllm tokenizer endpoint	2023-09-20 20:30:31 -06:00
Cyberes	eb3179cfff	fix recent proompters to work with gunicorn	2023-09-17 19:06:53 -06:00
Cyberes	3c1254d3bf	cache stats in background	2023-09-17 18:55:36 -06:00
Cyberes	77edbe779c	actually validate prompt length lol	2023-09-14 18:31:13 -06:00
Cyberes	3100b0a924	set up queue to work with gunicorn processes, other improvements	2023-09-14 17:38:20 -06:00
Cyberes	93a344f4c5	check if the backend crapped out, print some more stuff	2023-09-14 14:26:25 -06:00
Cyberes	79b1e01b61	option to disable streaming, improve timeout on requests to backend, fix error handling. reduce duplicate code, misc other cleanup	2023-09-14 14:05:50 -06:00
Cyberes	bcedd2ab3d	adjust logging, add more vllm stuff	2023-09-13 11:22:33 -06:00
Cyberes	9740df07c7	add openai-compatible backend	2023-09-12 16:40:09 -06:00
Cyberes	1d9f40765e	remove text-generation-inference backend	2023-09-12 13:09:47 -06:00
Cyberes	6152b1bb66	fix invalid param error, add manual model name	2023-09-12 10:30:45 -06:00
Cyberes	40ac84aa9a	actually we don't want to emulate openai	2023-09-12 01:04:11 -06:00
Cyberes	747d838138	move where the vllm model is set	2023-09-11 21:05:22 -06:00
Cyberes	4c9d543eab	implement vllm backend	2023-09-11 20:47:19 -06:00
Cyberes	2d8812a6cd	fix crash again	2023-08-31 09:31:16 -06:00
Cyberes	47887c3925	missed a spot, clean up json error handling	2023-08-30 20:19:23 -06:00
Cyberes	2816c01902	refactor generation route	2023-08-30 18:53:26 -06:00

22 Commits