local-llm-server

Commit Graph

Author	SHA1	Message	Date
Cyberes	d9bbcc42e6	more work on openai endpoint	2023-09-26 22:09:11 -06:00
Cyberes	e0af2ea9c5	convert to gunicorn	2023-09-26 13:32:33 -06:00
Cyberes	11e84db59c	update database, tokenizer handle null prompt, convert top_p to vllm on openai, actually validate prompt on streaming,	2023-09-25 22:32:48 -06:00
Cyberes	8240a1ebbb	fix background log not doing anything	2023-09-25 18:18:29 -06:00
Cyberes	1646a00987	implement streaming on openai, improve streaming, run DB logging in background thread	2023-09-25 12:30:40 -06:00
Cyberes	320f51e01c	further align openai endpoint with expected responses	2023-09-24 21:45:30 -06:00
Cyberes	cb99c3490e	rewrite tokenizer, restructure validation	2023-09-24 13:02:30 -06:00
Cyberes	76a1428ba0	implement streaming for vllm	2023-09-23 17:57:23 -06:00
Cyberes	81452ec643	adjust vllm info	2023-09-21 20:13:29 -06:00
Cyberes	03e3ec5490	port to mysql, use vllm tokenizer endpoint	2023-09-20 20:30:31 -06:00
Cyberes	354ad8192d	fix division by 0, prettify /stats json, add js var to home	2023-09-16 17:37:43 -06:00
Cyberes	77edbe779c	actually validate prompt length lol	2023-09-14 18:31:13 -06:00
Cyberes	3100b0a924	set up queue to work with gunicorn processes, other improvements	2023-09-14 17:38:20 -06:00
Cyberes	79b1e01b61	option to disable streaming, improve timeout on requests to backend, fix error handling. reduce duplicate code, misc other cleanup	2023-09-14 14:05:50 -06:00
Cyberes	c45e68a8c8	adjust requests timeout, add service file	2023-09-14 01:32:49 -06:00
Cyberes	05a45e6ac6	didnt test anything	2023-09-13 11:51:46 -06:00
Cyberes	bcedd2ab3d	adjust logging, add more vllm stuff	2023-09-13 11:22:33 -06:00
Cyberes	9740df07c7	add openai-compatible backend	2023-09-12 16:40:09 -06:00
Cyberes	1d9f40765e	remove text-generation-inference backend	2023-09-12 13:09:47 -06:00
Cyberes	6152b1bb66	fix invalid param error, add manual model name	2023-09-12 10:30:45 -06:00
Cyberes	40ac84aa9a	actually we don't want to emulate openai	2023-09-12 01:04:11 -06:00
Cyberes	747d838138	move where the vllm model is set	2023-09-11 21:05:22 -06:00
Cyberes	4c9d543eab	implement vllm backend	2023-09-11 20:47:19 -06:00
Cyberes	c14cc51f09	get working with ooba again, give up on dockerfile	2023-09-11 09:51:01 -06:00
Cyberes	bf39b8da63	still having issues	2023-08-31 09:24:37 -06:00
Cyberes	47887c3925	missed a spot, clean up json error handling	2023-08-30 20:19:23 -06:00
Cyberes	2816c01902	refactor generation route	2023-08-30 18:53:26 -06:00
Cyberes	f9b9051bad	update weighted_average_column_for_model to account for when there was an error reported, insert null for response tokens when error, correctly parse x-forwarded-for, correctly convert model reported by hf-textgen	2023-08-29 15:46:56 -06:00
Cyberes	23f3fcf579	log errors to database	2023-08-29 14:48:33 -06:00
Cyberes	b44dfa2471	update info page	2023-08-29 14:00:35 -06:00
Cyberes	ba0bc87434	add HF text-generation-inference backend	2023-08-29 13:46:41 -06:00
Cyberes	0aa52863bc	forgot to start workers	2023-08-23 20:33:49 -06:00
Cyberes	6f8b70df54	add a queue system	2023-08-23 20:12:38 -06:00
Cyberes	7eb930fafd	crap	2023-08-23 16:12:25 -06:00
Cyberes	9fc674878d	allow disabling ssl verification	2023-08-23 16:11:32 -06:00
Cyberes	508089ce11	model info timeout and additional info	2023-08-23 16:07:43 -06:00
Cyberes	1f5e2da637	print fetch model error message	2023-08-23 16:02:57 -06:00
Cyberes	a525093c75	rename, more stats	2023-08-22 20:42:38 -06:00
Cyberes	0d32db2dbd	prototype hf-textgen and adjust logging	2023-08-22 19:58:31 -06:00
Cyberes	a59dcea2da	more proxy stats	2023-08-22 16:50:49 -06:00
Cyberes	8cbf643fd3	MVP	2023-08-21 21:28:52 -06:00

41 Commits