local-llm-server

Commit Graph

Author	SHA1	Message	Date
Cyberes	ff82add09e	redo database connection, add pooling, minor logging changes, other clean up	2024-05-07 09:48:51 -06:00
Cyberes	0e7f04ab2d	fix gunicorn logging	2023-12-21 14:24:50 -07:00
Cyberes	009039dbd8	fix server exception	2023-10-30 14:42:50 -06:00
Cyberes	0059e7956c	Merge cluster to master (#3 ) Co-authored-by: Cyberes <cyberes@evulid.cc> Reviewed-on: #3	2023-10-27 19:19:22 -06:00
Cyberes	e1d3fca6d3	try to cancel inference if disconnected from client	2023-09-28 09:55:31 -06:00
Cyberes	a4a1d6cce6	fix double logging	2023-09-28 01:34:15 -06:00
Cyberes	e86a5182eb	redo background processes, reorganize server.py	2023-09-27 23:36:44 -06:00
Cyberes	097d614a35	fix duplicate logging from console printer thread	2023-09-27 21:28:25 -06:00
Cyberes	adc0905c6f	fix imports	2023-09-27 21:20:08 -06:00
Cyberes	e5fbc9545d	add ratelimiting to websocket streaming endpoint, fix queue not decrementing IP requests, add console printer	2023-09-27 21:15:54 -06:00
Cyberes	43299b32ad	clean up background threads	2023-09-27 19:39:04 -06:00
Cyberes	35e9847b27	set inference workers to daemon, add finally to inference worker, hide estimated avg tps	2023-09-27 18:36:51 -06:00
Cyberes	74f16afa67	update dockerfile	2023-09-27 16:12:36 -06:00
Cyberes	105b66d5e2	unify error message handling	2023-09-27 14:48:47 -06:00
Cyberes	957a6cd092	fix error handling	2023-09-27 14:36:49 -06:00
Cyberes	aba2e5b9c0	don't use db pooling, add LLM-ST-Errors header to disable formatted errors	2023-09-26 23:59:22 -06:00
Cyberes	d9bbcc42e6	more work on openai endpoint	2023-09-26 22:09:11 -06:00
Cyberes	e0af2ea9c5	convert to gunicorn	2023-09-26 13:32:33 -06:00
Cyberes	b44dda7a3a	option to show SYSTEM tokens in stats	2023-09-25 23:39:50 -06:00
Cyberes	bbdb9c9d55	try to prevent "### XXX" responses on openai	2023-09-25 23:14:35 -06:00
Cyberes	11e84db59c	update database, tokenizer handle null prompt, convert top_p to vllm on openai, actually validate prompt on streaming,	2023-09-25 22:32:48 -06:00
Cyberes	2d299dbae5	openai_force_no_hashes	2023-09-25 22:01:57 -06:00
Cyberes	8240a1ebbb	fix background log not doing anything	2023-09-25 18:18:29 -06:00
Cyberes	289b40181c	forgot to test all config possibilities	2023-09-25 17:23:43 -06:00
Cyberes	30282479a0	fix flask exception	2023-09-25 17:22:28 -06:00
Cyberes	135bd743bb	fix homepage slowness, fix incorrect 24 hr prompters, fix redis wrapper,	2023-09-25 17:20:21 -06:00
Cyberes	bbe5d5a8fe	improve openai endpoint, exclude system tokens more places	2023-09-25 09:32:23 -06:00
Cyberes	6459a1c91b	allow setting simultaneous IP limit per-token, fix token use tracker, fix tokens on streaming	2023-09-25 00:55:20 -06:00
Cyberes	320f51e01c	further align openai endpoint with expected responses	2023-09-24 21:45:30 -06:00
Cyberes	8d6b2ce49c	minor changes, add admin token auth system, add route to get backend info	2023-09-24 15:54:35 -06:00
Cyberes	cb99c3490e	rewrite tokenizer, restructure validation	2023-09-24 13:02:30 -06:00
Cyberes	62412f4873	add config setting for hostname	2023-09-23 23:24:08 -06:00
Cyberes	84a1fcfdd8	don't store host if it's an IP	2023-09-23 23:14:22 -06:00
Cyberes	fab7b7ccdd	active gen workers wait	2023-09-23 21:17:13 -06:00
Cyberes	03e3ec5490	port to mysql, use vllm tokenizer endpoint	2023-09-20 20:30:31 -06:00
Cyberes	3c1254d3bf	cache stats in background	2023-09-17 18:55:36 -06:00
Cyberes	3100b0a924	set up queue to work with gunicorn processes, other improvements	2023-09-14 17:38:20 -06:00
Cyberes	a89295193f	add moderation endpoint to openai api, update config	2023-09-14 15:07:17 -06:00
Cyberes	507327db49	lower caching of home page	2023-09-14 14:30:01 -06:00
Cyberes	79b1e01b61	option to disable streaming, improve timeout on requests to backend, fix error handling. reduce duplicate code, misc other cleanup	2023-09-14 14:05:50 -06:00
Cyberes	035c17c48b	reformat info page info_html field	2023-09-13 20:40:55 -06:00
Cyberes	12e894032e	show the openai system prompt	2023-09-13 20:25:56 -06:00
Cyberes	9740df07c7	add openai-compatible backend	2023-09-12 16:40:09 -06:00
Cyberes	1d9f40765e	remove text-generation-inference backend	2023-09-12 13:09:47 -06:00
Cyberes	6152b1bb66	fix invalid param error, add manual model name	2023-09-12 10:30:45 -06:00
Cyberes	57ccedcfb9	adjust some things	2023-09-12 01:10:58 -06:00
Cyberes	a84386c311	move import check furthger up	2023-09-12 01:05:03 -06:00
Cyberes	40ac84aa9a	actually we don't want to emulate openai	2023-09-12 01:04:11 -06:00
Cyberes	4c9d543eab	implement vllm backend	2023-09-11 20:47:19 -06:00
Cyberes	c14cc51f09	get working with ooba again, give up on dockerfile	2023-09-11 09:51:01 -06:00

1 2

83 Commits