local-llm-server

Commit Graph

Author	SHA1	Message	Date
Cyberes	28c250385d	add todo	2023-10-27 19:00:49 -06:00
Cyberes	563630547a	add robots.txt	2023-10-23 17:32:33 -06:00
Cyberes	b4e01e129d	fix when all offline	2023-10-23 17:28:59 -06:00
Cyberes	3cf73fec9b	fix a few exceptions when all backends go offline	2023-10-23 15:22:57 -06:00
Cyberes	92e4ecd8a1	refer to queue for tracking IP count rather than seperate value	2023-10-18 09:03:10 -06:00
Cyberes	2c7773cc4f	get streaming working again	2023-10-16 16:22:52 -06:00
Cyberes	83f3ba8919	trying to fix workers still processing after backend goes offline	2023-10-15 15:11:37 -06:00
Cyberes	18e37a72ae	add model selection to openai endpoint	2023-10-09 23:51:26 -06:00
Cyberes	8df667bc0a	t	2023-10-05 19:25:08 -06:00
Cyberes	64d7a9edbb	fix	2023-10-05 18:09:24 -06:00
Cyberes	3d0a5cf0a2	t	2023-10-05 18:06:36 -06:00
Cyberes	01fb619b9b	f	2023-10-05 18:05:31 -06:00
Cyberes	1670594908	fix import error	2023-10-04 16:29:19 -06:00
Cyberes	6723dd79dc	fix exceptoin	2023-10-04 16:04:03 -06:00
Cyberes	95d781725e	t	2023-10-04 12:42:18 -06:00
Cyberes	4deb32bf1c	test	2023-10-04 10:32:11 -06:00
Cyberes	5f4e4710c1	option to prioritize by parameter count	2023-10-04 10:19:44 -06:00
Cyberes	67f5df9bb9	fix stats page	2023-10-03 20:42:53 -06:00
Cyberes	33b4b8404b	clean up streaming	2023-10-03 14:10:50 -06:00
Cyberes	32ad97e57c	do default model rather than default backend, adjust moderation endpoint logic and add timeout, exclude system tokens from recent proompters, calculate number of moderators from endpoint concurrent gens, adjust homepage	2023-10-03 13:40:08 -06:00
Cyberes	94141b8ecf	fix processing not being decremented on streaming, fix confusion over queue, adjust stop sequences	2023-10-02 20:53:08 -06:00
Cyberes	51881ae39d	fix tokenizer	2023-10-01 17:19:34 -06:00
Cyberes	2a3ff7e21e	update openai endpoints	2023-10-01 14:15:01 -06:00
Cyberes	93d19fb95b	fix exception	2023-10-01 10:25:32 -06:00
Cyberes	d203973e80	fix routes	2023-10-01 01:13:13 -06:00
Cyberes	25ec56a5ef	get streaming working, remove /v2/	2023-10-01 00:20:00 -06:00
Cyberes	c5b30d985c	adjust jinja template	2023-09-30 22:11:51 -06:00
Cyberes	7af3dbd76b	add message about settings	2023-09-30 21:31:25 -06:00
Cyberes	592eb08cb1	add message for /v1/	2023-09-30 21:07:12 -06:00
Cyberes	91ba2fad1b	add proompter stats back in	2023-09-30 20:11:14 -06:00
Cyberes	e0f86d053a	reorganize to api v2	2023-09-30 19:42:41 -06:00
Cyberes	114f36e709	functional	2023-09-30 19:41:50 -06:00
Cyberes	624ca74ce5	mvp	2023-09-29 00:09:44 -06:00
Cyberes	e7b57cad7b	set up cluster config and basic background workers	2023-09-28 18:40:24 -06:00
Cyberes	e1d3fca6d3	try to cancel inference if disconnected from client	2023-09-28 09:55:31 -06:00
Cyberes	a4a1d6cce6	fix double logging	2023-09-28 01:34:15 -06:00
Cyberes	e86a5182eb	redo background processes, reorganize server.py	2023-09-27 23:36:44 -06:00
Cyberes	097d614a35	fix duplicate logging from console printer thread	2023-09-27 21:28:25 -06:00
Cyberes	adc0905c6f	fix imports	2023-09-27 21:20:08 -06:00
Cyberes	e5fbc9545d	add ratelimiting to websocket streaming endpoint, fix queue not decrementing IP requests, add console printer	2023-09-27 21:15:54 -06:00
Cyberes	43299b32ad	clean up background threads	2023-09-27 19:39:04 -06:00
Cyberes	35e9847b27	set inference workers to daemon, add finally to inference worker, hide estimated avg tps	2023-09-27 18:36:51 -06:00
Cyberes	74f16afa67	update dockerfile	2023-09-27 16:12:36 -06:00
Cyberes	105b66d5e2	unify error message handling	2023-09-27 14:48:47 -06:00
Cyberes	957a6cd092	fix error handling	2023-09-27 14:36:49 -06:00
Cyberes	aba2e5b9c0	don't use db pooling, add LLM-ST-Errors header to disable formatted errors	2023-09-26 23:59:22 -06:00
Cyberes	d9bbcc42e6	more work on openai endpoint	2023-09-26 22:09:11 -06:00
Cyberes	e0af2ea9c5	convert to gunicorn	2023-09-26 13:32:33 -06:00
Cyberes	b44dda7a3a	option to show SYSTEM tokens in stats	2023-09-25 23:39:50 -06:00
Cyberes	bbdb9c9d55	try to prevent "### XXX" responses on openai	2023-09-25 23:14:35 -06:00

1 2 3

113 Commits