local-llm-server

This repository has been archived on 2024-10-27. You can view files and clone it, but cannot push or open issues or pull requests.

History

Cyberes 347a82b7e1 avoid sending to backend to tokenize if it's greater than our specified context size		2023-09-28 03:54:20 -06:00
..
oobabooga	further align openai endpoint with expected responses	2023-09-24 21:45:30 -06:00
openai	don't use db pooling, add LLM-ST-Errors header to disable formatted errors	2023-09-26 23:59:22 -06:00
vllm	avoid sending to backend to tokenize if it's greater than our specified context size	2023-09-28 03:54:20 -06:00
__init__.py	more work on openai endpoint	2023-09-26 22:09:11 -06:00
generator.py	add ratelimiting to websocket streaming endpoint, fix queue not decrementing IP requests, add console printer	2023-09-27 21:15:54 -06:00
info.py	option to disable streaming, improve timeout on requests to backend, fix error handling. reduce duplicate code, misc other cleanup	2023-09-14 14:05:50 -06:00
llm_backend.py	fix error handling	2023-09-27 14:36:49 -06:00