local-llm-server

This repository has been archived on 2024-10-27. You can view files and clone it, but cannot push or open issues or pull requests.

History

Cyberes e5fbc9545d add ratelimiting to websocket streaming endpoint, fix queue not decrementing IP requests, add console printer		2023-09-27 21:15:54 -06:00
..
oobabooga	further align openai endpoint with expected responses	2023-09-24 21:45:30 -06:00
openai	don't use db pooling, add LLM-ST-Errors header to disable formatted errors	2023-09-26 23:59:22 -06:00
vllm	fix error handling	2023-09-27 14:36:49 -06:00
__init__.py	more work on openai endpoint	2023-09-26 22:09:11 -06:00
generator.py	add ratelimiting to websocket streaming endpoint, fix queue not decrementing IP requests, add console printer	2023-09-27 21:15:54 -06:00
info.py	option to disable streaming, improve timeout on requests to backend, fix error handling. reduce duplicate code, misc other cleanup	2023-09-14 14:05:50 -06:00
llm_backend.py	fix error handling	2023-09-27 14:36:49 -06:00