preemo_text-generation-infe.../server
Yang, Bo b5fadc4c28
Don't enable custom kernels if CUDA is not available (#6)
2023-08-02 09:51:54 -07:00
..
custom_kernels
exllama_kernels
tests Add AutoCausalLM (#5) 2023-08-02 09:35:40 -07:00
text_generation_server Don't enable custom kernels if CUDA is not available (#6) 2023-08-02 09:51:54 -07:00
.gitignore
Makefile fix(server): fix missing datasets in quantize 2023-07-27 14:50:45 +02:00
Makefile-flash-att
Makefile-flash-att-v2
Makefile-vllm feat(server): update vllm version (#723) 2023-07-28 15:36:38 +02:00
README.md
poetry.lock fix(server): fix missing datasets in quantize 2023-07-27 14:50:45 +02:00
pyproject.toml v0.9.4 (#713) 2023-07-27 19:25:15 +02:00
requirements.txt fix(server): fix missing datasets in quantize 2023-07-27 14:50:45 +02:00

README.md

Text Generation Inference Python gRPC Server

A Python gRPC server for Text Generation Inference

Install

make install

Run

make run-dev