hf_text-generation-inference/server
Miquel Farre f7cf45dfde fix 2024-11-18 13:01:50 -05:00
..
custom_kernels
exllama_kernels
exllamav2_kernels
tests
text_generation_server fix 2024-11-18 13:01:50 -05:00
.gitignore
Makefile
Makefile-awq
Makefile-eetq
Makefile-exllamav2
Makefile-flash-att
Makefile-flash-att-v2
Makefile-flashinfer
Makefile-lorax-punica
Makefile-selective-scan
Makefile-vllm
README.md
poetry.lock Add support for compressed-tensors w8a8 int checkpoints (#2745) 2024-11-18 17:20:31 +01:00
pyproject.toml Add support for compressed-tensors w8a8 int checkpoints (#2745) 2024-11-18 17:20:31 +01:00
requirements_cuda.txt connecting video to qwen2 2024-11-18 13:01:50 -05:00
requirements_intel.txt WIP video support 2024-11-18 13:01:50 -05:00
requirements_rocm.txt WIP video support 2024-11-18 13:01:50 -05:00

README.md

Text Generation Inference Python gRPC Server

A Python gRPC server for Text Generation Inference

Install

make install

Run

make run-dev