hf_text-generation-inference/server
Wang, Yi f478aa77ad
hotfix: ipex fails since cuda moe kernel is not supported (#2532)
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
2024-09-20 10:02:55 +02:00
..
custom_kernels
exllama_kernels
exllamav2_kernels
tests
text_generation_server
.gitignore
Makefile
Makefile-awq
Makefile-eetq
Makefile-exllamav2
Makefile-fbgemm
Makefile-flash-att
Makefile-flash-att-v2
Makefile-flashinfer
Makefile-lorax-punica
Makefile-selective-scan
Makefile-vllm
README.md
poetry.lock
pyproject.toml
requirements_cuda.txt
requirements_intel.txt
requirements_rocm.txt

README.md

Text Generation Inference Python gRPC Server

A Python gRPC server for Text Generation Inference

Install

make install

Run

make run-dev