History

drbh 2ca5980634 Pr 2337 ci branch (#2379 ) * hotfix: fix xpu crash brought by code refine. torch.xpu rely on import ipex Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * reable gemma2 in xpu Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> * fix in regression in ipex flashattention Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> --------- Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Co-authored-by: Wang, Yi A <yi.a.wang@intel.com>		2024-08-08 12:30:29 -04:00
..
custom_kernels	…
exllama_kernels	…
exllamav2_kernels	…
tests	feat: add ruff and resolve issue (#2262 )	2024-07-26 10:29:09 -04:00
text_generation_server	Pr 2337 ci branch (#2379 )	2024-08-08 12:30:29 -04:00
.gitignore	…
Makefile	hotfix: update nccl	2024-07-23 23:31:28 +02:00
Makefile-awq	…
Makefile-eetq	…
Makefile-fbgemm	…
Makefile-flash-att	…
Makefile-flash-att-v2	…
Makefile-lorax-punica	…
Makefile-selective-scan	…
Makefile-vllm	…
README.md	…
poetry.lock	Install Marlin from standalone package (#2320 )	2024-07-29 15:37:10 +02:00
pyproject.toml	Install Marlin from standalone package (#2320 )	2024-07-29 15:37:10 +02:00
requirements_cuda.txt	…
requirements_intel.txt	…
requirements_rocm.txt	…

Text Generation Inference Python gRPC Server

A Python gRPC server for Text Generation Inference

Install

make install

make run-dev