local-llm-server/other/vllm/Docker/DOCKER.md

1 line
295 B
Markdown
Raw Normal View History

2023-09-26 14:48:34 -06:00
`docker run --shm-size 14g --gpus all -v /storage/models/awq/MythoMax-L2-13B-AWQ:/models/MythoMax-L2-13B-AWQ -e ENV_API_SERVER_ARGS="--model /models/MythoMax-L2-13B-AWQ --quantization awq --host 0.0.0.0 --port 7000 --max-num-batched-tokens 8192 --gpu-memory-utilization 1" -d cyberes_vllm_cloud`