15 lines
683 B
Markdown
15 lines
683 B
Markdown
**A Docker container for running VLLM on Paperspace Gradient notebooks.**
|
|
|
|
1. Run `jupyter server --generate-config` and `jupyter server password` on your local machine, then copy Jupyter's config directory to `./jupyter`
|
|
2. Place your Rathole client config at `./rathole-client.toml`
|
|
3. `docker build . -t "paperspace-vllm"`
|
|
|
|
To test on your local machine, run this command:
|
|
|
|
```bash
|
|
docker run --shm-size 14g --gpus all \
|
|
-v /storage/models/awq/MythoMax-L2-13B-AWQ:/models/MythoMax-L2-13B-AWQ \
|
|
-p 7000:7000 -p 8888:8888 \
|
|
-e API_SERVER_ARGS="--model /models/MythoMax-L2-13B-AWQ --quantization awq --max-num-batched-tokens 99999 --gpu-memory-utilization 1" \
|
|
vllm-cloud
|
|
``` |