2022-12-12 20:37:41 -07:00
|
|
|
# Riffusion Inference Server
|
2022-11-25 14:20:30 -07:00
|
|
|
|
2022-12-12 20:40:21 -07:00
|
|
|
Riffusion is an app for real-time music generation with stable diffusion.
|
2022-11-25 17:30:11 -07:00
|
|
|
|
2022-12-12 20:37:41 -07:00
|
|
|
Read about it at https://www.riffusion.com/about and try it at https://www.riffusion.com/.
|
2022-12-12 19:55:40 -07:00
|
|
|
|
2022-12-12 20:24:24 -07:00
|
|
|
* Web app: https://github.com/hmartiro/riffusion-app
|
|
|
|
* Inference server: https://github.com/hmartiro/riffusion-inference
|
|
|
|
* Model checkpoint: https://huggingface.co/riffusion/riffusion-model-v1
|
2022-11-25 17:30:11 -07:00
|
|
|
|
2022-12-12 20:40:21 -07:00
|
|
|
This repository contains the Python backend does the model inference and audio processing, including:
|
2022-12-12 20:37:41 -07:00
|
|
|
|
|
|
|
* a diffusers pipeline that performs prompt interpolation combined with image conditioning
|
|
|
|
* a module for (approximately) converting between spectrograms and waveforms
|
|
|
|
* a flask server to provide model inference via API to the next.js app
|
|
|
|
|
|
|
|
|
2022-11-25 17:30:11 -07:00
|
|
|
## Install
|
|
|
|
Tested with Python 3.9 and diffusers 0.9.0
|
|
|
|
|
|
|
|
```
|
|
|
|
conda create --name riffusion-inference python=3.9
|
|
|
|
conda activate riffusion-inference
|
|
|
|
python -m pip install -r requirements.txt
|
|
|
|
```
|
|
|
|
|
|
|
|
## Run
|
|
|
|
Start the Flask server:
|
|
|
|
```
|
|
|
|
python -m riffusion.server --port 3013 --host 127.0.0.1 --checkpoint /path/to/diffusers_checkpoint
|
|
|
|
```
|
|
|
|
|
|
|
|
The model endpoint is now available at `http://127.0.0.1:3013/run_inference` via POST request.
|
|
|
|
|
|
|
|
Example input (see [InferenceInput](https://github.com/hmartiro/riffusion-inference/blob/main/riffusion/datatypes.py#L28) for the API):
|
|
|
|
```
|
|
|
|
{
|
|
|
|
alpha: 0.75,
|
|
|
|
num_inference_steps: 50,
|
2022-11-25 23:48:52 -07:00
|
|
|
seed_image_id: "og_beat",
|
2022-11-25 17:30:11 -07:00
|
|
|
|
|
|
|
start: {
|
|
|
|
prompt: "church bells on sunday",
|
|
|
|
seed: 42,
|
|
|
|
denoising: 0.75,
|
|
|
|
guidance: 7.0,
|
|
|
|
},
|
|
|
|
|
|
|
|
end: {
|
|
|
|
prompt: "jazz with piano",
|
|
|
|
seed: 123,
|
|
|
|
denoising: 0.75,
|
|
|
|
guidance: 7.0,
|
|
|
|
},
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
Example output (see [InferenceOutput](https://github.com/hmartiro/riffusion-inference/blob/main/riffusion/datatypes.py#L54) for the API):
|
|
|
|
```
|
|
|
|
{
|
2022-12-12 19:55:40 -07:00
|
|
|
image: "< base64 encoded JPEG image >",
|
2022-11-25 17:30:11 -07:00
|
|
|
audio: "< base64 encoded MP3 clip >",,
|
|
|
|
}
|
|
|
|
```
|