aac93eb80e | ||
---|---|---|
riffusion | ||
seed_images | ||
.gitignore | ||
LICENSE | ||
README.md | ||
dev_requirements.txt | ||
requirements.txt |
README.md
Riffusion Inference Server
Riffusion is an app for real-time music generation with stable diffusion.
Read about it at https://www.riffusion.com/about and try it at https://www.riffusion.com/.
- Web app: https://github.com/hmartiro/riffusion-app
- Inference server: https://github.com/hmartiro/riffusion-inference
- Model checkpoint: https://huggingface.co/riffusion/riffusion-model-v1
This repository contains the Python backend does the model inference and audio processing, including:
- a diffusers pipeline that performs prompt interpolation combined with image conditioning
- a module for (approximately) converting between spectrograms and waveforms
- a flask server to provide model inference via API to the next.js app
Install
Tested with Python 3.9 and diffusers 0.9.0
conda create --name riffusion-inference python=3.9
conda activate riffusion-inference
python -m pip install -r requirements.txt
Run
Start the Flask server:
python -m riffusion.server --port 3013 --host 127.0.0.1 --checkpoint /path/to/diffusers_checkpoint
The model endpoint is now available at http://127.0.0.1:3013/run_inference
via POST request.
Example input (see InferenceInput for the API):
{
alpha: 0.75,
num_inference_steps: 50,
seed_image_id: "og_beat",
start: {
prompt: "church bells on sunday",
seed: 42,
denoising: 0.75,
guidance: 7.0,
},
end: {
prompt: "jazz with piano",
seed: 123,
denoising: 0.75,
guidance: 7.0,
},
}
Example output (see InferenceOutput for the API):
{
image: "< base64 encoded JPEG image >",
audio: "< base64 encoded MP3 clip >",,
}