riffusion-inference/riffusion/datatypes.py

"""
Data model for the riffusion API.
"""
from __future__ import annotations

import typing as T
from dataclasses import dataclass


@dataclass(frozen=True)
class PromptInput:
    """
    Parameters for one end of interpolation.
    """

    # Text prompt fed into a CLIP model
    prompt: str

    # Random seed for denoising
    seed: int

    # Negative prompt to avoid (optional)
    negative_prompt: T.Optional[str] = None

    # Denoising strength
    denoising: float = 0.75

    # Classifier-free guidance strength
    guidance: float = 7.0


@dataclass(frozen=True)
class InferenceInput:
    """
    Parameters for a single run of the riffusion model, interpolating between
    a start and end set of PromptInputs. This is the API required for a request
    to the model server.
    """

    # Start point of interpolation
    start: PromptInput

    # End point of interpolation
    end: PromptInput

    # Interpolation alpha [0, 1]. A value of 0 uses start fully, a value of 1
    # uses end fully.
    alpha: float

    # Number of inner loops of the diffusion model
    num_inference_steps: int = 50

    # Which seed image to use
    seed_image_id: str = "og_beat"

    # ID of mask image to use
    mask_image_id: T.Optional[str] = None


@dataclass(frozen=True)
class InferenceOutput:
    """
    Response from the model inference server.
    """

    # base64 encoded spectrogram image as a JPEG
    image: str

    # base64 encoded audio clip as an MP3
    audio: str

    # The duration of the audio clip
    duration_s: float
Describe data types 2022-11-25 17:13:29 -07:00			`"""`
			`Data model for the riffusion API.`
			`"""`
Greatly simplify the server and baseten integration With the new clean module structure, make it so the two servers share all the important code. This makes the baseten integration very small and simple, and paves the way for more integrations. Topic: clean_rewrite 2022-12-26 18:25:17 -07:00			`from __future__ import annotations`
Describe data types 2022-11-25 17:13:29 -07:00
Support masks 2022-11-25 23:48:52 -07:00			`import typing as T`
Enable ruff import sorting Topic: import_sorting 2022-12-26 19:12:02 -07:00			`from dataclasses import dataclass`
Describe data types 2022-11-25 17:13:29 -07:00

Add support for cache (still disabled) 2022-11-27 17:06:12 -07:00			`@dataclass(frozen=True)`
Describe data types 2022-11-25 17:13:29 -07:00			`class PromptInput:`
			`"""`
			`Parameters for one end of interpolation.`
			`"""`

			`# Text prompt fed into a CLIP model`
			`prompt: str`

			`# Random seed for denoising`
			`seed: int`

Audio to audio handles interpolation within it Kill the separate page. Topic: audio_to_audio_interpolation 2023-01-14 12:31:33 -07:00			`# Negative prompt to avoid (optional)`
			`negative_prompt: T.Optional[str] = None`

Describe data types 2022-11-25 17:13:29 -07:00			`# Denoising strength`
			`denoising: float = 0.75`

			`# Classifier-free guidance strength`
			`guidance: float = 7.0`


Add support for cache (still disabled) 2022-11-27 17:06:12 -07:00			`@dataclass(frozen=True)`
Describe data types 2022-11-25 17:13:29 -07:00			`class InferenceInput:`
			`"""`
			`Parameters for a single run of the riffusion model, interpolating between`
			`a start and end set of PromptInputs. This is the API required for a request`
			`to the model server.`
			`"""`

			`# Start point of interpolation`
			`start: PromptInput`

			`# End point of interpolation`
			`end: PromptInput`

			`# Interpolation alpha [0, 1]. A value of 0 uses start fully, a value of 1`
			`# uses end fully.`
			`alpha: float`

			`# Number of inner loops of the diffusion model`
			`num_inference_steps: int = 50`

			`# Which seed image to use`
Support masks 2022-11-25 23:48:52 -07:00			`seed_image_id: str = "og_beat"`

			`# ID of mask image to use`
			`mask_image_id: T.Optional[str] = None`
Describe data types 2022-11-25 17:13:29 -07:00

Add support for cache (still disabled) 2022-11-27 17:06:12 -07:00			`@dataclass(frozen=True)`
Describe data types 2022-11-25 17:13:29 -07:00			`class InferenceOutput:`
			`"""`
Provide clip duration and encode base64 prefix type 2022-11-27 14:55:42 -07:00			`Response from the model inference server.`
Describe data types 2022-11-25 17:13:29 -07:00			`"""`
Greatly simplify the server and baseten integration With the new clean module structure, make it so the two servers share all the important code. This makes the baseten integration very small and simple, and paves the way for more integrations. Topic: clean_rewrite 2022-12-26 18:25:17 -07:00
Provide clip duration and encode base64 prefix type 2022-11-27 14:55:42 -07:00			`# base64 encoded spectrogram image as a JPEG`
Describe data types 2022-11-25 17:13:29 -07:00			`image: str`
Provide clip duration and encode base64 prefix type 2022-11-27 14:55:42 -07:00
			`# base64 encoded audio clip as an MP3`
Describe data types 2022-11-25 17:13:29 -07:00			`audio: str`
Provide clip duration and encode base64 prefix type 2022-11-27 14:55:42 -07:00
			`# The duration of the audio clip`
			`duration_s: float`