diffusers/scripts
Sanchit Gandhi b94880e536
Add AudioLDM (#2232)
* Add AudioLDM

* up

* add vocoder

* start unet

* unconditional unet

* clap, vocoder and vae

* clean-up: conversion scripts

* fix: conversion script token_type_ids

* clean-up: pipeline docstring

* tests: from SD

* clean-up: cpu offload vocoder instead of safety checker

* feat: adapt tests to audioldm

* feat: add docs

* clean-up: amend pipeline docstrings

* clean-up: make style

* clean-up: make fix-copies

* fix: add doc path to toctree

* clean-up: args for conversion script

* clean-up: paths to checkpoints

* fix: use conditional unet

* clean-up: make style

* fix: type hints for UNet

* clean-up: docstring for UNet

* clean-up: make style

* clean-up: remove duplicate in docstring

* clean-up: make style

* clean-up: make fix-copies

* clean-up: move imports to start in code snippet

* fix: pass cross_attention_dim as a list/tuple to unet

* clean-up: make fix-copies

* fix: update checkpoint path

* fix: unet cross_attention_dim in tests

* film embeddings -> class embeddings

* Apply suggestions from code review

Co-authored-by: Will Berman <wlbberman@gmail.com>

* fix: unet film embed to use existing args

* fix: unet tests to use existing args

* fix: make style

* fix: transformers import and version in init

* clean-up: make style

* Revert "clean-up: make style"

This reverts commit 5d6d1f8b324f5583e7805dc01e2c86e493660d66.

* clean-up: make style

* clean-up: use pipeline tester mixin tests where poss

* clean-up: skip attn slicing test

* fix: add torch dtype to docs

* fix: remove conversion script out of src

* fix: remove .detach from 1d waveform

* fix: reduce default num inf steps

* fix: swap height/width -> audio_length_in_s

* clean-up: make style

* fix: remove nightly tests

* fix: imports in conversion script

* clean-up: slim-down to two slow tests

* clean-up: slim-down fast tests

* fix: batch consistent tests

* clean-up: make style

* clean-up: remove vae slicing fast test

* clean-up: propagate changes to doc

* fix: increase test tol to 1e-2

* clean-up: finish docs

* clean-up: make style

* feat: vocoder / VAE compatibility check

* feat: possibly expand / cut audio waveform

* fix: pipeline call signature test

* fix: slow tests output len

* clean-up: make style

* make style

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: William Berman <WLBberman@gmail.com>
2023-03-23 19:00:21 +01:00
..
__init__.py Fix conversion script 2022-07-15 17:00:41 +00:00
change_naming_configs_and_checkpoints.py [Copyright] 2023 (#2524) 2023-03-01 10:31:00 +01:00
conversion_ldm_uncond.py Replace flake8 with ruff and update black (#2279) 2023-02-07 23:46:23 +01:00
convert_dance_diffusion_to_diffusers.py Replace flake8 with ruff and update black (#2279) 2023-02-07 23:46:23 +01:00
convert_ddpm_original_checkpoint_to_diffusers.py Fix typos and add Typo check GitHub Action (#483) 2022-09-16 15:36:51 +02:00
convert_diffusers_to_original_stable_diffusion.py Replace flake8 with ruff and update black (#2279) 2023-02-07 23:46:23 +01:00
convert_dit_to_diffusers.py Replace flake8 with ruff and update black (#2279) 2023-02-07 23:46:23 +01:00
convert_k_upscaler_to_diffusers.py Replace flake8 with ruff and update black (#2279) 2023-02-07 23:46:23 +01:00
convert_kakao_brain_unclip_to_diffusers.py Replace flake8 with ruff and update black (#2279) 2023-02-07 23:46:23 +01:00
convert_ldm_original_checkpoint_to_diffusers.py [Copyright] 2023 (#2524) 2023-03-01 10:31:00 +01:00
convert_lora_safetensor_to_diffusers.py make style 2023-03-06 10:51:03 +00:00
convert_models_diffuser_to_diffusers.py Add UNet 1d for RL model for planning + colab (#105) 2022-11-14 13:48:48 -08:00
convert_ms_text_to_video_to_diffusers.py [MS Text To Video] Add first text to video (#2738) 2023-03-22 18:39:33 +01:00
convert_music_spectrogram_to_diffusers.py Music Spectrogram diffusion pipeline (#1044) 2023-03-23 14:06:17 +01:00
convert_ncsnpp_original_checkpoint_to_diffusers.py [Copyright] 2023 (#2524) 2023-03-01 10:31:00 +01:00
convert_original_audioldm_to_diffusers.py Add AudioLDM (#2232) 2023-03-23 19:00:21 +01:00
convert_original_controlnet_to_diffusers.py controlnet sd 2.1 checkpoint conversions (#2593) 2023-03-10 08:22:02 -08:00
convert_original_stable_diffusion_to_diffusers.py [From pretrained] Speed-up loading from cache (#2515) 2023-03-10 11:56:10 +01:00
convert_stable_diffusion_checkpoint_to_onnx.py [Copyright] 2023 (#2524) 2023-03-01 10:31:00 +01:00
convert_unclip_txt2img_to_image_variation.py Replace flake8 with ruff and update black (#2279) 2023-02-07 23:46:23 +01:00
convert_vae_diff_to_onnx.py make style 2023-03-06 10:40:18 +00:00
convert_vae_pt_to_diffusers.py Replace flake8 with ruff and update black (#2279) 2023-02-07 23:46:23 +01:00
convert_versatile_diffusion_to_diffusers.py Rename 'CLIPFeatureExtractor' class to 'CLIPImageProcessor' (#2732) 2023-03-23 13:49:22 +01:00
convert_vq_diffusion_to_diffusers.py Replace flake8 with ruff and update black (#2279) 2023-02-07 23:46:23 +01:00
generate_logits.py Replace flake8 with ruff and update black (#2279) 2023-02-07 23:46:23 +01:00