diffusers/scripts
Will Berman ef2ea33c3b
VQ-diffusion (#658)
* Changes for VQ-diffusion VQVAE

Add specify dimension of embeddings to VQModel:
`VQModel` will by default set the dimension of embeddings to the number
of latent channels. The VQ-diffusion VQVAE has a smaller
embedding dimension, 128, than number of latent channels, 256.

Add AttnDownEncoderBlock2D and AttnUpDecoderBlock2D to the up and down
unet block helpers. VQ-diffusion's VQVAE uses those two block types.

* Changes for VQ-diffusion transformer

Modify attention.py so SpatialTransformer can be used for
VQ-diffusion's transformer.

SpatialTransformer:
- Can now operate over discrete inputs (classes of vector embeddings) as well as continuous.
- `in_channels` was made optional in the constructor so two locations where it was passed as a positional arg were moved to kwargs
- modified forward pass to take optional timestep embeddings

ImagePositionalEmbeddings:
- added to provide positional embeddings to discrete inputs for latent pixels

BasicTransformerBlock:
- norm layers were made configurable so that the VQ-diffusion could use AdaLayerNorm with timestep embeddings
- modified forward pass to take optional timestep embeddings

CrossAttention:
- now may optionally take a bias parameter for its query, key, and value linear layers

FeedForward:
- Internal layers are now configurable

ApproximateGELU:
- Activation function in VQ-diffusion's feedforward layer

AdaLayerNorm:
- Norm layer modified to incorporate timestep embeddings

* Add VQ-diffusion scheduler

* Add VQ-diffusion pipeline

* Add VQ-diffusion convert script to diffusers

* Add VQ-diffusion dummy objects

* Add VQ-diffusion markdown docs

* Add VQ-diffusion tests

* some renaming

* some fixes

* more renaming

* correct

* fix typo

* correct weights

* finalize

* fix tests

* Apply suggestions from code review

Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* finish

* finish

* up

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2022-11-03 16:10:28 +01:00
..
__init__.py Fix conversion script 2022-07-15 17:00:41 +00:00
change_naming_configs_and_checkpoints.py Style the `scripts` directory (#250) 2022-08-25 15:46:09 +02:00
conversion_ldm_uncond.py Style the `scripts` directory (#250) 2022-08-25 15:46:09 +02:00
convert_dance_diffusion_to_diffusers.py [Dance Diffusion] Add dance diffusion (#803) 2022-10-25 18:39:25 +02:00
convert_ddpm_original_checkpoint_to_diffusers.py Fix typos and add Typo check GitHub Action (#483) 2022-09-16 15:36:51 +02:00
convert_diffusers_to_original_stable_diffusion.py Checkpoint conversion script from Diffusers => Stable Diffusion (CompVis) (#701) 2022-10-04 13:33:38 +02:00
convert_ldm_original_checkpoint_to_diffusers.py Style the `scripts` directory (#250) 2022-08-25 15:46:09 +02:00
convert_ncsnpp_original_checkpoint_to_diffusers.py Style the `scripts` directory (#250) 2022-08-25 15:46:09 +02:00
convert_original_stable_diffusion_to_diffusers.py CompVis -> diffusers script - allow converting from merged checkpoint to either EMA or non-EMA (#991) 2022-10-26 12:32:07 +02:00
convert_stable_diffusion_checkpoint_to_onnx.py [Onnx] support half-precision and fix bugs for onnx pipelines (#932) 2022-10-25 16:48:53 +02:00
convert_vq_diffusion_to_diffusers.py VQ-diffusion (#658) 2022-11-03 16:10:28 +01:00
generate_logits.py Fix typos and add Typo check GitHub Action (#483) 2022-09-16 15:36:51 +02:00