diffusers/docs/source/api/models.mdx

<!--Copyright 2022 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->

# Models

Diffusers contains pretrained models for popular algorithms and modules for creating the next set of diffusion models.
The primary function of these models is to denoise an input sample, by modeling the distribution $p_\theta(\mathbf{x}_{t-1}|\mathbf{x}_t)$.
The models are built on the base class ['ModelMixin'] that is a `torch.nn.module` with basic functionality for saving and loading models both locally and from the HuggingFace hub.

## ModelMixin
[[autodoc]] ModelMixin

## UNet2DOutput
[[autodoc]] models.unet_2d.UNet2DOutput

## UNet2DModel
[[autodoc]] UNet2DModel

## UNet1DOutput
[[autodoc]] models.unet_1d.UNet1DOutput

## UNet1DModel
[[autodoc]] UNet1DModel

## UNet2DConditionOutput
[[autodoc]] models.unet_2d_condition.UNet2DConditionOutput

## UNet2DConditionModel
[[autodoc]] UNet2DConditionModel

## DecoderOutput
[[autodoc]] models.vae.DecoderOutput

## VQEncoderOutput
[[autodoc]] models.vae.VQEncoderOutput

## VQModel
[[autodoc]] VQModel

## AutoencoderKLOutput
[[autodoc]] models.vae.AutoencoderKLOutput

## AutoencoderKL
[[autodoc]] AutoencoderKL

## Transformer2DModel
[[autodoc]] Transformer2DModel

## Transformer2DModelOutput
[[autodoc]] models.attention.Transformer2DModelOutput

## FlaxModelMixin
[[autodoc]] FlaxModelMixin

## FlaxUNet2DConditionOutput
[[autodoc]] models.unet_2d_condition_flax.FlaxUNet2DConditionOutput

## FlaxUNet2DConditionModel
[[autodoc]] FlaxUNet2DConditionModel

## FlaxDecoderOutput
[[autodoc]] models.vae_flax.FlaxDecoderOutput

## FlaxAutoencoderKLOutput
[[autodoc]] models.vae_flax.FlaxAutoencoderKLOutput

## FlaxAutoencoderKL
[[autodoc]] FlaxAutoencoderKL
Docs (#45) * first pass at docs structure * minor reformatting, add github actions for docs * populate docs (primarily from README, some writing) 2022-07-13 09:42:05 -06:00			`<!--Copyright 2022 The HuggingFace Team. All rights reserved.`

			`Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with`
			`the License. You may obtain a copy of the License at`

			`http://www.apache.org/licenses/LICENSE-2.0`

			`Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on`
			`an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the`
			`specific language governing permissions and limitations under the License.`
			`-->`

			`# Models`

			`Diffusers contains pretrained models for popular algorithms and modules for creating the next set of diffusion models.`
			`The primary function of these models is to denoise an input sample, by modeling the distribution $p_\theta(\mathbf{x}_{t-1}\|\mathbf{x}_t)$.`
			The models are built on the base class ['ModelMixin'] that is a `torch.nn.module` with basic functionality for saving and loading models both locally and from the HuggingFace hub.

[Docs] Models (#416) * docs for attention * types for embeddings * unet2d docstrings * UNet2DConditionModel docstrings * fix typos * style and vq-vae docstrings * docstrings for VAE * Update src/diffusers/models/unet_2d.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * make style * added inherits from sentence * docstring to forward * make style * Apply suggestions from code review Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * finish model docs * up Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> 2022-09-08 09:28:11 -06:00			`## ModelMixin`
			`[[autodoc]] ModelMixin`
Docs (#45) * first pass at docs structure * minor reformatting, add github actions for docs * populate docs (primarily from README, some writing) 2022-07-13 09:42:05 -06:00
[Docs] Models (#416) * docs for attention * types for embeddings * unet2d docstrings * UNet2DConditionModel docstrings * fix typos * style and vq-vae docstrings * docstrings for VAE * Update src/diffusers/models/unet_2d.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * make style * added inherits from sentence * docstring to forward * make style * Apply suggestions from code review Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * finish model docs * up Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> 2022-09-08 09:28:11 -06:00			`## UNet2DOutput`
			`[[autodoc]] models.unet_2d.UNet2DOutput`
Docs (#45) * first pass at docs structure * minor reformatting, add github actions for docs * populate docs (primarily from README, some writing) 2022-07-13 09:42:05 -06:00
[Docs] Models (#416) * docs for attention * types for embeddings * unet2d docstrings * UNet2DConditionModel docstrings * fix typos * style and vq-vae docstrings * docstrings for VAE * Update src/diffusers/models/unet_2d.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * make style * added inherits from sentence * docstring to forward * make style * Apply suggestions from code review Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * finish model docs * up Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> 2022-09-08 09:28:11 -06:00			`## UNet2DModel`
			`[[autodoc]] UNet2DModel`
Docs (#45) * first pass at docs structure * minor reformatting, add github actions for docs * populate docs (primarily from README, some writing) 2022-07-13 09:42:05 -06:00
Add UNet 1d for RL model for planning + colab (#105) * re-add RL model code * match model forward api * add register_to_config, pass training tests * fix tests, update forward outputs * remove unused code, some comments * add to docs * remove extra embedding code * unify time embedding * remove conv1d output sequential * remove sequential from conv1dblock * style and deleting duplicated code * clean files * remove unused variables * clean variables * add 1d resnet block structure for downsample * rename as unet1d * fix renaming * rename files * add get_block(...) api * unify args for model1d like model2d * minor cleaning * fix docs * improve 1d resnet blocks * fix tests, remove permuts * fix style * add output activation * rename flax blocks file * Add Value Function and corresponding example script to Diffuser implementation (#884) * valuefunction code * start example scripts * missing imports * bug fixes and placeholder example script * add value function scheduler * load value function from hub and get best actions in example * very close to working example * larger batch size for planning * more tests * merge unet1d changes * wandb for debugging, use newer models * success! * turns out we just need more diffusion steps * run on modal * merge and code cleanup * use same api for rl model * fix variance type * wrong normalization function * add tests * style * style and quality * edits based on comments * style and quality * remove unused var * hack unet1d into a value function * add pipeline * fix arg order * add pipeline to core library * community pipeline * fix couple shape bugs * style * Apply suggestions from code review Co-authored-by: Nathan Lambert <nathan@huggingface.co> * update post merge of scripts * add mdiblock / outblock architecture * Pipeline cleanup (#947) * valuefunction code * start example scripts * missing imports * bug fixes and placeholder example script * add value function scheduler * load value function from hub and get best actions in example * very close to working example * larger batch size for planning * more tests * merge unet1d changes * wandb for debugging, use newer models * success! * turns out we just need more diffusion steps * run on modal * merge and code cleanup * use same api for rl model * fix variance type * wrong normalization function * add tests * style * style and quality * edits based on comments * style and quality * remove unused var * hack unet1d into a value function * add pipeline * fix arg order * add pipeline to core library * community pipeline * fix couple shape bugs * style * Apply suggestions from code review * clean up comments * convert older script to using pipeline and add readme * rename scripts * style, update tests * delete unet rl model file * remove imports in src Co-authored-by: Nathan Lambert <nathan@huggingface.co> * Update src/diffusers/models/unet_1d_blocks.py * Update tests/test_models_unet.py * RL Cleanup v2 (#965) * valuefunction code * start example scripts * missing imports * bug fixes and placeholder example script * add value function scheduler * load value function from hub and get best actions in example * very close to working example * larger batch size for planning * more tests * merge unet1d changes * wandb for debugging, use newer models * success! * turns out we just need more diffusion steps * run on modal * merge and code cleanup * use same api for rl model * fix variance type * wrong normalization function * add tests * style * style and quality * edits based on comments * style and quality * remove unused var * hack unet1d into a value function * add pipeline * fix arg order * add pipeline to core library * community pipeline * fix couple shape bugs * style * Apply suggestions from code review * clean up comments * convert older script to using pipeline and add readme * rename scripts * style, update tests * delete unet rl model file * remove imports in src * add specific vf block and update tests * style * Update tests/test_models_unet.py Co-authored-by: Nathan Lambert <nathan@huggingface.co> * fix quality in tests * fix quality style, split test file * fix checks / tests * make timesteps closer to main * unify block API * unify forward api * delete lines in examples * style * examples style * all tests pass * make style * make dance_diff test pass * Refactoring RL PR (#1200) * init file changes * add import utils * finish cleaning files, imports * remove import flags * clean examples * fix imports, tests for merge * update readmes * hotfix for tests * quality * fix some tests * change defaults * more mps test fixes * unet1d defaults * do not default import experimental * defaults for tests * fix tests * fix-copies * fix * changes per Patrik's comments (#1285) * changes per Patrik's comments * update conversion script * fix renaming * skip more mps tests * last test fix * Update examples/rl/README.md Co-authored-by: Ben Glickenhaus <benglickenhaus@gmail.com> 2022-11-14 14:48:48 -07:00			`## UNet1DOutput`
			`[[autodoc]] models.unet_1d.UNet1DOutput`

			`## UNet1DModel`
			`[[autodoc]] UNet1DModel`

[Docs] Models (#416) * docs for attention * types for embeddings * unet2d docstrings * UNet2DConditionModel docstrings * fix typos * style and vq-vae docstrings * docstrings for VAE * Update src/diffusers/models/unet_2d.py Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * make style * added inherits from sentence * docstring to forward * make style * Apply suggestions from code review Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * finish model docs * up Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> 2022-09-08 09:28:11 -06:00			`## UNet2DConditionOutput`
			`[[autodoc]] models.unet_2d_condition.UNet2DConditionOutput`

			`## UNet2DConditionModel`
			`[[autodoc]] UNet2DConditionModel`

			`## DecoderOutput`
			`[[autodoc]] models.vae.DecoderOutput`

			`## VQEncoderOutput`
			`[[autodoc]] models.vae.VQEncoderOutput`

			`## VQModel`
			`[[autodoc]] VQModel`

			`## AutoencoderKLOutput`
			`[[autodoc]] models.vae.AutoencoderKLOutput`

			`## AutoencoderKL`
			`[[autodoc]] AutoencoderKL`
Revert "[v0.4.0] Temporarily remove Flax modules from the public API (#755)" This reverts commit 2e209c30cf6f2ba42001d0629dc6b7ce354b9a9d. 2022-10-06 10:35:40 -06:00
VQ-diffusion (#658) * Changes for VQ-diffusion VQVAE Add specify dimension of embeddings to VQModel: `VQModel` will by default set the dimension of embeddings to the number of latent channels. The VQ-diffusion VQVAE has a smaller embedding dimension, 128, than number of latent channels, 256. Add AttnDownEncoderBlock2D and AttnUpDecoderBlock2D to the up and down unet block helpers. VQ-diffusion's VQVAE uses those two block types. * Changes for VQ-diffusion transformer Modify attention.py so SpatialTransformer can be used for VQ-diffusion's transformer. SpatialTransformer: - Can now operate over discrete inputs (classes of vector embeddings) as well as continuous. - `in_channels` was made optional in the constructor so two locations where it was passed as a positional arg were moved to kwargs - modified forward pass to take optional timestep embeddings ImagePositionalEmbeddings: - added to provide positional embeddings to discrete inputs for latent pixels BasicTransformerBlock: - norm layers were made configurable so that the VQ-diffusion could use AdaLayerNorm with timestep embeddings - modified forward pass to take optional timestep embeddings CrossAttention: - now may optionally take a bias parameter for its query, key, and value linear layers FeedForward: - Internal layers are now configurable ApproximateGELU: - Activation function in VQ-diffusion's feedforward layer AdaLayerNorm: - Norm layer modified to incorporate timestep embeddings * Add VQ-diffusion scheduler * Add VQ-diffusion pipeline * Add VQ-diffusion convert script to diffusers * Add VQ-diffusion dummy objects * Add VQ-diffusion markdown docs * Add VQ-diffusion tests * some renaming * some fixes * more renaming * correct * fix typo * correct weights * finalize * fix tests * Apply suggestions from code review Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com> * Apply suggestions from code review Co-authored-by: Pedro Cuenca <pedro@huggingface.co> * finish * finish * up Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com> Co-authored-by: Pedro Cuenca <pedro@huggingface.co> 2022-11-03 09:10:28 -06:00			`## Transformer2DModel`
			`[[autodoc]] Transformer2DModel`

			`## Transformer2DModelOutput`
			`[[autodoc]] models.attention.Transformer2DModelOutput`

Revert "[v0.4.0] Temporarily remove Flax modules from the public API (#755)" This reverts commit 2e209c30cf6f2ba42001d0629dc6b7ce354b9a9d. 2022-10-06 10:35:40 -06:00			`## FlaxModelMixin`
			`[[autodoc]] FlaxModelMixin`

			`## FlaxUNet2DConditionOutput`
			`[[autodoc]] models.unet_2d_condition_flax.FlaxUNet2DConditionOutput`

			`## FlaxUNet2DConditionModel`
			`[[autodoc]] FlaxUNet2DConditionModel`

			`## FlaxDecoderOutput`
			`[[autodoc]] models.vae_flax.FlaxDecoderOutput`

			`## FlaxAutoencoderKLOutput`
			`[[autodoc]] models.vae_flax.FlaxAutoencoderKLOutput`

			`## FlaxAutoencoderKL`
			`[[autodoc]] FlaxAutoencoderKL`