* add AudioDiffusionPipeline and LatentAudioDiffusionPipeline
* add docs to toc
* fix tests
* fix tests
* fix tests
* fix tests
* fix tests
* Update pr_tests.yml
Fix tests
* parent 499ff34b3edc3e0c506313ab48f21514d8f58b09
author teticio <teticio@gmail.com> 1668765652 +0000
committer teticio <teticio@gmail.com> 1669041721 +0000
parent 499ff34b3edc3e0c506313ab48f21514d8f58b09
author teticio <teticio@gmail.com> 1668765652 +0000
committer teticio <teticio@gmail.com> 1669041704 +0000
add colab notebook
[Flax] Fix loading scheduler from subfolder (#1319)
[FLAX] Fix loading scheduler from subfolder
Fix/Enable all schedulers for in-painting (#1331)
* inpaint fix k lms
* onnox as well
* up
Correct path to schedlure (#1322)
* [Examples] Correct path
* uP
Avoid nested fix-copies (#1332)
* Avoid nested `# Copied from` statements during `make fix-copies`
* style
Fix img2img speed with LMS-Discrete Scheduler (#896)
Casting `self.sigmas` into a different dtype (the one of original_samples) is not advisable. In my img2img pipeline this leads to a long running time in the `integrate.quad` call later on- by long I mean more than 10x slower.
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
Fix the order of casts for onnx inpainting (#1338)
Legacy Inpainting Pipeline for Onnx Models (#1237)
* Add legacy inpainting pipeline compatibility for onnx
* remove commented out line
* Add onnx legacy inpainting test
* Fix slow decorators
* pep8 styling
* isort styling
* dummy object
* ordering consistency
* style
* docstring styles
* Refactor common prompt encoding pattern
* Update tests to permanent repository home
* support all available schedulers until ONNX IO binding is available
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
* updated styling from PR suggested feedback
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
Jax infer support negative prompt (#1337)
* support negative prompts in sd jax pipeline
* pass batched neg_prompt
* only encode when negative prompt is None
Co-authored-by: Juan Acevedo <jfacevedo@google.com>
Update README.md: Minor change to Imagic code snippet, missing dir error (#1347)
Minor change to Imagic Readme
Missing dir causes an error when running the example code.
make style
change the sample model (#1352)
* Update alt_diffusion.mdx
* Update alt_diffusion.mdx
Add bit diffusion [WIP] (#971)
* Create bit_diffusion.py
Bit diffusion based on the paper, arXiv:2208.04202, Chen2022AnalogBG
* adding bit diffusion to new branch
ran tests
* tests
* tests
* tests
* tests
* removed test folders + added to README
* Update README.md
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* move Mel to module in pipeline construction, make librosa optional
* fix imports
* fix copy & paste error in comment
* fix style
* add missing register_to_config
* fix class docstrings
* fix class docstrings
* tweak docstrings
* tweak docstrings
* update slow test
* put trailing commas back
* respect alphabetical order
* remove LatentAudioDiffusion, make vqvae optional
* move Mel from models back to pipelines :-)
* allow loading of pretrained audiodiffusion models
* fix tests
* fix dummies
* remove reference to latent_audio_diffusion in docs
* unused import
* inherit from SchedulerMixin to make loadable
* Apply suggestions from code review
* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* up
* convert dual unet
* revert dual attn
* adapt for vd-official
* test the full pipeline
* mixed inference
* mixed inference for text2img
* add image prompting
* fix clip norm
* split text2img and img2img
* fix format
* refactor text2img
* mega pipeline
* add optimus
* refactor image var
* wip text_unet
* text unet end to end
* update tests
* reshape
* fix image to text
* add some first docs
* dual guided pipeline
* fix token ratio
* propose change
* dual transformer as a native module
* DualTransformer(nn.Module)
* DualTransformer(nn.Module)
* correct unconditional image
* save-load with mega pipeline
* remove image to text
* up
* uP
* fix
* up
* final fix
* remove_unused_weights
* test updates
* save progress
* uP
* fix dual prompts
* some fixes
* finish
* style
* finish renaming
* up
* fix
* fix
* fix
* finish
Co-authored-by: anton-l <anton@huggingface.co>
* add conversion script for vae
* up
* up
* some fixes
* add text model
* use the correct config
* add docs
* move model in it's own file
* move model in its own file
* pass attenion mask to text encoder
* pass attn mask to uncond inputs
* quality
* fix image2image
* add imag2image in init
* fix import
* fix one more import
* fix import, dummy objetcs
* fix copied from
* up
* finish
Co-authored-by: patil-suraj <surajp815@gmail.com>
* add conversion script for vae
* uP
* uP
* more changes
* push
* up
* finish again
* up
* up
* up
* up
* finish
* up
* uP
* up
* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* up
* up
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* Changes for VQ-diffusion VQVAE
Add specify dimension of embeddings to VQModel:
`VQModel` will by default set the dimension of embeddings to the number
of latent channels. The VQ-diffusion VQVAE has a smaller
embedding dimension, 128, than number of latent channels, 256.
Add AttnDownEncoderBlock2D and AttnUpDecoderBlock2D to the up and down
unet block helpers. VQ-diffusion's VQVAE uses those two block types.
* Changes for VQ-diffusion transformer
Modify attention.py so SpatialTransformer can be used for
VQ-diffusion's transformer.
SpatialTransformer:
- Can now operate over discrete inputs (classes of vector embeddings) as well as continuous.
- `in_channels` was made optional in the constructor so two locations where it was passed as a positional arg were moved to kwargs
- modified forward pass to take optional timestep embeddings
ImagePositionalEmbeddings:
- added to provide positional embeddings to discrete inputs for latent pixels
BasicTransformerBlock:
- norm layers were made configurable so that the VQ-diffusion could use AdaLayerNorm with timestep embeddings
- modified forward pass to take optional timestep embeddings
CrossAttention:
- now may optionally take a bias parameter for its query, key, and value linear layers
FeedForward:
- Internal layers are now configurable
ApproximateGELU:
- Activation function in VQ-diffusion's feedforward layer
AdaLayerNorm:
- Norm layer modified to incorporate timestep embeddings
* Add VQ-diffusion scheduler
* Add VQ-diffusion pipeline
* Add VQ-diffusion convert script to diffusers
* Add VQ-diffusion dummy objects
* Add VQ-diffusion markdown docs
* Add VQ-diffusion tests
* some renaming
* some fixes
* more renaming
* correct
* fix typo
* correct weights
* finalize
* fix tests
* Apply suggestions from code review
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* finish
* finish
* up
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* feat: add repaint
* fix: fix quality check with `make fix-copies`
* fix: remove old unnecessary arg
* chore: change default to DDPM (looks better in experiments)
* ".to(device)" changed to "device="
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
* make generator device-specific
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
* make generator device-specific and change shape
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
* fix: add preprocessing for image and mask
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
* fix: update test
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
* Update src/diffusers/pipelines/repaint/pipeline_repaint.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Add docs and examples
* Fix toctree
Co-authored-by: fja <fja@zurich.ibm.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
* start
* add more logic
* Update src/diffusers/models/unet_2d_condition_flax.py
* match weights
* up
* make model work
* making class more general, fixing missed file rename
* small fix
* make new conversion work
* up
* finalize conversion
* up
* first batch of variable renamings
* remove c and c_prev var names
* add mid and out block structure
* add pipeline
* up
* finish conversion
* finish
* upload
* more fixes
* Apply suggestions from code review
* add attr
* up
* uP
* up
* finish tests
* finish
* uP
* finish
* fix test
* up
* naming consistency in tests
* Apply suggestions from code review
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Nathan Lambert <nathan@huggingface.co>
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
* remove hardcoded 16
* Remove bogus
* fix some stuff
* finish
* improve logging
* docs
* upload
Co-authored-by: Nathan Lambert <nol@berkeley.edu>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Nathan Lambert <nathan@huggingface.co>
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
* Initial version of `fp16` page.
* Fix typo in README.
* Change titles of fp16 section in toctree.
* PR suggestion
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* PR suggestion
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Clarify attention slicing is useful even for batches of 1
Explained by @patrickvonplaten after a suggestion by @keturn.
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Do not talk about `batches` in `enable_attention_slicing`.
* Use Tip (just for fun), add link to method.
* Comment about fp16 results looking the same as float32 in practice.
* Style: docstring line wrapping.
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>