* move files a bit
* more refactors
* fix more
* more fixes
* fix more onnx
* make style
* upload
* fix
* up
* fix more
* up again
* up
* small fix
* Update src/diffusers/__init__.py
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* correct
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Initial code for attempt at improving SD <--> diffusers conversions for v2.0
* Updates to support round-trip between orig. SD 2.0 and diffusers models
* Corrected formatting to Black standard
* Correcting import formatting
* Fixed imports (properly this time)
* add some corrections
* remove inference files
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* add paint by example
* mkae loading possibel
* up
* Update src/diffusers/models/attention.py
* up
* finalize weight structure
* make example work
* make it work
* up
* up
* fix
* del
* add
* update
* Apply suggestions from code review
* correct transformer 2d
* finish
* up
* up
* up
* up
* fix
* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* Apply suggestions from code review
* up
* finish
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* up
* convert dual unet
* revert dual attn
* adapt for vd-official
* test the full pipeline
* mixed inference
* mixed inference for text2img
* add image prompting
* fix clip norm
* split text2img and img2img
* fix format
* refactor text2img
* mega pipeline
* add optimus
* refactor image var
* wip text_unet
* text unet end to end
* update tests
* reshape
* fix image to text
* add some first docs
* dual guided pipeline
* fix token ratio
* propose change
* dual transformer as a native module
* DualTransformer(nn.Module)
* DualTransformer(nn.Module)
* correct unconditional image
* save-load with mega pipeline
* remove image to text
* up
* uP
* fix
* up
* final fix
* remove_unused_weights
* test updates
* save progress
* uP
* fix dual prompts
* some fixes
* finish
* style
* finish renaming
* up
* fix
* fix
* fix
* finish
Co-authored-by: anton-l <anton@huggingface.co>
* re-add RL model code
* match model forward api
* add register_to_config, pass training tests
* fix tests, update forward outputs
* remove unused code, some comments
* add to docs
* remove extra embedding code
* unify time embedding
* remove conv1d output sequential
* remove sequential from conv1dblock
* style and deleting duplicated code
* clean files
* remove unused variables
* clean variables
* add 1d resnet block structure for downsample
* rename as unet1d
* fix renaming
* rename files
* add get_block(...) api
* unify args for model1d like model2d
* minor cleaning
* fix docs
* improve 1d resnet blocks
* fix tests, remove permuts
* fix style
* add output activation
* rename flax blocks file
* Add Value Function and corresponding example script to Diffuser implementation (#884)
* valuefunction code
* start example scripts
* missing imports
* bug fixes and placeholder example script
* add value function scheduler
* load value function from hub and get best actions in example
* very close to working example
* larger batch size for planning
* more tests
* merge unet1d changes
* wandb for debugging, use newer models
* success!
* turns out we just need more diffusion steps
* run on modal
* merge and code cleanup
* use same api for rl model
* fix variance type
* wrong normalization function
* add tests
* style
* style and quality
* edits based on comments
* style and quality
* remove unused var
* hack unet1d into a value function
* add pipeline
* fix arg order
* add pipeline to core library
* community pipeline
* fix couple shape bugs
* style
* Apply suggestions from code review
Co-authored-by: Nathan Lambert <nathan@huggingface.co>
* update post merge of scripts
* add mdiblock / outblock architecture
* Pipeline cleanup (#947)
* valuefunction code
* start example scripts
* missing imports
* bug fixes and placeholder example script
* add value function scheduler
* load value function from hub and get best actions in example
* very close to working example
* larger batch size for planning
* more tests
* merge unet1d changes
* wandb for debugging, use newer models
* success!
* turns out we just need more diffusion steps
* run on modal
* merge and code cleanup
* use same api for rl model
* fix variance type
* wrong normalization function
* add tests
* style
* style and quality
* edits based on comments
* style and quality
* remove unused var
* hack unet1d into a value function
* add pipeline
* fix arg order
* add pipeline to core library
* community pipeline
* fix couple shape bugs
* style
* Apply suggestions from code review
* clean up comments
* convert older script to using pipeline and add readme
* rename scripts
* style, update tests
* delete unet rl model file
* remove imports in src
Co-authored-by: Nathan Lambert <nathan@huggingface.co>
* Update src/diffusers/models/unet_1d_blocks.py
* Update tests/test_models_unet.py
* RL Cleanup v2 (#965)
* valuefunction code
* start example scripts
* missing imports
* bug fixes and placeholder example script
* add value function scheduler
* load value function from hub and get best actions in example
* very close to working example
* larger batch size for planning
* more tests
* merge unet1d changes
* wandb for debugging, use newer models
* success!
* turns out we just need more diffusion steps
* run on modal
* merge and code cleanup
* use same api for rl model
* fix variance type
* wrong normalization function
* add tests
* style
* style and quality
* edits based on comments
* style and quality
* remove unused var
* hack unet1d into a value function
* add pipeline
* fix arg order
* add pipeline to core library
* community pipeline
* fix couple shape bugs
* style
* Apply suggestions from code review
* clean up comments
* convert older script to using pipeline and add readme
* rename scripts
* style, update tests
* delete unet rl model file
* remove imports in src
* add specific vf block and update tests
* style
* Update tests/test_models_unet.py
Co-authored-by: Nathan Lambert <nathan@huggingface.co>
* fix quality in tests
* fix quality style, split test file
* fix checks / tests
* make timesteps closer to main
* unify block API
* unify forward api
* delete lines in examples
* style
* examples style
* all tests pass
* make style
* make dance_diff test pass
* Refactoring RL PR (#1200)
* init file changes
* add import utils
* finish cleaning files, imports
* remove import flags
* clean examples
* fix imports, tests for merge
* update readmes
* hotfix for tests
* quality
* fix some tests
* change defaults
* more mps test fixes
* unet1d defaults
* do not default import experimental
* defaults for tests
* fix tests
* fix-copies
* fix
* changes per Patrik's comments (#1285)
* changes per Patrik's comments
* update conversion script
* fix renaming
* skip more mps tests
* last test fix
* Update examples/rl/README.md
Co-authored-by: Ben Glickenhaus <benglickenhaus@gmail.com>
* Changes for VQ-diffusion VQVAE
Add specify dimension of embeddings to VQModel:
`VQModel` will by default set the dimension of embeddings to the number
of latent channels. The VQ-diffusion VQVAE has a smaller
embedding dimension, 128, than number of latent channels, 256.
Add AttnDownEncoderBlock2D and AttnUpDecoderBlock2D to the up and down
unet block helpers. VQ-diffusion's VQVAE uses those two block types.
* Changes for VQ-diffusion transformer
Modify attention.py so SpatialTransformer can be used for
VQ-diffusion's transformer.
SpatialTransformer:
- Can now operate over discrete inputs (classes of vector embeddings) as well as continuous.
- `in_channels` was made optional in the constructor so two locations where it was passed as a positional arg were moved to kwargs
- modified forward pass to take optional timestep embeddings
ImagePositionalEmbeddings:
- added to provide positional embeddings to discrete inputs for latent pixels
BasicTransformerBlock:
- norm layers were made configurable so that the VQ-diffusion could use AdaLayerNorm with timestep embeddings
- modified forward pass to take optional timestep embeddings
CrossAttention:
- now may optionally take a bias parameter for its query, key, and value linear layers
FeedForward:
- Internal layers are now configurable
ApproximateGELU:
- Activation function in VQ-diffusion's feedforward layer
AdaLayerNorm:
- Norm layer modified to incorporate timestep embeddings
* Add VQ-diffusion scheduler
* Add VQ-diffusion pipeline
* Add VQ-diffusion convert script to diffusers
* Add VQ-diffusion dummy objects
* Add VQ-diffusion markdown docs
* Add VQ-diffusion tests
* some renaming
* some fixes
* more renaming
* correct
* fix typo
* correct weights
* finalize
* fix tests
* Apply suggestions from code review
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
* Apply suggestions from code review
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* finish
* finish
* up
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
* start
* add more logic
* Update src/diffusers/models/unet_2d_condition_flax.py
* match weights
* up
* make model work
* making class more general, fixing missed file rename
* small fix
* make new conversion work
* up
* finalize conversion
* up
* first batch of variable renamings
* remove c and c_prev var names
* add mid and out block structure
* add pipeline
* up
* finish conversion
* finish
* upload
* more fixes
* Apply suggestions from code review
* add attr
* up
* uP
* up
* finish tests
* finish
* uP
* finish
* fix test
* up
* naming consistency in tests
* Apply suggestions from code review
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Nathan Lambert <nathan@huggingface.co>
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
* remove hardcoded 16
* Remove bogus
* fix some stuff
* finish
* improve logging
* docs
* upload
Co-authored-by: Nathan Lambert <nol@berkeley.edu>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Nathan Lambert <nathan@huggingface.co>
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
* Conversion script
* ran black
* ran isort
* remove unused import
* map location so everything gets loaded onto CPU before conversion
* ran black again
* Update setup.py
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* Fix typos
* Add a typo check action
* Fix a bug
* Changed to manual typo check currently
Ref: https://github.com/huggingface/diffusers/pull/483#pullrequestreview-1104468010
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
* Removed a confusing message
* Renamed "nin_shortcut" to "in_shortcut"
* Add memo about NIN
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
* begin text2img conversion script
* add fn to convert config
* create config if not provided
* update imports and use UNet2DConditionModel
* fix imports, layer names
* fix unet coversion
* add function to convert VAE
* fix vae conversion
* update main
* create text model
* update config creating logic for unet
* fix config creation
* update script to create and save pipeline
* remove unused imports
* fix checkpoint loading
* better name
* save progress
* finish
* up
* up
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
* [Vae and AutoencoderKL clean]
* save intermediate finished work
* more progress
* more progress
* finish modeling code
* save intermediate
* finish
* Correct tests
* up
* change model name
* renaming
* more changes
* up
* up
* up
* save checkpoint
* finish api / naming
* finish config renaming
* rename all weights
* finish really