Commit Graph

187 Commits

Author SHA1 Message Date
Victor Hall 3a6fe3b4a1 add huber loss, timestep clamping, slightly safer txt reading 2024-04-26 23:54:31 -04:00
Damian Stewart 9fc6ae7a09
prevent OOM with disabled unet when gradient checkpointing is enabled
unet needs to be in train() mode for gradient checkpointing to work
2024-01-16 10:23:52 +13:00
Victor Hall dfcc9e7f41 add dataloader plugin hooks for caption and pil image 2023-12-20 14:55:50 -05:00
Victor Hall a284c52dee update windows to torch 2.1 and add attention type option 2023-11-24 16:13:01 -05:00
Alex dcf2969640
Merge branch 'victorchall:main' into main 2023-11-17 20:52:34 +02:00
Victor Hall 6727b6d61f fix log_writer bug and move logs into specific project log folder 2023-11-17 12:30:57 -05:00
alexds9 5dc9f18061 1. Added AdaCoor optimizer.
2. Added pyramid noise.
3. Fixed problem with log_writer missing from EveryDreamOptimizer.
2023-11-17 16:08:43 +02:00
Victor Hall 0b843718b8
Merge pull request #237 from reijerh/misc-minor-fixes
Misc minor fixes + minor improvements to txt2img script
2023-11-13 13:35:58 -05:00
Gabriel Roldan b212b94d13
Fix device mismatch when using min snr gamma option 2023-11-13 12:17:56 -03:00
reijerh 7de666ec2d Misc minor fixes 2023-11-09 00:22:41 +01:00
Victor Hall 6ea721887c allow scheduler change for training 2023-11-05 21:14:54 -05:00
Victor Hall 21361a3622 option to swap training scheduler 2023-11-05 20:54:14 -05:00
Victor Hall c0a1955164
Merge pull request #233 from damian0815/feat_the_acculmunator
plugin: grad accumulation scheduler
2023-11-05 19:53:11 -05:00
Damian Stewart c485d4ea60
fix device mismatch with loss_scale 2023-11-01 09:29:41 +01:00
Damian Stewart a7343ad190 fix scale batch calculation 2023-11-01 08:11:42 +01:00
Damian Stewart da731268b2 put a file loss_scale.txt containing a float in a training folder to apply loss scale (eg -1 for negative examples) 2023-10-31 10:06:21 +01:00
Damian Stewart 9a69ce84cb typo 2023-10-22 20:23:57 +02:00
Damian Stewart 6434844432 add missing json, fix error 2023-10-22 20:09:53 +02:00
Damian Stewart 9396d2156e Merge remote-tracking branch 'upstream/main' into feat_the_acculmunator 2023-10-22 19:35:32 +02:00
Damian Stewart 26a1475f0c initial implementation of the_acculmunator 2023-10-22 19:26:35 +02:00
Gabriel Roldán f301677881 Fix save_ckpt_dir not being when saving model 2023-10-01 00:09:24 -03:00
Victor Hall e8e4f0c2ea
Merge pull request #214 from luisgabrielroldan/keep_tags
Add --keep_tags to keep first N tags fixed on shuffle
2023-09-25 13:10:21 -04:00
Victor Hall a9c98f5866 bugged flag 2023-09-22 10:16:56 -04:00
Victor Hall 166c2e74e1 off by one on last epoch save 2023-09-21 21:29:36 -04:00
Gabriel Roldán 43984f2ad3
Add --keep_tags to keep first N tags fixed on shuffle 2023-09-20 19:53:30 -03:00
Victor Hall 09aa13c3dd
Merge branch 'main' into feat_rolling_save 2023-09-20 16:32:37 -04:00
Victor Hall 2dff3aa8d1 ema update 2023-09-18 16:13:22 -04:00
Damian Stewart a68ebe3658 fix typo 2023-09-17 20:17:54 +02:00
Damian Stewart 3fddef3698 put back make_save_path and fix error in plugin runner 2023-09-17 20:16:26 +02:00
Victor Hall fa5b38e26b some minor updates to ema 2023-09-12 21:37:27 -04:00
alexds9 7259ce873b 1. Samples format change to make sure global step appear before "ema" indication. 2023-09-11 00:13:26 +03:00
alexds9 39b3082bf4 1. Making sure to release VRAM in samples. 2023-09-10 22:42:01 +03:00
alexds9 d2d493c911 1. New parameters added to train.json and trainSD21.json - disabled by default.
2. Description added to ADVANCED_TWEAKING.md
2023-09-10 20:06:50 +03:00
alexds9 5b1760fff2 1. Added an argument ema_decay_resume_model to load EMA model - it's loaded alongside main model, instead of copying normal model. It's optional, without loaded EMA model, it will copy the regular model to me the first EMA model, just like before.
2. Fixed findlast option for regular models not to load EMA models by default.
3. findlast can be used to load EMA model too when used with ema_decay_resume_model.
4. Added ema_device variable to store the device in torch type.
5. Cleaned prints and comments.
2023-09-07 19:53:20 +03:00
alexds9 cf4a082e11 1. Fix to EMA samples arguments not respecting False value. 2023-09-06 23:04:12 +03:00
alexds9 5bcf9407f0 1. Improved EMA support: samples generation with arguments EMA/NOT-EMA, saving checkpoints and diffusers for both, ema_decay_target implemented.
2. enable_zero_terminal_snr separated from zero_frequency_noise_ratio.
2023-09-06 22:37:10 +03:00
alexds9 23df727a1f Added support for:
1. EMA decay. Using EMA decay model, it is updated every ema_decay_interval by (1 - ema_decay_rate), it can be stored on CPU to save VRAM. Only EMA model is saved now.
2. min_snr_gamma - improve converging speed, more info: https://arxiv.org/abs/2303.09556
3. load_settings_every_epoch - Will load 'train.json' at start of every epoch.
2023-09-06 13:38:52 +03:00
Damian Stewart 9b5b96a50b fixes for ZTSNR training 2023-08-15 20:49:28 +02:00
Victor Hall 8007869b84 log exception if something blows up so it ends up in the .log 2023-07-05 16:02:18 -04:00
Victor Hall 42c417171d improve plugins 2023-07-04 17:29:22 -04:00
Victor Hall 1afaf59ec9 fix default for plugins 2023-06-29 22:00:16 -04:00
Victor Hall a72d455fc5 missed issue with plugin 2023-06-29 20:44:14 -04:00
Victor Hall aa7e004869 first crack at plugins 2023-06-27 20:53:48 -04:00
Damian Stewart 227f56427b write correct epoch number of final save, and add flags to disable grad scaler tweaks, last epoch renaming, ckpt save 2023-06-17 19:18:04 +02:00
Victor Hall 6f64efaaaa
Merge pull request #193 from damian0815/feat_user_defined_batching
User defined batching
2023-06-10 13:08:19 -04:00
damian 59fc9891d4 shuffle named batches while respecting and accounting for grad_accum 2023-06-07 18:07:37 +02:00
Damian Stewart 53d0686086 add a batch_id.txt file to subfolders or a batch_id key to local yaml to force images for that folder to be processed in the same batch 2023-06-05 01:02:27 +02:00
Pat Shanahan e7d199e712
Fix spelling error 2023-06-04 10:13:37 -05:00
Victor Hall 1155a28867 zero terminal fixes 2023-06-03 21:41:56 -04:00
Victor Hall 81b7b00df7 use trained betas for zero terminal snr 2023-06-03 17:17:04 -04:00