documentation of new text encoder LR and aspect_ratio settings

2023-03-02 22:36:00 +01:00 · 2023-03-02 22:36:00 +01:00 · ae281976ca
parent fe0083877f
commit ae281976ca
3 changed files with 6 additions and 3 deletions
--- a/doc/LOGGING.md
+++ b/doc/LOGGING.md
@ -35,7 +35,8 @@ In place of `sample_prompts.txt` you can provide a `sample_prompts.json` file, w
    },
    {
      "prompt": "a photograph of ted bennet riding a bicycle",
-      "seed": -1
+      "seed": -1,
+      "aspect_ratio": 1.77778
    },
    {
      "random_caption": true,
@ -47,7 +48,7 @@ In place of `sample_prompts.txt` you can provide a `sample_prompts.json` file, w

 At the top you can set a `batch_size` (subject to VRAM limits), a default `seed` and `cfgs` to generate with, as well as a `scheduler` and `num_inference_steps` to control the quality of the samples. Available schedulers are `ddim` (the default) and `dpm++`. Finally, you can set `show_progress_bars` to `true` if you want to see progress bars during the sample generation process. 

-Individual samples are defined under the `samples` key. Each sample can have a `prompt`, a `negative_prompt`, a `seed` (use `-1` to pick a different random seed each time), and a `size` (must be multiples of 64). Use `"random_caption": true` to pick a random caption from the training set each time.
+Individual samples are defined under the `samples` key. Each sample can have a `prompt`, a `negative_prompt`, a `seed` (use `-1` to pick a different random seed each time), and a `size` (must be multiples of 64) or `aspect_ratio` (eg 1.77778 for 16:9). Use `"random_caption": true` to pick a random caption from the training set each time.

 ## LR

--- a/doc/OPTIMIZER.md
+++ b/doc/OPTIMIZER.md
@ -34,6 +34,8 @@ Lucidrains' [implementation](https://github.com/lucidrains/lion-pytorch) of the

 LR can be set in `optimizer.json` and excluded from the main CLI arg or train.json but if you use the main CLI arg or set it in the main train.json it will override the setting. This was done to make sure existing behavior will not break.  To set LR in the `optimizer.json` make sure to delete `"lr": 1.3e-6` in your main train.json and exclude the CLI arg.

+The text encoder LR can run at a different value to the U-net LR. This may help prevent over-fitting, especially if you're training from SD2 checkpoints. To set the text encoder LR, add a value for `text_encoder_lr_scale` to `optimizer.json`. For example, to have the text encoder LR to 50% of the U-net LR, add `"text_encoder_lr_scale": 0.5` to `optimizer.json`. The default value is `1.0`, meaning the text encoder and U-net are trained with the same LR.
+
 Betas, weight decay, and epsilon are documented in the [AdamW paper](https://arxiv.org/abs/1711.05101) and there is a wealth of information on the web, but consider those experimental to tweak.  I cannot provide advice on what might be useful to tweak here.

 Note `lion` does not use epsilon.
--- a/optimizer.json
+++ b/optimizer.json
@ -6,7 +6,7 @@
        "betas": "exponential decay rates for the moment estimates",
        "epsilon": "value added to denominator for numerical stability, unused for lion",
        "weight_decay": "weight decay (L2 penalty)",
-        "text_encoder_lr_scale": "if set, scale the text encoder's LR by this much relative to the unet LR"
+        "text_encoder_lr_scale": "scale the text encoder LR relative to the Unet LR. for example, if `lr` is 2e-6 and `text_encoder_lr_scale` is 0.5, the text encoder's LR will be set to `1e-6`."
    },
    "optimizer": "adamw8bit",
    "lr": 1e-6,