update lr recommendation
This commit is contained in:
parent
6c41be3539
commit
e5aa70c54f
|
@ -46,10 +46,12 @@ The value is defaulted at 0.04, which means 4% conditional dropout. You can set
|
||||||
|
|
||||||
Learning rate adjustment is a very important part of training. You can use the default settings, or you can tweak it. You should consider increasing this further if you increase your batch size further (10+) using [gradient checkpointing](#gradient_checkpointing).
|
Learning rate adjustment is a very important part of training. You can use the default settings, or you can tweak it. You should consider increasing this further if you increase your batch size further (10+) using [gradient checkpointing](#gradient_checkpointing).
|
||||||
|
|
||||||
--lr 1.5e-6 ^
|
--lr 1.0e-6 ^
|
||||||
|
|
||||||
By default, the learning rate is constant for the entire training session. However, if you want it to change by itself during training, you can use cosine.
|
By default, the learning rate is constant for the entire training session. However, if you want it to change by itself during training, you can use cosine.
|
||||||
|
|
||||||
|
General suggestion is 1e-6 for training SD1.5 at 512 resolution. For SD2.1 at 768, try a much lower value, such as 2e-7. [Validation](VALIDATION.md) can be helpful to tune learning rate.
|
||||||
|
|
||||||
## Clip skip
|
## Clip skip
|
||||||
|
|
||||||
Aka "penultimate layer", this takes the output from the text encoder not from its last output layer, but layers before.
|
Aka "penultimate layer", this takes the output from the text encoder not from its last output layer, but layers before.
|
||||||
|
|
Loading…
Reference in New Issue