You can edit the example `train.json` file to your liking, then run the following command:
python train.py --config train.json
Be careful with editing the json file, as any syntax errors will cause the program to crash. You might want to use a json validator to check your file before running it. You can use an online validator such as https://jsonlint.com/ or look at it in VS Code.
One particular note is if your path to `data_root` or `resume_ckpt` has backslashes they need to use double \\\ or single /. There is an example train.json in the repo root.
I recommend you copy one of the examples below and keep it in a text file for future reference. Your settings are logged in the logs folder, but you'll need to make a command to start training.
Resuming from a checkpoint, 50 epochs, 6 batch size, 3e-6 learning rate, constant scheduler, generate samples evern 200 steps, 10 minute checkpoint interval, adam8bit, and using the default "input" folder for training data:
Training from SD2 512 base model, 18 epochs, 4 batch size, 1.2e-6 learning rate, constant LR, generate samples evern 100 steps, 30 minute checkpoint interval, adam8bit, using imagesin the x:\mydata folder, training at resolution class of 640:
Training from the "SD21" model on the "jets" dataset on another drive, for 50 epochs, 6 batch size, 1.5e-6 learning rate, cosine scheduler that will decay in 1500 steps, generate samples evern 100 steps, save a checkpoint every 20 epochs, and use AdamW 8bit optimizer:
You should point to the folder in the logs per above if you want to resume rather than running a conversion back on a 2.0GB or 2.5GB pruned file if possible.