clarify docs

2023-06-08 10:17:08 +02:00 · 2023-06-08 10:17:08 +02:00 · 607f1e3ac4
parent f37d285272
commit 607f1e3ac4
1 changed files with 4 additions and 2 deletions
--- a/doc/ADVANCED_TWEAKING.md
+++ b/doc/ADVANCED_TWEAKING.md
@ -153,12 +153,14 @@ Very tentatively, I suggest closer to 0.10 for short term training, and lower va

 ## Keeping images together (custom batching)

-If you have a subset of your dataset that expresses the same style or concept, training quality may be improved by putting all of these images through the trainer in the same batch every step (rather than the default behaviour, which is to shuffle them randomly throughout the entire dataset).
+If you have a subset of your dataset that expresses the same style or concept, training quality may be improved by putting all of these images through the trainer together in the same batch or batches, instead of the default behaviour (which is to shuffle them randomly throughout the entire dataset).

-To control this, put a file called `batch_id.txt` in each batch with a unique name for the data. For example if you have a bunch of images of horses and you are trying to train them as a single concept, you can assign a batch id such as "my_horses" to these images by putting the word `my_horses` inside `batch_id.txt` in your folder with horse images. 
+To control this, put a file called `batch_id.txt` into a folder to give a unique name to the training data in this folder. For example, if you have a bunch of images of horses and you are trying to train them as a single concept, you can assign a unique name such as "my_horses" to these images by putting the word `my_horses` inside `batch_id.txt` in your folder with horse images. 

 > Note that because this creates extra aspect ratio buckets, you need to be very careful about correlating the number of images to your training batch size. Aim to have an exact multiple of `batch_size` images at each aspect ratio. For example, if your `batch_size` is 6 and you have images with aspect ratios 4:3, 3:4, and 9:16, you should add or delete images until you have an exact multiple of 6 images (i.e. 6, 12, 28, ...) for each aspect ratio. If you do not do this, the bucketer will duplicate images to fill up each aspect ratio bucket. You'll probably also want to use manual validation to prevent the validator from messing this up, too.

+If you are using `.yaml` files for captioning, you can alternatively add a `batch_id: ` entry to either `local.yaml` or the individual images' `.yaml` files. Note that neither `.yaml` nor `batch_id.txt` files act recursively: they do not apply to subfolders.
+

 # Stuff you probably don't need to mess with, but well here it is: