Created Train set scaling (markdown)

2022-10-19 12:29:06 -04:00 · 2022-10-19 12:29:06 -04:00 · 53b1241e74
parent b78b567e9a
commit 53b1241e74
1 changed files with 11 additions and 0 deletions
--- a/Train-set-scaling.md
+++ b/Train-set-scaling.md
@ -0,0 +1,11 @@
 ![image](https://user-images.githubusercontent.com/15839076/196738736-382ba2ca-7f01-4d80-a655-a5e120ddce4b.png)
 Training settings have approximately following this curve as the dataset as grown.
 Typical dreambooth training falls on the far left of this graph, the community has many examples of people using 10-80 images and 800-2500 steps for typical face/person training.  As I've scaled the Final Fantasy 7 Remake model, I've found a somewhat inverted exponential curve in steps required as I add data, but I suspect this will flatten out to linear as we zoom out.  
 Adding in more ground truth data will also multiple the line with respect to the Y axis of steps.  I.e. 25% training data and 75% ground truth data I would suspect will increase steps/training time by 4 but better preserve the  character of the base model.
 ![image](https://user-images.githubusercontent.com/15839076/196750042-58f20803-c5c7-426a-9adf-fd89085e531c.png)
 My expectation the price of adding substantial ground truth will be improved model quality retention.