Update README.md

2022-09-06 00:25:05 -07:00 · 2022-09-06 00:25:05 -07:00 · b6996e55d7
parent 95711f635b
commit b6996e55d7
1 changed files with 8 additions and 1 deletions
--- a/README.md
+++ b/README.md
@ -4,8 +4,15 @@ This is an implementtaion of Google's [Dreambooth](https://arxiv.org/abs/2208.12

 This code repository is based on that of [Textual Inversion](https://github.com/rinongal/textual_inversion). Note that Textual Inversion only optimizes word ebedding, while dreambooth fine-tunes the whole diffusion model.

-The implementation makes minimum changes over the official codebase of Textual Inversion, and in fact some components in Textual Inversion, such as the embedding manager, are not deleted, although they will never be used here.
+The implementation makes minimum changes over the official codebase of Textual Inversion. In fact, due to lazyness, some components in Textual Inversion, such as the embedding manager, are not deleted, although they will never be used here.

 ## Usage

 ### Preparation
+To fine-tune a stable diffusion model, you need to obtain the pre-trained stable diffusion models following their [instructions](https://github.com/CompVis/stable-diffusion#stable-diffusion-v1). Weights can be downloads on [HuggingFace](https://huggingface.co/CompVis). You can decide which version of checkpoint to use, but I use ```sd-v1-4-full-ema.ckpt```.
+
+We also need to create a set of images for regularization, as the fine-tuning algorithm of Dreambooth requires that. Details of the algorithm can be found in the paper. The text prompt can be ```a class```, where ```class``` is a word that describes the class of your object, such as ```dog```. I generate 8 images for regularization. Save the generated images (separately, one image per ```.png``` file) at ```/root/to/regularization/images```.
+
+### Training
+
+### Generation