399 lines
12 KiB
Plaintext
399 lines
12 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "aa2c1ada",
|
|
"metadata": {
|
|
"id": "aa2c1ada"
|
|
},
|
|
"source": [
|
|
"# Dreambooth\n",
|
|
"### Notebook implementation by Joe Penna (@MysteryGuitarM on Twitter) - Improvements by David Bielejeski\n",
|
|
"https://github.com/JoePenna/Dreambooth-Stable-Diffusion\n",
|
|
"\n",
|
|
"### If on runpod / vast.ai / etc, spin up an A6000 or A100 pod using a Stable Diffusion template with Jupyter pre-installed."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "7b971cc0",
|
|
"metadata": {
|
|
"id": "7b971cc0"
|
|
},
|
|
"source": [
|
|
"## Build Environment"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "2AsGA1xpNQnb",
|
|
"metadata": {
|
|
"id": "2AsGA1xpNQnb"
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"# If running on Vast.AI, copy the code in this cell into a new notebook. Run it, then launch the `dreambooth_runpod_joepenna.ipynb` notebook from the jupyter interface.\n",
|
|
"!git clone https://github.com/JoePenna/Dreambooth-Stable-Diffusion"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "9e1bc458-091b-42f4-a125-c3f0df20f29d",
|
|
"metadata": {
|
|
"id": "9e1bc458-091b-42f4-a125-c3f0df20f29d",
|
|
"scrolled": true
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"#BUILD ENV\n",
|
|
"%pip install omegaconf\n",
|
|
"%pip install einops\n",
|
|
"%pip install pytorch-lightning==1.6.5\n",
|
|
"%pip install protobuf==3.20.1\n",
|
|
"%pip install test-tube\n",
|
|
"%pip install transformers\n",
|
|
"%pip install kornia\n",
|
|
"%pip install -e git+https://github.com/CompVis/taming-transformers.git@master#egg=taming-transformers\n",
|
|
"%pip install -e git+https://github.com/openai/CLIP.git@main#egg=clip\n",
|
|
"%pip install setuptools==59.5.0\n",
|
|
"%pip install pillow==9.0.1\n",
|
|
"%pip install torchmetrics==0.6.0\n",
|
|
"%pip install -e .\n",
|
|
"%pip install gdown\n",
|
|
"%pip install pydrive\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "dae11c10",
|
|
"metadata": {
|
|
"id": "dae11c10"
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"## DOWNLOAD SD\n",
|
|
"%pip install gdown\n",
|
|
"\n",
|
|
"#Model hosted by David Bielejeski\n",
|
|
"!gdown https://drive.google.com/uc?id=1SXIecuTKoUTrPBh2Y95NiEtBBH9PbGAe"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "17d1d11a",
|
|
"metadata": {
|
|
"id": "17d1d11a"
|
|
},
|
|
"source": [
|
|
"# Regularization Images"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "ed07a5df",
|
|
"metadata": {
|
|
"id": "ed07a5df"
|
|
},
|
|
"source": [
|
|
"Training teaches your new model both your token **but** re-trains your class simultaneously.\n",
|
|
"\n",
|
|
"From cursory testing, it does not seem like reg images affect the model too much. However, they do affect your class greatly, which will in turn affect your generations."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "67f9ff0c-b529-4c7c-8e26-8388d70a5d91",
|
|
"metadata": {
|
|
"id": "67f9ff0c-b529-4c7c-8e26-8388d70a5d91"
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"#GENERATE 400 images\n",
|
|
"!python scripts/stable_txt2img.py --seed 10 --ddim_eta 0.0 --n_samples 1 --n_iter 400 --scale 10.0 --ddim_steps 50 --ckpt model.ckpt \\\n",
|
|
"--prompt \"person\"\n",
|
|
"\n",
|
|
"# If you don't want to train against \"person\", change it to whatever you want above, and on some of the cells below:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "3d1c7e1c",
|
|
"metadata": {
|
|
"id": "3d1c7e1c"
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"# zip up the files for downloading and reuse.\n",
|
|
"!apt-get install -y zip\n",
|
|
"!zip -r all_images.zip outputs/"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "mxPL2O0OLvBW",
|
|
"metadata": {
|
|
"id": "mxPL2O0OLvBW"
|
|
},
|
|
"source": [
|
|
"## Upload existing regularization images (200+ is recommended)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "e7EydXCjOV1v",
|
|
"metadata": {
|
|
"id": "e7EydXCjOV1v"
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"# grab the files here: https://github.com/djbielejeski/Stable-Diffusion-Regularization-Images\n",
|
|
"!git clone https://github.com/JoePenna/Stable-Diffusion-Regularization-Images\n",
|
|
"\n",
|
|
"!mkdir outputs\n",
|
|
"!mkdir outputs/txt2img-samples\n",
|
|
"!mkdir outputs/txt2img-samples/samples\n",
|
|
"\n",
|
|
"# RECOMMENDED - use person_ddim images (1500) - ddim @ 50 steps, CFG 7.0\n",
|
|
"mv -v Stable-Diffusion-Regularization-Images/person_ddim outputs/txt2img-samples outputs/txt2img-samples/samples\n",
|
|
"\n",
|
|
"# Use man_euler images\n",
|
|
"# provided by Niko Pueringer (Corridor Digital) - euler @ 40 steps, CFG 7.5\n",
|
|
"mv -v Stable-Diffusion-Regularization-Images/man_euler outputs/txt2img-samples outputs/txt2img-samples/samples\n",
|
|
"\n",
|
|
"# Use man_unsplash images - pictures from various photographers\n",
|
|
"mv -v Stable-Diffusion-Regularization-Images/man_unsplash outputs/txt2img-samples outputs/txt2img-samples/samples\n",
|
|
"\n",
|
|
"# Use woman_dimm images\n",
|
|
"# provided by David Bielejeski\n",
|
|
"mv -v Stable-Diffusion-Regularization-Images/woman_dimm outputs/txt2img-samples outputs/txt2img-samples/samples"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "309eab2d",
|
|
"metadata": {
|
|
"id": "309eab2d"
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"!apt-get install -y unzip"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "459eb1b5",
|
|
"metadata": {
|
|
"id": "459eb1b5"
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"# Unzip existing files If you've uploaded the previously generated files, unzip them\n",
|
|
"!unzip -o folder_to_upload.zip -d .\n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "zshrC_JuMXmM",
|
|
"metadata": {
|
|
"id": "zshrC_JuMXmM"
|
|
},
|
|
"source": [
|
|
"# Upload your training images\n",
|
|
"\n",
|
|
"At the moment, it seems like 10-20 images of someone's face is enough.\n",
|
|
"\n",
|
|
"WARNING: Be sure to upload an *even* amount of images, otherwise the training inexplicably stops at 1500 steps.\n",
|
|
"\n",
|
|
"* 2-3 full body\n",
|
|
"* 3-5 upper body \n",
|
|
"* 5-12 close-up on face"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "bff9314e",
|
|
"metadata": {
|
|
"id": "bff9314e"
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"# or upload from your Google Drive\n",
|
|
"%pip install gdown\n",
|
|
"!gdown https://drive.google.com/uc?id={fileId}\n",
|
|
"!mv {name of file} training_samples/{name of file}"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "ad4e50df",
|
|
"metadata": {
|
|
"id": "ad4e50df"
|
|
},
|
|
"source": [
|
|
"## Prep Training Variables"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "a26964af",
|
|
"metadata": {
|
|
"id": "a26964af"
|
|
},
|
|
"source": [
|
|
"Navigate to:\n",
|
|
"/workspace/Dreambooth-Stable-Diffusion/ldm/data/personalized.py\n",
|
|
"\n",
|
|
"Change `demoura` in line 11 to whatever you want your token to be.\n",
|
|
"\n",
|
|
"(Note: The original repo uses `sks` as a token -- but that has a relatively high correlation with the SKS semi-automatic rifle, which affects low-step generations)\n",
|
|
"\n",
|
|
"e.g. I changed mine to\n",
|
|
"```python\n",
|
|
"training_templates_smallest = [\n",
|
|
" 'joepenna {}',\n",
|
|
"]\n",
|
|
"```"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "37612c32",
|
|
"metadata": {
|
|
"id": "37612c32"
|
|
},
|
|
"source": [
|
|
"The last line of this file trains for `3000` steps.\n",
|
|
"/workspace/Dreambooth-Stable-Diffusion/configs/stable-diffusion/v1-finetune_unfrozen.yaml\n",
|
|
"\n",
|
|
"If training a person or subject, keep an eye on your project's `/images/train/samples_scaled_gs-00xxxx` generations.\n",
|
|
"\n",
|
|
"If training a style, keep an eye on your project's `/images/train/samples_gs-00xxxx` generations."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "12f19de7",
|
|
"metadata": {
|
|
"id": "12f19de7"
|
|
},
|
|
"source": [
|
|
"## Start Training (you should also change the code below)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "6fa5dd66-2ca0-4819-907e-802e25583ae6",
|
|
"metadata": {
|
|
"id": "6fa5dd66-2ca0-4819-907e-802e25583ae6",
|
|
"tags": []
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"# START THE TRAINING\n",
|
|
"!python \"main.py\" \\\n",
|
|
" --base configs/stable-diffusion/v1-finetune_unfrozen.yaml \\\n",
|
|
" -t \\\n",
|
|
" --actual_resume \"model.ckpt\" \\\n",
|
|
" --reg_data_root \"/workspace/Dreambooth-Stable-Diffusion/outputs/txt2img-samples/samples\" \\\n",
|
|
" -n \"project_name\" \\\n",
|
|
" --gpus 0, \\\n",
|
|
" --data_root \"/workspace/Dreambooth-Stable-Diffusion/training_samples\" \\\n",
|
|
" --class_word \"person\" # << match this word to the class word from regularization images above"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "dc49d0bd",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Pruning (12GB to 2GB)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# This will prune around 10GB from the ckpt file, then move it the \"trained_models\" folder\n",
|
|
"!python \"scripts/prune-ckpt.py\" --ckpt logs/{training_name}/checkpoints/last.ckpt\n",
|
|
"!mkdir trained_models\n",
|
|
"!mv logs/{training_name}/checkpoints/last-pruned.ckpt workspace/Dreambooth-Stable-Diffusion/trained_models/{training_name}.ckpt"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "d28d0139",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Generate Images With Your Trained Model!"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "80ddb03b",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"!python scripts/stable_txt2img.py \\\n",
|
|
"--ddim_eta 0.0 --n_samples 1 --n_iter 4 \\\n",
|
|
"--scale 7.0 --ddim_steps 50 --ckpt \"/workspace/Dreambooth-Stable-Diffusion/logs/{training_name}/last.ckpt\" \\\n",
|
|
"--prompt \"demoura person as a masterpiece portrait painting by John Singer Sargent in the style of Rembrandt\""
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "cfm1iW_9OAw1",
|
|
"metadata": {
|
|
"id": "cfm1iW_9OAw1"
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"#Download the .ckpt file and use in your favorite Stable Diffusion repo!"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"colab": {
|
|
"collapsed_sections": [],
|
|
"provenance": []
|
|
},
|
|
"kernelspec": {
|
|
"display_name": "Python 3 (ipykernel)",
|
|
"language": "python",
|
|
"name": "python3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.10.6"
|
|
},
|
|
"vscode": {
|
|
"interpreter": {
|
|
"hash": "b0fa6594d8f4cbf19f97940f81e996739fb7646882a419484c72d19e05852a7e"
|
|
}
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 5
|
|
}
|