EveryDream-trainer/Train-Runpod.ipynb

454 lines
15 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"id": "676114ae",
"metadata": {},
"source": [
"## Every Dream trainer\n",
"\n",
"You will need your data prepared first before starting! Don't waste rental fees if you're not ready to upload your files. Your files should be captioned before you start with either the caption as the filename or in text files for each image alongside the image files. See main README.md for more details. Tools are available to automatically caption your files.\n",
"\n",
"[Instructions](https://github.com/victorchall/EveryDream-trainer/blob/main/README.md)\n",
"\n",
"If you can sign up for Runpod here (shameless referral link): [Runpod](https://runpod.io?ref=oko38cd0)\n",
"\n",
"If you are confused by the wall of text, join the discord here: [EveryDream Discord](https://discord.gg/uheqxU6sXN)\n",
"\n",
"Make sure you have at least 40GB of Runpod **Volume** storage at a minimum so you don't waste training just 1 ckpt that is overtrained and have to start over. Penny pinching on storage is ultimately a waste of your time and money! This is setup to give you more than one ckpt so you don't overtrain.\n",
"\n",
"### Starting model\n",
"Make sure you have your hugging face token ready to download the 1.5 mode. You can get one here: https://huggingface.co/settings/tokens\n",
"If you don't have a User Access Token, create one. Or you can upload a starting checkpoint instead of using the HF download and skip that step, but you'll need to modify the starting model name when you start training (more info below)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0902e735",
"metadata": {},
"outputs": [],
"source": [
"# check system resources, make sure your GPU actually has 24GB\n",
"# You should see \"0 MB / 24576 MB\" in the middle of the printout\n",
"# if you see 0 MB / 22000 MB find a different instance or provider...\n",
"!grep Swap /proc/meminfo\n",
"!swapon -s\n",
"!nvidia-smi"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "bb6d14b7-3c37-4ec4-8559-16b4e9b8dd18",
"metadata": {},
"outputs": [],
"source": [
"!git clone https://github.com/victorchall/everydream-trainer\n",
"%cd everydream-trainer\n",
"\n",
"import codecs\n",
"finish_msg = codecs.encode('QBAR', 'rot_13')\n",
"\n",
"print(finish_msg)"
]
},
{
"cell_type": "markdown",
"id": "589bfca0",
"metadata": {
"tags": []
},
"source": [
"## Install dependencies\n",
"\n",
"**This will take a couple minutes. Wait until it says \"DONE\" to move on.** \n",
"You can ignore \"warnings.\""
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ab559338",
"metadata": {},
"outputs": [],
"source": [
"!pip install -q omegaconf\n",
"!pip install -q einops\n",
"!pip install -q pytorch-lightning==1.6.5\n",
"!pip install -q test-tube\n",
"!pip install -q transformers==4.19.2\n",
"!pip install -q kornia\n",
"!pip install -e git+https://github.com/CompVis/taming-transformers.git@master#egg=taming-transformers\n",
"!pip install -e git+https://github.com/openai/CLIP.git@main#egg=clip\n",
"!pip install -q setuptools==59.5.0\n",
"!pip install -q pillow==9.0.1\n",
"!pip install -q torchmetrics==0.6.0\n",
"!pip install -e .\n",
"!pip install huggingface_hub\n",
"!pip install ipywidgets==7.7.1\n",
"import ipywidgets as widgets\n",
"print(finish_msg)"
]
},
{
"cell_type": "markdown",
"id": "c230d91a",
"metadata": {},
"source": [
"## Now that dependencies are installed, ready to move on!"
]
},
{
"cell_type": "markdown",
"id": "17affc47",
"metadata": {},
"source": [
"## Log into huggingface\n",
"Run the cell below and paste your token into the prompt. You can get your token from your [huggingface account page](https://huggingface.co/settings/tokens).\n",
"\n",
"The token will not show on the screen, just press enter after you paste it.\n",
"\n",
"Then run the following cell to download the base checkpoint (may take a minute)."
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "02c8583e",
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "3b16e2951c6549f589afaadc67e5ac9d",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"VBox(children=(HTML(value='<center> <img\\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from huggingface_hub import notebook_login\n",
"\n",
"notebook_login()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "503322f5",
"metadata": {},
"outputs": [],
"source": [
"# Default is 1.5 with the new VAE, you can change this to another file on huggingface if you want:\n",
"from huggingface_hub import hf_hub_download\n",
"downloaded_model_path = hf_hub_download(\n",
" repo_id=\"panopstor/EveryDream\",\n",
" filename=\"sd_v1-5_vae.ckpt\",\n",
" use_auth_token=True\n",
")\n",
"print(downloaded_model_path) # cache location\n",
"print(finish_msg)"
]
},
{
"cell_type": "markdown",
"id": "0bf1e8cd",
"metadata": {},
"source": [
"# Upload training files\n",
"\n",
"Ues the navigation on the left to open **\"every-dream-trainer\"** folder, then open **\"input\"** and upload your training files using the **up arrow button** above the file explorer. \n",
"\n",
"You can upload your images while you wait for the base checkpoint to download above, but do not continue until the checkpoint download is done.\n",
"\n",
"If your captions are in .txt files, upload those in the same folder as the images. \n",
"\n",
"![upload](./demo/runpodupload.png)\n",
"\n",
"You can check there are files in the folder by running the cell below (optional, just prints first 10)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "af711a9c-7a50-49a1-a571-439d62a9587c",
"metadata": {},
"outputs": [],
"source": [
"!ls -U input | head -10\n",
"# at least a few image filenames should show below, if not you uploaded to the wrong folder"
]
},
{
"cell_type": "markdown",
"id": "873d9f3f",
"metadata": {},
"source": [
"## Tweak your YAML\n",
"You can adjust the YAML file to change the training parameters. The v1-fine yamls are in the everydream-trainer/configs/stable-diffusion folder. By default the \"v1-finetune_runpod.yaml\" is used here with some sane defaults for one subject with 20-30 images.\n",
"\n",
"You can also upload your own in that folder and use it below.\n",
"\n",
"Instructions are here: [EveryDream README](https://github.com/victorchall/EveryDream-trainer/blob/main/README.md) (hopefully you already read this?)\n",
"\n",
"[Runpod YAML](configs/stable-diffusion/v1-finetune_runpod.yaml) is a good starting point for small datasets (30-50 images) and **is the default in the command below**. It will only keep 2 checkpoints.\n",
"\n",
"If you are running on an A100 on Colab or otherwise, you can adjust the batch size up substantially. Batch size 16 on A100 40GB as been tested as working. In Colab, use the file navigation on the left to open the yaml.\n",
"\n",
"If you need help with larger training, join the Discord for advice on making a custom yaml."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "65e8a052-bc26-4339-81e0-3b4f78c1d286",
"metadata": {},
"outputs": [],
"source": [
"# change to use a custom yaml, but RUN THIS CELL NO MATTER WHAT to set your yaml path\n",
"my_yaml = \"configs/stable-diffusion/v1-finetune_runpod.yaml\"\n",
"# downloaded_model_path = \"v1-5-pruned-emaonly.ckpt\" # this is the default, but you can change it by uncommenting if you uploaded a file into /everydream-trainer\n",
"print(downloaded_model_path) # reminder in case something went wrong with download\n",
"print(f\"yaml set to: {my_yaml}\")"
]
},
{
"cell_type": "markdown",
"id": "c7d909e7",
"metadata": {},
"source": [
"## Run the trainer\n",
"Next cell runs training. This will take a while depending on your number of images, repeats, and max_epochs. \n",
"\n",
"You can watch for test images in the logs folder."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c12e7cf3-42be-4537-a4f7-5723c0248562",
"metadata": {},
"outputs": [],
"source": [
"# run the trainer, wait until it finishes then SCROLL DOWN to the next cell\n",
"!python main.py --base {my_yaml} -t --actual_resume {downloaded_model_path} -n test --data_root input\n",
"print(finish_msg)"
]
},
{
"cell_type": "markdown",
"id": "e8c93085",
"metadata": {},
"source": [
"## Prune your checkpoints\n",
"This will create 2GB pruned files for all your checkpoints and delete the 11GB files.\n",
"\n",
"If you wish to resume, you may want to remove \"--delete\" command below and manually delete the 11GB files you dont want to keep. Typically you only need the last 1 checkpoint in 11gb for resuming."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5e70ae7e",
"metadata": {},
"outputs": [],
"source": [
"# prune the ckpts\n",
"!python scripts/autoprune_all.py --delete"
]
},
{
"cell_type": "markdown",
"id": "065dfd91",
"metadata": {},
"source": [
"## Test\n",
"Look in the file drawer on the left for your epoch ckpt names and try them out in the cell below one at a time. You can save time just downloading the one pruned file that looks the best. Try each out.\n",
"\n",
"Change the prompt and the ckpt_path below to appropriate values."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "954dfc20",
"metadata": {},
"outputs": [],
"source": [
"!python scripts/txt2img.py --ckpt_path \"epoch=02-step=01000.ckpt\" \\\n",
"--n_samples 2 \\\n",
"--n_iter 4 \\\n",
"--prompt \"a boy and his dog talking a walk down the sidewalk\" \\\n",
"--scale 6.0 \\\n",
"--outdir outputs "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e13533d6",
"metadata": {},
"outputs": [],
"source": [
"# run to show the images here, or use the file drawer on the left to look at them\n",
"import glob\n",
"from IPython.display import Image, display\n",
"for imageName in glob.glob('outputs/*.jpg'): #assuming JPG\n",
" display(Image(filename=imageName))\n",
" print(imageName)"
]
},
{
"cell_type": "markdown",
"id": "51456afe",
"metadata": {},
"source": [
"## Download your checkpoints\n",
"\n",
"Use the cell below to generate links, right click and save as to download. If you use Colab, you can skip this and use the Gdrive connect instead.\n",
"\n",
"**If the links don't work, you can double left click the ckpt file in the file drawer on the left, then go to \"File\" menu then \"Download\"** or use the Hugging Face upload below.\n",
"\n",
"[EveryDream Discord](https://discord.gg/uheqxU6sXN)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1f94f9ab",
"metadata": {},
"outputs": [],
"source": [
"import glob\n",
"from IPython.display import FileLink\n",
"for f in glob.glob(\"*.ckpt\"):\n",
" display(FileLink(f))\n",
"# right click save as to download the ckpt files"
]
},
{
"cell_type": "markdown",
"id": "c13d37a1",
"metadata": {},
"source": [
"# HuggingFace upload\n",
"Use the cell below to upload your checkpoint to your personal HuggingFace account if you want instead of manually downloading. You should already be authorized to Huggingface by token if you used the download/token cells above.\n",
"\n",
"Make sure to fill in the three fields at the top. This will only upload one file at a time, so you will need to run the cell below for each file you want to upload.\n",
"\n",
"* You can get your account name from your [HuggingFace account page](https://huggingface.co/settings/account). Look for your \"username\" field and paste it below.\n",
"\n",
"* You only need to setup a repository one time. You can create it here: [Create New HF Model](https://huggingface.co/new) Make sure you write down the repo name you make for future use. You can reuse it later.\n",
"\n",
"* You need to type the name of the ckpts one at a time in the cell below, find them in the left file drawer of your Runpod/Vast/Colab."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9fb962e0",
"metadata": {},
"outputs": [],
"source": [
"#list ckpts in root that are ready for download\n",
"import glob\n",
"\n",
"for f in glob.glob(\"*.ckpt\"):\n",
" print(f)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f7237ed4",
"metadata": {},
"outputs": [],
"source": [
"!huggingface-cli lfs-enable-largefiles\n",
"# fill in these three fields:\n",
"hfusername = \"panopstor\"\n",
"reponame = \"EveryDream\"\n",
"ckpt_name = \"epoch=01-step=00030-pruned.ckpt\"\n",
"\n",
"\n",
"target_name = ckpt_name.replace('-','').replace('=','')\n",
"import os\n",
"os.rename(ckpt_name,target_name)\n",
"repo_id=f\"{hfusername}/{reponame}\"\n",
"print(f\"uploading to HF: {repo_id}/{ckpt_name}\")\n",
"print(\"this make take a while...\")\n",
"\n",
"from huggingface_hub import HfApi\n",
"api = HfApi()\n",
"response = api.upload_file(\n",
" path_or_fileobj=target_name,\n",
" path_in_repo=target_name,\n",
" repo_id=repo_id,\n",
" repo_type=None,\n",
" create_pr=1,\n",
")\n",
"print(response)\n",
"print(finish_msg)\n",
"print(\"go to your repo and accept the PR this created to see your file\")"
]
},
{
"cell_type": "markdown",
"id": "54171a12",
"metadata": {},
"source": [
"# Gdrive connect\n",
"\n",
"For Colab only, copies your ckpts to your gdrive. If the EveryDreamCkpts folder already exists in your gdrive there will be an error, but it should still copy your files. "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "857015ff",
"metadata": {},
"outputs": [],
"source": [
"from google.colab import drive\n",
"drive.mount('/content/drive')\n",
"\n",
"!mkdir /content/drive/MyDrive/EveryDreamCkpts\n",
"!cp *.ckpt /content/drive/MyDrive/EveryDreamCkpts"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.13"
},
"vscode": {
"interpreter": {
"hash": "2e677f113ff5b533036843965d6e18980b635d0aedc1c5cebd058006c5afc92a"
}
}
},
"nbformat": 4,
"nbformat_minor": 5
}