Update LAION_SCRAPE.md

Signed-off-by: Victor Hall <victor.charles.hall@gmail.com>
This commit is contained in:
Victor Hall 2023-01-09 22:46:22 -05:00 committed by GitHub
parent fc671937f0
commit 508a628fc5
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 2 additions and 1 deletions

View File

@ -8,6 +8,7 @@ It has been tested with 2B-en-aesthetic, but may need minor tweaks for some othe
https://huggingface.co/datasets/laion/laion2B-en-aesthetic https://huggingface.co/datasets/laion/laion2B-en-aesthetic
**This tool does not work unless you download a set of Laion parquet files, above link is suggested.** Download all 128 .parquet files and place them in the /laion folder.
The script will rename downloaded files to the best of its ability to the TEXT (caption) of the image with the original file extension, which can be plugged into the new class of caption-capable DreamBooth apps or the EveryDream trainer that will use the filename as the prompt for training. The script will rename downloaded files to the best of its ability to the TEXT (caption) of the image with the original file extension, which can be plugged into the new class of caption-capable DreamBooth apps or the EveryDream trainer that will use the filename as the prompt for training.
@ -47,4 +48,4 @@ You can throw commands in a shell/cmd script to run several searches, but I will
python scripts/download_laion.py --search_text " hokusai" --limit 200 python scripts/download_laion.py --search_text " hokusai" --limit 200
python scripts/download_laion.py --search_text " bernini" --limit 200 python scripts/download_laion.py --search_text " bernini" --limit 200
python scripts/download_laion.py --search_text "Gustav Klimt" --limit 200 python scripts/download_laion.py --search_text "Gustav Klimt" --limit 200
python scripts/download_laion.py --search_text "engon Schiele" --limit 200 python scripts/download_laion.py --search_text "engon Schiele" --limit 200