update readme
This commit is contained in:
parent
2b0d75792f
commit
fac465060d
|
@ -2,16 +2,10 @@
|
|||
|
||||
Automatic captioning uses Salesforce's BLIP to automatically create a clean sentence structure for captioning input images before training.
|
||||
|
||||
This requires an Nvidia GPU, but is not terribly intensive work. It should run fine on something like a 1050 Ti 4GB.
|
||||
This requires an Nvidia GPU, but is not terribly intensive work. It should run fine on something like a 1050 Ti 4GB.
|
||||
|
||||
I suggest using [Birme](https://www.birme.net/?target_width=512&target_height=512&auto_focal=false&image_format=webp&quality_jpeg=95&quality_webp=99) to crop and resize first, but there are various tools out there for this. I strongly suggest making sure to crop well for training! It's best to crop to square first because you do not want to caption things that are later removed by cropping.
|
||||
|
||||
Auto-caption is fast and not very resource intensive, but it still uses GPU. You only need an Nvidia GPU with 2GB VRAM to run.
|
||||
|
||||
Make sure cuda version of torch and torchvision are installed by activating your environment and running this command:
|
||||
|
||||
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 --extra-index-url https://download.pytorch.org/whl/cu113
|
||||
|
||||
## Execute
|
||||
|
||||
Place input files into the /input folder
|
||||
|
|
Loading…
Reference in New Issue