diff --git a/README.md b/README.md index 4390916..46e9615 100644 --- a/README.md +++ b/README.md @@ -78,6 +78,9 @@ steps show the relative improvements of the checkpoints: Stable Diffusion is a latent diffusion model conditioned on the (non-pooled) text embeddings of a CLIP ViT-L/14 text encoder. + +#### Sampling Script + After [obtaining the weights](#weights), link them ``` mkdir -p models/ldm/stable-diffusion-v1/ @@ -88,24 +91,6 @@ and sample with python scripts/txt2img.py --prompt "a photograph of an astronaut riding a horse" --plms ``` -Another way to download and sample Stable Diffusion is by using the [diffusers library](https://github.com/huggingface/diffusers/tree/main#new--stable-diffusion-is-now-fully-compatible-with-diffusers) -```py -# make sure you're logged in with `huggingface-cli login` -from torch import autocast -from diffusers import StableDiffusionPipeline, LMSDiscreteScheduler - -pipe = StableDiffusionPipeline.from_pretrained( - "CompVis/stable-diffusion-v1-3-diffusers", - use_auth_token=True -) - -prompt = "a photo of an astronaut riding a horse on mars" -with autocast("cuda"): - image = pipe(prompt)["sample"][0] - -image.save("astronaut_rides_horse.png") -``` - By default, this uses a guidance scale of `--scale 7.5`, [Katherine Crowson's implementation](https://github.com/CompVis/latent-diffusion/pull/51) of the [PLMS](https://arxiv.org/abs/2202.09778) sampler, and renders images of size 512x512 (which it was trained on) in 50 steps. All supported arguments are listed below (type `python scripts/txt2img.py --help`). @@ -149,6 +134,28 @@ non-EMA to EMA weights. If you want to examine the effect of EMA vs no EMA, we p which contain both types of weights. For these, `use_ema=False` will load and use the non-EMA weights. +#### Diffusers Integration + +Another way to download and sample Stable Diffusion is by using the [diffusers library](https://github.com/huggingface/diffusers/tree/main#new--stable-diffusion-is-now-fully-compatible-with-diffusers) +```py +# make sure you're logged in with `huggingface-cli login` +from torch import autocast +from diffusers import StableDiffusionPipeline, LMSDiscreteScheduler + +pipe = StableDiffusionPipeline.from_pretrained( + "CompVis/stable-diffusion-v1-3-diffusers", + use_auth_token=True +) + +prompt = "a photo of an astronaut riding a horse on mars" +with autocast("cuda"): + image = pipe(prompt)["sample"][0] + +image.save("astronaut_rides_horse.png") +``` + + + ### Image Modification with Stable Diffusion By using a diffusion-denoising mechanism as first proposed by [SDEdit](https://arxiv.org/abs/2108.01073), the model can be used for different