Reproducible images by supplying latents to pipeline (#247)

* Accept latents as input for StableDiffusionPipeline. * Notebook to demonstrate reusable seeds (latents). * More accurate type annotation Co-authored-by: Suraj Patil <surajp815@gmail.com> * Review comments: move to device, raise instead of assert. * Actually commit the test notebook. I had mistakenly pushed an empty file instead. * Adapt notebook to Colab. * Update examples readme. * Move notebook to personal repo. Co-authored-by: Suraj Patil <surajp815@gmail.com>
2022-08-25 15:47:05 +02:00 · 2022-08-25 15:47:05 +02:00 · bfe37f3159
parent 89793a97e2
commit bfe37f3159
2 changed files with 18 additions and 7 deletions
--- a/examples/inference/readme.md
+++ b/examples/inference/readme.md
@ -48,3 +48,7 @@ with autocast("cuda"):
 images[0].save("fantasy_landscape.png")
 ```
 You can also run this example on colab [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/patil-suraj/Notebooks/blob/master/image_2_image_using_diffusers.ipynb)
+
+## Tweak prompts reusing seeds and latents
+
+You can generate your own latents to reproduce results, or tweak your prompt on a specific result you liked. [This notebook](stable-diffusion-seeds.ipynb) shows how to do it step by step. You can also run it in Google Colab [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pcuenca/diffusers-examples/blob/main/notebooks/stable-diffusion-seeds.ipynb).
--- a/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py
+++ b/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py
@ -46,6 +46,7 @@ class StableDiffusionPipeline(DiffusionPipeline):
        guidance_scale: Optional[float] = 7.5,
        eta: Optional[float] = 0.0,
        generator: Optional[torch.Generator] = None,
+        latents: Optional[torch.FloatTensor] = None,
        output_type: Optional[str] = "pil",
        **kwargs,
    ):
@ -98,12 +99,18 @@ class StableDiffusionPipeline(DiffusionPipeline):
            # to avoid doing two forward passes
            text_embeddings = torch.cat([uncond_embeddings, text_embeddings])

-        # get the intial random noise
+        # get the initial random noise unless the user supplied it
+        latents_shape = (batch_size, self.unet.in_channels, height // 8, width // 8)
+        if latents is None:
            latents = torch.randn(
-            (batch_size, self.unet.in_channels, height // 8, width // 8),
+                latents_shape,
                generator=generator,
                device=self.device,
            )
+        else:
+            if latents.shape != latents_shape:
+                raise ValueError(f"Unexpected latents shape, got {latents.shape}, expected {latents_shape}")
+            latents = latents.to(self.device)

        # set timesteps
        accepts_offset = "offset" in set(inspect.signature(self.scheduler.set_timesteps).parameters.keys())