[docs] Update ONNX doc to use `optimum` (#2702)

* minor edits to onnx and openvino docs.

* Apply suggestions from code review

Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com>

---------

Co-authored-by: Ella Charlaix <80481427+echarlaix@users.noreply.github.com>
This commit is contained in:
Sayak Paul 2023-03-17 22:47:42 +05:30 committed by GitHub
parent f4bbcb29c0
commit a16957159e
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 33 additions and 41 deletions

View File

@ -13,61 +13,53 @@ specific language governing permissions and limitations under the License.
# How to use the ONNX Runtime for inference # How to use the ONNX Runtime for inference
🤗 Diffusers provides a Stable Diffusion pipeline compatible with the ONNX Runtime. This allows you to run Stable Diffusion on any hardware that supports ONNX (including CPUs), and where an accelerated version of PyTorch is not available. 🤗 [Optimum](https://github.com/huggingface/optimum) provides a Stable Diffusion pipeline compatible with ONNX Runtime.
## Installation ## Installation
- TODO Install 🤗 Optimum with the following command for ONNX Runtime support:
```
pip install optimum["onnxruntime"]
```
## Stable Diffusion Inference ## Stable Diffusion Inference
The snippet below demonstrates how to use the ONNX runtime. You need to use `OnnxStableDiffusionPipeline` instead of `StableDiffusionPipeline`. You also need to download the weights from the `onnx` branch of the repository, and indicate the runtime provider you want to use. To load an ONNX model and run inference with the ONNX Runtime, you need to replace [`StableDiffusionPipeline`] with `ORTStableDiffusionPipeline`. In case you want to load
a PyTorch model and convert it to the ONNX format on-the-fly, you can set `export=True`.
```python ```python
# make sure you're logged in with `huggingface-cli login` from optimum.onnxruntime import ORTStableDiffusionPipeline
from diffusers import OnnxStableDiffusionPipeline
pipe = OnnxStableDiffusionPipeline.from_pretrained(
"runwayml/stable-diffusion-v1-5",
revision="onnx",
provider="CUDAExecutionProvider",
)
model_id = "runwayml/stable-diffusion-v1-5"
pipe = ORTStableDiffusionPipeline.from_pretrained(model_id, export=True)
prompt = "a photo of an astronaut riding a horse on mars" prompt = "a photo of an astronaut riding a horse on mars"
image = pipe(prompt).images[0] images = pipe(prompt).images[0]
pipe.save_pretrained("./onnx-stable-diffusion-v1-5")
``` ```
The snippet below demonstrates how to use the ONNX runtime with the Stable Diffusion upscaling pipeline. If you want to export the pipeline in the ONNX format offline and later use it for inference,
you can use the [`optimum-cli export`](https://huggingface.co/docs/optimum/main/en/exporters/onnx/usage_guides/export_a_model#exporting-a-model-to-onnx-using-the-cli) command:
```python ```bash
from diffusers import OnnxStableDiffusionPipeline, OnnxStableDiffusionUpscalePipeline optimum-cli export onnx --model runwayml/stable-diffusion-v1-5 sd_v15_onnx/
prompt = "a photo of an astronaut riding a horse on mars"
steps = 50
txt2img = OnnxStableDiffusionPipeline.from_pretrained(
"runwayml/stable-diffusion-v1-5",
revision="onnx",
provider="CUDAExecutionProvider",
)
small_image = txt2img(
prompt,
num_inference_steps=steps,
).images[0]
generator = torch.manual_seed(0)
upscale = OnnxStableDiffusionUpscalePipeline.from_pretrained(
"ssube/stable-diffusion-x4-upscaler-onnx",
provider="CUDAExecutionProvider",
)
large_image = upscale(
prompt,
small_image,
generator=generator,
num_inference_steps=steps,
).images[0]
``` ```
Then perform inference:
```python
from optimum.onnxruntime import ORTStableDiffusionPipeline
model_id = "sd_v15_onnx"
pipe = ORTStableDiffusionPipeline.from_pretrained(model_id)
prompt = "a photo of an astronaut riding a horse on mars"
images = pipe(prompt).images[0]
```
Notice that we didn't have to specify `export=True` above.
You can find more examples in [optimum documentation](https://huggingface.co/docs/optimum/).
## Known Issues ## Known Issues
- Generating multiple prompts in a batch seems to take too much memory. While we look into it, you may need to iterate instead of batching. - Generating multiple prompts in a batch seems to take too much memory. While we look into it, you may need to iterate instead of batching.

View File

@ -36,4 +36,4 @@ prompt = "a photo of an astronaut riding a horse on mars"
images = pipe(prompt).images[0] images = pipe(prompt).images[0]
``` ```
You can find more examples in [optimum documentation](https://huggingface.co/docs/optimum/intel/inference#export-and-inference-of-stable-diffusion-models). You can find more examples (such as static reshaping and model compilation) in [optimum documentation](https://huggingface.co/docs/optimum/intel/inference#export-and-inference-of-stable-diffusion-models).