stable-diffusion-webui

Commit Graph

Author	SHA1	Message	Date
Kohaku-Blueleaf	209c26a1cb	improve efficiency and support more device	2024-01-09 22:11:44 +08:00
AUTOMATIC1111	a70dfb64a8	change import statements for #14478	2023-12-31 22:38:30 +03:00
Aarni Koskela	5768afc776	Add utility to inspect a model's parameters (to get dtype/device)	2023-12-31 13:22:43 +02:00
Kohaku-Blueleaf	9a15ae2a92	Merge branch 'dev' into test-fp8	2023-12-03 10:54:54 +08:00
AUTOMATIC1111	af5f0734c9	Merge pull request #14171 from Nuullll/ipex Initial IPEX support for Intel Arc GPU	2023-12-02 19:22:32 +03:00
Kohaku-Blueleaf	110485d5bb	Merge branch 'dev' into test-fp8	2023-12-02 17:00:09 +08:00
AUTOMATIC1111	88736b5557	Merge pull request #14131 from read-0nly/patch-1 Update devices.py - Make 'use-cpu all' actually apply to 'all'	2023-12-02 09:46:19 +03:00
Nuullll	7499148ad4	Disable ipex autocast due to its bad perf	2023-12-02 14:00:46 +08:00
Nuullll	8b40f475a3	Initial IPEX support	2023-11-30 20:22:46 +08:00
obsol	3cd6e1d0a0	Update devices.py fixes issue where "--use-cpu" all properly makes SD run on CPU but leaves ControlNet (and other extensions, I presume) pointed at GPU, causing a crash in ControlNet caused by a mismatch between devices between SD and CN https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/14097	2023-11-27 19:21:43 -05:00
Kohaku-Blueleaf	043d2edcf6	Better naming	2023-11-19 15:56:31 +08:00
Kohaku-Blueleaf	598da5cd49	Use options instead of cmd_args	2023-11-19 15:50:06 +08:00
KohakuBlueleaf	ddc2a3499b	Add MPS manual cast	2023-10-28 16:52:35 +08:00
Kohaku-Blueleaf	d4d3134f6d	ManualCast for 10/16 series gpu	2023-10-28 15:24:26 +08:00
Kohaku-Blueleaf	eaa9f5162f	Add CPU fp8 support Since norm layer need fp32, I only convert the linear operation layer(conv2d/linear) And TE have some pytorch function not support bf16 amp in CPU. I add a condition to indicate if the autocast is for unet.	2023-10-24 01:49:05 +08:00
AUTOMATIC1111	46375f0592	fix for crash when running #12924 without --device-id	2023-09-09 09:39:37 +03:00
catboxanon	5681bf8016	More accurate check for enabling cuDNN benchmark on 16XX cards	2023-08-31 14:57:16 -04:00
AUTOMATIC1111	386245a264	split shared.py into multiple files; should resolve all circular reference import errors related to shared.py	2023-08-09 10:25:35 +03:00
AUTOMATIC1111	0d5dc9a6e7	rework RNG to use generators instead of generating noises beforehand	2023-08-09 08:43:31 +03:00
AUTOMATIC1111	fca42949a3	rework torchsde._brownian.brownian_interval replacement to use device.randn_local and respect the NV setting.	2023-08-03 07:18:55 +03:00
AUTOMATIC1111	84b6fcd02c	add NV option for Random number generator source setting, which allows to generate same pictures on CPU/AMD/Mac as on NVidia videocards.	2023-08-03 00:00:23 +03:00
Aarni Koskela	b85fc7187d	Fix MPS cache cleanup Importing torch does not import torch.mps so the call failed.	2023-07-11 12:51:05 +03:00
AUTOMATIC1111	da8916f926	added torch.mps.empty_cache() to torch_gc() changed a bunch of places that use torch.cuda.empty_cache() to use torch_gc() instead	2023-07-08 17:13:18 +03:00
Aarni Koskela	ba70a220e3	Remove a bunch of unused/vestigial code As found by Vulture and some eyes	2023-06-05 22:43:57 +03:00
AUTOMATIC	8faac8b963	run basic torch calculation at startup in parallel to reduce the performance impact of first generation	2023-05-21 21:55:14 +03:00
AUTOMATIC	028d3f6425	ruff auto fixes	2023-05-10 11:05:02 +03:00
AUTOMATIC	5fe0dd79be	rename CPU RNG to RNG source in settings, add infotext and parameters copypaste support to RNG source	2023-04-29 11:29:37 +03:00
Deciare	d40e44ade4	Option to use CPU for random number generation. Makes a given manual seed generate the same images across different platforms, independently of the GPU architecture in use. Fixes #9613.	2023-04-18 23:27:46 -04:00
brkirch	1b8af15f13	Refactor Mac specific code to a separate file Move most Mac related code to a separate file, don't even load it unless web UI is run under macOS.	2023-02-01 14:05:56 -05:00
brkirch	2217331cd1	Refactor MPS fixes to CondFunc	2023-02-01 06:36:22 -05:00
brkirch	7738c057ce	MPS fix is still needed :( Apparently I did not test with large enough images to trigger the bug with torch.narrow on MPS	2023-02-01 05:23:58 -05:00
AUTOMATIC1111	fecb990deb	Merge pull request #7309 from brkirch/fix-embeddings Fix embeddings, upscalers, and refactor `--upcast-sampling`	2023-01-28 18:44:36 +03:00
brkirch	f9edd578e9	Remove MPS fix no longer needed for PyTorch The torch.narrow fix was required for nightly PyTorch builds for a while to prevent a hard crash, but newer nightly builds don't have this issue.	2023-01-28 04:16:27 -05:00
brkirch	ada17dbd7c	Refactor conditional casting, fix upscalers	2023-01-28 04:16:25 -05:00
AUTOMATIC	9beb794e0b	clarify the option to disable NaN check.	2023-01-27 13:08:00 +03:00
AUTOMATIC	d2ac95fa7b	remove the need to place configs near models	2023-01-27 11:28:12 +03:00
brkirch	e3b53fd295	Add UI setting for upcasting attention to float32 Adds "Upcast cross attention layer to float32" option in Stable Diffusion settings. This allows for generating images using SD 2.1 models without --no-half or xFormers. In order to make upcasting cross attention layer optimizations possible it is necessary to indent several sections of code in sd_hijack_optimizations.py so that a context manager can be used to disable autocast. Also, even though Stable Diffusion (and Diffusers) only upcast q and k, unfortunately my findings were that most of the cross attention layer optimizations could not function unless v is upcast also.	2023-01-25 01:13:04 -05:00
brkirch	84d9ce30cb	Add option for float32 sampling with float16 UNet This also handles type casting so that ROCm and MPS torch devices work correctly without --no-half. One cast is required for deepbooru in deepbooru_model.py, some explicit casting is required for img2img and inpainting. depth_model can't be converted to float16 or it won't work correctly on some systems (it's known to have issues on MPS) so in sd_models.py model.depth_model is removed for model.half().	2023-01-25 01:13:02 -05:00
AUTOMATIC1111	aa60fc6660	Merge pull request #6922 from brkirch/cumsum-fix Improve cumsum fix for MPS	2023-01-19 13:18:34 +03:00
brkirch	a255dac4f8	Fix cumsum for MPS in newer torch The prior fix assumed that testing int16 was enough to determine if a fix is needed, but a recent fix for cumsum has int16 working but not bool.	2023-01-17 20:54:18 -05:00
AUTOMATIC	c361b89026	disable the new NaN check for the CI	2023-01-17 11:05:01 +03:00
AUTOMATIC	9991967f40	Add a check and explanation for tensor with all NaNs.	2023-01-16 22:59:46 +03:00
brkirch	8111b5569d	Add support for PyTorch nightly and local builds	2023-01-05 20:54:52 -05:00
brkirch	16b4509fa6	Add numpy fix for MPS on PyTorch 1.12.1 When saving training results with torch.save(), an exception is thrown: "RuntimeError: Can't call numpy() on Tensor that requires grad. Use tensor.detach().numpy() instead." So for MPS, check if Tensor.requires_grad and detach() if necessary.	2022-12-17 04:22:58 -05:00
AUTOMATIC	b6e5edd746	add built-in extension system add support for adding upscalers in extensions move LDSR, ScuNET and SwinIR to built-in extensions	2022-12-03 18:06:33 +03:00
AUTOMATIC	46b0d230e7	add comment for #4407 and remove seemingly unnecessary cudnn.enabled	2022-12-03 16:01:23 +03:00
AUTOMATIC	2651267e3a	fix #4407 breaking UI entirely for card other than ones related to the PR	2022-12-03 15:57:52 +03:00
AUTOMATIC1111	681c0003df	Merge pull request #4407 from yoinked-h/patch-1 Fix issue with 16xx cards	2022-12-03 10:30:34 +03:00
brkirch	0fddb4a1c0	Rework MPS randn fix, add randn_like fix torch.manual_seed() already sets a CPU generator, so there is no reason to create a CPU generator manually. torch.randn_like also needs a MPS fix for k-diffusion, but a torch hijack with randn_like already exists so it can also be used for that.	2022-11-30 10:33:42 -05:00
AUTOMATIC1111	cc90dcc933	Merge pull request #4918 from brkirch/pytorch-fixes Fixes for PyTorch 1.12.1 when using MPS	2022-11-27 13:47:01 +03:00

1 2

79 Commits