parent
1e0ebe0378
commit
b5022bb43e
|
@ -1,165 +0,0 @@
|
|||
##big titty anon's list of artists
|
||||
>Characters:
|
||||
Artgerm (girls, semi real, sexy) !!!
|
||||
Tom Bagshaw (girls waist up, dark)
|
||||
Magali Villeneuve (fantasy,females, males)
|
||||
Loish (girl head, anime/comic)
|
||||
Mandy Jurgens (semi real girls, head)
|
||||
Michael Garmash (fairly real girls, broad strokes) !!!
|
||||
Mark Brooks (comic superheroes)
|
||||
Krenz Cushart (semi anime, environments)
|
||||
Wenjun Lin (very colorful with environments)
|
||||
Charlie Bowater (fantasy, attractive)
|
||||
Bayard Wu (fantasy, everything)
|
||||
diego dayer (realistic male and female)
|
||||
peter mohrbacher (weird creatures but detailed)
|
||||
Jeremy Lipkin (realistic, broad strokes)
|
||||
gregory manchess (males, females, animals, realistic)
|
||||
karol bak (girls, fantasy, fancy)
|
||||
livia prima (girls with environments)
|
||||
aleksi briclot (fantasy, male and female)
|
||||
Mark Arian (flowers in db, still really strong) !
|
||||
donato giancola (fantasy with environments)
|
||||
sophie anderson (classic, kids)
|
||||
leyendecker (classic)
|
||||
norman rockwell (classic)
|
||||
luis royo (monochrome)
|
||||
gaston bussiere (classic)
|
||||
frank frazetta (comic style)
|
||||
gil elvgren (classic sexy girls) pretty good
|
||||
simon bisley (comic style)
|
||||
nicolas poussin (classic)
|
||||
boris vallejo (sexy, semireal comic style, very orange, wasteland)
|
||||
kyoung hwan kim (girls, sexy)
|
||||
svetlin velinov(humans and creatures, detailed)
|
||||
maciej kuciara (2d and 3d girls)
|
||||
hajime sorayama(mostly chrome colored android girls, seems to lead to messed up limbs)
|
||||
Gerald Brom (gloomy)
|
||||
Tsutomu Nihei(monochrome manga style)
|
||||
guweiz (semi-realistic girl, fantasy, mostly half body or face)
|
||||
eric zener(realistic, swimming)
|
||||
Clay Mann(comic, attractive)
|
||||
Joe Jusko(comic)
|
||||
jesper ejsing(fantasy, monsters and characters)
|
||||
huang guangjian (fantasy, humans)
|
||||
rossdraws(anime, girls)
|
||||
joshua middleton(comic, super heroes)
|
||||
Cedric Peyravernay (dishonored characters) !!!
|
||||
Alex Horley (fantasy, blizzard)
|
||||
Clyde Caldwell (fantasy, comic)
|
||||
Dave Rapoza (fantasy, comic, semireal)
|
||||
christophe_young (fantasy, girls)
|
||||
Irakli Nadar(girls, only faces)
|
||||
|
||||
>Detailed Art:
|
||||
Justin Gerard (fantasy, somewhate orange) !!!
|
||||
Raymond Swanland (fantasy)
|
||||
Greg Rutkowski (fantasy)
|
||||
marc simonetti (fantasy)
|
||||
James Gurney (fantasy but fairly realistic artstyle)
|
||||
|
||||
>Environmental art:
|
||||
Stephan Martiniere (flashy, sci-fi, detailed)
|
||||
Craig Mullins (fairly realistic fantasy, detailed)
|
||||
jordan grimmer (fantasy, semi real)
|
||||
Leon Tukker (sci-fi cityscapes, detailed)
|
||||
ismail inceoglu (colorful sci-fi and fantasy) !!!
|
||||
dan mumford(bright colors, fairly monochrome, comic)
|
||||
tyler edlin (fairly detailed fantasy art)
|
||||
tomasz alen kopera (moss and root made things)
|
||||
les edwards (comic, semi-realistic)
|
||||
caspar david friedrich (classic)
|
||||
darek zabrocki(fantasy, fairly real)
|
||||
anato finnstark (mostly 2 colors)
|
||||
tuomas korpi (fantasy)
|
||||
makoto shinkai (anime)
|
||||
syd mead (flat color, also futuristic cars)
|
||||
John Constable (classic)
|
||||
Ivan Shishkin (classic, trees)
|
||||
raphael lacoste(detailed, fantasy)
|
||||
Bruce Pennington (colorful, fantasy)
|
||||
michael whelan (colorful, fantasy)
|
||||
Akihiko Yoshida (bravely default, vagrant story)
|
||||
rhads(very colorful)
|
||||
anton fadeev (duelyst background)
|
||||
Noah Bradley (fantasy, colorful)
|
||||
James Paick(fantasy, sci-fi)
|
||||
Gilles Beloeil(fairly realistic)
|
||||
|
||||
##big titty anon's notes
|
||||
wlop in base 1.4 sucks ass, in WD he's pretty good it seems, though as said the floor was raised dramatically.
|
||||
Cedric Peyravernay whose girls are somehow hot by only using him even in base 1.4 which I really didn't expect from a dishonored artist.
|
||||
Michael Garmash turns all faces into those of goddesses. Artgerm in base 1.4 is amazing, in WD he isn't really necessary.
|
||||
Justin Gerard adds a really nice fantasy feel that I feel none of the others do.
|
||||
ismail inceoglu add a sci-fi feel that needs to be coupled with another artist to tame it a bit.
|
||||
Stephan Martinere makes things like clothes really shiny, which is nice and I think he's the only one who does that.
|
||||
|
||||
##anime vector anon's tierlist
|
||||
God Tier
|
||||
art by ken sugimori, art by kentaro miura, art by tony taka, art by yoji shinkawa, cowboy bebop, flcl, frame arms girl, yuru camp
|
||||
|
||||
SSS Tier
|
||||
isekai, josei, manwha, shoujo, 80's anime, aho girl, asobi asobase, atelier meruru, atelier ryza, atelier totori, bible black, full metal alchemist, goblin slayer, harem, hellsing, phantasy star, xenoblade, yotsuba, youjo senki
|
||||
|
||||
High Tier
|
||||
animage, clamp, monthly comic alive, otaku usa, seinen, shonen, studio pierrot, virtual youtuber, visual novel, 5-toubun no hanayome, atelier rorona, azur lane, bakemonogatari, bubblegum crisis, dagashi kashi, danmachi, date a live, granblue, granblue draph, granblue erune, gunbuster, kancolle, konohana kitan, love hina, maison ikkoku, misato katsuragi, princess principal, raphtalia, reimu hakurei, touhou, rukia, rune factory, sailor moon, sayonara zetsubou sensei, sewayaki kitsune no senko-san, spice and wolf, strike witches, tera online, tera online elin, trigun, urusei yatsura, utawarerumono, world's end harem, lalafell
|
||||
|
||||
Mid Tier
|
||||
anime, art by kenichi sonoda, comic kairakuten, dengeki maoh, dengeki moeoh, doujin, doujinshi, eroge, gainax, hentai, kyoto animation, manga, mangazine, megami magazine, monthly shonen ace, sekai project, tankoubon, tsunako, type-moon, ufotable, vtuber, 90's anime, ai yori aoushi, asuna yuuki, bishoujo, boku wa tomodachi ga sukunai, cardcaptor, eromanga-sensei, gochuumon wa usagi desu ka, hinata hyuuga, idolmaster, kodomo no jikan, kurumi tokisaki, madoka magica, mahou sensei negima, kanna kamui, hayate no gotoku, hajimete no gal, chobits, clannad, asuka langley soryu, maid dragon, meru the succubus, mitsuboshi, mitsuboshi colors, mizugi kanojo, monster musume, nekopara, nisekoi, omega quintet, oreimo, pripara, ranma, rayearth, re zero, rin tohsaka, seitokai yakuindomo, shokugeki no soma, steins gate, tanto cuore card, tenchi muyo, the world god only knows, toaru majutsu no index, tsukiko tsutsukakushi, xenosaga, vanillaware, night elf, draenei
|
||||
|
||||
Low Tier
|
||||
comic isekairakuten, monthly newtype, azumanga daioh, cardcaptor sakura, darkstalkers, ddlc, dirty pair, ecchi, getsuyoubi no tawawa, haruhi suzumiya, hyperdimension neptunia, kemono friends, k-on, konosuba, kos-mos, megumin, no game no life, non non biyori, orihime, prisma illya, rei ayanami, rosario to vampire, rozen maiden, ryuko matoi, saki mahjong anime, senran kagura, the melancholy of haruhi suzumiya, to love ru, uma musume, vocaloid, wagnaria, worgen, pandaren
|
||||
|
||||
Garbage
|
||||
studio trigger, fairy tail, gunsmith cats, high school dxd, high school of the dead, magical girl lyrical nanoha, precure, prison school, project a-ko, shinmai maou no testament, street fighter, symphogear, tanto cuore deck, toradora, yuru yuri, zero no tsukaima, martian successor nadesico, misty pokemon, monster girl encylopedia, pokemon girl, pokegirl, bosshi, art by bosshi, satoshi urushihara, art by satoshi urushihara, asanagi, art by asanagi, happoubi jin, art by happoubi jin, dofus, wakfu, maple story, aura kingdom
|
||||
|
||||
Gundam Tier
|
||||
monthly gundam ace
|
||||
|
||||
Special Tier (very strong influence, varied/poor quality)
|
||||
battle angel alita, dance in the vampire bund, full metal panic, gall force, girls und panzer, how not to summon a demon lord, kill la kill, little witch academia, made in abyss, sword art online
|
||||
|
||||
Special Tier 2 (sometimes decent vectors, but not quite understood by the AI)
|
||||
devil hunter yohko, disgaea, illyasviel von einzbern, jahy-sama, lucky star, ryoko hakubi, shinobu oshino, slayers, soul eater, succubus, taimanin asagi, fran final fantasy, vulpera, tauren
|
||||
|
||||
##anime vector anon's notes on waifu diffusion
|
||||
1: As you might expect from it's name, WD works wonderfully for generating anime girls; however, it capably handles other things as well, such as scenery.
|
||||
2: Vectors and prompts that worked in SD will typically work in WD, with the exception of photos or realism.
|
||||
3: Because photos do not work well with WD, real world anime merchandise such as nendos and figmas are better done in SD.
|
||||
4: Since WD outputs are already going to be illustrations, there is no need for vectors such as 'illustration' or 'digital painting'; the use of these vectors adds very little to an overall prompt, or even detracts from it.
|
||||
5: Personally recommended quality control vectors are variations of 'artstation' (such as 'arstation girl' or 'trending on artstation') and 'highly detailed'. These can safely be added to most prompts.
|
||||
6: Unsurprisingly, 'face' and 'hand' seem to be the best vectors for faces and hands. For full body, 'artstation female body' works well; 'male body artstation' if you're some kinda prancing lala homo man. For a focus on specific body parts, just ask for that body part.
|
||||
7: If you ask for booba, milk truck arrive; even a vector like 'flat chest' is going to give your girl huge badonkers. If you want the itty bitty titty committee, go for vectors like 'loli', or invoke the names of flat chested characters.
|
||||
8: Vectors like 'curvaceous' are often redundant since WD will default to shapely ladies.
|
||||
9: Like in SD, furries, monster girls, elves and more can be made in WD (with some luck and patience).
|
||||
10: If SD doesn't know a vector, WD doesn't know it either. Thus, for the time being, we are still stuck without a lot of wonderful artists like Satoshi Urushihara and Happoubi Jin.
|
||||
|
||||
##anime vector anon's notes, part 2
|
||||
1: Characters can mostly be recreated with enough effort, time, and luck. Vaguely describe the character—their colors, clothes, age, body type, etc—and then potentially add one or two artists whose style you find agreeable. For more popular or recognizable characters, you could also add in their name or the series they're from.
|
||||
2: Vectors can be overlapped to some interesting ends. For example, if you throw in a dozen different animals, you're not going to get an image with all of those animals in it—you're going to get a fantastical creature with elements from several of them all at the same time. You may have noticed this already by trying to create things like catgirls, but the uses extend further than this—clothes, for example. If you give the AI multiple outfits to work with, it's going to combine them in unexpected ways to give you uniquely dressed characters.
|
||||
3: Some vectors have unexpected use case scenarios. For example, I'd originally listed Idolmaster in mid tier; while my opinion on its overall quality is unchanged, it's a worthy vector if you're aiming for a group shot with multiple characters in it.
|
||||
4: This may require further testing, but the advent of WD and similar anime models seems to have boosted the effectiveness and/or quality of lower tier vectors. Preliminary results have been promising.
|
||||
5: If you have access to negative prompts, USE THEM. My generic list for making anime booba:
|
||||
bad anatomy, disfigured, deformed, malformed, mutant, monstrous, ugly, gross, disgusting, man, male, boy, old, blurry, fat, obese, out of frame, poorly drawn, extra limbs, extra fingers, missing limbs
|
||||
|
||||
##specific prompts from anime vector anon
|
||||
This is just a selection of random prompts that I've particularly enjoyed, or that have seen a positive response when previously shared. Feel free to play around with them, experiment, and share your results.
|
||||
|
||||
WD:
|
||||
taimanin asagi, ninja, bodysuit, fishnet, artstation female body, highly detailed, art by kentaro yabuki
|
||||
art by kentaro yabuki, wlop, artstation female body, highly detailed, little girl, skinny, red hair and eyes, midriff, black leather, elbow gloves, thighhighs, choker, black miniskirt
|
||||
harem, group of girls, trending on artstation, highly detailed, akihiko yoshida, mark arian, sexy, bikini
|
||||
lalafell, artstation female body, highly detailed, swimsuit, Alphonse Mucha
|
||||
michael garmash, WLOP, chibi, curvaceous, huge breasts, artstation female body, highly detailed, thick, granblue erune, tera online elin, betty boop
|
||||
art by ken sugimori, art by yoji shinkawa, artstation female body, highly detailed, furry, minotaur, huge breasts
|
||||
art by ken sugimori, lois van baarle, ross tran, trending on artstation, highly detailed, girl, furry, anthropomorphic animal girl
|
||||
loli, midriff, trending on artstation, highly detailed
|
||||
|
||||
SD:
|
||||
fantasy, intricate, detailed, realistic, photo, plush, doll, toy, cute, Japanese, loli
|
||||
intricate, detailed, realistic, photo, cute, girl, Japanese, nendoroid / anime figure
|
||||
intricate, detailed, realistic, photo, sexy, fur coat, girl
|
||||
lovecraft, pixar, realistic, intricate, detailed, scenery, landscape
|
||||
realistic, intricate, detailed, concept art, fantasy, cat, dog, bunny, rabbit, lizard, snake, giraffe, elephant, bird, crocodile, hippopotamus, gorilla, grizzly bear, magically fused into a chimera3
|
||||
studio pierrot, intricate, realistic, detailed, digital painting, artstation, concept art, sharp focus, illustration, Victorian, dandy gentleman, wizard, muscular man, art by kentaro miura
|
||||
fantasy, intricate, cinematic, digital painting, artstation, concept art, sharp focus, illustration, Artgerm, Greg Rutkowski, Alphonse Mucha, mecha panther woman
|
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
|
@ -0,0 +1 @@
|
|||
f19e2f6dd76c5f88f4902213d58c0c5faec896ad
|
|
@ -0,0 +1,206 @@
|
|||
<!DOCTYPE html>
|
||||
<html>
|
||||
|
||||
<head>
|
||||
<meta charset="utf-8">
|
||||
|
||||
<title>Larger resolutions with Stable Diffusion</title>
|
||||
<link rel="canonical" href="https://rentry.co/sdupscale" />
|
||||
|
||||
|
||||
<meta name="description" content="Problem
|
||||
Stable Diffusion 1.4 was trained on 512×512 images. That means generating pictures in any other dimensions is going to mess your result up. When straying too far away from 512×512 and 1:1 aspect ratio, you'll get twin heads on characters, long necks, broken composition, tiled repetition a...">
|
||||
|
||||
<meta name="twitter:card" content="summary" />
|
||||
<meta name="twitter:description" content="Problem
|
||||
Stable Diffusion 1.4 was trained on 512×512 images. That means generating pictures in any other dimensions is going to mess your result up. When straying too far away from 512×512 and 1:1 aspect ratio, you'll get twin heads on characters, long necks, broken composition, tiled repetition a..." />
|
||||
<meta name="twitter:title" content="Larger resolutions with Stable Diffusion" />
|
||||
<meta name="twitter:site" content="@rentry_co" />
|
||||
|
||||
<meta property="og:url" content="https://rentry.co/sdupscale" />
|
||||
<meta property="og:description" content="Problem
|
||||
Stable Diffusion 1.4 was trained on 512×512 images. That means generating pictures in any other dimensions is going to mess your result up. When straying too far away from 512×512 and 1:1 aspect ratio, you'll get twin heads on characters, long necks, broken composition, tiled repetition a..." />
|
||||
<meta property="og:title" content="Larger resolutions with Stable Diffusion" />
|
||||
<meta property="og:type" content="article" />
|
||||
|
||||
|
||||
<meta name="twitter:image" content="https://i.imgur.com/vdkgfuM.jpg" />
|
||||
<meta property="og:image" content="https://i.imgur.com/vdkgfuM.jpg" />
|
||||
|
||||
|
||||
|
||||
<meta name="referrer" content="strict-origin-when-cross-origin" />
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=2, user-scalable=1" />
|
||||
|
||||
<link rel="stylesheet" href="/static/css/bootstrap.min.css?v=84">
|
||||
<link rel="manifest" href="/static/manifest.json?v=8">
|
||||
<script>document.documentElement.classList.toggle("dark-mode", (localStorage.getItem("dark-mode") === null && window.matchMedia("(prefers-color-scheme: dark)").matches || localStorage.getItem("dark-mode") == "true"));</script>
|
||||
<script>const script = document.createElement("script"); const hn = window.location.hostname === 'rentry.org' && 'rentry.org' || 'rentry.co'; script.src = 'https://a.' + hn + '/js/plausible.js'; script.defer = true; script.setAttribute('data-domain', hn + ',rentry'); document.head.appendChild(script);</script>
|
||||
</head>
|
||||
|
||||
<body class="m-0 p-0">
|
||||
|
||||
<div class="container container-smooth">
|
||||
<div class="row no-gutters">
|
||||
|
||||
<div class="col-12">
|
||||
<div class="row no-gutters">
|
||||
<div class="col-12 long-words">
|
||||
|
||||
|
||||
<div class="entry-text my-2 px-2 px-sm-4" style="min-height: 15rem; padding-top:0.1px; padding-bottom:0.1px">
|
||||
<article>
|
||||
<div><h1 id="larger-resolutions-with-stable-diffusion">Larger resolutions with Stable Diffusion<a class="headerlink" href="#larger-resolutions-with-stable-diffusion" title="Permanent link"> </a></h1>
|
||||
<h2 id="problem">Problem<a class="headerlink" href="#problem" title="Permanent link"> </a></h2>
|
||||
<p>Stable Diffusion 1.4 was trained on 512×512 images. That means generating pictures in any other dimensions is going to mess your result up. When straying too far away from 512×512 and 1:1 aspect ratio, you'll get twin heads on characters, long necks, broken composition, tiled repetition and plenty of unwanted results in general. Here's a comparison.</p>
|
||||
<div class="ntable-wrapper">
|
||||
<table class="ntable">
|
||||
<thead>
|
||||
<tr>
|
||||
<th style="text-align: center">512×512, <em>John Berkey Sci-Fi</em></th>
|
||||
<th style="text-align: center">1024×1024, <em>John Berkey Sci-Fi</em> (resized back to 512)</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr>
|
||||
<td style="text-align: center"><img alt="" referrerpolicy="same-origin" src="https://i.imgur.com/vdkgfuM.jpg" title=""></td>
|
||||
<td style="text-align: center"><img alt="" referrerpolicy="same-origin" src="https://i.imgur.com/aqmdGBA.jpg" title=""></td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
<div class="ntable-wrapper">
|
||||
<table class="ntable">
|
||||
<thead>
|
||||
<tr>
|
||||
<th style="text-align: center">512×512, <em>photo of a man in the park</em></th>
|
||||
<th style="text-align: center">320×896, <em>photo of a man in the park</em> (resized back to 512)</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr>
|
||||
<td style="text-align: center"><img alt="" referrerpolicy="same-origin" src="https://i.imgur.com/2XUHUkc.jpg" title=""></td>
|
||||
<td style="text-align: center"><img alt="" referrerpolicy="same-origin" src="https://i.imgur.com/TVkFCWK.jpg" title=""></td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
<p>Any picture with a clearly defined subject is going to end up like this. Some pictures like landscapes, backgrounds and other similar scenes will actually benefit from repetition, to an extent.</p>
|
||||
<p><img alt="" referrerpolicy="same-origin" src="https://i.imgur.com/6Guw71J.jpg" title=""></p>
|
||||
<p>However even those will get garbled once you take it too far.</p>
|
||||
<p>So, how do you make pictures with larger resolutions?</p>
|
||||
<h2 id="sd-upscale-gobig">SD Upscale / GoBIG<a class="headerlink" href="#sd-upscale-gobig" title="Permanent link"> </a></h2>
|
||||
<p>If you are using <a href="https://github.com/AUTOMATIC1111/stable-diffusion-webui" rel="nofollow noopener">this</a> Web UI, you have a feature called SD upscale (on the img2img tab). It's probably available in other wrappers for Stable Diffusion as well, but I will focus on this one. It will upscale your picture 2x (512×512 will become 1024×1024), using SD itself to invent more details. It can be repeated to make images of larger resolutions. It doesn't take up more memory, just requires proportionally more time. It can yield arbitrarily detailed pictures from mere 512×512, and these would be not fake but "real" details. It can even fix some faces and hands as they tend to be drawn better at larger sizes.</p>
|
||||
<p>The algorithm works like this:</p>
|
||||
<ol>
|
||||
<li>Upscale your image 2x by normal means</li>
|
||||
<li>Divide the 2x image into a bunch of tiles, with some overlap</li>
|
||||
<li>Run img2img on every tile, with respect to your prompt and settings.</li>
|
||||
<li>Combine the tiles to even out the seam.</li>
|
||||
</ol>
|
||||
<p><img alt="" referrerpolicy="same-origin" src="https://i.imgur.com/CdadwyC.jpg" title=""></p>
|
||||
<p>The settings page for it looks like this:</p>
|
||||
<p><img alt="" referrerpolicy="same-origin" src="https://i.imgur.com/EyUvrHD.jpg" title=""></p>
|
||||
<h4 id="optimal-settings">Optimal settings<a class="headerlink" href="#optimal-settings" title="Permanent link"> </a></h4>
|
||||
<p>Tile size is best kept the same as original because at different dimensions img2img will generate a completely diffrent picture and your result is going to be different from the original. However, if you don't care about it, you can make tiles larger or smaller to fit the content better. Read the Limitations section below and think how tiles will be layed out in respect to the underlying content.</p>
|
||||
<p>Keep in mind that if you give the tiles too little overlap, the result might considerably differ in different tiles. If you give too much overlap, you'll waste performance and may get double seams in extreme cases.</p>
|
||||
<p>Prompt can either be the same, describing the entire image (but also see the Limitation section), or just your styling vectors or something average if your content is too diverse across the tiles, or something entirely different if you want to get creative with adding details.</p>
|
||||
<p>If you don't want the result deviating too much from the original, keep seed the same.</p>
|
||||
<p>The main setting is Denoising strength, it works the same as in img2img, as it just runs img2img on each tile. The higher denoising is, the closer the result to the prompt and CFG. The lower denoising, the closer the result to the input picture. So:</p>
|
||||
<ul>
|
||||
<li>more denoising -> more details induced by prompt and settings, but prone to unexpected hallucinations, visible seams, difference between tiles</li>
|
||||
<li>less denoising -> less details, but safer</li>
|
||||
</ul>
|
||||
<p>Usually, denoising > 0.45 gives undesired effects. (depends on the picture, though)</p>
|
||||
<p>CFG scale works exactly like it does in img2img, again because SD upscale is just tiled img2img.</p>
|
||||
<p>If you want the picture to deviate as little as possible from the original (just add details), keep all settings except denoising the same, including the tile size (same as picture size), prompt, seed etc. If you still want to set the different tile size, try playing with the seed resize feature but I was unable to make it work reliably:</p>
|
||||
<ul>
|
||||
<li>tick Extra</li>
|
||||
<li>set seed the same as original</li>
|
||||
<li>set little W and H sliders to the size of the original</li>
|
||||
</ul>
|
||||
<h4 id="prescaler">Prescaler<a class="headerlink" href="#prescaler" title="Permanent link"> </a></h4>
|
||||
<p>The first step in the algorithm (prescaling) is crucial. Here's a comparison of two pre-scaling algorithms used in SD upscale process.</p>
|
||||
<div class="ntable-wrapper">
|
||||
<table class="ntable">
|
||||
<thead>
|
||||
<tr>
|
||||
<th style="text-align: center">Lanczos</th>
|
||||
<th style="text-align: center">ESRGAN Remacri</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr>
|
||||
<td style="text-align: center"><img alt="" referrerpolicy="same-origin" src="https://i.imgur.com/FSNf2kl.png" title=""></td>
|
||||
<td style="text-align: center"><img alt="" referrerpolicy="same-origin" src="https://i.imgur.com/aRCgl7G.png" title=""></td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
<p>Why does this happen? Lanczos is purely algorithmic, ESRGAN Remacri is a neural upscaler which is tuned for crispness and detail preservation. While neither of them is even remotely close to SD upscale, remacri keeps more detail for SD upscale to latch on when hallucinating new details.</p>
|
||||
<p>Two custom finetuned models for ESRGAN were found to work particularly good with SD upscale: <a href="https://drive.google.com/file/d/14pUxWLlOnzjZKOCsNguyNHchU6_581fc" rel="nofollow noopener">remacri</a> (works better for backgrounds, also tends to amplify brush strokes somewhat), and <a href="https://drive.google.com/file/d/1v-t2Op85wkME2Gnutiutp1Mqb1nkSM8q" rel="nofollow noopener">lollypop</a> (works better for more or less realistic people). You can experiment with other specialized ESRGAN models listed in <a href="https://upscale.wiki/wiki/Model_Database" rel="nofollow noopener">Upscale Wiki</a>.</p>
|
||||
<p>Note: Real-ESRGAN is <strong>not</strong> ESRGAN. The naming is confusing, but Real-ESRGAN is a newer, different model which doesn't seem to have finetuned variants. Don't use it, it's better in theory but is shit for this particular purpose.</p>
|
||||
<p>For the AUTOMATIC1111's wrapper we are talking about, drop your ESRGAN models into the ESRGAN folder, they will be available in SD upscale then.</p>
|
||||
<h4 id="limitations">Limitations<a class="headerlink" href="#limitations" title="Permanent link"> </a></h4>
|
||||
<p>SD upscale has a considerable limitation: the prompt is the same for all tiles, and you can't manually lay tiles out. The more tiles you need to cover the image, the worse the issue is. Think small tiles on larger resolutions.</p>
|
||||
<p><img alt="" referrerpolicy="same-origin" src="https://i.imgur.com/4HNAA1A.jpg" title=""></p>
|
||||
<p>See the problem? Each tile covers very different content yet they are described by a single prompt. This can lead to SD upscale suddenly dreaming in a face in the grass, or invent another detail unrelated to the particular part of the picture.</p>
|
||||
<p><img alt="" referrerpolicy="same-origin" src="https://i.imgur.com/DF2OrmV.jpg" title=""></p>
|
||||
<p>There are several workarounds for this:</p>
|
||||
<ul>
|
||||
<li>Use larger tiles, or just run the entire prescaled picture through img2img manually, as one piece, if you can fit it in your VRAM. Since img2img is guided by the underlying pre-scaled picture, larger tiles won't give repetition. You will inevitably deviate from your original picture as your tile size will be different from the original size, though. Another person wrote a separate guide for this: <a href="https://rentry.org/b7vcb">https://rentry.org/b7vcb</a></li>
|
||||
<li>Don't set denoise too high on such images. More denoise = more chances for unexpected hallucinations.</li>
|
||||
<li>Only use styling vectors in your prompt, no descriptions of the content. Or even no prompt at all. The downside is that details won't be that relevant or good, especially with subsequent upscaling.</li>
|
||||
<li>Use manual compositing in Krita or Photoshop. You can prescale anything, and then manually detail it with img2img with any layout, prompt and settings you want.</li>
|
||||
</ul></div>
|
||||
</article>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="row no-gutters">
|
||||
<div class="col-12 px-0">
|
||||
<div class="text-muted">
|
||||
<div class="float-left text-left">
|
||||
<a role="button" class="btn btn-light float-left squared mr-2" href="/sdupscale/edit">Edit</a>
|
||||
<div class="dropdown d-inline-block position-relative">
|
||||
<button id="dropdownButton" class="btn btn-light squared mr-2 dropdown-toggle" type="button">Export</button>
|
||||
<div class="dropdown-content">
|
||||
<a role="button" class="btn btn-light squared" href="/sdupscale/raw">Raw</a>
|
||||
<a role="button" class="btn btn-light squared" href="/sdupscale/pdf">PDF</a>
|
||||
<a role="button" class="btn btn-light squared" href="/sdupscale/png">PNG</a>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<div class="float-right text-right pr-2 pr-sm-0">
|
||||
Pub: 11 Sep 2022 17:49 <span class="d-none d-sm-inline">UTC</span><br>
|
||||
|
||||
Edit: 13 Sep 2022 10:14 <span class="d-none d-sm-inline">UTC</span><br>
|
||||
|
||||
Views: 1471<br>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
|
||||
<div class="text-center w-100 mb-3">
|
||||
<hr class="my-2 basement-hr">
|
||||
<a class="mr-1" href="/">new</a>·<a class="mx-1" href="/what">what</a>·<a class="mx-1" href="/how">how</a>·<a class="ml-1" href="/langs">langs</a>
|
||||
<div class="position-relative"><span style="right: 0; bottom: -9px; background:transparent!important" class="position-absolute btn squared mr-2 mr-sm-0" id="darkModeBtn" title="Dark/light mode"></span></div>
|
||||
|
||||
</div>
|
||||
|
||||
|
||||
|
||||
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<script src="/static/js/jquery.min.js?v=20"></script>
|
||||
<script src="/static/js/bootstrap.min.js?v=20"></script>
|
||||
|
||||
</body>
|
||||
|
||||
</html>
|
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
|
@ -0,0 +1 @@
|
|||
aec040fcca2ccaf3b7c2012edc8dcf1e6b796bca
|
Binary file not shown.
|
@ -0,0 +1,14 @@
|
|||
#!/bin/bash
|
||||
|
||||
wget "https://rentry.org/voldy/pdf" -O "VOLDY RETARD GUIDE.pdf"
|
||||
wget "https://rentry.org/drfar/pdf" -O "Inpainging and Outpainting.pdf"
|
||||
wget "https://rentry.org/textard/pdf" -O "RETARD'S GUIDE TO TEXTUAL INVERSION.pdf"
|
||||
wget "https://rentry.org/anime_and_titties/pdf" -O "big titty anon's list of artists.pdf"
|
||||
wget "https://rentry.org/informal-training-guide/pdf" -O "Informal Training Guide.pdf"
|
||||
wget "https://rentry.org/865dy/pdf" -O "Getting Started on Paperspace.pdf"
|
||||
wget "https://rentry.org/cputard/pdf" -O "CPU RETARD GUIDE.pdf"
|
||||
wget "https://rentry.org/sd-nativeisekai/pdf" -O "Stable Diffusion Native Isekai.pdf"
|
||||
wget "https://rentry.org/ayymd-stable-diffustion-v1_4-guide/pdf" -O "AyyMD Stable Diffuse v1.4 for Wangblows 10.pdf"
|
||||
wget "https://rentry.org/sdamd/pdf" -O "Stable Diffusion AMD.pdf"
|
||||
wget "https://rentry.org/yrpvv/pdf" -O "Standard Diffusion Models.pdf"
|
||||
wget "https://rentry.org/sdupscale" -O "Larger resolutions with Stable Diffusion.pdf"
|
|
@ -1,142 +0,0 @@
|
|||
#->`-- Inpainging & Outpainting --`<-#
|
||||
|
||||
[TOC]
|
||||
|
||||
Are your generated images disappointing? Is the AI letting you down? Well fear not! There are tools you can use to work with the AI to get the image you want.
|
||||
|
||||
This guide uses negative prompts which only the [AUTOMATIC1111/Voldemort](https://github.com/AUTOMATIC1111/stable-diffusion-webui) repo supports.
|
||||
|
||||
## Inpainting
|
||||
|
||||
**What's inpainting?**
|
||||
Inpainting is a way to "fill in" parts of an image. In the context of stable diffusion that means making the AI regenerate part of the image.
|
||||
|
||||
**How do I get to the inpainting section of the WebUI?**
|
||||
At the top of the WebUI click the `img2img` tab.
|
||||
|
||||
**That's cool, but what do I use it for?**
|
||||
It has two primary uses:
|
||||
1. Fixing wonky parts of the image.
|
||||
2. Working with the AI to modify an existing image.
|
||||
|
||||
|
||||
|
||||
### Let's do it!
|
||||
|
||||
**Prompt**
|
||||
We're going to need a special prompt for inpainting, one that focuses on the elements of the image that you want to change. I have a few prompts I copy&paste for certain parts of the image.
|
||||
|
||||
|||
|
||||
|-|-|
|
||||
|**Faces**|<describe the face here> medium shot, extremely detailed, intricate, ((clear focus)), ((sharp focus)), perfect face, very deep eyes|
|
||||
|**Anime Faces**|<describe the face here> medium shot, anime, extremely detailed, intricate, ((clear_focus)), ((sharp_focus)), perfect_face, very deep eyes, ((round pupils)), big anime eyes|
|
||||
|**Hands**|perfect hands, realistic hands, extremely detailed hands, individual fingers, intricate fingers, 8k hands|
|
||||
|
||||
You can put whatever works best in the negative prompt.
|
||||
|
||||
**Change these settings**
|
||||
|
||||
Below the prompt box, click the second radio button: `Inpaint a part of image`
|
||||
|
||||
`Sampling Steps`: set it to whatever is getting you good results.
|
||||
|
||||
`Masked content`: set to `original`. This setting controls what the AI does with the masked content. Basically, `fill` makes the AI erase and regenerate it and `original` tells it to base its generation off the original content.
|
||||
|
||||
`Inpaint at full resolution`: check this box. This setting will make the AI upscale the masked region to your target resolution, do the inpainting, then resize it and place it back in the image. This enables very intricate generation of small regions of the image.
|
||||
|
||||
`Batch count`: it will probably take a few attempts for the AI to create what you want it to so why not speed the process up by generating multiple images in one go?
|
||||
|
||||
`CFG Scale`: how closely the AI adheres to your prompt. Set to whatever has been working best for you.
|
||||
|
||||
`Denoising strength`: determines how little respect the algorithm should have for image's content. At 0, nothing will change, and at 1 you'll get an unrelated image. Start at 0.3 and work your way up from there.
|
||||
|
||||
`Height` & `Width`: set to the dimensions of your image.
|
||||
|
||||
|
||||
|
||||
|
||||
I'll walk you through the process of how to do this. I'm going to use this image generated using the waifu-diffusion model.
|
||||
|
||||
![inpainting1](https://raw.githubusercontent.com/Engineer-of-Stuff/stable-diffusion-paperspace/main/Docs/Assets/inpainting1.png)
|
||||
|
||||
It's a complex scene with a few issues. Let's start by adding landing gear. I'm going to open the image in my external photo editor (I use GIMP) and sketch out the landing gear and then upload it to the WebUI and mask it.
|
||||
|
||||
![inpainting2](https://raw.githubusercontent.com/Engineer-of-Stuff/stable-diffusion-paperspace/main/Docs/Assets/inpainting2.png)
|
||||
|
||||
On the left is the landing gear I have drawn. On the right is the masked image, with the white being the mask I have drawn on using the WebUI's masking tool.
|
||||
|
||||
As you can see, my drawing is really poor. But it doesn't have to be a quality drawing, it just has to give the AI an idea of what you want. Make sure the perspective and rough colors are correct. I also was very generous with the masking to allow it to generate shadows. My prompt will be "landing gear, shadows" and I'm going to turn up the denoising strength to give the AI some freedom to do what it wants. Next I generated a batch of five images. The results weren't very good because the AI was just mimicking what I had drawn. I increased the denoising strength to 0.65 and ran it again.
|
||||
|
||||
**Pro-Tip:** if it generates something you like set it to that seed and adjust the prompt.
|
||||
|
||||
![inpainting3](https://raw.githubusercontent.com/Engineer-of-Stuff/stable-diffusion-paperspace/main/Docs/Assets/inpainting3.png)
|
||||
|
||||
Hey, that's pretty good! It didn't add any shadows but I can try to fix that by masking where they should be and running it with the prompt "shadows".
|
||||
|
||||
Next, let's remove that flag thing in front of the girl. Back to GIMP to color it out. I'll use the AI to blend the details and remove any leftover traces of my erasure.
|
||||
|
||||
I'm not satisfied with her skirt, it looks a little chuncky. Let's try to knock out both the skirt and the face at the same time. Normally you'd want to focus on one thing at a time but I want to capture the movement between the hair and clothes.
|
||||
|
||||
My prompt will be `cute girl looking straight ahead, hair blowing in wind, skirt blowing in wind, medium shot, anime, extremely detailed, intricate, ((clear focus)), ((sharp focus)), perfect face, very deep eyes`
|
||||
|
||||
My negative prompt will be `deformed, blurry, bad anatomy, disfigured, poorly drawn face, mutation, mutated, extra_limb, ugly, poorly drawn hands`. I'm going to generate five samples at denoise 0.4 (giving some room to the AI for it to play with the hair). After the first set of five I decided 0.4 didn't create enough blowing hair so I bumped it up to 0.65.
|
||||
|
||||
**Input:**
|
||||
|
||||
![inpainting7](https://raw.githubusercontent.com/Engineer-of-Stuff/stable-diffusion-paperspace/main/Docs/Assets/inpainting7.png)
|
||||
|
||||
**Output:**
|
||||
|
||||
![grid](https://raw.githubusercontent.com/Engineer-of-Stuff/stable-diffusion-paperspace/main/Docs/Assets/inpaintingrid1.png)
|
||||
|
||||
There certainly is a lot of shared energy between the skirt and hair. This one is my favorite.
|
||||
|
||||
![inpainting5](https://raw.githubusercontent.com/Engineer-of-Stuff/stable-diffusion-paperspace/main/Docs/Assets/inpainting5.png)
|
||||
|
||||
The skirt was originally a little transparent by the backwards `C` so I ran it through again with `transparent` in the negatives. The AI understood what I meant and fixed it for me.
|
||||
|
||||
Now, lets fix her face. Be careful to only mask the face and hair to preserve what has been fixed already. I'm going to run this at `0.3` denoise since we've already done a lot of work in this area.
|
||||
|
||||
Same prompt, minus the skirt part:`cute girl looking up at sky, hair blowing in wind, medium shot, anime, extremely detailed, intricate, ((clear focus)), ((sharp focus)), perfect face, very deep eyes`
|
||||
|
||||
**Input:**
|
||||
|
||||
![inpainting6](https://raw.githubusercontent.com/Engineer-of-Stuff/stable-diffusion-paperspace/main/Docs/Assets/inpainting6.png)
|
||||
|
||||
**Output:**
|
||||
|
||||
![inpaintingrid2](https://raw.githubusercontent.com/Engineer-of-Stuff/stable-diffusion-paperspace/main/Docs/Assets/inpaintingrid2.png)
|
||||
|
||||
**I choose this as my final image:**
|
||||
|
||||
![inpainting8](https://raw.githubusercontent.com/Engineer-of-Stuff/stable-diffusion-paperspace/main/Docs/Assets/inpainting8.png)
|
||||
|
||||
|
||||
|
||||
And there you have it! It's a very simple, repetitive process that allows you to work closely with the AI to create the exact image you've got in your head.
|
||||
|
||||
### Upload a mask
|
||||
|
||||
Click the `Upload mask` button. The image dialog will be split into two sections, the top for your source image and the bottom for the mask.
|
||||
|
||||
The mask is a black and white PNG file. White marks places to modify and black is places to leave the same. You can flip this in the `Masking mode` section.
|
||||
|
||||
Here's an example of a mask.
|
||||
|
||||
![mask1.png](https://raw.githubusercontent.com/Engineer-of-Stuff/stable-diffusion-paperspace/main/Docs/Assets/mask1.png)
|
||||
|
||||
## Outpainting
|
||||
|
||||
** What is outpainting?**
|
||||
Outpainting allows you to extend the original image and create large-scale images in any aspect ratio. Outpainting takes into account the image’s existing visual elements (shadows, reflections, and textures) to maintain the context of the original image.
|
||||
|
||||
I never got outpainting to work right. If you're trying to extend your image I'd recommend inpainting creatively like this instead:
|
||||
|
||||
1. Extend your canvas in an image editor.
|
||||
2. Draw what should be there.
|
||||
3. Run img2img over new area with overlap.
|
||||
4. Manually combine new and old area to hide the transition.
|
||||
|
||||
If you're getting tired of switching between GIMP and the WebUI, try Krita with an inpainting plugin (but doesn't support negative prompts). [You'll need this.](https://www.flyingdog.de/sd/en/)
|
||||
|
||||
[The images in this guide are hosted on Github](https://github.com/Engineer-of-Stuff/stable-diffusion-paperspace)
|
119
Docs/Models.md
119
Docs/Models.md
|
@ -1,119 +0,0 @@
|
|||
#->`Standard Diffusion Models`<-#
|
||||
|
||||
[TOC]
|
||||
|
||||
# Quick & Easy Torrent Downloading
|
||||
|
||||
```bash
|
||||
apt update
|
||||
apt install -y aria2
|
||||
aria2c "<magnet URL here>"
|
||||
```
|
||||
|
||||
# Models
|
||||
|
||||
### Standard Model
|
||||
|
||||
**Torrent**
|
||||
|
||||
```bash
|
||||
magnet:?xt=urn:btih:3A4A612D75ED088EA542ACAC52F9F45987488D1C&tr=udp://tracker.opentrackr.org:1337
|
||||
```
|
||||
|
||||
|
||||
|
||||
**Web Download**
|
||||
|
||||
Voldy provided an alternative download if you don't want to use HuggingFace.
|
||||
|
||||
`https://drive.google.com/file/d/1wHFgl0ivCmIZv88hVZXkb8oy9qCuaBGA/view`
|
||||
|
||||
|
||||
HuggingFace is much faster and reliable but you need to get access to the repo and provide your user token.
|
||||
|
||||
```bash
|
||||
wget --header="'Authorization: Bearer {user_token}'" https://huggingface.co/CompVis/stable-diffusion-v-1-4-original/resolve/main/sd-v1-4.ckpt -O sd-v1-4.ckpt
|
||||
```
|
||||
|
||||
|
||||
|
||||
### Waifu Diffusion
|
||||
|
||||
|
||||
|
||||
**Torrent**
|
||||
|
||||
```bash
|
||||
magnet:?xt=urn:btih:F45CECF4E9DE86DA83A78DD2CCCD7F27D5557A52&tr=udp://nyquist.localghost.org:6969
|
||||
```
|
||||
|
||||
|
||||
|
||||
**Web Download**
|
||||
|
||||
Very slow
|
||||
|
||||
```bash
|
||||
https://thisanimedoesnotexist.ai/downloads/wd-v1-2-full-ema.ckpt
|
||||
```
|
||||
|
||||
|
||||
|
||||
**Half-Size Model**
|
||||
|
||||
Smaller filesize, minor shift of output vs ema model. If you're on free tier try this first, it's 3.5GB.
|
||||
|
||||
```bash
|
||||
magnet:?xt=urn:btih:153590FD7E93EE11D8DB951451056C362E3A9150&dn=wd-v1-2-full-ema-pruned.ckpt&tr=udp://tracker.opentrackr.org:1337
|
||||
```
|
||||
|
||||
### WD v1.2 and SD v1.4 Merged
|
||||
```bash
|
||||
magnet:?xt=urn:btih:UFIV4BI4MGWFLZSKPFQ5VFLNYL24ADUQ&dn=wd1-2_sd1-4_merged.ckpt&tr=udp%3A%2F%2Ftracker.opentrackr.org%3A1337%2Fannounce
|
||||
```
|
||||
|
||||
|
||||
### trinart_stable_diffusion_v2
|
||||
|
||||
Another anime finetune. Pixiv-esque illustrations, not as cohesive as waifu diffusion.
|
||||
|
||||
The 60000 steps version is the original, the 115000 steps is the 60000 with additional training. Use the 60000 step version if the style nudging is too much.
|
||||
|
||||
|
||||
|
||||
**115000**
|
||||
|
||||
```bash
|
||||
https://huggingface.co/naclbit/trinart_stable_diffusion_v2/resolve/main/trinart2_step115000.ckpt -O trinart2_step115000.ckpt
|
||||
```
|
||||
|
||||
|
||||
|
||||
**60000**
|
||||
|
||||
```bash
|
||||
https://huggingface.co/naclbit/trinart_stable_diffusion_v2/resolve/main/trinart2_step60000.ckpt -O trinart2_step60000.ckpt
|
||||
```
|
||||
|
||||
### Danbooru potato epoch0
|
||||
|
||||
A lewd danbooru model that outputs ehh results cause it needs more training.
|
||||
|
||||
```bash
|
||||
magnet:?xt=urn:btih:f6976fbe3b9f93469bb62eb0c4950643b09f1f83&dn=Lewd-diffusion-pruned.ckpt&tr=udp%3a%2f%2ftracker.opentrackr.org%3a1337%2fannounce&tr=http%3a%2f%2ftracker.nucozer-tracker.ml%3a2710%2fannounce&tr=udp%3a%2f%2ftracker.openbittorrent.com%3a6969%2fannounce&tr=udp%3a%2f%2ftracker.moeking.me%3a6969%2fannounce&tr=http%3a%2f%2ftracker.openbittorrent.com%3a80%2fannounce&tr=udp%3a%2f%2ftracker.dler.org%3a6969%2fannounce&tr=http%3a%2f%2ftracker1.bt.moack.co.kr%3a80%2fannounce&tr=udp%3a%2f%2fexplodie.org%3a6969%2fannounce&tr=udp%3a%2f%2fopen.stealth.si%3a80%2fannounce&tr=udp%3a%2f%2f9.rarbg.com%3a2810%2fannounce&tr=udp%3a%2f%2fexodus.desync.com%3a6969%2fannounce&tr=udp%3a%2f%2fbt.oiyo.tk%3a6969%2fannounce&tr=udp%3a%2f%2fopen.demonii.com%3a1337%2fannounce&tr=https%3a%2f%2ftracker.lilithraws.org%3a443%2fannounce&tr=http%3a%2f%2ftracker3.ctix.cn%3a8080%2fannounce&tr=udp%3a%2f%2fchouchou.top%3a8080%2fannounce&tr=https%3a%2f%2fopentracker.i2p.rocks%3a443%2fannounce&tr=https%3a%2f%2ftracker.nanoha.org%3a443%2fannounce&tr=udp%3a%2f%2ftracker.torrent.eu.org%3a451%2fannounce&tr=https%3a%2f%2ftracker1.520.jp%3a443%2fannounce
|
||||
```
|
||||
|
||||
# Upscalers
|
||||
|
||||
### Lollypop
|
||||
|
||||
```bash
|
||||
magnet:?xt=urn:btih:JDZHD4PQVPHJU35C32GRJMSLYSR7YRHS&dn=lollypop.pth&tr=udp%3A%2F%2Ftracker.opentrackr.org%3A1337%2Fannounce
|
||||
```
|
||||
|
||||
### Remacri Upscaler
|
||||
```bash
|
||||
magnet:?xt=urn:btih:TNSKQM7JWWWOTIVEYS4GPSG2L2HFKSI7&dn=4x_foolhardy_Remacri.pth&tr=udp%3A%2F%2Ftracker.opentrackr.org%3A1337%2Fannounce
|
||||
```
|
||||
|
||||
[Github mirror](https://github.com/Engineer-of-Stuff/stable-diffusion-paperspace/blob/main/Docs/Datasets.md)
|
|
@ -1,52 +0,0 @@
|
|||
#->`--RETARD'S GUIDE TO TEXTUAL INVERSION--`<-#
|
||||
->*Quitters never win, and winners never quit.*<-
|
||||
|
||||
Textual inversion allows you to train data sets of specific styles or things, which will then be tied to a specific word. It does this without affecting how the model file works as a whole, allowing you to inject keyword shortcuts. If that doesn't excite you, let me put it another way -
|
||||
This lets you tie a keyword to specific people, things, places, or art styles in Stable Diffusion that you otherwise would be unable to directly reference. It's a game changer.
|
||||
|
||||
==**If you'd prefer to run the trainer locally:**==
|
||||
You can do so using [**this repo**](https://github.com/huggingface/diffusers/tree/main/examples/textual_inversion)
|
||||
However this guide is for training on Colab, so you may have to figure out some things yourself.
|
||||
(Also local Textual Inversion requires at least 12gb of vram at a minimum)
|
||||
|
||||
!!!note In the beginning...
|
||||
**NOTE: You will need Stable Diffusion installed before proceeding, follow the recommended setup guide [**HERE**](https://rentry.org/voldy)
|
||||
|
||||
**STEP 1:** In file explorer, navigate to your root stable diffusion directory (`/stable-diffusion-webui`).
|
||||
Create a new folder in this location called `embeddings` (If there isn't one already)
|
||||
|
||||
**STEP 2:** Visit the [Stable Diffusion Concept Library](https://huggingface.co/sd-concepts-library) and pick any model.
|
||||
After selecting the one you wish to install, open a git bash in your `/embeddings` folder and type `git clone` followed by the model URL
|
||||
eg. `git clone https://huggingface.co/sd-concepts-library/bonzi-monkey`
|
||||
|
||||
**STEP 3:** Enter the directory for the model you just downloaded.
|
||||
Open `token-identifier.txt` and copy the contents, ignoring the < > (you need to enter them when you put the token into your prompt, but the file name is irrelevant to that).
|
||||
|
||||
**STEP 4:** Right click on learned_embeds.bin, click rename, paste what you copied out of token-identifier.txt, and hit enter
|
||||
|
||||
**STEP 5:** Move this newly renamed `.bin` file to the `/embeddings` directory. Feel free to delete the directory you cloned for the model once you have moved the `.bin` file out of it.
|
||||
|
||||
**Step 6:** Launch Voldy's SD and make sure your newly added model works.
|
||||
Pay attention to the name of the `.bin`, That's the keyword which tells Stable Diffusion to use it.
|
||||
Use the name within `< >` brackets, alongside normal prompts:
|
||||
Example prompt: `<bonzi-monkey>, photorealistic, oil painting, vintage`
|
||||
==Enjoy!==
|
||||
!!!note But I want to make my own!
|
||||
|
||||
**Step 1:** Once you've figured out what you want to integrate, pick 3-5 photos of your selection, and crop them down into square images, preferably **512x512**
|
||||
|
||||
**Step 2:** Upload your photos to a publicly accessible database.
|
||||
[Imgur](https://imgur.com) allows for hidden photos with a publicly accessible link
|
||||
|
||||
**Step 3:** Go to the [Textual Inversion Colab](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/sd_textual_inversion_training.ipynb#scrollTo=Yl3r7A_3ASxm) and follow the step by step instructions to run the model training, clicking the play button to the left at each new portion.
|
||||
For your image files, make sure whatever you're inputting is the **direct link** ending in `.jpg`, .`png`, or something of the like.
|
||||
!!! info Make sure you have a [Huggingface](https://huggingface.co/join) account. The Colab will give you a link to generate your token code, when you follow it, generate a **write code**
|
||||
|
||||
**Step 4:** Once the training has fully run, click the folder on the left side of the screen.
|
||||
Click `sd-concept-output`, click the three dots to the right of `learned_embeds.bin`, click download.
|
||||
Once the file has downloaded, rename it to whatever you provided as your placeholder token.
|
||||
|
||||
**Step 5:** Move this newly renamed `.bin` file to the `/embeddings` directory.
|
||||
|
||||
**Step 6:** Launch Voldy's SD and make sure your newly added model works.
|
||||
==Have fun!==
|
Loading…
Reference in New Issue