AI绘画——webui



最近半年text to image的应用非常火热,区别于之前视觉里的分类和检测等任务,虽然机器能识别图里的东西是什么很有趣,但是这些任务一个3岁左右的小孩基本也能在很短的时间给出正确的答案。而text to image所做的任务要更加复杂一些,这些任务基本都需要受过专业训练的人员花费较长时间才能完成。 而且webui等工具的开源,让更多非专业人士能切实到文本生成图片的震撼。

春节前后就开始体验webui了,这段时间事情有点多,搁置了更新,这次先更新下webui生成图片的效果。下次更新stable diffusion的技术细节!

本文使用webui生成图片,网上有挺多安装教程了,这里不赘述了。

模型下载

首先可以去https://civitai.com/下载需要的模型。可惜的是前几天已经禁止国内IP访问了,好在有些可以部分替代的网站,例如https://www.4b3.com/ai-drawing-model

text2image

生成的时候可以用多组关键词和多个模型生成图片,模型可以设置权重搭配比例,生成混搭的风格。

模型: C站上moxing的模型

prompt: ((4k,masterpiece,best quality)), shuimobysim, traditional chinese ink painting, lotus, hanfu, maxiskit, dress conservatively
1 girl, solo, white hair, long hair, fox ears, white, bikini, fish, many fish near girl, look at viewer, tease
Negative prompt: (watermark),sketch, duplicate, ugly, huge eyes, text, logo, monochrome, worst face, (bad and mutated hands:1.3), (worst quality:2.0), (low quality:2.0), (blurry:2.0), horror, geometry, (bad hands), (missing fingers), multiple limbs, bad anatomy, (interlocked fingers:1.2), Ugly Fingers, (extra digit and hands and fingers and legs and arms:1.4), crown braid, ((2girl)), (deformed fingers:1.2), (long fingers:1.2),(bad-artist-anime),extra fingers,fewer fingers,hands up,bad hands, bad feet,shoes, stone, ((bad toe))
Steps: 30, Sampler: DPM++ SDE Karras, CFG scale: 3, Seed: 2171486205, Size: 384x896, Model hash: 56be194f47, Model: cetusMix_cetusVersion2Fp16, Denoising strength: 0.5, Clip skip: 2, ENSD: 31337, Hires upscale: 2, Hires upscaler: Nearest, AddNet Enabled: True, AddNet Module 1: LoRA, AddNet Model 1: Moxin_Shukezouma11(494301de3d6e), AddNet Weight A 1: 0.7, AddNet Weight B 1: 0.7, AddNet Module 2: LoRA, AddNet Model 2: Moxin_10(17cd20c7b6ea), AddNet Weight A 2: 0.3, AddNet Weight B 2: 0.3, AddNet Module 3: LoRA, AddNet Model 3: chilloutmixss_xss10(9d82c7787e79), AddNet Weight A 3: 0.5, AddNet Weight B 3: 0.5, AddNet Module 4: LoRA, AddNet Model 4: firekeeperLoraFrom_fierkeeper16(acd58eb24484), AddNet Weight A 4: 0.8, AddNet Weight B 4: 0.8

下面的图使用了两个lora模型,一个是之前比较火热的KoreanDoll,用来生成脸部和身体,另一个是英雄联盟里一些角色的模型,用来生成风格。两种模型混搭后可以出现类似cosplay的效果。

模型: C站上samira的模型

prompt: (8k, RAW photo:1.2),best quality, ultra high res, sky, field, grass, samira (league of legends), league of legends, 1girl, jewelry, tattoo, eyepatch, earrings, green eyes, braid, long hair, dark skin, gloves, armor, navel, bracelet, lips, hair over shoulder, smile, looking at viewer, arm tattoo, mole, mole above mouth
Negative prompt: sword, holding, lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry, censored, letterbox, blurry
Steps: 20, Sampler: DPM++ SDE Karras, CFG scale: 7, Seed: 1059483644, Size: 512x512, Model hash: fc2511737a, Model: chilloutmix_NiPrunedFp32Fix, Denoising strength: 0.7, Clip skip: 2, ENSD: 31337, Hires resize: 768x1024, Hires upscaler: Latent

模型: C站上Jinx的模型

prompt: JinxLol,mature female,1girl, solo,looking at viewer, navel, gloves, fingerless gloves, character name, midriff, bare shoulders, looking at viewer, gun, crop top, belt,outdoors,,
Negative prompt: (low quality, worst quality:1.3), (lowres), blurry,text,watermark,signature,artist name,letterboxed, female pubic hair,realism
Steps: 20, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 2754482513, Size: 512x768, Model hash: fc2511737a, Model: chilloutmix_NiPrunedFp32Fix, Denoising strength: 0.5, Clip skip: 2, Hires upscale: 2, Hires upscaler: Latent

模型: C站上Ahri的模型

prompt:(8k, RAW photo:1.2),best quality, ultra high res, ahri, ahri_(league_of_legends), 1girl, absurdres, animal_ears, black_hair, breasts, detached_sleeves, distr, highres, facial_mark, fox_ears, fox_tail, hand_up, league_of_legends, long_hair, long_sleeves, magic, multiple_tails, orange_eyes, parted_lips, solo, standing, tail, full body
Negative prompt: (painting by bad-artist-anime:0.9), (painting by bad-artist:0.9), watermark, text, error, blurry, jpeg artifacts, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, artist name, (worst quality, low quality:1.4), bad anatomy

controlNet

controlNet也是用来引导图片生成的,好处是在文本生成图片的基础上,加上另一张图做引导,可以让生成的图在一些底层特征上看起来相似。可以采用的特征分为以下几种:

  • canny: 边缘检测
  • depth: 深度检测
  • hed:边缘检测
  • mlsd: 线段检测
  • normal_map: 建模识别
  • openpose: 姿势骨骼提取
  • openpose_hand:姿势骨骼+手部
  • scribble: 提取黑白稿
  • fake_scribble: 涂鸦风格提取
  • segmentation: 分割

给大家推荐一个整理了各种画家风格的网站,可以供diffusion model生成类似风格的图像:https://lib.kalos.art/topic?model=1&topic=0

引导图(昨天去商场夹的公仔):

生成图

prompt: ((4k,masterpiece,best quality)), cat, animal painting by Jasmine Becket-Griffith, detailed,multicoloured

再来张中国风

prompt: cat, animal painting by Bada Shanren, beautiful details, shuimobysim, traditional chinese ink painting

-------------本文结束感谢您的阅读-------------