版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
LauraSánchezGarcíaJulioAntonioSotoVicenteIEUniversity(C4__466671-AdvancedArtificialIntelligence)IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall20241/841Intro2DenoisingDiffusionProbabilisticModels3Advancementsandimprovements4Largediffusionmodels5BeyondimagegenerationIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall20242/84IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall20243/84A(certainlynotcomplete)list:•LatentVariablemodels(incl.VAEs)•Autoregressivemodels(incl.GPT-styleLanguageModels)•GANs•Flow-basedmodels(incl.NormalizingFlows)•Energy-BasedModels(incl.Score-basedmodels)•Diffusionmodels(kindofmixofallpreviouspoints)•CombinationsIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall20244/84ImageImagegenerationSource:Hoelal.[2020]”AphotoofaCorgidogridingabikeinTimesSquare.Itiswearingsunglassesandabeachhat.”Source:Sahariaetal.[2022]IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall20245/84Source:Brooksetal.[2024]IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall20246/84IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall20247/84Outline•Theforwardprocess•TheNice™property•Thereverseprocess•Lossfunction•Trainingalgorithm•Themodel•SamplingalgorithmIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall20248/84IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall20249/84”Creatingnoisefromdataiseasy;creatingdatafromnoiseisgenerativemodeling.”Songetal.,2020IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202410/84”Creatingnoisefromdataiseasy;creatingdatafromnoiseisgenerativemodeling.”Songetal.,2020DDPMsprogressivelygenerateimagesoutofnoiseSource:Hoetal.[2020]IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202410/84Inorderforamodeltolearnhowtodothat,weneedaprocesstogeneratesuitabletrainingdataintheformof(noise,image)pairsIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202411/84Inorderforamodeltolearnhowtodothat,weneedaprocesstogeneratesuitabletrainingdataintheformof(noise,image)pairsWewillrefertotheprocessofcreatingthetrainingdataastheforwarddiffusionprocess,whichwillprogressivelymakeanimagenoisierIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202411/84Inorderforamodeltolearnhowtodothat,weneedaprocesstogeneratesuitabletrainingdataintheformof(noise,image)pairsWewillrefertotheprocessofcreatingthetrainingdataastheforwarddiffusionprocess,whichwillprogressivelymakeanimagenoisierIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202411/84Ourmodelwilllearnhowtorevertthatnoisingprocess,throughwhatitisknownasthereversediffusionprocess→progressivelydenoisinganoisyimageIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202412/84Ourmodelwilllearnhowtorevertthatnoisingprocess,throughwhatitisknownasthereversediffusionprocess→progressivelydenoisinganoisyimageIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202412/84Ourmodelwilllearnhowtorevertthatnoisingprocess,throughwhatitisknownasthereversediffusionprocess→progressivelydenoisinganoisyimageOncetrained,themodelshouldhavethereforelearnedhowtodenoiseimagesSowecangeneratesomepurelyrandomnoise,runitthroughourmodelandgetbackanimagegeneratedoutofpurenoise!IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202412/84Moreformally,DDPMsworkthroughmanystepstwhichare0,1,…,T•x0istheoriginalimage•q(xt∣xt−1)istheforwarddiffusionprocess•pθ(xt−1∣xt)willbethereversediffusionprocess(learnedbyourmodelwithweightsθ)IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202413/84Moreformally,DDPMsworkthroughmanystepstwhichare0,1,…,T•x0istheoriginalimage•q(xt∣xt−1)istheforwarddiffusionprocess•pθ(xt−1∣xt)willbethereversediffusionprocess(learnedbyourmodelwithweightsθ)DuringforwarddiffusionweaddGaussian(Normal)noisetotheimageineveryt,producingnoisyimagesx1,x2,…xTAstbecomeshigher,theimagebecomesmoreandmorenoisyIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202413/84q(xt∣xt−1)≔√1−βt⋅xt−1+N(0,βtI)IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202414/84q(xt∣xt−1)≔√1−βt⋅xt−1+N(0,βtI)Takeanimageatsomepointt−1GenerateGaussiannoisefromanisotropicmultivariateNormalofsizext,Scalext−1valuesby√1−βt(sodatascaledoesnotgrowasweaddnoise)AddthenoisethethescaledimageIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202414/84q(xt∣xt−1)≔√1−βt⋅xt−1+N(0,βtI)Takeanimageatsomepointt−1GenerateGaussiannoisefromanisotropicmultivariateNormalofsizext,Scalext−1valuesby√1−βt(sodatascaledoesnotgrowasweaddnoise)AddthenoisethethescaledimageItcanbedirectlycomputedasq(xt∣xt−1)≔N(xt;√1−βtxt−1,βtI)IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202414/84q(xt∣xt−1)≔√1−βt⋅xt−1+N(0,βtI)Takeanimageatsomepointt−1GenerateGaussiannoisefromanisotropicmultivariateNormalofsizext,Scalext−1valuesby√1−βt(sodatascaledoesnotgrowasweaddnoise)AddthenoisethethescaledimageItcanbedirectlycomputedasq(xt∣xt−1)≔N(xt;√1−βtxt−1,βtI)noiseisaddedineachstept11Inthepaperitismadetogrowlinearlyfromβ1=10−4toβT=0.02forT=1000IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202414/84Thefullforwardprocessistherefore:Tq(x1:T|x0)污Πq(xt|xt—1)t=1ForalargeT,thefinalimageisbasicallyonlynoise(alloriginalimageinfoisessentiallylost),soitbecomesroughlyxT~N(0,I)Demohere!IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202415/84Tricktogetanyxtfromx0withouthavingtocomputetheintermediatesteps.Παs.WecanusethereparametrizationtrickfortheNormaldistributiontoget:q(xt|x0)=N(xt;√α- Inthepaperthisisdescribedas”Anotableproperty”.IbelievethatthefirsttocallthisasanicepropertywasWeng[2021].WewillcallittheNice™propertyIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202416/84Tricktogetanyxtfromx0withouthavingtocomputetheintermediatesteps.Παs.WecanusethereparametrizationtrickfortheNormaldistributiontoget:DetailsinAppendixA!q(xt|x0)=N(xt;√α-DetailsinAppendixA!•Easier,fastercomputation•Anyimagestatextcomesfroma(Normal)probabilitydistribution,drasticallysimplyfingderivationsDemohere!Inthepaperthisisdescribedas”Anotableproperty”.IbelievethatthefirsttocallthisasanicepropertywasWeng[2021].WewillcallittheNice™propertyIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202416/84WewilltrainamodelpθtolearntoperformthereverseprocessStartingfromp(xT)=N(0,I),itwilltrytorecreatetheimage!IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202417/84WewilltrainamodelpθtolearntoperformthereverseprocessStartingfromp(xT)=N(0,I),itwilltrytorecreatetheimage!pθ(xt1|xt)污N(xt1;μθ(xt,t),Σθ(xt,t))WillbeaneuralnetworkpredictionWillbesettoavalueσIbasedonβtIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202417/84WewilltrainamodelpθtolearntoperformthereverseprocessStartingfromp(xT)=N(0,I),itwilltrytorecreatetheimage!pθ(xt1|xt)污N(xt1;μθ(xt,t),Σθ(xt,t))WillbeaneuralnetworkpredictionWillbesettoavalueσIbasedonβtAndTpθ(x0:T)污p(xT)Πpθ(xt—1|xt)t=1IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202417/84SummarySummaryIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202418/84SummarySummaryTheforwardprocessposterioristheground-truthreversediffusionprocessthatthemodelwilllearntoapproximate!IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202418/84JustlikeinVAEs,thelossfunctionisbasedontheEvidenceLowerBound(ELBO):ELBO=Eq(x1∶T∣x0)[logIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202419/84JustlikeinVAEs,thelossfunctionisbasedontheEvidenceLowerBound(ELBO):ELBO=Eq(x1∶T∣x0)[logWhichbecomes:⎡T⎤EqIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202419/84JustlikeinVAEs,thelossfunctionisbasedontheEvidenceLowerBound(ELBO):ELBO=Eq(x1∶T∣x0)[logWhichbecomes:⎡T⎤EqIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202419/84JustlikeinVAEs,thelossfunctionisbasedontheEvidenceLowerBound(ELBO):ELBO=Eq(x1∶T∣x0)[logWhichbecomes:⎡T⎤EqDetailsDetailsinAppendixB!IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202419/84JustlikeinVAEs,thelossfunctionisbasedontheEvidenceLowerBound(ELBO):ELBO=Eq(x1∶T∣x0)[logWhichbecomes:⎡T⎤Eq•LT→priormatchingterm.Hasnolearnableparameters,soDetailsinAppendixB!•LtDetailsinAppendixB!•L0→reconstructionterm.Onlylearninghowtogofromx1tox0,soauthorsendedupignoringit(simplerandbetterresults)IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202419/84LossthereforefocusesonLt−1:Where:•q(xt−1∣xt,x0)istheforwardprocessposterior(i.e.whatwouldbetheprefect,ground-truthreverseprocess)conditionedonx0•pθ(xt−1∣xt)willbeourlearnedreverseprocessasseeninslide17 IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202420/84LossthereforefocusesonLt−1:Where:•q(xt−1∣xt,x0)istheforwardprocessposterior(i.e.whatwouldbetheprefect,ground-truthreverseprocess)conditionedonx0xt−1∣xt)willbeourlearnedreverseprocessasseeninslide17Theforwardprocessposterioristractableandcanbecomputedas:̃q(xt−1∣xt,x0)=N(xt−1;t(xt,x0),βtI)WhereIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202420/84LossthereforefocusesonLt−1:Where:•q(xt−1∣xt,x0)istheforwardprocessposterior(i.e.whatwouldbetheprefect,ground-truthreverseprocess)conditionedonx0xt−1∣xt)willbeourlearnedreverseprocessasseeninslide17Theforwardprocessposterioristractableandcanbecomputedas:̃q(xt−1∣xt,x0)=N(xt−1;t(xt,x0),βtI)WhereDetailsinAppendixC!DetailsinAppendixC!IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202420/84LossthereforefocusesonLt−1:Where:•q(xt−1∣xt,x0)istheforwardprocessposterior(i.e.whatwouldbetheprefect,ground-truthreverseprocess)conditionedonx0xt−1∣xt)willbeourlearnedreverseprocessasseeninslide17Theforwardprocessposterioristractableandcanbecomputedas:̃q(xt−1∣xt,x0)=N(xt−1;t(xt,x0),βtI)WhereDetailsinAppendixC!DetailsinAppendixC!IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202420/84LossisthereforetheKLdivergencebetweentwoNormals:theforwardprocessposteriorq(xt−1∣xt,x0)andthereverseprocessthatourmodelwilllearnIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202421/84LossisthereforetheKLdivergencebetweentwoNormals:theforwardprocessposteriorq(xt−1∣xt,x0)andthereverseprocessthatourmodelwilllearnSincebothareNormaldistributions,thisKLdivergenceis: Eqt(xt,x0)−μθ(xt,t)∥]Forwardprocessposteriormean(slide20) Frommodel’sprediction(slide17)IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202421/84LossisthereforetheKLdivergencebetweentwoNormals:theforwardprocessposteriorq(xt−1∣xt,x0)andthereverseprocessthatourmodelwilllearnpxt−1∣xt)SincebothareNormaldistributions,thisKLdivergenceis: ]Forwardprocessposteriormean(slide20) Frommodel’sprediction(slide17)However:authorsdecideinsteadtopredictthenoiseaddedduringtheforwardprocess.Reformulating: Addednoiseinforwardpass ModelpredictingthenoiseusingxtandtasfeaturesIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202421/84LossisthereforetheKLdivergencebetweentwoNormals:theforwardprocessposteriorq(xt−1∣xt,x0)andthereverseprocessthatourmodelwilllearnpxt−1∣xt)SincebothareNormaldistributions,thisKLdivergenceis: ]Forwardprocessposteriormean(slide20) Frommodel’sprediction(slide17)However:authorsdecideinsteadtopredictthenoiseaddedduringtheforwardprocess.Reformulating: process.Reformulating:Addednoiseinforwardpass ModelpredictingthenoiseusingxtandtasfeaturesDetailsinAppendixD!IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202421/84LossisthereforetheKLdivergencebetweentwoNormals:theforwardprocessposteriorq(xt−1∣xt,x0)andthereverseprocessthatourmodelwilllearnpxt−1∣xt)SincebothareNormaldistributions,thisKLdivergenceis: ]Forwardprocessposteriormean(slide20) Frommodel’sprediction(slide17)However:authorsdecideinsteadtopredictthenoiseaddedduringtheforwardprocess.Reformulating: Addednoiseinforwardpass ModelpredictingthenoiseusingxtandtasfeaturesDetailsinAppendixD!IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202421/84Source:Hoetal.[2020]IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202422/84Source:Hoetal.[2020]Where√tx0+√1+t∈isjustxtcomputedthroughtheNice™property!IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202422/84ProposedmodelisaU-Netarchitecture(Ronnebergeretal.[2015])thatincludesself-attentionblocks IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202423/84ProposedmodelisaU-Netarchitecture(Ronnebergeretal.[2015])thatincludesself-attentionblocksTheyalsoincludeGroupNorm(WuandHe[2018])inResNetandself-attentionblockstisaddedoneveryResNetblockthroughpositionalencoding(Vaswanietal.[2017])IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202423/84Oncethemodelistrained,wecangeneratenewimagesby:Source:Hoetal.[2020] IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202424/84Oncethemodelistrained,wecangeneratenewimagesby:Source:Hoetal.[2020]Step4justappliesthereparametrizationtricktothelearnedreverseprocesspθ(xt−1∣xt)=N(xt−1;μθ(xt,t),Σθ(xt,t))IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202424/84Samplingisaniterativeprocess:weprogressivelyremovepredictednoise IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202425/84Samplingisaniterativeprocess:weprogressivelyremovepredictednoiseYoumaywonder:Ifatanysinglestepwearepredictingthefulladdednoise∈,whydon’tweremoveitcompletelyinasinglestep?IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202425/84Samplingisaniterativeprocess:weprogressivelyremovepredictednoiseYoumaywonder:Ifatanysinglestepwearepredictingthefulladdednoise∈,whydon’tweremoveitcompletelyinasinglestep?Answer:DetailsDetailsinAppendixE!IEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202425/84AdvancementsandimprovementsIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202426/84AdvancementsAdvancementsandimprovementsOutline•Variance/noiseschedulers•Learningthereverseprocessvariance•Fastersampling:DDIMs•Conditionalgeneration•ClassifierGuidance•Classifier-FreeGuidance•Conditioningonimages•ControlNet•ConditioningontextIEUniversity-L.Sánchez,J.A.SotoDiffusionmodelsFall202427/84Variance/noiseVariance/noiseschedulersNicholandDhariwal[2021]propot=beingtinywhentiscloseto0(theysetittos=0.008)ComparisonbetweenschedulerinDDPMandNichol&Dhariwal’scosineschedulerproposal.Source:Nichol&Dhariwal[2021]IEUniversity-L.Sánchez,J.A.Soto
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 骨折患者康复锻炼指导
- 加强参股管理培训
- 2026中国电信福建公司春季校园招聘备考题库附参考答案详解(综合卷)
- 2026新疆第四师总医院春季招聘88人备考题库及参考答案详解【研优卷】
- 2026福建漳州市龙文区教育局招聘43人备考题库带答案详解(综合题)
- 2026中国农业科学院饲料研究所新兽药与免疫调控创新团队科研助理招聘2人备考题库及参考答案详解【考试直接用】
- 2026广东深圳市龙岗区坂田街道上品雅园幼儿园招聘1人备考题库附参考答案详解【完整版】
- 2026上海市消防救援局招聘500名政府专职消防员备考题库及参考答案详解【b卷】
- 2026陕西蒲城高新医院招聘25人备考题库附完整答案详解【历年真题】
- 报关实务试题及答案详细解析
- 物业管理公司员工招聘条件及流程
- 井下电气作业安全课件
- 各种实验室检查的正常值和临床意义
- 慢性疾病管理临床路径方案
- 类器官技术用于药物剂量优化策略
- DB31∕T 634-2020 电动乘用车运行安全和维护保障技术规范
- 低钠血症的护理
- (新版)上海安全员C3考试(重点)题库300题(含答案)
- 拖拉机犁地合同范本
- 农民告别千年古税课件
- 宠物医院实习答辩
评论
0/150
提交评论