生成对抗网络(GANs)_第1页
生成对抗网络(GANs)_第2页
生成对抗网络(GANs)_第3页
生成对抗网络(GANs)_第4页
生成对抗网络(GANs)_第5页
已阅读5页,还剩36页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

DIG:DataIntelligenceGenerationLab

山西大学计算机科学与信息技术学院山西大学大数据科学与产业研究院生成式对抗网络原理及其开展:GenerativeAdversarialNets2023年12月知识工程OUTLINEGenerativeAdversarialNets(GANS)DeepConvolutionalGenerativeAdversarialNetworks(DCGAN)ConditionalGenerativeAdversarialNets(CGAN)OUTLINEGenerativeAdversarialNets(GANS)DeepConvolutionalGenerativeAdversarialNetworks(DCGAN)ConditionalGenerativeAdversarialNets(CGAN)GenerativeAdversarialNets(GANS)有监督学习经常比无监督的能获得更好的训练效果。但真实世界中,有监督学习需要的数据标注〔label〕是相对少的。所以研究者们从未放弃去探索更好的无监督学习策略,希望能从海量的无标注数据中学到对于这个真实世界的表示甚至知识,从而去更好地理解我们的真实世界。评价无监督学习好坏的方式有很多,其中生成任务就是最直接的一个。只有当我们能生成/创造我们的真实世界,才能说明我们是完完全全理解了它。然而,生成任务所依赖的生成式模型〔generativemodels〕往往会遇到两大困难。首先是我们需要大量的先验知识去对真实世界进行建模,其中包括选择什么样的先验、什么样的分布等等。而建模的好坏直接影响着我们的生成模型的表现。另一个困难是,真实世界的数据往往很复杂,我们要用来拟合模型的计算量往往非常庞大,甚至难以承受。GenerativeAdversarialNets(GANS)IanGoodfellow提出的GenerativeAdversarialNetworks(GANs)很好的避开了这两个困难。每一个GAN框架,都包含着一对模型——一个生成模型〔G〕和一个判别模型〔D〕。因为D的存在,才使得GAN中的G不再需要对于真实数据的先验知识和复杂建模,也能学习去逼近真实数据,最终让其生成的数据到达以假乱真的地步——D也无法分别。论文中的模型优化公式:GenerativeAdversarialNets(GANS)GenerativeAdversarialNets(GANS)Sampleminibatchofmexamples{x_1,x_2,......,x_m}sampleminibatchofmnoisesamples{z_1,z_2,......,z_m}GeneratorDiscriminatorGenerativeAdversarialNets(GANS)代码说明及实验成果:#定义判别器defdiscriminator(x):#计算D_h1=ReLU〔x*D_W1+D_b1〕,该层的输入为含784个元素的向量D_h1=tf.nn.relu(tf.matmul(x,D_W1)+D_b1)#计算第三层的输出结果。因为使用的是Sigmoid函数,那么该输出结果是一个取值为[0,1]间的标量〔见上述权重定义〕#即判别输入的图像到底是真〔=1〕还是假〔=0〕D_logit=tf.matmul(D_h1,D_W2)+D_b2D_prob=tf.nn.sigmoid(D_logit)#返回判别为真的概率和第三层的输入值,输出D_logit是为了将其输入tf.nn.sigmoid_cross_entropy_with_logits()以构建损失函数returnD_prob,D_logitGenerativeAdversarialNets(GANS)#定义一个可以生成m*n阶随机矩阵的函数,该矩阵的元素服从均匀分布,随机生成的z就为生成器的输入defsample_Z(m,n):returnnp.random.uniform(-1.,1.,size=[m,n])#定义生成器defgenerator(z):#第一层先计算y=z*G_W1+G-b1,然后投入激活函数计算G_h1=ReLU〔y〕,G_h1为第二次层神经网络的输出激活值G_h1=tf.nn.relu(tf.matmul(z,G_W1)+G_b1)#以下两个语句计算第二层传播到第三层的激活结果,第三层的激活结果是含有784个元素的向量,该向量转化28×28就可以表示图像G_log_prob=tf.matmul(G_h1,G_W2)+G_b2G_prob=tf.nn.sigmoid(G_log_prob)returnG_probGenerativeAdversarialNets(GANS)#分别输入真实图片和生成的图片,并投入判别器以判断真伪D_real=discriminator(X)D_fake=discriminator(G_sample)#以下为原论文的判别器损失和生成器损失D_loss=-tf.reduce_mean(tf.log(D_real)+tf.log(1.-D_fake))G_loss=-tf.reduce_mean(tf.log(D_fake))#定义判别器和生成器的优化方法为Adam算法,关键字var_list说明最小化损失函数所更新的权重矩阵D_solver=tf.train.AdamOptimizer().minimize(D_loss,var_list=theta_D)G_solver=tf.train.AdamOptimizer().minimize(G_loss,var_list=theta_G)GenerativeAdversarialNets(GANS)GAN的优势:1.根据实际的结果,它们看上去可以比其它模型产生了更好的样本〔图像更锐利、清晰〕。2.生成对抗式网络框架能训练任何一种生成器网络。大局部其他的框架需要该生成器网络有一些特定的函数形式,比方输出层是高斯的。重要的是所有其他的框架需要生成器网络遍布非零质量〔non-zeromass〕。生成对抗式网络能学习可以仅在与数据接近的细流形〔thinmanifold〕上生成点。3.不需要设计遵循任何种类的因式分解的模型,任何生成器网络和任何鉴别器都会有用。4.无需利用马尔科夫链反复采样,无需在学习过程中进行推断〔Inference〕,回避了近似计算棘手的概率的难题。GenerativeAdversarialNets(GANS)GAN的缺点:1.解决不收敛〔non-convergence〕的问题。目前面临的根本问题是:所有的理论都认为GAN应该在纳什均衡〔Nashequilibrium〕上有卓越的表现,但梯度下降只有在凸函数的情况下才能保证实现纳什均衡。当博弈双方都由神经网络表示时,在没有实际到达均衡的情况下,让它们永远保持对自己策略的调整是可能的【OpenAIIanGoodfellow的Quora】。2.难以训练:崩溃问题〔collapseproblem〕。GAN的学习过程可能发生崩溃问题〔collapseproblem〕,生成器开始退化,总是生成同样的样本点,无法继续学习。【ImprovedTechniquesforTrainingGANs】3.无需预先建模,模型过于自由不可控。与其他生成式模型相比,GAN这种竞争的方式不再要求一个假设的数据分布,而是使用一种分布直接进行采样sampling,从而真正到达理论上可以完全逼近真实数据,这也是GAN最大的优势。然而,这种不需要预先建模的方法缺点是太过自由了,对于较大的图片,较多的pixel的情形,基于简单GAN的方式就不太可控了。在GAN[GoodfellowIan,Pouget-AbadieJ]中,每次学习参数的更新过程,被设为D更新k回,G才更新1回,也是出于类似的考虑。OUTLINEGenerativeAdversarialNets(GANS)DeepConvolutionalGenerativeAdversarialNetworks(DCGAN)ConditionalGenerativeAdversarialNets(CGAN)ConditionalGenerativeAdversarialNets〔CGAN〕ConditionalGenerativeAdversarialNets〔CGAN〕Inthisworkweintroducetheconditionalversionofgenerativeadversarialnets,whichcanbeconstructedbysimplyfeedingthedata,y,wewishtoconditionontoboththegeneratoranddiscriminator.WeshowthatthismodelcangenerateMNISTdigitsconditionedonclasslabels.Wealsoillustratehowthismodelcouldbeusedtolearnamulti-modalmodel,andprovidepreliminaryexamplesofanapplicationtoimagetagginginwhichwedemonstratehowthisapproachcangeneratedescriptivetagswhicharenotpartoftraininglabels.ConditionalGenerativeAdversarialNets〔CGAN〕Generativeadversarialnetscanbeextendedtoaconditionalmodelifboththegeneratoranddiscriminatorareconditionedonsomeextrainformationy.ycouldbeanykindofauxiliaryinformation,suchasclasslabelsordatafromothermodalities.Wecanperformtheconditioningbyfeedingyintotheboththediscriminatorandgeneratorasadditionalinputlayer.ConditionalGenerativeAdversarialNets〔CGAN〕Inthegeneratorthepriorinputnoisepz(z),andyarecombinedinjointhiddenrepresentation,andtheadversarialtrainingframeworkallowsforconsiderableflexibilityinhowthishiddenrepresentationiscomposed.Inthediscriminatorxandyarepresentedasinputsandtoadiscriminativefunction(embodiedagainbyaMLPinthiscase).ConditionalGenerativeAdversarialNets〔CGAN〕ConditionalGenerativeAdversarialNets〔CGAN〕OUTLINEGenerativeAdversarialNets(GANS)DeepConvolutionalGenerativeAdversarialNetworks(DCGAN)ConditionalGenerativeAdversarialNets(CGAN)GenerativeAdversarialNets(GANS)GenerativeAdversarialNets(GANS)Inrecentyears,supervisedlearningwithconvolutionalnetworks(CNNs)hasseenhugeadoptionincomputervisionapplications.Comparatively,unsupervisedlearningwithCNNshasreceivedlessattention.InthisworkwehopetohelpbridgethegapbetweenthesuccessofCNNsforsupervisedlearningandunsupervisedlearning.WeintroduceaclassofCNNscalleddeepconvolutionalgenerativeadversarialnetworks(DCGANs),thathavecertainarchitecturalconstraints,anddemonstratethattheyareastrongcandidateforunsupervisedlearning.Trainingonvariousimagedatasets,weshowconvincingevidencethatourdeepconvolutionaladversarialpairlearnsahierarchyofrepresentationsfromobjectpartstoscenesinboththegeneratoranddiscriminator.Additionally,weusethelearnedfeaturesfornoveltasks-demonstratingtheirapplicabilityasgeneralimagerepresentations.GenerativeAdversarialNets(GANS)Inthispaper,wemakethefollowingcontributions:•WeproposeandevaluateasetofconstraintsonthearchitecturaltopologyofConvolutionalGANsthatmakethemstabletotraininmostsettings.WenamethisclassofarchitecturesDeepConvolutionalGANs(DCGAN)•Weusethetraineddiscriminatorsforimageclassificationtasks,showingcompetitiveperformancewithotherunsupervisedalgorithms.•WevisualizethefilterslearntbyGANsandempiricallyshowthatspecificfiltershavelearnedtodrawspecificobjects.•Weshowthatthegeneratorshaveinterestingvectorarithmeticpropertiesallowingforeasymanipulationofmanysemanticqualitiesofgeneratedsamples.GenerativeAdversarialNets(GANS)Background:HistoricalarremptstoscaleupGANsusingCNNstomodelimageshavebeenunsuccessful.WealsoencountereddifficultiesattemptingtoscaleGANsusingCNNarchitecurescommonlyusedinthesupervisedliterature.However,afterextensivemodelexplorationweidentifiedafamilyofarchitecturesthatresultedinstabletrainingacrossarangeofdatasetsandallowedfortraininghigherresolutionanddeepergenerativemodels.CoreourapproachisadoptingandmodifyingthreerecentlydemonstratedchangesofCNNarchitectures.GenerativeAdversarialNets(GANS)ArchitectureguidelinesforstableDeepConvolutionalGANs•Replaceanypoolinglayerswithstridedconvolutions(discriminator)andfractional-stridedconvolutions(generator).•Usebatchnorminboththegeneratorandthediscriminator.•Removefullyconnectedhiddenlayersfordeeperarchitectures.•UseReLUactivationingeneratorforalllayersexceptfortheoutput,whichusesTanh.•UseLeakyReLUactivationinthediscriminatorforalllayers.GenerativeAdversarialNets(GANS)APPROACHANDMODELARCHITECTUREThefirstisthe

allconvolutionalnetwhichreplacesdeterministicspatialpoolingfunctions(suchasmaxpooling)withstriedconvolutions.Weusethisapproachinourgenerator,allowingittolearnitsownspatialupsampling,anddiscriminator.Secondisthe

trendtowardseliminatingfullyconnectedlayersontopofconvolutionalfeatures.Thestrongestexampleofthisisglobalaveragepoolingwhichhasbeenutilizedinstateoftheartimageclassificationmodels(Mordvintsevetal.).Wefoundglobalaveragepoolingincreasedmodelstabilitybuthuirconvergencespeed.Amiddlegroundofdirectlyconnectingthehighestconvolutionalfeaturestotheinputandoutputrespectivelyofthegeneratoranddiscrominatorworkedwell.ThefirstlayeroftheGAN,whichtakesauniformnoisedistributionZasinput,couldbecalledfullyconnectedasitisjustamatrixmultiplication,buttheresultisreshapedintoa4-dimensionaltensorandusedasthestartoftheconvolutionstack.Forthediscriminator,thelastconvolutionlayerisflattenedandthenfedintoasinglesigmoidoutput.SeeFig.1foravisualizationofanexamplemodelarchitecture.GenerativeAdversarialNets(GANS)Generatemodel:GenerativeAdversarialNets(GANS)Discriminatormodel:

h0=lrelu(conv2d(image,self.df_dim,name='d_h0_conv'))h1=lrelu(self.d_bn1(conv2d(h0,self.df_dim*2,name='d_h1_conv')))h2=lrelu(self.d_bn2(conv2d(h1,self.df_dim*4,name='d_h2_conv')))h3=lrelu(self.d_bn3(conv2d(h2,self.df_dim*8,name='d_h3_conv')))h4=linear(tf.reshape(h3,[self.batch_size,-1]),1,'d_h4_lin')GenerativeAdversarialNets(GANS)ThirdisBatchNormalization(Ioffe&Szegedy,2023)whichstabilizeslearningbynormalizingtheinputtoeachunittohavezeromeanandunitvariance.Thishelpsdealwithtrainingproblemsthatariseduetopoorinitializationandhelpsgradientflowindeepermodels.Directlyapplyingbatchnormtoalllayershowever,resultedinsampleoscillationandmodelinstability.Thiswasavoidedbynotapplyingbatchnormtothegeneratoroutputlayerandthediscriminatorinputlayer.TheReLUactivation(Nair&Hinton,2023)isusedinthegeneratorwiththeexceptionoftheoutputlayerwhichusestheTanhfunction.Withinthediscriminatorwefoundtheleakyrectifiedactivation(Maasetal.,2023)(Xuetal.,2023)toworkwell,especiallyforhigherresolutionmodeling.ThisisincontrasttotheoriginalGANpaper,whichusedthemaxoutactivation(Goodfellowetal.,2023).GenerativeAdversarialNets(GANS)训练细节1、mini-batch训练,batchsize是128.2、所有的参数初始化由(0,0.02)的正态分布中随即得到3、LeakyReLU的斜率是0.2.4、虽然之前的GAN使用momentum来加速训练,DCGAN使用调好超参的Adamoptimizer。5、learningrate=0.00026、将momentum参数beta从0.9降为0.5来防止震荡和不稳定。GenerativeAdversarialNets(GANS)4.1LSUNAsvisualqualityofsamplesfromgenerativeimagemodelshasimproved,concernsofover-fittingandmemorizationoftrainingsampleshaverisen.Todemonstratehowourmodelscaleswithmoredataandhigherresolutiongeneration,wetrainamodelontheLSUNbedroomsdatasetcontainingalittleover3milliontrainingexamples.Recentanalysishasshownthatthereisadirectlinkbetweenhowfastmodelslearnandtheirgeneralizationperformance(Hardtetal.,2023).Weshowsamplesfromoneepochoftraining(Fig.2),mimickingonlinelearning,inadditiontosamplesafterconvergence(Fig.3),asanopportunitytodemonstratethatourmodelisnotproducinghighqualitysamplesviasimplyoverfitting/memorizingtrainingexamples.Nodataaugmentationwasappliedtotheimages.GenerativeAdversarialNets(GANS)4.1.1DEDUPLICATIONTofurtherdecreasethelikelihoodofthegeneratormemorizinginputexamples(Fig.2)weperformasimpleimagede-duplicationprocess.Wefita3072-128-3072de-noisingdropoutregularizedRELUautoencoderon32x32downsampledcenter-cropsoftrainingexamples.TheresultingcodelayeractivationsarethenbinarizedviathresholdingtheReLUactivationwhichhasbeenshowntobeaneffectiveinformationpreservingtechnique(Srivastavaetal.,2023)andprovidesaconvenientformofsemantic-hashing,all

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论