深度学习综述讨论简介deepLearning_第1页
深度学习综述讨论简介deepLearning_第2页
深度学习综述讨论简介deepLearning_第3页
深度学习综述讨论简介deepLearning_第4页
深度学习综述讨论简介deepLearning_第5页
已阅读5页,还剩46页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

,IntroductiontoDeepLearningHuihuiLiuMar.1,2017,Outline,ConceptionofdeeplearningDevelopmenthistoryDeeplearningframeworksDeepneuralnetworkarchitecturesConvolutionalneuralnetworksIntroductionNetworkstructureTrainingtricksApplicationinAestheticImageEvaluationIdea,DeepLearning(Hinton,2006),Deeplearningisabranchofmachinelearningbasedonasetofalgorithmsthatattempttomodelhighlevelabstractionsindata.Theadvantageofdeeplearningistoextractingfeaturesautomaticallyinsteadofextractingfeaturesmanually.,ComputervisionSpeechrecognitionNaturallanguageprocessing,DevelopmentHistory,1943,19401950196019701980199020002010,MPmodel,1958,Single-layerPerceptron,1969,XORproblem,1986,BPalgorithm,1989,CNN-LeNet,1995,1997,SVM,LSTM,Gradientdisappearanceproblem,1991,2006,DBN,ReLU,2011,2012,2015,DropoutAlexNet,BNFasterR-CNNResidualNet,GeoffreyHinton,W.S.McCullochW.Pitts,Rosenblatt,MarvinMinsky,YannLeCun,Hinton,Hinton,HintonLeCunBengio,DeepLearningFrameworks,Deepneuralnetworkarchitectures,DeepBeliefNetworks(DBN)RecurrentNeuralNetworks(RNN)GenerativeAdversarialNetworks(GANs)ConvolutionalNeuralNetworks(CNN)LongShort-TermMemory(LSTM),DBN(DeepBeliefNetwork,2006),HiddenunitsandvisibleunitsEachunitisbinary(0or1).Everyvisibleunitconnectstoallthehiddenunits.Everyhiddenunitconnectstoallthevisibleunits.Therearenoconnectionsbetweenv-vandh-h.,HintonGE.DeepbeliefnetworksJ.Scholarpedia,2009,4(6):5947.,Fig1.RBM(restrictedBoltzmannmachine)structure.,Fig2.DBN(deepbeliefnetwork)structure.,Idea?ComposedofmultiplelayersofRBM.Howtowetraintheseadditionallayers?Unsupervisedgreedyapproach,RNN(RecurrentNeuralNetwork,2013),What?RNNaimstoprocessthesequencedata.RNNwillrememberthepreviousinformationandapplyittothecalculationofthecurrentoutput.Thatis,thenodesofthehiddenlayerareconnected,andtheinputofthehiddenlayerincludesnotonlytheoutputoftheinputlayerbutalsotheoutputofthehiddenlayer.,MarhonSA,CameronCJF,KremerSC.RecurrentNeuralNetworksM/HandbookonNeuralInformationProcessing.SpringerBerlinHeidelberg,2013:29-65.,Applications?MachineTranslationGeneratingImageDescriptionsSpeechRecognition,Howtotrain?BPTT(Backpropagationthroughtime),GANs(GenerativeAdversarialNetworks,2014),GANsInspiredbyzero-sumGameinGameTheory,whichconsistsofapairofnetworks-ageneratornetworkandadiscriminatornetwork.Thegeneratornetworkgeneratesasamplefromtherandomvector,thediscriminatornetworkdiscriminateswhetheragivensampleisnaturalorcounterfeit.Bothnetworkstraintogethertoimprovetheirperformanceuntiltheyreachapointwherecounterfeitandrealsamplescannotbedistinguished.,GoodfellowI,Pouget-AbadieJ,MirzaM,etal.GenerativeadversarialnetsC/Advancesinneuralinformationprocessingsystems.2014:2672-2680.,Applacations:ImageeditingImagetoimagetranslationGeneratetextGenerateimagesbasedontextCombinedwithreinforcementlearningAndmore,LongShort-TermMemory(LSTM,1997),NeuralNetworks,Neuron,Neuralnetwork,ConvolutionalNeuralNetworks(CNN),Convolutionneuralnetworkisakindoffeedforwardneuralnetwork,whichhasthecharacteristicsofsimplestructure,lesstrainingparametersandstrongadaptability.CNNavoidsthecomplexpre-processingofimage(etc.extracttheartificialfeatures),wecandirectlyinputtheoriginalimage.Basiccomponents:ConvolutionLayers,PoolingLayers,FullyconnectedLayers,Convolutionlayer,Theconvolutionkerneltranslatesona2-dimensionalplane,andeachelementoftheconvolutionkernelismultipliedbytheelementatthecorrespondingpositionoftheconvolutionimageandthensumalltheproduct.Bymovingtheconvolutionkernel,wehaveanewimage,whichconsistsofthesumoftheproductoftheconvolutionkernelateachposition.,localreceptivefieldweightsharing,Reducedthenumberofparameters,Poolinglayer,Poolinglayeraimstocompresstheinputfeaturemap,whichcanreducethenumberofparametersintrainingprocessandthedegreeofover-fittingofthemodel.Max-pooling:Selectingthemaximumvalueinthepoolingwindow.Mean-pooling:Calculatingtheaverageofallvaluesinthepoolingwindow.,FullyconnectedlayerandSoftmaxlayer,Eachnodeofthefullyconnectedlayerisconnectedtoallthenodesofthelastlayer,whichisusedtocombinethefeaturesextractedfromthefrontlayers.,Fig1.Fullyconnectedlayer.,Fig2.CompleteCNNstructure.,Fig3.Softmaxlayer.,TrainingandTesting,Forwardpropagation-Takingasample(X,Yp)fromthesamplesetandputtheXintothenetwork;-CalculatingthecorrespondingactualoutputOp.Backpropagation-CalculatingthedifferencebetweentheactualoutputOpandthecorrespondingidealoutputYp;-Adjustingtheweightmatrixbyminimizingtheerror.,Trainingstage:,Testingstage:,Puttingdifferentimagesandlabelsintothetrainedconvolutionneuralnetworkandcomparingtheoutputandtheactualvalueofthesample.,Beforethetrainingstage,weshouldusesomedifferentsmallrandomnumberstoinitializeweights.,CNNStructureEvolution,HintonBP,Neocognition,LeCunLeNet,AlexNet,Historicalbreakthrough,ReLUDropoutGPU+BigData,VGG16,VGG19,MSRA-Net,Deepernetwork,NIN,GoogLeNet,InceptionV3InceptionV4,R-CNNSPP-Net,FastR-CNN,FasterR-CNN,InceptionV2(BN),FCNFCN+CRF,STNet,CNN+RNN/LSTM,ResNet,Enhancedthefunctionalityoftheconvolutionmodule,Classificationtask,Detectiontask,Addnewfunctionalunit,integration,1980,1998,1989,2014,2015,ImageNet,ILSVRC(ImageNetLargeScaleVisualRecognitionChallenge),2013,2014,2015,2015,2014,2015,2015,2012,2015,BN(BatchNormalization),RPN,LeNet(LeCun,1998),LeNetisaconvolutionalneuralnetworkdesignedbyYannLeCunforhandwrittennumeralrecognitionin1998.Itisoneofthemostrepresentativeexperimentalsystemsinearlyconvolutionalneuralnetworks.LeNetincludestheconvolutionlayer,poolinglayerandfull-connectedlayer,whicharethebasiccomponentsofmodernCNNnetwork.LeNetisconsideredtobethebeginningoftheCNN.,networkstructure:3convolutionlayers+2poolinglayers+1fullyconnectedlayer+1outputlayer,HaykinS,KoskoB.GradientBasedLearningAppliedtoDocumentRecognitionD.Wiley-IEEEPress,2009.,AlexNet(Alex,2012),Networkstructure:5convolutionlayers+3fullyconnectedlayersThenonlinearactivationfunction:ReLU(Rectifiedlinearunit)Methodstopreventoverfitting:Dropout,DataAugmentationBigDataTraining:ImageNet-imagedatabaseofmillionordersofmagnitudeOthers:GPU,LRN(localresponsenormalization)layer,KrizhevskyA,SutskeverI,HintonGE.ImageNetclassificationwithdeepconvolutionalneuralnetworksC/InternationalConferenceonNeuralInformationProcessingSystems.CurranAssociatesInc.2012:1097-1105.,Overfeat(2013),SermanetP,EigenD,ZhangX,etal.OverFeat:IntegratedRecognition,LocalizationandDetectionusingConvolutionalNetworksJ.EprintArxiv,2013.,VGG-Net(OxfordUniversity,2014),input:afixed-size224*224RGBimagefilters:averysmallreceptivefield-3*3,withstride1Max-pooling:2*2pixelwindow,withstride2,Fig1.ArchitectureofVGG16,Table1:ConvNetconfigurations(shownincolumns).Theconvolutionallayerparametersaredenotedas“conv-”,SimonyanK,ZissermanA.VeryDeepConvolutionalNetworksforLarge-ScaleImageRecognitionJ.ComputerScience,2014.,Why3*3filters?Stackedconv.layershavealargereceptivefieldMorenon-linearityLessparameterstolearn,Network-in-Network(NIN,ShuichengYan,2013),Networkstructure:4Mlpconvlayers+Globalaveragepoolinglayer,Fig1.linearconvolutionMLPconvolution,Fig2.fullyconnectedlayerglobalaveragepoolinglayer,MinLinetal,NetworkinNetwork,Arxiv2013.,Fig3.NINstructure,Linearcombinationofmultiplefeaturemaps.Informationintegrationofcross-channel.,ReducedtheparametersReducedthenetworkAvoidedover-fitting,GoogLeNet(InceptionV1,2014),Fig1.Inceptionmodule,naveversion,ProposedinceptionarchitectureandoptimizeditCanceledthefullyconnnectedlayerUsedauxiliaryclassifierstoacceleratenetworkconvergence,SzegedyC,LiuW,JiaY,etal.GoingdeeperwithconvolutionsC/ProceedingsoftheIEEEConferenceonComputerVisionandPatternRecognition.2015:1-9.,Fig2.Inceptionmodulewithdimensionreductions,Fig3.GoogLeNetnetwork(22layers),InceptionV2(2015),IoffeS,SzegedyC.Batchnormalization:AcceleratingdeepnetworktrainingbyreducinginternalcovariateshiftJ.arXivpreprintarXiv:1502.03167,2015.,InceptionV3(2015),SzegedyC,VanhouckeV,IoffeS,etal.RethinkingtheinceptionarchitectureforcomputervisionC/ProceedingsoftheIEEEConferenceonComputerVisionandPatternRecognition.2016:2818-2826.,ResNet(KaiwenHe,2015),Asimpleandcleanframeworkoftraining“very”deepnetworks.State-of-the-artperformancefor,ImageclassificationObjectdetectionSemanticSegmentationandmore,HeK,ZhangX,RenS,etal.DeepResidualLearningforImageRecognitionJ.2015:770-778.,Fig1.Shortcutconnections,Fig2.ResNetstructure(152layers),FractalNet,InceptionV4(2015),SzegedyC,IoffeS,VanhouckeV,etal.Inception-v4,inception-resnetandtheimpactofresidualconnectionsonlearningJ.arXivpreprintarXiv:1602.07261,2016.,Inception-ResNet,HeK,ZhangX,RenS,etal.DeepResidualLearningforImageRecognitionJ.2015:770-778.,Comparison,SqueezeNet,SqueezeNet:AlexNet-levelaccuracywith50 xfewerparametersand0.5MBmodelsize,Xception,R-CNN(2014),Regionproposals:SelectiveSearchResizetheregionproposal:Warpallregionproposalstotherequiredsize(227*227,AlexNetInput)ComputeCNNfeature:Extracta4096-dimensionalfeaturevectorfromeachregionproposalusingAlexNet.Classify:TrainingalinearSVMclassifierforeachclass.,1UijlingsJRR,SandeKEAVD,GeversT,etal.SelectiveSearchforObjectRecognitionJ.InternationalJournalofComputerVision,2013,104(2):154-171.2GirshickR,DonahueJ,DarrellT,etal.RichFeatureHierarchiesforAccurateObjectDetectionandSemanticSegmentationJ.2014:580-587.,R-CNN:Regionproposals+CNN,SPP-Net(Spatialpyramidpoolingnetwork,2015),HeK,ZhangX,RenS,etal.SpatialPyramidPoolinginDeepConvolutionalNetworksforVisualRecognitionJ.IEEETransactionsonPatternAnalysis&MachineIntelligence,2015,37(9):1904-1916.,Fig2.Anetworkstructurewithaspatialpyramidpoolinglayer.,Fig1.Top:AconventionalCNN.Bottom:Spatialpyramidpoolingnetworkstructure.,Advantages:Getthefeaturemapoftheentireimagetosavemuchtime.Outputafixedlengthfeaturevectorwithinputsofarbitrarysizes.Extractthefeatureofdifferentscale,andcanexpressmorespatialinformation.,TheSPP-Netmethodcomputesaconvolutionalfeaturemapfortheentireinputimageandthenclassifieseachobjectproposalusingafeaturevectorextractedfromthesharedfeaturemap.,FastR-CNN(2015),AFastR-CNNnetworktakesanentireimageandasetofobjectproposalsasinput.Thenetworkprocessestheentireimagewithseveralconvolutional(conv)andmaxpoolinglayerstoproduceaconvfeaturemap.Foreachobjectproposal,aregionofinterest(RoI)poolinglayerextractsafixed-lengthfeaturevectorfromthefeaturemap.Eachfeaturevectorisfedintoasequenceoffullyconnectedlayersthatfinallybranchintotwosiblingoutputlayers.,GirshickR.Fastr-cnnC/ProceedingsoftheIEEEInternationalConferenceonComputerVision.2015:1440-1448.,FasterR-CNN(2015),FasterR-CNN=RPN+FastR-CNN,ARegionProposalNetwork(RPN)takesanimage(ofanysize)asinputandoutputsasetofrectangularobjectproposals,eachwithanobjectnessscore.,RenS,HeK,GirshickR,etal.Fasterr-cnn:Towardsreal-timeobjectdetectionwithregionproposalnetworksC/Advancesinneuralinformationprocessingsystems.2015:91-99.,Figure1.FasterR-CNNisasingle,unifiednetworkforobjectdetection.,Figure2.RegionProposalNetwork(RPN).,Trainingtricks,DataAugmentationDropoutReLUBatchNormalization,DataAugmentation,-rotation-flip-zoom-shift-scale-contrast-noisedisturbance-color-.,Dropout(2012),Dropoutconsistsofsettingtozerotheoutputofeachhiddenneuronwithprobabilityp.Theneuronswhichare“droppedout”inthiswaydonotcontributetotheforwardbackpropagationanddonotparticipateinbackpropagation.,ReLU(RectifiedLinearUnit),advantages,rectified,SimplifiedcalculationAvoidedgradientdisappeared,BatchNormalization(2015),Intheinputofeachlayerofthenetwork,insertanormalizedlayer.Foralayerwithd-dimensionalinputx=(x(1).x(d),wewillnormalizeeachdimension:,IoffeS,SzegedyC.Batchnormalization:AcceleratingdeepnetworktrainingbyreducinginternalcovariateshiftJ.arXivpreprintarXiv:1502.03167,2015.,InternalCovariateShift,ApplicationinAestheticImageEvaluation,DongZ,ShenX,LiH,etal.PhotoQualityAssessmentwithDCNNthatUnderstandsImageWellM/MultiMediaModeling.SpringerInternationalPublishing,2015:524-535.LuX,LinZ,JinH,etal.RatingimageaestheticsusingdeeplearningJ.IEEETransactionsonMultimedia,2015,17(11):2021-2034.WangW,ZhaoM,WangL,etal.Amulti-scenedeeplearningmodelforimageaestheticevaluationJ.SignalProcessingImageCommunication,2016,47:511-518.,PhotoQualityAssessmentwithDCNNthatUnderstandsImageWell,DCNN_Aesth,trainedwellnetwork,atwo-classSVMclassifier,DCNN_Aesth_SP,originalimages,segmentedimages,spatialpyramid,ImageNet,CUHKAVA,DongZ,ShenX,LiH,etal.PhotoQualityAssessmentwithDCNNthatUnderstandsImageWellM/MultiMediaModeling.SpringerInternationalPublishing,2015:524-535.,Ratingimageaestheticsusingdeeplearning,Supportheterogeneousinputs,i.e.,globalandlocalviews.AllparametersinDCNNarejointlytrained.,Fig1.Globalviewsandlocalviewsofanimage,Fig3.DCNNarchitecture,Fig2.SCNNarchitecture,SCNN,DCNN,Enablesthenetworktojudgeimageaestheticswhilesimultaneouslyconsideringboththeglobalandlocalviewsofanimage.,LuX,LinZ,JinH,etal.RatingimageaestheticsusingdeeplearningJ.IEEETransactionsonMultimedia,2015,17(11):2021-2034.,Amulti-scenedeeplearningmodelforimageaestheticevaluation,Designasceneconvolutionallayerconsistofmulti-groupdescriptorsinthenetwork.Designapre-trainingproceduretoinitializeourmodel.,Fig1.Thearchitectureofthemulti-scenedeeplearningmodel(MSDLM).,Fig2.TheoverviewofproposedMSDLM.,ArchitectureofMSDLM:4convolutionallayers+1sceneconvolutionallayer+3fullyconnectedlayers,WangW,ZhaoM,WangL,etal.Amulti-scenedeeplearningmodelforimageaestheticevaluationJ.SignalProcessingImageCommunication,2016,47:511-518.,Example-Loadthedataset,defload_dataset():url=,ExampleModel,net1=NeuralNet(layers=(input,layers.InputLayer),(conv2d1,layers.Conv2DLayer),(maxpool1,layers.MaxPool2DLayer),(conv2d2,layers.Conv2DLayer),(maxpool2,layers.MaxPool2DLayer),(dropout1,layers.DropoutLayer),(dense,layers.DenseLayer),(dropout2,layers.DropoutLayer),(output,layers.DenseLayer),#inputlayerinput_shape=(None,1,28,28),#layerconv2d1conv2d1_num_filters=32,conv2d1_filter_size=(5,5),conv2d1_nonlinearity=lasagne.nonlinearities.rectify,conv2d1_W=lasagne.init.GlorotUniform(),#layermaxpool1maxpool1_pool_size=(2,2),#layerconv2d2conv2d2_num_filters=32,conv2d2_filter_size=(5,5),conv2d2_nonlinearity=lasagne.nonlinearities.rectify,#layermaxpool2maxpool2_pool_size=(2,2),#dropout1dropout1_p=0.5,#densei.e.full-connectedlayerdense_num_units=256,dense_nonlinearity=lasagne.nonlinearities.rectify,#dropout2dropout2_p=0.5,#outputoutput_nonlinearity=lasagne.nonlinearities.softmax,output_num_units=10,#optimizationmethodparamsupdate=nesterov_momentum,update_learning_rate=0.01,update_momentum=0.9,max_epochs=10,verbose=1,),ExampleTrainandTest,#Trainthenetworknn=net1.fit(X_train,y_train)#Usingthea

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论