




已阅读5页,还剩44页未读, 继续免费阅读
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
TipsforDeepLearning,NeuralNetwork,GoodResultsonTestingData?,GoodResultsonTrainingData?,YES,YES,NO,NO,Overfitting!,RecipeofDeepLearning,DonotalwaysblameOverfitting,DeepResidualLearningforImageRecognition/abs/1512.03385,TestingData,Overfitting?,TrainingData,Notwelltrained,NeuralNetwork,GoodResultsonTestingData?,GoodResultsonTrainingData?,YES,YES,RecipeofDeepLearning,Differentapproachesfordifferentproblems.,e.g.dropoutforgoodresultsontestingdata,GoodResultsonTestingData?,GoodResultsonTrainingData?,YES,YES,RecipeofDeepLearning,HardtogetthepowerofDeep,Deeperusuallydoesnotimplybetter.,ResultsonTrainingData,VanishingGradientProblem,Largergradients,Almostrandom,Alreadyconverge,basedonrandom!?,Learnveryslow,Learnveryfast,Smallergradients,VanishingGradientProblem,Intuitivewaytocomputethederivatives,=?,+,+,Smallergradients,ReLU,RectifiedLinearUnit(ReLU),Reason:,1.Fasttocompute,2.Biologicalreason,3.Infinitesigmoidwithdifferentbiases,4.Vanishinggradientproblem,XavierGlorot,AISTATS11,AndrewL.Maas,ICML13,KaimingHe,arXiv15,ReLU,0,0,0,0,ReLU,AThinnerlinearnetwork,Donothavesmallergradients,ReLU-variant,alsolearnedbygradientdescent,Maxout,LearnableactivationfunctionIanJ.Goodfellow,ICML13,Max,Input,Max,7,1,Max,Max,2,4,ReLUisaspecialcasesofMaxout,Youcanhavemorethan2elementsinagroup.,neuron,Maxout,0,0,=+,1=+,2=0,ReLUisaspecialcasesofMaxout,Maxout,=+,1=+,2=+,LearnableActivationFunction,MorethanReLU,Maxout,LearnableactivationfunctionIanJ.Goodfellow,ICML13ActivationfunctioninmaxoutnetworkcanbeanypiecewiselinearconvexfunctionHowmanypiecesdependingonhowmanyelementsinagroup,2elementsinagroup,3elementsinagroup,Maxout-Training,Givenatrainingdatax,weknowwhichzwouldbethemax,Max,Input,Max,11,21,Max,Max,12,22,1,2,11,21,Maxout-Training,Givenatrainingdatax,weknowwhichzwouldbethemaxTrainthisthinandlinearnetwork,Input,11,21,12,22,1,2,Differentthinandlinearnetworkfordifferentexamples,GoodResultsonTestingData?,GoodResultsonTrainingData?,YES,YES,RecipeofDeepLearning,Review,LargerLearningRate,SmallerLearningRate,Adagrad,+1=02,Usefirstderivativetoestimatesecondderivative,RMSProp,1,2,ErrorSurfacecanbeverycomplexwhentrainingNN.,LargerLearningRate,SmallerLearningRate,RMSProp,1000,2111,+1,0=0,1=02+112,3222,2=12+122,=12+12,RootMeanSquareofthegradientswithpreviousgradientsbeingdecayed,Hardtofindoptimalnetworkparameters,TotalLoss,Thevalueofanetworkparameterw,Veryslowattheplateau,Stuckatlocalminima,=0,Stuckatsaddlepoint,=0,0,Inphysicalworld,Momentum,Howaboutputthisphenomenoningradientdescent?,Review:VanillaGradientDescent,Startatposition0,Computegradientat0,Moveto1=0-0,Computegradientat1,Moveto2=11,Movement,Gradient,0,1,2,3,0,1,2,3,Stopuntil0,Momentum,Startatpoint0,Computegradientat0,Moveto1=0+v1,Computegradientat1,Movementv0=0,Movementv1=v0-0,Movementv2=v1-1,Moveto2=1+v2,Movement,Gradient,0,1,2,3,0,1,2,3,Movementnotjustbasedongradient,butpreviousmovement.,Movementoflaststep,Movement:movementoflaststepminusgradientatpresent,Momentum,viisactuallytheweightedsumofallthepreviousgradient:0,1,1,v0=0,v1=-0,v2=-0-1,Startatpoint0,Computegradientat0,Moveto1=0+v1,Computegradientat1,Movementv0=0,Movementv1=v0-0,Movementv2=v1-1,Moveto2=1+v2,Movementnotjustbasedongradient,butpreviousmovement,Movement:movementoflaststepminusgradientatpresent,Movement=Negativeof+Momentum,Momentum,cost,=0,Stillnotguaranteereachingglobalminima,butgivesomehope,Adam,RMSProp+Momentum,formomentum,forRMSprop,GoodResultsonTestingData?,GoodResultsonTrainingData?,YES,YES,RecipeofDeepLearning,EarlyStopping,Epochs,TotalLoss,Trainingset,Testingset,Stopathere,Validationset,GoodResultsonTestingData?,GoodResultsonTrainingData?,YES,YES,RecipeofDeepLearning,Regularization,NewlossfunctiontobeminimizedFindasetofweightnotonlyminimizingoriginalcostbutalsoclosetozero,Originalloss(e.g.minimizesquareerror,crossentropy),(usuallynotconsiderbiases),Regularizationterm,L2regularization:,Regularization,Newlossfunctiontobeminimized,Gradient:,Update:,Closertozero,WeightDecay,L2regularization:,Regularization,Newlossfunctiontobeminimized,Update:,Alwaysdelete,L1regularization:,L2,Regularization-WeightDecay,Ourbrainprunesouttheuselesslinkbetweenneurons.,Doingthesamethingtomachinesbrainimprovestheperformance.,GoodResultsonTestingData?,GoodResultsonTrainingData?,YES,YES,RecipeofDeepLearning,Dropout,Training:,Eachtimebeforeupdatingtheparameters,Eachneuronhasp%todropout,Dropout,Training:,Eachtimebeforeupdatingtheparameters,Eachneuronhasp%todropout,Usingthenewnetworkfortraining,Thestructureofthenetworkischanged.,Thinner!,Foreachmini-batch,weresamplethedropoutneurons,Dropout,Testing:,Nodropout,Ifthedropoutrateattrainingisp%,alltheweightstimes1-p%,Assumethatthedropoutrateis50%.Ifaweightw=1bytraining,set=0.5fortesting.,Dropout-IntuitiveReason,Training,Testing,Dropout(腳上綁重物),Nodropout(拿下重物後就變很強),Dropout-IntuitiveReason,Whenteamsup,ifeveryoneexpectthepartnerwilldothework,nothingwillbedonefinally.,However,ifyouknowyourpartnerwilldropout,youwilldobetter.,我的partner會擺爛,所以我要好好做,Whentesting,noonedropoutactually,soobtaininggoodresultseventually.,Dropout-IntuitiveReason,Whytheweightsshouldmultiply(1-p)%(dropoutrate)whentesting?,TrainingofDropout,TestingofDropout,1,2,3,4,1,2,3,4,Assumedropoutrateis50%,0.5,0.5,0.5,0.5,Nodropout,Dropoutisakindofensemble.,Ensemble,Network1,Network2,Network3,Network4,Trainabunchofnetworkswithdifferentstructures,TrainingSet,Set1,Set2,Set3,Set4,Dropoutisakindofensemble.,Ensemble,y1,Network1,Network2,Network3,Network4,Testingdatax,y2,y3,y4,average,Dropoutisakindofensemble.,TrainingofDropout,minibatch1,Usingonemini-batchtotrainonenetwork,Someparametersinthenetworkareshared,minibatch2,minibatch3,minibatch4,Mneurons,2Mpossiblenetwork
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 监理能不能改施工方案
- 螺旋选矿机拆除施工方案
- 深圳供电局迁改施工方案
- 2025年医生入院考试试题及答案
- 2025年菏泽教师编制题库及答案
- 资金应对方案范本
- 2025年大连初中劳动考试题及答案
- 2025商业银行人民币个人住房按揭贷款合同
- 2025区域代理商授权合同范本
- 第八届粤食粤安全题库及答案解析
- 过敏性皮炎的治疗及护理
- 2024年河南省淮滨县人民医院公开招聘护理工作人员试题带答案详解
- 房颤内科护理学
- 甲状腺结节术后护理
- 政策变迁课件
- 2025年江西文演集团招聘笔试冲刺题2025
- 物理课程与教学论 课件 第五章 物理教学模式、方法与策略
- 烘焙类产品培训课件
- 水泥标准培训课件
- 2025-2030年中国反无人机行业市场深度调研及前景趋势与投资研究报告
- 如何提升科室医疗安全
评论
0/150
提交评论