




已阅读5页,还剩44页未读, 继续免费阅读
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
TipsforDeepLearning,NeuralNetwork,GoodResultsonTestingData?,GoodResultsonTrainingData?,YES,YES,NO,NO,Overfitting!,RecipeofDeepLearning,DonotalwaysblameOverfitting,DeepResidualLearningforImageRecognition/abs/1512.03385,TestingData,Overfitting?,TrainingData,Notwelltrained,NeuralNetwork,GoodResultsonTestingData?,GoodResultsonTrainingData?,YES,YES,RecipeofDeepLearning,Differentapproachesfordifferentproblems.,e.g.dropoutforgoodresultsontestingdata,GoodResultsonTestingData?,GoodResultsonTrainingData?,YES,YES,RecipeofDeepLearning,HardtogetthepowerofDeep,Deeperusuallydoesnotimplybetter.,ResultsonTrainingData,VanishingGradientProblem,Largergradients,Almostrandom,Alreadyconverge,basedonrandom!?,Learnveryslow,Learnveryfast,Smallergradients,VanishingGradientProblem,Intuitivewaytocomputethederivatives,=?,+,+,Smallergradients,ReLU,RectifiedLinearUnit(ReLU),Reason:,1.Fasttocompute,2.Biologicalreason,3.Infinitesigmoidwithdifferentbiases,4.Vanishinggradientproblem,XavierGlorot,AISTATS11,AndrewL.Maas,ICML13,KaimingHe,arXiv15,ReLU,0,0,0,0,ReLU,AThinnerlinearnetwork,Donothavesmallergradients,ReLU-variant,alsolearnedbygradientdescent,Maxout,LearnableactivationfunctionIanJ.Goodfellow,ICML13,Max,Input,Max,7,1,Max,Max,2,4,ReLUisaspecialcasesofMaxout,Youcanhavemorethan2elementsinagroup.,neuron,Maxout,0,0,=+,1=+,2=0,ReLUisaspecialcasesofMaxout,Maxout,=+,1=+,2=+,LearnableActivationFunction,MorethanReLU,Maxout,LearnableactivationfunctionIanJ.Goodfellow,ICML13ActivationfunctioninmaxoutnetworkcanbeanypiecewiselinearconvexfunctionHowmanypiecesdependingonhowmanyelementsinagroup,2elementsinagroup,3elementsinagroup,Maxout-Training,Givenatrainingdatax,weknowwhichzwouldbethemax,Max,Input,Max,11,21,Max,Max,12,22,1,2,11,21,Maxout-Training,Givenatrainingdatax,weknowwhichzwouldbethemaxTrainthisthinandlinearnetwork,Input,11,21,12,22,1,2,Differentthinandlinearnetworkfordifferentexamples,GoodResultsonTestingData?,GoodResultsonTrainingData?,YES,YES,RecipeofDeepLearning,Review,LargerLearningRate,SmallerLearningRate,Adagrad,+1=02,Usefirstderivativetoestimatesecondderivative,RMSProp,1,2,ErrorSurfacecanbeverycomplexwhentrainingNN.,LargerLearningRate,SmallerLearningRate,RMSProp,1000,2111,+1,0=0,1=02+112,3222,2=12+122,=12+12,RootMeanSquareofthegradientswithpreviousgradientsbeingdecayed,Hardtofindoptimalnetworkparameters,TotalLoss,Thevalueofanetworkparameterw,Veryslowattheplateau,Stuckatlocalminima,=0,Stuckatsaddlepoint,=0,0,Inphysicalworld,Momentum,Howaboutputthisphenomenoningradientdescent?,Review:VanillaGradientDescent,Startatposition0,Computegradientat0,Moveto1=0-0,Computegradientat1,Moveto2=11,Movement,Gradient,0,1,2,3,0,1,2,3,Stopuntil0,Momentum,Startatpoint0,Computegradientat0,Moveto1=0+v1,Computegradientat1,Movementv0=0,Movementv1=v0-0,Movementv2=v1-1,Moveto2=1+v2,Movement,Gradient,0,1,2,3,0,1,2,3,Movementnotjustbasedongradient,butpreviousmovement.,Movementoflaststep,Movement:movementoflaststepminusgradientatpresent,Momentum,viisactuallytheweightedsumofallthepreviousgradient:0,1,1,v0=0,v1=-0,v2=-0-1,Startatpoint0,Computegradientat0,Moveto1=0+v1,Computegradientat1,Movementv0=0,Movementv1=v0-0,Movementv2=v1-1,Moveto2=1+v2,Movementnotjustbasedongradient,butpreviousmovement,Movement:movementoflaststepminusgradientatpresent,Movement=Negativeof+Momentum,Momentum,cost,=0,Stillnotguaranteereachingglobalminima,butgivesomehope,Adam,RMSProp+Momentum,formomentum,forRMSprop,GoodResultsonTestingData?,GoodResultsonTrainingData?,YES,YES,RecipeofDeepLearning,EarlyStopping,Epochs,TotalLoss,Trainingset,Testingset,Stopathere,Validationset,GoodResultsonTestingData?,GoodResultsonTrainingData?,YES,YES,RecipeofDeepLearning,Regularization,NewlossfunctiontobeminimizedFindasetofweightnotonlyminimizingoriginalcostbutalsoclosetozero,Originalloss(e.g.minimizesquareerror,crossentropy),(usuallynotconsiderbiases),Regularizationterm,L2regularization:,Regularization,Newlossfunctiontobeminimized,Gradient:,Update:,Closertozero,WeightDecay,L2regularization:,Regularization,Newlossfunctiontobeminimized,Update:,Alwaysdelete,L1regularization:,L2,Regularization-WeightDecay,Ourbrainprunesouttheuselesslinkbetweenneurons.,Doingthesamethingtomachinesbrainimprovestheperformance.,GoodResultsonTestingData?,GoodResultsonTrainingData?,YES,YES,RecipeofDeepLearning,Dropout,Training:,Eachtimebeforeupdatingtheparameters,Eachneuronhasp%todropout,Dropout,Training:,Eachtimebeforeupdatingtheparameters,Eachneuronhasp%todropout,Usingthenewnetworkfortraining,Thestructureofthenetworkischanged.,Thinner!,Foreachmini-batch,weresamplethedropoutneurons,Dropout,Testing:,Nodropout,Ifthedropoutrateattrainingisp%,alltheweightstimes1-p%,Assumethatthedropoutrateis50%.Ifaweightw=1bytraining,set=0.5fortesting.,Dropout-IntuitiveReason,Training,Testing,Dropout(腳上綁重物),Nodropout(拿下重物後就變很強),Dropout-IntuitiveReason,Whenteamsup,ifeveryoneexpectthepartnerwilldothework,nothingwillbedonefinally.,However,ifyouknowyourpartnerwilldropout,youwilldobetter.,我的partner會擺爛,所以我要好好做,Whentesting,noonedropoutactually,soobtaininggoodresultseventually.,Dropout-IntuitiveReason,Whytheweightsshouldmultiply(1-p)%(dropoutrate)whentesting?,TrainingofDropout,TestingofDropout,1,2,3,4,1,2,3,4,Assumedropoutrateis50%,0.5,0.5,0.5,0.5,Nodropout,Dropoutisakindofensemble.,Ensemble,Network1,Network2,Network3,Network4,Trainabunchofnetworkswithdifferentstructures,TrainingSet,Set1,Set2,Set3,Set4,Dropoutisakindofensemble.,Ensemble,y1,Network1,Network2,Network3,Network4,Testingdatax,y2,y3,y4,average,Dropoutisakindofensemble.,TrainingofDropout,minibatch1,Usingonemini-batchtotrainonenetwork,Someparametersinthenetworkareshared,minibatch2,minibatch3,minibatch4,Mneurons,2Mpossiblenetwork
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2025年应急管理与灾害响应考试试卷及答案总结
- 2025年新工艺材料技术的研究与应用试题及答案
- 2025年信息安全与风险管理课程考核试卷及答案
- 2025年汽车服务工程职业能力测试试卷及答案
- 2025年批判性思维能力测试题及答案
- 2025年建筑师执业资格考试试卷及答案
- 2025年5G通信技术与网络的性能优化的综合能力考试卷及答案
- 2025年电子商务技能证书考试试题及答案
- 物资采购出库管理制度
- 特困补助资金管理制度
- 《临床技术操作规范-放射医学检查技术分册》
- DB12T 531-2014 电梯主要部件判废技术条件
- 大隐1#综合楼安装全专业手工计算表
- 《一元一次方程》参考课件
- 消除“艾梅乙”医疗歧视-从我做起
- 《阿凡达》电影赏析
- GB/T 44625-2024动态响应同步调相机技术要求
- 商业伦理与职业道德学习通超星期末考试答案章节答案2024年
- 系统商用密码应用方案v5-2024(新模版)
- 基于单片机的彩灯控制器设计
- 2024至2030年中国医疗信息化市场潜力与投资前景分析报告
评论
0/150
提交评论