版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
DeepLearninganditsApplicationConvolutionalneuralnetworksCNN
Architectures59Case
StudiesAlexNetVGGGoogLeNetResNetAlso....NiN(Networkin
Network)Wide
ResNetResNeXTStochastic
DepthSqueeze-and-Excitation
NetworkDenseNetFractalNetSqueezeNetNASNetSJTUDeepLearningLecture.Review:
LeNet-5[LeCunetal.,
1998]Convfilterswere5x5,appliedatstride
1Subsampling(Pooling)layerswere2x2appliedatstride
2i.e.architectureis
[CONV-POOL-CONV-POOL-FC-FC]60SJTUDeepLearningLecture.[1]Gradient-basedlearningappliedtodocumentrecognition,ProceedingsoftheIEEE,86(11)1998CaseStudy:
AlexNet
[Krizhevskyetal.
2012]61Architecture:CONV1MAX
POOL1NORM1CONV2MAX
POOL2NORM2CONV3CONV4CONV5Max
POOL3FC6FC7FC8Input:227x227x3
imagesFirstlayer(CONV1):9611x11filtersappliedatstride
4=>Q:whatistheoutputvolumesize?Hint:(227-11)/4+1=
55Outputvolume
[55x55x96]Q:Whatisthetotalnumberofparametersinthislayer?Parameters:(11*11*3)*96=
35KSJTUDeepLearningLecture.[1]ImageNetClassificationwithDeepConvolutionalNeuralNetworks,NIPS2012
CaseStudy:
AlexNetArchitecture:CONV1MAX
POOL1NORM1CONV2MAX
POOL2NORM2CONV3CONV4CONV5Max
POOL3FC6FC7FC8Input:227x227x3
imagesAfterCONV1:
55x55x9662Secondlayer(POOL1):3x3filtersappliedatstride2Outputvolume:
27x27x96Parameters:
0!SJTUDeepLearningLecture.CaseStudy:
AlexNetFull(simplified)AlexNet
architecture:[227x227x3]
INPUT[55x55x96]CONV1:9611x11filtersatstride4,pad0[27x27x96]MAXPOOL1:3x3filtersatstride2[27x27x96]NORM1:Normalization
layer[27x27x256]CONV2:2565x5filtersatstride1,pad2[13x13x256]MAXPOOL2:3x3filtersatstride2[13x13x256]NORM2:Normalizationlayer[13x13x384]CONV3:3843x3filtersatstride1,pad1[13x13x384]CONV4:3843x3filtersatstride1,pad1[13x13x256]CONV5:2563x3filtersatstride1,pad1[6x6x256]MAXPOOL3:3x3filtersatstride
2[4096]FC6:4096
neurons[4096]FC7:4096
neurons[1000]FC8:1000neurons(classscores)Details/Retrospectives:firstuseof
ReLUusedNormlayersheavydata
augmentationdropout
0.5batchsize
128SGDMomentum
0.9Learningrate1e-2,reducedby10manuallywhenvalaccuracy
plateausL2weightdecay
5e-4-7CNNensemble:18.2%->
15.4%63SJTUDeepLearningLecture.CaseStudy:
AlexNetFull(simplified)AlexNet
architecture:[227x227x3]
INPUT[55x55x96]CONV1:9611x11filtersatstride4,pad0[27x27x96]MAXPOOL1:3x3filtersatstride
2[27x27x96]NORM1:Normalization
layer[27x27x256]CONV2:2565x5filtersatstride1,pad2[13x13x256]MAXPOOL2:3x3filtersatstride2[13x13x256]NORM2:Normalizationlayer[13x13x384]CONV3:3843x3filtersatstride1,pad1[13x13x384]CONV4:3843x3filtersatstride1,pad1[13x13x256]CONV5:2563x3filtersatstride1,pad1[6x6x256]MAXPOOL3:3x3filtersatstride
2[4096]FC6:4096
neurons[4096]FC7:4096
neurons[1000]FC8:1000neurons(class
scores)Historical
note:TrainedonGTX580GPUwithonly3GBof
memory.Networkspreadacross2GPUs,halftheneurons(featuremaps)oneachGPU.[55x55x48]x
264SJTUDeepLearningLecture.CaseStudy:
AlexNetFull(simplified)AlexNet
architecture:[227x227x3]
INPUT[55x55x96]CONV1:9611x11filtersatstride4,pad0[27x27x96]MAXPOOL1:3x3filtersatstride2[27x27x96]NORM1:Normalization
layer[27x27x256]CONV2:2565x5filtersatstride1,pad2[13x13x256]MAXPOOL2:3x3filtersatstride2[13x13x256]NORM2:Normalizationlayer[13x13x384]CONV3:3843x3filtersatstride1,pad1[13x13x384]CONV4:3843x3filtersatstride1,pad1[13x13x256]CONV5:2563x3filtersatstride1,pad1[6x6x256]MAXPOOL3:3x3filtersatstride
2[4096]FC6:4096
neurons[4096]FC7:4096
neurons[1000]FC8:1000neurons(class
scores)CONV1,CONV2,CONV4,
CONV5:Connectionsonlywithfeaturemapsonsame
GPU65SJTUDeepLearningLecture.CaseStudy:
AlexNetFull(simplified)AlexNet
architecture:[227x227x3]
INPUT[55x55x96]CONV1:9611x11filtersatstride4,pad0[27x27x96]MAXPOOL1:3x3filtersatstride2[27x27x96]NORM1:Normalization
layer[27x27x256]CONV2:2565x5filtersatstride1,pad2[13x13x256]MAXPOOL2:3x3filtersatstride2[13x13x256]NORM2:Normalizationlayer[13x13x384]CONV3:3843x3filtersatstride1,pad1[13x13x384]CONV4:3843x3filtersatstride1,pad1[13x13x256]CONV5:2563x3filtersatstride1,pad1[6x6x256]MAXPOOL3:3x3filtersatstride
2[4096]FC6:4096
neurons[4096]FC7:4096
neurons[1000]FC8:1000neurons(class
scores)CONV3,FC6,FC7,
FC8:Connectionswithallfeaturemapsinprecedinglayer,communicationacross
GPUs66SJTUDeepLearningLecture.CaseStudy:VGGNet
[SimonyanandZisserman,2014]67Smallfilters,Deeper
networks8layers
(AlexNet)->16-19layers
(VGG16Net)Only3x3CONVstride1,pad
1and2x2MAXPOOLstride
211.7%top5errorin
ILSVRC’13(ZFNet)->7.3%top5errorin
ILSVRC’14AlexNetVGG16VGG19SJTUDeepLearningLecture.[1]Verydeepconvolutionalnetworksforlarge-scaleimagerecognition,ICLR2015CaseStudy:
VGGNetQ:Whyusesmallerfilters?(3x3
conv)AlexNet68VGG16VGG19Q:Whatistheeffectivereceptivefieldofthree3x3conv(stride1)
layers?[7x7]Stackofthree3x3conv(stride1)
layershassameeffectivereceptivefieldasone7x7conv
layerButdeeper,more
non-linearitiesAndfewerparameters:3*(32C2)
vs.72C2forCchannelsper
layerSJTUDeepLearningLecture.INPUT:[224x224x3]
memory:224*224*3=150K
params:
0CONV3-64:[224x224x64]
memory:224*224*64=3.2M
params:(3*3*3)*64=1,728CONV3-64:[224x224x64]
memory:224*224*64=3.2M
params:(3*3*64)*64=
36,864POOL2:[112x112x64]
memory:112*112*64=800K
params:
0CONV3-128:[112x112x128]
memory:112*112*128=1.6M
params:(3*3*64)*128=73,728CONV3-128:[112x112x128]
memory:112*112*128=1.6M
params:(3*3*128)*128=
147,456POOL2:[56x56x128]
memory:56*56*128=400K
params:
0CONV3-256:[56x56x256]
memory:
56*56*256=800KCONV3-256:[56x56x256]
memory:
56*56*256=800KCONV3-256:[56x56x256]
memory:
56*56*256=800Kparams:
(3*3*128)*256=
294,912params:
(3*3*256)*256=
589,824params:(3*3*256)*256=
589,824POOL2:[28x28x256]
memory:28*28*256=200K
params:
0CONV3-512:[28x28x512]
memory:
28*28*512=400KCONV3-512:[28x28x512]
memory:
28*28*512=400KCONV3-512:[28x28x512]
memory:
28*28*512=400Kparams:(3*3*256)*512=
1,179,648params:(3*3*512)*512=
2,359,296params:(3*3*512)*512=
2,359,296POOL2:[14x14x512]
memory:14*14*512=100K
params:
0params:(3*3*512)*512=
2,359,296params:(3*3*512)*512=
2,359,296params:(3*3*512)*512=
2,359,296CONV3-512:[14x14x512]
memory:14*14*512=100KCONV3-512:[14x14x512]
memory:14*14*512=100KCONV3-512:[14x14x512]
memory:
14*14*512=100KPOOL2:[7x7x512]
memory:7*7*512=25K
params:
0FC:[1x1x4096]
memory:4096
params:7*7*512*4096=102,760,448FC:[1x1x4096]
memory:4096
params:4096*4096=
16,777,216FC:[1x1x1000]
memory:1000
params:4096*1000=
4,096,000VGG1669CaseStudy:
VGGNetTOTALmemory:24M*4bytes~=96MB/image(foraforwardpass)TOTALparams:138M
parametersSJTUDeepLearningLecture.CaseStudy:
VGGNetDetails:ILSVRC’142ndinclassification,1st
inlocalizationSimilartrainingprocedureasAlexNet
[Krizhevsky2012]NoLocalResponseNormalisation
(LRN)UseVGG16orVGG19(VGG19
onlyslightlybetter,more
memory)Useensemblesforbest
resultsFC7featuresgeneralizewelltoothertasksAlexNetVGG16VGG1970SJTUDeepLearningLecture.CaseStudy:GoogLeNet(inceptionv1)
[Szegedyetal.,
2015]79SJTUDeepLearningLecture.[1]Goingdeeperwithconvolutions,CVPR2015CaseStudy:GoogLeNet(inceptionv1)
[Szegedyetal.,
2015]79SJTUDeepLearningLecture.Reducethechannelnumberby1x1convsCaseStudy:GoogLeNet(inceptionv1)
[Szegedyetal.,
2015]79SJTUDeepLearningLecture.AuxiliarylossisusedtoimprovethemodelClassifierforinferenceCaseStudy:
ResNet
[Heetal.,
2015]Verydeepnetworksusing
residualconnections152-layermodelfor
ImageNetILSVRC’15classification
winner(3.57%top5
error)Sweptallclassification
anddetectioncompetitionsinILSVRC’15and
COCO’15!...reluXResidual
blockXidentityF(x)+
xF(x)relu71SJTUDeepLearningLecture.[1]DeepResidualLearningforImageRecognition,CVPR2016[2]IdentityMappingsinDeepResidualNetworks,ECCV2016CaseStudy:
ResNet72Whathappenswhenwecontinuestackingdeeperlayersona“plain”convolutionalneural
network?Q:What’sstrangeaboutthesetrainingandtestcurves?[Hint:lookattheorderofthe
curves]56-layermodelperformsworseonbothtrainingandtest
error->Thedeepermodelperformsworse,butit’snotcausedby
overfitting!SJTUDeepLearningLecture.reluCaseStudy:
ResNetSolution:Usenetworklayerstofitaresidualmappinginsteadofdirectlytryingtofitadesiredunderlying
mappingXResidual
blockXidentityF(x)+
xF(x)H(x)=F(x)+
xrelureluX“Plain”
layersH(x)UselayerstofitresidualF(x)=H(x)-
xinsteadofH(x)
directly74SJTUDeepLearningLecture....CaseStudy:
ResNetreluXResidual
blockXidentityF(x)+
xF(x)reluFullResNet
architecture:75Stackresidual
blocksEveryresidualblock
hastwo3x3conv
layersPeriodically,double#
offiltersanddownsamplespatiallyusingstride2(/2ineach
dimension)Additionalconvlayer
atthe
beginningNoFClayersatthe
end(onlyFC1000tooutputclasses)SJTUDeepLearningLecture....CaseStudy:
ResNetTotaldepthsof34,50,101,or152layersfor
ImageNet76SJTUDeepLearningLecture.CaseStudy:
ResNetFordeeper
networks(ResNet-50+),use
“bottleneck”layertoimproveefficiency(similarto
GoogLeNet)1x1conv,64
filterstoprojectto28x28x643x3convoperates
overonly64feature
maps1x1conv,256filters
projectsbackto256featuremaps(28x28x256)77SJTUDeepLearningLecture.TrainingResNetin
practice:BatchNormalizationaftereveryCONV
layerXavier2/initializationfromHeet
alICCV2015.SGD+Momentum
(0.9)Learningrate:0.1,dividedby10whenvalidationerror
plateausMini-batchsize
256Weightdecayof
1e-5Nodropout
used78CaseStudy:ResNetSJTUDeepLearningLecture.CaseStudy:ResNetExperimental
ResultsAbletotra
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 内蒙古赤峰市翁牛特旗乌丹二中2026届数学高一下期末学业质量监测试题含解析
- 医学PBL小组团队情绪调节与协作氛围改善策略
- 医保政策调整对患者医疗体验的影响
- 天文观测方法技能考核试题及真题
- 2025Linux服务器构建与运维管理:07使用MongoDB实现数据库服务
- 旅游规划中的可持续发展策略
- 3.2.1周长的认识(练习-尖子生)2025-2026学年小学数学三年级下册同步分层 人教版
- 医保报销与医疗质量评价
- 2026年物流管理专业核心技能实训试题
- 2025年消防安全试题
- 三年级数学下册口算练习题(每日一练共12份)
- DB15T 1896-2020 单位消防安全评估标准
- 圣教序教学课件
- (高清版)DB11∕T 1455-2025 电动汽车充电基础设施规划设计标准
- 2025年辅警招聘考试真题(含答案)
- 2025年贵州省中考理科综合(物理化学)试卷真题(含答案详解)
- 生物安全培训 课件
- 语文●全国甲卷丨2023年普通高等学校招生全国统一考试语文试卷及答案
- 康养银发产业“十五五规划”研究报告
- T/IESB 002-2020景观照明设施运行维护费用估算
- T/GIEHA 035-2022医院室内空气质量要求
评论
0/150
提交评论