17-卷积神经网络实例_第1页
17-卷积神经网络实例_第2页
17-卷积神经网络实例_第3页
17-卷积神经网络实例_第4页
17-卷积神经网络实例_第5页
已阅读5页,还剩21页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

DeepLearninganditsApplicationConvolutionalneuralnetworksCNN

Architectures59Case

StudiesAlexNetVGGGoogLeNetResNetAlso....NiN(Networkin

Network)Wide

ResNetResNeXTStochastic

DepthSqueeze-and-Excitation

NetworkDenseNetFractalNetSqueezeNetNASNetSJTUDeepLearningLecture.Review:

LeNet-5[LeCunetal.,

1998]Convfilterswere5x5,appliedatstride

1Subsampling(Pooling)layerswere2x2appliedatstride

2i.e.architectureis

[CONV-POOL-CONV-POOL-FC-FC]60SJTUDeepLearningLecture.[1]Gradient-basedlearningappliedtodocumentrecognition,ProceedingsoftheIEEE,86(11)1998CaseStudy:

AlexNet

[Krizhevskyetal.

2012]61Architecture:CONV1MAX

POOL1NORM1CONV2MAX

POOL2NORM2CONV3CONV4CONV5Max

POOL3FC6FC7FC8Input:227x227x3

imagesFirstlayer(CONV1):9611x11filtersappliedatstride

4=>Q:whatistheoutputvolumesize?Hint:(227-11)/4+1=

55Outputvolume

[55x55x96]Q:Whatisthetotalnumberofparametersinthislayer?Parameters:(11*11*3)*96=

35KSJTUDeepLearningLecture.[1]ImageNetClassificationwithDeepConvolutionalNeuralNetworks,NIPS2012

CaseStudy:

AlexNetArchitecture:CONV1MAX

POOL1NORM1CONV2MAX

POOL2NORM2CONV3CONV4CONV5Max

POOL3FC6FC7FC8Input:227x227x3

imagesAfterCONV1:

55x55x9662Secondlayer(POOL1):3x3filtersappliedatstride2Outputvolume:

27x27x96Parameters:

0!SJTUDeepLearningLecture.CaseStudy:

AlexNetFull(simplified)AlexNet

architecture:[227x227x3]

INPUT[55x55x96]CONV1:9611x11filtersatstride4,pad0[27x27x96]MAXPOOL1:3x3filtersatstride2[27x27x96]NORM1:Normalization

layer[27x27x256]CONV2:2565x5filtersatstride1,pad2[13x13x256]MAXPOOL2:3x3filtersatstride2[13x13x256]NORM2:Normalizationlayer[13x13x384]CONV3:3843x3filtersatstride1,pad1[13x13x384]CONV4:3843x3filtersatstride1,pad1[13x13x256]CONV5:2563x3filtersatstride1,pad1[6x6x256]MAXPOOL3:3x3filtersatstride

2[4096]FC6:4096

neurons[4096]FC7:4096

neurons[1000]FC8:1000neurons(classscores)Details/Retrospectives:firstuseof

ReLUusedNormlayersheavydata

augmentationdropout

0.5batchsize

128SGDMomentum

0.9Learningrate1e-2,reducedby10manuallywhenvalaccuracy

plateausL2weightdecay

5e-4-7CNNensemble:18.2%->

15.4%63SJTUDeepLearningLecture.CaseStudy:

AlexNetFull(simplified)AlexNet

architecture:[227x227x3]

INPUT[55x55x96]CONV1:9611x11filtersatstride4,pad0[27x27x96]MAXPOOL1:3x3filtersatstride

2[27x27x96]NORM1:Normalization

layer[27x27x256]CONV2:2565x5filtersatstride1,pad2[13x13x256]MAXPOOL2:3x3filtersatstride2[13x13x256]NORM2:Normalizationlayer[13x13x384]CONV3:3843x3filtersatstride1,pad1[13x13x384]CONV4:3843x3filtersatstride1,pad1[13x13x256]CONV5:2563x3filtersatstride1,pad1[6x6x256]MAXPOOL3:3x3filtersatstride

2[4096]FC6:4096

neurons[4096]FC7:4096

neurons[1000]FC8:1000neurons(class

scores)Historical

note:TrainedonGTX580GPUwithonly3GBof

memory.Networkspreadacross2GPUs,halftheneurons(featuremaps)oneachGPU.[55x55x48]x

264SJTUDeepLearningLecture.CaseStudy:

AlexNetFull(simplified)AlexNet

architecture:[227x227x3]

INPUT[55x55x96]CONV1:9611x11filtersatstride4,pad0[27x27x96]MAXPOOL1:3x3filtersatstride2[27x27x96]NORM1:Normalization

layer[27x27x256]CONV2:2565x5filtersatstride1,pad2[13x13x256]MAXPOOL2:3x3filtersatstride2[13x13x256]NORM2:Normalizationlayer[13x13x384]CONV3:3843x3filtersatstride1,pad1[13x13x384]CONV4:3843x3filtersatstride1,pad1[13x13x256]CONV5:2563x3filtersatstride1,pad1[6x6x256]MAXPOOL3:3x3filtersatstride

2[4096]FC6:4096

neurons[4096]FC7:4096

neurons[1000]FC8:1000neurons(class

scores)CONV1,CONV2,CONV4,

CONV5:Connectionsonlywithfeaturemapsonsame

GPU65SJTUDeepLearningLecture.CaseStudy:

AlexNetFull(simplified)AlexNet

architecture:[227x227x3]

INPUT[55x55x96]CONV1:9611x11filtersatstride4,pad0[27x27x96]MAXPOOL1:3x3filtersatstride2[27x27x96]NORM1:Normalization

layer[27x27x256]CONV2:2565x5filtersatstride1,pad2[13x13x256]MAXPOOL2:3x3filtersatstride2[13x13x256]NORM2:Normalizationlayer[13x13x384]CONV3:3843x3filtersatstride1,pad1[13x13x384]CONV4:3843x3filtersatstride1,pad1[13x13x256]CONV5:2563x3filtersatstride1,pad1[6x6x256]MAXPOOL3:3x3filtersatstride

2[4096]FC6:4096

neurons[4096]FC7:4096

neurons[1000]FC8:1000neurons(class

scores)CONV3,FC6,FC7,

FC8:Connectionswithallfeaturemapsinprecedinglayer,communicationacross

GPUs66SJTUDeepLearningLecture.CaseStudy:VGGNet

[SimonyanandZisserman,2014]67Smallfilters,Deeper

networks8layers

(AlexNet)->16-19layers

(VGG16Net)Only3x3CONVstride1,pad

1and2x2MAXPOOLstride

211.7%top5errorin

ILSVRC’13(ZFNet)->7.3%top5errorin

ILSVRC’14AlexNetVGG16VGG19SJTUDeepLearningLecture.[1]Verydeepconvolutionalnetworksforlarge-scaleimagerecognition,ICLR2015CaseStudy:

VGGNetQ:Whyusesmallerfilters?(3x3

conv)AlexNet68VGG16VGG19Q:Whatistheeffectivereceptivefieldofthree3x3conv(stride1)

layers?[7x7]Stackofthree3x3conv(stride1)

layershassameeffectivereceptivefieldasone7x7conv

layerButdeeper,more

non-linearitiesAndfewerparameters:3*(32C2)

vs.72C2forCchannelsper

layerSJTUDeepLearningLecture.INPUT:[224x224x3]

memory:224*224*3=150K

params:

0CONV3-64:[224x224x64]

memory:224*224*64=3.2M

params:(3*3*3)*64=1,728CONV3-64:[224x224x64]

memory:224*224*64=3.2M

params:(3*3*64)*64=

36,864POOL2:[112x112x64]

memory:112*112*64=800K

params:

0CONV3-128:[112x112x128]

memory:112*112*128=1.6M

params:(3*3*64)*128=73,728CONV3-128:[112x112x128]

memory:112*112*128=1.6M

params:(3*3*128)*128=

147,456POOL2:[56x56x128]

memory:56*56*128=400K

params:

0CONV3-256:[56x56x256]

memory:

56*56*256=800KCONV3-256:[56x56x256]

memory:

56*56*256=800KCONV3-256:[56x56x256]

memory:

56*56*256=800Kparams:

(3*3*128)*256=

294,912params:

(3*3*256)*256=

589,824params:(3*3*256)*256=

589,824POOL2:[28x28x256]

memory:28*28*256=200K

params:

0CONV3-512:[28x28x512]

memory:

28*28*512=400KCONV3-512:[28x28x512]

memory:

28*28*512=400KCONV3-512:[28x28x512]

memory:

28*28*512=400Kparams:(3*3*256)*512=

1,179,648params:(3*3*512)*512=

2,359,296params:(3*3*512)*512=

2,359,296POOL2:[14x14x512]

memory:14*14*512=100K

params:

0params:(3*3*512)*512=

2,359,296params:(3*3*512)*512=

2,359,296params:(3*3*512)*512=

2,359,296CONV3-512:[14x14x512]

memory:14*14*512=100KCONV3-512:[14x14x512]

memory:14*14*512=100KCONV3-512:[14x14x512]

memory:

14*14*512=100KPOOL2:[7x7x512]

memory:7*7*512=25K

params:

0FC:[1x1x4096]

memory:4096

params:7*7*512*4096=102,760,448FC:[1x1x4096]

memory:4096

params:4096*4096=

16,777,216FC:[1x1x1000]

memory:1000

params:4096*1000=

4,096,000VGG1669CaseStudy:

VGGNetTOTALmemory:24M*4bytes~=96MB/image(foraforwardpass)TOTALparams:138M

parametersSJTUDeepLearningLecture.CaseStudy:

VGGNetDetails:ILSVRC’142ndinclassification,1st

inlocalizationSimilartrainingprocedureasAlexNet

[Krizhevsky2012]NoLocalResponseNormalisation

(LRN)UseVGG16orVGG19(VGG19

onlyslightlybetter,more

memory)Useensemblesforbest

resultsFC7featuresgeneralizewelltoothertasksAlexNetVGG16VGG1970SJTUDeepLearningLecture.CaseStudy:GoogLeNet(inceptionv1)

[Szegedyetal.,

2015]79SJTUDeepLearningLecture.[1]Goingdeeperwithconvolutions,CVPR2015CaseStudy:GoogLeNet(inceptionv1)

[Szegedyetal.,

2015]79SJTUDeepLearningLecture.Reducethechannelnumberby1x1convsCaseStudy:GoogLeNet(inceptionv1)

[Szegedyetal.,

2015]79SJTUDeepLearningLecture.AuxiliarylossisusedtoimprovethemodelClassifierforinferenceCaseStudy:

ResNet

[Heetal.,

2015]Verydeepnetworksusing

residualconnections152-layermodelfor

ImageNetILSVRC’15classification

winner(3.57%top5

error)Sweptallclassification

anddetectioncompetitionsinILSVRC’15and

COCO’15!...reluXResidual

blockXidentityF(x)+

xF(x)relu71SJTUDeepLearningLecture.[1]DeepResidualLearningforImageRecognition,CVPR2016[2]IdentityMappingsinDeepResidualNetworks,ECCV2016CaseStudy:

ResNet72Whathappenswhenwecontinuestackingdeeperlayersona“plain”convolutionalneural

network?Q:What’sstrangeaboutthesetrainingandtestcurves?[Hint:lookattheorderofthe

curves]56-layermodelperformsworseonbothtrainingandtest

error->Thedeepermodelperformsworse,butit’snotcausedby

overfitting!SJTUDeepLearningLecture.reluCaseStudy:

ResNetSolution:Usenetworklayerstofitaresidualmappinginsteadofdirectlytryingtofitadesiredunderlying

mappingXResidual

blockXidentityF(x)+

xF(x)H(x)=F(x)+

xrelureluX“Plain”

layersH(x)UselayerstofitresidualF(x)=H(x)-

xinsteadofH(x)

directly74SJTUDeepLearningLecture....CaseStudy:

ResNetreluXResidual

blockXidentityF(x)+

xF(x)reluFullResNet

architecture:75Stackresidual

blocksEveryresidualblock

hastwo3x3conv

layersPeriodically,double#

offiltersanddownsamplespatiallyusingstride2(/2ineach

dimension)Additionalconvlayer

atthe

beginningNoFClayersatthe

end(onlyFC1000tooutputclasses)SJTUDeepLearningLecture....CaseStudy:

ResNetTotaldepthsof34,50,101,or152layersfor

ImageNet76SJTUDeepLearningLecture.CaseStudy:

ResNetFordeeper

networks(ResNet-50+),use

“bottleneck”layertoimproveefficiency(similarto

GoogLeNet)1x1conv,64

filterstoprojectto28x28x643x3convoperates

overonly64feature

maps1x1conv,256filters

projectsbackto256featuremaps(28x28x256)77SJTUDeepLearningLecture.TrainingResNetin

practice:BatchNormalizationaftereveryCONV

layerXavier2/initializationfromHeet

alICCV2015.SGD+Momentum

(0.9)Learningrate:0.1,dividedby10whenvalidationerror

plateausMini-batchsize

256Weightdecayof

1e-5Nodropout

used78CaseStudy:ResNetSJTUDeepLearningLecture.CaseStudy:ResNetExperimental

ResultsAbletotra

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论