深度学习及其应用 课件6_第1页
深度学习及其应用 课件6_第2页
深度学习及其应用 课件6_第3页
深度学习及其应用 课件6_第4页
深度学习及其应用 课件6_第5页
已阅读5页,还剩144页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

GraphNeuralNetworkDeepLearninganditsApplicationHowtorepresentagraph?2SJTUDeepLearningLecture.

UndirectedGraphDirectedGraphAdditionalelementsofagraph3SJTUDeepLearningLecture.

Distinctionofgraph-structureddata4SJTUDeepLearningLecture.Differentfromimagedata:Itingeneraldoesnothavea2Dgridstructureandinsteadrepresentsrelationshipsbetweenobjects.Inaddition,graphdataisnotlimitedtovisualinformationandcanrepresentrelationshipsbetweenabstractentities.ImagedataDistinctionofgraph-structureddata5SJTUDeepLearningLecture.Differentfromtextdata:Itrepresentsrelationshipsbetweenobjectsratherthanjustsequencesofwords.Graphdatacanalsobeusedtorepresentrelationshipsbetweenentitiesintextdata,suchasco-occurrencerelationshipsbetweenwordsinadocument.“苟利国家生死以,岂因祸福避趋之”TextdataDistinctionofGraph-structuredData6SJTUDeepLearningLecture.Differentfrom3-Dpointclouds:Itrepresentsrelationshipsbetweenobjectsratherthanjustacollectionofpointsandtheirpositioncoordinates.Graphdatacanalsobeusedtorepresentrelationshipsbetweenpointsinapointcloud,suchasnearestneighborrelationships.PointcloudsGraph-structureddataareubiquitous!7SJTUDeepLearningLecture.SocialnetworksMoleculesGraph-structureddataareubiquitous!8SJTUDeepLearningLecture.KnowledgegraphsProgressinGraphGenerationHoogeboom,E.,Satorras,V.G.,Vignac,C.,&Welling,M.(2022,June).Equivariantdiffusionformoleculegenerationin3D.InInternationalConferenceonMachineLearning(pp.8867-8887)9SJTUDeepLearningLecture.AI-aidedDrugDiscoveryRelatedTasks10SJTUDeepLearningLecture.Nodeclassification:Inthistask,thegoalistopredictacategoricallabelforeachnodeinagraph.Forexample,inasocialnetwork,nodesmayrepresentpeople,andthetaskcouldbetopredicttheiroccupationbasedontheconnectionsbetweenthem.2.

Graphclassification:Thistaskinvolvespredictingacategoricallabelforanentiregraph.Inamoleculegraph,thetaskcouldbetopredictwhetheramoleculeistoxicornot.3.

Linkprediction:Thistaskinvolvespredictingtheexistenceofedgesbetweennodesinagraph.Forexample,inasocialnetwork,thetaskcouldbetopredictwhethertwopeoplearelikelytobecomefriendsbasedontheirotherexistingconnections.Messagepassing11SJTUDeepLearningLecture.Goal:Integratetheinformationfromneighboringnodestoencodecontextualgraphinformation.Theideabehindmessagepassingistoallowinformationtoflowbetweennodes,allowingthenetworktolearntherelationshipsandpatternsinthegraphstructure.Bypassingmessagesbetweennodes,thenetworkcancapturethedependenciesandinteractionsbetweennodes,leadingtoimprovedrepresentationandbetterperformanceongraph-relatedtaskse.g.nodeclassification,graphclassification,andlinkprediction.

Messagepassinggeneralformulation12SJTUDeepLearningLecture.Messagepassinggeneralformulation13SJTUDeepLearningLecture.Ineachiterationofmessagepassing,Theaggregateoperationcollectsinformationfromneighboringnodesandsummarizesitintoacompactrepresentation,whichisthenpassedontothetargetnode.Thisoperationallowsthenetworktogatherinformationfromthesurroundingnodesandmakeuseofittoupdatetherepresentationofthetargetnode.Theupdateoperation,takestheinformationfromtheaggregateoperationandupdatestherepresentationofthetargetnode.Thisupdatedrepresentationincludestheinformationfromtheoriginalnodefeaturesaswellastheinformationfromitsneighboringnodes.Theupdatedrepresentationthenservesasinputforthenextiterationofmessagepassing.Overview14SJTUDeepLearningLecture.Some

typicalGNNs

15SJTUDeepLearningLecture.1.GraphConvolutionalNetworks(GCN)[1]:ProposedbyKipfandWellingin2016,GCNisoneofthepioneeringGNNmodels.ItadaptstheconvolutionoperationfromtraditionalCNNstoworkongraph-structureddata,enablingfeaturelearningonnodesinthegraph.2.GraphSAGE

[2]:DevelopedbyHamilton,Ying,andLeskovecin2017,GraphSAGEisaninductiveGNNmodelthatlearnstogenerateembeddingsfornodesbyaggregatinginformationfromtheirlocalneighborhood.Thismodelisparticularlyusefulforgraphswithunseennodesorevolvingovertime.3.GraphAttentionNetworks(GAT)[3]:IntroducedbyVelickovicetal.in2017,GATincorporatesattentionmechanismsintographneuralnetworks.GATallowsnodestoweightheimportanceoftheirneighborsdynamically,enablingthemodeltofocusonmorerelevantinformationduringtheaggregationprocess.IssuesinGNNs16SJTUDeepLearningLecture.Scalability:GNNscanstrugglewithlarge-scalegraphs,asthecomputationalcomplexityandmemoryrequirementsincreasewiththesizeofthegraph.DesigningscalableGNNsthatcanefficientlyhandlelargegraphswhilemaintaininghighperformanceisacriticalchallenge.

e.g.[NodeFormer,Wuetal.],[DIFFormer,Wuetal.]Heterogeneousgraphs:Graphswithdifferenttypesofnodesandedges(heterogeneousgraphs)arecommoninreal-worldapplications.DevelopingGNNsthatcaneffectivelyhandleandexploittherichinformationinheterogeneousgraphsisstillanongoingresearcharea.IssuesinGNNs17SJTUDeepLearningLecture.Dynamicgraphs:Manyreal-worldgraphsaredynamic,withnodesandedgesbeingaddedorremovedovertime.DevelopingGNNscapableofadaptingtoandlearningfromdynamicgraphsremainsanopenchallenge.e.g.[EasyDGL,Chaoetal.]Oversmoothing:OversmoothingcanoccurinGNNswhenthenumberoflayersislarge.Oversmoothingmeansthatthefeaturesofdifferentnodesbecomeincreasinglysimilarduringthemessage-passingandinformationaggregationprocesses,leadingtoreduceddiscriminativepower.Summary18SJTUDeepLearningLecture.Inconclusion,GraphNeuralNetworks(GNNs)haveemergedasapowerfultoolforlearningfromgraph-structureddata.Byleveraginglocalandglobalinformationthroughmessage-passingandaggregationmechanisms,GNNshavedemonstratedremarkableperformanceinvariousdomains,includingsocialnetworkanalysis,recommendationsystems,anddrugdiscovery.Despitetheirimpressiveachievements,GNNsstillfacechallengessuchasscalability,oversmoothing,generalization,andsoon.OngoingresearchisfocusedonaddressingtheseissuesandfurtherimprovingGNNsfordiverseapplications.AswecontinuetoadvanceourunderstandingofGNNs,wecanexpectthesemodelstoplayanincreasinglysignificantroleinaddressingcomplexproblemsacrossawiderangeofdomains,unlockingnewpossibilitiesfordata-drivendecision-makingandinsights.References[Ref1]

Kipf,ThomasN.,andMaxWelling."Semi-SupervisedClassificationwithGraphConvolutionalNetworks."

InternationalConferenceonLearningRepresentations.[Ref2]Hamilton,Will,ZhitaoYing,andJureLeskovec."Inductiverepresentationlearningonlargegraphs."

Advancesinneuralinformationprocessingsystems

30(2017).[Ref3]Veličković,Petar,etal."GraphAttentionNetworks."

InternationalConferenceonLearningRepresentations.[Ref4]Wu,Qitian,etal."Nodeformer:Ascalablegraphstructurelearningtransformerfornodeclassification."

AdvancesinNeuralInformationProcessingSystems

35(2022):27387-27401.[Ref5]Wu,Qitian,etal."DIFFormer:Scalable(Graph)TransformersInducedbyEnergyConstrainedDiffusion."

TheEleventhInternationalConferenceonLearningRepresentations.[Ref6]Chen,Chao,etal."EasyDGL:Encode,TrainandInterpretforContinuous-timeDynamicGraphLearning."

arXivpreprintarXiv:2303.12341

(2023).[Ref7]Wu,Qitian,etal."HandlingDistributionShiftsonGraphs:AnInvariancePerspective."

InternationalConferenceonLearningRepresentations.[Ref8]Yang,Nianzu,etal."Learningsubstructureinvarianceforout-of-distributionmolecularrepresentations."

AdvancesinNeuralInformationProcessingSystems.2022.[Ref9]Zhang,Hengrui,etal."Fromcanonicalcorrelationanalysistoself-supervisedgraphneuralnetworks."

AdvancesinNeuralInformationProcessingSystems

34(2021):76-89.[Ref10]Zhang,Shaofeng,etal."M-Mix:GeneratingHardNegativesviaMulti-sampleMixingforContrastiveLearning."

Proceedingsofthe28thACMSIGKDDConferenceonKnowledgeDiscoveryandDataMining.2022.19SJTUDeepLearningLecture.DNN:Advanced

TrainingtechniquesDeepLearninganditsApplicationAdvancedOptimizersSJTUDeepLearningLecture.21OptimizersNesterovMomentum(NAG)Adaptive

Learning

Rate:

AdaGradAdaptive

Learning

Rate:

AdaDeltaAdaptive

Learning

Rate:

RMSPropAdaptive

Learning

Rate:

AdamSJTUDeepLearningLecture.22NesterovMomentum(NAG)StandardmomentumNesterovmomentum:gradientisevaluatedafterthecurrentvelocityisapplied.Adda“correctionfactor”

to

look

aheadSJTUDeepLearningLecture.23SGDMomentumvs.NAGSJTUDeepLearningLecture.24SGDMomentumNAGNesterovMomentum(NAG)IllustrationThisanticipatoryupdatepreventsusfromgoingtoofastandresultsinincreasedresponsiveness.WorkmoreforbatchupdateratherthanSGDSJTUDeepLearningLecture.25Momentumvs.NAGSJTUDeepLearningLecture.26LearningRateMomentumsprovide“smart”gradients.AnotherimportantfactorofGDislearningrate.smallslowlargenon-optimalSJTUDeepLearningLecture.27ChoosingaproperlearningratecanbedifficultGloballearningratemaynotbesuitableforallparametersAdaptive

Learning

Rate:

AdaGradAdaGrad:

Greater

progress

in

the

more

gently

sloped

directionSJTUDeepLearningLecture.28

Adaptive

Learning

Rate:

AdaGradSJTUDeepLearningLecture.29Adaptive

Learning

Rate:

AdaDeltaSJTUDeepLearningLecture.30AdaDeltaAdaDeltaisanextensionofAdaGradthatseekstoreduceitsaggressive,monotonicallydecreasinglearningrate.

Wedon’tneedtopresetthegloballearningrateasitdoesnotappearintheruleAdaptive

Learning

Rate:

AdaDeltaSJTUDeepLearningLecture.31RMSPropAdaptive

Learning

Rate:

RMSPropSJTUDeepLearningLecture.32RMSPropPros:Usefulfornon-convexfunctionAneffectiveandpracticaloptimizationalgorithmfordeepneuralnetworksCons:RequirespresetlearningrateAdaptive

Learning

Rate:

RMSPropSJTUDeepLearningLecture.33Applied

momentum

on

the

RMSProp

rescaled

gradientCombining

RMSProp

and

NesterovMomentumSJTUDeepLearningLecture.34Combining

RMSProp

and

MomentumRMSPropwithMomentumPros:ProsfromRMSPropmorerobustforlocalminimumCons:RequirespresetgloballearningrateSJTUDeepLearningLecture.35Adaptive

Learning

Rate:

AdamAdaptiveMomentEstimation(Adam):

New

combination

of

RMSProp

and

momentumSJTUDeepLearningLecture.36VariantofRMSPropwithmomentumWithbiascorrectionRobust

to

the

choice

of

hyper-parametersAdaptive

Learning

Rate:

AdamSJTUDeepLearningLecture.37OptimizingGradientDescendSGDoptimizationonlosssurfacecontoursandsaddlepointSJTUDeepLearningLecture.38TrainingtechniquesDeepLearninganditsApplicationBack-PropagationSJTUDeepLearningLecture.40Back-propagationAlgorithm(1)Reviewmulti-layer

neuralnetworksFeed

forwardoperation

is

a

chain

function

calculationsBack-propagationAlgorithm(2)Loss

function

example:

square

errorNN

example:asimple

one

layerlinearmodel:Sothederivativeoflossfunction(singlesample)

is:Back-propagationAlgorithm(3)Generalunitactivationinamultilayernetwork:Forwardpropagation:calculateforeachunitThelossLdependsononlythrough:ErrorsignalActivation

functionInput/output

of

hidden

layerActivationBack-propagationAlgorithm(4)Outputunitwithlinearoutputfunction:Hiddenunitwhichsendsinputstounits:Updateweights(learningrate):KaiYu.SJTUDeepLearningLecture.44CheckallnodesconnectedtotApplychainrulesBack-propagationAlgorithm(5)BPalgorithmformulti-layerNNcanbedecomposedinthefollowingfoursteps:Feed-forwardcomputationBackpropagationtotheoutputlayerBackpropagationtothehiddenlayerWeightupdatesKaiYu.SJTUDeepLearningLecture.45Activation

Function

--

SigmoidSigmoid

FunctionKaiYu.SJTUDeepLearningLecture.46Activation

Function

--

SigmoidSigmoidssaturateandkillgradients

Whentheneuron’sactivationsaturatesateithertailof0or1,thegradientattheseregionsisalmostzero.Sigmoidoutputsarenotzero-centered

Ifthedatacomingintoaneuronisalwayspositive,

thenthegradientontheweights

will

becomeeitherallbepositive,orallnegative(zigzagproblem).KaiYu.SJTUDeepLearningLecture.47Activation

Function

--

TanhTanh

FunctionKaiYu.SJTUDeepLearningLecture.48Activation

Function

--

ReLUReLU(RectifiedLinearUnit)KaiYu.SJTUDeepLearningLecture.49Q:Is

ReLu

Linear

or

Non-Linear

activation

function?Activation

Function

--

ReLUGreatlyaccelerate

theconvergenceofstochasticgradientdescentcomparedtothe

.ReLUcanbeimplementedbysimplythresholdingamatrixofactivationsatzero

(without

exponentialsoperations).ReLUunitscan“die”duringtraining:alargegradientflowingthroughaReLUneuroncouldcausetheweightstoupdateinsuchawaythattheneuronwillneveractivateonanydatapointagain.KaiYu.SJTUDeepLearningLecture.50Activation

Function

--

ReLULeakyReLUKaiYu.SJTUDeepLearningLecture.51StochasticGradientdescentSJTUDeepLearningLecture.52StochasticGradientDescent

(SGD)

KaiYu.SJTUDeepLearningLecture.53TraininganeuralnetworkBatchGDvs.StochasticGDBGD,updateslowlySGD,blindlyupdateKaiYu.SJTUDeepLearningLecture.54Negative

gradient

is

regarded

as

a

force

moving

a

particle

in

the

parametric

spaceAssume

unit

massandanaverageforcedecayingfactorduringacceleration,thevelocitybecomesAfteronetimestep,theparticlemovestoanewparametricpoint,i.e.Problem:startingvelocityateachpointisnotzero,

leading

to

oscillationPhysical

Analogy

of

SGDSJTUDeepLearningLecture.55Movingmatter

has

inertia

or

momentum!Consider

the

velocity

of

the

previous

update

and

assume

an

average

decaying

factor

resulting

from

friction,

the

velocity

update

becomes:Afteronetimestep,theparticlemovestoanewparametricpointMomentumKaiYu.SJTUDeepLearningLecture.56Momentum

Algorithm

KaiYu.SJTUDeepLearningLecture.57Momentum

AnalysisIncreasesfordimensionswhosegradientspointinthesamedirectionsReducesupdatesfordimensionswhosegradientschangedirections.Gainfasterconvergenceandreducedoscillation.KaiYu.SJTUDeepLearningLecture.58DeepLearninganditsApplicationATTENTIONMECHANISMTransformer-EncoderSJTUDeepLearningLecture.60SJTUDeepLearningLecture.61TransformerTransformeriskeycomponentofBERT/GPTParallelcomputingReplacingRNN&LSTM,becomingthemosteffectiveextractorSJTUDeepLearningLecture.62TransformerSJTUDeepLearningLecture.63TransformerSJTUDeepLearningLecture.64TransformerPositionalEncoding

ResidualConnection(Add&Norm)Encoder-DecoderAttentionSJTUDeepLearningLecture.65TransformerAttentionmechanismcannotdistinguishthepositionorderofinputwordsTheanimalscrossthestreet. ||Crossthethestreetanimals.

SJTUDeepLearningLecture.66TransformerSJTUDeepLearningLecture.67PositionEncodingPositionEncodingwhereposisthepositionandiisthedimensionForanyfixedoffsetk,PEpos+kcanberepresentedasalinearfunctionofPEposSJTUDeepLearningLecture.68PositionEncodingSJTUDeepLearningLecture.69PositionEncodingSine

PECosine

PESJTUDeepLearningLecture.70ResidualsSJTUDeepLearningLecture.71ResidualsTransformer-DECODERSJTUDeepLearningLecture.72SJTUDeepLearningLecture.73TransformerTransformeriskeycomponentofBertParallelcomputingReplacingRNN&LSTM,becomingthemosteffectiveextractorSJTUDeepLearningLecture.74TransformerTheencoder’sinputsflowthroughaself-attentionlayerTheoutputsoftheself-attentionlayerarefedtoafeed-forwardneuralnetworkThedecoderhasboththoselayers,butbetweenthemisanattentionlayerthathelpsthedecoderfocusonrelevantpartsoftheinputsentenceMulti-HeadAttentionSJTUDeepLearningLecture.75TransformerAttentionbetweeneverytwotokensAttentionfrombeforetokensReference[1]Vaswani,Ashish,etal."Attentionisallyouneed."Advancesinneuralinformationprocessingsystems.2017.[2]Devlin,Jacob,etal."Bert:Pre-trainingofdeepbidirectionaltransformersforlanguageunderstanding."arXivpreprintarXiv:1810.04805(2018).[3]Serrano,Sofia,andNoahA.Smith."IsAttentionInterpretable?."arXivpreprintarXiv:1906.03731(2019).[4]Jain,Sarthak,andByronC.Wallace."Attentionisnotexplanation."arXivpreprintarXiv:1902.10186(2019).[5]Peters,MatthewE.,etal."Deepcontextualizedwordrepresentations."arXivpreprintarXiv:1802.05365(2018).[6]Radford,Alec,etal."Improvinglanguageunderstandingbygenerativepre-training."URLhttps://s3-us-west-2.amazonaws.com/openai-assets/researchcovers/languageunsupervised/languageunderstandingpaper.pdf(2018).SJTUDeepLearningLecture.76Reference[7]Yang,Zhilin,etal."XLNet:GeneralizedAutoregressivePretrainingforLanguageUnderstanding."arXivpreprintarXiv:1906.08237(2019).[8]Dai,Zihang,etal."Transformer-xl:Attentivelanguagemodelsbeyondafixed-lengthcontext."arXivpreprintarXiv:1901.02860(2019).[9]Veličković,Petar,etal."Graphattentionnetworks."arXivpreprintarXiv:1710.10903(2017).[10]https://jalammar.github.io/illustrated-transformer/[11]/p/31547842SJTUDeepLearningLecture.77DeepLearninganditsApplicationATTENTIONMECHANISMAttentionmechanismSJTUDeepLearningLecture.79SJTUDeepLearningLecture.80AttentionMechanismAttentionMechanismiswidelyusedinNLP,CV,etc.andhasmany

variants.Assigndifferentweightstoeachpartoftheinputorthemodelstructure.Extractmoresignificanthiddeninformationwithoutadditionalstorageoverhead.Makeneuralnetworksinterpretative(a

dispute

theory)SJTUDeepLearningLecture.81AttentionMechanismSJTUDeepLearningLecture.82AttentionMechanismSJTUDeepLearningLecture.83Seq2SeqShortcomingWordsinonesentencesharethesameweightOnlythelasthiddenlayerofEncoderisoutputintoDecoderSJTUDeepLearningLecture.84BottleneckProblemSJTUDeepLearningLecture.85AttentionMechanismKyunghyunChoetal.LearningPhraseRepresentationsusingRNNEncoder–DecoderforStatisticalMachineTranslation,arXivpreprintarXiv:1406.1078

SJTUDeepLearningLecture.86AttentionMechanismSJTUDeepLearningLecture.87AttentionvariantsSJTUDeepLearningLecture.88Seq2SeqwithAttentionSJTUDeepLearningLecture.89Seq2SeqwithAttentionSJTUDeepLearningLecture.90GlobalAttentionAllhiddenstatelayersusedSJTUDeepLearningLecture.91LocalAttentionCalculatealignedpositionfordecodingwordAlignedpositiondecideshiddenstateswindowSJTUDeepLearningLecture.92AttentionMechanism自动驾驶背景介绍严骏驰上海交通大学无人驾驶出租车(Robotaxi)城市环卫自动驾驶①车辆装配间③实际自动驾驶城市环卫场景:高精地图、点云、视频配套②多类型环卫车联合编队作业背景介绍–任务自动驾驶任务:给定传感器数据(相机-照片,毫米波雷达/激光雷达-点云,IMU/GPS–车辆状态,高精地图等),安全、舒适、快速的自动前往乘客的目的地。背景介绍–感知自动驾驶任务分工感知:包括目标检测、目标跟踪、车道线识别等。作用为根据传感器数据,提取出周边环境(静态物体+动态物体)的状态,例如他车的坐标与速度,信号灯的位置与当前颜色,车道位置与拓扑等。背景介绍–预测自动驾驶任务分工预测:包括意图识别、轨迹预测等。作用为根据感知模块得到的周围环境的历史与当前状态,推测周围动态物体(行人、自行车、机动车等)在未来时刻的位置。背景介绍–决策自动驾驶任务分工决策:包括行为决策、路径规划、运动控制等。作用为根据预测模块给出的周围环境未来状态、高精地图、乘客目的地,给出自车的驾驶方案与控制策略。自动驾驶端到端架构严骏驰上海交通大学端到端自动驾驶–分类传统的车端自动驾驶方案系统模块化架构缺陷:信息损失级联误差冗余/重复计算上游任务并未直接优化驾驶目的!(检测到附近的车重要性>>远方的车)端到端自动驾驶:发挥神经网络端到端优化优势,利用海量数据,探索自动驾驶新方案端到端自动驾驶–分类显式端到端:保留传统分工,但中间可导隐式端到端:略过大多数模块直接输出最终结果两大类:显式端到端vs隐式端到端优势:有中间可视化,易debug;易于在规控过程加入规则限制缺点:多任务训练不稳定,多任务模型相比单任务模型仍有性能损失Planning-orientedAutonomousDriving,CVPR2023AwardCandidateTrajectory-guidedControlPredictionforEnd-to-endAutonomousDriving:ASimpleyetStrongBaseline,

NeurIPS

2023优势:端到端优化最终任务缺点:开车出现错误难以debug端到端自动驾驶–显式ST-P3:第一个基于环视相机的,具有显示中间表征结果的端到端自动驾驶框架感知:BEV语义分割–当前时刻车辆/行人+地图预测:未来时刻BEV语义分割–车辆/行人未来位置决策:感知结果+规则+预测结果->粗粒度规控->GRURefine->细粒度规控ST-P3:End-to-endVision-basedAutonomousDrivingviaSpatial-TemporalFeatureLearning,ECCV2022端到端自动驾驶–显式UniAD-CVPR2023AwardCandidate(12/9155):Transformer为媒介,显式的端到端完成大多数自动驾驶任务下游任务的Query,通过与上游任务Query的CrossAttention完成信息交互各项任务全面达到SOTAPlanning-orientedAutonomousDriving,CVPR2023AwardCandidate端到端自动驾驶–显式UniAD-CVPR2023AwardCandidate(12/9155):Planning模块MotionQuery作为先验根据Desire,对BEV特征做AttentionOccupancy+Kinetic约束的OptimizationPlanning-orientedAutonomousDriving,CVPR2023AwardCandidate端到端自动驾驶–显式UniAD-CVPR2023AwardCandidate(12/9155):DemoVideoPlanning-orientedAutonomousDriving,CVPR2023AwardCandidate端到端自动驾驶–显式UniAD-CVPR2023AwardCandidate(12/9155):可视化具备从上游错误中恢复的能力Planning-orientedAutonomousDriving,CVPR2023AwardCandidate端到端自动驾驶–显式UniAD-CVPR2023AwardCandidate(12/9155):可视化感知长尾问题仍然存在Planning-orientedAutonomousDriving,CVPR2023AwardCandidate端到端自动驾驶–隐式CILRS:挑战:基于行为克隆的规控算法存在因果倒置问题方案:用图片网络的特征来预测车辆当前速度通过鼓励模型从视觉输入中察觉动态信息,避免模型仅仅依赖上一时刻速度做决策一定程度上缓解,但任未解决ExploringtheLimitationsofBehaviorCloningforAutonomousDriving,

ICCV

2019端到端自动驾驶–隐式端到端模型输出形式探索-TCP:挑战:输出控制信号,擅长转弯,但碰撞多;输出轨迹信号,碰撞少,但拐大弯,例如90°直角弯会失败方案:多任务分支,且任务分支间进行特征交互Demo模型仅使用单一单目相机作为输入,发布时在CARLAAutonomousDrivingLeaderboard上取得了第一名的分数,远超其他使用多传感器输入的方法(多相机和激光雷达)Trajectory-guidedControlPredictionforEnd-to-endAutonomousDriving:ASimpleyetStrongBaseline,

NeurIPS2022端到端自动驾驶–预训练自监督预训练for端到端模型-PPGeo:自动驾驶场景下每天有海量无标注的数据被采集,挖掘自监督学习能力至关重要CV领域常用的对比学习、MIM对对于驾驶中的运动先验没有建模方案:一阶段通过前后两帧重建学习Depth、Pose变化、相机内参的三个估计网络(MonoDepth框架),二阶段Freeze训练好的Depth、相机内参网络,用要预训练的backbone代替Pose变化网络,让模型仅通过单帧输入预测下一帧状况,从而鼓励Backbone学习与自车运动相关的信息PPGeo:PolicyPre-trainingforAutonomousDrivingviaSelf-supervisedGeometricModeling,

ICLR2023端到端自动驾驶–预训练自监督预训练for端到端模型-PPGeo:在端到端任务上,PPGeo预训练远超ImageNet预训练,MAE/MoCo自监督训练可视化表明PPGeo得到的特征更好的捕捉了自车运动相关信息PPGeo:PolicyPre-trainingforAutonomousDrivingviaSelf-supervisedGeometricModeling,

ICLR2023自动驾驶典型模块严骏驰上海交通大学感知模块–输入输入:多传感器(环视相机+激光雷达+毫米波雷达+自车状态传感器IMU/GPS)+高精地图(可选)传感器输入示意图nuScenes:Amultimodaldatasetforautonomousdriving(CVPR2020)感知模块–输出输出:周围物体的位置与速度(3D目标检测+目标跟踪)3D目标检测3D目标跟踪道路结构认知车道线检测感知模块–传感器融合关键问题-传感器融合:将多传感器信息,根据其时空对应关系,作为一个整体输出前融合:在数据或特征层面将不同传感器的输入融合,根据融合后的特征检测后融合:每个传感器单独进行检测,将检测结果进行互补与去重前融合–特征融合:BEVFusion

优势:无需设计复杂的互补与查重规则,整个系统可导;融合得到特征可供下游多个模块共享使用缺点:训练与特定传感器配置耦合优势:解耦性,鲁棒性,可直接应用已有目标检测方法缺点:基于规则的互补与查重难以发挥大数据优势前融合-数据融合:PointPainting

PointPainting:SequentialFusionfor3DObjectDetection(CVPR2020)BEVFusion:Multi-TaskMulti-SensorFusionwithUnifiedBird’s-EyeViewRepresentation(IROS2022)感知模块–BEV特征融合基于BEV(Bird-Eye-View)的特征融合:将各个传感器输入分别用各自的backbone提取特征,将特征根据几何关系投影鸟瞰图下的栅格Lidar/Radar/高精地图只需将z轴压平即可得到BEV,但相机需要做像素坐标到自车坐标转换转换方法一:2D->3D,即将每个像素投影到3D空间中的对应位置,例如LSS挑战:每个像素对应3D空间中的一条射线,需要估计深度优势:点云辅助下深度分布相对容易生成缺点:深度监督信号稀疏;高速场景点云估计深度误差大;物体边界不易处理Lift,Splat,Shoot:EncodingImagesFromArbitraryCameraRigsbyImplicitlyUnprojectingto3D(ECCV2020)感知模块-

BEV特征融合基于BEV(Bird-Eye-View)的特征融合:将各个传感器输入分别提取特征,将特征根据几何关系投影鸟瞰图下的栅格Lidar/Radar/高精地图只需将z轴压平即可得到BEV,但相机需要做像素到物理空间转换转换方法2:3D->2D,即根据3D空间中的每个位置,查询图片对应位置的特征,例如BEVFormer优势:无需深度估计;易于大规模训练缺点:对数据需求量大;模型计算复杂度高BEVFormer:LearningBird's-Eye-ViewRepresentationfromMulti-CameraImagesviaSpatiotemporalTransformers(ECCV2022)感知模块-车道线联合检测PersFormer:3DLaneDetectionviaPerspectiveTransformerandtheOpenLaneBenchmark,ECCV2022OralPersFormer:将图片特征通过反透视变换得到2D-鸟瞰图的参考点,结合Transformer在3D空间检测车道线在自采数据集与公开数据集均大幅超越已有方法ECCV2022Oral,在一年中获得Github290+star,/OpenDriveLab/PersFormer_3DLane感知模块-车道线检测PersFormer:3DLaneDetectionviaPerspectiveTransformerandtheOpenLaneBenchmark,ECCV2022Oral工业界、学术界第一个大规模3D真实车道线数据集OpenLane:/OpenPerceptionX/OpenLane

感知模块-车道线检测Opendenselane:ADenseLidar-BasedDatasetforHDMapConstruction,

ICME2022LiDAR车道线数据集中规模大,种类全的数据集OpenDenseLane:/Thinklab-SJTU/OpenDenseLane感知模块-道路结构认知RoadGenome:ATopologyReasoningBenchmarkforSceneUnderstandinginAutonomousDriving,

arixv工业界、学术界第一个道路结构认知真实车道线数据集OpenLanev2:/OpenDriveLab/OpenLane-V2预测模块–输入

输入:车辆/行人/自行车等动态物体的历史位置、速度、转向角等,静态障碍物位置,车道线与交通标志的位置与类型等预测模块–输出

输出:场景中动态物体未来时刻的位置(多种未来可能性)指标:平均L2误差,终点L2误差,MissRate预测模块–场景编码关键问题–场景编码:异质、动态、非结构化输入,如何为动态物体进行特征编码方法1–基于Raster的编码:将提取出来的信息,重新“画”到BEV下,使用CNN进行编码优势:可以直接使用成熟的CVbackbone编码缺点:稀疏特征图–效率低;在物理坐标与像素间映射带来误差;难以显式编码不同类型的语义关系前融合-数据融合:PointPainting

ChauffeurNet:LearningtoDrivebyImitatingtheBestandSynthesizingtheWorst,fromWaymo2018预测模块–场景编码关键问题–场景编码:异质、动态、非结构化输入,如何为动态物体进行特征编码方法2–基于Vector的编码:将不同的输入用各自的backbone处理成vector,使用GNN/Transformer优势:直接利用抽象出来的场景元素信息,建模关系缺点:芯片上需实现高效图网络算法VectorNet:EncodingHDMapsandAgentDynamicsfromVectorizedRepresentation,CVPR2020LaneGCN:LearningLaneGraphRepresentationsforMotionForecasting,

ECCV

2020

oralVectorNet-TransformerLaneGCN–4种GCN预测模块–异质图TransformerHeterogeneousDrivingGraphTransformer:驾驶场景中的信息源密集且种类丰富(车辆、行人、车道、交通灯等)->建模为异质图,图上节点为场景中的元素,边代表元素间关系元素在空间中的相对性->节点间的特征的视角转换自动驾驶海量数据->基于Transformer的图网络HDGT:HeterogeneousDrivingGraphTransformerforMulti-AgentTrajectoryPredictionviaSceneEncoding,arxiv预测模块–基于目标的预测TNT:Target-driveNTrajectoryPrediction,

CoRL2020二阶段轨迹预测TNT:先验:驾驶员都有目的地,给定目的地的情况下,具体行进轨迹的波动并不重要一阶段预测目的地来保证多样性,二阶段根据目的地补全轨迹来提高一致性基于L2距离的指标变差,但基于missrate的指标大幅变好->多样性提高预测模块–时间一致性优化TowardsCapturingtheTemporalDynamicsforTrajectoryPrediction:aCoarse-to-FineApproach,

CoRL2022时

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论