驱动汽车科技创新发展演讲资料-理想自动驾驶-2024-04-自动驾驶

上传人：行*** IP属地：北京上传时间：2024-06-16 格式：DOCX 页数：22 大小：3.51MB 积分：20 举报 版权申诉

已阅读5页，还剩17页未读，继续免费阅读

版权说明：本文档由用户提供并上传，收益归属内容提供方，若内容存在侵权，请进行举报或认领

文档简介

TheConvergenceofAutonomousDrivingandSystem1&2Thinking

PengJia

LiAuto,China

Contents

LiADOverview

LiADTechnologyHighlights

LiAuto'sViewonAutonomousDriving

Real

Rule-drivenL2:2D/Mono3DData-drivenL3:BEV/End2EndKnowledge-drivenL4:VLM/WorldModel

World

EveryDay

Driving

Scenarios

UnknownScenarios

ExpandedDriving

Scenarios

Known

Scenarios

LiADFramework

SYSTEM1

Intuition&instinct

SYSTEM2

Rationalthinking

Takeseffort

Slow

Logical

Lazy

Indecisive

95%

Unconscious

Fast

Associative

Automaticpilot

System1--End-to-EndModelforL3AD

Fastend-to-endresponsetothesurroundingenvironment.

System2--LargeMultimodal-ActionModel

Exploreandlogicallythinkunderunknownenvironments.Modalitiesincludelanguage,vision,pointclouds,canbusandnavigationtosolveL4unknownscenes.

System1System2TrainingLoop

Perception

Decision&Planning

Control

Short-termMemory

Vehicle

L3EndtoEndModel

Sensors

L4MultimodalLLM

Recognition

GeneralKnowledge

SimReinforcementLearningModel

EvaluationNetwork

GenerativeWorldModel

Cloud

Well-recognizedworksfromLiADteam

MUTR3D

2021World's1st

Incamera-based3Dtracking

FUTR3D

Industryleading

DenseTNR

1stPlaceSolution

InICCV2021INTERPRET

Challenge

DETR3D

2021World's1st

Incamera-based3Ddetection

HDMapNet

CVPR2021ADP3Workshop

Multi-sensor3Ddetectionmodel

(BestPaperNomination)

CORL2021DETR3D:

/pdf/2110.06922.pdf

CVPR2022MUTR3D:

/pdf/2205.00613.pdf

ICML2023VectorMapNet:

/pdf/2206.08920.pdf

CVPR2023NPN:

/pdf/2304.08481.pdf

CVPR2023FUTR3D:

/pdf/2203.10642.pdf

ICRA2022HDMapNet:

/pdf/2107.06307.pdf

CVPR2023VIP3D:

/pdf/2208.01582.pdf

ArchitectureofADMax3.0

SafetyPerception

SafetyPlanner

OneModel(Multi-taskPerception)

Prediction&PlanningNetwork

ShadowApp1

ShadowApp2

Shadow

StaticBEV

ObjectBEV

Occupancy

MPC

……

Spatio-TemporalPlanner

End2EndTrafficSignalNetwork

Camera×7

LiDAR

Radar

NavigationMap

NVIDIADRIVEOrin×2

InferencePerfOptimization

ForPerceptionPipeline111ms/9fps=>48ms/21fps

ItemOptimizationActionOptTypeLatency(percentage)

ApplyMPStoavoidCUDAcontextswitchoverhead(systemofmultipleprocesseswithGPUcalls).

Pipeline

-9.91%

RemoveunneededCUDAcalls(e.g.,cudaWaitExternalSemaphore).

Pipeline

-5.60%

EnlargeCUDA_DEVICE_MAX_CONNECTIONStoresolvefalsedependencyamongCUDAstreams.

Pipeline

-4.80%

Schedulemodelheadswithdifferentinferencefrequency.

Pipeline

-8.10%

ReplaceD2DwithH2DcopyinimagingstreamstoutilizeGPUCopyEngineinsteadofCUDAcore.

Pipeline

-3.25%

Optimizebevpoolv2pluginbyreducingwarpdivergence&fp16(from

Lidar_AI_Solution

Model

-10.80%

EliminatereformattingkernelsduetounfusedQATnodes.

Model

-3.60%

ApplyTRTMHAfastkerneltoacceleratetransformerblocks.

Model

-5.60%

UseSparsity.

Model

-3.60%

OffloadROI-aligntoPVA.

Model

-1.80%

Finalperfimprovement-57.06%

LiADL4Research——Drive-VLM

SequenceofImagesDriveVLMSceneDescriptionCriticalObjectsAnalysisHierarchicalPlanning

Input:

Basedonthesummarized

analysis,pleasedetermine

themetaactionsanddecisions.

Drive-VLM:

MetaActions:['Slowdown’,'Stop',‘Reverse’].

Decisions:Driveslowly,stopinfrontoftheconstructionworkers,andthenreverse.

Input:

Basedontheabovedecisionsandanalysis,pleaseindicatethe3sfuturetrajectoryoftheegovehicle.

Drive-VLM:

Futuretrajectory:[(x1,y1),(x2,y2),…,(xn,yn)]

Input:

Picture:<img>

Pleaseindicatethescenecondition.

Drive-VLM:

Weather:Sunny.

Time:Day.

RoadEnvironment:Urban.

AlternativeLane:Ownlaneaheadimpassable,occupiedbyparked

vehicles,leftlanepassable,oncominglane,proceedwithcaution.

EgoLanePosition:Donotapply.

VisionEncoder

FeatureAlignment

Input:

Pleaselistthekeyobjects.

Drive-VLM:

Criticalobjects:<Twoconstructionworkers>

Box:<(x1,y1),(x2,y2)>

Matching

3DPerceptionResults

Input:

Matchedobjects:

Twoconstructionworkers

Thepositionandhistory

trajectoryofmatchedobjectsinBEVregion:<…>

Unmatchedobjects:NoneDescribetheobjectand

indicateitsinfluence.

Drive-VLM:

Characteristic:Garbagecleaninginprogressontherightlaneahead.

Influence:Blockingtherouteofourvehicle.

Input:

Ego-statesandhistoricaltrajectory:<…>

Basedontheanalysisof

sceneandcriticalobjects,determinethedrivingmetaactionsanddecisions.

Collaboration

Dual

System

Slow-Fast

3DPerception

MotionPredictionTrajectoryPlanning

TraditionalAVPipeline

*SubmittedtoCVPR24

https://openreview.net/forum?id=jL4YMzXYII

LLMDeployedOnNVIDIADRIVEOrin

LLaMA2-3B(BS=1,Input_len=128,Output_len=128)

PlatformConfigContextLatency(ms)DecoderPerf(tokens/s)

DriveOSLinux,OrinINT4(GPTQ)52.565.6

LLaMA2-7B(BS=1,Input_len=128,Output_len=128)

PlatformConfigContextLatency(ms)DecoderPerf(tokens/s)

DriveOSLinux,OrinINT4(GPTQ)73.1541.8

LiADSimResearch——StreetGaussians

Originalscene

Street-gaussianswapping

Originalscene

Street-gaussianswapping

Reallog

Camerasimulation

Originscene

Linrescene

Unisimscene

RenderingImages

Decomposition

Semanticmaps

Geometrymodel

PositionμRotationαOpacityRScale$

Point-basedRendering

BackgroundmodelComposition

Dynamicappearancemodel

)⊗

TimebasisSHbasis

……⊗

∑(

OptimizableTrackedboxes

Objectmodel

Scenerepresentation

3DGS[16]NSG[31]MARS[51]Ours

PSNR↑

29.95

30.23

31.37

34.54

PSNR*↑

17.74

22.05

23.07

25.16

SSIM↑

0.907

0.866

0.904

0.936

LPIPS↓

0.140

0.331

0.246

0.091

FPS↑

277

0.47

0.68

133

Table1.QuantitativeresultsontheWaymo[40]dataset.

Therenderingimageresolutionis1066°ø1600.“PNSR*”denotesthePSNRofmovingobjects.

*SubmittedtoCVPR24

https://openreview.net/forum?id=jL4YMzXYII

LiADResearch——BEV-CLIP:MultimodalDataRetrieval

Weightmatrix

BEV

Encoder

SharedCross-madalPrompt

Pedscrossing

crosswalk,

manycars……

Language

Textembedding

Encoder

LoRA

Weightmatrix

KGEmbedding

(a)(b)

Knowledgegraph

BEVCaptionGenerationHead

Contrastiveloss

(c)

Figure2.OverallstructureofBEV-CLIP.

(a)ProcessingofBEVandtextfeatures.Theimagefrom6surroundingcamerasaregeneratedintoaBEVfeaturebytheBEVEncoderwithfrozenparameters.Atthesametime,theinputtextembeddingisconcatenatedwiththekeyword-matchedKnowledgeGraphnodeembedding,andfedintotheLanguageEncoder

withLoRAbranchforprocessing.(b)Sharedcross-modalprompt(SCP),whichalignstheBEVandlinguisticfeaturesinthesamehiddenspace.(c)Jointsupervisionofcaptiongenerationandretrievaltasks.⊙denotesdotproduct.

1.81

人人文库> 全部分类> 应用文书 > 研究报告

温馨提示

1. 本站所有资源如无特殊说明，都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
2. 本站的文档不包含任何第三方提供的附件图纸等，如果需要附件，请联系上传者。文件的所有权益归上传用户所有。
3. 本站RAR压缩包中若带图纸，网页内容里面会有图纸预览，若没有图纸预览就没有图纸。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 人人文库网仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对用户上传分享的文档内容本身不做任何修改或编辑，并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容，请与我们联系，我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

驱动汽车科技创新发展演讲资料-理想自动驾驶-2024-04-自动驾驶

文档简介

温馨提示

最新文档

评论

驱动汽车科技创新发展演讲资料-理想自动驾驶-2024-04-自动驾驶

文档简介

温馨提示

最新文档

评论

相关文档