人工智能时代的现代存储技术 Modern Storage Technologies for the AI Era_第1页
人工智能时代的现代存储技术 Modern Storage Technologies for the AI Era_第2页
人工智能时代的现代存储技术 Modern Storage Technologies for the AI Era_第3页
人工智能时代的现代存储技术 Modern Storage Technologies for the AI Era_第4页
人工智能时代的现代存储技术 Modern Storage Technologies for the AI Era_第5页
已阅读5页,还剩115页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

AI-generatedvisualrepresentationofastoragesystemforAI

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,20251/47

ModernStorage

TechnologiesfortheAIEra

MarcusParadies

March19,2025

Outline

(1)ForwhichtasksdoweneedstorageinAI?

(2)Whichstoragetechnologiescouldweuseforthat?

Disclaimer:IwillmostlytalkaboutStorage4AIandnotAI4Storage

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,20252/47

StorageRequirementsinDeepLearning

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,20253/47

TrainingDeepNeuralNetworks

–ADNNistrainedovermultipleroundstermedepochs

–Eachepochprocessesallitemsinthedatasetexactlyonce,andconsistsofmultiple

iterations

–Eachiterationprocessesarandom,disjointsubsetofthedatatermedaminibatch

–TheDNNistraineduntilatargetaccuracyisreached

OnlineDataPreparationModelTraining

ForwardDecodingProp.

DataLoading

Cache

Local/RemoteStorage

LossCalc.

BackProp.

Weightupd.

Transform

Augmentation

CPUCPUGPU

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAlEra

March19,20254/47

TrainingDeepNeuralNetworks

DataLoading

Cache

Local/RemoteStorage

ForwardProp.

LossCalc.

BackProp.

Weightupd.

Decoding

Transform

Augmentation

OnlineDataPreparationModelTraining

CPUCPUGPU

(1)DataLoading

–Dataloadedfromlocal(e.g.,SSD)orremotestorage(e.g.,S3)

–Cachingdatamighthelp(butrequiresspecialcachingpoliciesduetodatasetsizes)

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAlEra

March19,20255/47

TrainingDeepNeuralNetworks

OnlineDataPreparationModelTraining

ForwardDecodingProp.

DataLoading

Cache

Local/RemoteStorage

LossCalc.

BackProp.

Weightupd.

Transform

Augmentation

CPUCPUGPU

(2)OnlineDataPreparation

–Inputdataaredecoded,preprocessed,andaugmentedonCPUs(mostly)tocreatetensorsforcomputationsonGPUs

–Onlinepreprocessinginevitablyincursmultipletimesforeachtraining(e.g.,augmentationisarandomoperationthatrandomlytweakseachelementtocreatediversedataandimprovemodelaccuracy)

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,20255/47

TrainingDeepNeuralNetworks

DataLoading

Cache

Local/RemoteStorage

ForwardProp.

LossCalc.

BackProp.

Weightupd.

Decoding

Transform

Augmentation

OnlineDataPreparationModelTraining

CPUCPUGPU

(3)ModelTraining

–Weightofthemodelisupdatedaccordingtoforward&backwardcomputationsontheGPUs

–Multipleepochsareperformedforhigheraccuracybyiteratingthesamedatasetmultipletimes

–Augmentationisrequiredtopreventthemodelfrombeingoverfittedonthesamedataset

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,20255/47

Observations

–GPUsevolveatacrazypace(andkeepdoingso-thisisactuallynottheproblem)

–ThetimespentonCPUsforpreprocessingcanbelongerthanthatspentonGPUsfortensorcomputations!

→ExpensiveGPUswaitforCPUcomputations→GPUunderutilizations

–Evenworse:shifttowardsdata-centricAIrequiresmoreadvanceddatapreprocessingsuchascleaningandaugmentation

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,20256/47

NVIDIARubinSystem

MarcusParadies(LMU)ModernStorageTechnologiesfortheAIEraMarch19,20257/47

Large-ScaledeepRecommendationModelTraining@Meta

–StorageandonlinepreprocessingcanalreadyconsumemorepowerthantheactualGPUtrainersthemselvesinMeta’sdatacenters

–Thisdirectlyconstrainstrainingcapacityduetofixeddatacenterpowerbudgets

Storage:Exabytes,BW:Tbps

[10]

Zhaoetal.-“Understandingdatastorageandingestionforlarge-scaledeeprecommendationmodeltraining:industrialproduct”(ISCA’22),2022.

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,20258/47

AnalyzingandMitigatingDataStallsinDNNTraining

DataStalls

–FetchStalls(i/obound)

–PrepStalls(cpubound)

ModernStorageTechnologiesfortheAIEra

MarcusParadies(LMU)

March19,20259/47

[5]

Mohanetal.-“AnalyzingandMitigatingDataStallsinDNNTraining”(VLDB’21),2021.

ReducingtheOverheadofPreprocessing

SelectiveCaching

–Storedataitemsinlocalstorage/remotememory(e.g.memorypoolsviaRDMA)

Offloadpreprocessingtospecializedhardware(FPGAsandGPUs)

–UsersmustmanuallyconvertCPUoperationstothelimitedoperationssupportedbythespecializedhardware,whichisanon-trivialwork

–Inaddition,itisdifficulttooffloadgeneraloperations(e.g.,user-definedfunctionsorthird-partylibraries)

Offload(selected)preprocessingtaskstoremotestorage

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202510/47

DNNCheckpointingWhatisit?

Learnedmodelparametersarewrittentopersistentstorageeverysooftenduringtrainingforfault-tolerance

–Large-scaleDNNscantakeseveraldaysorevenweekstotrainacrosstenstothousandsofhardwareaccelerators

–Thejobresumestrainingfromthestatecapturedinthemostrecentpersistentcheckpoint(alsofordebuggingpurposes)

MarcusParadies(LMU)3Bmodel

MentraeecnfolmogisdforeteIra

doptimizerstateMarch19,202511/47

DNNCheckpointingSourcesofFailures

(1)Scaleofhardwaredeploymentsandtrainingtimecontinuetogrow

(2)Cheapercloudresources,suchasspotVMs,tolowercostsalsodramaticallyincreasesthefrequencyoffailures(preemptspotVMs)

(3)Failuresincludehardware,network,andpowerfailures,aswellassoftwarebugsand

out-ofmemoryissues

Numbers

–AstudyfromMetashowsthat50%ofMLtrainingjobsencounterafailurewithinlessthan16minutesofexecution

–Microsoftreportsa45minutemeantimebetweenjobfailuresinamulti-tenantGPUcluster

–Thorpeetal.foundthataGPUclusterof64spotVMsinAWSEC2experienced127distinctpreemptioneventsin24hours

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202512/47

DNNCheckpointing

Traditionalcheckpointing(e.g.,PyTorch,Tensorflow)

[8

]

–TxandUx–modeltrainingandupdatestepsforiterationx

–Px–timetopersiststatetostorage

–Cx–timetocopycheckpointstatefromGPUtoDRAM

Improvedcheckpointing(CheckFreq)

[4,8

]

[4]

Mohan,Phanishayee,andChidambaram-“CheckFreq:Frequent,Fine-GrainedDNNCheckpointing”(FAST’21),2021.

[8]

Strati,Friedman,andKlimovic-“PCcheck:PersistentConcurrentCheckpointingforML”(ASPLOS’25),2025.

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202513/47

PCcheckwithoutpipelining

[8

]

PCcheckwithpipelining

[8

]

[8]

Strati,Friedman,andKlimovic-“PCcheck:PersistentConcurrentCheckpointingforML”(ASPLOS’25),2025.

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202514/47

ModernStorageTechnologies

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202515/47

ModernStorageTechnologies

KeyQuestions(atleastforthistalk!):

Howdoweaccessdatafaster?

Howtoweavoidmovingdata?

Howdowestoredataeconomically(inthelongterm)?

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202516/47

Howdoweaccessdatafaster?

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202517/47

WhereisTimeSpentWhenAccessingStorage?

(e.g.,SATA)

ApplicationDRAM

POSIX

OperatingSystem(blocklayer,FS

layer,etc.)

Protocol

StorageDevice

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202518/47

WhereisTimeSpentWhenAccessingStorage?

(e.g.,SATA)

Protocol

ApplicationDRAM

POSIX

OperatingSystem(blocklayer,FS

layer,etc.)

StorageDevice

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202518/47

WhereisTimeSpentWhenAccessingStorage?

POSIX

Protocol(eSATA)

OperatingSystem

(blocklayer,FS

layer,etc.)

ApplicationDRAM

StorageDevice

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202518/47

POSIX

Protocol(eSATA)

OperatingSyst

(blocklayer,FS

layer,etc.)

ApplicationDRAM

StorageDevice

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202518/47

POSIX

Protocol(eSATA)

Operatingyst

(blocklayer,FS

layer,etc.)

ApplicationRAM

StorageDevice

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202518/47

Whatcanwedoaboutthis?

POSIX

Protocol(eSATA)

Operatingyst

(blocklayer,FS

layer,etc.)

ApplicationRAM

StorageDevice

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202519/47

POSIX

Protocol(eSATA)

Operatingyst

(blocklayer,FS

layer,etc.)

ApplicationRAM

NANDFlashSSDs

StorageDevice

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202519/47

Protocol(e.SATA)

POSIX

NVMe!

NANDFlash

SSDs!

ApplicationRAM

Operatingyst

(blocklayer,FS

layer,etc.)

Storage

Device

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202519/47

Whatcanwedoaboutthis?

RAMPOSIX

Protocol(eSATA)

KernelBypass!

NVMe!

NANDFlash

SSDs!

Application

Operatingyst

(blocklayer,FS

layer,etc.)

Storage

Device

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202519/47

POSIX

Protocol(eSATA)

AsyncI/O!

KernelBypass!

NVMe!

NANDFlash

SSDs!

Operatingyst

(blocklayer,FS

layer,etc.)

Application

Storage

Device

RAM

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202519/47

Whatcanwedoaboutthis?

Protocol(e.g.,SATA)

KernelBypass!

NVMe!

NANDFlash

SSDs!

RAM

AsyncI/O!

(blocklayer,layer,etc.

Application

OperatingSym

Sto

De

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202519/47

NANDFlashSSDs

–Lowlatency(μs),highthroughput(GB/s),largenumberofIOPS

–Massiveinherentparallelism

[3]

Kuschewskietal.-“High-PerformanceQueryProcessingwithNVMeArrays:SpillingwithoutKillingPerformance”(SIGMOD’25),2024.

[1]

Caietal.-“ErrorsinFlash-Memory-BasedSolid-StateDrives:Analysis,Mitigation,andRecovery”(arXiv),2017.

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202520/47

NVMeProtocol

NVMe(Non-VolatileMemoryExpress)isahigh-speedstorageprotocoldesignedspecifi-callyforsolid-statedrives(SSDs)tocommunicatewithacomputer’sCPUmoreefficiently

→Lowerlatency,higherlOPS&throughput

GeneralWorking

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAlEra

March19,202521/47

PerformanceoverheadsinducedbyOSstack

Figure:Kernel’slatencyoverheadwith512Brandomreads.HDD

isSeagateExosX16,NANDisIntelOptane750LSNAND,NVM-1is

firstgenerationIntelOptaneSSD(900P),andNVM-2issecond

generationIntelOptaneSSD(P5800X).

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202522/47

PerformanceoverheadsinducedbyOSstack

Figure:Kernel’slatencyoverheadwith512Brandomreads.HDD

isSeagateExosX16,NANDisIntelOptane750LSNAND,NVM-1is

firstgenerationIntelOptaneSSD(900P),andNVM-2issecond

generationIntelOptaneSSD(P5800X).

kernelcrossing

351ns

5.6%

readsyscall

199ns

3.2%

ext4

2,006ns

32.0%

bio

379ns

6.0%

NVMedriver

113ns

1.8%

storagedevice

3,224ns

51.4%

total

6.27μs

100.0%

Table:Averagelatencybreakdownofa512Brandomread()

syscallusingIntelOptaneSSDgen2

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202522/47

Figure:Kernel’slatencyoverheadwith512Brandomreads.HDD

isSeagateExosX16,NANDisIntelOptane750LSNAND,NVM-1is

firstgenerationIntelOptaneSSD(900P),andNVM-2issecond

generationIntelOptaneSSD(P5800X).

kernelcrossing

351ns

5.6%

readsyscall

199ns

3.2%

ext4

2,006ns

32.0%

bio

379ns

6.0%

NVMedriver

113ns

1.8%

storagedevice

3,224ns

51.4%

total

6.27μs

100.0%

Table:Averagelatencybreakdownofa512Brandomread()

syscallusingIntelOptaneSSDgen2

→Avoidtraversingthekernel’sstoragestackandmovedatabackandforthbetweenthekernelanduserspacewhenissuingdependentstoragerequests

[11]

Zhongetal.-“BPFforstorage:anexokernel-inspiredapproach”(HotOS’21),2021.

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202522/47

AsyncI/O

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202523/47

io_uring

async,scalable,minimizesyscalls&kernelroundtripss

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAlEra

March19,202524/47

FeedingDatatotheGPU

–CPUisresponsibleforcoordinationanddataaccess

–CPU’sDRAMisusedasbouncebuffer

–unnecessarymemorycopies

–increasedCPUoverheadandincreasedlatency

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202525/47

TheProblem(s)ofCPU-CentricStorageAccess

–CPUasorchestratorbetweencomputeandstorageaccessrequiressynchronization

–GPUkernelshavetobelaunchedmultipletimes

–CPUcannotaccuratelydeterminewhichpartsofthedataareneeded(andwhen!)

CPU-CentricStorageAccess

[7

]

[7]

Qureshietal.-“GPU-InitiatedOn-DemandHigh-ThroughputStorageAccessintheBaMSystemArchitecture”(ASPLOS’23),2023.

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202526/47

Accelerator-InitiatedI/O

BaMModel

[7

]

BigAcceleratorMemory(BaM)Architecture

[7

]

[7]

Qureshietal.-“GPU-InitiatedOn-DemandHigh-ThroughputStorageAccessintheBaMSystemArchitecture”(ASPLOS’23),2023.

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202527/47

Howtoweavoidmovingdata?

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202528/47

DataMovementConsideredHarmful

Whydowemovedata?

–Dataresidesindeepstoragehierarchiesandhastobemovedtocomputeresources

Whyisthisbad?

–Wastesbandwidthresourcesandenergy

–Involves(traditionally)oftentimesCPUresources

Whatelsecouldwegain?

–(Further)specializationofcomputeresources

–Higherdegreeofparallelism

–FreeupofCPUcycles

–Security&privacy

ComputationalStoragetotheRescue!

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202529/47

ComputationalStorageFTW(?)

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202530/47

WhatisComputationalStorage?

Source:SNIADictionary,

snia.org

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAlEra

March19,202531/47

ArchitecturesthatprovideComputationalStorageFunctionscoupledtostorage,offload-inghostprocessingorreducingdatamovement.

ArchitecturesthatprovideComputationalStorageFunctionscoupledtostorage,offload-inghostprocessingorreducingdatamovement.

Thesearchitecturesenableimprovementsinapplicationperformanceand/orinfras-tructureefficiencythroughtheintegrationofcomputeresources(outsideofthetradi-tionalcompute&memoryarchitecture)eitherdirectlywithstorageorbetweenthehostandthestorage.Thegoalofthesearchitecturesistoenableparallelcomputationand/ortoalleviateconstraintsonexistingcompute,memory,storage,andl/O

(Selectionof)ComputationalStorageHardware

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202532/47

ComputationalStorageArchitectures

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAlEra

March19,202533/47

Comp.StorageProcessorComp.StorageDeviceComp.StorageArray

ExamplesforCSEEs:OS,container,eBPF,FPGAbitstream

Source:SNIA,ComputationalStorageArchitectureandProgrammingModelv1.0

λ-IOArchitecture&Details

[9]

Yangetal.-“λ-IO:aunifiedIOstackforcomputationalstorage”(FAST’23),2023.

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202534/47

λ-IOCSDResults

SimpleApplications

TPC-H,SparkSQL(filter+proj.only)

[9]

Yangetal.-“λ-IO:aunifiedIOstackforcomputationalstorage”(FAST’23),2023.

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202535/47

In-StorageTransparentCompression

–Buildhardware-accelerated,transparentcompressionintotheSSD

–Compressionisappliedonlevel(fixedcompressionscheme,compressioncannotbesteeredbyapplication)

Benefits

+Savesstoragecapacity

+FreeupCPU

+NomodificationstoapplicationnecessaryCaveats

–Nocontrolovercompressionscheme

[2]

Huangetal.-“BreathingNewLifeintoanOldTree:ResolvingLoggingDilemmaofB+-treeonModernComputationalStorageDrives”(VLDB’24),2023.

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202536/47

In-StorageTransparentCompression

ExampleApplication:B+-treecompression

–B+-treemostwidelyusedindexingdatastructure

–Suffersfromhigherstoragespaceusageandhigherwriteamplification(comparedtoLSM-trees,forsmall-sizedrecords)

[6]

Qiaoetal.-“ClosingtheB+-treevs.LSM-treeWriteAmplificationGaponModernStorageHardwarewithBuilt-inTransparentCompression”(FAST’22),2022.

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202537/47

In-StorageTransparentCompression

ExampleApplication:B+-treecompression

–B+-treemostwidelyusedindexingdatastructure

–Suffersfromhigherstoragespaceusageandhigherwriteamplification(comparedtoLSM-trees,forsmall-sizedrecords)

[6]

Qiaoetal.-“ClosingtheB+-treevs.LSM-treeWriteAmplificationGaponModernStorageHardwarewithBuilt-inTransparentCompression”(FAST’22),2022.

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202538/47

ChallengesfacedbyComputationalStorage

–Noreal“KillerApp”yet

–Realconcerns&perceivedissueswithcomplexity

–Lowcomplexity,goodbenefits

–Highcomplexity,highbenefits

–Standardization(→beingaddressed)

–Security(→beingworkedon)

–Aligningtheecosystem

–Lackofaccesstohardwareandprototypes

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAlEra

March19,202539/47

Howdowestoredataeconomically(inthelongterm)?

MarcusParadies(LMU)

ModernStorageTechnologiesfortheAIEra

March19,202540/47

StorageHierarchy

–Storagelandscapeisdiversifying

–Deepstora

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论