版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
AI-generatedvisualrepresentationofastoragesystemforAI
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,20251/47
ModernStorage
TechnologiesfortheAIEra
MarcusParadies
March19,2025
Outline
(1)ForwhichtasksdoweneedstorageinAI?
(2)Whichstoragetechnologiescouldweuseforthat?
Disclaimer:IwillmostlytalkaboutStorage4AIandnotAI4Storage
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,20252/47
StorageRequirementsinDeepLearning
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,20253/47
TrainingDeepNeuralNetworks
–ADNNistrainedovermultipleroundstermedepochs
–Eachepochprocessesallitemsinthedatasetexactlyonce,andconsistsofmultiple
iterations
–Eachiterationprocessesarandom,disjointsubsetofthedatatermedaminibatch
–TheDNNistraineduntilatargetaccuracyisreached
OnlineDataPreparationModelTraining
ForwardDecodingProp.
DataLoading
Cache
Local/RemoteStorage
LossCalc.
BackProp.
Weightupd.
Transform
Augmentation
CPUCPUGPU
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAlEra
March19,20254/47
TrainingDeepNeuralNetworks
DataLoading
Cache
Local/RemoteStorage
ForwardProp.
LossCalc.
BackProp.
Weightupd.
Decoding
Transform
Augmentation
OnlineDataPreparationModelTraining
CPUCPUGPU
(1)DataLoading
–Dataloadedfromlocal(e.g.,SSD)orremotestorage(e.g.,S3)
–Cachingdatamighthelp(butrequiresspecialcachingpoliciesduetodatasetsizes)
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAlEra
March19,20255/47
TrainingDeepNeuralNetworks
OnlineDataPreparationModelTraining
ForwardDecodingProp.
DataLoading
Cache
Local/RemoteStorage
LossCalc.
BackProp.
Weightupd.
Transform
Augmentation
CPUCPUGPU
(2)OnlineDataPreparation
–Inputdataaredecoded,preprocessed,andaugmentedonCPUs(mostly)tocreatetensorsforcomputationsonGPUs
–Onlinepreprocessinginevitablyincursmultipletimesforeachtraining(e.g.,augmentationisarandomoperationthatrandomlytweakseachelementtocreatediversedataandimprovemodelaccuracy)
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,20255/47
TrainingDeepNeuralNetworks
DataLoading
Cache
Local/RemoteStorage
ForwardProp.
LossCalc.
BackProp.
Weightupd.
Decoding
Transform
Augmentation
OnlineDataPreparationModelTraining
CPUCPUGPU
(3)ModelTraining
–Weightofthemodelisupdatedaccordingtoforward&backwardcomputationsontheGPUs
–Multipleepochsareperformedforhigheraccuracybyiteratingthesamedatasetmultipletimes
–Augmentationisrequiredtopreventthemodelfrombeingoverfittedonthesamedataset
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,20255/47
Observations
–GPUsevolveatacrazypace(andkeepdoingso-thisisactuallynottheproblem)
–ThetimespentonCPUsforpreprocessingcanbelongerthanthatspentonGPUsfortensorcomputations!
→ExpensiveGPUswaitforCPUcomputations→GPUunderutilizations
–Evenworse:shifttowardsdata-centricAIrequiresmoreadvanceddatapreprocessingsuchascleaningandaugmentation
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,20256/47
NVIDIARubinSystem
MarcusParadies(LMU)ModernStorageTechnologiesfortheAIEraMarch19,20257/47
Large-ScaledeepRecommendationModelTraining@Meta
–StorageandonlinepreprocessingcanalreadyconsumemorepowerthantheactualGPUtrainersthemselvesinMeta’sdatacenters
–Thisdirectlyconstrainstrainingcapacityduetofixeddatacenterpowerbudgets
Storage:Exabytes,BW:Tbps
[10]
Zhaoetal.-“Understandingdatastorageandingestionforlarge-scaledeeprecommendationmodeltraining:industrialproduct”(ISCA’22),2022.
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,20258/47
AnalyzingandMitigatingDataStallsinDNNTraining
DataStalls
–FetchStalls(i/obound)
–PrepStalls(cpubound)
ModernStorageTechnologiesfortheAIEra
MarcusParadies(LMU)
March19,20259/47
[5]
Mohanetal.-“AnalyzingandMitigatingDataStallsinDNNTraining”(VLDB’21),2021.
ReducingtheOverheadofPreprocessing
SelectiveCaching
–Storedataitemsinlocalstorage/remotememory(e.g.memorypoolsviaRDMA)
Offloadpreprocessingtospecializedhardware(FPGAsandGPUs)
–UsersmustmanuallyconvertCPUoperationstothelimitedoperationssupportedbythespecializedhardware,whichisanon-trivialwork
–Inaddition,itisdifficulttooffloadgeneraloperations(e.g.,user-definedfunctionsorthird-partylibraries)
Offload(selected)preprocessingtaskstoremotestorage
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202510/47
DNNCheckpointingWhatisit?
Learnedmodelparametersarewrittentopersistentstorageeverysooftenduringtrainingforfault-tolerance
–Large-scaleDNNscantakeseveraldaysorevenweekstotrainacrosstenstothousandsofhardwareaccelerators
–Thejobresumestrainingfromthestatecapturedinthemostrecentpersistentcheckpoint(alsofordebuggingpurposes)
MarcusParadies(LMU)3Bmodel
MentraeecnfolmogisdforeteIra
doptimizerstateMarch19,202511/47
DNNCheckpointingSourcesofFailures
(1)Scaleofhardwaredeploymentsandtrainingtimecontinuetogrow
(2)Cheapercloudresources,suchasspotVMs,tolowercostsalsodramaticallyincreasesthefrequencyoffailures(preemptspotVMs)
(3)Failuresincludehardware,network,andpowerfailures,aswellassoftwarebugsand
out-ofmemoryissues
Numbers
–AstudyfromMetashowsthat50%ofMLtrainingjobsencounterafailurewithinlessthan16minutesofexecution
–Microsoftreportsa45minutemeantimebetweenjobfailuresinamulti-tenantGPUcluster
–Thorpeetal.foundthataGPUclusterof64spotVMsinAWSEC2experienced127distinctpreemptioneventsin24hours
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202512/47
DNNCheckpointing
Traditionalcheckpointing(e.g.,PyTorch,Tensorflow)
[8
]
–TxandUx–modeltrainingandupdatestepsforiterationx
–Px–timetopersiststatetostorage
–Cx–timetocopycheckpointstatefromGPUtoDRAM
Improvedcheckpointing(CheckFreq)
[4,8
]
[4]
Mohan,Phanishayee,andChidambaram-“CheckFreq:Frequent,Fine-GrainedDNNCheckpointing”(FAST’21),2021.
[8]
Strati,Friedman,andKlimovic-“PCcheck:PersistentConcurrentCheckpointingforML”(ASPLOS’25),2025.
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202513/47
PCcheckwithoutpipelining
[8
]
PCcheckwithpipelining
[8
]
[8]
Strati,Friedman,andKlimovic-“PCcheck:PersistentConcurrentCheckpointingforML”(ASPLOS’25),2025.
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202514/47
ModernStorageTechnologies
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202515/47
ModernStorageTechnologies
KeyQuestions(atleastforthistalk!):
Howdoweaccessdatafaster?
Howtoweavoidmovingdata?
Howdowestoredataeconomically(inthelongterm)?
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202516/47
Howdoweaccessdatafaster?
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202517/47
WhereisTimeSpentWhenAccessingStorage?
(e.g.,SATA)
ApplicationDRAM
POSIX
OperatingSystem(blocklayer,FS
layer,etc.)
Protocol
StorageDevice
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202518/47
WhereisTimeSpentWhenAccessingStorage?
(e.g.,SATA)
Protocol
ApplicationDRAM
POSIX
OperatingSystem(blocklayer,FS
layer,etc.)
StorageDevice
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202518/47
WhereisTimeSpentWhenAccessingStorage?
POSIX
Protocol(eSATA)
OperatingSystem
(blocklayer,FS
layer,etc.)
ApplicationDRAM
StorageDevice
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202518/47
POSIX
Protocol(eSATA)
OperatingSyst
(blocklayer,FS
layer,etc.)
ApplicationDRAM
StorageDevice
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202518/47
POSIX
Protocol(eSATA)
Operatingyst
(blocklayer,FS
layer,etc.)
ApplicationRAM
StorageDevice
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202518/47
Whatcanwedoaboutthis?
POSIX
Protocol(eSATA)
Operatingyst
(blocklayer,FS
layer,etc.)
ApplicationRAM
StorageDevice
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202519/47
POSIX
Protocol(eSATA)
Operatingyst
(blocklayer,FS
layer,etc.)
ApplicationRAM
NANDFlashSSDs
StorageDevice
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202519/47
Protocol(e.SATA)
POSIX
NVMe!
NANDFlash
SSDs!
ApplicationRAM
Operatingyst
(blocklayer,FS
layer,etc.)
Storage
Device
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202519/47
Whatcanwedoaboutthis?
RAMPOSIX
Protocol(eSATA)
KernelBypass!
NVMe!
NANDFlash
SSDs!
Application
Operatingyst
(blocklayer,FS
layer,etc.)
Storage
Device
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202519/47
POSIX
Protocol(eSATA)
AsyncI/O!
KernelBypass!
NVMe!
NANDFlash
SSDs!
Operatingyst
(blocklayer,FS
layer,etc.)
Application
Storage
Device
RAM
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202519/47
Whatcanwedoaboutthis?
Protocol(e.g.,SATA)
KernelBypass!
NVMe!
NANDFlash
SSDs!
RAM
AsyncI/O!
(blocklayer,layer,etc.
Application
OperatingSym
Sto
De
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202519/47
NANDFlashSSDs
–Lowlatency(μs),highthroughput(GB/s),largenumberofIOPS
–Massiveinherentparallelism
[3]
Kuschewskietal.-“High-PerformanceQueryProcessingwithNVMeArrays:SpillingwithoutKillingPerformance”(SIGMOD’25),2024.
[1]
Caietal.-“ErrorsinFlash-Memory-BasedSolid-StateDrives:Analysis,Mitigation,andRecovery”(arXiv),2017.
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202520/47
NVMeProtocol
NVMe(Non-VolatileMemoryExpress)isahigh-speedstorageprotocoldesignedspecifi-callyforsolid-statedrives(SSDs)tocommunicatewithacomputer’sCPUmoreefficiently
→Lowerlatency,higherlOPS&throughput
GeneralWorking
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAlEra
March19,202521/47
PerformanceoverheadsinducedbyOSstack
Figure:Kernel’slatencyoverheadwith512Brandomreads.HDD
isSeagateExosX16,NANDisIntelOptane750LSNAND,NVM-1is
firstgenerationIntelOptaneSSD(900P),andNVM-2issecond
generationIntelOptaneSSD(P5800X).
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202522/47
PerformanceoverheadsinducedbyOSstack
Figure:Kernel’slatencyoverheadwith512Brandomreads.HDD
isSeagateExosX16,NANDisIntelOptane750LSNAND,NVM-1is
firstgenerationIntelOptaneSSD(900P),andNVM-2issecond
generationIntelOptaneSSD(P5800X).
kernelcrossing
351ns
5.6%
readsyscall
199ns
3.2%
ext4
2,006ns
32.0%
bio
379ns
6.0%
NVMedriver
113ns
1.8%
storagedevice
3,224ns
51.4%
total
6.27μs
100.0%
Table:Averagelatencybreakdownofa512Brandomread()
syscallusingIntelOptaneSSDgen2
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202522/47
Figure:Kernel’slatencyoverheadwith512Brandomreads.HDD
isSeagateExosX16,NANDisIntelOptane750LSNAND,NVM-1is
firstgenerationIntelOptaneSSD(900P),andNVM-2issecond
generationIntelOptaneSSD(P5800X).
kernelcrossing
351ns
5.6%
readsyscall
199ns
3.2%
ext4
2,006ns
32.0%
bio
379ns
6.0%
NVMedriver
113ns
1.8%
storagedevice
3,224ns
51.4%
total
6.27μs
100.0%
Table:Averagelatencybreakdownofa512Brandomread()
syscallusingIntelOptaneSSDgen2
→Avoidtraversingthekernel’sstoragestackandmovedatabackandforthbetweenthekernelanduserspacewhenissuingdependentstoragerequests
[11]
Zhongetal.-“BPFforstorage:anexokernel-inspiredapproach”(HotOS’21),2021.
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202522/47
AsyncI/O
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202523/47
io_uring
async,scalable,minimizesyscalls&kernelroundtripss
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAlEra
March19,202524/47
FeedingDatatotheGPU
–CPUisresponsibleforcoordinationanddataaccess
–CPU’sDRAMisusedasbouncebuffer
–unnecessarymemorycopies
–increasedCPUoverheadandincreasedlatency
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202525/47
TheProblem(s)ofCPU-CentricStorageAccess
–CPUasorchestratorbetweencomputeandstorageaccessrequiressynchronization
–GPUkernelshavetobelaunchedmultipletimes
–CPUcannotaccuratelydeterminewhichpartsofthedataareneeded(andwhen!)
CPU-CentricStorageAccess
[7
]
[7]
Qureshietal.-“GPU-InitiatedOn-DemandHigh-ThroughputStorageAccessintheBaMSystemArchitecture”(ASPLOS’23),2023.
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202526/47
Accelerator-InitiatedI/O
BaMModel
[7
]
BigAcceleratorMemory(BaM)Architecture
[7
]
[7]
Qureshietal.-“GPU-InitiatedOn-DemandHigh-ThroughputStorageAccessintheBaMSystemArchitecture”(ASPLOS’23),2023.
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202527/47
Howtoweavoidmovingdata?
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202528/47
DataMovementConsideredHarmful
Whydowemovedata?
–Dataresidesindeepstoragehierarchiesandhastobemovedtocomputeresources
Whyisthisbad?
–Wastesbandwidthresourcesandenergy
–Involves(traditionally)oftentimesCPUresources
Whatelsecouldwegain?
–(Further)specializationofcomputeresources
–Higherdegreeofparallelism
–FreeupofCPUcycles
–Security&privacy
ComputationalStoragetotheRescue!
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202529/47
ComputationalStorageFTW(?)
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202530/47
WhatisComputationalStorage?
Source:SNIADictionary,
snia.org
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAlEra
March19,202531/47
ArchitecturesthatprovideComputationalStorageFunctionscoupledtostorage,offload-inghostprocessingorreducingdatamovement.
ArchitecturesthatprovideComputationalStorageFunctionscoupledtostorage,offload-inghostprocessingorreducingdatamovement.
Thesearchitecturesenableimprovementsinapplicationperformanceand/orinfras-tructureefficiencythroughtheintegrationofcomputeresources(outsideofthetradi-tionalcompute&memoryarchitecture)eitherdirectlywithstorageorbetweenthehostandthestorage.Thegoalofthesearchitecturesistoenableparallelcomputationand/ortoalleviateconstraintsonexistingcompute,memory,storage,andl/O
(Selectionof)ComputationalStorageHardware
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202532/47
ComputationalStorageArchitectures
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAlEra
March19,202533/47
Comp.StorageProcessorComp.StorageDeviceComp.StorageArray
ExamplesforCSEEs:OS,container,eBPF,FPGAbitstream
Source:SNIA,ComputationalStorageArchitectureandProgrammingModelv1.0
λ-IOArchitecture&Details
[9]
Yangetal.-“λ-IO:aunifiedIOstackforcomputationalstorage”(FAST’23),2023.
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202534/47
λ-IOCSDResults
SimpleApplications
TPC-H,SparkSQL(filter+proj.only)
[9]
Yangetal.-“λ-IO:aunifiedIOstackforcomputationalstorage”(FAST’23),2023.
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202535/47
In-StorageTransparentCompression
–Buildhardware-accelerated,transparentcompressionintotheSSD
–Compressionisappliedonlevel(fixedcompressionscheme,compressioncannotbesteeredbyapplication)
Benefits
+Savesstoragecapacity
+FreeupCPU
+NomodificationstoapplicationnecessaryCaveats
–Nocontrolovercompressionscheme
[2]
Huangetal.-“BreathingNewLifeintoanOldTree:ResolvingLoggingDilemmaofB+-treeonModernComputationalStorageDrives”(VLDB’24),2023.
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202536/47
In-StorageTransparentCompression
ExampleApplication:B+-treecompression
–B+-treemostwidelyusedindexingdatastructure
–Suffersfromhigherstoragespaceusageandhigherwriteamplification(comparedtoLSM-trees,forsmall-sizedrecords)
[6]
Qiaoetal.-“ClosingtheB+-treevs.LSM-treeWriteAmplificationGaponModernStorageHardwarewithBuilt-inTransparentCompression”(FAST’22),2022.
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202537/47
In-StorageTransparentCompression
ExampleApplication:B+-treecompression
–B+-treemostwidelyusedindexingdatastructure
–Suffersfromhigherstoragespaceusageandhigherwriteamplification(comparedtoLSM-trees,forsmall-sizedrecords)
[6]
Qiaoetal.-“ClosingtheB+-treevs.LSM-treeWriteAmplificationGaponModernStorageHardwarewithBuilt-inTransparentCompression”(FAST’22),2022.
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202538/47
ChallengesfacedbyComputationalStorage
–Noreal“KillerApp”yet
–Realconcerns&perceivedissueswithcomplexity
–Lowcomplexity,goodbenefits
–Highcomplexity,highbenefits
–Standardization(→beingaddressed)
–Security(→beingworkedon)
–Aligningtheecosystem
–Lackofaccesstohardwareandprototypes
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAlEra
March19,202539/47
Howdowestoredataeconomically(inthelongterm)?
MarcusParadies(LMU)
ModernStorageTechnologiesfortheAIEra
March19,202540/47
StorageHierarchy
–Storagelandscapeisdiversifying
–Deepstora
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 茶叶类目抖音运营方案
- 景区投资公司运营方案
- 公园运营活动方案
- 国资投资平台运营方案
- 健康产品运营方案范本
- 袋装水电商运营方案
- 展示柜台运营方案
- 夏季墙面施工方案
- 影楼企业运营管理方案
- 垂钓策划运营方案模板
- 目视化管理培训建议
- (正式版)DB50∕T 1896-2025 《建设项目占用湿地、湿地公园生态影响评价专题报告编制规范》
- 流水线方案报告
- 2026年普通高中学业水平合格性考试生物知识点考点复习提纲
- 山西省2025年(夏季)普通高中学业水平合格性考试地理试卷(含答案详解)
- 2026.01.01施行的《行政事业单位内部控制评价办法》解读与指南
- 《交易心理分析》中文
- 2026年浙江省杭州市单招职业适应性测试题库带答案解析
- 雨课堂学堂在线学堂云《5G与人工智能(湖北师大 )》单元测试考核答案
- 2025年辽宁警务辅助人员招聘考试(行政能力测试)历年参考题库含答案详解
- 道路货运运输企业安全生产培训和教育学习制度
评论
0/150
提交评论