版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
MorganstanleyRESEARCHGlobalIdea
January15,202610:30PMGMT
GlobalTechnology
Memory–HowtoPlaytheNew
AIBottleneck
Memorysitsinacapacity-constrainedcyclewithunusuallylongordervisibilitydrivenbyAIinference.For2026,theriskis
executionandtransition,notdemand.Asteeperpricingclimb
andfavourableconditionslikelypersistthrough2027.Multipleshaveexpanded,butwethinkstockcallscanstillworkwithmuchhigherearningsupsidefromhere.
Inferencebecomesamemorychallenge,notjustcompute.Memoryaccess
increasinglydeterminestheperformanceoflongertext,image/videoandAgenticAIworkflows,withfarmorerobustmemoryrequirementsthanpriorAImodelsto
supportcontext,autonomyandcontinuouslearning.ThesesystemsrequiresuperiorserverDRAMandenterpriseNANDtofunctioneffectively.
Memorycycle–asteeperpricingclimb.Memorypricingpowerisshiftingat
lightingspeed.WeexpectasteeperupcyclewithrapidgainsinDRAM,HBM,NAND,andlegacymemory.Innovationandarchitecturalredesigncontinuetoimprove
memoryefficiency,enablingAIsystemsdeliverbetterlatencyandcostprofiles,
materiallyenhancinguserexperience.ThislowerstheeconomicbarriertoadoptionandunlocksasignificantlylargerAITAM,evenasaggregatememorydemand
continuestoscalewithbroaderdeployment.Ouranalysissuggeststhattext-onlyAIinferencealonecouldaccountfor35%of2026globalmemorysupplyforDRAM
and92%forNAND.
What'schanged?Thekeydebateisnowshiftingtowhethersupplycancatchupandwherethetruechokepointsits,andthereforewhethertightnessandpricingpowerpersistornormalize.Near-termpriceexpectationsmatterless,inourview,butourchannelchecksindicatepotentialupsidetoalreadyaggressive70%+QoQhikesforbothDRAMandNAND.Inventorylevelcontinuestofallacrossthesupplychain.
Capexaccelerationisinevitable,focusedonDRAM,andwedoexpectmore
meaningfulgreenfieldexpansionsfrom2027.Thesupply-demandgapforlegacymemoryiswideningfurtherforDDR4/3,NORandSLC/MLCNANDandweraisePriceTargetsacrosstheboard.
Bottlenecksarethewinners–buymemoryandsemicap,especiallyEUV.We
preferhigherpricingpowerinDRAM(Samsung,SKhynix,MU),legacymemory
(Winbond),HDD(WDC),andcapexbenefitsviaSPE(ASML),packaging(DISCO)vs.downstreamhardwareandconsumerfacingmarginpressure.Negativefactorsto
watchincludetheimpactofdemanddestructionforsystem/hardwaredevicevendorsandchallengingYoYgrowthcomparisonsfrom2H26formemory.
MorganStanley&Co.Internationalplc+
ShawnKim
EquityAnalyst
+44207677-1018
Shawn.Kim@
LeeSimpson
EquityAnalyst
+44207425-3378
Lee.Simpson@
NigelvanPutten
EquityAnalyst
+44207425-2803
Nigel.Putten@
MorganStanleyAsiaLimited+
DuanLiu
EquityAnalyst
+8522239-7357
Duan.Liu@
MichelleKim
ResearchAssociate
+8523963-0183
Michelle.Kim1@
MorganStanley&Co.Internationalplc+
AmeliaMScicluna
ResearchAssociate
+44207425-6694
Amelia.Scicluna@
MorganStanleyMUFGSecuritiesCo.,Ltd.+
TetsuyaWadaki
EquityAnalyst
+8136836-8890
Tetsuya.Wadaki@
MorganStanleyTaiwanLimited+
CharlieChan
EquityAnalyst
+88622730-1725
Charlie.Chan@
MorganStanley&Co.LLC
JosephMoore
EquityAnalyst
+1212761-7516
Joseph.Moore@
MorganStanleyTaiwanLimited+
DanielYen,CFA
EquityAnalyst
+88622730-2863
Daniel.Yen@
MorganStanleyMUFGSecuritiesCo.,Ltd.+
KazuoYoshikawa,CFA
EquityAnalyst
+8136836-8408
Kazuo.Yoshikawa@
MorganStanley&Co.LLC
ErikWWoodring
EquityAnalyst
+1212296-8083
Erik.Woodring@
MorganStanleydoesandseekstodobusinesswith
companiescoveredinMorganStanleyResearch.Asaresult,investorsshouldbeawarethatthefirmmayhaveaconflictofinterestthatcouldaffecttheobjectivityofMorganStanley
Research.InvestorsshouldconsiderMorganStanley
Researchasonlyasinglefactorinmakingtheirinvestmentdecision.
Foranalystcertificationandotherimportantdisclosures,refertotheDisclosureSection,locatedattheendofthis
report.
+=Analystsemployedbynon-U.S.affiliatesarenotregisteredwithFINRA,maynotbeassociatedpersonsofthememberandmaynotbesubjecttoFINRArestrictionson
communicationswithasubjectcompany,publicappearancesandtradingsecuritiesheldbyaresearchanalystaccount.
2
organstanley
RESEARCH
GlobalIdea
DylanLiu
ResearchAssociate
Dylan.Liu@
MasonWayne
+1212761-4519
ResearchAssociate
Mason.Wayne@
+1212761-6012
MorganStanleyMUFGSecuritiesCo.,Ltd.
SuzuneTamura,CFA
EquityAnalyst
Suzune.Tamura@
+8136836-8891
MorganStanleyAsiaLimited+
EthanJia
ResearchAssociate
Ethan.Jia@
+8523963-2287
MorganStanley&Co.LLC
ShaneBrett
EquityAnalyst
Shane.Brett@
+1212761-1022
Technology-EuropeanSemiconductors
Europe
IndustryViewIn-Line
organstanley
RESEARCH
GlobalIdea
HowtoPlaytheAIBottleneck?
Historicmemoryshortage–aprecursortoahistoricsemiconductorproduction
footprint.WithDRAMpricesnowsurpassingmetalsasabenchmarkofscarcity,the
memorysectorentersaperiodofextendedcapacityconstraint.Memoryalongwith
advancedlogicfoundrymanufacturersmustfindwaystosecureandmanagesupplychainsforrapidlygrowingAIinfrastructureconsumptionandensureexpansioninwafer
manufacturing.Thebottlenecksinthesemiconductorindustrybecomethewinnersin
stocksperformanceandtheknock-oneffectonadjacentpartsofsemiconductorstendstobeunderestimated–thekeybottlenecktoAIhasshiftedtocommodityDRAMandNANDfromCoWoSandHBM.Nextcouldbesemiconductorequipment–inparticularEUV
(
ASMLHoldingNV:StrongerSet-Upfor2027;RaisePTto€1,400)
.Henceourglobaltop
10pickstoplaythememorybottleneck:
•DRAM–Samsung,MUandalsolikeSKhynix
•LegacyMemory–Winbond
•Storage–WDC
•Advancedpackaging–DISCO
•Semicap–AMAT,ASMI
•EUV–ASML
Thenatureofthecorrelation.Thesemicap(semiconductorcapitalequipment)aswellasthelogicfoundrycyclesarecloselyandintrinsicallycorrelatedtotheDRAMcycle,with
thememorymarketactingasaprimarydriverofthebroadersemiconductorindustry's
cyclicalnature.Intheshortrun,thereisnosignofanimminentpeakinmemorypricing
andprofitability–addtherecentTSMC
strongcapexguide
withleadingedgefoundry
tightnessandcapexfortheindustryissettoacceleratetowardsalltimehighsby2027-28.Semicapcompaniesgenerallyexperiencetheupturn(anddownturn)inprofitabilityonetotwoquartersaheadofmemorymakers,asequipmentordersareplacedwellinadvanceofproduction.However,stocksinflectaroundthesametimeasillustratedinExhibit1.The
shareperformancelagissignificantandamaterialcatch-upmoreprobable.
Exhibit1:ASMLvs.DRAMYoYperformance–significantlaggard
ASMLDRAM
250%
200%
150%
100%
50%
0%
-50%
-100%
Source:FactSet,MorganStanleyResearch.
Note:ThreeDRAMcompanies=SamsungElectronics,MicronandSKHynix
MorganStanleyResearch3
4
Exhibit2:ASMLorderswellbelowpeaktoday
ASMLQuarterlyNetBookings(EURmillions)
10,000
MemoryLogic
9,000
8,000
7,000
6,000
5,000
4,000
3,000
2,000
1,000
1Q19
2Q19
3Q19
4Q19
1Q20
2Q20
3Q20
4Q20
1Q21
2Q21
3Q21
4Q21
1Q22
2Q22
3Q22
4Q22
1Q23
2Q23
3Q23
4Q23
1Q24
2Q24
3Q24
4Q24
1Q25
2Q25
3Q25
0
Source:CompanyData,MorganStanleyResearch
Exhibit3:Semicapexposurelandscape
Source:AppliedMaterials,MorganStanleyResearch
EUVLessonsfromHistory–TimetoPlayChess,NotCheckers
ASML–PartyLikeIt's2010.Theperiodbetween2010-12wasextremelycapacity-constrainedinEUVlithography,andsawextremelylimitedvolumesofproductionfromASMLasthesoleviableEUVtoolsupplier.AsEUVtoolswerenotyet
production-worthy,theyrequiredco-development,risk-sharingandguaranteed
demandtojustifycontinuedinvestment.Inotherwords,accesstoEUVtoolswasdeterminedlessbypriceandmorebyrisktoleranceandstrategiccommitment.Weseeasimilarpotentialset-upwithEUVcapacityincreasinglyconstrainedinto2027-28e.What'sdifferentthistimeisbothSKhynixandMicronusing6-layersofEUVforfuture1c/gamaDRAMandlotsofcashonthebalancesheetcompetingfor
toolsvs.Samsungandadvancedlogicfoundry.
SamsunghoardingEUVtools.SamsungElectronicspositioneditselfasaleadEUVpartner,byplacingearlytoolreservations,engaginginadeepengineering
collaborationandexpressedwillingnesstoabsorbpotentialyieldrisks.SamsungElectronicsendedupabsorbingadisproportionateshareofearlyEUVtool
availability,effectively'crowdingout'otherlogicplayerslikeTSMCandIntel.Asaresult,TSMCandIntelwereforcedtorelylongeronadvancedDUVmulti-
patterningtools.Thisepisodehasprecededandcatalyzedashiftfrom
transactionalequipmentpurchasingtostrategicco-ownershipofcriticalsupply-chainassets.
organstanley
RESEARCH
GlobalIdea
MorganStanleyResearch5
Exhibit4:
SouthKoreaSemiconductorEquipmentImports
2002
2002
2003
2004
2005
2005
2006
2007
2008
2008
2009
2010
2011
2011
2012
2013
2014
2014
2015
2016
2017
2017
2018
2019
2020
2020
2021
2022
2023
2023
2024
2025
Equipments(Import)——3mmaYoY(RHS)
3,500
3,000
2,500
2,000
1,500
1,000
500
0
(%)
600
500
400
300
200
100
0
-100
-200
US$(mn)
Source:KoreaCustoms,MorganStanleyResearch
Strategicassets.In2012,ASMLannouncedacustomerco-investmentprogram
sizedat€1.38bninminoritystakepurchasesandlong-termR&Dfunding,withIntel(15%),Samsung(3%)andTSMC(5%)participatingandASMLremaining
independentwithnocontrolrightsheldbyanycustomers.Althoughthe
announcementofthispartnershipgeneratedamutedmarketreaction,thisepisodehighlightedastructuralfeatureofthesemiconductorsupplychain–controloverbottleneckcapitalequipmentcantemporarilyreshapecompetitivedynamics,evenwithoutformalexclusivity.Today,weseesimilardiscussionsaroundAdvanced
PackagingCapacity,HBMsupplychainsandFoundryco-investmentmodels.
Memoryplayers'currentengagementwithASML'sEUV(High-NAEUVtool)intheformofacquisitionandstrategicpartnership:ASMLhasbeenbuildinga
substantialresearchandsupportcampusinHwasung,SouthKorea,designedtostrengthencollaborationwithbothSamsungandSKhynix.
•SamsungisplanningtopurchasemultipleASMLHigh-NAEUVtoolswithplanstoinstallthemfor2nmfoundryproduction(notmemory).This
representsaleaddeploymentamongmajorplayersratherthanapassiveposition.
•In2025,hynixinstalledanASMLTwinscanNXE:5200BHigh-NAEUV
systematitsM16fabinIncheon,SouthKorea.ThismarksthefirstHigh-NAEUVtooldeployedformemoryproductionoutsideR&Dusage.AsthefirsttointegrateaHigh-NAEUVsysteminamassproductioncontext,SKhynixisimplicitlypartneringwithASMLonbothdeploymentanddevelopmentofnewlithographycapabilities.
organstanley
RESEARCH
GlobalIdea
6
Exhibit5:ASMLoutperformed
theMSCI2010-2012
Sharepriceperformancerelativeto01/01/2010
120%
ASMLMSCI
100%
80%
60%
40%
20%
0%
-20%
-40%
Source:FactSet,MorganStanleyResearch
Exhibit6:ASMLNTMPE2010-12
ASMLHoldingNV-PE-NTM
18
16
14
12
10
8
6
Source:FactSet,MorganStanleyResearch
Memory–whythesuddenbottleneck?
MemorybecomesasignificantbottleneckforAIdevelopment.Key-Value(KV)Cacheisemergingastheprimarymemoryscalingconstraintintransformerinference.Ascontextlengthsandconcurrencyrise,KVCachememorygrowslinearly,saturatinghigh-bandwidthmemorywellbeforecomputelimitsarereached.Thismakesinferenceincreasingly
memory-boundandunderpinstheindustry'spushtowardarchitecturalandsoftware-levelmemoryefficiency.AIinferenceisfundamentallydifferentfromLLMtrainingandis
becomingincreasinglydifficulttoscale.Recentbreakthroughsandemergingusagepatternsmakeinferencemorememory-intensive,oftenrequiringadditionalhigh-
bandwidthmemoryratherthanless.CurrenttrendsthatareincreasingthememoryrequirementsofinferenceandamplifyingtheKVCacheprobleminclude:
•Context-MemoryinAgenticAIisdeeplyintertwinedwithcontext–without
context,eventhemostsophisticatedconversationbecomesmeaningless
(tendencytohallucinate).Contextconstantlyretrievespriorsharedhistorytomakesenseofthepresent.ItincreasescomputeandsignificantlymorememorydemandastheamountofinformationtheLLMmodelcanlookatrises
meaningfullywhengeneratingananswerthatisofquality.
•Reasoninggeneratesalongsequenceofthinkingbeforethefinalanswer,similartohowpeoplesolveaproblemstep-by-step.Thissignificantlyincreaseslatencyandthelongsequenceofthoughttokensstrainsthememory.
•Multimodal(images,audio,videogeneration)arelargerdatatypesthatconsumefarmorethantextgeneration.
•MixtureofExperts(MoE)expandsmemoryusagewiththeuseofmultiple
experts(forexample,ChinaDeepSeekv3has256MoE)invokedselectivelyratherthanasingledensefeed-forwardblock,whichallowsmodelsizetogrow
significantlyforhigherquality(relativetoamodestincreaseintrainingcost)andhelptraining.
MorganStanleyResearch7
Exhibit7:AIInferencePrefillvs.Decodingstages
Source:MorganStanleyResearch
TheAIhardwareracehaspivotedtowardslessglamorousmemoryfromprevious
computehorsepower.ComputedeterminesAIbutmemorynowdetermineshowfarandhowfastitcanscale.AsAIsystemsmovefurtherawayfromtrainingtowards
inference,thephysicsofperformancechangeswithAIinferenceworkloads,especially
AgenticAI,whichisincreasinglymorememory-boundthancompute-bound.MemorylivesatdifferentlayersinanAIAgentsystem(short-termworkingmemory,storedgeneral
facts,long-termmemoryspan,pre-trainedexternalknowledgebase,tooloutputs,userhistory).Asmodelsgrowinsizeandcontextwindowsexpand,thechallengeislessabouthowfastchipscancomputebuthowquicklytheycanfeeddatatothoseprocessors.
AstepchangeinAIisunderwayin2026...TheAIindustryisshiftingfromgenerativeAItoAgenticAI,with2025beingtheyearwhenwemasteredreasoningandintroduced
AgenticAI,to2026,whichwillbeaboutmovingAIfromexperimentationtocore
infrastructureandenterpriseagentadoption.Theseagentsarenowmorereliable,havestrongermemory,havefewerhallucinationsandcontinuouslearninghasbegun.Weareintheprocessoffusingfrontiermodelswithcustomizedopen-sourcesystemsrunningonenterpriseservers.Thenextendmarketwillbemuchbigger–thatisPhysical,whenwemoveintelligencefromthecloudintoindustrialAIandhumanoids.
…withadramaticimpactformemory.AgenticAIdrivesmassivedemandforDRAMandNANDbyrequiringsignificantlyhighermemorycapacityandperformancetosupportitscorefunctionsofcontext,autonomy,planningandcontinuouslearning.Weareshiftingfromreactivesingle-taskmodelstoproactive,autonomous,andcontinuouslylearning
systemsthatrequiresignificantandreliablememoryresourcestofunctioneffectively.
ThisshifthasledDRAMdemandprioritizingtheproductionofhigh-end,AI-specificmemory,drivingupoverallDRAMdemandsurgeandprices.
8
Exhibit8:AgenticAI–Memorytiersillustration
Source:TowardsDataScience,MorganStanleyResearch
Whatisthelong-termbullcaseformemory?
We'restillatthebeginning,nottheend.Despiterecentprogress,ChatGPTisonlythreeyearsold,operational1GWdatacentersdon’tyetexist,andexpertssee“nowallsinsight”forpre-training.AImodelscontinuetoexhibitsubstantialheadroomforimprovement
throughbothincreasedcomputeandefficiencygains,asdemonstratedbyrecent
breakthroughssuchasDeepSeekV4.Beyondtraditionalscaling,weareuncoveringnewoptimizationleversacrosstrainingandinference,includingdynamicreasoningdepth(howmuchamodel'thinks'beforeresponding).Atthesametime,thegrowingadoptionof
visionandmultimodalAImodelsisstructurallyincreasingmemoryrequirements,astheseworkloadsprocesshigh-dimensionalinputsandmaintainlargerintermediate
representationinfastmemory.Wehavefoundnewscalingrelationshipsinotherpartsofmodeltrainingandinference,includinghowmuchtothinkbeforeansweringaquestion.
Exhibit9:NumberofAImodelsreleased
Multimodal
Video
ImageGeneration
Vision
AllLarge-ScaleAIsystemsLanguage
180
160
140
120
100
80
60
40
20
0
2022
2023
2024
2025
201920202021
Source:OurWorldinData,MorganStanleyResearch
TheAIbuildoutkeepshittingnewinfrastructurelimits.Formemory,itisfacingthe
largestglobalscalingofanytechwaveinhistory.AIagentsarequicklybecomingacritical
MorganStanleyResearch9
bottleneckandthenextwaveofprogresswithAIagentswillnotcomefrombetter
reasoningbutratherwillcomefrombettercontexthandling.AnAIassistantthat
rememberseverythingismoreusefulthanabiggermodelthatrememberslittle.Thismeansmorememorylayerstounlockfarmorevaluethanreasoningimprovements.
Memoryandcontextmanagementareincreasinglythebottleneck.Theagent’ssourcecodeisjusttheorchestrationlayerwhereastheheavyliftinghappensinhowmemorygetsingested,organized,andretrieved.Memorysystemsarequicklybecomingthehidden
complexitybehindagents.Ifcodeusedtobethebottleneck,memorymightbethenew
one.LLMsrememberjustenoughcontexttosustainaconversation,butnotalifetimeofthem.Formanyusers,that’sfinebutforanyonewhoimaginesAIasapermanentcognitivepartneroragent,it’sastructuralfrustration.
•Storagecost:Keepingbillionsofusers’fullconversationhistories,indexedandinstantlyretrievable,wouldexplodestorageneedsandretrievallatency.
•Processingcost:Evenifthedataisstored,everyquerywouldneedmorecomputetosearch,rankandcontextualize.Thattranslatesdirectlyintohigherper-promptcosts.
•Hardwarecost:AtypicalAIserveruses8Xmorememorythantraditionalservers,andwitheachgeneration,thatnumberclimbsevenhigher.
Innovationandhigherefficiency.DeepSeekrecentresearchpaper
'ConditionalMemory
viaScalableLookup:ANewAxisofSparcityforLLMs'
demonstratesapathwaytoscalingmodelcapacitybeyondHBMlimitsbydecouplingreasoningfromknowledgestorage.Inthisarchitecture,latency-criticalreasoningremainsresidentinHBM,whilelarge,less
frequentlyaccessedknowledgememoryisoff-loadedtoCXL-attachedDDR5.Thiseffectivelyintroducesmemoryasanewaxisofsparsity,enablingmeaningfulmodelscalingevenasHBMcapacityremainsconstrained.
MoreNAND–NVIDIAInferenceContextMemoryStoragePlatform.AtCES,Nvidia
showcaseditsinferencecontextmemorystorageplatformwhichservesasaKVCacheforinferencing.ThisplatformispoweredbytheNVIDIABlueField-4DPU(DataProcessing
Unit)andinsertedanothertierofeSSDstorage,whichmanagestheoff-loadingand
sharingofKVCachedata.Inatypicalconfiguration,thisinfrastructureallowsforan
additional16TBofhigh-speedSSDstorage(presumablyNVMe)tobedirectlyassociatedwitheachRubinGPU,functioningasanextensionofthesystem'smemoryhierarchytohandleextremelylargecontextlengths.WithDPUmovingfromBlueField-3to4,DRAMcontentalsoupgradedfrom32GBDDR5to128GBLPDDR5X.
organstanley
RESEARCH
GlobalIdea
10
Exhibit10:NVIDIAInferenceContextMemoryStoragePlatform
Source:NVIDIACES2026KeynoteContextStorage
SizingtheInferenceTAM
Itisalmostimpossibletocalculatetheexactdemandonmemorygiventhechanging
dynamicsonusergrowth,applications,andtechnologyinnovation.Whatinvestorsare
primarilytryingtounderstandishowmuchincrementalTAMcanbeaddedasKVCacheexpands.Currently,mostofthehotKVCacheisstoredinHBM/DRAM,whichmeansnotonlythatitisexpensivebutalsothememorydurationwillbeshort.AllowingKVCachetobeoffloadedtoeSSDmakeslongercontextandlongermemorydurationpossible,whichcaninturnsignificantlyenhancecurrentapplicationsandcreateabetterinferencesetupforAgenticAIdevelopmentandpenetration.
In
Exhibit11
,wecalculatedthetieredmemoryusageofaChatGPTlikemodelwithkeyassumptionsbelow(fullassumptionsin
Appendix:MemoryUsageTierBreakdown
Assumptions
):
•Weassumed800millionweeklyactiveusers(PeakQPS300,000req/s)
•Inputtokens/request:2,000tokens
•WeassumedliveKVis50%/50%splitonHBMandDRAMandreusedKVCacheis40%onDRAMvs.60%onNAND.(KVCacheinFP16precision)
•Assumedtextonlyapplications;image/videodemandisnotconsideredinthismodel
•Westicktotheindustrycurrentcommonpracticefortherestofourassumptionstotheextentpossible
Inconclusion,fora200,000GPUclusterthatrunssuchamodel,theHBMusageis
around200PB,DRAM4EB,NAND42EBonwarmdata/KVCacheoffloadanddatalake
demandisaround260EB.Ifweassumegloballytherearethreesuchmodels,thetotalAIinferencedemandwillaccountfor17%,35%and92%of2026globalmemorysupplyrespectivelyforHBM,DRAMandNAND.
organstanley
RESEARCH
GlobalIdea
MorganStanleyResearch11
Exhibit11:TieredMemory/StorageUsageBreakdown
TierComponentTotal(TiB)Total(PiB)Notes
HBM
LiveKV(hot)
330
0
50%ofliveKV
DRAM
LiveKV(warm)
330
0
50%ofliveKV
RackSSD
LiveKV(offload)
-
-
DRAM
KVreusecache(24h,50%)
4,169,941
4,072
Time-integratedretainedKV40%inDRAM,60%inRackSSD
RackSSD
KVreusecache(24h,50%)
6,254,911
6,108
Time-integratedretainedKV40%inDRAM,60%inRackSSD
HBM
GPUoverheads
18,750
18
(workspace+runtime+sidemodels)
HBM
Weightsonactiveexperts(top-2/128)
186,731
182
Setinputifknown
DRAM
Hostbuffers/state
0
0
per-seqhoststate
RackSSD
LocalRackSSDcaches/logs
36,379,788
35,527
200TB/node(decimal)
HBMTotal
226,291
226
DRAMTotal
4,585,261
4,585
RackSSDTotal
46,877,348
46,877
DatalakeRAG+logs+caches(central)
293,556,000
293,556
Source:MorganStanleyResearchestimates
Note:detailedassumptionsinAppendix
ExtendingcontextlengthandincreasingtheKVCachewillhavethemostobvious
incrementalupliftonDRAMandRackSSD(assumingEffectivelivewindowcappedthe
usageofHBMandDatalakeismorerelevanttomodelsize,RAGandotherfactors).By
increasingtheinputtokensfrom2000tokens/requestto5,000tokens/requestholdingotherassumptionsunchanged,theincrementalKVCachewillincreasetheDRAMdemandbyaround2EBandRackSSDby3EBpermodel.
Exhibit12:Sensitivitytestoninputtoken/requestpermodel
Inputtokens/requestHBMTOTAL(PB)DRAMTOTAL(PB)RackSSDTOTAL(EB)DatalakeTOTAL(EB)
2,000
226
4,585
47
294
5,000
226
6,648
50
294
10,000
226
10,087
55
294
20,000
226
16,964
65
294
50,000
226
37,597
96
294
100,000
226
71,983
148
294
200,000
226
140,757
251
294
Source:MorganStanleyResearchestimates
Exhibit13:Inference-stagestoragebymemorytier
Storagetier
Whatliveshereinpractice
HBM(GPUmemory)
•Modelweights(activeshards)
•HotKVcache(recenttokensonly,typicallylast256–1,024tokens)
•Temporaryactivations/workspaces(attentionscratch,GEMMbuffers,logits,NCCLbuffers)
•Runtimemetadata(KVblocktables,schedulers,CUDAgraphs)
•Occasionallysmallsidemodels(safetyfilters,routers)
HostDRAM(CPUmemory)
•WarmKVcache(paged/offloaded)
•EvictedKVblocksfromHBM(majorityofKVatscale)
•Request/sessionstate(tokenqueues,schedulers,batchingmetadata)
•Prompttext+tokenizedinputs
•OptionalCPU-residentweights(rareforhigh-throughputchat)
Local/in-rackNAND(NVMe/SSD)
•ColdKVspill(emergencyorbatch/offlineinferenceonly)
•KVcheckpointsforlongjobs
•Localcaches(promptcache,embeddingcache)
•Node-levellogsandshort-termtelemetrybuffers
Datalake(sharedstorage)
•Modelstorage(checkpoints,shardedweights,multiplevariants)
•KVspillforoffline/recovery(notlatency-sensitive)
•RAGcorporaandindices
•Logs,telemetry,traces
•Long-livedcaches(promptreuse,embeddings,evaluationartifacts)
Source:MorganStanleyResearch
What'schanged?
Pricehikeseverywhere.Givenrecentbuyer-sellernegotiations,weseeDRAMpricing
momentumtoremainexceptionallystronginto1Q26,withupsiderisktoitspriorforecast.GlobalDRAMmarkethasenteredaphaseofdramaticpriceinflation,drivenprimarilybythethreemajorsuppliersreallocatingcapacitytowardhigh-marginserverDRAMandHBM
12
tomeetrobustAIinferenceandinfrastructuredemandfrommajorCSPs.Thisshiftshas
createdasevere'capacitycrowding-outeffect'forPC,mobile,andconsumerDRAM,
resultinginafirmlyentrenchedhigh-price,low-volumeseller'smarket.WithUSCSPswillingtoabsorbsubstantialpriceincreasesandactivelysigninglong-termcontractstosecuresupply,
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 安全教育燃气知识
- 潼南电脑培训
- 线上金银加工培训课件
- 2026年风电叶片无损检测员专项考试题及答案
- 2026安徽蚌埠市禹会区招聘村级后备干部招聘5人备考题库附参考答案详解(完整版)
- 2026广东广州市花都区实验中学临聘教师招聘3人备考题库含答案详解
- 2026上半年安徽事业单位联考宣州区招聘30人备考题库附参考答案详解(巩固)
- 2026社会工作者之中级社会综合能力基础试题库和答案
- 安全生产责任制度和岗位责任制
- 2026上半年青海事业单位联考海北州招聘44人备考题库附参考答案详解(预热题)
- 个税挂靠协议书
- 车载HUD产业发展趋势报告(2025)-CAICV智能车载光显示任务组
- 重症科患者的康复护理
- 2025年矿山提升机闸瓦检测题库(附答案)
- 2024-2025学年浙江省台州市高二(上)期末语文试卷
- 肾内科慢性肾病疾病干预方案
- 矿山安全托管合同范本
- 2025高一政治必修一测试题
- 石材开采工理论知识考核试卷及答案
- 2023年广东省事业单位招聘考试真题及答案解析
- 加工中心操作工初级工考试试卷与答案
评论
0/150
提交评论