摩根士丹利:全球科技行业研究:存储领域-如何布局新的 AI 瓶颈_第1页
摩根士丹利:全球科技行业研究:存储领域-如何布局新的 AI 瓶颈_第2页
摩根士丹利:全球科技行业研究:存储领域-如何布局新的 AI 瓶颈_第3页
摩根士丹利:全球科技行业研究:存储领域-如何布局新的 AI 瓶颈_第4页
摩根士丹利:全球科技行业研究:存储领域-如何布局新的 AI 瓶颈_第5页
已阅读5页,还剩38页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

MorganstanleyRESEARCHGlobalIdea

January15,202610:30PMGMT

GlobalTechnology

Memory–HowtoPlaytheNew

AIBottleneck

Memorysitsinacapacity-constrainedcyclewithunusuallylongordervisibilitydrivenbyAIinference.For2026,theriskis

executionandtransition,notdemand.Asteeperpricingclimb

andfavourableconditionslikelypersistthrough2027.Multipleshaveexpanded,butwethinkstockcallscanstillworkwithmuchhigherearningsupsidefromhere.

Inferencebecomesamemorychallenge,notjustcompute.Memoryaccess

increasinglydeterminestheperformanceoflongertext,image/videoandAgenticAIworkflows,withfarmorerobustmemoryrequirementsthanpriorAImodelsto

supportcontext,autonomyandcontinuouslearning.ThesesystemsrequiresuperiorserverDRAMandenterpriseNANDtofunctioneffectively.

Memorycycle–asteeperpricingclimb.Memorypricingpowerisshiftingat

lightingspeed.WeexpectasteeperupcyclewithrapidgainsinDRAM,HBM,NAND,andlegacymemory.Innovationandarchitecturalredesigncontinuetoimprove

memoryefficiency,enablingAIsystemsdeliverbetterlatencyandcostprofiles,

materiallyenhancinguserexperience.ThislowerstheeconomicbarriertoadoptionandunlocksasignificantlylargerAITAM,evenasaggregatememorydemand

continuestoscalewithbroaderdeployment.Ouranalysissuggeststhattext-onlyAIinferencealonecouldaccountfor35%of2026globalmemorysupplyforDRAM

and92%forNAND.

What'schanged?Thekeydebateisnowshiftingtowhethersupplycancatchupandwherethetruechokepointsits,andthereforewhethertightnessandpricingpowerpersistornormalize.Near-termpriceexpectationsmatterless,inourview,butourchannelchecksindicatepotentialupsidetoalreadyaggressive70%+QoQhikesforbothDRAMandNAND.Inventorylevelcontinuestofallacrossthesupplychain.

Capexaccelerationisinevitable,focusedonDRAM,andwedoexpectmore

meaningfulgreenfieldexpansionsfrom2027.Thesupply-demandgapforlegacymemoryiswideningfurtherforDDR4/3,NORandSLC/MLCNANDandweraisePriceTargetsacrosstheboard.

Bottlenecksarethewinners–buymemoryandsemicap,especiallyEUV.We

preferhigherpricingpowerinDRAM(Samsung,SKhynix,MU),legacymemory

(Winbond),HDD(WDC),andcapexbenefitsviaSPE(ASML),packaging(DISCO)vs.downstreamhardwareandconsumerfacingmarginpressure.Negativefactorsto

watchincludetheimpactofdemanddestructionforsystem/hardwaredevicevendorsandchallengingYoYgrowthcomparisonsfrom2H26formemory.

MorganStanley&Co.Internationalplc+

ShawnKim

EquityAnalyst

+44207677-1018

Shawn.Kim@

LeeSimpson

EquityAnalyst

+44207425-3378

Lee.Simpson@

NigelvanPutten

EquityAnalyst

+44207425-2803

Nigel.Putten@

MorganStanleyAsiaLimited+

DuanLiu

EquityAnalyst

+8522239-7357

Duan.Liu@

MichelleKim

ResearchAssociate

+8523963-0183

Michelle.Kim1@

MorganStanley&Co.Internationalplc+

AmeliaMScicluna

ResearchAssociate

+44207425-6694

Amelia.Scicluna@

MorganStanleyMUFGSecuritiesCo.,Ltd.+

TetsuyaWadaki

EquityAnalyst

+8136836-8890

Tetsuya.Wadaki@

MorganStanleyTaiwanLimited+

CharlieChan

EquityAnalyst

+88622730-1725

Charlie.Chan@

MorganStanley&Co.LLC

JosephMoore

EquityAnalyst

+1212761-7516

Joseph.Moore@

MorganStanleyTaiwanLimited+

DanielYen,CFA

EquityAnalyst

+88622730-2863

Daniel.Yen@

MorganStanleyMUFGSecuritiesCo.,Ltd.+

KazuoYoshikawa,CFA

EquityAnalyst

+8136836-8408

Kazuo.Yoshikawa@

MorganStanley&Co.LLC

ErikWWoodring

EquityAnalyst

+1212296-8083

Erik.Woodring@

MorganStanleydoesandseekstodobusinesswith

companiescoveredinMorganStanleyResearch.Asaresult,investorsshouldbeawarethatthefirmmayhaveaconflictofinterestthatcouldaffecttheobjectivityofMorganStanley

Research.InvestorsshouldconsiderMorganStanley

Researchasonlyasinglefactorinmakingtheirinvestmentdecision.

Foranalystcertificationandotherimportantdisclosures,refertotheDisclosureSection,locatedattheendofthis

report.

+=Analystsemployedbynon-U.S.affiliatesarenotregisteredwithFINRA,maynotbeassociatedpersonsofthememberandmaynotbesubjecttoFINRArestrictionson

communicationswithasubjectcompany,publicappearancesandtradingsecuritiesheldbyaresearchanalystaccount.

2

organstanley

RESEARCH

GlobalIdea

DylanLiu

ResearchAssociate

Dylan.Liu@

MasonWayne

+1212761-4519

ResearchAssociate

Mason.Wayne@

+1212761-6012

MorganStanleyMUFGSecuritiesCo.,Ltd.

SuzuneTamura,CFA

EquityAnalyst

Suzune.Tamura@

+8136836-8891

MorganStanleyAsiaLimited+

EthanJia

ResearchAssociate

Ethan.Jia@

+8523963-2287

MorganStanley&Co.LLC

ShaneBrett

EquityAnalyst

Shane.Brett@

+1212761-1022

Technology-EuropeanSemiconductors

Europe

IndustryViewIn-Line

organstanley

RESEARCH

GlobalIdea

HowtoPlaytheAIBottleneck?

Historicmemoryshortage–aprecursortoahistoricsemiconductorproduction

footprint.WithDRAMpricesnowsurpassingmetalsasabenchmarkofscarcity,the

memorysectorentersaperiodofextendedcapacityconstraint.Memoryalongwith

advancedlogicfoundrymanufacturersmustfindwaystosecureandmanagesupplychainsforrapidlygrowingAIinfrastructureconsumptionandensureexpansioninwafer

manufacturing.Thebottlenecksinthesemiconductorindustrybecomethewinnersin

stocksperformanceandtheknock-oneffectonadjacentpartsofsemiconductorstendstobeunderestimated–thekeybottlenecktoAIhasshiftedtocommodityDRAMandNANDfromCoWoSandHBM.Nextcouldbesemiconductorequipment–inparticularEUV

(

ASMLHoldingNV:StrongerSet-Upfor2027;RaisePTto€1,400)

.Henceourglobaltop

10pickstoplaythememorybottleneck:

•DRAM–Samsung,MUandalsolikeSKhynix

•LegacyMemory–Winbond

•Storage–WDC

•Advancedpackaging–DISCO

•Semicap–AMAT,ASMI

•EUV–ASML

Thenatureofthecorrelation.Thesemicap(semiconductorcapitalequipment)aswellasthelogicfoundrycyclesarecloselyandintrinsicallycorrelatedtotheDRAMcycle,with

thememorymarketactingasaprimarydriverofthebroadersemiconductorindustry's

cyclicalnature.Intheshortrun,thereisnosignofanimminentpeakinmemorypricing

andprofitability–addtherecentTSMC

strongcapexguide

withleadingedgefoundry

tightnessandcapexfortheindustryissettoacceleratetowardsalltimehighsby2027-28.Semicapcompaniesgenerallyexperiencetheupturn(anddownturn)inprofitabilityonetotwoquartersaheadofmemorymakers,asequipmentordersareplacedwellinadvanceofproduction.However,stocksinflectaroundthesametimeasillustratedinExhibit1.The

shareperformancelagissignificantandamaterialcatch-upmoreprobable.

Exhibit1:ASMLvs.DRAMYoYperformance–significantlaggard

ASMLDRAM

250%

200%

150%

100%

50%

0%

-50%

-100%

Source:FactSet,MorganStanleyResearch.

Note:ThreeDRAMcompanies=SamsungElectronics,MicronandSKHynix

MorganStanleyResearch3

4

Exhibit2:ASMLorderswellbelowpeaktoday

ASMLQuarterlyNetBookings(EURmillions)

10,000

MemoryLogic

9,000

8,000

7,000

6,000

5,000

4,000

3,000

2,000

1,000

1Q19

2Q19

3Q19

4Q19

1Q20

2Q20

3Q20

4Q20

1Q21

2Q21

3Q21

4Q21

1Q22

2Q22

3Q22

4Q22

1Q23

2Q23

3Q23

4Q23

1Q24

2Q24

3Q24

4Q24

1Q25

2Q25

3Q25

0

Source:CompanyData,MorganStanleyResearch

Exhibit3:Semicapexposurelandscape

Source:AppliedMaterials,MorganStanleyResearch

EUVLessonsfromHistory–TimetoPlayChess,NotCheckers

ASML–PartyLikeIt's2010.Theperiodbetween2010-12wasextremelycapacity-constrainedinEUVlithography,andsawextremelylimitedvolumesofproductionfromASMLasthesoleviableEUVtoolsupplier.AsEUVtoolswerenotyet

production-worthy,theyrequiredco-development,risk-sharingandguaranteed

demandtojustifycontinuedinvestment.Inotherwords,accesstoEUVtoolswasdeterminedlessbypriceandmorebyrisktoleranceandstrategiccommitment.Weseeasimilarpotentialset-upwithEUVcapacityincreasinglyconstrainedinto2027-28e.What'sdifferentthistimeisbothSKhynixandMicronusing6-layersofEUVforfuture1c/gamaDRAMandlotsofcashonthebalancesheetcompetingfor

toolsvs.Samsungandadvancedlogicfoundry.

SamsunghoardingEUVtools.SamsungElectronicspositioneditselfasaleadEUVpartner,byplacingearlytoolreservations,engaginginadeepengineering

collaborationandexpressedwillingnesstoabsorbpotentialyieldrisks.SamsungElectronicsendedupabsorbingadisproportionateshareofearlyEUVtool

availability,effectively'crowdingout'otherlogicplayerslikeTSMCandIntel.Asaresult,TSMCandIntelwereforcedtorelylongeronadvancedDUVmulti-

patterningtools.Thisepisodehasprecededandcatalyzedashiftfrom

transactionalequipmentpurchasingtostrategicco-ownershipofcriticalsupply-chainassets.

organstanley

RESEARCH

GlobalIdea

MorganStanleyResearch5

Exhibit4:

SouthKoreaSemiconductorEquipmentImports

2002

2002

2003

2004

2005

2005

2006

2007

2008

2008

2009

2010

2011

2011

2012

2013

2014

2014

2015

2016

2017

2017

2018

2019

2020

2020

2021

2022

2023

2023

2024

2025

Equipments(Import)——3mmaYoY(RHS)

3,500

3,000

2,500

2,000

1,500

1,000

500

0

(%)

600

500

400

300

200

100

0

-100

-200

US$(mn)

Source:KoreaCustoms,MorganStanleyResearch

Strategicassets.In2012,ASMLannouncedacustomerco-investmentprogram

sizedat€1.38bninminoritystakepurchasesandlong-termR&Dfunding,withIntel(15%),Samsung(3%)andTSMC(5%)participatingandASMLremaining

independentwithnocontrolrightsheldbyanycustomers.Althoughthe

announcementofthispartnershipgeneratedamutedmarketreaction,thisepisodehighlightedastructuralfeatureofthesemiconductorsupplychain–controloverbottleneckcapitalequipmentcantemporarilyreshapecompetitivedynamics,evenwithoutformalexclusivity.Today,weseesimilardiscussionsaroundAdvanced

PackagingCapacity,HBMsupplychainsandFoundryco-investmentmodels.

Memoryplayers'currentengagementwithASML'sEUV(High-NAEUVtool)intheformofacquisitionandstrategicpartnership:ASMLhasbeenbuildinga

substantialresearchandsupportcampusinHwasung,SouthKorea,designedtostrengthencollaborationwithbothSamsungandSKhynix.

•SamsungisplanningtopurchasemultipleASMLHigh-NAEUVtoolswithplanstoinstallthemfor2nmfoundryproduction(notmemory).This

representsaleaddeploymentamongmajorplayersratherthanapassiveposition.

•In2025,hynixinstalledanASMLTwinscanNXE:5200BHigh-NAEUV

systematitsM16fabinIncheon,SouthKorea.ThismarksthefirstHigh-NAEUVtooldeployedformemoryproductionoutsideR&Dusage.AsthefirsttointegrateaHigh-NAEUVsysteminamassproductioncontext,SKhynixisimplicitlypartneringwithASMLonbothdeploymentanddevelopmentofnewlithographycapabilities.

organstanley

RESEARCH

GlobalIdea

6

Exhibit5:ASMLoutperformed

theMSCI2010-2012

Sharepriceperformancerelativeto01/01/2010

120%

ASMLMSCI

100%

80%

60%

40%

20%

0%

-20%

-40%

Source:FactSet,MorganStanleyResearch

Exhibit6:ASMLNTMPE2010-12

ASMLHoldingNV-PE-NTM

18

16

14

12

10

8

6

Source:FactSet,MorganStanleyResearch

Memory–whythesuddenbottleneck?

MemorybecomesasignificantbottleneckforAIdevelopment.Key-Value(KV)Cacheisemergingastheprimarymemoryscalingconstraintintransformerinference.Ascontextlengthsandconcurrencyrise,KVCachememorygrowslinearly,saturatinghigh-bandwidthmemorywellbeforecomputelimitsarereached.Thismakesinferenceincreasingly

memory-boundandunderpinstheindustry'spushtowardarchitecturalandsoftware-levelmemoryefficiency.AIinferenceisfundamentallydifferentfromLLMtrainingandis

becomingincreasinglydifficulttoscale.Recentbreakthroughsandemergingusagepatternsmakeinferencemorememory-intensive,oftenrequiringadditionalhigh-

bandwidthmemoryratherthanless.CurrenttrendsthatareincreasingthememoryrequirementsofinferenceandamplifyingtheKVCacheprobleminclude:

•Context-MemoryinAgenticAIisdeeplyintertwinedwithcontext–without

context,eventhemostsophisticatedconversationbecomesmeaningless

(tendencytohallucinate).Contextconstantlyretrievespriorsharedhistorytomakesenseofthepresent.ItincreasescomputeandsignificantlymorememorydemandastheamountofinformationtheLLMmodelcanlookatrises

meaningfullywhengeneratingananswerthatisofquality.

•Reasoninggeneratesalongsequenceofthinkingbeforethefinalanswer,similartohowpeoplesolveaproblemstep-by-step.Thissignificantlyincreaseslatencyandthelongsequenceofthoughttokensstrainsthememory.

•Multimodal(images,audio,videogeneration)arelargerdatatypesthatconsumefarmorethantextgeneration.

•MixtureofExperts(MoE)expandsmemoryusagewiththeuseofmultiple

experts(forexample,ChinaDeepSeekv3has256MoE)invokedselectivelyratherthanasingledensefeed-forwardblock,whichallowsmodelsizetogrow

significantlyforhigherquality(relativetoamodestincreaseintrainingcost)andhelptraining.

MorganStanleyResearch7

Exhibit7:AIInferencePrefillvs.Decodingstages

Source:MorganStanleyResearch

TheAIhardwareracehaspivotedtowardslessglamorousmemoryfromprevious

computehorsepower.ComputedeterminesAIbutmemorynowdetermineshowfarandhowfastitcanscale.AsAIsystemsmovefurtherawayfromtrainingtowards

inference,thephysicsofperformancechangeswithAIinferenceworkloads,especially

AgenticAI,whichisincreasinglymorememory-boundthancompute-bound.MemorylivesatdifferentlayersinanAIAgentsystem(short-termworkingmemory,storedgeneral

facts,long-termmemoryspan,pre-trainedexternalknowledgebase,tooloutputs,userhistory).Asmodelsgrowinsizeandcontextwindowsexpand,thechallengeislessabouthowfastchipscancomputebuthowquicklytheycanfeeddatatothoseprocessors.

AstepchangeinAIisunderwayin2026...TheAIindustryisshiftingfromgenerativeAItoAgenticAI,with2025beingtheyearwhenwemasteredreasoningandintroduced

AgenticAI,to2026,whichwillbeaboutmovingAIfromexperimentationtocore

infrastructureandenterpriseagentadoption.Theseagentsarenowmorereliable,havestrongermemory,havefewerhallucinationsandcontinuouslearninghasbegun.Weareintheprocessoffusingfrontiermodelswithcustomizedopen-sourcesystemsrunningonenterpriseservers.Thenextendmarketwillbemuchbigger–thatisPhysical,whenwemoveintelligencefromthecloudintoindustrialAIandhumanoids.

…withadramaticimpactformemory.AgenticAIdrivesmassivedemandforDRAMandNANDbyrequiringsignificantlyhighermemorycapacityandperformancetosupportitscorefunctionsofcontext,autonomy,planningandcontinuouslearning.Weareshiftingfromreactivesingle-taskmodelstoproactive,autonomous,andcontinuouslylearning

systemsthatrequiresignificantandreliablememoryresourcestofunctioneffectively.

ThisshifthasledDRAMdemandprioritizingtheproductionofhigh-end,AI-specificmemory,drivingupoverallDRAMdemandsurgeandprices.

8

Exhibit8:AgenticAI–Memorytiersillustration

Source:TowardsDataScience,MorganStanleyResearch

Whatisthelong-termbullcaseformemory?

We'restillatthebeginning,nottheend.Despiterecentprogress,ChatGPTisonlythreeyearsold,operational1GWdatacentersdon’tyetexist,andexpertssee“nowallsinsight”forpre-training.AImodelscontinuetoexhibitsubstantialheadroomforimprovement

throughbothincreasedcomputeandefficiencygains,asdemonstratedbyrecent

breakthroughssuchasDeepSeekV4.Beyondtraditionalscaling,weareuncoveringnewoptimizationleversacrosstrainingandinference,includingdynamicreasoningdepth(howmuchamodel'thinks'beforeresponding).Atthesametime,thegrowingadoptionof

visionandmultimodalAImodelsisstructurallyincreasingmemoryrequirements,astheseworkloadsprocesshigh-dimensionalinputsandmaintainlargerintermediate

representationinfastmemory.Wehavefoundnewscalingrelationshipsinotherpartsofmodeltrainingandinference,includinghowmuchtothinkbeforeansweringaquestion.

Exhibit9:NumberofAImodelsreleased

Multimodal

Video

ImageGeneration

Vision

AllLarge-ScaleAIsystemsLanguage

180

160

140

120

100

80

60

40

20

0

2022

2023

2024

2025

201920202021

Source:OurWorldinData,MorganStanleyResearch

TheAIbuildoutkeepshittingnewinfrastructurelimits.Formemory,itisfacingthe

largestglobalscalingofanytechwaveinhistory.AIagentsarequicklybecomingacritical

MorganStanleyResearch9

bottleneckandthenextwaveofprogresswithAIagentswillnotcomefrombetter

reasoningbutratherwillcomefrombettercontexthandling.AnAIassistantthat

rememberseverythingismoreusefulthanabiggermodelthatrememberslittle.Thismeansmorememorylayerstounlockfarmorevaluethanreasoningimprovements.

Memoryandcontextmanagementareincreasinglythebottleneck.Theagent’ssourcecodeisjusttheorchestrationlayerwhereastheheavyliftinghappensinhowmemorygetsingested,organized,andretrieved.Memorysystemsarequicklybecomingthehidden

complexitybehindagents.Ifcodeusedtobethebottleneck,memorymightbethenew

one.LLMsrememberjustenoughcontexttosustainaconversation,butnotalifetimeofthem.Formanyusers,that’sfinebutforanyonewhoimaginesAIasapermanentcognitivepartneroragent,it’sastructuralfrustration.

•Storagecost:Keepingbillionsofusers’fullconversationhistories,indexedandinstantlyretrievable,wouldexplodestorageneedsandretrievallatency.

•Processingcost:Evenifthedataisstored,everyquerywouldneedmorecomputetosearch,rankandcontextualize.Thattranslatesdirectlyintohigherper-promptcosts.

•Hardwarecost:AtypicalAIserveruses8Xmorememorythantraditionalservers,andwitheachgeneration,thatnumberclimbsevenhigher.

Innovationandhigherefficiency.DeepSeekrecentresearchpaper

'ConditionalMemory

viaScalableLookup:ANewAxisofSparcityforLLMs'

demonstratesapathwaytoscalingmodelcapacitybeyondHBMlimitsbydecouplingreasoningfromknowledgestorage.Inthisarchitecture,latency-criticalreasoningremainsresidentinHBM,whilelarge,less

frequentlyaccessedknowledgememoryisoff-loadedtoCXL-attachedDDR5.Thiseffectivelyintroducesmemoryasanewaxisofsparsity,enablingmeaningfulmodelscalingevenasHBMcapacityremainsconstrained.

MoreNAND–NVIDIAInferenceContextMemoryStoragePlatform.AtCES,Nvidia

showcaseditsinferencecontextmemorystorageplatformwhichservesasaKVCacheforinferencing.ThisplatformispoweredbytheNVIDIABlueField-4DPU(DataProcessing

Unit)andinsertedanothertierofeSSDstorage,whichmanagestheoff-loadingand

sharingofKVCachedata.Inatypicalconfiguration,thisinfrastructureallowsforan

additional16TBofhigh-speedSSDstorage(presumablyNVMe)tobedirectlyassociatedwitheachRubinGPU,functioningasanextensionofthesystem'smemoryhierarchytohandleextremelylargecontextlengths.WithDPUmovingfromBlueField-3to4,DRAMcontentalsoupgradedfrom32GBDDR5to128GBLPDDR5X.

organstanley

RESEARCH

GlobalIdea

10

Exhibit10:NVIDIAInferenceContextMemoryStoragePlatform

Source:NVIDIACES2026KeynoteContextStorage

SizingtheInferenceTAM

Itisalmostimpossibletocalculatetheexactdemandonmemorygiventhechanging

dynamicsonusergrowth,applications,andtechnologyinnovation.Whatinvestorsare

primarilytryingtounderstandishowmuchincrementalTAMcanbeaddedasKVCacheexpands.Currently,mostofthehotKVCacheisstoredinHBM/DRAM,whichmeansnotonlythatitisexpensivebutalsothememorydurationwillbeshort.AllowingKVCachetobeoffloadedtoeSSDmakeslongercontextandlongermemorydurationpossible,whichcaninturnsignificantlyenhancecurrentapplicationsandcreateabetterinferencesetupforAgenticAIdevelopmentandpenetration.

In

Exhibit11

,wecalculatedthetieredmemoryusageofaChatGPTlikemodelwithkeyassumptionsbelow(fullassumptionsin

Appendix:MemoryUsageTierBreakdown

Assumptions

):

•Weassumed800millionweeklyactiveusers(PeakQPS300,000req/s)

•Inputtokens/request:2,000tokens

•WeassumedliveKVis50%/50%splitonHBMandDRAMandreusedKVCacheis40%onDRAMvs.60%onNAND.(KVCacheinFP16precision)

•Assumedtextonlyapplications;image/videodemandisnotconsideredinthismodel

•Westicktotheindustrycurrentcommonpracticefortherestofourassumptionstotheextentpossible

Inconclusion,fora200,000GPUclusterthatrunssuchamodel,theHBMusageis

around200PB,DRAM4EB,NAND42EBonwarmdata/KVCacheoffloadanddatalake

demandisaround260EB.Ifweassumegloballytherearethreesuchmodels,thetotalAIinferencedemandwillaccountfor17%,35%and92%of2026globalmemorysupplyrespectivelyforHBM,DRAMandNAND.

organstanley

RESEARCH

GlobalIdea

MorganStanleyResearch11

Exhibit11:TieredMemory/StorageUsageBreakdown

TierComponentTotal(TiB)Total(PiB)Notes

HBM

LiveKV(hot)

330

0

50%ofliveKV

DRAM

LiveKV(warm)

330

0

50%ofliveKV

RackSSD

LiveKV(offload)

-

-

DRAM

KVreusecache(24h,50%)

4,169,941

4,072

Time-integratedretainedKV40%inDRAM,60%inRackSSD

RackSSD

KVreusecache(24h,50%)

6,254,911

6,108

Time-integratedretainedKV40%inDRAM,60%inRackSSD

HBM

GPUoverheads

18,750

18

(workspace+runtime+sidemodels)

HBM

Weightsonactiveexperts(top-2/128)

186,731

182

Setinputifknown

DRAM

Hostbuffers/state

0

0

per-seqhoststate

RackSSD

LocalRackSSDcaches/logs

36,379,788

35,527

200TB/node(decimal)

HBMTotal

226,291

226

DRAMTotal

4,585,261

4,585

RackSSDTotal

46,877,348

46,877

DatalakeRAG+logs+caches(central)

293,556,000

293,556

Source:MorganStanleyResearchestimates

Note:detailedassumptionsinAppendix

ExtendingcontextlengthandincreasingtheKVCachewillhavethemostobvious

incrementalupliftonDRAMandRackSSD(assumingEffectivelivewindowcappedthe

usageofHBMandDatalakeismorerelevanttomodelsize,RAGandotherfactors).By

increasingtheinputtokensfrom2000tokens/requestto5,000tokens/requestholdingotherassumptionsunchanged,theincrementalKVCachewillincreasetheDRAMdemandbyaround2EBandRackSSDby3EBpermodel.

Exhibit12:Sensitivitytestoninputtoken/requestpermodel

Inputtokens/requestHBMTOTAL(PB)DRAMTOTAL(PB)RackSSDTOTAL(EB)DatalakeTOTAL(EB)

2,000

226

4,585

47

294

5,000

226

6,648

50

294

10,000

226

10,087

55

294

20,000

226

16,964

65

294

50,000

226

37,597

96

294

100,000

226

71,983

148

294

200,000

226

140,757

251

294

Source:MorganStanleyResearchestimates

Exhibit13:Inference-stagestoragebymemorytier

Storagetier

Whatliveshereinpractice

HBM(GPUmemory)

•Modelweights(activeshards)

•HotKVcache(recenttokensonly,typicallylast256–1,024tokens)

•Temporaryactivations/workspaces(attentionscratch,GEMMbuffers,logits,NCCLbuffers)

•Runtimemetadata(KVblocktables,schedulers,CUDAgraphs)

•Occasionallysmallsidemodels(safetyfilters,routers)

HostDRAM(CPUmemory)

•WarmKVcache(paged/offloaded)

•EvictedKVblocksfromHBM(majorityofKVatscale)

•Request/sessionstate(tokenqueues,schedulers,batchingmetadata)

•Prompttext+tokenizedinputs

•OptionalCPU-residentweights(rareforhigh-throughputchat)

Local/in-rackNAND(NVMe/SSD)

•ColdKVspill(emergencyorbatch/offlineinferenceonly)

•KVcheckpointsforlongjobs

•Localcaches(promptcache,embeddingcache)

•Node-levellogsandshort-termtelemetrybuffers

Datalake(sharedstorage)

•Modelstorage(checkpoints,shardedweights,multiplevariants)

•KVspillforoffline/recovery(notlatency-sensitive)

•RAGcorporaandindices

•Logs,telemetry,traces

•Long-livedcaches(promptreuse,embeddings,evaluationartifacts)

Source:MorganStanleyResearch

What'schanged?

Pricehikeseverywhere.Givenrecentbuyer-sellernegotiations,weseeDRAMpricing

momentumtoremainexceptionallystronginto1Q26,withupsiderisktoitspriorforecast.GlobalDRAMmarkethasenteredaphaseofdramaticpriceinflation,drivenprimarilybythethreemajorsuppliersreallocatingcapacitytowardhigh-marginserverDRAMandHBM

12

tomeetrobustAIinferenceandinfrastructuredemandfrommajorCSPs.Thisshiftshas

createdasevere'capacitycrowding-outeffect'forPC,mobile,andconsumerDRAM,

resultinginafirmlyentrenchedhigh-price,low-volumeseller'smarket.WithUSCSPswillingtoabsorbsubstantialpriceincreasesandactivelysigninglong-termcontractstosecuresupply,

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论