华为韬定律-多层电子系统的时间缩微理论(A Time Scaling Theory for Multi-Layer Electronic Systems)_第1页
华为韬定律-多层电子系统的时间缩微理论(A Time Scaling Theory for Multi-Layer Electronic Systems)_第2页
华为韬定律-多层电子系统的时间缩微理论(A Time Scaling Theory for Multi-Layer Electronic Systems)_第3页
华为韬定律-多层电子系统的时间缩微理论(A Time Scaling Theory for Multi-Layer Electronic Systems)_第4页
华为韬定律-多层电子系统的时间缩微理论(A Time Scaling Theory for Multi-Layer Electronic Systems)_第5页
已阅读5页,还剩19页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

ATimeScalingTheoryforMulti-LayerElectronicSystems

TingboHe

Huawei

Abstract

Forsixdecades,Moore'sgeometricscalingdroveprogressinsemiconductors.Thatindustrycompactnolongerholds:returnsfrompuredimensionalshrinkinghaveflattened,leading-edgedesignbudgetsexceedonebilliondollarsperchip,andcost-per-transistoratthemostadvancednodesisnolongerfalling.Thisperspectivearguesforasuccessorscalingprinciple—tscaling—thatadoptstimeitself,ratherthantransistorarea,astheprimarymetricofprogress,applyingasinglecharacteristictimeconstanttastheunifyingoptimizationtargetacrosstwelveordersofmagnitude,fromaswitchingtransistortoadata-centerworkload.Twoproduction-scaledemonstrationsarepresented.OnamobileSoC,LogicFolding—amethodologythatpartitionsdigital,analog,andmemorycircuitsacrossverticallystackedactivetiers—deliversa55%step-wiseincreaseintransistordensityanda41%power-efficiencygainatafixeddevicenode.OnAIsystems,aco-designedstackcomprisingthememory-semanticUnifiedBusfabric,near-packagedHi-ONEopticalI/O,andedge-to-surface3DFoldingprojectsmorethan100×growthinhardwareintegrationby2035.Thedeeperclaimismethodological:tscalingisthefirstscalingprinciplesinceDennardtoestablishasharedoptimizationtargetacrosstheentirecomputingstack.

Lead

Sincethemid-1960s,thesemiconductorindustryhasmeasuredprogressinnanometers.Everyeighteenmonths,transistorsshrank,frequenciesrose,andthecostperlogicgatefell.Moore'sLaw

Thisversionposted2026-05-25.

Thisversionposted2026-05-25.

functionedasbothanempiricalobservationandhelpedestablishanindustrycompactuponwhichtheentirecomputingstackwasbuilt.Thatindustrycompactnolongerholds.Beyondthe7nmnode,geometricscalingnolongerdeliversitshistoricaldividends.Lithographytoolingisapproachingthephysicallimitsofpatterning,EUVdepreciationdominateswafercost,andtheper-transistorpricecurvehasflattened—andinsomecasesreversed.Fororganizationswhoseaccesstothemostadvancedlithographyisconstrained,theconstraintbecamebindingearlierandbearsdownmoreseverely.

Thecentralquestionfortheindustryhasthereforechanged.Itisnolonger"howmuchfurthercanthetransistorshrink?"Itis"whatshouldbescaled,andagainstwhatobjective?"

Overthepastsixyears,theauthor'steamatHuaweiSemiconductorhasinvestigatedthisquestioninsiliconacrossmobileSoCs,AIaccelerators,systemfabrics,andpackaging.Theconclusionisthattheanswerliesnotinanothernode,norinanothertransistorarchitecture,butinachangeoftheprimaryoptimizationtargetitself.Thisperspectivearguesthatthenextdecadeofelectronic-systemevolutionshouldbeguidednotbygeometricscaling,butbytimescaling—thesystematicreductionofasinglecharacteristictimeconstanttacrosseverylayerofthestack,fromatransistorswitchinginapicosecondtoadata-centerworkloadrespondinginasecond.

Thecasefortscalingisdevelopedbelowasbothascientificmethodologyandanindustrialroadmap,drawingonlessonsfrom381chipsbroughttovolumeproductionbetweenMay2020andMay2026.

1.TheEndoftheGeometricEra

Formostofitshistory,thesemiconductorindustryhashadonejob:makethetransistorsmaller.GordonMoore's1965observation—thattransistordensitydoublesapproximatelyeverytwoyears—wascomplementedadecadelaterbyRobertDennard'sscalingtheory,whichestablishedthatproportionalshrinkingofvoltageanddimensionscouldmaintainaconstantelectricfield.Together,geometricscalingandDennardscalingdeliveredexponentialimprovementsin

Thisversionposted2026-05-25.

performanceperwattandperformanceperdollarfornearlyfivedecades.

Thisarrangementunraveledintwostages.Around2005,Dennardscalingbrokefirst:voltageceasedtoscaleproportionallywithfeaturesize,andthedark-siliconerabegan.Geometricscalingpersistedlonger,sustainedbyFinFETandsubsequentlygate-all-around(GAA)devicearchitectures.Beyond7nm,however,returnsfrompuredimensionalscalinghaveflattened.Thereasonsarenowwelldocumented:velocitysaturationreducesthedependenceofintrinsicdelayonchannellengthfromquadratictolinear;theparasiticresistanceandcapacitanceoflocalinterconnectsincreasinglydominatethestandard-celldelaybudget;maskcosts,EUVdepreciation,anddesign-rulecomplexityhavedrivenleading-edgechipdesignbudgetspastonebilliondollarsperchipatthe2nmnode.

Theeconomicconsequencesareequallyinescapable.Costpertransistorhasflattenedatadvancednodesand,attheleadingedge,isnowrising.Theindustrycompactthatsustainedthelastfiftyyears—moretransistorsatlowercosteverygeneration—nolongerholds.

ForHuaweiSemiconductor,thistransitionarrivedwithanadditionalconstraint:restrictedaccesstothemostadvancedlithographytooling.Assumingthatanothernodewouldresolvetheproblemwasnolongertenable.Sixyearsago,thegeometricroadmapplateaued,forcingamorefundamentalquestion—onethat,inretrospect,theentireindustrywilleventuallyhavetoconfront.

2.Time,NotSpace:TheRealCurrencyofMoore'sEra

Reducedtoitsessentialeffectontheenduser,Moore'sLawwasneverfundamentallyaboutgeometry.Smallertransistorsimprovedsystemperformancebecausetheyswitchedfaster.Denserinterconnectsimprovedperformancebecausesignalstraversedshorterdistances.Higherintegrationimprovedperformancebecausedatacrossedfewerboundaries.Whateachgenerationdelivered,inessence,wasareductionintime—picosecondtonanosecondatthedevice,nanosecondtomicrosecondatthechip,microsecondtosecondatthesystem.Spatialscalingservedmerelyastheinstrumentforcompressingtime.

Thisversionposted2026-05-25.

Oncethisisrecognized,anobviousreframingpresentsitself.Timeitselfshouldbeadoptedastheprimarymetric.Acharacteristictimeconstanttcanbedefinedateverylayerofthestack—transistor,circuit,chip,andsystem—anditsreductiontreatedastheunifyingoptimizationtarget.Geometricscalingthenbecomesonetechniqueamongmanyforreducingt,ratherthantheonly

one.

Thisprincipleiscalledtscaling,andisproposedhereasthesuccessortogeometricMoorescalingastheguidingprincipleofsemiconductorevolution.Formally,tistreatedasalayeredconstructthatdecomposesas

T=f(Ttransistor,Tcircuit,Tchip,Tsystem)

whereTtransistor,Tcircuit,Tchip,andTsystemrepresentthetimeconstantsatthetransistor,circuit,chip,andsystemlayer,respectively.Eachlayer'stcomposedfromthelayersbeneathittogetherwiththeorganizationalandcommunicationoverheadsintroducedatthatlayer.Theworkingspaceoftspansapproximatelytwelveordersofmagnitudeintime(picosecondstoseconds)andacomparablerangeinspace(nanometerstokilometers).Ateachlayer,distinctmechanismsareavailableforreducingt:

·Transistor:intrinsicswitchingdelay,addressedthroughmobilityenhancement,strainengineering,high-k/metalgate,andGAAarchitectures,and,increasingly,throughreductionoftheparasiticRandCoflocalinterconnects,whichnowexceedtheintrinsictransittimeby

severalfactors.

·Circuit:RCpropagationdelayalongsignalpaths,addressedthroughlower-resistivityconductors,low-kdielectrics,and—mostconsequentially—throughreductionofwirelengthviaverticalintegration.

·Chip:computeandmemory-accesslatency,addressedthrougharchitecturalchoices,pipelinedepth,memoryhierarchy,andon-chipfabrics.

·System:end-to-endmessageandsynchronizationtime,addressedthroughinterconnecttopology,protocolstack,andfabricdesign.

Ausefulgenerationalruleemergesfromthislayeredformulation:

wherethescalingfactoraisapplication-specificratherthanuniversal.Productionexperiencetodateindicatesa≈1.3×peryearforpower-constrainedmobiledevices,≈1.5×peryearforsafety-criticalautonomoussystems,andupto10×peryearforAIworkloads,wherethroughputtranslatesdirectlyintoeconomicvalue.

Whatrenderstausefulprimarymetric,ratherthanarelabelingofexistingones,isthatitisthesamemetricacrosstheentirestack.Frequency,latency,bandwidth,andthroughputareallgovernedbytattheirrespectivelayers.Aprocesstechnologist,acircuitdesigner,andasystemarchitectcandebatethesamequantityinidenticalunits.tisthelanguagethatenablesend-to-endstackco-optimization—andtheeraofindependentoptimizationateachlayer,withtimingemergingasaresidual,hasconcluded.

3.LogicFolding:AMobile-SoCProofPoint

Thefirstproduction-scaletestoftscalingwasconductedinmobile.AsmartphoneSoCistheunusualcaseinwhichonechipconstitutestheentiresystem.Multi-socketparallelismisnotavailable;nothousand-nodefabriccanmaskaslowlink.Allperformancedeliveredtotheuseroriginatesfromasingledie,underafew-wattpowerenvelope,againstthermallimitssetbyhandheldform-factorconstraints.

After2020,whenaccesstoleading-edgenodeswasrestricted,theoperativequestionbecame:withthenodefixed,howcangeneration-over-generationimprovementscontinuetobedeliveredonasingledie?

TheanswerthatemergediscalledLogicFolding.

Definition.LogicFoldingisadesignmethodologythatpartitionsdigital,analog,andmemorycircuitsacrossverticallystackedactivetierstojointlyoptimizeperformance,power,andareafollowingthetimescalingprinciple.

Thisversionposted2026-05-25.

Thisversionposted2026-05-25.

Digitalcircuitsdivideintocombinationallogic—theBooleannetworkbetweenregisters—andsequentiallogic—theflip-flopsthatholdstate.Theperformanceceilingofadigitalsystemissetbythecritical-pathdelaybetweenadjacentflip-flopstages,whichinturnisdominatedbyinterconnectRCandgatecountalongthatpath.Conventionaloptimizationplacesgatesinaplaneandrouteswiresthroughametalstackabove;thelongerthewire,thegreatertheparasiticRC,andtheslowerthecriticalpath.

LogicFoldingabandonstheplanarassumption.Critical-pathgatesaredistributedacrosstwo(andeventuallymore)verticallystackedactivetiers,connectedthroughultra-fine-pitchhybridbonding.Fromthecircuitdesigner'sperspective,thetwotiersbehaveasasinglecontinuousfabric,withcellsdistributedacrossthewaferboundaryasifitwereanadditionalmetallayer.Signalwiresbecomesubstantiallyshorter,parasiticRCdecreasessharply,clockskewtightens,andthechipoperatesatahigherclockfrequencyatthesamedevicenode.

TohelpLogicFoldingdeliverthesegains,itisadvantageoustokeepthegearratiobetweenhybrid-bondingpitchandtop-metalpitchcomparativelylow—roughlybelow3inpractice,withlowerratiosgenerallybetter.Withtoday'stop-metalpitcharound720nm,thistranslatesintoahybrid-bondingpitchbelow2μm—andideallytoagearratioofapproximately1,atwhichthebird-cageroutingoverheadatthebondinginterfaceeffectivelyvanishes.Achievingthispitch,togetherwiththerequiredoverlayaccuracy(<0.5μm),TSVscaling(CDandKOZsub-1.5μm,pitchsub-6μm),andyield(~100%withsmartredundancy),requiredamulti-yearprocess-developmenteffortacrossthesupplierandpartnerecosystem.

Theresults,measuredonKirin2026,areconcrete:

·Transistordensityrosestep-wisefrom155to238MTr/mm²inasinglegeneration(transistordensityiscalculatedusingtheformula;theareautilizationofKirinSoCdesignis68%)—amagnitudeofimprovementthatpreviouslyrequiredthreeyearsofgeometricscaling.

·SoCperformance-corepowerefficiencyimprovedby41%andmaximumclockfrequency

Thisversionposted2026-05-25.

rosebynearly13%.

·Ahigh-speedglobalNetwork-on-Chipdatapathconstructedacrossbothupperandlowertiersreducedthedata-pathfootprintby55%,withimprovedpower-deliverystability.

·Apost-siliconclock-skewadjustmentschemecontributedover5%SoCperformanceindependently.

·OnSRAM—whereaccessspeed,energy-per-bit,andareadependstronglyonbit-lineandword-linelength—LogicFoldingshortenedcriticalpaths,reducedenergyperbit,and

increasedoperatingfrequencybyover40%.

·Onarepresentativeprocessingcore,thedouble-layerfoldingarchitecturereducedclock-

buffercountbymorethan50%,clockskewby25%,andwirelengthbyapproximately30%.

Thesegainswereachievedatafixeddevicenode,obtainednotthroughanewlithographystepbutthroughatopologicalreorganizationofthespatialdistributionoflogicinthreedimensions.

TheLogicFoldingimplementationshippinginKirin2026isdeliberatelyconservative.Thehybrid-bondingpitchreached1.5μm;TSVlandingadvancedonlyonestepbelowthetopmetal;foldingwasappliedselectivelyalongkeycriticalpathsratherthanacrosstheentiredesign.Evenso,theCPUperformance-corefrequencyreturnsto3.1GHzthisyear.

Overthenextdecade,LogicFoldingisexpectedtoevolvefromlocalcritical-pathfoldingtofull-scale,multi-layerfolding—three,four,andmoreactivetiersperpackage—enabledbylower-temperaturehybridbonding(relaxingthethermalbudgetacrosstiers)andbyTSVlandingmigratingfromthetopmetaldowntoM6,whichliberatesover30%ofhigh-levelroutingresources.From2026to2035,transistordensityisprojectedtorisetoward400MTr/mm²andbeyond.Simultaneously,LogicFoldingenablesKirintosubstantiallystepupCPUcorefrequency,andpavesthewaystowards4GHzandbeyond(Table1).Theroadmapisfeasibleand,incostterms,economicallyviable.

Table1.TrendoftheoperatingfrequencyofKirinCPUperformancecore.

Thisversionposted2026-05-25.

SoC

Architecture

Frequency(GHz)

State

2023

Kirin9000s

Planar

2.6

Massproduct

2024

Kirin9020

Planar

2.65

Massproduct

2025

Kirin9030pro

Planar

2.75

Massproduct

2026

Kirin2026

LogicFolding

3.1

Silicon

2027

Kirin2027

LogicFolding

3.39

Silicon

2028

Kirin2028

LogicFolding

3.71

Pre-silicon

2029

Kirin2029

LogicFolding

4

Pre-silicon

SidebarA—LogicFoldingataGlance

·Hybrid-bondingpitch:sub-2μm(1.5μminKirin2026;targetgearratio≈1)

·Overlayaccuracy:under0.5μm

·TSVCD/KOZ:sub-1.5μm;pitchsub-6μm;failurerate<100ppm;repairrate99.9%

·Yield:~100%withsmartredundancy

·Transistordensity:155→238MTr/mm²inasinglestep

·Power-efficiency/frequencygain(SoCP-core):+41%/+13%

·SRAMoperatingfrequency:+40%+

·Clock-buffercount/clockskew/wirelengthonarepresentativecore:-50%/-25%/-30%

4.FromPicosecondstoMicroseconds:tScalingintheAIDataCenter

AnaturalquestioniswhetheraprincipledevelopedinthemilliwattsmartphoneregimesurvivestranslationtothegigawattregimeofAItrainingandinference.AIworkloadsoccupytheoppositeendofthetspectrum:notasinglechipbuthundredsorthousandsofchipsbehavingasonemachine,withaggregatecomputeincreasingbyapproximatelysixordersofmagnitudeoverthepastdecade.Theanswerisaffirmative—providedtistreatedasasystem-levelobjectiveandappliedacrossthewholechain,ratherthanwithinasingleaccelerator.

TwofactsshapetheAIsideofthetargument.First,AIsystemscontinuetogrow—fromonechip,todozens,tohundreds,andincreasinglytotensofthousands.Second,theenergybudgetandthematerialsbudgetofmodernAIsystemsaredominatedbydata,notbycompute.Over80%ofenergy

Thisversionposted2026-05-25.

inalargeAIclusterisconsumedbydatamovement;over70%ofsystemcostisallocatedtodatastorage.Theimplicationisdirect:reducingthetimedataspendsintransit—betweenchips,betweenracks,andwithinthepackage—isatleastasimportantasreducingthetimecomputespendscomputing.

tscalingisinstantiatedatAIscalethroughthreecoordinatedlayers:asystemfabric(UnifiedBus),anear-packagedopticalengine(Hi-ONE),andatopologicalreorganizationofthepackageitself(3DFolding).

4.1UnifiedBus—At-FirstSystemFabric

Traditionalmulti-node,multi-acceleratorarchitecturesmovedataacrossmultiplestackedprotocols:PCletothehost,NVLinkorproprietaryfabricswithinthechassis,EthernetorInfiniBandbetweenchassis,andsoftware-stackremote-memoryaccessontop.Eachlayerentailsaprotocolconversion,additionalserialization,anextraDMAbuffer,andafurtherhandshake.Everyconversionaddslatency,reducesreliability,andincursadditionalcost.

UnifiedBus(UB)replacesthisstackwithasingleprotocolthatoperateswithinandacrossthechassis—afullypeer-to-peerfabricthatexposesmemorysemanticsnativelyacrossthewholesystem.Datamovementisreducedtoconversion-free,peer-to-peertransmissionatthememory-semanticlayer,withhardware-managedcoherenceinplaceofsoftware-stackmessagepassing.

Themeasuredbenefitisapproximatelytwoordersofmagnitude:end-to-endremote-accesslatencyfallsfromthetensofmicrosecondstypicalofTCP/IP-classstackstoapproximately100ns—a~500×reductioninsystemtalongthedominantcommunicationaxis.Attherackscale,thisbringsthesystemasymptoticallyclosetoasingle,fabric-coherentmachine—designatedinternallyasaSystem-as-One-Chip.

4.2Hi-ONE—OpticalI/OatthePackage

Oncecommunicationlatencyisreduced,thenextbottleneckshifts.Increasingthedensityofchipswithinasinglerackpushespowerdensityandreliabilitypasttheirlimits—andpusheselectricalSerDespasttheirs.At400Gb/sperAIchip,coppercablingremainswellunderstoodandreliable.

Thisversionposted2026-05-25.

Atmulti-Tb/sperchip,copperbecomesphysicallyimpractical:SerDesreachcontracts,cablingbecomesprohibitivelybulky,panelinstallationbecomesinfeasible,andthermalandpower-deliverymarginsareexhausted.

TheapproachdevelopedatHuaweiSemiconductoristheHigh-densityOptical-interconnect-NodeEngine,Hi-ONE—anear-packagedopticalenginethatdelivers8Tb/spermodule,matchingtheUBbandwidthofanAIchiponasingleopticallink.ItreducestherequiredSerDesreachfrom~100cmto~5cm,eliminatesbulkycabling,andextendsreachfromunderameterto100meters—renderinghigh-densityinterconnectfordistributed,gigawatt-scaledatacentersphysicallyrealizable.

ThedesignphilosophyunderlyingHi-ONEisitselfat-scalingargument.InplaceofaheavyDSPforhighsignalfidelity,Hi-ONEadoptsalinearapproach—ananalogequalization-enhanceddriverandtrans-impedanceamplifier—andpermitstheUBprotocoltotolerateadeliberatelyrelaxedbit-errorrate.Thiscross-layertradebetweenprotocollayerandphysicallayerreducespower,cost,andintegrationcomplexity,andepitomizesthecross-layertrade-offthatat-firstmethodologyrewards.

4.3TheN²-vs-NDilemma,andWhy3DFoldingIsInevitable

ThedeepestreasonAIacceleratorswillnotstopat2.5Dfan-outisgeometric,andmeritsexplicitstatementbecauseitdeterminesthepost-2030roadmap.

Inaconventional2.5DAIchip,thelogicdieoccupiesthecenterofthepackage,HBMstacksandSerDeslineitsedges,andvoltageregulatorssurroundthepackage.Everymemorysignal,everyinterconnectsignal,andeveryampereofsupplycurrentmusttraversethedie'sedgetoreachthecomputeresourceswithin.IfthediehassidelengthN,then:

·computecapacityscalesasN²(area),

·butmemorybandwidth,interconnect,andpowerdelivery—allcarriedbythe2.5Dfan-outalongtheedge—scaleonlyasN(perimeter).

Thewideningdivergencebetweenthesequadraticandlinearcurvesconstitutesthefan-outdilemma,

Thisversionposted2026-05-25.

anditaccountsforthestallingof2.5Dscalingindependentofhowaggressivetheunderlyinglogicnodebecomes.Notransistor-levelimprovementclosesatopologicaldeficit.

3DFoldingresolvesthisdilemmabyrelocatingtheedge-boundresourcesontosurfaces.Powerdelivery(viabacksidepowerandintegratedvoltageregulators),high-speedmemory(viahybridbondingtologic),andopticalI/O(vianear-packagedHi-ONE)allmigratefromperimetertoverticalsurface—and,oncelocatedonasurface,theyscaleasN²,matchingthequadraticpaceof

compute.ThepackageisnolongeralogicdiesurroundedbyaperimeterbeltofmemoryandSerDes;itbecomesaverticallyintegratedstackinwhichmemory,fabric,power,andlogicallscaletogether.

Theroadmapplacesthisevolutiononanexplicittimeline.Throughapproximately2030,AIaccelerators(theAscendSuperPoDline—Ascend910Cin2025,Ascend950in2026,andthe990tofollow)relyonacombinationofmaturetechniques:chiplets,2.5Dfan-out,and3Dstackingviamicro-bumpandstandard-pitchhybridbonding.Around2030,Ascend990willintroduceLogicFoldingintotheAIacceleratorclass,andfromthatpoint3DFoldingbecomestheprincipalcarrierofathrough2035.Alongthispath,hardwareintegrationisprojectedtoincreasebymorethan100×by2035,withtreductiondistributedacrosseverylayerofthestackratherthanconcentratedatthedevicelevel.

SidebarB—tatAISystemScale

·UBremote-accesslatency:~10sofμs→~100ns(≈500×treduction)

·HiONEper-modulebandwidth:8Tb/s(matchesper-chipUBbandwidth)

·HiONESerDesreach:~100cm→~5cm;panel-to-panelreach:<1m→100m

·Fan-outdilemma:computeαN²,perimeter-boundBW/I/O/powerαN

·3DFolding:relocatesBW,opticalI/O,andpowerdeliveryfromedgesontosurfaces,restoringN²parity

·2026→2035projectedhardware-integrationgrowth:>100×

Thisversionposted2026-05-25.

5.LogicandMemory:FromDecouplingtoRe-Fusion

Oneimplicationoftscalingwarrantsseparatediscussion,becauseitsconsequencesareindustrialaswellastechnical.

Inthe8086era,theindustrydeliberatelydecoupledprocessorsandmemorythroughstandardizedmemorybuses.Thatdecouplingpermittedtwoindustriestoscaleindependently:processorperformanceadvancedrapidlyalongtheMoorecurve,whilememoryvendorsdevelopedavast,separatemarketalongsideit.

TheAIeraisreversingthisdecoupling.Thecontinuingexpansionofcomputedensityispushingmemorybandwidth,latency,power,andpackagingtotheirlimits.HBM,hybridbonding,and3D-stackedSRAMaresymptomsofasingleunderlyingfact:formodernAIworkloads,datamovementisascriticalascomputationitself,andlogicandmemoryareonceagainbeingdrivenintotightphysicalintegration.Astheyfuse,thebalanceofinfluenceinthesupplychainisshiftingtowardmemoryandpackagingvendors.

Thetechnologicaldirectionisunambiguous,buttheeconomicresolutionisnotyetsettled.EnduringsuccessintheAIhardwareerawillaccruetothosewhocanfuselogicandmemorytechnologicallyandestablishaneconomicpartnershipthatallowsbothindustriestosharethebenefitsofthatfusionoverthelongterm.Thisisnotmerelyaresearchproblem;itisastructuralproblemfortheindustrytoaddressoverthenextdecade.Byrenderingthecross-layercostofeveryseparationvisible,tscalingensuresthattheproblemcannotbedeferred.

6.OpenChallenges

Itwouldbemisleadingtopresenttscalingasacompletedsystem.Severalsubstantiveproblemsremainopen,andareidentifiedherebothtohighlightongoingworkandtoinvitecollaboration.

Toolchainsandmethodologies.Today'sEDAwasdevelopedforanerainwhicharea,timing,andpowerwereoptimizedalongthreeseparateaxes,withsystemtemergingasaresidual.Full-scale

Thisversionposted2026-05-25.

LogicFoldingrequiresthetoolchaintotreatmultiplestackeddiesasasinglecontinuousdesignentity—partitioninglogicatcellgranularityratherthanblockgranularity,placingacrossthefullvolumeunderaunifiedcostfunction,andperformingtimingclosureacrossinter-diepathswherevertical-interconnectparasitics,KOZexclusions,andinter-waferprocessvariationinteractinwaysthattraditional2D-trainedtoolsdonotaddressadequately.Preliminaryinternaltoolshavebeendevelopedthatproduceusefulresults,andmethodologydetailswillbepublishedinthecomingmonths.Aτ-nativetoolchain—open,multi-physics,and3D-native—isthesinglemostimportantenablinginvestmentforthenextdecade.

Inter-waferprocessvariation.LogicFoldingbondswafersfrompotentiallydistinctlots—andinsomecasesdistinctnodes.Inter-wafervariationinVt,drivecurrent,andinterconnectRCismateriallygreaterthanwithin-wafervariation,andfallsmostheavilyonclockdistributionandhold-timemargins.Smartredundancy,adaptivecompensation,andT-awaresignoffflowsarenecessarycomponentsoftheresponse.

Vertical-interconnectoverhead.EveryhybridbondandeveryTSVincursafiniteresistanceandcapacitancepenalty,andTSVKOZdisplacesstandardcells.LogicFoldingmustthereforebejustifiedlayerbylayerthroughthesimpleinequality

TBenefit(effectivesiliconarea+wirelengthreduction)

>Tpenalty(verticalinterconnectRC)

Thisthresholdhasbeencrossedformobilecriticalpathsandformemory;thethresholdisworkload-specific,andtheboundarywillmoveasbondingpitchshrinks.

Energy.tisatimelaw,notajoulelaw.Asuper-nodeoperating10×fasterbutwith10×greaterpowerconsumptionviolatesnoscalingprinciple,yetexceedsgridcapacity.tscalingthereforerequiresanenergycompanion:memory-semanticfabricsthateliminatestackoverhead,near-/co-packagedopticsthatreducepicojoulesperbitbyordersofmagnitude,backsidepowerdelivery,compute-in/near-memory,andthedisciplinedpracticeoftradingtheadroombackforpower

Thisversionposted2026-05-25.

(DVFSatdata-centerscale—thesamemechanismthatenabledsmartphonebatterylongevity).Importantly,theadroomitselfprovidesenergyheadroomwhenallocatedinthatdirection.

Benchmarks.Theindustry'scurrentperformancebenchmarks—Linpack,MLPerf,SPEC—weredesignedforanerainwhichasinglescalarperworkloadsufficed.Aτ-scalingindustryrequiresT-profilebenchmarks—vectorsthatexposethedominanttateachlayerofasystemtogetherwiththeheadroomremainingatthatlayer.Thedominant-tlayeris,bydefinition,thenextinvestment.

7.SixYearsIn,TenYearsOut

BetweenMay2020andMay2026,HuaweiSemiconductordesignedandbroughttovolumeproduction381chipsservingmobile,AI,automotive,industrial,andinfrastructuremarkets.Acrossthatportfolio,thetscalingthesishasheldup:

·Atthedeviceandcircuitlayers,transistordensityhasrisenfrom155toward400+MTr/mm²b

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论