会员注册 | 登录 | 微信快捷登录 支付宝快捷登录 QQ登录 微博登录 | 帮助中心 人人文库renrendoc.com美如初恋!
站内搜索 百度文库

热门搜索: 直缝焊接机 矿井提升机 循环球式转向器图纸 机器人手爪发展史 管道机器人dwg 动平衡试验台设计

   首页 人人文库网 > 资源分类 > PDF文档下载

52-Memory Coherence in Shared Virtual Memory Systems.pdf

  • 资源星级:
  • 资源大小:2.59MB   全文页数:39页
  • 资源格式: PDF        下载权限:注册会员/VIP会员
您还没有登陆,请先登录。登陆后即可下载此文档。
  合作网站登录: 微信快捷登录 支付宝快捷登录   QQ登录   微博登录
友情提示
2:本站资源不支持迅雷下载,请使用浏览器直接下载(不支持QQ浏览器)
3:本站资源下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰   

52-Memory Coherence in Shared Virtual Memory Systems.pdf

MemoryCoherenceinSharedVirtualMemorySystemsKAILlPrincetonUniversityandPAULHUDAKYaleUniversityThememorycoherenceproblemindesigningandimplementingasharedvirtualmemoryonlooselycoupledmultiprocessorsisstudiedindepth.Twoclassesofalgorithms,centralizedanddistributed,forsolvingtheproblemarepresented.AprototypesharedvirtualmemoryonanApolloringbasedonthesealgorithmshasbeenimplemented.Boththeoreticalandpracticalresultsshowthatthememorycoherenceproblemcanindeedbesolvedefficientlyonalooselycoupledmultiprocessor.CategoriesandSubjectDescriptorsC.2.1ComputerCommunicationNetworksNetworkArchitectureandDesignnetworkcommunicationsC.2.4ComputerCommunicationNetworksDistributedSystemsnetworkoperatingsystemsD.4.2OperatingSystemsStorageManagementdistributedmemoriesuirtuolmemoryD.4.7OperatingSystemsOrganizationandDesigndistributedsystemsGeneralTermsAlgorithms,Design,Experimentation,Measurement,PerformanceAdditionalKeyWordsandPhrasesLooselycoupledmultiprocessors,memorycoherence,parallelprogramming,sharedvirtualmemory1.INTRODUCTIONThebenefitsofavirtualmemorygowithoutsayingalmosteveryhighperformancesequentialcomputerinexistencetodayhasone.Infact,itishardtobelievethatlooselycoupledmultiprocessorswouldnotalsobenefitfromvirtualmemory.Onecaneasilyimaginehowvirtualmemorywouldbeincorporatedintoasharedmemoryparallelmachinebecausethememoryhierarchyneednotbemuchdifferentfromthatofasequentialmachine.Onamultiprocessorinwhichthephysicalmemoryisdistributed,however,theimplementationisnotobvious.ThisresearchwassupportedinpartbyNationalScienceFoundationgrantsMCS8302018,DCR8106181,andCCR8814265.ApreliminaryversionofthispaperappearedintheProceedingsofthe5thAnnualACMSymposiumonPrinciplesofDistributedComputing36.AuthorsaddressesK.Li,DepartmentofComputerScience,PrincetonUniversity,Princeton,NJ08544P.Hudak,DepartmentofComputerScience,YaleUniversity,NewHaven,CT06520.Permissiontocopywithoutfeeallorpartofthismaterialisgrantedprovidedthatthecopiesarenotmadeordistributedfordirectcommercialadvantage,theACMcopyrightnoticeandthetitleofthepublicationanditsdateappear,andnoticeisgiventhatcopyingisbypermissionoftheAssociationforComputingMachinery.Tocopyotherwise,ortorepublish,requiresafeeand/orspecificpermission.01989ACM07342071/89/1100032101.50ACMTransactionsonComputerSystems,Vol.7,No.4,November1989,Pages321359.322K.LiandP.HudakThesharedvirtualmemorydescribedinthispaperprovidesavirtualaddressspacethatissharedamongallprocessorsinalooselycoupleddistributedmemorymultiprocessorsystem.Applicationprogramscanusethesharedvirtualmemoryjustastheydoatraditionalvirtualmemory,except,ofcourse,thatprocessescanrunondifferentprocessorsinparallel.Thesharedvirtualmemorynotonlypagesdatabetweenphysicalmemoriesanddisks,asinaconventionalvirtualmemorysystem,butitalsopagesdatabetweenthephysicalmemoriesoftheindividualprocessors.Thusdatacannaturallymigratebetweenprocessorsondemand.Furthermore,justasaconventionalvirtualmemoryswapsprocesses,sodoesthesharedvirtualmemory.Thusthesharedvirtualmemoryprovidesanaturalandefficientformofprocessmigrationbetweenprocessorsinadistributedsystem.Thisisquiteagainbecauseprocessmigrationisusuallyverydifficulttoimplement.Ineffect,processmigrationsubsumesremoteprocedurecalls.Themaindifficultyinbuildingasharedvirtualmemoryissolvingthememorycoherenceproblem.Thisproblemissimilartothatwhichariseswithmulticacheschemesforsharedmemorymultiprocessors,buttheyaredifferentinmanyways.Inthispaperweconcentrateonthememorycoherenceproblemforasharedvirtualmemory.Anumberofalgorithmsarepresented,analyzed,andcompared.AprototypesystemcalledIVYhasbeenimplementedonalocalareanetworkofApolloworkstations.Theexperimentalresultsofnontrivialparallelprogramsrunontheprototypeshowtheviabilityofasharedvirtualmemory.Thesuccessofthisimplementationsuggestsanoperatingmodeforsucharchitecturesinwhichparallelprogramscanexploitthetotalprocessingpowerandmemorycapabilitiesinafarmoreunifiedwaythanthetraditionalmessagepassingapproach.2.SHAREDVIRTUALMEMORYAsharedvirtualmemoryisasingleaddressspacesharedbyanumberofprocessorsFigure1.Anyprocessorcanaccessanymemorylocationintheaddressspacedirectly.Memorymappingmanagersimplementthemappingbetweenlocalmemoriesandthesharedvirtualmemoryaddressspace.Otherthanmapping,theirchiefresponsibilityistokeeptheaddressspacecoherentatalltimesthatis,thevaluereturnedbyareadoperationisalwaysthesameasthevaluewrittenbythemostrecentwriteoperationtothesameaddress.Asharedvirtualmemoryaddressspaceispartitionedintopages.Pagesthataremarkedreadonlycanhavecopiesresidinginthephysicalmemoriesofmanyprocessorsatthesametime.Butapagemarkedwritecanresideinonlyoneprocessorsphysicalmemory.Thememorymappingmanagerviewsitslocalmemoryasalargecacheofthesharedvirtualmemoryaddressspaceforitsassociatedprocessor.Likethetraditionalvirtualmemory17,thesharedmemoryitselfexistsonlyvirtually.Amemoryreferencecausesapagefaultwhenthepagecontainingthememorylocationisnotinaprocessorscurrentphysicalmemory.Whenthishappens,thememorymappingmanagerretrievesthepagefromeitherdiskorthememoryofanotherprocessor.Ifthepageofthefaultingmemoryreferencehascopiesonotherprocessors,thenthememorymappingmanagermustdosomeworktokeepthememorycoherentandthencontinuethefaultinginstruction.ThispaperdiscussesbothcentralizedmanageralgorithmsandACMTransactionsonComputerSystems,Vol.7,No.4,November1989.MemoryCoherenceinSharedVirtualMemorySystemsl323SharedvirtualmemoryFig.1.Sharedvirtualmemorymapping.distributedmanageralgorithms,andinparticularshowsthataclassofdistributedmanageralgorithmscanretrievepagesefficientlywhilekeepingthememorycoherent.Ourmodelofaparallelprogramisasetofprocessesorthreadsthatshareasinglevirtualaddressspace.Theseprocessesarelightweighttheysharethesameaddressspace,andthusthecostofacontextswitch,processcreation,orprocessterminationissmall,say,ontheorderofafewprocedurecallsRoyLevin,personalcommunication,1986.Oneofthekeygoalsofthesharedvirtualmemory,ofcourse,istoallowprocessesofaprogramtoexecuteondifferentprocessorsinparallel.Todoso,theappropriateprocessmanagerandmemoryallocationmanagermustbeintegratedproperlywiththememorymappingmanager.Theprocessmanagerandthememoryallocationmanageraredescribedelsewhere34.Werefertothewholesystemasashredvirtualmemorysystem.Theperformanceofparallelprogramsonasharedvirtualmemorysystemdependsprimarilyontwothingsthenumberofparallelprocessesandthedegreeofupdatingofshareddatawhichcreatescontentiononthecommunicationchannels.Sinceanyprocessorcanreferenceanypageinthesharedvirtualmemoryaddressspaceandmemorypagesaremovedandcopiedondemand,thesharedvirtualmemorysystemdoesnotexhibitpathologicalthrashingforunshareddataorshareddatathatisreadonly.Furthermore,updatingshareddatadoesnotnecessarilycausethrashingifaprogramexhibitslocalityofreference.Oneofthemainjustificationsfortraditionalvirtualmemoryisthatmemoryreferencesinsequentialprogramsgenerallyexhibitahighdegreeoflocality16,171.Althoughmemoryreferencesinparallelprogramsmaybehavedifferentlyfromthoseinsequentialones,asingleprocessisstillasequentialprogramandshouldexhibitahighdegreeoflocality.Contentionamongparallelprocessesforthesamepieceofdatadependsonthealgorithm,ofcourse,butacommongoalindesigningparallelalgorithmsistominimizesuchcontentionforoptimalperformance.Thereisalargebodyofliteraturerelatedtotheresearchofsharedvirtualmemory.Theclosestareasarevirtualmemoryandparallelcomputingonlooselycoupledmultiprocessors.Researchonvirtualmemorymanagementbeganinthe1960s15andhasbeenanimportanttopicinoperatingsystemdesigneversince.TheresearchACMTransactionsonComputerSystems,Vol.7,No.4,November1989.324lK.LiandP.Hudakfocusedonthedesignofvirtualmemorysystemsforuniprocessors.Anumberoftheearlysystemsusedmemorymappingtoprovideaccesstodifferentaddressspaces.TherepresentativesystemsareTenexandMultics5,131.Inthesesystems,processesindifferentaddressspacescansharedatastructuresinmappedmemorypages.Butthememorymappingdesignwasexclusivelyforuniprocessors.Spectorproposedaremotereference/remoteoperationmodel42inwhichamasterprocessonaprocessorperformsremotereferencesandaslaveprocessonanotherprocessorperformsremoteoperations.Usingprocessornamesaspartoftheaddressinremotereferenceprimitives,thismodelallowsalooselycoupledmultiprocessortobehaveinawaysimilartoCM24,291orButterfly6inwhichasharedmemoryisbuiltfromlocalphysicalmemoriesinastaticmanner.Althoughimplementingremotememoryreferenceprimitivesinmicrocodecangreatlyimproveefficiency,thecostofaccessingaremotememorylocationisstillseveralordersofmagnitudemoreexpensivethanalocalmemoryreference.Themodelisusefulfordatatransferindistributedcomputing,butitisunsuitableforparallelcomputing.Amongthedistributedoperatingsystemsforlooselycoupledmultiprocessors,ApolloAegis2,32,331andAccent20,381havehadastronginfluenceontheintegrationofvirtualmemoryandinterprocesscommunication.BothAegisandAccentpermitmappedaccesstodataobjectsthatcanbelocatedanywhereinadistributedsystem.Bothofthemviewphysicalmemoryasacacheofvirtualstorage.Aegisusesmappedreadandwritememoryasitsfundamentalcommunicationparadigm.Accenthasasimilarfacilitycalledcopyonwriteandamechanismthatallowsprocessestopassdatabyvalue.Thedatasharingbetweenprocessesinthesesystemsislimitedattheobjectlevelthesystemdesignsarefordistributedcomputingratherthanparallelcomputing.Realisticparallelcomputingworkonlooselycoupledmultiprocessorshasbeenlimited.Muchworkhasfocusedonmessagepassingll,19,391.Itispossibletogainlargespeedupsoverauniprocessorbymessagepassing,butprogrammingapplicationsaredifficult111.Furthermore,asmentionedabove,messagepassinghasdifficultiesinpassingcomplicateddatastructures.Anotherdirectionhasbeentouseasetofprimitives,availabletotheprogrammerinthesourcelanguage,toaccessaglobaldataspaceforstoringshareddatastructures8,111.Thechiefproblemwithsuchanapproachistheusersneedtocontroltheglobaldataspaceexplicitly,whichcanbecomeespeciallycomplexwhenpassinglargedatastructuresorwhenattemptingprocessmigration.Inasharedvirtualmemorysuchaswepropose,noexplicitdatamovementisrequiredithappensimplicitlyuponmemoryreference,andcomplexdataismovedaseasilyassimpledata.Anotherseriousproblemwiththeexplicitglobaldataspaceapproachisthatefficiencyisimpairedevenforlocaldatasinceuseofaprimitiveimpliesatleasttheoverheadofaprocedurecall.Thisproblembecomesespeciallyacuteifoneoftheprimitiveoperationsoccursinaninnerloop,inwhichcaseexecutionononeprocessorismuchslowerthanthatofthebestsequentialprogram,thatis,oneinwhichtheoperationisreplacedwithastandardmemoryreference.Incontrast,whenusingoursharedvirtualmemory,theinnerloopwouldlookjustthesameasitssequentialversion,andthustheoverheadforaccessinglocaldatawouldbeexactlythecostofastandardmemoryreference.ACMTransactionsonComputerSystems,Vol.7,No.4,November1989.MemoryCoherenceinSharedVirtualMemorySystemsl325Thepointbeingthat,oncethepagesholdingaglobaldatastructurearepagedin,themechanismforaccessingthedatastructureispreciselythesameasonauniprocessor.Theconceptofasharedvirtualmemoryforlooselycoupledmultiprocessorswasfirstproposedin36andelaboratedinthePh.D.dissertation34.Detailsofthefirstimplementation,IVY,onanetworkofworkstationswasreportedin34and35.Onthebasisofthisearlywork,asharedvirtualmemorysystemwaslaterdesignedfortheLotusoperatingsystemkernel21.Mostrecently,theconcepthasbeenappliedtoalargescaleinterconnectionnetworkbasedsharedmemorymultiprocessor121andalargescalehypercubemultiprocessor37.Otherrelatedworkincludessoftwarecachesandanalysisofmemoryreferences.TheVMPprojectatStanfordimplementsasoftwarevirtualaddressedcachelotoprovidemulticomputerswithacoherentsharedmemoryspace.Theirinitialexperienceshowsthatacachelinesizecanbeaslargeas128or256byteswithoutperformancedegradation.Thecacheconsistencyprotocolissimilartothedynamicdistributedmanageralgorithmforsharedvirtualmemoryinthispaperanditsprelimmaryversion34.Finally,techniquesforanalyzingmemoryreferencesofparallelprograms18,451maybeapplicabletoanalyzingthebehaviorsofparallelprogramsusingasharedvirtualmemorysystem.3.MEMORYCOHERENCEPROBLEMAmemoryiscoherentifthevaluereturnedbyareadoperationisalwaysthesameasthevaluewrittenbythemostrecentwriteoperationtothesameaddress.Anarchitecturewithonememoryaccesspathshouldhavenocoherenceproblem.Asingleaccesspath,however,maynotsatisfytodaysdemandforhighperformance.Thememorycoherenceproblemwasfirstencounteredwhencachesappearedinuniprocessorssee40forasurveyandhasbecomemorecomplicatedwiththeintroductionofmulticachesforsharedmemoriesonmultiprocessors9,23,25,31,43,46andChuckThacher,personalcommunication,19841.Thememorycoherenceprobleminasharedvirtualmemorysystemdiffers,however,fromthatinmulticachesystems.Amulticachemultiprocessorusuallyhasanumberofprocessorssharingaphysicalmemorythroughtheirprivatecaches.Sincethesizeofacacheisrelativelysmallandthebusconnectingittothesharedmemoryisrelativelyfast,asophisticatedcoherenceprotocolisusuallyimplementedinthemulticachehardwaresuchthatthetimedelayofconflictingwritestoamemorylocationissmall.Ontheotherhand,asharedvirtualmemoryonalooselycoupledmultiprocessorhasnophysicallysharedmemory,andthecommunicationcostbetweenprocessorsisnontrivial.Thusconflictsarenotlikelytobesolvedwithnegligibledelay,andtheyresemblemuchmoreapagefaultinatraditionalvirtualmemorysystem.Therearetwodesignchoicesthatgreatlyinfluencetheimplementationofasharedvirtualmemorythegranularityofthememoryunitsi.e.,thepagesizeandthestrategyformaintainingcoherence.Thesetwodesignissuesarestudiedinthenexttwosubsections.3.1GranularityInatypicallooselycoupledmultiprocessor,sendinglargepacketsofdatasayonethousandbytesisnotmuchmoreexpensivethansendingsmallonessayACMTransactionsonComputerSystems,Vol.7,No.4,November1989.326lK.LiandP.Hudaklessthantenbytes41.ThissimilPyincostisusuallyduetothesoftwareprotocolsandoverheadofthevirtualmemorylayeroftheoperatingsystem.Iftheseoverheadsareacceptable,relativelylargememoryunitsarepossibleinasharedvirtualmemory.Ontheotherhand,thelargerthememoryunit,thegreaterthechanceforcontention.Detailedknowledgeofaparticularimplementationmightallowtheclientsprogrammertominimizecontentionbyarrangingconcurrentmemoryaccessestolocationsindifferentmemoryunits.EitherclientsorthesharedVirtualmemorystorageallocatormaytrytoemploysuchstrategies,butthismayintroduceinefficientuseofmemory.So,thepossibilityofcontentionindicatestheneedforrelativelysmallmemoryunits.Asuitablecompromiseingranularityisthetypicalpageasusedinconventionalvirtualmemoryimplementations,whichvaryinsizeontodayscomputersfrom256bytesto8Kbytes.Ourexperienceindicatesthatapagesizeofabout1Kbytesissuitablewithrespecttocontention,andasmentionedaboveshouldnotimposeunduecommunicationsoverhead.Weexpectthatsmallerpagesizesperhapsaslowas256bytesworkwellalso,butwearenotasconfidentaboutlargerpagesizes,duetothecontentionproblem.Therightsizeisclearlyapplicationdependent,however,andwesimplyonothavetheimplementationexperiencetosaywhatsizeisbestforasufficientlybroadrangeofparallelprograms.Inanycase,choosingapagesizeconsistentwiththatusedinconventionalvirtualmemoryimplementationshastheadvantageofallowingonetouseexistingpagefaultschemes.Inparticular,onecanusetheprotectionmechanismsinahardwareMemoryManagementUnitMMUthatallowsingleinstructionstotriggerpagefaultsandtotrapappropriatefaulthandlers.Aprogramcansettheaccessrightatothepagesinsuchawaythatmemoryaccessesthatcouldviolatememorycoherencecauseapagefaultthusthememorycoherenceproblemcanbesolvedinamodularwayinthepagefaulthandlersandtheirservers3.2MemoryCoherenceStrategiesItishelpfultofirstconsiderthespectrumofchoicesonehasforsolvingthememorycoherenceproblem.Thesechoicescanbeclassifiedbythewayinwhichonedealswithpagesynchronizationandpageownrship,tieshowninTableI.PageSynchronization.Therearetwobasicapproachestopagesynchronizationinvalidationandwritebroadcast.Intheinvalidationapproach,thereisonlyoneownerprocessorforeachpage.Thisprocessorhaseitherwriteorreadaccesstothepage.IfaprocessorQhasawritefaulttoapagep,itsfaulthandlertheninvalidatesallcopiesofp,changestheaccessofptowrite,movesacopyofptoQifQdoesnothaveonealready,andreturnstothefaultinginstruction.Afterreturning,processorQownspagepandcanproceedwiththewriteoperationandotherreadorwriteoperationsuntilthepageownershipisrelinquishedtosomeotherprocessor.ProcessorQ,ofcourse,doesnotneedtomovethecopyofthepageifitownsthepageforreading.IfaprocessorQhasareadACMTransactionsonComputerSystems,Vol.7,No.4,November1989.MemoryCoherenceinSharedVirtualMemorySystemsl327TableI.SpectrumofSolutionstotheMemoryCoherenceProblemPageownershipstrategyPagesynchronizationmethodDynamicFixedCentralizedmanagerDistributedmanagerFixedDynamicInvalidationNotallowedOkayGoodGoodWritebroadcastVeryexpensiveVeryexpensiveVeryexpensiveVeryexpensivefaulttoapagep,thefaulthandlerthenchangestheaccessofptoreadontheprocessorthathaswriteaccesstop,movesacopyofptoQandsetstheaccessofptoread,andreturnstothefaultinginstruction.Afterreturning,processorQcanproceedwiththereadoperationandotherreadoperationstothispageinthesamewaythatnormallocalmemorydoesuntilpisrelinquishedtosomeoneelse.Inthewritebroadcastapproach,aprocessortreatsareadfaultjustasitdoesintheinvalidationapproach.However,ifaprocessorhasawritefault,thefaulthandlerthenwritestoallcopiesofthepage,andreturnstothefaultinginstruction.Themainproblemswiththisapproachisthatitrequiresspecialhardwaresupport.Everywritetoasharedpageneedstogenerateafaultonthewritingprocessorandupdateallcopiesbecausethephilosophyofasharedvirtualmemoryrequiresthatpagesbesharedfreely.Topreventtheprocessorfromhavingthesamepagefaultagainwhenreturningtothefaultinginstruction,thehardwaremustbeabletoskipthefaultedwritecycle.Wedonotknowofanyexistinghardwarewiththisfunctionality.Thetheoreticalanalysisonsnoopycachecoherence30suggeststhatcombiningtheinvalidationapproachwiththewritebroadcastapproachmaybeabettersolution.However,whetherthisapproachcanapplytothesharedvirtualmemoryisanopenproblembecausetheoverheadofawritefaultismuchmorethanawriteonasnoopycachebus.Sincethealgorithmsusingwritebroadcastdonotseempracticalforlooselycoupledmultiprocessors,theyarenotconsideredfurtherinthispaper.PageOwnership.Theownershipofapagecanbefixedordynamic.Inthefixedownershipapproach,apageisalwaysownedbythesameprocessor.Otherprocessorsarenevergivenfullwriteaccesstothepagerathertheymustnegotiatewiththeowningprocessorandmustgenerateawritefaulteverytimetheyneedtoupdatethepage.Aswiththewritebroadcastapproach,fixedpageownershipACMTransactionsonComputerSystems,Vol.7,No.4,November1989.328lK.LiandP.Hudakisanexpensivesolutionforexistinglooselycoupledmultiprocessors.Furthermore,itconstrainsdesiredmodesofparallelcomputation.Thusweonlyconsiderdynamicownershipstrategies,asindicatedinTableI.Thestrategiesformaintainingdynamicpageownershipcanbesubdividedintotwoclassescentralizedanddistributed.Distributedmanagerscanbefurtherclassifiedaseitherfixedordynamic,referringtothedistributionofownershipdata.TheresultingcombinationsofstrategiesareshowninTableI,wherewehavemarkedasveryexpensiveornotallowedallcombinationsinvolvingwritebroadcastsynchronizationorfixedpageownership.Thispaperonlyconsiderstheremainingchoicesalgorithmsbasedoninvalidationusingeitheracentralizedmanager,afixeddistributedmanager,oradynamicdistributedmanager.3.3PageTable,Locking,andinvalidationAllofthealgorithmsforsolvingthememorycoherenceprobleminthispaperaredescribedbyusingpagefaulthandlers,theirservers,andthedatastructureonwhichtheyoperate.Thedatastructuresindifferentalgorithmsmaybedifferent,buttheyhaveatleastthefollowinginformationabouteachpageaccessindicatestheaccessibilitytothepage,copysetcontainstheprocessornumbersthathavereadcopiesofthepage,andlocksynchronizesmultiplepagefaultsbydifferentprocessesonthesameprocessorandsynchronizesremotepagerequests.Followinguniprocessorvirtualmemoryconvention,thisdatastructureiscalledapagetable.Everyprocessorusuallyhasapagetableonit,butthesamepageentryindifferentpagetablesmaybedifferent.Therearetwoprimitivesoperatingonthelockfieldinthepagetable/ockPTablep.lockLOOPIFtestandsetthelockbitTHENEXITIFfailTHENqueuethisprocessun/ockPTablepJ.lockclearthelockbitIFaprocessiswaitingonthelockTHENresumetheprocessThesetwoprimitivesareusedtosynchronizemultiplepagefaultrequestsonthesameprocessorordifferentprocessors.Anotherprimitivethatweuseinmemorycoherencealgorithmsisinvalidate.Thereareatleastthreewaystoinvalidatethecopiesofapageindividual,broadcast,andmulticast.TheindividualinvalidationisjustasimpleloopInvalidateInvalidatep,copysetFORiincopysetDOsendaninvalidationrequesttoprocessoriBroadcastormulticastinvalidationdoesnotneedacopyseteachjustrequiresasimplebroadcastmessage.ACMTransactionsonComputerSystems,Vol.7,No.4,November1989.MemoryCoherenceinSharedVirtualMemorySystemsl329TheserveroftheinvalidationoperationissimpleInvalidateserverPTablep.accessnilAlthoughtherearemanywaystoimplementremoteoperations,itisreasonabletoassumethatanyremoteoperationrequirestwomessages,arequestandareply,andthatareliablecommunicationprotocolisusedsothatonceaprocessorsendsarequestnomatterwhetheritisapointtopointmessage,broadcast,ormulticast,iteventuallyreceivesareply.Withsuchanassumption,formcopiesonanNprocessorsystem,anindividualinvalidationrequires2mmessages,mforrequests,andmforreplies.Abroadcastinvalidationsendsm1messagesandreceivesNm1messagesofwhichN1messagesarereceivedinparallel.Amulticastinvalidationneedstosendm1messagesandreceive2mmessagesofwhichmmessagesarereceivedinparallel.Thecostofreceivingkmessagesinparallelisgreaterthanorequaltothatofreceivingonemessageandlessthanorequaltothatofreceivinglzmessagessequentially.Ifalllzprocessorsareidle,receiptofthesemessagesinparallelcostsnothingsincetheidletimewouldotherwisebewasted.Ontheotherhand,ifallkprocessorsarebusy,receiptofthemessageswouldcostmoresincealllzprocessorswouldneedtobeinterruptedinordertoprocessthemessages.Clearly,multicastinvalidationhasthebestperformance,althoughmostlooselycoupledsystemsdonothaveamulticastfacilitythatcanusethepagetableinformation.BroadcastinvalidationisexpensivewhenNislarge.AcopysetcanberepresentedbyabitvectorlwhenNissmalle.g.,lessthan64.WhenNislarge,wemayneedtocompactthecopysetfield.Threesimplecompactionmethodsareconsideredlinkedbituectorrepresentsacopysetasalinkedlistthatonlylinksmeaningfulbitvectorstogethertosavespace.neighborbitvectorusesabitvectorasitscopysetforneighborprocessorsdirectlyconnectedprocessors.Thismethodrequiresprocessorstopropagateinvalidationrequests.vaguelydefinedsetusesatagtoindicatewhetherthereisavalidcopyset.Thisallowsthesharedvirtualmemorytodynamicallyallocatememoryforcopysets.Moredetaileddiscussiononpagetablecompactioncanbefoundin34.4.CENTRALIZEDMANAGERALGORITHMS4.1AMonitorLikeCentralizedManagerAlgorithmOurcentralizedmanagerissimilartoamonitor7,271consistingofadatastructureandsomeproceduresthatprovidemutuallyexclusiveaccesstothedatastructure.Thecoherenceprobleminmulticachesystemshasasimilarsolution9.ThecentralizedmanagerresidesonasingleprocessorandmaintainsatablecalledInfowhichhasoneentryforeachpage,eachentryhavingthreefields1Theownerfieldcontainsthesingleprocessorthatownsthatpage,namely,themostrecentprocessortohavewriteaccesstoit.ACMTransactionsonComputerSystems,Vol.7,No.4,November1989.

注意事项

本文(52-Memory Coherence in Shared Virtual Memory Systems.pdf)为本站会员(baixue100)主动上传,人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知人人文库网([email protected]),我们立即给予删除!

温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载不扣分。

copyright@ 2015-2017 人人文库网网站版权所有
苏ICP备12009002号-5