21-Instruction Issue Logic for High-Performance, Interruptable Pipelined Processors.pdf21-Instruction Issue Logic for High-Performance, Interruptable Pipelined Processors.pdf

收藏 分享

资源预览需要最新版本的Flash Player支持。
您尚未安装或版本过低,建议您

INSTRUCTIONISSUELOGICFORHIGHPERFORMANCE,INTERRUPTABLEP1PELINEDPROCESSORSGURINDARSSOHIANDSRIRAMVAJAPEYAMCOMPUTERSCIENCESDEPARTMENTUNIVERSITYOFWISCONSINMADISON1210WESTDAYTONSTREETMADISON,WISCONSIN53706ABSTRACTTHEPERFORMANCEOFPIPELINEDPROCESSORSISSEVERELYLIMITEDBYDATADEPENDENCIESINORDERTOACHIEVEHIGHPERFORMANCE,AMECHANISMTOALLEVIATETHEEFFECTSOFDATADEPENDENCIESMUSTEXISTIFAPIPELINEDCPUWITHMULTIPLEFUNCTIONALUNITSISTOBEUSEDINTHEPRESENCEOFAVIRTUALMEMORYHIERARCHY,AMECHANISMMUSTALSOEXISTFORDETERMININGTHESTATEOFTHEMACHINEPRECISELYINTHISPAPER,WECOMBINETHEISSUESOFDEPENDENCYRESOLUTIONANDPRECISENESSOFSTATEWEPRESENTADESIGNFORINSTRUCTIONISSUELOGICTHATRESOLVESDEPENDENCIESDYNAMICALLYAND,ATTHESAMETIME,GUARANTEESAPRECISESTATEOFTHEMACHINE,WITHOUTASIGNIFICANTHARDWAREOVERHEADDETAILEDSIMULATIONSTUDIESFORTHEPROPOSEDMECHANISM,USINGTHELAWRENCELIVERMORELOOPSASABENCHMARK,AREPRESENTED1INTRODUCTIONASTHEDEMANDFORPROCESSINGPOWERINCREASES,COMPUTERSYSTEMDESIGNERSAREFORCEDTOUSETECHNIQUESTHATRESULTINHIGHPERFORMANCEPROCESSINGUNITSAWIDELYUSEDTECHNIQUEISPIPELINING\1\,INWHICHTHEOVERALLLOGICOFTHESYSTEMISSPLITINTOSEVERALSTAGESWITHEACHSTAGEPERFORMINGASUBTASKOFACOMPLETETASKCONSIDERABLEOVERLAPCANBEACHIEVEDBECAUSEEACHSTAGECANPERFORMASUBTASKFORADIFFERENTTASKPIPELINEDCPUSHAVETWOMAJORIMPEDIMENTSTOTHEIRPERFORMANCEIDATADEPENDENCIESANDIIBRANCHINSTRUCTIONSANINSTRUCTIONCANNOTBEGINEXECUTIONUNTILITSOPERANDSAREAVAILABLEIFANOPERANDISTHERESULTOFAPREVIOUSINSTRUCTION,THEINSTRUCTIONMUSTWAITTILLTHEPREVIOUSINSTRUCTIONHASCOMPLETEDEXECUTION,THEREBYDEGRADINGPERFORMANCETHEPERFORMANCEDEGRADATIONDUETOBRANCHINSTRUCTIONSISEVENMORESEVERENOTONLYMUSTACONDITIONALBRANCHINSTRUCTIONWALTFORITSCONDITIONTOBEKNOWNRESULTINGINBUBBLESINTHEPIPELINE,ANADDITIONALPENALTYISINCURREDINFETCHINGANINSTRUCTIONFROMTHETAKENBRANCHPATHTOTHEINSTRUCTIONDECODEANDISSUESTAGEPERMISSIONTOCOPYWITHOUTFEEALLORPARTOFTHISMATERIALISGRANTEDPROVIDEDTHATTHECOPIESARENOTMADEORDISTRIBUTEDFORDIRECTCOMMERCIALADVANTAGE,THEACMCOPYRIGHTNOTICEANDTHETITLEOFTHEPUBLICATIONANDITSDATEAPPEAR,ANDNOTICEISGIVENTHATCOPYINGISBYPERMISSIONOFTHEASSOCIATIONFORCOMPUTINGMACHINERYTOCOPYOTHERWISE,ORTOREPUBLISH,REQUIRESAFEEAND/ORSPECIFICPERMISSIONAMAJORPROBLEMTHATARISESINPIPELINEDCOMPUTERDESIGNISTHATANINTERRUPTCANBEIMPRECISE\2,3\THISPROBLEMISESPECIALLYSEVEREINMULTIPLEFUNCTIONALUNITCOMPUTERSINWHICHINSTRUCTIONSCANCOMPLETEEXECUTIONOUTOFPROGRAMORDER\2,4\FORAHIGHPERFORMANCE,PIPELINEDCPU,ANADEQUATESOLUTIONMUSTBEFOUNDFORTHEIMPRECISEINTERRUPTPROBLEMANDMEANSMUSTBEPROVIDEDFOROVERCOMINGTHEPERFORMANCEDEGRADINGFACTORS11BACKGROUNDANDPREVIOUSWORKTHEDETRIMENTALEFFECTSOFBRANCHINSTRUCTIONSCANBEALLEVIATEDBYUSINGDELAYEDBRANCHINSTRUCTIONSHOWEVER,THEUTILITYOFDELAYEDBRANCHINSTRUCTIONSISLIMITEDFORLONGPIPELINESINSUCHCASES,OTHERMEANSMUSTEXISTTOALLEVIATETHEDETRIMENTALEFFECTSACOMMONAPPROACHISTOUSEBRANCHPREDICTION\5,6\USINGPREDICTIONTECHNIQUES,THEPROBABLEEXECUTIONPATHOFABRANCHINSTRUCTIONISDETERMINEDINSTRUCTIONSFROMTHEPREDICTEDPATHCANTHENBEFETCHEDINTOINSTRUCTIONBUFFERSOREVENEXECUTEDINACONDITIONALMODEL2,7,8\WHILETHECONDITIONALMODEOFEXECUTIONWILLRESULTINAHIGHERPIPELINETHROUGHPUT,ESPECIALLYIFTHEOUTCOMEOFTHEBRANCHESISPREDICTEDCORRECTLY,AHARDWAREMECHANISMMUSTEXISTWHICHWILLALLOWTHEMACHINETORECOVERFROMANINCORRECTSEQUENCEOFCONDITIONALINSTRUCTIONSBOTHHARDWAREANDSOFTWARESOLUTIONSEXISTTOTHEDATADEPENDENCYPROBLEMSOFTWARESOLUTIONSUSECODESCHEDULINGTECHNIQUESCOMBINEDWITHALARGESETOFREGISTERSTOINCREASETHEDEPENDENCYDISTANCEANDTOPROVIDEINTERLOCKS\9\HARDWARESOLUTIONSEMPLOYWAITINGSTATIONSORRESERVATIONSTATIONSWHEREANINSTRUCTIONCANWAITFORITSOPERANDSANDALLOWSUBSEQUENTINSTRUCTIONSTOPROCEED\10\INAPIPELINEDMACHINE,IMPRECISEINTERRUPTSCANBECAUSEDBYINSTRUCTIONGENERATEDTRAPSSUCHASARITHMETICEXCEPTIONSANDPAGEFAULTSANIMPRECISEINTERRUPTCANLEAVETHEMACHINEINANIRRECOVERABLESTATEWHILETHEOCCURRENCEOFARITHMETICEXCEPTIONSISRARE,THEOCCURRENCEOFPAGEFAULTSINAMACHINETHATSUPPORTSVUALMEMORYISNOTTHEREFORE,IFVIRTUALMEMORYISTOBEUSEDWITHAPIPELINEDCPU,ITISCRUCIALTHATINTERRUPTSBEPRECISESEVERALHARDWARESOLUTIONSTOTHEPROBLEMAREDESCRIBEDIN\3\WEAREUNAWAREOFANYSOFTWARESOLUTIONSTOTHEIMPRECISEINTERRUPTPROBLEMFORMULTIPLEFUNCTIONALUNITCOMPUTERSASOFTWARESOLUTIONWILLBEEXTREMELYDIFFICULT,IFNOTIMPOSSIBLENOTONLYMUSTTHESOFTWAREALLOWFORTHEWORSTCASEEXECUTIONTIMEFORANYINSTRUCTION,ITMUSTALSOKEEPTRACKOFINSTRUCTIONSTHATHAVECOMPLETEDOUTOFPRE1987ACM00847495/87/06000027500,7527GRAMORDERANDGENERATETHEAPPROPRIATECODESEQUENCETOUNDOTHEEFFECTSOFTHOSEINSTRUCTIONSINEITHERCASE,SOMEHARDWARESUPPORTMUSTBEPROVIDEDTOMAINTAINRUNTIMEINFORMATION12OUTLINEOFTHEPAPERINTHISPAPER,WETREATTHEPROBLEMSOFDEPENDENCYRESOLUTIONANDIMPRECISEINTERRUPTSSIMULTANEOUSLYSINCEAHARDWAREMECHANISMMUSTEXISTFORIMPLEMENTINGPRECISEINTERRUPTS,WHYNOTEXTENDTHISMECHANISMTORESOLVEDEPENDENCIESANDALLOWOUTOFORDERINSTRUCTIONEXECUTIONINSECTION2,WEDISCUSSTOMASULOSDEPENDENCYRESOLUTIONALGORITHMANDEXTENDIT,GIVINGSEVERALVARIATIONS,SOTHATTHECOSTOFIMPLEMENTINGITISNOTPROHIBITIVEEVENFORALARGENUMBEROFREGISTERSINSECTION3,WEDISCUSSTHEPROBLEMOFIMPRECISEINTERRUPTSANDPRESENTSOLUTIONSSECTION4DESCRIBESAUNITTHATRESOLVESDEPENDENCIESASWELLASIMPLEMENTSPRECISEINTERRUPTSTHEPRECISEINTERRUPTANDDEPENDENCYRESOLUTIONMECHANISMSMUTUALLYAIDANDSIMPLIFYEACHOTHERASIMULATIONANALYSISOFTHEPROPOSEDMECHANISMUSINGSEVERALLIVERMORELOOPSASBENCHMARKSISCARRIEDOUTINSECTION5FINALLY,WEDISCUSSHOWOURMECHANISMMIGHTBEUSEDTOALLEVIATETHEDEGRADATIONDUETOBRANCHINSTRUCTIONSTHROUGHOUTTHEPAPER,WEDISCUSSINCREMENTALMODIFICATIONSTOTHEBASICPRINCIPLESDATASUPPORTINGOURCLAIMSFORSUCHMODIFICATIONSHAVEBEENOMITTEDFORREASONSOFCONCISENESSHOWEVER,WEDOPRESENTDETAILEDSIMULATIONDATAFOROURFINALDESIGN13MODELARCHITECTURETHEMODELARCHITECTURETHATWEUSEFOROURSTUDIESISPRESENTEDINFIGURE1ITHASTHESAMECAPABILITIESANDEXECUTESTHESAMEINSTRUCTIONSETASTHESCALARUNITOFTHECRAY1\4,11\HOWEVER,THEREISAMAJORDIFFERENCEINOURARCHITECTURE,ALLINSTRUCTIONS,WHETHERTHEYARECOMPOSEDOFIPARCEL16BITSOR2PARCELS32BITSCANISSUEINASINGLECYCLEIFISSUECONDITIONSAREFAVORABLETHEREFORE,THEBESTCASEEXECUTIONTIMEOFACONDITIONALBRANCHINSTRUCTIONIS4CLOCKCYCLESAFTERTHECONDITIONISKNOWNASOPPOSEDTO5CLOCKCYCLESFORTHECRAY1\11\THECRAY1WASCHOSENBECAUSEITREPRESENTSASTATEOFTHEARTSCALARUNITANDITSEXECUTIONCANBEMODELEDPRECISELYTHEAUTHORSALSOHADEASYACCESSTOTOOLSTHATCOULDBEUSEDTOGENERATEINSTRUCTIONTRACESFORTHECRAY1SCALARUNIT\12\THEMODELMACHINE,THEREFORE,CONSISTSOFSEVERALFUNCTIONALUNITSCONNECTEDTOACOMMONRESULTBUSONLYONEFUNCTIONCANOUTPUTDATAONTOTHERESULTBUSINANYCLOCKCYCLEINSTRUCTIONSAREFETCHEDBYTHEINSTRUCTIONFETCHUNITANDDECODEDANDISSUEDBYTHEDECODEANDISSUEUNITONCEDEPENDENCIESHAVEBEENRESOLVEDINTHEDECODEANDISSUEUNIT,INSTRUCTIONSAREFORWARDEDTOTHEFUNCTIONALUNITSFOREXECUTIONTHERESULTSOFTHEFUNCTIONALUNITSAREWRITTENDIRECT2YINTOTHEREGISTERFILETHEREGISTERFILECONSISTSOF8A,8S,64BAND64TREGISTERS2DEPENDENCYRESOLUTIONOUTOFORDERINSTRUCTIONEXECUTIONWHENANINSTRUCTIONREACHESTHEDECODEANDISSUESTAGEINTHEPIPELINE,CHECKSMUSTBEMADETODETERMINEIFTHEOPERANDSFORTHEINSTRUCTIONAREAVAILABLE,IE,IFALLDEPENDENCIESFORTHISINSTRUCTIONHAVEBEENRESOLVEDIFANOPERANDISNOTAVAILABLE,THEINSTRUCTIONMUSTWAITCONSEQUENTLY,SUBSEQUENTINSTRUCTIONSCANNOTPROCEEDEVENTHOUGHTHEYMAYBEREADYTOEXEFUNCTIONALUNITSIFROMMEMORYIREGISTERINSTRUCTIONFETCHUNIT,IRFILELI\,IZRESTBUSFIGURE1THEBASICARCHITECTURECUTESUBSEQUENTINSTRUCTIONSCANPROCEEDIFTHEWAITINGINSTRUCTIONSTEPSASIDE,ANDALLOWSOTHERINSTRUCTIONSTOBYPASSITWHILEITWAITSFORITSOPERANFSRESERVATIONSTATIONSPERMITANINSTRUCTIONTODOTHIS\10\21TOMASULOSALGORITHMTOMASULOSDEPENDENCYRESOLUTIONALGORITHMWASFIRSTPRESENTEDFORTHEFLOATINGPOINTUNITOFTHEIBM360/91\10\ANEXTENSIONOFTHISALGORITHMFORTHESCALARUNITOFTHECRAY1ISPRESENTEDIN\13\THEALGORITHMOPERATESASFOLLOWSANINSTRUCTIONWHOSEOPERANDSARENOTAVAILABLEWHENITENTERSTHEDECODEANDISSUESTAGEISFORWARDEDTOARESERVATIONSTATIONRSASSOCIATEDWITHTHEFUNCTIONALUNITTHATITWILLBEUSINGITWAITSINTHERSUNTILITSDATADEPENDENCIESHAVEBEENRESOLVED,IE,ITSOPERANDSAREAVAILABLEONCEATARESERVATIONSTATION,ANINSTRUCTIONCANRESOLVEITSDEPENDENCIESBYMONITORINGTHECOMMONDATABUSTHERESULTBUSINOURMODELARCHITECTUREWHENALLTHEOPERANDSFORANINSTRUCTIONAREAVAILABLE,ITISDISPATCHEDTOTHEAPPROPRIATEFUNCTIONALUNITFOREXECUTIONTHERESULTBUSCANBERESERVEDEITHERWHENTHEINSTRUCTIONISDISPATCHEDTOTHEFUNCTIONALUNIT\13\ORSOONBEFOREITISABOUTTHELEAVETHEFUNCTIONALUNIT\10\EACHSOURCEREGISTERISASSIGNEDABITTHATDETERMINESIFRTHEREGISTERISBUSYAREGISTERISBUSYIFITISTHEDESTINATIONOFANINSTRUCTIONTHATISSTILLINEXECUTIONADESTINATIONREGISTERISALSOCALLEDASINKREGISTER\10\EACHSINKREGISTERISASSIGNEDATAGWHICHIDENTIFIESTHERESULTTHATMUSTBEWRITTENINTOTHEREGISTERSINCEANYREGISTERINTHEREGISTERFILECANBEASINK,EACHREGISTERMUSTBEASSIGNEDATAGEACHRESERVATIONSTATIONHASTHEFOLLOWINGFIELDSSOURCEOPERAND1SOURCEOPERAND2DESTINATION28IFASOURCEREGISTERISBUSYWHENTHEINSTRUCTIONREACHESTHEISSUESTAGE,THETAGFORTHESOURCEREGISTERISOBTAINEDANDTHEINSTRUCTIONISFORWARDEDTOARESERVATIONSTATIONIFTHESINKREGISTERISBUSY,THEINSTRUCTIONFETCHESANEWTAG,UPDATESTHETAGOFTHESINKREGISTERANDPROCEEDSTOARESERVATIONSTATIONTHEREGISTERSASWELLASTHERESERVATIONSTATIONSMONITORTHERESULTBUSANDUPDATETHEIRCONTENTSWHENAMATCHINGTAGISFOUNDMEMORYISTREATEDASASPECIALFUNCTIONALUNITDETAILSOFTHEALGORITHMCANBEFOUNDIN\10\AND\13\WHILETHISALGORITHMISSTRAIGHTFORWARDANDEFFECTIVE,ITISEXPENSIVETOIMPLEMENTBECAUSEEACHREGISTERNEEDSTOBETAGGEDANDEACHTAGNEEDSASSOCIATIVECOMPARISONHARDWARETOCARRYOUTTHETAGMATCHINGPROCESSTHISMAYNOTBEPRACTICALIFTHENUMBEROFPOSSIBLESINKFIELDS,IE,THENUMBEROFREGISTERSISLARGEFOROURMODELARCHITECTUREWHICHHAS8A,8S,64BAND64TREGISTERS,CLEARLYTHEUSEOF144TAGMATCHINGHARDWAREUNITSISIMPRACTICAL22EXTENSIONSTOTOMASULOSALGORITHM221ASEPARATETAGUNITONCLOSERINSPECTIONWESEETHATVERYFEWOFALLPOSSIBLESINKREGISTERSMAYACTUALLYBEACTIVE,IE,BEWAITINGFORARESULTATANYGIVENTIMETHEREFORE,IFWEASSOCIATEATAGWITHEACHPOSSIBLESINKREGISTER,ALOTOFASSOCIATIVETAGMATCHINGHARDWAREWILLBEIDLEATANYGIVENTIMEWHYNOTHAVEACOMMONTAGPOOLANDASSIGNATAGONLYTOACURRENTLYACTIVESINKREGISTERRATHERTHANASSOCIATINGATAGWITHEACHPOSSIBLESINKFIELDINTOMASULOSALGORITHM,ACURRENTLYACTIVEREGISTERISONEWHOSEBUSYBITISONWECONSOLIDATETHETAGSFROMALLCURRENTLYACTIVEREGISTERSINTOATAGUNITTUEACHREGISTERNOWHASONLYASINGLEBUSYBITATINSTRUCTIONISSUETIME,IFASOURCEREGISTERISBUSY,THETUISQUE
编号:201401051948126804    类型:共享资源    大小:795.51KB    格式:PDF    上传时间:2014-01-05
  
5
关 键 词:
工业、机械、能源、设计、建模、模具、工学
  人人文库网所有资源均是用户自行上传分享,仅供网友学习交流,未经上传用户书面授权,请勿作他用。
关于本文
本文标题:21-Instruction Issue Logic for High-Performance, Interruptable Pipelined Processors.pdf
链接地址:http://www.renrendoc.com/p-256804.html

当前资源信息

4.0
 
(2人评价)
浏览:34次
baixue100上传于2014-01-05

官方联系方式

客服手机:17625900360   
2:不支持迅雷下载,请使用浏览器下载   
3:不支持QQ浏览器下载,请用其他浏览器   
4:下载后的文档和图纸-无水印   
5:文档经过压缩,下载后原文更清晰   

精品推荐

相关阅读

人人文库
关于我们 - 网站声明 - 网站地图 - 资源地图 - 友情链接 - 网站客服客服 - 联系我们

网站客服QQ:2846424093    人人文库上传用户QQ群:460291265   

[email protected] 2016-2018  renrendoc.com 网站版权所有   南天在线技术支持

经营许可证编号:苏ICP备12009002号-5