会员注册 | 登录 | 微信快捷登录 支付宝快捷登录 QQ登录 微博登录 | 帮助中心 人人文库renrendoc.com美如初恋!
站内搜索 百度文库

热门搜索: 直缝焊接机 矿井提升机 循环球式转向器图纸 机器人手爪发展史 管道机器人dwg 动平衡试验台设计

   首页 人人文库网 > 资源分类 > PDF文档下载

12-The Case for the Reduced Instruction Set Computer.pdf

  • 资源星级:
  • 资源大小:655.30KB   全文页数:9页
  • 资源格式: PDF        下载权限:注册会员/VIP会员
您还没有登陆,请先登录。登陆后即可下载此文档。
  合作网站登录: 微信快捷登录 支付宝快捷登录   QQ登录   微博登录
友情提示
2:本站资源不支持迅雷下载,请使用浏览器直接下载(不支持QQ浏览器)
3:本站资源下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰   

12-The Case for the Reduced Instruction Set Computer.pdf

TheCasefortheReducedInstructionSetComputerDavidA.PattersonComputerScienceDivisionUniversityofCaliforniaBerkeley,California94720DavidR.DitzelBellLaboratoriesComputingScienceResearchCenterMurrayHill,NewJersey07974INTRODUCTIONOneoftheprimarygoalsofcomputerarchitectsistodesigncomputersthataremorecosteffectivethantheirpredecessors.Costeffectivenessincludesthecostofhardwaretomanufacturethemachine,thecostofprogramming,andcostsincurredrelatedtothearchitectureindebuggingboththeinitialhardwareandsubsequentprograms.Ifwereviewthehistoryofcomputerfamilieswefindthatthemostcommonarchitecturalchangeisthetrendtowardevermorecomplexmachines.Presumablythisadditionalcomplexityhasapositivetradeoffwithregardtothecosteffectivenessofnewermodels.Inthispaperweproposethatthistrendisnotalwayscosteffective,andinfact,mayevendomoreharmthangood.WeshallexaminethecaseforaReducedInstructionSetComputerRISCbeingascosteffectiveasaComplexInstructionSetComputerCISC.ThispaperwillarguethatthenextgenerationofVLSIcomputersmaybemoreeffectivelyimplementedasRISCsthanCISCs.Asexamplesofthisincreaseincomplexity,considerthetransitionsfromIBMSystem/3totheSystem/38Utley78andfromtheDECPDP11totheVAXll.ThecomplexityisindicatedquantitativelybythesizeofthecontrolstoreforDECthesizehasgrownfrom256x56inthePDP11/40to5120x96intheVAX11/780.REASONSFORINCREASEDCOMPLEXITYWhyhavecomputersbecomemorecomplexWecanthinkofseveralreasonsSpeedofMemoryvs.SpeedofCPU.JohnCockesaysthatthecomplexitybeganwiththetransitionfromthe701tothe709Cocke80.The701CPUwasabouttentimesasfastasthecoremainmemorythismadeanyprimitivesthatwereimplementedassubroutinesmuchslowerthanprimitivesthatwereinstructions.Thusthefloatingpointsubroutinesbecamepartofthe709architecturewithdramaticgains.Makingthe709morecomplexresultedinanadvancethatmadeitmorecosteffectivethanthe701.Sincethen,manyhigherlevelinstructionshavebeenaddedtomachinesinanattempttoimproveperformance.Notethatthistrendbeganbecauseoftheimbalanceinspeedsitisnotclearthatarchitectshaveaskedthemselveswhetherthisimbalancestillholdsfortheirdesigns.25MicrocodeandLSITechnology.Microprogrammedcontrolallowstheimplementationofcomplexarchitecturesmorecosteffectivelythanhardwiredcontrol\Husson70\.Advancesinintegratedcircuitmemoriesmadeinthelate60sandearly70shavecausedmicroprogrammedcontroltobethemorecosteffectiveapproachinalmosteverycase.Oncethedecisionismadetousemicroprogrammedcontrol,thecosttoexpandaninstructionsetisverysmallonlyafewmorewordsofcontrolstore.Sincethesizesofcontrolmemoriesareoftenpowersof2,sometimestheinstructionsetcanbemademorecomplexatnoextrahardwarecostbyexpandingthemicroprogramtocompletelyfillthecontrolmemory.Thustheadvancesinimplementationtechnologyresultedincosteffectiveimplementationofarchitecturesthatessentiallymovedtraditionalsubroutinesintothearchitecture.Examplesofsuchinstructionsarestringediting,integertofloatingconversion,andmathematicaloperationssuchaspolynomialevaluation.CodeDensity.Withearlycomputers,memorywasveryexpensive.Itwasthereforecosteffectivetohaveverycompactprograms.Complexinstructionsetsareoftenheraldedfortheirsupposedcodecompaction.Attemptingtoobtaincodedensitybyincreasingthecomplexityoftheinstructionsetisoftenadoubleedgedswordhowever,asmoreinstructionsandaddressingmodesrequiremorebitstorepresentthem.Evidencesuggeststhatcodecompactioncanbeaseasilyachievedmerelybycleaninguptheoriginalinstructionset.Whilecodecompactionisimportant,thecostof10morememoryisoftenfarcheaperthanthecostofsqueezing10outoftheCPUbyarchitecturalinnovations.CostforalargescalecpuisinadditionalcircuitpackagesneededwhilecostforasinglechipcpuismorelikelytobeinslowingdownperformanceduetolargerhenceslowercontrolPLAs.MarketingStrategy.Unfortunately,theprimarygoalofacomputercompanyisnottodesignthemostcosteffectivecomputertheprimarygoalofacomputercompanyistomakethemostmoneybysellingcomputers.Inordertosellcomputersmanufacturersmustconvincecustomersthattheirdesignissuperiortotheircompetitors.Complexinstructionsetsarecertainlyprimarymarketingevidenceofabettercomputer.Inordertokeeptheirjobs,architectsmustkeepsellingnewandbetterdesignstotheirinternalmanagement.Thenumberofinstructionsandtheirpowerisoftenusedtopromoteanarchitecture,regardlessoftheactualuseorcosteffectivenessofthecomplexinstructionset.Insomesensethemanufacturersanddesignerscannotbeblamedforthisaslongasbuyersofcomputersdonotquestiontheissueofcomplexityvs.costeffectiveness.Forthecaseofsiliconhouses,afancymicroprocessorisoftenusedasadrawcard,astherealprofitcomesfromluringcustomersintobuyinglargeamountsofmemorytogowiththeirrelativelyinexpensivecpu.UpwardCompatibility.Coincidentwithmarketingstrategyistheperceivedneedforupwardcompatibility.Upwardcompatibilitymeansthattheprimarywaytoimproveadesignistoaddnew,andusuallymorecomplex,features.Seldomareinstructionsoraddressingmodesremovedfromanarchitecture,resultinginagradualincreaseinboththenumberandcomplexityofinstructionsoveraseriesofcomputers.Newarchitecturestendtohaveahabitofincludingallinstructionsfoundinthemachinesofsuccessfulcompetitors,perhapsbecausearchitectsandcustomershavenorealgraspoverwhatdefinesagoodinstructionset.SupportforHighLevelLanguages.Astheuseofhighlevellanguagesbecomesincreasinglypopular,manufacturershavebecomeeagertoprovidemorepowerfulinstructionstosupportthem.Unfortunatelythereislittleevidencetosuggestthatanyofthemorecomplicatedinstructionsetshaveactuallyprovidedsuchsupport.Onthecontrary,weshallarguethatinmanycasesthecomplexinstructionsetsaremoredetrimentalthanuseful.Theefforttosupporthighlevellanguagesislaudable,butwefeelthatoftenthefocushasbeenonthewrongissues.26UseofMultiprogramming.Theriseoftimesharingrequiredthatcomputersbeabletorespondtointerruptswiththeabilitytohaltanexecutingprocessandrestartitatalatertime.Memorymanagementandpagingadditionallyrequiredthatinstructionscouldbehaltedbeforecompletionandlaterrestarted.Thoughneitherofthesehadalargeeffectonthedesignofinstructionsetsthemselves,theyhadadirecteffectontheimplementation.Complexinstructionsandaddressingmodesincreasethestatethathastobesavedonanyinterrupt.Savingthisstateofteninvolvestheuseofshadowregistersandalargeincreaseinthecomplexityofthemicrocode.Thiscomplexitylargelydisappearsonamachinewithoutcomplicatedinstructionsoraddressingmodeswithsideeffects.HOWHAVECISCSBEENUSEDOneoftheinterestingresultsofrisingsoftwarecostsistheincreasingrelianceonhighlevellanguages.Oneconsequenceisthatthecompilerwriterisreplacingtheassemblylanguageprogrammerindecidingwhichinstructionsthemachinewillexecute.Compilersareoftenunabletoutilizecomplexinstructions,nordotheyusetheinsidioustricksinwhichassemblylanguageprogrammersdelight.Compilersandassemblylanguageprogrammersalsorightfullyignorepartsoftheinstructionsetwhicharenotusefulunderthegiventimespacetradeoffs.Theresultisthatoftenonlyafairlysmallpartofthearchitectureisbeingused.Forexample,measurementsofaparticularIBM360compilerfoundthat10instructionsaccountedfor80ofallinstructionsexecuted,16for90,21for95,and30for99Alexander75.AnotherstudyofvariouscompilersandassemblylanguageprogramsconcludedthatlittleflexibilitywouldbelostifthesetofinstructionsontheCDC3600werereducedtoor¼oftheinstructionsnowavailable.Foster71ShustekpointsoutfortheIBM370thatashasbeenobservedmanytimes,veryfewopeodesaccountformostofaprogramsexecution.TheCOBOLprogram,forexample,executes84oftheavailable183instructions,but48represent99.08ofallinstructionsexecuted,and26represent90.28.Shustek78Similarstatisticsarefoundwhenexaminingtheuseofaddressingmodes.CONSEQUENCESOFCISCIMPLEMENTATIONSRapidchangesintechnologyandthedifficultiesinimplementingCISCshaveresultedinseveralinterestingeffects.Fastermemory.TheadvancesinsemiconductormemoryhavemadeseveralchangestotheassumptionsabouttherelativedifferenceinspeedbetweentheCPUandmainmemory.Semiconductormemoriesarebothfastandrelativelyinexpensive.TherecentuseofcachememoriesinmanysystemsfurtherreducesthedifferencebetweenCPUandmemoryspeeds.IrrationalImplementations.Perhapsthemostunusualaspectoftheimplementationofacomplexarchitectureisthatitisdifficulttohaverationalimplementations.Bythiswemeanthatspecialpurposeinstructionsarenotalwaysfasterthanasequenceofsimpleinstructions.OneexamplewasdiscoveredbyPeutoandShustekfortheIBM370Peuto,Shustek77theyfoundthatasequenceofloadinstructionsisfasterthanaloadmultipleinstructionforfewerthan4registers.Thiscasecovers40oftheloadmultipleinstructionsintypicalprograms.AnothercomesfromtheVAX11/780.TheINDEXinstructionisusedtocalculatetheaddressofanarrayelementwhileatthesametimecheckingtoseethattheindexfitsinthearraybounds.Thisisclearlyanimportantfunctiontoaccuratelydetecterrorsinhighlevellanguagesstatements.WefoundthatfortheVAX11/780,replacingthissinglehighlevelinstructionbyseveralsimpleinstructionsCOMPARE,JUMPLESSUNSIGNED,ADD,MULTIPLYthatwecouldperformthesamefunction4527fasterFurthermore,ifthecompilertookadvantageofthecasewherethelowerboundwaszero,thesimpleinstructionsequencewas60faster.Clearlysmallercodedoesnotalwaysimplyfastercode,nordohigherlevelinstructionsimplyfastercode.LengthenedDesignTime.Oneofthecoststhatissometimesignoredisthetimetodevelopanewarchitecture.EventhoughthereplicationcostsofaCISCmaybelow,thedesigntimeisgreatlyexpanded.IttookDEConly6monthstodesignandbegindeliveryofthePDP1,butitnowtakesatleastthreeyearstogothroughthesamecycleforamachineliketheVAX.1Thislongdesigntimecanhaveamajoreffectonthequalityoftheresultingimplementationthemachineiseitherannouncedwithathreeyearoldtechnologyorthedesignersmusttrytoforecastagoodimplementationtechnologyandattempttopioneerthattechnologywhilebuildingthemachine.Itisclearthatreduceddesigntimewouldhaveverypositivebenefitsontheresultingmachine.IncreasedDesignErrors.Oneofthemajorproblemsofcomplexinstructionsetsisdebuggingthedesignthisusuallymeansremovingerrorsfromthemicroprogramcontrol.Althoughdifficulttodocument,itislikelythatthesecorrectionswereamajorproblemwiththeIBM360family,asalmosteverymemberofthefamilyusedreadonlycontrolstore.The370lineusesalterablecontrolstoreexclusively,dueperhapstodecreasedhardwarecosts,butmorelikelyfromthebadexperiencewitherrorsonthe360.Thecontrolstoreisloadedfromafloppydiskallowingmicrocodetobemaintainedsimilarlytooperatingsystemsbugsarerepairedandnewfloppieswithupdatedversionsofthemicrocodearereleasedtothefield.TheVAX11/780designteamrealizedthepotentialformicrocodeerrors.TheirsolutionwastouseaFieldProgrammableLogicArrayand1024wordsofWritableControlStoreWCStopatchmicrocodeerrors.FortunatelyDECismoreopenabouttheirexperiencessoweknowthatmorethan50patcheshavebeenmade.Fewbelievethatthelasterrorhasbeenfound.2RISCANDVLSIThedesignofsinglechipVLSIcomputersmakestheaboveproblemswithCISCsevenmorecriticalthanwiththeirmultichipSSIimplementations.SeveralfactorsindicateaReducedInstructionSetComputerasareasonabledesignalternative.ImplementationFeasibility.AgreatdealdependsonbeingabletofitanentireCPUdesignonasinglechip.Acomplexarchitecturehaslessofachanceofbeingrealizedinagiventechnologythandoesalesscompficatedarchitecture.AgoodexampleofthisisDECsVAXseriesofcomputers.Thoughthehighendmodelsmayseemimpressive,thecomplexityofthearchitecturemakesitsimplementationonasinglechipextremelydifficultwithcurrentdesignrules,ifnottotallyimpossible.ImprovementinVLSItechnologywilleventuallymakeasinglechipversionfeasible,butonlyafterlesscomplexbutequallyfunctional32bitarchitecturescanberealized.RISCcomputersthereforebenefitfrombeingrealizableatanearlierdate.1Somehaveofferedotherexplanations.Everythingtakeslongernowsoftware,mail,nuclearpowerplants,sowhyshouldntcomputersItwasalsomentionedthatayoung,hungrycompanywouldprobablytakelesstimethananestablishedcompany.AlthoughtheseobservationsmaypartiallyexplainDECsexperiences,webelievethat,regardlessofthecircumstances,thecomplexityofthearchitecturewillaffectthedesigncycle.2EachpatchmeansseveralmicroinstructionsmustbeputintoWCS,sothe50patchesrequire252microinstructions.SincetherewasagoodchanceoferrorsinthecomplexVAXinstructions,someofthesewereimplementedonlyinWCSsothepatchesandtheexistinginstructionsuseasubstantialportionofthe1024words.28DesignTime.DesigndifficultyisacrucialfactorinthesuccessofVLSIcomputer.IfVLSItechnologycontinuestoatleastdoublechipdensityroughlyeverytwoyears,adesignthattakesonlytwoyearstodesignanddebugcanpotentiallyuseamuchsuperiortechnologyandhencebemoreeffectivethanadesignthattakesfouryearstodesignanddebug.Sincetheturnaroundtimeforanewmaskisgenerallymeasuredinmonths,eachbatchoferrorsdelaysproductdeliveryanotherquartercommonexamplesarethe12yeardelaysintheZ8000andMC68000.Speed.Theultimatetestforcosteffectivenessisthespeedatwhichanimplementationexecutesagivenalgorithm.Betteruseofchipareaandavailabilityofnewertechnologythroughreduceddebuggingtimecontributetothespeedofthechip.ARISCpotentiallygainsinspeedmerelyfromasimplerdesign.Takingoutasingleaddressmodeorinstructionmayleadtoalesscomplicatedcontrolstructure.ThisinturncanleadtosmallercontrolPLAs,smallermlc.ococlememories,fewergatesinthecriticalpathofthemachineallofthesecanlead,oiasterminorcycletime.Ifleavingoutaninstructionoraddressmodecausesthemachinetospeeduptheminorcycleby10,thentheadditionwouldhavetospeedupthemachinebymorethan10tobecosteffective.Sofar,wehaveseenlittlehardevidencethatcomplicatedinstructionsetsarecosteffectiveinthismanner.3Betteruseofchiparea.Ifyouhavethearea,whynotimplementtheCISCForagivenchipareatherearemanytradeoffsforwhatcanberealized.WefeelthattheareagainedbackbydesigningaRISCarchitectureratherthanaCISCarchitecturecanbeusedtomaketheRISCevenmoreattractivethantheCISC.Forexample,wefeelthattheentiresystemperformancemightimprovemoreifsiliconareawereinsteadusedforonchipcaches\Patterson,Srquin80\,largerandfastertransistors,orevenpipelining.AsVLSItechnologyimproves,theRISCarchitecturecanalwaysstayonestepaheadofthecomparableCISC.WhentheCISCbecomesrealizableonasinglechip,theRISCwillhavethesiliconareatousepipeliningtechniqueswhentheCISCgetspipeliningtheRISCwillhaveonchipcaches,etc.TheCISCalsosuffersbythefactthatitsintrinsiccomplexityoftenmakesadvancedtechniquesevenhardertoimplement.SUPPORTINGAHIGHLEVELLANGUAGECOMPUTERSYSTEMSomewouldarguethatsimplifyinganarchitectureisabackwardsstepinthesupportofhighlevellanguages.Arecentpaper\Ditzel,Patterson80\pointsoutthatahighlevelarchitectureisnotnecessarilythemostimportantaspectinachievingaHighLevelLanguageComputerSystem.AHighLevelLanguageComputerSystemhasbeendefinedashavingthefollowingcharacteristics1Useshighlevellanguagesforallprogramming,debuggingandotherusersysteminteractions.2Discoversandreportssyntaxandexecutiontimeerrorsintermsofthehighlevellanguagesourceprogram.3Doesnothaveanyoutwardappearanceoftransformationsfromtheuserprogramminglanguagetoanyinternallanguages.Thustheonlyimportantcharacteristicisthatacombinationofhardwareandsoftwareassuresthattheprogrammerisalwaysinteractingwiththecomputerintermsofahighlevellanguage.Atnotimeneedtheprogrammerbeawareofanylowerlevelsinthewritingorthedebuggingofaprogram.Aslongasthisrequirementismet,thenthegoalisachieved.Thusitmakesnodifferencein3Infact,thereisevidencetothecontrary.HarveyCragon,chiefarchitectoftheTIASC,saidthatthismachineimplementedacomplexmechanismtoimproveperformanceofindexedreferenceinsideofloops.Althoughtheysucceededinmakingtheseoperationsrunfaster,hefeltitmadetheASCslowerinothersituations.TheimpactwastomaketheASCslowerthansimplercomputersdesignedbyCray\Cragon80\.29aHighLevelLanguageComputerSystemwhetheritisimplementedwithaCISCthatmapsonetoonewiththetokensofthelanguage,orifthesamefunctionisprovidedwithaveryfastbutsimplemachine.Theexperiencewehavefromcompilerssuggeststhattheburdenoncompilerwritersiseasedwhentheinstructionsetissimpleanduniform.Complexinstructionsthatsupposedlysupporthighlevelfunctionsareoftenimpossibletogeneratefromcompilers.4Complexinstructionsareincreasinglypronetoimplementingthewrongfunctionastheleveloftheinstructionincreases.Thisisbecausethefunctionbecomessospecializedthatitbecomesuselessforotheroperations.5Complexinstructionscangenerallybereplacedwithasmallnumberoflowerlevelinstructions,oftenwithlittleornolossinperformance.6ThetimetogenerateacompilerforaCISCisadditionallyincreasedbecausebugsaremorelikelytooccuringeneratingcodeforcomplexinstructions.7Thereisafairamountofevidencethatmorecomplicatedinstructionsdesignedtomakecompilerseasiertowriteoftendonotaccomplishtheirgoal.Severalreasonsaccountforthis.First,becauseoftheplethoraofinstructionstherearemanywaystoaccomplishagivenelementaryoperation,asituationconfusingforboththecompilerandcompilerwriter.Second,manycompilerwritersassumethattheyaredealingwitharationalimplementationwheninfacttheyarenot.Theresultisthattheappropriateinstructionoftenturnsouttobethewrongchoice.Forexample,pushingaregisteronthestackwithPUSHLR0isslowerthanpushingitwiththemoveinstructionMOVLR0,SPontheVAX11/780.Wecanthinkofadozenmoreexamplesoffhand,forthisandalmosteveryothercomplicatedmachine.Onehastotakespecialcarenottouseaninstructionbecauseitsthere.Theseproblemscannotbefixedbydifferentmodelsofthesamearchitecturewithouttotallydestroyingeitherprogramportabilityorthereputationofagoodcompilerwriterasachangeintherelativeinstructiontimingswouldrequireanewcodegeneratortoretainoptimalcodegeneration.ThedesiretosupporthighlevellanguagesencompassesboththeachievementofaHLLCSandreducingcompilercomplexity.WeseefewcaseswhereaRISCissubstantiallyworseoffthanaCISC,leadingustoconcludethataproperlydesignedRISCseemsasreasonableanarchitectureforsupportinghighlevellanguagesasaCISC.4EvidenceforandagainstcomesfromDEC.ThecomplexMARKinstructionwasaddedtothePDP11inorderimprovetheperformanceofsubroutinecallsbecausethisinstructiondidnotdoexactlywhattheprogrammerswanteditisalmostneverused.ThedamagingevidencecomesfromtheVAXitisrumoredthattheVMSFORTRANcompilerapparentlyproducesaverylargefraction.8ofthepotentialVAXinstructions.5WewouldnotbesurprisedifFORTRANandBLISSwereusedasmodelsforseveraloccurrencesofthistypeofinstructionontheVAX.Considerthebranchiflowerhitsetandbranchiflowerhitclearinstructions,whichpreciselyimplementconditionalbranchingforBLISS,butareuselessforthemorecommonbranchifzeroandifnotzerofoundinmanyotherlanguagesthiscommonoccurrencerequirestwoinstructions.SimilarinstructionsandaddressingmodesexistwhichappealtoFORTRAN.6PeutoandShustekobservedthatthecomplexdecimalandcharacterinstructionsoftheIBMandAmdahlcomputersgenerallyresultedinrelativelypoorperformanceinthehighendmodels.TheysuggestthatsimplerinstructionsmayleadtoincreasedperformancePeuto,Shustek77.TheyalsomeasuredthedynamicoccurrenceofpairsofinstructionssignificantresultsherewouldsupporttheCISCphilosophy.TheirconclusionAnexaminationofthefrequentopeodepairsfailstouncoveranypairwhichoccursfrequentlyenoughtosuggestcreatingadditionalinstructionstoreplaceit,7InportingtheCcompilertotheVAX,overhalfofthebugsandaboutathirdofthecomplexityresultedfromthecomplicatedINDEXEDMODE.3OWORKONRISCARCHITECTURESAtBerkeley.InvestigationofaRISCarchitecturehasgoneonforseveralmonthsnowunderthesupervisionofD.A.PattersonandC.H.Srquin.Byajudiciouschoiceoftheproperinstructionsetandthedesignofacorrespondingarchitecture,wefeelthatitshouldbepossibletohaveaverysimpleinstructionsetthatcanbeveryfast.Thismayleadtoasubstantialnetgaininoverallprogramexecutionspeed.ThisistheconceptoftheReducedInstructionSetComputer.TheimplementationsofRISCswillalmostcertainlybelesscostlythantheimplementationsofCISCs.IfwecanshowthatsimplearchitecturesarejustaseffectivetothehighlevellanguageprogrammerasCISCssuchasVAXortheIBMS/38,wecanclaimtohavemadeaneffectivedesign.AtBellLabs.AprojecttodesigncomputersbaseduponmeasurementsoftheCprogramminglanguagehasbeenunderinvestigationbyasmallnumberofindividualsatBellLaboratoriesComputingScienceResearchCenterforanumberofyears.Aprototype16bitmachinewasdesignedandconstructedbyA.G.Fraser.32bitarchitectureshavebeeninvestigatedbyS.R.Bourne,D.R.Ditzel,andS.C.Johnson.Johnsonusedaniterativetechniqueofproposingamachine,writingacompiler,measuringtheresultstoproposeabettermachine,andthenrepeatingthecycleoveradozentimes.Thoughtheinitialintentwasnotspecificallytocomeupwithasimpledesign,theresultwasaRISClike32bitarchitecturewhosecodedensitywasascompactasthePDP11andVAXJohnson79.AtIBM.UndoubtedlythebestexampleRISCisthe801minicomputer,developedbyIBMResearchinYorktownHeights,N.Y.Electronics76Datamation79.ThisprojectisseveralyearsoldandhashadalargedesignteamexploringtheuseofaRISCarchitectureincombinationwithveryadvancedcompilertechnology.Thoughmanydetailsarelackingtheirearlyresultsseemquiteextraordinary.TheyareabletobenchmarkprogramsinasubsetofPL/IthatrunsaboutfivetimestheperformanceofanIBMS/370model168.Wearecertainlylookingforwardtomoredetailedinformation.CONCLUSIONThereareundoubtedlymanyexampleswhereparticularuniqueinstructionscangreatlyimprovethespeedofaprogram.Rarelyhaveweseenexampleswherethesamebenefitsapplytothesystemasawhole.Forawidevarietyofcomputingenvironmentswefeelthatcarefulpruningofaninstructionsetleadstoacosteffectiveimplementation.Computerarchitectsought,toaskthemselvesthefollowingquestionswhendesigninganewinstructionset.Ifthisinstructionoccursinfrequently,isitjustifiableonthegroundsthatitisnecessaryandunsynthesizable,forexample,aSupervisorCallinstruction.Iftheinstructionoccursinfrequentlyandissynthesizable,canitbejustifiedonthegroundsthatitisaheavilytimeconsumingoperation,forexample,floatingpointoperations.Iftheinstructionissynthesizablefromasmallnumberofmorebasicinstructions,whatistheoverallimpactonprogramsizeandspeediftheinstructionisleftoutIstheinstructionobtainableforfree,forexample,byutilizingunusedcontrolstoreorbyusinganoperationalreadyprovidedbytheALUIfitisobtainableforfree,whatwillbethecostindebugging,documentation,andthecostinfutureimplementationsIsitlikelythatacompilerwillbeabletogeneratetheinstructioneasilyWehaveassumedthatitisworthwhiletominimizethecomplexityperhapsmeasuredindesigntimeandgatesandmaximizeperformanceperhapsusingaverageexecutiontimeexpressedingatedelaysasatechnologyindependenttimeunitwhilemeetingthedefinitionofaHighLevelLanguageComputerSystem.Inparticular,wefeelthatVLSIcomputerswillbenefitthemostfromtheRISCconcepts.Toooften,therapidadvancementsinVLSItechnologyhavebeen31usedasapanaceatopromotearchitecturalcomplexity.Weseeeachtransistorasbeingpreciousforatleastthenexttenyears.Whilethetrendtowardsarchitecturalcomplexitymaybeonepathtowardsimprovedcomputers,thispaperproposesanotherpath,theReducedInstructionSetComputer.ACKNOWLEDGEMENTSFortheirpromptandconstructivecommentsonthispaper,theauthorswishtoexpressthankstoA.V.Aho,D.Bhandarkar,R.Campbell,G.Corcoran,G.Chesson,R.Cmelik,A.G.Fraser,S.L.Graham,S.C.Johnson,P.Kessler,T.London,J.Ousterhout,D.Poplawski,M.Powell,J.Reiser,L.Rowe,B.Rowland,J.Swensen,A.S.Tanenbaum,C.Srquin,Y.Tamir,G.Taylor,andJ.Wakerly.ThosestudentsatBerkeleywhohaveparticipatedintheRISCprojectareacknowledgedfortheirexcellentwork.RISCresearchatBerkeleywassponsoredinpartbytheDefenseAdvanceResearchProjectsAgencyDoD,ARPAOrderNo.3803,andmonitoredbyNavalElectronicSystemCommandunderContractNo.N0003978G00130004.REFERENCESAlexander75Cocke80Cragon80Datamation79Ditzel,Patterson80Electronics76Foster71Husson70Johnson79Patterson,Srquin80Peuto,Shustek77Shustek78Utley78W.C.AlexanderandD.B.Wortman,StaticandDynamiccharacteristicsofXPLPrograms,Computer,pp.4146,November1975,Vol.8,No.11.J.Cocke,privatecommunication,February,1980.H.A.Cragon,inhistalkpresentingthepaperTheCaseAgainstHighLevelLanguageComputers,attheInternationalWorkshoponHighLevelLanguageComputerArchitecture,May1980.Datamation,IBMMiniaRadicalDepalrture,October1979,pp.5355.RetrospectiveonHighLevelLanguageComputerArchitecture,SeventhAnnualInternationalSymposiumonComputerArchitecture,May68,1980,LaBaule,France.ElectronicsMagazine,AlteringComputerArchitectureisWaytoRaiseThroughput,SuggestsIBMResearchers,December23,1976,pp.3031.C.C.Foster,R.H.GonterandE.M.Riseman,MeasuresofOpCodeUtilization,IEEETransactionsonComputers,May,1971,pp.582584.S.S.Husson,MicroprogrammingPrinciplesandPractices,PrenticeHall,Engelwood,N.J.,pp.109112,1970.S.C.Johnson,A32bitProcessorDesign,ComputerScienceTechnicalReport80,BellLabs,MurrayHill,NewJersey,April2,1979.D.A.PattersonandC.H.Srquin,DesignConsiderationsforSingleChipComputersoftheFuture,IEEEJournalofSolidStateCircuits,IEEETransactionsonComputers,JointSpecialIssueonMicroprocessorsandMicrocomputers,Vol.C29,no.2,pp.108116,February1980.B.L.PeutoandL.J.Shustek,AnInstructionTimingModelofCPUPerformance,ConferenceProc.,FourthAnnualSymposiumonComputerArchitecture,March1977.L.J.Shustek,AnalysisandPerformanceofComputerInstructionSets,StanfordLinearAcceleratorCenterReport205,StanfordUniversity,May,1978,pp.56.B.G.Utleyetal,IBMSystem/38TechnicalDevelopments,IBMGS800237,1978.32ERRATAThefootnoteatthebottomofpage29inTheCasefortheReducedInstructionSetComputercontainsanerror.HarveyCragonwasspeakingonreducingthesemanticgapbymakingcomputersmorecomplicated.Themostappropriateslideisquoteddirectlybelow...WhendoingvectoroperationstheperformancewasapproximatelythesameonthesetwomachinestheCDC7andtheTIASC.Memorybandwidthwasapproximatelyequal.ThebufferingofthehardwareDOLOOPaccomplishedthesamememorybandwidthreductionasdidthebufferingofthenormalinstructionstreamonthe70.Afterthepropermacroshadbeenwrittenforthe7,anMthecallingprocedureincorporatedintothecompiler,equalaccesstothevectorcapabilitywasprovidedfromFORTRAN.Dueinlargemeasuretothecomplexityintroducedbythevectorhardware,scalarperformanceontheASCislessthanthe7600.Thelast,andmosttellingargument,isthatmorehardwarewasneededontheASCthanwasrequiredonthe7600.TheadditionalhardwareforbuildingtheratherelaborateDOLOOPoperationsdidnothavethepayoffthatwebadanticipated.TheexperiencewiththeASC,andotherexperiences,haveledmetoquestionalltheglowingpromiseswhicharemadeifwewillonlyclosethesemanticgap.Thepromisedbenefitsdonotmaterialize,generalityislost.Clearlytherelativespeedsofthetwodependupontheofscalarandvectoroperationsinagivenapplication.mixWealsoinadvertentlychangedHarveysmiddleheisHarveyG.Cragon.initialDavePattersonDaveDitzel33

注意事项

本文(12-The Case for the Reduced Instruction Set Computer.pdf)为本站会员(baixue100)主动上传,人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知人人文库网([email protected]),我们立即给予删除!

温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载不扣分。

copyright@ 2015-2017 人人文库网网站版权所有
苏ICP备12009002号-5