会员注册 | 登录 | 微信快捷登录 支付宝快捷登录 QQ登录 微博登录 | 帮助中心 人人文库renrendoc.com美如初恋!
站内搜索 百度文库

热门搜索: 直缝焊接机 矿井提升机 循环球式转向器图纸 机器人手爪发展史 管道机器人dwg 动平衡试验台设计

12-The Case for the Reduced Instruction Set Computer.pdf12-The Case for the Reduced Instruction Set Computer.pdf -- 5 元

宽屏显示 收藏 分享

资源预览需要最新版本的Flash Player支持。
您尚未安装或版本过低,建议您

TheCasefortheReducedInstructionSetComputerDavidA.PattersonComputerScienceDivisionUniversityofCaliforniaBerkeley,California94720DavidR.DitzelBellLaboratoriesComputingScienceResearchCenterMurrayHill,NewJersey07974INTRODUCTIONOneoftheprimarygoalsofcomputerarchitectsistodesigncomputersthataremorecosteffectivethantheirpredecessors.Costeffectivenessincludesthecostofhardwaretomanufacturethemachine,thecostofprogramming,andcostsincurredrelatedtothearchitectureindebuggingboththeinitialhardwareandsubsequentprograms.Ifwereviewthehistoryofcomputerfamilieswefindthatthemostcommonarchitecturalchangeisthetrendtowardevermorecomplexmachines.Presumablythisadditionalcomplexityhasapositivetradeoffwithregardtothecosteffectivenessofnewermodels.Inthispaperweproposethatthistrendisnotalwayscosteffective,andinfact,mayevendomoreharmthangood.WeshallexaminethecaseforaReducedInstructionSetComputerRISCbeingascosteffectiveasaComplexInstructionSetComputerCISC.ThispaperwillarguethatthenextgenerationofVLSIcomputersmaybemoreeffectivelyimplementedasRISCsthanCISCs.Asexamplesofthisincreaseincomplexity,considerthetransitionsfromIBMSystem/3totheSystem/38Utley78andfromtheDECPDP11totheVAXll.ThecomplexityisindicatedquantitativelybythesizeofthecontrolstoreforDECthesizehasgrownfrom256x56inthePDP11/40to5120x96intheVAX11/780.REASONSFORINCREASEDCOMPLEXITYWhyhavecomputersbecomemorecomplexWecanthinkofseveralreasonsSpeedofMemoryvs.SpeedofCPU.JohnCockesaysthatthecomplexitybeganwiththetransitionfromthe701tothe709Cocke80.The701CPUwasabouttentimesasfastasthecoremainmemorythismadeanyprimitivesthatwereimplementedassubroutinesmuchslowerthanprimitivesthatwereinstructions.Thusthefloatingpointsubroutinesbecamepartofthe709architecturewithdramaticgains.Makingthe709morecomplexresultedinanadvancethatmadeitmorecosteffectivethanthe701.Sincethen,manyhigherlevelinstructionshavebeenaddedtomachinesinanattempttoimproveperformance.Notethatthistrendbeganbecauseoftheimbalanceinspeedsitisnotclearthatarchitectshaveaskedthemselveswhetherthisimbalancestillholdsfortheirdesigns.25MicrocodeandLSITechnology.Microprogrammedcontrolallowstheimplementationofcomplexarchitecturesmorecosteffectivelythanhardwiredcontrol\Husson70\.Advancesinintegratedcircuitmemoriesmadeinthelate60sandearly70shavecausedmicroprogrammedcontroltobethemorecosteffectiveapproachinalmosteverycase.Oncethedecisionismadetousemicroprogrammedcontrol,thecosttoexpandaninstructionsetisverysmallonlyafewmorewordsofcontrolstore.Sincethesizesofcontrolmemoriesareoftenpowersof2,sometimestheinstructionsetcanbemademorecomplexatnoextrahardwarecostbyexpandingthemicroprogramtocompletelyfillthecontrolmemory.Thustheadvancesinimplementationtechnologyresultedincosteffectiveimplementationofarchitecturesthatessentiallymovedtraditionalsubroutinesintothearchitecture.Examplesofsuchinstructionsarestringediting,integertofloatingconversion,andmathematicaloperationssuchaspolynomialevaluation.CodeDensity.Withearlycomputers,memorywasveryexpensive.Itwasthereforecosteffectivetohaveverycompactprograms.Complexinstructionsetsareoftenheraldedfortheirsupposedcodecompaction.Attemptingtoobtaincodedensitybyincreasingthecomplexityoftheinstructionsetisoftenadoubleedgedswordhowever,asmoreinstructionsandaddressingmodesrequiremorebitstorepresentthem.Evidencesuggeststhatcodecompactioncanbeaseasilyachievedmerelybycleaninguptheoriginalinstructionset.Whilecodecompactionisimportant,thecostof10morememoryisoftenfarcheaperthanthecostofsqueezing10outoftheCPUbyarchitecturalinnovations.CostforalargescalecpuisinadditionalcircuitpackagesneededwhilecostforasinglechipcpuismorelikelytobeinslowingdownperformanceduetolargerhenceslowercontrolPLAs.MarketingStrategy.Unfortunately,theprimarygoalofacomputercompanyisnottodesignthemostcosteffectivecomputertheprimarygoalofacomputercompanyistomakethemostmoneybysellingcomputers.Inordertosellcomputersmanufacturersmustconvincecustomersthattheirdesignissuperiortotheircompetitors.Complexinstructionsetsarecertainlyprimarymarketingevidenceofabettercomputer.Inordertokeeptheirjobs,architectsmustkeepsellingnewandbetterdesignstotheirinternalmanagement.Thenumberofinstructionsandtheirpowerisoftenusedtopromoteanarchitecture,regardlessoftheactualuseorcosteffectivenessofthecomplexinstructionset.Insomesensethemanufacturersanddesignerscannotbeblamedforthisaslongasbuyersofcomputersdonotquestiontheissueofcomplexityvs.costeffectiveness.Forthecaseofsiliconhouses,afancymicroprocessorisoftenusedasadrawcard,astherealprofitcomesfromluringcustomersintobuyinglargeamountsofmemorytogowiththeirrelativelyinexpensivecpu.UpwardCompatibility.Coincidentwithmarketingstrategyistheperceivedneedforupwardcompatibility.Upwardcompatibilitymeansthattheprimarywaytoimproveadesignistoaddnew,andusuallymorecomplex,features.Seldomareinstructionsoraddressingmodesremovedfromanarchitecture,resultinginagradualincreaseinboththenumberandcomplexityofinstructionsoveraseriesofcomputers.Newarchitecturestendtohaveahabitofincludingallinstructionsfoundinthemachinesofsuccessfulcompetitors,perhapsbecausearchitectsandcustomershavenorealgraspoverwhatdefinesagoodinstructionset.SupportforHighLevelLanguages.Astheuseofhighlevellanguagesbecomesincreasinglypopular,manufacturershavebecomeeagertoprovidemorepowerfulinstructionstosupportthem.Unfortunatelythereislittleevidencetosuggestthatanyofthemorecomplicatedinstructionsetshaveactuallyprovidedsuchsupport.Onthecontrary,weshallarguethatinmanycasesthecomplexinstructionsetsaremoredetrimentalthanuseful.Theefforttosupporthighlevellanguagesislaudable,butwefeelthatoftenthefocushasbeenonthewrongissues.26UseofMultiprogramming.Theriseoftimesharingrequiredthatcomputersbeabletorespondtointerruptswiththeabilitytohaltanexecutingprocessandrestartitatalatertime.Memorymanagementandpagingadditionallyrequiredthatinstructionscouldbehaltedbeforecompletionandlaterrestarted.Thoughneitherofthesehadalargeeffectonthedesignofinstructionsetsthemselves,theyhadadirecteffectontheimplementation.Complexinstructionsandaddressingmodesincreasethestatethathastobesavedonanyinterrupt.Savingthisstateofteninvolvestheuseofshadowregistersandalargeincreaseinthecomplexityofthemicrocode.Thiscomplexitylargelydisappearsonamachinewithoutcomplicatedinstructionsoraddressingmodeswithsideeffects.HOWHAVECISCSBEENUSEDOneoftheinterestingresultsofrisingsoftwarecostsistheincreasingrelianceonhighlevellanguages.Oneconsequenceisthatthecompilerwriterisreplacingtheassemblylanguageprogrammerindecidingwhichinstructionsthemachinewillexecute.Compilersareoftenunabletoutilizecomplexinstructions,nordotheyusetheinsidioustricksinwhichassemblylanguageprogrammersdelight.Compilersandassemblylanguageprogrammersalsorightfullyignorepartsoftheinstructionsetwhicharenotusefulunderthegiventimespacetradeoffs.Theresultisthatoftenonlyafairlysmallpartofthearchitectureisbeingused.Forexample,measurementsofaparticularIBM360compilerfoundthat10instructionsaccountedfor80ofallinstructionsexecuted,16for90,21for95,and30for99Alexander75.AnotherstudyofvariouscompilersandassemblylanguageprogramsconcludedthatlittleflexibilitywouldbelostifthesetofinstructionsontheCDC3600werereducedtoor¼oftheinstructionsnowavailable.Foster71ShustekpointsoutfortheIBM370thatashasbeenobservedmanytimes,veryfewopeodesaccountformostofaprogramsexecution.TheCOBOLprogram,forexample,executes84oftheavailable183instructions,but48represent99.08ofallinstructionsexecuted,and26represent90.28.Shustek78Similarstatisticsarefoundwhenexaminingtheuseofaddressingmodes.CONSEQUENCESOFCISCIMPLEMENTATIONSRapidchangesintechnologyandthedifficultiesinimplementingCISCshaveresultedinseveralinterestingeffects.Fastermemory.TheadvancesinsemiconductormemoryhavemadeseveralchangestotheassumptionsabouttherelativedifferenceinspeedbetweentheCPUandmainmemory.Semiconductormemoriesarebothfastandrelativelyinexpensive.TherecentuseofcachememoriesinmanysystemsfurtherreducesthedifferencebetweenCPUandmemoryspeeds.IrrationalImplementations.Perhapsthemostunusualaspectoftheimplementationofacomplexarchitectureisthatitisdifficulttohaverationalimplementations.Bythiswemeanthatspecialpurposeinstructionsarenotalwaysfasterthanasequenceofsimpleinstructions.OneexamplewasdiscoveredbyPeutoandShustekfortheIBM370Peuto,Shustek77theyfoundthatasequenceofloadinstructionsisfasterthanaloadmultipleinstructionforfewerthan4registers.Thiscasecovers40oftheloadmultipleinstructionsintypicalprograms.AnothercomesfromtheVAX11/780.TheINDEXinstructionisusedtocalculatetheaddressofanarrayelementwhileatthesametimecheckingtoseethattheindexfitsinthearraybounds.Thisisclearlyanimportantfunctiontoaccuratelydetecterrorsinhighlevellanguagesstatements.WefoundthatfortheVAX11/780,replacingthissinglehighlevelinstructionbyseveralsimpleinstructionsCOMPARE,JUMPLESSUNSIGNED,ADD,MULTIPLYthatwecouldperformthesamefunction4527fasterFurthermore,ifthecompilertookadvantageofthecasewherethelowerboundwaszero,thesimpleinstructionsequencewas60faster.Clearlysmallercodedoesnotalwaysimplyfastercode,nordohigherlevelinstructionsimplyfastercode.LengthenedDesignTime.Oneofthecoststhatissometimesignoredisthetimetodevelopanewarchitecture.EventhoughthereplicationcostsofaCISCmaybelow,thedesigntimeisgreatlyexpanded.IttookDEConly6monthstodesignandbegindeliveryofthePDP1,butitnowtakesatleastthreeyearstogothroughthesamecycleforamachineliketheVAX.1Thislongdesigntimecanhaveamajoreffectonthequalityoftheresultingimplementationthemachineiseitherannouncedwithathreeyearoldtechnologyorthedesignersmusttrytoforecastagoodimplementationtechnologyandattempttopioneerthattechnologywhilebuildingthemachine.Itisclearthatreduceddesigntimewouldhaveverypositivebenefitsontheresultingmachine.IncreasedDesignErrors.Oneofthemajorproblemsofcomplexinstructionsetsisdebuggingthedesignthisusuallymeansremovingerrorsfromthemicroprogramcontrol.Althoughdifficulttodocument,itislikelythatthesecorrectionswereamajorproblemwiththeIBM360family,asalmosteverymemberofthefamilyusedreadonlycontrolstore.The370lineusesalterablecontrolstoreexclusively,dueperhapstodecreasedhardwarecosts,butmorelikelyfromthebadexperiencewitherrorsonthe360.Thecontrolstoreisloadedfromafloppydiskallowingmicrocodetobemaintainedsimilarlytooperatingsystemsbugsarerepairedandnewfloppieswithupdatedversionsofthemicrocodearereleasedtothefield.TheVAX11/780designteamrealizedthepotentialformicrocodeerrors.TheirsolutionwastouseaFieldProgrammableLogicArrayand1024wordsofWritableControlStoreWCStopatchmicrocodeerrors.FortunatelyDECismoreopenabouttheirexperiencessoweknowthatmorethan50patcheshavebeenmade.Fewbelievethatthelasterrorhasbeenfound.2RISCANDVLSIThedesignofsinglechipVLSIcomputersmakestheaboveproblemswithCISCsevenmorecriticalthanwiththeirmultichipSSIimplementations.SeveralfactorsindicateaReducedInstructionSetComputerasareasonabledesignalternative.ImplementationFeasibility.AgreatdealdependsonbeingabletofitanentireCPUdesignonasinglechip.Acomplexarchitecturehaslessofachanceofbeingrealizedinagiventechnologythandoesalesscompficatedarchitecture.AgoodexampleofthisisDECsVAXseriesofcomputers.Thoughthehighendmodelsmayseemimpressive,thecomplexityofthearchitecturemakesitsimplementationonasinglechipextremelydifficultwithcurrentdesignrules,ifnottotallyimpossible.ImprovementinVLSItechnologywilleventuallymakeasinglechipversionfeasible,butonlyafterlesscomplexbutequallyfunctional32bitarchitecturescanberealized.RISCcomputersthereforebenefitfrombeingrealizableatanearlierdate.1Somehaveofferedotherexplanations.Everythingtakeslongernowsoftware,mail,nuclearpowerplants,sowhyshouldntcomputersItwasalsomentionedthatayoung,hungrycompanywouldprobablytakelesstimethananestablishedcompany.AlthoughtheseobservationsmaypartiallyexplainDECsexperiences,webelievethat,regardlessofthecircumstances,thecomplexityofthearchitecturewillaffectthedesigncycle.2EachpatchmeansseveralmicroinstructionsmustbeputintoWCS,sothe50patchesrequire252microinstructions.SincetherewasagoodchanceoferrorsinthecomplexVAXinstructions,someofthesewereimplementedonlyinWCSsothepatchesandtheexistinginstructionsuseasubstantialportionofthe1024words.28DesignTime.DesigndifficultyisacrucialfactorinthesuccessofVLSIcomputer.IfVLSItechnologycontinuestoatleastdoublechipdensityroughlyeverytwoyears,adesignthattakesonlytwoyearstodesignanddebugcanpotentiallyuseamuchsuperiortechnologyandhencebemoreeffectivethanadesignthattakesfouryearstodesignanddebug.Sincetheturnaroundtimeforanewmaskisgenerallymeasuredinmonths,eachbatchoferrorsdelaysproductdeliveryanotherquartercommonexamplesarethe12yeardelaysintheZ8000andMC68000.Speed.Theultimatetestforcosteffectivenessisthespeedatwhichanimplementationexecutesagivenalgorithm.Betteruseofchipareaandavailabilityofnewertechnologythroughreduceddebuggingtimecontributetothespeedofthechip.ARISCpotentiallygainsinspeedmerelyfromasimplerdesign.Takingoutasingleaddressmodeorinstructionmayleadtoalesscomplicatedcontrolstructure.ThisinturncanleadtosmallercontrolPLAs,smallermlc.ococlememories,fewergatesinthecriticalpathofthemachineallofthesecanlead,oiasterminorcycletime.Ifleavingoutaninstructionoraddressmodecausesthemachinetospeeduptheminorcycleby10,thentheadditionwouldhavetospeedupthemachinebymorethan10tobecosteffective.Sofar,wehaveseenlittlehardevidencethatcomplicatedinstructionsetsarecosteffectiveinthismanner.3Betteruseofchiparea.Ifyouhavethearea,whynotimplementtheCISCForagivenchipareatherearemanytradeoffsforwhatcanberealized.WefeelthattheareagainedbackbydesigningaRISCarchitectureratherthanaCISCarchitecturecanbeusedtomaketheRISCevenmoreattractivethantheCISC.Forexample,wefeelthattheentiresystemperformancemightimprovemoreifsiliconareawereinsteadusedforonchipcaches\Patterson,Srquin80\,largerandfastertransistors,orevenpipelining.AsVLSItechnologyimproves,theRISCarchitecturecanalwaysstayonestepaheadofthecomparableCISC.WhentheCISCbecomesrealizableonasinglechip,theRISCwillhavethesiliconareatousepipeliningtechniqueswhentheCISCgetspipeliningtheRISCwillhaveonchipcaches,etc.TheCISCalsosuffersbythefactthatitsintrinsiccomplexityoftenmakesadvancedtechniquesevenhardertoimplement.SUPPORTINGAHIGHLEVELLANGUAGECOMPUTERSYSTEMSomewouldarguethatsimplifyinganarchitectureisabackwardsstepinthesupportofhighlevellanguages.Arecentpaper\Ditzel,Patterson80\pointsoutthatahighlevelarchitectureisnotnecessarilythemostimportantaspectinachievingaHighLevelLanguageComputerSystem.AHighLevelLanguageComputerSystemhasbeendefinedashavingthefollowingcharacteristics1Useshighlevellanguagesforallprogramming,debuggingandotherusersysteminteractions.2Discoversandreportssyntaxandexecutiontimeerrorsintermsofthehighlevellanguagesourceprogram.3Doesnothaveanyoutwardappearanceoftransformationsfromtheuserprogramminglanguagetoanyinternallanguages.Thustheonlyimportantcharacteristicisthatacombinationofhardwareandsoftwareassuresthattheprogrammerisalwaysinteractingwiththecomputerintermsofahighlevellanguage.Atnotimeneedtheprogrammerbeawareofanylowerlevelsinthewritingorthedebuggingofaprogram.Aslongasthisrequirementismet,thenthegoalisachieved.Thusitmakesnodifferencein3Infact,thereisevidencetothecontrary.HarveyCragon,chiefarchitectoftheTIASC,saidthatthismachineimplementedacomplexmechanismtoimproveperformanceofindexedreferenceinsideofloops.Althoughtheysucceededinmakingtheseoperationsrunfaster,hefeltitmadetheASCslowerinothersituations.TheimpactwastomaketheASCslowerthansimplercomputersdesignedbyCray\Cragon80\.29aHighLevelLanguageComputerSystemwhetheritisimplementedwithaCISCthatmapsonetoonewiththetokensofthelanguage,orifthesamefunctionisprovidedwithaveryfastbutsimplemachine.Theexperiencewehavefromcompilerssuggeststhattheburdenoncompilerwritersiseasedwhentheinstructionsetissimpleanduniform.Complexinstructionsthatsupposedlysupporthighlevelfunctionsareoftenimpossibletogeneratefromcompilers.4Complexinstructionsareincreasinglypronetoimplementingthewrongfunctionastheleveloftheinstructionincreases.Thisisbecausethefunctionbecomessospecializedthatitbecomesuselessforotheroperations.5Complexinstructionscangenerallybereplacedwithasmallnumberoflowerlevelinstructions,oftenwithlittleornolossinperformance.6ThetimetogenerateacompilerforaCISCisadditionallyincreasedbecausebugsaremorelikelytooccuringeneratingcodeforcomplexinstructions.7Thereisafairamountofevidencethatmorecomplicatedinstructionsdesignedtomakecompilerseasiertowriteoftendonotaccomplishtheirgoal.Severalreasonsaccountforthis.First,becauseoftheplethoraofinstructionstherearemanywaystoaccomplishagivenelementaryoperation,asituationconfusingforboththecompilerandcompilerwriter.Second,manycompilerwritersassumethattheyaredealingwitharationalimplementationwheninfacttheyarenot.Theresultisthattheappropriateinstructionoftenturnsouttobethewrongchoice.Forexample,pushingaregisteronthestackwithPUSHLR0isslowerthanpushingitwiththemoveinstructionMOVLR0,SPontheVAX11/780.Wecanthinkofadozenmoreexamplesoffhand,forthisandalmosteveryothercomplicatedmachine.Onehastotakespecialcarenottouseaninstructionbecauseitsthere.Theseproblemscannotbefixedbydifferentmodelsofthesamearchitecturewithouttotallydestroyingeitherprogramportabilityorthereputationofagoodcompilerwriterasachangeintherelativeinstructiontimingswouldrequireanewcodegeneratortoretainoptimalcodegeneration.ThedesiretosupporthighlevellanguagesencompassesboththeachievementofaHLLCSandreducingcompilercomplexity.WeseefewcaseswhereaRISCissubstantiallyworseoffthanaCISC,leadingustoconcludethataproperlydesignedRISCseemsasreasonableanarchitectureforsupportinghighlevellanguagesasaCISC.4EvidenceforandagainstcomesfromDEC.ThecomplexMARKinstructionwasaddedtothePDP11inorderimprovetheperformanceofsubroutinecallsbecausethisinstructiondidnotdoexactlywhattheprogrammerswanteditisalmostneverused.ThedamagingevidencecomesfromtheVAXitisrumoredthattheVMSFORTRANcompilerapparentlyproducesaverylargefraction.8ofthepotentialVAXinstructions.5WewouldnotbesurprisedifFORTRANandBLISSwereusedasmodelsforseveraloccurrencesofthistypeofinstructionontheVAX.Considerthebranchiflowerhitsetandbranchiflowerhitclearinstructions,whichpreciselyimplementconditionalbranchingforBLISS,butareuselessforthemorecommonbranchifzeroandifnotzerofoundinmanyotherlanguagesthiscommonoccurrencerequirestwoinstructions.SimilarinstructionsandaddressingmodesexistwhichappealtoFORTRAN.6PeutoandShustekobservedthatthecomplexdecimalandcharacterinstructionsoftheIBMandAmdahlcomputersgenerallyresultedinrelativelypoorperformanceinthehighendmodels.TheysuggestthatsimplerinstructionsmayleadtoincreasedperformancePeuto,Shustek77.TheyalsomeasuredthedynamicoccurrenceofpairsofinstructionssignificantresultsherewouldsupporttheCISCphilosophy.TheirconclusionAnexaminationofthefrequentopeodepairsfailstouncoveranypairwhichoccursfrequentlyenoughtosuggestcreatingadditionalinstructionstoreplaceit,7InportingtheCcompilertotheVAX,overhalfofthebugsandaboutathirdofthecomplexityresultedfromthecomplicatedINDEXEDMODE.3OWORKONRISCARCHITECTURESAtBerkeley.InvestigationofaRISCarchitecturehasgoneonforseveralmonthsnowunderthesupervisionofD.A.PattersonandC.H.Srquin.Byajudiciouschoiceoftheproperinstructionsetandthedesignofacorrespondingarchitecture,wefeelthatitshouldbepossibletohaveaverysimpleinstructionsetthatcanbeveryfast.Thismayleadtoasubstantialnetgaininoverallprogramexecutionspeed.ThisistheconceptoftheReducedInstructionSetComputer.TheimplementationsofRISCswillalmostcertainlybelesscostlythantheimplementationsofCISCs.IfwecanshowthatsimplearchitecturesarejustaseffectivetothehighlevellanguageprogrammerasCISCssuchasVAXortheIBMS/38,wecanclaimtohavemadeaneffectivedesign.AtBellLabs.AprojecttodesigncomputersbaseduponmeasurementsoftheCprogramminglanguagehasbeenunderinvestigationbyasmallnumberofindividualsatBellLaboratoriesComputingScienceResearchCenterforanumberofyears.Aprototype16bitmachinewasdesignedandconstructedbyA.G.Fraser.32bitarchitectureshavebeeninvestigatedbyS.R.Bourne,D.R.Ditzel,andS.C.Johnson.Johnsonusedaniterativetechniqueofproposingamachine,writingacompiler,measuringtheresultstoproposeabettermachine,andthenrepeatingthecycleoveradozentimes.Thoughtheinitialintentwasnotspecificallytocomeupwithasimpledesign,theresultwasaRISClike32bitarchitecturewhosecodedensitywasascompactasthePDP11andVAXJohnson79.AtIBM.UndoubtedlythebestexampleRISCisthe801minicomputer,developedbyIBMResearchinYorktownHeights,N.Y.Electronics76Datamation79.ThisprojectisseveralyearsoldandhashadalargedesignteamexploringtheuseofaRISCarchitectureincombinationwithveryadvancedcompilertechnology.Thoughmanydetailsarelackingtheirearlyresultsseemquiteextraordinary.TheyareabletobenchmarkprogramsinasubsetofPL/IthatrunsaboutfivetimestheperformanceofanIBMS/370model168.Wearecertainlylookingforwardtomoredetailedinformation.CONCLUSIONThereareundoubtedlymanyexampleswhereparticularuniqueinstructionscangreatlyimprovethespeedofaprogram.Rarelyhaveweseenexampleswherethesamebenefitsapplytothesystemasawhole.Forawidevarietyofcomputingenvironmentswefeelthatcarefulpruningofaninstructionsetleadstoacosteffectiveimplementation.Computerarchitectsought,toaskthemselvesthefollowingquestionswhendesigninganewinstructionset.Ifthisinstructionoccursinfrequently,isitjustifiableonthegroundsthatitisnecessaryandunsynthesizable,forexample,aSupervisorCallinstruction.Iftheinstructionoccursinfrequentlyandissynthesizable,canitbejustifiedonthegroundsthatitisaheavilytimeconsumingoperation,forexample,floatingpointoperations.Iftheinstructionissynthesizablefromasmallnumberofmorebasicinstructions,whatistheoverallimpactonprogramsizeandspeediftheinstructionisleftoutIstheinstructionobtainableforfree,forexample,byutilizingunusedcontrolstoreorbyusinganoperationalreadyprovidedbytheALUIfitisobtainableforfree,whatwillbethecostindebugging,documentation,andthecostinfutureimplementationsIsitlikelythatacompilerwillbeabletogeneratetheinstructioneasilyWehaveassumedthatitisworthwhiletominimizethecomplexityperhapsmeasuredindesigntimeandgatesandmaximizeperformanceperhapsusingaverageexecutiontimeexpressedingatedelaysasatechnologyindependenttimeunitwhilemeetingthedefinitionofaHighLevelLanguageComputerSystem.Inparticular,wefeelthatVLSIcomputerswillbenefitthemostfromtheRISCconcepts.Toooften,therapidadvancementsinVLSItechnologyhavebeen31usedasapanaceatopromotearchitecturalcomplexity.Weseeeachtransistorasbeingpreciousforatleastthenexttenyears.Whilethetrendtowardsarchitecturalcomplexitymaybeonepathtowardsimprovedcomputers,thispaperproposesanotherpath,theReducedInstructionSetComputer.ACKNOWLEDGEMENTSFortheirpromptandconstructivecommentsonthispaper,theauthorswishtoexpressthankstoA.V.Aho,D.Bhandarkar,R.Campbell,G.Corcoran,G.Chesson,R.Cmelik,A.G.Fraser,S.L.Graham,S.C.Johnson,P.Kessler,T.London,J.Ousterhout,D.Poplawski,M.Powell,J.Reiser,L.Rowe,B.Rowland,J.Swensen,A.S.Tanenbaum,C.Srquin,Y.Tamir,G.Taylor,andJ.Wakerly.ThosestudentsatBerkeleywhohaveparticipatedintheRISCprojectareacknowledgedfortheirexcellentwork.RISCresearchatBerkeleywassponsoredinpartbytheDefenseAdvanceResearchProjectsAgencyDoD,ARPAOrderNo.3803,andmonitoredbyNavalElectronicSystemCommandunderContractNo.N0003978G00130004.REFERENCESAlexander75Cocke80Cragon80Datamation79Ditzel,Patterson80Electronics76Foster71Husson70Johnson79Patterson,Srquin80Peuto,Shustek77Shustek78Utley78W.C.AlexanderandD.B.Wortman,StaticandDynamiccharacteristicsofXPLPrograms,Computer,pp.4146,November1975,Vol.8,No.11.J.Cocke,privatecommunication,February,1980.H.A.Cragon,inhistalkpresentingthepaperTheCaseAgainstHighLevelLanguageComputers,attheInternationalWorkshoponHighLevelLanguageComputerArchitecture,May1980.Datamation,IBMMiniaRadicalDepalrture,October1979,pp.5355.RetrospectiveonHighLevelLanguageComputerArchitecture,SeventhAnnualInternationalSymposiumonComputerArchitecture,May68,1980,LaBaule,France.ElectronicsMagazine,AlteringComputerArchitectureisWaytoRaiseThroughput,SuggestsIBMResearchers,December23,1976,pp.3031.C.C.Foster,R.H.GonterandE.M.Riseman,MeasuresofOpCodeUtilization,IEEETransactionsonComputers,May,1971,pp.582584.S.S.Husson,MicroprogrammingPrinciplesandPractices,PrenticeHall,Engelwood,N.J.,pp.109112,1970.S.C.Johnson,A32bitProcessorDesign,ComputerScienceTechnicalReport80,BellLabs,MurrayHill,NewJersey,April2,1979.D.A.PattersonandC.H.Srquin,DesignConsiderationsforSingleChipComputersoftheFuture,IEEEJournalofSolidStateCircuits,IEEETransactionsonComputers,JointSpecialIssueonMicroprocessorsandMicrocomputers,Vol.C29,no.2,pp.108116,February1980.B.L.PeutoandL.J.Shustek,AnInstructionTimingModelofCPUPerformance,ConferenceProc.,FourthAnnualSymposiumonComputerArchitecture,March1977.L.J.Shustek,AnalysisandPerformanceofComputerInstructionSets,StanfordLinearAcceleratorCenterReport205,StanfordUniversity,May,1978,pp.56.B.G.Utleyetal,IBMSystem/38TechnicalDevelopments,IBMGS800237,1978.32ERRATAThefootnoteatthebottomofpage29inTheCasefortheReducedInstructionSetComputercontainsanerror.HarveyCragonwasspeakingonreducingthesemanticgapbymakingcomputersmorecomplicated.Themostappropriateslideisquoteddirectlybelow...WhendoingvectoroperationstheperformancewasapproximatelythesameonthesetwomachinestheCDC7andtheTIASC.Memorybandwidthwasapproximatelyequal.ThebufferingofthehardwareDOLOOPaccomplishedthesamememorybandwidthreductionasdidthebufferingofthenormalinstructionstreamonthe70.Afterthepropermacroshadbeenwrittenforthe7,anMthecallingprocedureincorporatedintothecompiler,equalaccesstothevectorcapabilitywasprovidedfromFORTRAN.Dueinlargemeasuretothecomplexityintroducedbythevectorhardware,scalarperformanceontheASCislessthanthe7600.Thelast,andmosttellingargument,isthatmorehardwarewasneededontheASCthanwasrequiredonthe7600.TheadditionalhardwareforbuildingtheratherelaborateDOLOOPoperationsdidnothavethepayoffthatwebadanticipated.TheexperiencewiththeASC,andotherexperiences,haveledmetoquestionalltheglowingpromiseswhicharemadeifwewillonlyclosethesemanticgap.Thepromisedbenefitsdonotmaterialize,generalityislost.Clearlytherelativespeedsofthetwodependupontheofscalarandvectoroperationsinagivenapplication.mixWealsoinadvertentlychangedHarveysmiddleheisHarveyG.Cragon.initialDavePattersonDaveDitzel33
编号:201401051948056798    大小:655.30KB    格式:PDF    上传时间:2014-01-05
  【编辑】
5
关 键 词:
工业、机械、能源、设计、建模、模具、工学
温馨提示:
1: 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
2: 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
3.本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 人人文库网仅提供交流平台,并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
  人人文库网所有资源均是用户自行上传分享,仅供网友学习交流,未经上传用户书面授权,请勿作他用。
0条评论

还可以输入200字符

暂无评论,赶快抢占沙发吧。

当前资源信息

4.0
 
(2人评价)
浏览:33次
baixue100上传于2014-01-05

官方联系方式

客服手机:13961746681   
2:不支持迅雷下载,请使用浏览器下载   
3:不支持QQ浏览器下载,请用其他浏览器   
4:下载后的文档和图纸-无水印   
5:文档经过压缩,下载后原文更清晰   

相关资源

相关资源

相关搜索

工业、机械、能源、设计、建模、模具、工学  
关于我们 - 网站声明 - 网站地图 - 友情链接 - 网站客服客服 - 联系我们
copyright@ 2015-2017 人人文库网网站版权所有
苏ICP备12009002号-5