会员注册 | 登录 | 微信快捷登录 支付宝快捷登录 QQ登录 微博登录 | 帮助中心 人人文库renrendoc.com美如初恋!
站内搜索 百度文库

热门搜索: 直缝焊接机 矿井提升机 循环球式转向器图纸 机器人手爪发展史 管道机器人dwg 动平衡试验台设计

43-Very High-Speed Computing Systems.pdf43-Very High-Speed Computing Systems.pdf -- 5 元

宽屏显示 收藏 分享

资源预览需要最新版本的Flash Player支持。
您尚未安装或版本过低,建议您

PROCEEDINGSOFTHEm,VOL.54,NO.12,DECEMBER,1966190123.H.A.Ernst,TCS,anexperimentalmultiprogrammingsystemfortheIBM7090,IBMCorp.,YorktownHeights,N.Y.,ResearchRept.RJ248,June1963,41pages.theIBM7090.IBMResearchRept.RJ248,June1963,41pages.24.M.Lehman.R.Eshed,andZ.Netter,SABRAC,atimesharinglowcostcomputer.Commun.ACM,vol.6,pp.427429,August1963.25.R.V.SmithandD.N.Senzig,Computerorganizationforarrayprocessing,IBMCorp.,YorktownHeights,N.Y.,ResearchRept.RC1330,December1964.26.A.S.Critchlow,Generalizedmultiprocessingandmultiprogrammingsystems,1963AFIPSProc.FJCC,pp.107126.27.M.E.Conway,Amultiprocessorsystemdesign,ibid.,pp.139146.28.R.R.SeeberandA.B.Lindquist.Associativelogicforhighlyparallelsystems,ihid.,pp.489493.29.R.M.Meade,604machinedescription,IBMinternalmemo.,December1963.38pages.30.M.Lehman,R.Eshed,andZ.Netter,SABRAanewgenerationserialcomputer,IEEETrans.onElectronicComputers,vol.EC12.pp.618628,December1963.31.M.W.Allen,T.Pearcey.J.P.Penny,G.A.Rose,andJ.G.Sanderson.CIRRUS,aneconomicalmultiprogramcomputerwithmicroprogramcontrol,ibid.,pp.663671.32.W.F.MillerandR.A.Aschenbrenner,TheGUSmulticomputersystem,ibid.,pp.671676.33.G.Estrin,B.Russenl,R.Turn,andJ.Bibb,Parallelprocessinginarestructurablecomputersystem,ibid.,pp.747755.34.G.GregoryandR.McReynolds,TheSolomoncomputer,ibid.,35.H.S.Bright,APhilcomultiprocessingsystem,1964Proc.FJCC,pp.97141.36.R.G.EwingandP.M.Davies,Anassociativeprocessor,1964AFIPSProc.FJCC,pp.147158.37.H.A.Kinslow,Thetimesharingmonitorsystem,ibid.,pp.443454.38.J.Nievergelt.Parallelmethodsforintegratingordinarydifferentialequations,Commun.ACM,vol.7,pp.731733,December1964.39.W.H.Desmonde,RealTimeDuraProcessingSystems.EnglewoodCliffs,N.J.PrenticeHall,1964.40.M.Lehman.Serialmodeoperationandhighspeedparallelprocessing,InformationProcessing,1965Proc.IFIP,pt.2.NewYorkSpartan,1966,pp.631633.41.R.V.SmithandD.N.Senzig,Computerorganizationforarraypp.774755.processing.1965Proc.FJCC,pp.117128.42.E.W.Dijkstra,Solutionofaprobleminconcurrentprogrammingcontrol.Commun.ACM,vol.8,p.569.September1965.43.J.B.Dennis,Segmentationandthedesignofmultiprogrammedcomputersystems,J.ACM,vol12,pp.589602,October1965.44.F.J.CorbatoandV.A.Vyssotsky,Introductionandoverviewofthemulticssystem,1965Proc.FJCC,pp.185196.45.E.L.Glaser.J.Couleur,andG.Oliver,Systemdesignofacomputerfortimesharingapplications,ibid..pp.197202.46.V.A.Vyssotsky,F.J.Corbato,andR.M.Graham,Structureofthemulticssupervisor,ibid..pp.203212.47.R.C.DaleyandP.G.Neumann,Ageneralpurposefilesystemforsecondarystorage,ibid.,pp.213229.48.J.F.Ossanna,L.E.Mikus,andS.D.Dursten,Communicationandinputoutputswitchinginamultiplecomputingsystem,ihid..49.J.W.Forgie,Atimeandmemorysharingexecutiveprogramforquickresponseonlineapplications,ibid.,pp.599610.50.J.D.McCullogh,K.H.Speiennan,andF.W.Zurcher,Designforamultipleusermultiprocessingsystem,ibid.,pp.611618.51.W.T.Comfort,Acomputingsystemdesignforuserdevice,ibid.,pp.619628.52.J.P.Anderson,Programstructuresforparallelprocessing,Commun.ACM,vol.8,pp.786788,December1965.53.B.W.Arden,B.A.Galler,T.C.D.OBrien,andF.H.Westervelt,Programandaddressingstructureinatimesharingenvironment,J.ACM,vol.13,pp.116.January1966.54.J.H.Katz,Simulationofamultiprocessorcomputersystem,SRDRept.LA009,February1966,tobepublishedin1966Proc.SJCC.55.G.S.ShedlerandM.M.Lehman,Parallelcomputationandthesolutionofpolynomialequations,IBM,YorktownHeights,N.Y.,ResearchRept.RC1550,February1966.56.H.Hellerman,Parallelprocessingofalgebraicexpressions,IEEETrans.onElectronicComputers,vol.EC15,pp.8291,February1966.57.J.B.DennisandE.C.VanHorn,Programmingsemanticsformultiprogrammedcomputation,Commun.ACM,vol.9,pp.143155,March1966.58.N.Wirth,Anoteonprogramstructuresforparallelprogramming,Commun.ACM,vol.9,pp.32G321,May1966.59.D.E.Knuth,Additionalcommentsonproblemsinconcurrentprogrammingcontrol,ihid.,pp.321322.pp.231241.VeryHighspeedComputingSystemsMICHAELJ.FLYNNMEMBER,IEEEAbstractVeryhighspeedcomputersmaybeclnssifiedasfollows1SingleJktmctionStrdingleDataStreamSISD2SileImbnctioaStreamMultipleDataStreamSIMD3MultiplehstmcthStrePntSingleDataStreamMSD4MnltipleInstroctiooStreamMultipleDataStreamD.Stream,asnsedhere,referstothesequenceofdataorirstructiomasseenbythemachinedaringtbeexecutionofaprogram.mecoastitaeatsofasystemstorage,exeation,andhtrudonbandhgbraochingaredkcussdwitbregardtorecentdevelopmen6and/orsystemsLimitatim.TheCOlsMnentsaredkcuwdmtermofcoocarrentSEDManuscriptreceivedJune30,1966revisedAugust16,1966.ThisworkwasperformedundertheauspicesoftheU.S.AtomicEnergyCommission.TheauthoriswithNorthwesternUniversity,Evanston,Ill.,andArgonneNationalLaboratory,Argonne,Ill.systemsCDC6600seriesand,ioparticnlar,IBMModd90series,sincemnltiplestreamorganizationsusuallydonotquireanymoreelaboratecompooents.Representativeorganizationsareselectedfromeachclassandthearrangementofthecomtitwntsisshown.INTRODUCTIONMANYSIGNIFICANTscientificproblemsrequiretheuseofprodigiousamountsofcomputingtime.Inordertohandletheseproblemsadequately,thelargescalescientificcomputerhasbeendeveloped.Thiscomputeraddressesitselftoaclassofproblemscharacterizedbyhavingahighratioofcomputingrequirementtoinput/outputrequirementsapartiallydefactosituation1902PROCEEDINGSOFTHEIEEEDECEMBERcausedbytheunavailabilityofmatchinginput/outputequipment.Thecomplexityoftheseprocessors,coupledwiththeadvancementofthestateofthecomputingarttheyrepresent,hasfocusedattentiononscientificcomputers.Insightthusgainedisfrequentlyapredictorofcomputerdevelopmentsonamoreuniversalbasis.Thispaperisanattempttoexplorelargescientificcomputingequipment,reviewingpossibleorganizationsstartingwiththeconcurrentorganizationswhicharepresentlyinoperationandthenexaminingtheothertheoreticalorganizationalpossibilities.ORGANIZATIONThecomputingprocess,initsessentialform,istheperformanceofasequenceofinstructionsonasetofdata.Eachinstructionperformsacombinatorialmanipulationalthough,foreconomy,.subsequencingisalsoinvolvedononeortwoelementsofthedataset.Iftheelementwereasinglebitandonlyonemchbitcouldbemanipulatedatanyunitoftime,wewouldhaveavariationoftheTuringmachinethestrictlyserialsequmtialmachine.Thenaturalextensionofthisistointroduceadatasetwhoseelementsmorecloselycorrespondtoanaturaldataquantumcharacter,integer,floatingpointnumber,etc..Sincethesizeofdatumhasincreased,sotoohasthenumberofcombinatorialmanipulationsthatcanbeperformedmanipulationsontwonbitargumentshave22possibleoutcomes.Ofcourse,attentionisrestrictedtothoseoperationswhichhavearithmeticorlogical.significance.Aprogramconsistsofanorderedsetofinstructions.Theprogramhasconsiderablyfewerwrittenorstoredinstructionsthanthenumberofmachineinstructionstobeperformed.Thedifferenceisintherecursionsorloopswhichareinherentintheprogram.Itishighlyadvantageousifthealgorithmbeingimplementedishighlyrecursive.Thebasicmechanismforsettinguptheloopsistheconditionalbranchinstructions.ForconvenienceweadopttwoworkingdefinitionsZnstructionStreamisthesequenceofinstructionsasperformedbythemachineDataStreamisthesequenceofdatacalledforbytheinstructionstreamincludinginputandpartialortemporaryresults.Thesetwoconceptsarequiteusefulincategorizingcomputerorganizationsinanattempttoavoidtheubiquitousandambiguoustermparallelism.OrganizationswillbecharacterizedbythemultiplicityofthehardwareprovidedtoservicetheInstruction.andDataStreams.Themutiplicityistakenasthemaximumpossiblenumberofsimultaneousoperationsinstructionsoroperandsdatabeinginthesamephaseofexecutionatthemostconstrainedcomponentof.theorganization.SeveralquestionsareimmediatelyevidentwhatisaninstructionwhatisanoperandhowistheconstrainingcomponentfoundTheseproblemscanbeansweredbetterbyestablishmentofareference.IftheIBM704.werecomparedtotheTuringmachine,the704wouldappearhighlyparallel.Ontheotherhand,ifadefinitionweremadeintermsofthenaturaldataunitcalledforbyaproblem,thesituationwouldbeequallyuntenable,sinceinmanyproblemsonewouldconsideralargematrixofdataaunit.ThuswearbitrarilyselectareferenceorganizationtheIBM70470927090.Thisorganizationisthenregardedastheprototypeoftheclassofmachineswhichwelabel1SingleInstructionStreamSingleDataStreamSISD.Threeadditionalorganizationalclassesareevident.2SingleInstructionStreamMultipleDataStream3MultipleInstructionStreamSingleDataStream4MultipleInstructionStreamMultipleDataStreamSIMDMISDMIMD.Beforecontinuing,wedefinetwoadditionalusefulnotions.Bandwidthisanexpressionoftimerateofoccurrence.Inparticular,computationalorexecutionbandwidthisthenumberofinstructionsprocessedpersecondandstoragebandwidthistheretrievalrateofoperandandoperationmemorywordswords/second.Latencyorlatentperiodisthetotaltimeassociatedwiththeprocessingfromexcitationtoresponseofaparticulardataunitataphaseinthecomputingprocess.Thusfar,categorizationhasdependedonthemultiplicityofsimultaneouseventsatthesystemscomponentwhichimposesthemostconstraints.Theratioofthenumberofsimultaneousinstructionsbeingprocessedtothisconstrainedmultiplicityiscalledtheconfluenceorconcurrenceofthesystem.ConfluenceisillustratedinFig.1foranSISDorganization.Itseffectistoincreasethecomputationalbandwidthinstructionsprocessed/secondbymaximizingtheutilityGENERATEAOORESSOFINSTRUCTIONFETCHINSTRUCTIONDECOOENSTRUCTIONGENERATEAOORESSOFOPERANO111FETCHOPERfDEXECUTEINSTRUCTIONINST.2INST.IL...4INSTRUCTIONISTARTSINSTRUCTION2STARTSINSTRUCTION3STARTSFig.1.Concurrencyandinstructionprocessing.1966FLYNNVERYHIGHSPEEDCOMPUTING1903oftheconstrainingcomponentorbottleneck.Theprocessingofthefirstinstructionproceedsinthephasesshown.Inordertoincreasethecomputationalspeed,webeginprocessinginstruction2assoonasinstruction1completesitsfirstphase.Clearly,itisdesirabletominimizethetimeineachphase,butthereisnoadvantageinminimizingbelowthetimerequiredbyaparticularphaseormechanism.Suppose,foragivenorganization,thattheinstructiondecoderisanabsoluteserialmechanismwithresolutiontime,At.Iftheaverageinstructionpreparationandexecutiontimeist,,thenthecomputationalbandwidthmaybeimprovedbyt,/At.Thusifonlyoneinstructioncanbehandledatatimeatthebottleneck,thenthemaximumachevableperformanceisl/Ar.1beingthemultiplicityoftheconstraint.Inordertoprocessagivennumberofinstructionsinaparticularunitoftime,acertainbandwidthofmemorymustbeachievedtoinsureanamplesupplyofoperandsandoperationsintheformofinstructions.Similarly,thedatamustbeoperatedonexecutedatarateconsistentwiththedesiredcomputationalrate.CONSTITUENTSOFTHESYSTEMWewilltreatstorage,execution,andinstructionhandlingbranchingasthemajorconstituentsofasystem.Sincemultiplestreamorganizationsusuallyrepresentmultipleattachmentsofoneormoreoftheaboveconstituents,welosenogeneralityindiscussingtheconstituentsofasystemastheyhaveevolvedinSISDorganizations.Indeed,confluentSISDorganizationsbytheirnaturemustallowarbitraryinteractionbetweenelementsofthedatastreamorinstructionstream.Multiplestreamorganizationmaylimitthisinteraction,thussimplifyingsomeoftheconsiderationsinanarea.Therefore,fornow,weshallbemainlyconcernedwithtechniquestoextendcomputationalperformanceinacontextofaconfluentSISDsystem.Wewillfurtherassumethattheserialmechanismwhichconstrainstheorganizationisthedecodingofinstructions.Thus,ontheaverage,theprocessingofoneinstructionperdecodecyclewillbeanupperlimitonperformance.TherelationshipbetweentheconstituentsisshowninFig.2fortheSISDorganization.I\/I71STORAGEjnEXECUTIONBANDWIDTHINSTRUCTIONSTREAYINSTRUCTIONHANDLINGUNITOPERAUOSTREAYn...0IFig.2.SISDOrganization.A.StorageTheinstructionanddatastreamsareassumedsequential.Thus,accessingtostoragewillalsobesequential.Iftherewereavailablestoragewhoseaccessmechanismcouldbeoperatedinonedecodecycleandthisaccessingcouldberepeatedeverycycle,thesystemwouldberelativelysimple.Theonlyinterferencewoulddevelopwhenoperandshadtobefetchedinapatternconflictingwiththeinstructions.Unfortunately,accessinginmoststoragemediaisconsiderablyslowerthandecoding.ThismakestheuseofinterleavingtechniquesnecessarytoachievetherequiredmemorybandwidthseeFig.3.Intheinterleavingscheme,nmemoriesareused.Wordsaredistributedineachofthememoriessequentiallymodulon.Inmemoryiisstoredwordi,ni,and2ni,etc.However,theaccessingmechanismdoesnotalterthelatencysituationforthesystem,andthsmustbeincludedinthedesign.Memorybandwidthmustbesufficienttohandletheaccessingofinstructionsandoperands,thestorageofresultsandinputoutputtraffic9.Theamountofinputoutputbandwidthisproblemdependent.Largememoryrequirementswillnecessitatetransferofblocksofdatatoandfrommemory.Theresultingtrafficwillbeinverselyproportionaltotheoverallsizeofthememory.However,anincreaseinthesizeofthememorytominimizethisinterferencealsoactstoincreasethebulkofthememory,andusuallytheinterconnectiondistance.Theresult,then,maybeanincreaseinthelatencycausedbythesecommunicationsproblems.Figure3wasgeneratedaftertheworkofFlores9andassumescompletelyrandomaddressrequests.ThewaitingtimeistheaverageadditionalamountoftimeovertheaccesstimerequiredtoretrieveanitemduetoconflictingACCESSREGENERATIONLATENCYtatrII.1.InUNITSBUSYn.atMEMORYCYCLE\ASSUMEDEMANDRATEOFnWORDSPERMEMORYCYCLE\WAITINGTIMEASAFRACTIONOF1.0MEMORYCYCLEIRANDOMADDRESSREQUESTSI.oI.52oNUMBEROFMEMORYUNITSDEMANDRATEFig.3.Interleavingmemoryunits.1904PROCEEDINGSOFTHEIEEEDECEMBERrequests.Sincememoryrequestsareusuallysequentialforinstructionsandatleastregularfordata,theresultisoverlypessimisticforallbutheavilybranchdependentproblems.Typicalprogramswillexperienceabouthalfthewaitingtimeshown.Thislatencyinthesystemgreatlyincreasesthecomplexityofthecontrolmechanismforthestoragesystem.Thestorageunitmustnowincludequeuingmechanismstoorganize,fetch,andstorerequestswhichareinprocessorcouldnotbehonoredduetoconflicts.Thesequeuingregisters,containingoutstandingservicerequests,mustbecontinuallycomparedagainstnewrequestsforservice.Extensivecontrolinterlocksmustbemadeavailabletoserveatleastthefollowingfunctions2,334.directthefetchedwordtotheappropriaterequestorpreventoutofsequencefetchstoreorstorefetchinthesamememorylocationeliminateduplicaterequestsofthesamememorylocationespeciallywherethememorylocationcontainsmorethanoneinstructionordataunitanalyzekeysorboundarieswherefetchorstorememoryprotectionareused.Inaddition,confluentcomputingsystemsfrequentlyemploybufferstominimizetrafficrequirementsonthememory2.Aninstructionbuffermightcontainthenextninstructionsinthesequenceaswellasahistoryofminstructionstogetherwithseverallevelsofalternatepathsofbranch.Thepresenceofahistoricalpictureofinstructionsintheinstructionbufferallowsfortheopportunitytostoresmallloops,thusavoidingthepenaltyofreaccessinginstructions.Anoperandbufferisusedbytheexecutionunitasitsintrinsicstoragemedium.Theinstructionunitwouldrestructureinstructionsthatcalledforoperandsfrommemoryintoinstructionswhichcallforthecontentsoftheseregisters.Theinstructionunitwould,afterdecodingtheinstruction,initiatetherequestsfortheappropriateoperandsanddirecttheirplacementintotheoperandbuffer.Thustheexecutionunitwouldactasanindependentcomputer,whosestoragewouldbelimitedtothecontentsofthisbuffer,andwhoseinstructionswouldbeinashortenedformat.B.ExecutionWecenterourdiscussiononthefloatingpointinstructionssincetheyarethemostwidelyusedinlargescientificcomputing,andrequirethemostsophisticatedexecution.Inordertoachievetheappropriatebandwidthlevels,wecanrepeatthedeploymentschemethatwasusedformemory,onlyheretheindependentunitswillbededicatedtoservicingoneclassofinstructions.Inadditiontodirectlyincreasingthebandwidthbyprovidinganumberofunits,thespecializationoftheunitaidsinexecutionefficiencyinseveralotherways.Inparticular,eachunitmightactasasmallinsularunitoflogic,henceminimizingintraunitwirecommunicationproblems.Secondly,thededicatedunithasfewerlogicaldecisionstomakethanauniversalunit.Thirdly,theunitmaybeimplementedinamoreefficientfashion.Onemayalsoconsiderpipeliningtechniques6toimprovethebandwidthwithinaparticularinsularunit,inadditiontooptimizinganalgorithmtominimizelatenttime.Pipeliningisaprocesswhereinnaturalpointsinthedecisionmakinghardwarearesoughttolatchupintermediateresultsandresuscitatetheuseoftheunit.Thelatchingmayincludeextradecisionelementsforstorage,butthesemayalsobeusedinaidingthecontroloftimeskewdifferencesinthevariousparallelpaths.IFloatingAddSubtractOperationsFloatingaddsubtractoperationsconsistofthreebasicparts.First,thejustificationofthefractionalpartsofthetwooperandsbytheamountofthedifferenceoftheexponents.Second,theaddingofthefractionalpartsorappropriatecomplement.Third,thepostnormalizationofthefractioniftheresulthasaleadingzeroinasignificantpositionoranoverflowoccurs.Becauseofthedecisionmakingshiftingproblemsassociatedwiththeexponenthandling,anormallatchpointforpipeliningthetwoaddoperationsinoneunitduplexingtheunitisattheinterfacefromthepreshiftintotheadderorafterthefirstleveloftheadderstructure.Thefloatingaddclassofinstructionhasbeenreported5tooperateinthe120nanosecondrangeforaduplexedunitwith56bitfractionSystems360format.2FloatingMultiply5,lo,12,13Theessentialdecisionprocessofthemultiplicationalgorithmistheadditionofthemultiplicandtoitselfbyasmanytimesasareindicatedbythemultiplier.Iftherearenbitsinamultiplierandmultiplicand,thenthemultiplicandmustbeaddedtoitselfntimeswithaoneplaceshiftbeforeeachaddition.Standardtechniquesexistforreducingthenumberofadditionse.g.,multiplierbits1111.maybetreatedas1oooO.1.requiredbyencodingthemultiplierintoalessernumberofsignedbits,say42.Oncethisisdone,Wallace12suggeststhedirectadditionofthe42,nbit,shiftedmultiplicands.Thebasicdecisionelementinaddingofeachbitofthen/2operandsisthesocalledbinaryfulladder,whichtakesthreeinputsofequalsignificanceandproducestwooutputs,oneofthesamesignificance,thesum,andoneofhighersignificance,thecarry.Bytheassociativelaw,thecarryisinjectedintoanyavailablelowerleveladdstructureoftheappropriatesignificance.Theneteffectissometimesreferredtoasaflushadder.Thefinalresult,ofcourse,isthereductionofthen/2multiplesintotworesultsegmentsthesumsandthecarrieswhicharethenassimilatedbyaconventionalcarrypropagateadderwithananticipationmechanism.Thebasicproblemistheimplementationofthisalgorithm.Forlargen,theplanesegmentsthatpartitionthealgorithmareinconsistentwithconventionalpackagingsizesandinterconnectionlimitationsseebelow.ConsiderasimplervariationFig.4bSIhereonlyn/mmultiplesofthemultiplicandareretiredperiterationmiterationsarerequired.Then/mmultiplesaredecodedinton/2madditionsofthemultiplier.Thesemultiplesare1966FLYNNVERYHIGHSPEEDCOMPUTING1905theninsertedintothefirstleveloftheaddertreeFig.4b.Nowtheaddassimilationofbothcarriesandsumsproceedsasbeforewiththeintroductionofintermediatestagingpointswheretheresultsaretemporarilylatched.Thestoragepointsactasskewrelativetimingcontrolandimprovetheoverallexecutionbandwidthefficiencyofthealgorithm.Afterthefirstsetofmultipleshasbeeninsertedintothefirstlevelofthetreeandassimilated,itislatchedatthefirstlatchpoint.Thestoragepointservestodecouplethefirstlevelfromsubsequentlevelsofthehardwarethereforethefirstsetofmultiplesmaynowproceedtobeabsorbedinthesecondleveland,simultaneously,asecondsetofmultiplesmaybeinsertedintothefirstlevelofthetree.Theprocesscontinuesand,atthebottomofthetree,thecarriesandsumsareassimilatedinanadderwithcarrypropagation.Noticethatthelatencyinthisvariationmaybetwicethatoftheoriginalalgorithm.Alsonoticethatilii1,,/J/1/1REGULARCUTTINGPLANEREGISTERSCONTAININGMULTIPLESASSUME61TREEOFLATCHINGFULLADDERSCARRYPRWAGATEINCLUDESCARRYADDERANTICIPATIONUNITINTERCONNECTIONSARESHOWNONCUTTINGPLANE0000006ithbitMULTIPLEREGISTERSPARTIALSUMACCUMUL4TlONithREWRcumNPLANEbFig.4.Outlineofamultipliertree.LiliilithPL4NEPROFILEpotentialbandwidthinthisalgorithmhasnotsufferedsinceasecondmultiplicationmayproceedindependentlyassoonasthelastsetofmultiplesisinsertedandhaspassedthefirstlevelofthetree.Implementationsofthissecondschemehaveyieldedaperformanceof180nanosecondsforafloatingpointmultiplyincludingpostshiftingandexponentupdating56bitfraction,Systems360format.Figure4,partsaandbarethreedimensionalrepresentationsofthesecondvariationofthemultiplyalgorithm.Figure4aisanisometricprojectionandFig.4bisacrosssectionorregularcuttingplaneandprofileview.Thisrepresentationwaschosentoillustratesomeofthedifficultiesofimplementinghighspeedsystemsingeneral.Itiswellknown1thatpropagationdelayandnonuniformtransmissionlineloadingaremajorfactorsintheswitchmgspeedofalogicstage.Onewould,therefore,desireaphysicalpackageconsistentwiththenaturalcommunicationpatternofthealgorithmsothattheimplementationcouldbeoptimized.Thedifficultiesofaplanarpackageareobvious.Communicationsbetweenregularcuttingplanesprofileview,Fig.4bmustbemadeaxiallyinthephysicalimplementation.Indeed,thereisnoassurancethatthecapacityofthephysicalpackageplanewillmatchtherequirementsoftheregularcuttingplaneseveralpackageplanesmayberequired.Noticethatthefirstvariationofthealgorithmhadmuchmoreextensiverequirementsperplane,thusitsimplementationwouldverylikelybelessefficientthanthesecondvariation.3FloatingPointDivide5,IO,lIHistorically,dividehasbeenlimitedbythefactthattheiterativeprocesswasdependentonpreviouspartialresults.Recently,attentionhasbeengiventotechniqueswhichdonothavethislimitation.ThesetechniquesincludeuseofNewtonRaphsoniterationsforseriesapproximationstoquotients.AssumeaQuotientbLet11b1X1xx23...fXf...whichcanberewrittenas1b1xlx21x4...1x2mThusQuotienta.Ix.lx2.1x4...1x2.Recallthatinbinarythefactor1xisrelatedto1xbyacomplementationoperation.Thus,thedenominatorisrewrittenas1x,complementedtoform1x,theproductis1x2whichiscomplementedtoform1x2,whichcontinuesthedevelopment.1906PROCEEDINGSOFTHEIEEEDECEMBERNoticethatthespeedissubstantiallyenhancedbythepresenceofahighspeedmultiplier.Infact,asmanyastwomultiplicationsmightproceedsimultaneously.oneforthequotientdevelopmentandoneforthenextdenominatorterm,ifthehardwarepermits.Inforcingquickconvergenceoftheseries,usuallyatablelookuparrangementisusedtoprovidethefirstapproximationforthequotient.Thereafter,subsequentiterationsdevelopdoubletheprecisionofthefirstapproximation.Floatingpointdividesunder700nanosecondshavebeenreportedusingthisscheme5.4AlgorithmsforAchievingMaximumEjiciencyintheConcurrentExecutionofIndependentUnitsOneofthepurposesinhavingindependentunitsdedicatedtoindividualinstructiontypesistoimprovetheexecutionofeachoftheinstructionclasses.Theotheradvantageisthattheseindependentunitsmayoperateconcurrentlyandservetoincreasetheoverallbandwidthoftheexecutionunit.Tomasulo4describesanelegantalgorithmtoallowtheachievementofmaximumefficiencyintheconcurrentexecutionofunitsFig.5,illustrationfromAmdahlI.Theoperationofthisalgorithmtakesadvantageofthelatenttimefollowingtheissuingoftherequesttoaccesstheoperandsfromstorageintothememoryintheexecutionareaeithervirtualoraddressablebuffers.Whenaninstructionisforwardedtotheexecutionarea,awordintheexecutionunitmemoryisreservedforitsoperand,buttheoperandhasnotyetarrived.Forexample,iftheinstructiontobeperformedconsistedofamultiply,thenitcouldbeimmediatelyforwardedtothemultiplicationunit.TheLOADR,,aMULTRI,pADDR,,R2STORERLOADRI,bFig.5.EffectofTomasuloconcurrencyalgorithm.aSequencewithconventionaldependency.bSequenceasperformedaftertagwasforwarded.multiplierwouldrequireatagrepresentingtheoperand.Theexecutionareaisprovidedwithacommondatabusandthetagrepresentingtheoperandisbroadcastonthebusonecycleearly,thentheexecutionfunctionalunitsexaminetheirqueuesofrequiredoperandsandgateinimmediatelyanappropriateonewithoutwaitingforittogotothebufferstorage.Thetagforwardingfreesthebufferstorageofhavinganyresponsibilityforthisparticularoperand.Ofcourse,atthesametime,aloadcouldthenbeexecutedintothatsameword.Thus,thesecondloadinstructionandthemultiplyinstructioncouldproceedconcurrentlyinafashionimpossiblebefore.Results,astheybecomeavailable,mightalsobeforwardedtounitstheadderinFig.5requestingactionbyasimilarmechanism,ratherthanproceedingdirectlytostorage,whethervirtualoraddressed.Thisavoidsintermediatestopsandhenceimprovesoverallefficiencyofexecutionbyimprovingtheoverlapintheconcurrencysystem.C.BranchingAssumeforthemomentthatmemorybandwidthandexecutionbandwidthhavenowbeenarrangedsothattheymorethansatisfytherequirementofoneinstructionprocessedperdecodecycle.WhatthenwouldlimittheperformanceofaSISDsystemIfitisassumedthatthereareafixednumberofdatadependentbranchpointsinagivenprogram,then,byoperatinginaconfluentorotherhighspeedmode,weessentiallybringthebranchpointsclosertogether.However,theresolutiontimeofthedatadependentbranchpointisbasicallyfixedforasystemwithagivenexecutionand/oraccessinglatency.Thebasicretrogressivefactoristhepresenceofbranchdependenciesintheinstructionstream.Amongthemanytypesofbranchinstructions,wehave2,71ExecuteTheoperandwhichisfetchedistobetreatedasaninstructionoraninstructioncounter.Thispresentssomeproblemssincetheaccesslatencyisinsertedintotheinstructionstream.Itisessentiallythesameproblemasindirectaddressingoradoubleoperandfetch.Onecouldanticipatesomeofthisdifficultybykeepingaheadintheinstructionstream.However,thisparticularinstructiondoesnotposeaseriesdegradationproblembecauseitsuseisnormallyrestrictedtolinkages.2BranchUnconditionalTheeffectiveaddresssogeneratedbecomesthecontentsoftheinstructioncounter.Thissubclassofinstructionpresentsthesameproblemasexecute.3BranchonIndexThecontentsofanindexregisterisdecrementedbyoneoneachiterationuntiltheregisteriszero,uponwhichthealternatepathistaken.Onlytheaccesslatencyisafactorsincezerocanbeanticipated.4DataDependentBranchThepathsofthebrancharedeterminedbytheconditionsign,bit,status,etc.ofsomedatacell.Invariably,thisconditionisdependentontheexecutionofapreviouslyissuedinstruction.1966FLYNNVERYHIGHSPEEDCOMPUTING1907Herethedegradationisseriousandunavoidable.Firsttheexecutionoftheconditiongeneratingdatamustbecompleted.Then,thetestismadeandthepathisselected.Now,theoperandscanbefetchedandconfluencycanberestored.Itispresumedthatbothinstructionpathswerepreviouslyfetchedbutnotethatthisisdoneonlyattheexpenseofgreatermemorybandwidthrequirements.Anotherslightimprovementcanbemadeiftheoperandsare.fetchedforonealternativepathsolongastheunresolvedbranchpathisnotexecutedtoavoidseriousrecoveryandreconstructionproblemsiftheguessproveswrong.Ofthetwoloopclosingorconditionalbranchinstructions,BranchonIndexhaslessdegradationduetoresolutionoflatencythanDataDependentBranch.Whenanoptionexistsasintheperformanceofaknownnumberofiterationstheprogrammershouldselecttheformer.AssumingthattheDataDependentBranchresolvinglatencyisaconstantforanyparticularmachine,wemayshowitsrelationshipFig.6toperformancebyassumingvariouspercentagesofoccurrenceinexecutedcodeofthistypeinstruction.Thelatencyrepresentsthesumofaverageexecutiontimeplusoperandaccesstimefrommemory.Figure6assumesthattheorganizationunderconsidcrationhasenoughconfluencytoperformoneinstruction/cycleontheaveragewithoutbranchresolutioninterruptions.Toresolvethebranch,itisgenerallynecessarytofullyexecuteandtesttheresultoftheprecedinginstruction.Duringthistime,fetchingofinstructionsanddatamayproceedforonebranchpathandpossiblyafewinstructionsmaybefetchedforthealternate.FetchingofbothpathsdoublesthebandwidthrequiredandincreasesthewaitingtimeFig.3.Thusthelatencyincludesanaverageexecutiontime,testtime,and.a.percentagewrongpathguessesoftheoperandaccesstime.Noticethatwhilewehavestudieddegradation.duetonNUMBEROFCYCLESOFEXECFTIONANDACCESSLATENCYII.oWWJvuZtUVE0.5UWuz0Fig.6.1020PIIXNTASEOFOCEURREEYGEffCONDlTlOULBRANUIINSTRUCTIOISINImmnowsTRumDegradationduetodatadependentbranchinstructions.latencyfortheSJSDorganization,multiplestreamorganizationsmayexhibitbranchinduceddegradationduetoeitherthssamephenomenonorananalogspatialinefficiencyinwhichonlyonepathactivatesordeterminestheoutcomeofadependency.CLASSESOFORGANIZATIONIntlussectionweshallconsidereachoftheorganizationalclasseslistedinthefirstsectionofthepaper.Wewillremarkonorillustraterepresentativesystemsthatfallintoeachclass.Thevariousconfigurationsarebynomeansexhaustive.A.ConJluentSISDTheconfluentSISDprocessorIBMSTRETCH7,CDC6600series8,IBM360/90series2l5achievesitspowerbyoverlappingthevarioussequentialdecisionprocesseswhichmakeuptheexecutionoftheinstructionFigs.1and2.Inspiteofthevariousschemesforachievingarbitrarilyhighmemorybandwidthandexecutionbandwidth,thereremainsanessentialconstraintinthistypeoforganization.Asweimpliedbefore,thisbottleneckisthedecodingofoneinstructioninaunittime,thus,nomorethanoneinstructioncanheretiredinthesametimequantum,.ontheaverage.Ifoneweretotrytoextend.thisorganizationbytakingtwo,three,orndifferentinstructionsinthesamedecodecycle,andnolimitationswereplacedoninstructioninterdependence,thenumberofinstructiontypestobeclassifiedwouldbeincreasedbythecombinatorialamountMdifferentinstructionstakennatatimerepresentsMdifferentoutcomesandthedecodingmechanismwouldbecorrespondinglyincreasedincomplexity.Ontheotherhand,onecouldplacerestrictionsontheoccurrenceofeitherspecifiedtypesofinstructionsorinstructiondependencies.This,inturn,narrowstheclassofproblemsforwhichthemachineissuitableand/ordemandsrestrictiveprogrammingpractices.Indeed,thisisacharacteristicofmultiplestreamorganizationssincethemultiplicityorparallelismimpliesindependentsimultaneousaction.B.SIMD1418SIMDtypestructureshavebeenproposedbyUnger14,SlotnikI51SOLOMON,ILLIACIV,CraneandGithens16,and,morerecently,byHellerman17.SOLOMONistheclassicSIMD.Therearenuniversalexecutionunitseachwithitsownaccesstooperandstorage.Thesingleinstructionstreamactssimultaneouslyonthenoperandswithoutusingconfluencetechniques.Increasedperformanceisgainedstrictlybyusingmoreunits.Communicationbetweenunitsisrestrictedtoapredeterminedneighborhoodpatternandmustalsoproceedinauniversal,uniformfashionFig.7a.NoteSOLOMONhasbeensupersededbyILLIACIVasasystembeingactivelydeveloped,whichisnolongercompletelySIMD.ThedifficultieswithSIMDare1LatencyintheinstructionstreamforSIMDbranchesisnowreplacedbylatencyinthedatastreamcausedbyPROCEEDINGSOFTHEIEEEDECEMBER19082operandcommunicationforwardingproblems.Presently,thenumberofclassesofproblemswhoseoperandstreamshavetherequiredcommunicationregularityisnotwellestablished.SIMDorganizationsareinconsistentwithstandardalgorithmictechniquesincluding,andespecially,compilertechniClUeS.3TheuniversalityoftheexecutionunitsdeprivethemofImaximumefficiency.C.MISDThesestructureshavereceivedmuchlessattention18,19.AnexampleofsuchastructureisshowninFig.7b.ItbasicallyemploysthehighbandwidthdedicatedexecutionunitasdescribedintheconfluentSISDsection.Thisunitisthensharedbynyirtualmachinesoperatingonprogramsequencesindependentofoneanother.Eachvirtualmachinehasaccesstotheexecutionhardwareoncepercycle.Eachvirtualmachine,ofcourse,hasitsownEXECUTION1UNITSUINSTRUCTIONUNITLIMITEDWMMUNICATION4BUSESARETIMESHAREDMULTIPLYBUSIUNIT9INSTRUCTIONIIIIDAlAdHlSTORAGEEXECUTIONUNITEXECUTIONUNITEXECUTIONSOURCEDATASTREAYDERIVEDDATISTREAYcFig.7.aSIMD.bMISD.ConvertstoMIMDifinstructionsanddataareprivatelymaintainedtogether.cMISD.privateinstructionmemoryandinteractionbetweeninstructionstreamsoccursonlyviathecommondatamemory.Presumably,ifthereareNinstructionunitsthenthebandwidthofthecommondatastoragemustbeNtimesgreaterthantheindividualinstructionstorage.ThisrequirementcouldbesubstantiallyreducedbyuseofamodifiedversionofTomasulostagforwardingalgorithmandaseparatecommonforwardingbus.AnotherversionofthisFig.7cwouldforceforwardingofoperands.ThusthedatastreampresentedtoExecutionUnit2istheresultantofExecutionUnit1operatingitsinstructiononthesourcedatastream.Theinstructionthatanyunitperformsmaybefixedspecializedsuchthattheinterconnectionorsetupofunitsmustbeflexible,semifixedsuchthatthefunctionofanyunitisfixedforonepassofadatafileorvariablesothatthestreamofinstructionsoperatesatanypointonthesingledatastream.Undersuchanarrangement,onlythefirstexecutionunitseesthesourcedatastreamandwhileitisprocessingtheithoperandtheithexecutionunitisprocessingtheithderivationofthefirstoperandofthesourcestream.D.MIMDIfwereconstructtheorganizationofFig.7bsothatthedataandinstructionstreamsaremaintainedtogetherinprivatememories,wehaveanexampleofMIMD.Thereisnointeractionorminimuminteractionallowedbetweenthesevirtualmachines.Solongasthelatenttimeforexecutionofanyoperationislessthanthememorycycle,noproblemsariseduetobranching.Thus,suchanarrangementallowsmaximumadvantageoftheallowablebandwidthsofexecutionunitandmemoryunit.Ofcourse,suchanapproachmightwellbecriticizedonthebasisthattherequirementofindependenceoftheinstructionsequencesdoesnotaddressitselftotherequirementsoflargescientificproblems.Itwouldbebettersuitedtotheneedsofthetimesharingand/orutilityenvironment.ThisrestrictedMIMDpointsuptheshortcomingofourorganizationaldefinitions.Thespecificationsfailtoincludeaclassdicationofhowthestreamsmayinteract,thusarestrictedMIMDmaybeorganizationallymuchsimplerthanaconfluentSISD.GeneralMIMDstructureshavebeenmorewidelydescribed20E23withlargescalemultiplicitybeingenvisionedbyHolland21andmorerestrictedimplementationsbeingundertakenbyBurroughsandUnivac23.Inhisoriginalproposal,Hollandconsidersanarrayofprocessorseachwithaonewordstorage.Themodulesareindependentandarecapableofconcurrentexecution.Theprocessorshavearithmeticabilitybutcommunicationislimitedbytheirneedtobuildapathtotheappropriateoperand.Inpathbuilding,thedisplacementanddirectionoftheoperandisspecifiedandtheinterveningmoduleprocessorsformavinculum.SuchanarrangementmightsolvesomeoftheessentialblockageproblemsofSEDandSIMD,sinceindependentprogramsegmentsproceedsimultaneously.Also,memorybandwidthisalwaysadequate.1966FLYNNVERYHIGHSPEEDCOMPUTING1909ThedifficultieswithsuchanarrangementofMIMDtime,andthismaybefurtherdegradedbytheoccurrenceofgenerallyincluder201conditionalbranchinstructions.1interconnectionsbetweenunits,whichposeserious2theuniversalnatureoftheindividualmodule,which3theclassofproblemswhichcouldutilizeMIMDinterferenceproblemslimitsitsefficiencyorganization,whchispresentlysmall.SYSTEMSREQUIREMENTSTheeffectivenessoftheoverallcomputingprocessesmustbemeasuredonabasismuchlargerthanperformancenanosecondsperinstructionalone.Evenonthisprimitivebasis,itisobviousthatdifferentinstructionrepertoiremayimplysubstantiallydifferenteffectivenesswiththesameaveragenanosecondperinstructionratio.Despiteouroriginalassumptionthattheverylargeproblemdoesnotemploysignificantinput/output,itis.soonrealizedthatthis.requirementisunduly.restrictive.Presently,highspeedsystemsremainsolimited.Itmaybesometimebeforebroadereffectiveutilizationmayberealizedthroughnewdevelopmentsininput/outputequipment.Ofcourse,theoverallmeasureofefficiencyoftheprocessoristhenumberofcorrectlycompletedcomputationsandprogramsdoneoveranextendedperiodoftime.Includedinthismeasureisthereliability.andthemaintainabilityofthesystem.Complexsystemswithverylargenumbersofcomponentsarenaturallyverydifficulttomaintain,unlessfeatureswhichprovidefaultlocationareincluded.Amajorcomponentofsuchfeatureswouldbecheckingorfaultdetection.ofalloperationsanddatatransfers.Ondetectionoferror,hardwareaideddiagnosticsshouldbeprovidedsothatservicingandmaintenancemightbereadilyandeasilyaccomplished.Ifcheckingisnotincludedinthehardware,thenitis,ofcourse,incumbentontheusertoprogramathoroughcheckofhisresults.This,ofcourse,representsanoverheadwhichpenalizestheeffectiveperformanceandutilizationoftheequipment.Noticethatsomeofthesuggestedorganizationscater.torestrictiveproblemsetsparticularlytheSIMD,wheremultipleoperandsareexecutedbythesameinstructionstream.Suchorganizationsareclearlylimitedforgeneralpurposeprocessingwheretheremaybemanyinteractionsbetweenoperandelements.Similarly,withtheMISDorganizations,intheorganizationshownonFig.7b,anumberofindependentinstructionstreamsareinvolvedandtheirveryindependenceprecludesMISDuseononelargeproblemcomposedofessentialdependenciesthis,ofcourse,istypicaloflargescalescientificprogramming.CONCLUSIONSConventionalSingleInstructionStreamSingleDataStreamSISDprocessorsmaybeenhancedbyconcurrencyconfluenceofinstructionhandling,operandacquisition,andexecution.Thereis,however,alimitationoftheorderoftheexecutionofoneinstructionperinstructiondecodingBymultiplexingtheinstructionstreamorthedatastreamorboth,newclassesofprocessorswhichdonotnecessarilysharethislimitaredeveloped.Theireffectivenessdependsonthenatureoftheproblemanditisanopenquestionastowhethernewalgorithmswhichwillservetoextendtheirusefulnesscanbedeveloped.ACKNOWLEDGMENTTheauthorisindebtedtoD.JacobsohnandR.AschenbrenneroftheArgonneNationalLaboratoryforseveralvaluablediscussionsonsomeofthematerial.REFERENCESIG.M.AmdahlandM.J.Flynn,Engineeringaspectsoflargehghspeedcomputerdesign,Proc.Symp.onMicroelectronicsandLargeSystems.Washington,D.C.Spartan,1965,pp.7795.2D.W.Anderson,F.J.Sparacio,andR.M.Tomasulo,TheModel91machinephilosophyandinstructionhandling,IBMJ.Res.andDev.,November1966.3L.J.Boland,G.D.Granito,A.U.Marcotte,B.M.Messina,andJ.W.Smith,TheModel91storagesystem,ibid.4R.M.Tomasulo,Anefficientalgorithmforautomaticexploitationofmultipleexecutionunits,ibid.5S.F.Anderson,J.Earle,R.E.Goldschmidt,andD.M.Powers,TheModel91executionunit,ibid.6L.W.Cotton,Circuitimplementationofhighspeedpipelinesystems,1965Proc.AFIPSFJCC,p.489.7W.Buchholz,.Ed.,PlanningaComputerSystem.NewYorkMcCrawHill,1962.8J.E.Thomton,,ParalleloperationintheControlData6600,Proc.AFIPS1964FJCC,pt.11,pp.334.191I.Flores,Derivationofawaitingtimefactorforamultiplebankmemory,J.ACM,vol.1I,pp.265282,July1964.loM.Lehman,D.Senzig,andJ.Lee,Serialarithmetictechniques,Proc.AFIPSI965FJCC,pp.715725.llR.E.Goldschmidt,Analgorithmforhighspeeddivision,M.S.thesis,Mass.Inst.Tech.,Cambridge,June1965.12C.S.Wallace,Asuggestionfor4fastmultiplier,IEEETrans.onElectronicComputers,vol.EC13,pp.1417,February1964.13D.Jacobsohn,Asuggestionforafastmultiplier,IEEETlans.onElectronicComputersCorrespondence,vol.EC13,p.754,December1964.14S.H.Unger,Acomputerorientedtowardspatialproblems,Proc.IRE,pp.174,October1958.I51D.L.Slotnick,W.C.Borch,andR.C.McReynolds,TheSolomonComputerapreliminaryreport,Proc.I962WorkshoponComputerOrganization.Washington,D.C.Spartan,1963,pp.692.161B.A.CraneandJ.A.Githens,BulkprocessingindistributedlogicMemory,IEEETrans.onElectronicComputers,vol.EC14,pp.186196,April1965.17H.Hellerman,Parallelprocessingofalgebraicexpressions,IEEETrans.onElectronicComputers,vol.EC15,pp.8291,February1966.I81D.N.SenzigandR.V.Smith,ComputerorganizationforarrayprocessingProc.AFIPSI965FJCC,pp.117129.I91R.AschenbrennerandG.Robinson,Intrinsicmultiprocessing,ArgonneNationalLaboratory,Argonne,Ill.,ANLTech.Memo.121,June1964.20W.Comfort,Highlypamllelmachines,Proc.I962WorkshoponComputerOrganization.Washington,D.C.Spartan,1963,pp.126155.21J.H.Holland,Auniversalcomputercapableofexecutinganarbitrarynumberofsubprogramssimultaneously,I959Proc.EJCC,pp.108113.22R.Reiter,Astudyofamodelforparallelcomputation,UniversityofMichigan.AnnArbor,Tech.Rept.,ISL654,July1965.23D.R.LewisandG.E.Mellen,StretchingLARCscapabilityby1anewmultiprocessorsystem,presentedatthe1964Symp.onMicroelectronicsandLargeSystems,Washington,D.C.
编号:201401051948326818    大小:959.11KB    格式:PDF    上传时间:2014-01-05
  【编辑】
5
关 键 词:
工业、机械、能源、设计、建模、模具、工学
温馨提示:
1: 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
2: 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
3.本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 人人文库网仅提供交流平台,并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
  人人文库网所有资源均是用户自行上传分享,仅供网友学习交流,未经上传用户书面授权,请勿作他用。
0条评论

还可以输入200字符

暂无评论,赶快抢占沙发吧。

当前资源信息

4.0
 
(2人评价)
浏览:11次
baixue100上传于2014-01-05

官方联系方式

客服手机:13961746681   
2:不支持迅雷下载,请使用浏览器下载   
3:不支持QQ浏览器下载,请用其他浏览器   
4:下载后的文档和图纸-无水印   
5:文档经过压缩,下载后原文更清晰   

相关资源

相关资源

相关搜索

工业、机械、能源、设计、建模、模具、工学  
关于我们 - 网站声明 - 网站地图 - 友情链接 - 网站客服客服 - 联系我们
copyright@ 2015-2017 人人文库网网站版权所有
苏ICP备12009002号-5