新编文档-Multicycleconclusion-coursescswashingtonedu结论_第1页
新编文档-Multicycleconclusion-coursescswashingtonedu结论_第2页
新编文档-Multicycleconclusion-coursescswashingtonedu结论_第3页
免费预览已结束,剩余18页可下载查看

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、Multicycle conclusion My office hours, move to Mon or Wed? Pla n: Pipeli ning this and n ext week, maybe performa nee an alysis fbday:一 Microprogramming一 Extending the cycle datapath一 Multi-cycle performaneePCWriteMemReadAddressMemoryWrite Mem data DataMemWriteIRWrite31-2625-2120-1615-1115-0In struc

2、ti onregisterMemory data registerThe multicycle datapathALUSrcAA-? 0RegDstRegWriteMemToRegALUOutALUSrcB-? 1 YPCSourceOp = BEQFinite-state machine for the control unitBra nch/、completi onALUSJCAT 1 1 ALUSrcB = 00ALUOp = 110PCV/rite - Zero IPCSource 1 /In structi on fetch andPC in creme ntR-type 耳 ecu

3、tionR-typei writebackRegister fetch and branch computationR-type 1 ALUSrcA = 11 RegV/rite = 1orD = 0Mem Read = 1/ iRvnte:=1 iALUSrcA = 0ALUSrcB = 0' :JALUOp = 010PCSource = 0PC7 nte = 1='ALUSrcA 0ALUSrcB - 11ALUOp - 010ALUSrcB = 00 ALUOp = func /* RegDst = 1 MemToReg = 0 /Memory < writeEf

4、fctiv ackho務compulationWpmX/rite = lorD 1ALUSrcA - 1ALUSrcB - 10ALUOp - 010Memory readRegister < writelorD -'RegWrite =RegDst - oMemToRe -Implementing the FSMThis can be tran slated into a state table; here are the first two states.Curre ntStateIn put (Op)Next StateOutput (C ontrol sig na(5)P

5、C WritelorDMemReadMemWriteIR WriteRegDstMemToRegRegWriteALUSrcAALUSrcBALU opPC SourceIn strFetchXReg Fetch10101XX00010100Reg FetchBEQBranch com pl0X000XX0011010XReg FetchR-typeR-type execute0X000XX0011010XReg FetchLW/S VCompute eff addr0X000XX0011010X You can impleme nt this the hard way.一 Represent

6、 the current state using flip-flops or a register.一 Find equations for the next state and (control signal) outputs in terms of the curre nt state and in put (in structi on word). Or you can use the easy way.一 Stick the whole state table into a memory, like a ROM ?一 This would be much easier, since y

7、ou don't have to derive equations.Pitfalls of state machines As we just saw, we could tran slate this state diagram into a state table, and then make a logic circuit or stick it into a ROM. This works pretty well for our small example, but designing a fin让 e-statemachine for a larger instruction

8、 set is much harder.一 There could be many states in the machine ? For example, some MIPS in structi ons n eed 20 stages to execute in some impleme ntati on s-each of which would be represe nted by a separate state.一 There could be many paths in the machine ? For example, the DEC VAX from 1978 had ne

9、arly 300 opcodes ? that's a lot of bran chi ng!一 There could be many outputs ? For instanee, the Pentium Pro's integer datapath has 120 control signals, and the floating? point datapath has285 control signals ?一 Implementing and maintaining the control unit for processors like these would be

10、 a nightmare ? Ybu'd have to work w 让 h large Boolean equati ons or a huge state table ?Motivation for microprogramming Think of the control unit's state diagram as a little program一 Each state represe nts a“ comma nd," or a set of con trol sig nals that tellsthe datapath what to do.一 S

11、everal comma nds are executed seque ntially.一 “ Bran ches" may be take n depe nding on the in structi on opcode ?一 The state machine“ loops" by returning to the initial state. Why don't we invent a special la nguage for making the con trol un it?一 We could devise a more readable, highe

12、r-level notation rather than deali ng directly with binary con trol sig nals and state tran siti ons ?一 We would design control units by writing“programs in this Ianguage.一 We will depe nd on a hardware or software tran slator to con vert our programs into a circuit for the con trol un it.A good not

13、ation is very useful In stead of specify ing the exact binary values for each con trol sig nal, we will define a symbolic notation that's easier to work with. As a simple example, we might replace ALUSrcB = 01 with ALUSrcB = 4? We can also create symbols thatcombine several control signals toget

14、her.In stead oflorD = 0Mem Read = 1IRWrite = 1it would be ni cer to just say somethi ng likeRead PCMicroi nstructi onsLabelALU controlSrc1Src2Register con trolMemoryPCWrite con trolNext For the MIPS multicycle we could defi nemicroi nstructio ns with eight fields.一 These fields will be filled in sym

15、bolically, i nstead of in bin ary.一 They determine all the control signals for the datapath ? There are only 8 fields because some of them specify more tha n one of the 12 actual control signals ?一 A microinstruction corresponds to one execution stage, or one cycle ? You can see that in each microi

16、nstructio n, we can do somethi ng with the ALU, register file, memory, and program coun ter un its ?The first stage, the microprogramming way Below are two lines of microcode to implement the first two multicycle executi on stages, i nstructi on fetch and register fetch ? The first line, labelled Fe

17、tch, involves several actions ?一 Read from memory address PC.一 Use the ALU to compute PC + 4, and store it back in the PC.一 Continue on to the next sequential microinstruction ?LabelALU controlSrc1Src2Register controlMemoryPCWrite con trolNextFetchAddPC4Read PCALUSeqAddPCExtshiftReadDispatch 1LabelA

18、LU con trolSrc1Src2Register controlMemoryPCWrite con trolNextBEQ1SubABALU-ZeroFetch Control would transfer to this microinstruction 讦 the opcode was “ be?q"一 Compute A-B, to set the ALU's Zero bit if A=B.一 Update PC with ALUOut (which contains the branch target from the previous cycle) if Z

19、ero is set ?一 The beq is completed, so fetch the next instruction? The 1 in the label BEQ1 reminds us that we came here via the first branchpoint (“ dispatch tablen)l, from the sec ond executi on stageLabelALU con trolSrc1Src2Register controlMemoryPCWrite con trolNextRtypelfuncABSeqWrite ALUFetch Wh

20、at if the opcode in dicates an R-type in structi on?一 The first cycle here performs an operation on registers A a nd B, based on the MIPS instruction's func field ?一 The next stage writes the ALU output to register “rd from the MIPS in structi on word. We can the n go back to the Fetch micr oin

21、structi on, to fetch and executethe next MIPS in struct!o n.LabelALU con trolSrc1Src2Register con trolMemoryPCWrite con trolNextMem1AddAExte ndDispatch 2SW2Write ALUFetchLV/2Read ALUSeqWrite MDRFetch For both sw and Iw in structi ons, we should first compute the effective memory address, A + sign-ex

22、tend(IR15 ? 0) ? Ano ther dispatch or branch dist in guishes betwee n stores and loads ?一 For sw, we store data (from B) to the effective memory address ? 一 For Iw we copy data from the effective memory address to register rt. In either case, we continue on to Fetch after we're done.Microprogram

23、ming vs. programming Microi nstructio ns corresp ond to con trol sig nals?一 They describe what is done in a single clock cycle ?一 These are the most basic operations available in a processor. Microprograms impleme nt higher-level MIPS in structi ons?一 MIPS assembly Ianguage instructions are comparat

24、ively complex, each possibly requiri ng multiple clock cycles to execute ?一 But each complex MIPS in struct! on can be impleme nted with several simpler microi nstructio ns ?Similarities with assembly language Microcode is intended to make control unit design easier.一 We defined symbols like Read PC

25、 to replace binary control signals.一 A tran slator can con vert microi nstructio ns into a real con trol unit一 The translation is straightforward, because each microinstruction corresp onds to one set of con trol values ? This sounds similar to MIPS assembly Ian guage!一 We use mnemon ics like Iw in

26、stead of bi nary opcodes like 100011 MIPS programs must be assembled to produce real machi ne code ?一 Each MIPS in struct! on corresp onds to a 32? bit in structi on wordManaging complexity It looks like all we've done is devise a new notation that makes it easier to specify control signals ? Th

27、at's exactly right! It's all about man agi ng complexity.一 Control units are probably the most challenging part of CPU design一 Large instruction sets require large state machines with many states,branches and outputs ?Con trol un its for multicycle processors are difficult to create and mai

28、ntain. Appl ying programm ing ideas to hardware desig n is a useful tech nique One disadva ntage of microprograms is that look ing up con trol sig nals in a ROM can be slower tha n gen erati ng them from simplified circuits ? Sometimes complex instructions implemented in hardware areslower thanequiv

29、ale nt assembly programs writte n using simpler in struct ons一 Complex instructions are usually very general, so they can be used more often ? But this also means they can't be optimized for specific opera nds or situations ?一 Some microprograms just aren't writte n very efficie ntly. But si

30、 nee they're built into the CPU, people are stuck with them (at least un til the n ext processor upgrade) ?How microcode is used today Moder n CISC processors (like x86) use a comb in ati on of hardwired logic and microcode to bala nee desig n effort with performa nee ?一 Con trol for many simple

31、 in struct! ons can be impleme nted in hardwired which can be faster tha n readi ng a microcode ROM.一 Less-used or very complex instructions are microprogrammed to make the desig n easier and more flexible. In this way, desig ners observe the“ first law of performa nee一 Make the comm on case fast!Th

32、e single-cycle datapath; what is the cycle time?a3ns3ns2ns3ns匸I 20MRegistersALUOpALUSrcWrite dataInstruction memoryRead data 1Shift left 2Write dataData memoryWrite registerSign ! extendRead register 1Read register 21(15*01(15-11)Add> AddRead Instruction address (31-0Y RegDslI 25-21丫PCSrcMemToReg

33、Peforms nee of a multicycle impleme ntati onPC3ns2nsAddressMemoryWrite Memdata DataRead J"ReadJlregister 1data 1Read register 2FReadWrite registerl.tdata 2Registers$(joiciIn structi onregisterMemory data registerLet's assume the following delays for the major functional units.Comparing cycl

34、e times The clock period has to be long eno ugh to allow all of the required work tocomplete with in the cycle? In the sin gle-cycle datapath, the urequired work" was just the complete executi on of any in structi on.一 The Iongest instruction,Iw, requires s (3 + 2 + 3 + 3 + 2).一 So the clock cy

35、cle time has to be 13ns, for a 77MHz clock rate ? For the multicycle datapath, the“ required work" is only a sin gle stage一 The Iongest delay is 3ns, for both the ALU and the memory.一 So our cycle time has to be 3ns, or a clock rate of 333MHz ?一 The register file needs only 2ns, but it must wai

36、t an extra 1ns to stay syn chr oni zed with the other fun cti onal un its. The single-cycle cycle 廿 me is limited by the slowestinstruction, whereasthe multicycle cycle time is limited by the slowest fun cti onal un it.Comparing instruction execution times In the si ngle-cycle datapath, each in stru

37、cti on n eeds an en tire clock cycle, or 13ns, to execute ? With the multicycle CPU, differe nt in structi ons n eed differe nt nu mbers of clock cycles, and hence differe nt amo unts of time ?一 A branch needs 3 cycles, or 3 x 3ns = 9ns ?一 Arithmetic and sw instructions each require 4 cycles, or 12n

38、s ?一 Fin ally, a Iw takes 5 stages, or 15ns ? We can make some observati ons about performa nee already.一 Loads take Ion ger with this multicycle impleme ntati on, while all other in structi ons are faster tha n before?一 So if our program does n't have too many loads, the n we should see an in c

39、rease in performa nee ?The gcc exampleLet's assume the gcc in struct!o n mix.In structio nFreque ncyArithmetic48%Loads22%Stores11%Bran ches19% In a sin gle-cycle datapath, all in structi ons take 13ns to execute ? The average executi on time for an in structi on on the multicycle processor works

40、 out to 12.09ns ?(48% x 12ns) + (22% x 15ns) + (11% x 12ns) + (19% x 9ns) = 12.09ns The multicycle implementation is faster in this case, but not by much ? The speedup here is only 7.5% ?13ns / 12.09 ns = 1.075This CPU is too simple Our example in structi on set is too simple to see large gains ?一 All of our instructions need about the same number of cycles (3? 5)?一 The ben e

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论