容错计算英文版课件:lec03-combin_第1页
容错计算英文版课件:lec03-combin_第2页
容错计算英文版课件:lec03-combin_第3页
容错计算英文版课件:lec03-combin_第4页
容错计算英文版课件:lec03-combin_第5页
已阅读5页,还剩14页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、Fault-Tolerant ComputingMotivation, Background, and ToolsOct. 20061Combinational ModelingAbout This PresentationEditionReleasedRevisedRevisedFirstOct. 2006This presentation has been prepared for the graduate course ECE 257A (Fault-Tolerant Computing) by Behrooz Parhami, Professor of Electrical and C

2、omputer Engineering at University of California, Santa Barbara. The material contained herein can be used freely in classroom teaching or any other educational setting. Unauthorized uses are prohibited. Behrooz ParhamiOct. 20062Combinational ModelingCombinational ModelingOct. 20063Combinational Mode

3、lingWhen model does not match reality.Oct. 20064Combinational ModelingA Motivating ExampleFive-site distributed computer systemData files to be stored on the sites so that they remain available despite site and link malfunctionsS = Site availability (aS in handout)L = Link availability (aL in handou

4、t)Some possible strategies: Duplication on home site and mirror site Triplication on home site and 2 backups Data dispersion through codingHere, we ignore the important problem of keeping the replicas consistent and do not worry about malfunction detection and attendant recovery actionsOct. 20065Com

5、binational ModelingData Availability with Home and Mirror SitesA = SL + (1 SL)SL = 2SL (SL)2 RequesterRDDHome Mirror Assume data file must be obtained directly from a site that holds it For example, S = 0.99, L = 0.95, A = 0.9965With no redundancy, A = 0.99 0.95 = 0.9405 Combinational modeling: Cons

6、ider all combinations of circumstances that lead to availability/success (unavailability/failure)RDDRDDRDDRD1 SLSLSL 1 L(1 S)LAnalysis by considering mutually exclusive subcasesOct. 20066Combinational ModelingData Availability with TriplicationA = SL + (1 SL)SL + (1 SL)2SL = 3SL 3(SL)2 + (SL)3 Reque

7、sterRDDDHome Backup 1 For example, S = 0.99, L = 0.95, A = 0.9998With duplication, A = 0.9965 With no redundancy, A = 0.9405RDDDRDDRDDDRDD1 SL 1 L(1 S)LBackup 2 RDDDRDDDRDDSL 1 L(1 S)L1 SL SL Can merge these two casesA = SL + (1 SL) SL + (1 SL)SLOct. 20067Combinational ModelingData Availability with

8、 File DispersionFile accessible if 2 out of 4 sites accessibleA = (SL)4 + 4(1 SL)(SL)3 + 6(1 SL)2(SL)2 = 6(SL)2 8(SL)3 + 3(SL)4 RequesterRddddPiece 2 Encode an l-bit file into 5l/3 bits (67% redund.) Break encoded file into 5 pieces of length l/3 Store each piece on one of the 5 sitesAny 3 of the 5

9、pieces can be used to reconstruct the original file Piece 1 Piece 4 Piece 3 Piece 5 For example, S = 0.99, L = 0.95, A = 0.9992, Redundancy = 67%With duplication, A = 0.9965, Redundancy = 100%With triplication, A = 0.9998, Redundancy = 200%With no redundancy, A = 0.9405Oct. 20068Combinational Modeli

10、ngSeries SystemA system composed of n units all of which must be healthy for the system to function properlyR = P RiExample: Redundant system ofvalves in series with regard to stuck-on-shut malfunctions (tolerates stuck-on-open valves)Example: Redundant system of valves in parallel with regard to to

11、 stuck-on-open malfunctions (tolerates stuck-on-shut valves)Oct. 20069Combinational ModelingSeries System: Implications to DesignAssume exponential reliability lawRi = exp li t R = P Ri = exp (Sli) t Given the reliability goal r, find the required value of Sli Assign a failure rate “budget” to each

12、unit and proceed with its designMay have to reallocate budgets if design proves impossible or costlyOct. 200610Combinational ModelingParallel SystemA system composed of n units, the health of one of which is enough for the system to function properly1 R = P (1 Ri )That is, the system fails only if a

13、ll units malfunctionExample: Redundant system of valves in series with regard to stuck-on-open malfunctions (tolerates stuck-on-open valves)Example: Redundant system ofvalves in parallel with regard to stuck-on-shut malfunctions (tolerates stuck-on-shut valves)Oct. 200611Combinational ModelingParall

14、el System: Implications to DesignAssume exponential reliability lawRi = exp li t 1 R = P (1 Ri )Given the reliability goal r, find the required value of P (1 Ri ) Assign a failure probability “budget” to each unit For example, with identical units, 1 Rm = n 1 r Assume r = 0.9999, n = 4 1 Rm = 0.1 (m

15、odule reliability must be 0.9)Conversely, for r = 0.9999 and Rm = 0.9, n = 4 is neededOct. 200612Combinational ModelingThe Concept of CoverageFor r = 0.9999 and Ri = 0.9, n = 4 is neededStandby sparing: One unit works; others are also active concurrently or they may be inactive (spares)When a malfun

16、ction of the main unit is detected, it is removed from service and an alternate unit is brought on-line; our analysis thus far assumes perfect malfunction detection and reconfigurationR = 1 (1 Rm)n = Rm 1 (1 Rm)n1 (1 Rm)Let the probability of correct malfunction detection and successful reconfigurat

17、ion be c (coverage factor)R = Rm See Siew92, p. 2881 cn(1 Rm)n1 c(1 Rm)Oct. 200613Combinational ModelingSeries-Parallel SystemThe system functions properly if a string of healthy units connect one side of the diagram to the other1 R = (1 R1 R2) (1 R3 R4)Example: Parallel connection of series pairs o

18、f valves (tolerates one stuck-on-shut and one stuck-on-open valve)3142Example: Series connection of parallel pairs of valves (tolerates one stuck-on-shut and one stuck-on-open valve)3142R = 1 (1 R1)(1 R3) 1 (1 R2)(1 R4)Oct. 200614Combinational ModelingMore Complex Series-Parallel SystemsThe system f

19、unctions properly if a string of healthy units connect one side of the diagram to the otherR = R2 prob(system OK | Unit 2 OK) + (1 R2) prob(system OK | Unit 2 not OK)R2OK = 1 (1 R1)(1 R6) 1 (1 R3)(1 R5) R4243156We can think of Unit 5 as being able to replace Units 2 and 34315643156R2OK = 1 (1 R1R5)(

20、1 R3R6) R4R2OKR2OKOct. 200615Combinational ModelingAnalysis Using Success PathsR 1 Pi (1 Rith success path)R 1 (1 R1R5R4) * (1 R1R2R3R4)(1 R6R3R4)243156This yields an upper bound on reliability because it assumes the paths to be independent2431564314With equal module reliabilities:R 1 (1 Rm3)2 (1 Rm

21、4) If we expand * by multiplying out, removing any power for the various reliabilities, we get an exact reliability expressionR = 1 (1 R1R4R5)(1 R3R4R6 R1R2R3R4 R1R2R3R4R6) = R3R4R6 + R1R2R3R4 + R1R2R3R4R6 + R1R4R5 R1R3R4R5R6 R1R2R3R4R5 R1R2R3R4R5R6 (Verify for the case of equal Rj ) Oct. 200616Comb

22、inational Modelingk-out-of-n SystemThere are n modules, any k of which are adequate for proper system functioningV231Example: System with 2-out-of-3 votingAssume perfect voterR = R1 R2 R3 + R1 R2 (1 R3) + R2 R3 (1 R1) + R3 R1 (1 R2) With all units having the same reliability Rm and imperfect voter:R

23、 = (3Rm2 2Rm3) RvTriple-modular redundancy (TMR)R = Sj = k to n ( )Rmj (1 Rm)nj k-out-of-n system in generalnjAssuming that any 2 malfunctions in TMR lead to failure is pessimisticWith binary outputs, we can model compensating errors (when two malfunctioning modules produce 0 and 1 outputs)Oct. 200617C

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

最新文档

评论

0/150

提交评论