《东北大学实时系统》PPT课件.ppt_第1页
《东北大学实时系统》PPT课件.ppt_第2页
《东北大学实时系统》PPT课件.ppt_第3页
《东北大学实时系统》PPT课件.ppt_第4页
《东北大学实时系统》PPT课件.ppt_第5页
已阅读5页,还剩58页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

Real-Time Systems,Nan Guan (关楠),Computing Evolution,Mainframe computing (60s-70s) Large computers to execute big data processing applications Desktop computing & Internet (80s-90s) One computer at every desk to do business/personal activities Embedded Systems (since 00s) Numerous computing devices in every place/person “Invisible” part of the environment Millions for desktops vs. billions for embedded processors,Embedded Systems,“Computer systems that are embedded into physical objectes ” 98% computers in the world are embedded systems Market share,3,Embedded Systems,Real-Time Systems,Real-Time Systems: “The correctness of the computation does not only depend on the logical results, but also the time when the results are produced.” Real-Time Systems are common in Embedded Systems Interaction between computers and the physical world,Real-Time Systems,Real-Time Systems,Real-Time Systems,Real-Time Systems,Real-Time Systems,We need to guarantee that all timing constraints are satisfied under any circumstance, BEFORE run-time How to do this? How about using a very fast processor? How about executing it and measuring the time? (Testing),11,The story about a sheep in Scotland,12,The story about the Pathfinder on Mars ,What do we learn?,Testing is inadequate,13,What do we learn?,Testing is inadequate, especially for real-time embedded systems Need to simulate the stimulus and feedback from the physical world,14,Even the NASA Lab fails, then how about your project?,What do we learn?,Difficult to fix the error, even if it is exposed in testing Difficult to reproduce Difficult to explain,15,Timing Analysis,How to rigorously verify the timing correctness? An extremely difficult problem ,16,Real-Time Computing Technology,Some leading countries USA Germany Sweden France Italy ,17,Research in Real-Time Systems,Conferences RTSS: IEEE Real-Time Systems Symposium RTAS: IEEE Real-Time and Technology and Applications Symposium ECRTS: Euromicro Conference on Real-time Systems Journal: Journal of Real-Time Systems,18,More details about Real-Time Systems in the following,19,Hard vs Soft Real-Time Systems,Definition based on functional criticality HARD: The failure to meet the timing constraints has disastrous consequences Braking system, collision detection, etc. SOFT: A few misses of deadlines do no serious harm, and only the systems overall performance degrades Video playback, etc. Problem: how serious is serious? - A design property,Hard vs Soft Real-Time Systems,Definition based on validation Validation: A demonstration by a provably correct, efficient procedure or by exhaustive simulation and testing HARD: The user require the validation that the system always meets the timing constraints SOFT: No validation is required, or only a demonstration that the job meet some statistical constraint suffices,Desirable Properties of RT Systems (1),Predictability the system behavior is known before it is put into operation e.g. Response times, deadlock freedom, etc. Analyzability easy to analyze if the system can meet all the deadlines Cost optimality e.g. Energy consumption, memory blocks, etc.,Desirable Properties of RT Systems (2),Maintainability modular structure to ease system modification Robustness must not collapse when subject to peak load, exception, manage all possible scenarios Fault tolerance hardware and software failures should not cause the system to crash,Why Achieving Predictability is Hard?,Challenges from hardware features Cache sharing, processor pipelines, multicores, DMA . Interrupt handling may introduce unbounded delays Resource sharing: priority inversion Memory management (static allocation may not be enough, dynamic data structures e.g. Queue), no virtual memory Communication delays in a distributed environment,Why Achieving Predictability is Hard?,Challenges from software features Difficult to calculate the worst case execution time for tasks (theoretically impossible, halting problem) Bounded loops e.g. For-loops only Complex synchronization patterns between tasks: potential deadlocks (formal verification) Multi-tasking, tasks that share resources,Misconceptions about RT Computing,Real-time computing is fast computing Real-time system research is performance engineering Real-time is time sharing,Overall Structure of a RT System,Hardware (CPU, I/O device etc) a clock! A real time OS (function as standard OS, with predictable behavior and well-defined functionality) A collection of RT tasks/processes (share recourses, communicate/synchronize,A Real-Time Control System,2019/8/4,28,A Real-Time Control System,Selection of the sampling period The digital world is discrete! The responsiveness of the overall system The dynamic behavior of the plant: keep oscillation small and the system under control,Further: Multirate Systems,The example of fly-by-wire avionics,Timing constraints Finish control law computation for each smallest sampling periods!,Further: Multirate Systems,For example The sampling rates of A, B, and C are 180Hz, 90Hz, 30Hz resp. The sampling periods are “harmonic”,Do the following in each 1 / 180 second cycle: Do A Do B once every 2 cycles Do C once every 6 cycles Output commands Wait until the beginning of the next cycle,Furthermore: A Car Controller,Activities of a car control system. Let C= worst case execution time T= (sampling) period D= deadline Control requirements,Construct a controller meeting all the deadlines!,The Overall System,Timing Analysis,Software system is complex,Timing Analysis,Usually performed bottom-up Program Level Analysis to bound the worst-case execution time (WCET) of each individual task assuming fully dedicated hardware System Level Analysis the WCET information of individual tasks is gathered, to investigate the contention of processing capacity among different tasks in a multitasking environment,Program Level,36,Execution Time of a Program,Execution times of programs are intrinsically variable!,Void signal_processing () read_signal( / complex error handling operations ,The Distribution of Execution Times,The Quality of Static Analysis,Safety The estimated upper bound should always enclose the WCET Tightness (Precision) The estimated upper bound should be as close as possible to the actual WCET Main objective: to minimize system over-provisioning Complexity There is always a trade-off between analysis efficiency and precision,System Level,40,Real-Time Scheduling,A Simple Task Model,Non periodic/Aperiodic Tasks Appear once only 3 parameters: A: arriving time C: computing time (resource requirement) D: deadline (relative deadline),42,Constraints on Tasks,Timing constraints: deadline for each task, Relative to arriving time Absolute deadline Precedence constraints Precedence graphs Input/output relation Resource constraints Mutual exclusion Resource access protocols,43,Scheduling Problems,Given a set of tasks (ready queue) Check if all deadlines can be met with a given schedule (schedulability check) Construct a ”feasible” schedule to meet all deadlines Construct an optimal schedule e.g. minimizing response times,44,Tasks with the Same Arrival Time,Assume a list of tasks (A, C1, D1)(A, C2, D2) .(A, Cn,Dn) that arrive at the same time i.e. A How to find a feasible schedule? (there may be many feasible schedules),EDF,EDF (Earliest-Deadline-First) order tasks with non-decreasing deadlines. Jackson 1955 Example: (0,1,10) (0,2,3) (0,3,5) Schedule: (0,2,3) (0,3,5) (0,1,10),46,EDF: Schedulability Test,Order tasks in increasing order of deadlines If C1+.+Ck =Dk for all k, the task set is schedulable Response time for task i, Ri =C1+.+Ci,47,EDF: Examples,(2, 4)(1,5)(6,10) is schedulable: Feasible schedule: (2,4)(1,5)(6,10) Note that (1,5)(2,4)(6,10) is also feasible (1,10)(3,3)(2,5) is schedulable The feasible schedule: (3,3)(2,5)(1,10) Why not shortest task first? (4,6)(1,10)(3,5) is not schedulable (3,5)(4,6)(1,10) is not feasible: 3+4 6!,48,EDF: Optimality,Assume that Ri is the finishing time of task i, i.e. response time. Let Li = Ri-Di (the lateness for task i) FACT1: EDF is optimal, minimizing the maximum lateness MAXi(Li) FACT2: EDF is optimal If EDF cannt find a feasible schedule for a task set, then no other algorithm can, which means that the task set is non schedulable. Note that even a task set is non schedulable, EDF may minimize the maximal lateness (minimizes e.g. the loss for soft tasks),49,Tasks with Different Arrival Times,Assume a list of tasks S= (A1, C1, D1)(A2,C2, D2) .(An,Cn,Dn) Preemptive EDF Horn 1974: Whenever new tasks arrive, sort the ready queue according to earlist deadlines Run the first task of the queue,50,Preemptive EDF: Schedulability Test,At any time, order the tasks according to EDF (A1, C1, D1) . (Ai,Ci,Di) If C1+.+Ck =Dk for all k=1,2.i, then the task set is schedulable at the moment If S is schedulable at all time points at which tasks arrive, S is schedulable Why?,51,Preemptive EDF: Example,Consider (1, 5, 11)(2,1,3)(3, 4,8) Deadlines are relative to arrival times At 1, (5,11) At 2, (1,3) (4,10) At 3, (4,8) (4,9),52,Preemptive EDF: Optimality,Assume that Ri is the finishing time/response time of task i. Let Li = Ri-Di (the lateness for task i) FACT1: preemptive EDF is optimal in minimizing the maximum lateness Lmax= MAXi(Li) FACT2: Preemptive EDF is optimal Dertouzos 1974 in finding feasible schedules.,53,Non Preemptive EDF,Run a task until its finished and then sort the queue according to EDF Easy to implement, less overhead (no more context switch than necessary) However it is not optimal, it may not find the feasible schedule even it exists.,54,Non-preemptive EDF: Example,55,“Work-conserving” Non-preemptive EDF,If we only consider non-idle algorithms (CPU waiting only if no tasks to run), is EDF is optimal? Unfortunately no! Example T1= (0, 10, 100) T2= (0, 1, 101) T3= (1,4,4) Run T1,T3,T2: the 3rd task will miss its deadline Run T2,T3,T1: it is a feasible schedule,56,Optimal Non-preemptive Schedule?,To construct “optimal” non-preemptive Schedule: “complete search, NP-hard” is needed The decision should be made according to all the parameters in the whole list of tasks The worst case is to test all possible combinations of n tasks (NP-hard, difficult for large n),57,Practical Methods,Backtrack whenever needed Search until a non-schedulable situation occur, then backtrack Bratleys algorithm Simple and easy to implement but may not find a schedule if n is too big (worst case),58,Example (Bratleys Algorithm),59,Better Heuristic Methods?,Discussion (assignment): Design a new heuristic method to construct good non-preemptive algorithms.,60,Outline of This Course,Program Level Timing Analysis Worst-case Execution Time (WCET) Analysis How to deal with the software complexity? How to deal with the hardware complexity? System Level Timing Analysis Real-Time Scheduling Theory Other Formal verification, Real-Time operating systems,61,What Happened on Mars?,62,The Mars Pathfinder mission was widely proclaimed as “flawless“ in the early days after its July 4th, 1997 landing on the Martian surface. Successes included its unconventional “landing“ - bouncing onto the Martian surface surrounded by airbags, deploying the Sojourner rover, and gathering and transmitting voluminous data back to Earth, including the panoramic pictures that were such a hit on the Web. But a few days into the mission, not long after Pathfinder started gathering meteorological data, the spacecraft began experiencing total system resets, each resulting in losses of data. The press reported these failures in terms such as “software glitches“ and “the computer was trying to do too many things at once“. Last night at the IEEE Real-Time Systems Symposium I heard a fascinating keynote address by David Wilner, Chief Technical Officer of Wind River Systems. Wind River makes VxWorks, the real-time embedded systems kernel that was used in the Mars Pathfinder mission. In his talk, he explained in detail the actual software problems that caused the total system resets of the Pathfinder spacecraft, how they were diagnosed, and how they were solved. I wanted to share his story with each of you. VxWorks provides preemptive priority scheduling of threads. Tasks on the Pathfinder spacecraft were executed as threads with priorities that were assigned in the usual manner reflecting the relative urgency of these tasks. Pathfinder contained an “information bus“, which you can think of as a shared memory area used for passing information between different components of the spacecraft. A bus management task ran frequently with high priority to move certain kinds of data in and out of the information bus. Access to the bus was synchronized with mutual exclusion locks (mutexes). The meteorological data gathering task ran as an infrequent, low priority thread, and used the information bus to publish its data. When publishing its data, it would acquire a mutex, do writes to the bus, and release the mutex. If an interrupt caused the information bus thread to be scheduled while this mutex was held, and if the information bus thread then attempted to acquire this same mutex in order to retrieve published data, this would cause it to block on the mutex, waiting until the meteorological thread released the mutex before it could continue. The spacecraft also contained a communications task that ran with medium priority. Most of the time this combination worked fine. However, very infrequently it was possible for an interrupt to occur that caused the (medium priority) communications task to be scheduled during the short interval while the (high priority) information bus thread was blocked waiting for the (low priority) meteorological data thread. In this case, the long-running communications task, having higher priority than the meteorological task, would prevent it from running, consequently preventing the blocked information bus task from running. After some time had passed, a watchdog timer would go off, notice that the data bus task had not been executed for some time, conclude that something had gone drastically wrong, and initiate a total system reset. This scenario is a classic case of priority inversion. How was this debugged? VxWorks can be run in a mode where it records a total trace of all interesting system events, including context switches, uses of synchronization objects, and interrupts. After the failure, JPL engineers spent hours and hours running the system on the exact spacecraft replica in their lab with tracing turned on, attempting to replicate the precise conditions under which they believed that the reset occurred. Early in the morning, after all but one engineer had gone home, the engineer finally reproduced a system reset on the replica. Analysis of the trace revealed the priority inversion. How was this problem corrected? When created, a VxWorks mutex object accepts a boolean parameter that indicates whether priority inheritance should be performed by the mutex. The mutex in question had been initialized with the parameter off; had it been on, the low-priority meteorological thread would have inherited the priority of the high-priority data bus thread blocked on it while it held the mutex, causing it be scheduled with higher priority than the medium-priority communications task, thus preventing the priority inversion. Once diagnosed, it was clear to the JPL engineers that using priority inheritance would prevent the resets they were seeing. VxWorks contains a C language interpreter intended to allow developers to type in C expressions and functions to be executed on the fly during system debugging. The JPL engineers fortuitously decided to launch the spacecraft with this feature still enabled. By coding convention, the initialization parameter for the mutex in question (and those for two others which could have caused the same problem) were stored in global variables, whose addresses were in symbol tables also included in the launch software, and available to the C interpreter. A short C program was uploaded to the spacecraft, which when interpreted, changed the values of these variables from FALSE to TRUE. No more system resets occurred. Analysis and Lessons First and foremost, diagnosing this problem as a black box would have been impossible. Only detailed traces of actual system behavior enabled the faulty execution sequence to be captured and identified. Secondly, leaving the “debugging“ facilities in the system saved the day. Without the ability to modify the system in the field, the problem could not have been corrected. Finally, the engineers initial analysis that “the data bus task executes very frequently and is time-critical - we shouldnt spend the extra time in it to perform priority inheritance“ was exactly wrong. It is precisely in such time critical and important situations where correctness is essential, even at some additional performance cost. Human nature, Deadline Pressures David told us that the JPL engineers later confessed that on

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

最新文档

评论

0/150

提交评论