下载本文档
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、Linux Kernel Crash Dumps,Matt D. Robinson and Tom Morano Silicon Graphics Computer Systems,Linux Kernel Crash DumpsContents,Objectives LKCD Components Kernel Design Considerations Kernel Initiating Dumps Kernel Dumping Hooks/Execution Dump Initiation Code/Layout Dump Tunables Introduction to LCRASH,
2、Linux Kernel Crash DumpsObjectives,LKCD created for Linux customers, support personnel and Linux kernel engineers LKCD reduces MTBF and MTTR statistics Kernel problems are resolved more quickly As the Linux kernel becomes more complex, the need for LKCD increases,Linux Kernel Crash DumpsLKCD Compone
3、nts,LKCD Components Kernel changes to configure, catch kernel failures, and save crash dumps User level scripts to save and configure system memory to a crash dump LCRASH, the kernel crash dump analyzer,Linux Kernel Crash DumpsKernel Design Considerations,The biggest design considerations were: Dump
4、 Save Mechanism Raw I/O vs. Buffer Cache I/O Kernel Code Location Dump Storage NOTE: Other crash dump products available for Linux may use different dumping methods than those described here,Linux Kernel Crash DumpsKernel Design Considerations,Dump Save Mechanism,Reset System,PROM,Disk,Kernel,Disk,R
5、eset System,Save Memory to Swap Space in Kernel,Save Memory to Swap Space from PROM/BIOS,Crash,Linux Kernel Crash DumpsKernel Design Considerations,Kernel save method chosen because: PROM/BIOS is too architecture-specific reset/power-off may clear memory kernel disk driver restrictions no disk to fi
6、lesystem validation at PROM code can be modified in kernel; PROM code is difficult to make changes for (backwards compatibility issues),Linux Kernel Crash DumpsKernel Design Considerations,Raw I/O vs. Buffer Cache I/O Buffer cache locking prevents handling dump workaround without major performance h
7、it on basic I/O Re-entry interrupt locking problem Raw I/O is not fully supported in Linux yet (in the kernel) - kiobuf code needs more work IDE, RAID, etc., drivers need raw I/O hooks (current plan is to create driver layer above to avoid necessary locking),Linux Kernel Crash DumpsKernel Design Con
8、siderations,Kernel Code Location Code changes are separated into generic and architecture-specific files kernel/vmdump.c arch/kernel/vmdump.c Additional modifications made to linux/include/sysctl.h, kernel/sysctl.c, and kernel crash hook functions,Linux Kernel Crash DumpsKernel Design Considerations
9、,Dump Storage Memory dumps are saved to swap space Swapping during boot-up is an issue Disk partition tables in memory - could this cause a data corruption problem? Cannot assume filesystem layer will be available during crash,Linux Kernel Crash DumpsKernel Initiating Dumps,Initiating Dump Process C
10、hange to /proc/sys/vmdump/dumpdev calls dump_open() in kernel dump_open() checks to ensure the device specified is a block device device points to a valid swap partition device has valid character device file_operations table (currently SCSI only, due to lack of raw I/O capability for IDE disks),Lin
11、ux Kernel Crash DumpsKernel Initiating Dumps,Errors in dump_open() are logged to system log buffer Changes needed for 2.3 (without devfs) due to mismatch between block and character major/minor pairs for the same disk device Success of dump_open() displays: dump_open(): dump device opened: 0 x803 sd
12、(8,3),Linux Kernel Crash DumpsKernel Dumping Hooks,Kernel Hooks for Executing Crash Dump panic() was modified to perform SMP freeze and to call dump_execute() die_if_kernel() or die() calls dump_execute() after KDB, GDB, and show_registers() are done NMI (Non-Maskable Interrupt) hooks still needed f
13、or systems that support the capability in hardware,Linux Kernel Crash DumpsKernel Dumping Hooks,Kernel Hooks and Parameters panic(): register state is not saved, panic string is saved die_if_kernel() or die(): registers are saved, panic string is generic (for now) Interrupt handlers vs. non I/O requ
14、est lock dumping needs to be differentiated,Linux Kernel Crash DumpsKernel Dump Execution,Kernel Dump Execution dump_execute() checks to see if dumping is turned on If DUMP_NONE is set, it returns immediately _dump_execute(), which is architecture-specific, is called to save the dump Within _dump_ex
15、ecute(), dump header values are saved, memory pages are saved, and the function returns when complete,Linux Kernel Crash DumpsKernel Dump Layout,Kernel Dump Layout,Dump Header,Dump Page Headers,Dump Pages,Linux Kernel Crash DumpsKernel Dump Layout,Dump header is written out first; it contains basic
16、information about dump Memory pages are written next, each with a page header containing virtual address of the page in memory size of page (important if compressed) page flags (compressed, raw, dump end) The last step is a re-write of the dump header which updates the total number of pages written,
17、Linux Kernel Crash DumpsKernel Dump Limitations,Kernel Dump Limitations Current interrupt crashes will lock up with re-entry to disk driver function Dump header needs to be written out more often Raw I/O capabilities need to be added in kernel for more disk drivers (using kiobufs, scatter-gather lis
18、ts, etc.) Page typing needed for ordered dumps More architectures need to be supported,Linux Kernel Crash DumpsKernel Recovery of Crash Dump,Kernel Reboot After Crash During early boot-up, the system runs the /etc/rc.d/rc.sysinit script, which in turn runs /sbin/vmdump /sbin/vmdump runs with either
19、the config or save option config sets all dump tunables and attempts to open the dump device save looks for a crash dump in dump device and saves it to disk (if requested),Linux Kernel Crash DumpsKernel /proc Tunables,Kernel Tunables /proc/sys/vmdump contains all LKCD kernel tunables /proc/sys/kerne
20、l/panic is modified so that the system reboots after LKCD creates a crash dump dumpdev holds the name of the dump device dump_compress_pages determines if the memory pages should be compressed dump_level indicates which pages to dump to disk (only three levels currently supported),Linux Kernel Crash
21、 DumpsKernel Dump Tunables,/etc/sysconfig/vmdump holds all LKCD tunables (the /proc tunables are changed automatically): DUMP_ACTIVE=1 DUMPDEV=/dev/vmdump DUMPDIR=/var/log/vmdump DUMP_SAVE=1 DUMP_LEVEL=4 DUMP_COMPRESS_PAGES=1 PANIC_TIMEOUT=5,Linux Kernel Crash DumpsKernel Dump Tunables,DUMP_ACTIVE D
22、etermines if the crash dump scripts should perform any actions; the default value is 1 (active). Set to 0 to not save or configure system for crash dumps,Linux Kernel Crash DumpsKernel Dump Tunables,DUMPDEV The name of the dump device; this typically is /dev/vmdump. NOTE: It is recommended to change
23、 what device /dev/vmdump points to rather than to change this value directly, as /dev/vmdump is normally a symbolic link.,Linux Kernel Crash DumpsKernel Dump Tunables,DUMPDIR The name of the directory to save dumps to; this typically is /var/log/vmdump. DUMP_SAVE Whether to save the crash dump to di
24、sk or not. The system will still be configured to save crash dumps regardless of the value of DUMP_SAVE.,Linux Kernel Crash DumpsKernel Dump Tunables,DUMP_LEVEL Determines how much memory (or not) should be saved in the crash dump. Default value is 4 (DUMP_ALL), although other values such as 0 (DUMP
25、_NONE) and 1 (DUMP_HEADER) are also valid. This sets /proc/sys/vmdump/dump_level to the same value (/sbin/vmdump config).,Linux Kernel Crash DumpsKernel Dump Tunables,DUMP_COMPRESS_PAGES Determines whether to compress memory pages when saving memory image to disk. Defaults to 1 (compress). This sets
26、 /proc/sys/vmdump/dump_compress_pages to the same value (/sbin/vmdump config).,Linux Kernel Crash DumpsKernel Dump Tunables,PANIC_TIMEOUT Changes the amount of time to sleep before resetting the system after a software failure. Changes /proc/sys/kernel/panic to the same value (/sbin/vmdump config).
27、NOTE: This value should always be non-zero; if zero, the system will spin indefinitely until it is reset by hand.,Linux Kernel Crash DumpsKernel Dump Files,Kernel Dump Files vmdump.N holds the crash dump data saved from DUMPDEV; it is a copy of the memory image at the time of the system crash map.N
28、is a copy of /boot/System.map Both files needed to perform crash analysis; addresses in map.N point to values in vmdump.N; if the files do not come from the same kernel build, crash analysis may be inaccurate,Linux Kernel Crash DumpsIntroduction to LCRASH,Overview of LCRASH Linux system crash dump a
29、nalysis tool Provides access to kernel data in LKCD crash dumps or live system memory Displays detailed information about a system crash Can be used interactively or to generate system crash dump reports,Linux Kernel Crash DumpsIntroduction to LCRASH,LCRASH Crash Dump Report General system informati
30、on Type of crash Dump of the system log_buf List of active tasks Kernel stack trace showing the function calls leading up to the point of the crash,Linux Kernel Crash DumpsIntroduction to LCRASH,LCRASH Interactive Commands For a more detailed examination of the elements of a crash Kernel data displa
31、yed in a clear, easy-to-read manner Invoked via an ASCII command line user interface featuring command line editing and command history Command output can be piped to utilities such as more and grep,Linux Kernel Crash DumpsIntroduction to LCRASH,Examples of LCRASH commands statDisplays pertinent sys
32、tem information and the contents of the log_buf array vtopDisplays virtual to physical address mappings ptypeDisplays arbitrary kernel structures from the crash dump symbolDisplays kernel symbol information dumpDisplays the contents of system memory in a variety of bases and data sizes taskDisplays
33、key information from selected tasks or all tasks running on the system traceDisplays a kernel stack trace for one or more task disDisassembles one or more machine instructions,Linux Kernel Crash DumpsIntroduction to LCRASH,The libklib Library Library of low-level functions providing access to the sy
34、stem dump and kernel symbol table Translate virtual addresses into physical addresses Determine the address of kernel symbols Access memory pages in the dump or live system memory Read in blocks of kernel data Access kernel data type information,Linux Kernel Crash DumpsIntroduction to LCRASH,Accessi
35、ng Kernel Symbol Information The System.map file contains the virtual address of all kernel symbols (variables, functions, etc.) LCRASH parses the System.map file at startup and builds an internal table of kernel symbols Functions determine the address of a kernel symbol, or locate a symbol matching
36、 a particular address,Linux Kernel Crash DumpsIntroduction to LCRASH,Reading in Blocks of Data from a Dump LCRASH cant access data in a system dump directly Functions read in blocks of data from a system dump or live system memory Kernel virtual addresses are translated into physical address Memory
37、pages in the dump are uncompressed automatically The desired data is then copied into an LCRASH buffer,Linux Kernel Crash DumpsIntroduction to LCRASH,Accessing Kernel Type Information Facilities provided for accessing extended information in the kernel symbol table (when built using the -gstabs comp
38、iler option) Kernel data type definitions, including type and size of kernel structure members Data types of global variables Function parameters Source code line numbers of kernel functions Most production systems are not built with the -gstabs flag,Linux Kernel Crash DumpsIntroduction to LCRASH,Ge
39、nerating Kernel Stack Traces LCRASH is able to generate kernel stack traces without using frame pointers Various heuristics are applied to each stack frame to determine what the PC, RA, SP, and frame pointer should be Derived values are sanity checked to ensure they are at least reasonable The entir
40、e stack trace is constructed before it is displayed Most x86 kernels do not use frame pointers,Linux Kernel Crash DumpsIntroduction to LCRASH,LCRASH commands for displaying kernel stack traces trace displays a stack trace for one or more active tasks strace displays an arbitrary stack trace using a PC, RA, and SP provided on the command line; or finds all valid sta
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2026年四川工程职业技术学院单招职业技能测试题库附参考答案详解(a卷)
- 物流行业市场现状与前景
- 中医护理的老年护理
- 基础生活护理教学课件
- 特色林业农民增收路径
- 加强就业指导培训力度
- 2026安徽合肥市第48中学滨湖教育集团招聘考试备考题库及答案解析
- 2026福建泉州晋江市西园街道晋城学府幼儿园保育员招聘1人笔试参考题库及答案解析
- 2026福建三明市泰宁县紧缺急需专业教师补充招聘2人笔试备考题库及答案解析
- 个人职业规划安排方法
- 碱性嫩黄项目可行性研究报告(立项备案下载可编辑)
- GB/T 22502-2025超市销售生鲜农产品管理技术规范
- 2025年贵州分类考试试题及答案
- 五一期间安全运输培训课件
- 智慧农业概论课件
- GB/T 46229-2025喷砂用橡胶软管
- 西藏助教活动方案
- 液化石油气三级安全教育考试试题与答案
- 机械工程基础(第5版)课件 0绪论
- NK细胞的发现与研究进展
- 2025德勤审计笔试题库及答案
评论
0/150
提交评论