Linux Kernel Crash Dumps.ppt_第1页
Linux Kernel Crash Dumps.ppt_第2页
Linux Kernel Crash Dumps.ppt_第3页
Linux Kernel Crash Dumps.ppt_第4页
Linux Kernel Crash Dumps.ppt_第5页
免费预览已结束,剩余37页可下载查看

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、Linux Kernel Crash Dumps,Matt D. Robinson and Tom Morano Silicon Graphics Computer Systems,Linux Kernel Crash DumpsContents,Objectives LKCD Components Kernel Design Considerations Kernel Initiating Dumps Kernel Dumping Hooks/Execution Dump Initiation Code/Layout Dump Tunables Introduction to LCRASH,

2、Linux Kernel Crash DumpsObjectives,LKCD created for Linux customers, support personnel and Linux kernel engineers LKCD reduces MTBF and MTTR statistics Kernel problems are resolved more quickly As the Linux kernel becomes more complex, the need for LKCD increases,Linux Kernel Crash DumpsLKCD Compone

3、nts,LKCD Components Kernel changes to configure, catch kernel failures, and save crash dumps User level scripts to save and configure system memory to a crash dump LCRASH, the kernel crash dump analyzer,Linux Kernel Crash DumpsKernel Design Considerations,The biggest design considerations were: Dump

4、 Save Mechanism Raw I/O vs. Buffer Cache I/O Kernel Code Location Dump Storage NOTE: Other crash dump products available for Linux may use different dumping methods than those described here,Linux Kernel Crash DumpsKernel Design Considerations,Dump Save Mechanism,Reset System,PROM,Disk,Kernel,Disk,R

5、eset System,Save Memory to Swap Space in Kernel,Save Memory to Swap Space from PROM/BIOS,Crash,Linux Kernel Crash DumpsKernel Design Considerations,Kernel save method chosen because: PROM/BIOS is too architecture-specific reset/power-off may clear memory kernel disk driver restrictions no disk to fi

6、lesystem validation at PROM code can be modified in kernel; PROM code is difficult to make changes for (backwards compatibility issues),Linux Kernel Crash DumpsKernel Design Considerations,Raw I/O vs. Buffer Cache I/O Buffer cache locking prevents handling dump workaround without major performance h

7、it on basic I/O Re-entry interrupt locking problem Raw I/O is not fully supported in Linux yet (in the kernel) - kiobuf code needs more work IDE, RAID, etc., drivers need raw I/O hooks (current plan is to create driver layer above to avoid necessary locking),Linux Kernel Crash DumpsKernel Design Con

8、siderations,Kernel Code Location Code changes are separated into generic and architecture-specific files kernel/vmdump.c arch/kernel/vmdump.c Additional modifications made to linux/include/sysctl.h, kernel/sysctl.c, and kernel crash hook functions,Linux Kernel Crash DumpsKernel Design Considerations

9、,Dump Storage Memory dumps are saved to swap space Swapping during boot-up is an issue Disk partition tables in memory - could this cause a data corruption problem? Cannot assume filesystem layer will be available during crash,Linux Kernel Crash DumpsKernel Initiating Dumps,Initiating Dump Process C

10、hange to /proc/sys/vmdump/dumpdev calls dump_open() in kernel dump_open() checks to ensure the device specified is a block device device points to a valid swap partition device has valid character device file_operations table (currently SCSI only, due to lack of raw I/O capability for IDE disks),Lin

11、ux Kernel Crash DumpsKernel Initiating Dumps,Errors in dump_open() are logged to system log buffer Changes needed for 2.3 (without devfs) due to mismatch between block and character major/minor pairs for the same disk device Success of dump_open() displays: dump_open(): dump device opened: 0 x803 sd

12、(8,3),Linux Kernel Crash DumpsKernel Dumping Hooks,Kernel Hooks for Executing Crash Dump panic() was modified to perform SMP freeze and to call dump_execute() die_if_kernel() or die() calls dump_execute() after KDB, GDB, and show_registers() are done NMI (Non-Maskable Interrupt) hooks still needed f

13、or systems that support the capability in hardware,Linux Kernel Crash DumpsKernel Dumping Hooks,Kernel Hooks and Parameters panic(): register state is not saved, panic string is saved die_if_kernel() or die(): registers are saved, panic string is generic (for now) Interrupt handlers vs. non I/O requ

14、est lock dumping needs to be differentiated,Linux Kernel Crash DumpsKernel Dump Execution,Kernel Dump Execution dump_execute() checks to see if dumping is turned on If DUMP_NONE is set, it returns immediately _dump_execute(), which is architecture-specific, is called to save the dump Within _dump_ex

15、ecute(), dump header values are saved, memory pages are saved, and the function returns when complete,Linux Kernel Crash DumpsKernel Dump Layout,Kernel Dump Layout,Dump Header,Dump Page Headers,Dump Pages,Linux Kernel Crash DumpsKernel Dump Layout,Dump header is written out first; it contains basic

16、information about dump Memory pages are written next, each with a page header containing virtual address of the page in memory size of page (important if compressed) page flags (compressed, raw, dump end) The last step is a re-write of the dump header which updates the total number of pages written,

17、Linux Kernel Crash DumpsKernel Dump Limitations,Kernel Dump Limitations Current interrupt crashes will lock up with re-entry to disk driver function Dump header needs to be written out more often Raw I/O capabilities need to be added in kernel for more disk drivers (using kiobufs, scatter-gather lis

18、ts, etc.) Page typing needed for ordered dumps More architectures need to be supported,Linux Kernel Crash DumpsKernel Recovery of Crash Dump,Kernel Reboot After Crash During early boot-up, the system runs the /etc/rc.d/rc.sysinit script, which in turn runs /sbin/vmdump /sbin/vmdump runs with either

19、the config or save option config sets all dump tunables and attempts to open the dump device save looks for a crash dump in dump device and saves it to disk (if requested),Linux Kernel Crash DumpsKernel /proc Tunables,Kernel Tunables /proc/sys/vmdump contains all LKCD kernel tunables /proc/sys/kerne

20、l/panic is modified so that the system reboots after LKCD creates a crash dump dumpdev holds the name of the dump device dump_compress_pages determines if the memory pages should be compressed dump_level indicates which pages to dump to disk (only three levels currently supported),Linux Kernel Crash

21、 DumpsKernel Dump Tunables,/etc/sysconfig/vmdump holds all LKCD tunables (the /proc tunables are changed automatically): DUMP_ACTIVE=1 DUMPDEV=/dev/vmdump DUMPDIR=/var/log/vmdump DUMP_SAVE=1 DUMP_LEVEL=4 DUMP_COMPRESS_PAGES=1 PANIC_TIMEOUT=5,Linux Kernel Crash DumpsKernel Dump Tunables,DUMP_ACTIVE D

22、etermines if the crash dump scripts should perform any actions; the default value is 1 (active). Set to 0 to not save or configure system for crash dumps,Linux Kernel Crash DumpsKernel Dump Tunables,DUMPDEV The name of the dump device; this typically is /dev/vmdump. NOTE: It is recommended to change

23、 what device /dev/vmdump points to rather than to change this value directly, as /dev/vmdump is normally a symbolic link.,Linux Kernel Crash DumpsKernel Dump Tunables,DUMPDIR The name of the directory to save dumps to; this typically is /var/log/vmdump. DUMP_SAVE Whether to save the crash dump to di

24、sk or not. The system will still be configured to save crash dumps regardless of the value of DUMP_SAVE.,Linux Kernel Crash DumpsKernel Dump Tunables,DUMP_LEVEL Determines how much memory (or not) should be saved in the crash dump. Default value is 4 (DUMP_ALL), although other values such as 0 (DUMP

25、_NONE) and 1 (DUMP_HEADER) are also valid. This sets /proc/sys/vmdump/dump_level to the same value (/sbin/vmdump config).,Linux Kernel Crash DumpsKernel Dump Tunables,DUMP_COMPRESS_PAGES Determines whether to compress memory pages when saving memory image to disk. Defaults to 1 (compress). This sets

26、 /proc/sys/vmdump/dump_compress_pages to the same value (/sbin/vmdump config).,Linux Kernel Crash DumpsKernel Dump Tunables,PANIC_TIMEOUT Changes the amount of time to sleep before resetting the system after a software failure. Changes /proc/sys/kernel/panic to the same value (/sbin/vmdump config).

27、NOTE: This value should always be non-zero; if zero, the system will spin indefinitely until it is reset by hand.,Linux Kernel Crash DumpsKernel Dump Files,Kernel Dump Files vmdump.N holds the crash dump data saved from DUMPDEV; it is a copy of the memory image at the time of the system crash map.N

28、is a copy of /boot/System.map Both files needed to perform crash analysis; addresses in map.N point to values in vmdump.N; if the files do not come from the same kernel build, crash analysis may be inaccurate,Linux Kernel Crash DumpsIntroduction to LCRASH,Overview of LCRASH Linux system crash dump a

29、nalysis tool Provides access to kernel data in LKCD crash dumps or live system memory Displays detailed information about a system crash Can be used interactively or to generate system crash dump reports,Linux Kernel Crash DumpsIntroduction to LCRASH,LCRASH Crash Dump Report General system informati

30、on Type of crash Dump of the system log_buf List of active tasks Kernel stack trace showing the function calls leading up to the point of the crash,Linux Kernel Crash DumpsIntroduction to LCRASH,LCRASH Interactive Commands For a more detailed examination of the elements of a crash Kernel data displa

31、yed in a clear, easy-to-read manner Invoked via an ASCII command line user interface featuring command line editing and command history Command output can be piped to utilities such as more and grep,Linux Kernel Crash DumpsIntroduction to LCRASH,Examples of LCRASH commands statDisplays pertinent sys

32、tem information and the contents of the log_buf array vtopDisplays virtual to physical address mappings ptypeDisplays arbitrary kernel structures from the crash dump symbolDisplays kernel symbol information dumpDisplays the contents of system memory in a variety of bases and data sizes taskDisplays

33、key information from selected tasks or all tasks running on the system traceDisplays a kernel stack trace for one or more task disDisassembles one or more machine instructions,Linux Kernel Crash DumpsIntroduction to LCRASH,The libklib Library Library of low-level functions providing access to the sy

34、stem dump and kernel symbol table Translate virtual addresses into physical addresses Determine the address of kernel symbols Access memory pages in the dump or live system memory Read in blocks of kernel data Access kernel data type information,Linux Kernel Crash DumpsIntroduction to LCRASH,Accessi

35、ng Kernel Symbol Information The System.map file contains the virtual address of all kernel symbols (variables, functions, etc.) LCRASH parses the System.map file at startup and builds an internal table of kernel symbols Functions determine the address of a kernel symbol, or locate a symbol matching

36、 a particular address,Linux Kernel Crash DumpsIntroduction to LCRASH,Reading in Blocks of Data from a Dump LCRASH cant access data in a system dump directly Functions read in blocks of data from a system dump or live system memory Kernel virtual addresses are translated into physical address Memory

37、pages in the dump are uncompressed automatically The desired data is then copied into an LCRASH buffer,Linux Kernel Crash DumpsIntroduction to LCRASH,Accessing Kernel Type Information Facilities provided for accessing extended information in the kernel symbol table (when built using the -gstabs comp

38、iler option) Kernel data type definitions, including type and size of kernel structure members Data types of global variables Function parameters Source code line numbers of kernel functions Most production systems are not built with the -gstabs flag,Linux Kernel Crash DumpsIntroduction to LCRASH,Ge

39、nerating Kernel Stack Traces LCRASH is able to generate kernel stack traces without using frame pointers Various heuristics are applied to each stack frame to determine what the PC, RA, SP, and frame pointer should be Derived values are sanity checked to ensure they are at least reasonable The entir

40、e stack trace is constructed before it is displayed Most x86 kernels do not use frame pointers,Linux Kernel Crash DumpsIntroduction to LCRASH,LCRASH commands for displaying kernel stack traces trace displays a stack trace for one or more active tasks strace displays an arbitrary stack trace using a PC, RA, and SP provided on the command line; or finds all valid sta

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论