2024Mogdb数据库巡检脚本_第1页
2024Mogdb数据库巡检脚本_第2页
2024Mogdb数据库巡检脚本_第3页
2024Mogdb数据库巡检脚本_第4页
2024Mogdb数据库巡检脚本_第5页
已阅读5页,还剩9页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

Mogdb-数据库巡检脚本 目录1. 报告概述 52. 巡检报告解析 52.1.巡检脚本 52.2.巡检脚本执行 52.3.数据库节点1巡检结果解析 52.3.1.操作系统空间使用率 52.3.2.操作系统内存使用率 62.3.3.数据库进程及从库状态 62.3.4.CMVIPCheck 72.3.5.主机Ping 82.3.6.数据库SQL结果,一定程度的问题诊断,正常可忽略 92.3.7.数据库归档日志及Pg_xlog日志展示 102.4.数据库节点2巡检结果解析 112.4.1.数据库进程及从库状态 112.4.2.CMVIPCheck 133. 结论及建议 14

报告概述本篇文档对Mogdb数据库巡检脚本,进行巡检重点内容解析。巡检报告解析巡检脚本#!/bin/sh#mogdb_xunjian_v1.0_20240624.shV_DATA_TMP=$(date+'%Y-%m-%d_%H_%M_%S')V_FILE_TMP=/data/mogdbxunjian/Mogdb_day_xunjian_"${V_DATA_TMP}".txtecho"checkmogdb_xunjian"_$V_FILE_TMP>>"${V_FILE_TMP}"echo'Nextcheckosdf-h=>'>>"${V_FILE_TMP}"df-h>>"${V_FILE_TMP}"echo'Nextcheckosfree-m=>'>>"${V_FILE_TMP}"free-m>>"${V_FILE_TMP}"#echo'Nextcheckoscpusar-p=>'>>"${V_FILE_TMP}"#sar-p>>"${V_FILE_TMP}"echo'Nextcheckdbclusterprocessandslavestatus=>'>>"${V_FILE_TMP}"#/soft/ptk/ptkcluster-nats_csstatus--detail>>"${V_FILE_TMP}"echo'NextcheckdbVIPstatus=>'>>"${V_FILE_TMP}"su-omm-c"cm_ctlshow">>"${V_FILE_TMP}"echo'NextcheckpingVIPNetwork'>>"${V_FILE_TMP}"ping10.xx-c3>>"${V_FILE_TMP}"ping10.xx-c3>>"${V_FILE_TMP}"echo'NextcheckpingANetwork'>>"${V_FILE_TMP}"ping10.xx-c3>>"${V_FILE_TMP}"ping10.xx-c3>>"${V_FILE_TMP}"echo'NextcheckpingBNetwork'>>"${V_FILE_TMP}"ping10.xx-c3>>"${V_FILE_TMP}"ping10.1xx-c3>>"${V_FILE_TMP}"echo'NextcheckpingStreamreplicationNetwork'>>"${V_FILE_TMP}"ping192xx.100-c3>>"${V_FILE_TMP}"ping192xx.101-c3>>"${V_FILE_TMP}"echo'NextcheckDBviewSQLput'>>"${V_FILE_TMP}"echo'NextcheckDBpg_replication_slots'>>"${V_FILE_TMP}"echo'NextcheckDBselectcontextname,pg_size_pretty(sum(totalsize)),pg_size_pretty(sum(freesize))fromgs_session_memory_detailgroupbycontextnameorderbysum(totalsize)desclimit10'>>"${V_FILE_TMP}"echo'NextcheckDBgs_total_memory_detail'>>"${V_FILE_TMP}"echo'NextcheckDBselectstate,count(*)frompg_stat_activitygroupbystate'>>"${V_FILE_TMP}"su-omm-c'gsql<<EOF\x\l+select*frompg_replication_slots;selectcontextname,pg_size_pretty(sum(totalsize)),pg_size_pretty(sum(freesize))fromgs_session_memory_detailgroupbycontextnameorderbysum(totalsize)desclimit10;select*fromgs_total_memory_detail;selectstate,count(*)frompg_stat_activitygroupbystate;\qEOF'>>"${V_FILE_TMP}"echo'Nextcheckarchivefiledu-sm'>>"${V_FILE_TMP}"du-sm/data/mogdb/data/pg_xlog/archive_status>>"${V_FILE_TMP}"echo'Nextcheckarchivefilels-lrt|wc-l'>>"${V_FILE_TMP}"ls-l/data/mogdb/data/pg_xlog/archive_status|wc-l>>"${V_FILE_TMP}"echo'Nextcheckpg_xlogfiledu-sm'>>"${V_FILE_TMP}"du-sm/data/mogdb/data/pg_xlog|grep-varchive_status>>"${V_FILE_TMP}"echo'Nextcheckpg_xlogfilels-lrt|wc-l'>>"${V_FILE_TMP}"ls-l/data/mogdb/data/pg_xlog|grep-varchive_status|wc-l>>"${V_FILE_TMP}"echo'Nextcheckarchivefilels-n200'>>"${V_FILE_TMP}"ls-lrt/data/mogdb/data/pg_xlog/archive_status|sort-n|head-n200>>"${V_FILE_TMP}"echo'Nextcheckpg_xlogfilels-n200'>>"${V_FILE_TMP}"ls-lrt/data/mogdb/data/pg_xlog|grep-varchive_status|sort-n|head-n200>>"${V_FILE_TMP}"echo'Nextcheckpg_logfiledu-sm'>>"${V_FILE_TMP}"du-sm/data/mogdb/log/pg_log>>"${V_FILE_TMP}"--每分钟执行一次**/1***.~/.bashrc;.~/.ptk_mogdb_env;nohup/opt/mogdb506/app/bin/om_monitor-L/opt/mogdb506/log/cm/om_monitor>/dev/null2>&1&--每天凌晨1点05分执行一次51***nohupsh/data/mogdbxunjian/mogdb_xunjian_v1.0_20240624.sh>/dev/null2>&1&chmod+x/data/mogdbxunjian/mogdb_xunjian_v1.0_20240624.sh--生产环境,主备建议间隔10分钟执行;--每天凌晨1点15分执行一次151***巡检脚本执行Root登录后#sh/data/mogdbxunjian/mogdb_xunjian_v1.0_20240624.sh#ls-l/data/mogdbxunjian/会根据执行时的年月日,时分秒输出*.txt文件,cat即可。每天会自动执行一次,建议1年删除一次文件,删除历史的手工清理。数据库节点1巡检结果解析#重点会进行标红操作系统空间使用率使用率>80%,就建议及时关注分析,是数据占用的空间高,还是其它日志文件占用高;[root@mogdb1mogdbxunjian]#catMogdb_day_xunjian_2024-06-27_01_35_01.txtcheckmogdb_xunjian_/data/mogdbxunjian/Mogdb_day_xunjian_2024-06-27_01_35_01.txtNextcheckosdf-h=>FilesystemSizeUsedAvailUse%Mountedondevtmpfs15G015G0%/devtmpfs15G24K15G1%/dev/shmtmpfs15G9.6M15G1%/runtmpfs15G015G0%/sys/fs/cgroup/dev/mapper/klas-root500G40G460G8%/tmpfs15G20K15G1%/tmp/dev/sdb11.0T12G1012G2%/data/dev/sda21014M160M855M16%/boot/dev/sda1599M7.7M592M2%/boot/efitmpfs3.0G03.0G0%/run/user/1002tmpfs3.0G03.0G0%/run/user/0操作系统内存使用率当出现swap占用>0,且越来越多,需要持续关注,说明系统内存可能不够用了Nextcheckosfree-m=>totalusedfreesharedbuff/cacheavailableMem:30400131225536954355121796Swap:32767032767数据库进程及从库状态Ptk工具检测:1.数据库normal说明正常2.当前10.xxxxx.5mogdb1是主库角色,所以看10.xxxxx.5:28000(dn_6001)区域,sync_percent100%同步,主备同步正常。如果dn_6001区域内容是null说明主备网络不通。如果10.xxxxx.7Mogdb2是主库,则主备链路看10.xxxxx.7:28000(dn_6002)的栏位。[root@mogdb1mogdbxunjian]#ptkcluster-nats_csstatus--detail[ClusterState]cluster_name:ats_cscluster_state:Normaldatabase_version:MogDB5.0.6(build8b0a6ca8)cm_version:5.0.6(build3d9c5208)active_vip:10.xxxxx.6,10.xxxxx.6[CMServerState]id|ip|port|hostname|role++++1|10.xxxxx.5,10.xxxxx.5|15300|mogdb1|primary2|10.xxxxx.7,10.xxxxx.7|15300|mogdb2|standby[DatanodeState]cluster_name|id|ip|port|user|nodename|db_role|state|uptime|upstream+++++++++-ats_cs|6001|10.xxxxx.5|28000|omm|dn_6001|primary|Normal|00:33:43|-||10.xxxxx.5||||||||6002|10.xxxxx.7|28000|omm|dn_6002|standby|Normal|00:25:53|-||10.xxxxx.7|||||||[DataNodeDetail]10.xxxxx.5:28000(dn_6001)role:primarydata_dir:/data/mogdb/dataaz_name:AZ1[SendersInfo]:sender_pid:427047local_role:Primarypeer_role:Standbypeer_state:Normalstate:Streamingsender_sent_location:0/1C86F898sender_write_location:0/1C86F898sender_flush_location:0/1C86F898sender_replay_location:0/1C86F898receiver_received_location:0/1C86F898receiver_write_location:0/1C86F898receiver_flush_location:0/1C86F898receiver_replay_location:0/1C86F898sync_percent:100%sync_state:Syncsync_priority:1sync_most_available:Onchannel:192.168.1.100:28001-->192.168.1.101:5754810.xxxxx.7:28000(dn_6002)role:standbydata_dir:/data/mogdb/dataaz_name:AZ1[ReceiverInfo]:receiver_pid:415968local_role:Standbypeer_role:Primarypeer_state:Normalstate:Normalsender_sent_location:0/1C86F898sender_write_location:0/1C86F898sender_flush_location:0/1C86F898sender_replay_location:0/1C86F898receiver_received_location:0/1C86F898receiver_write_location:0/1C86F898receiver_flush_location:0/1C86F898receiver_replay_location:0/1C86F898sync_percent:100%channel:192.168.1.101:57548<--192.168.1.100:28001CMVIPCheck数据库VIP会挂载再数据库节点1或者节点2上面,如下VIP再节点1不存在,正常现象。如果节点1、节点2checkvip都不存在则存在异常!!!Nextcheckoscpusar-p=>Nextcheckdbclusterprocessandslavestatus=>NextcheckdbVIPstatus=>[NetworkConnectState]Networktimeout:6sCurrentCMServertime:2024-06-2701:35:02Networkstat('Y'meansconnected,otherwise'N'):|\|Y||Y|\|[NodeDiskHBState]Nodediskhbtimeout:200sCurrentCMServertime:2024-06-2701:35:03Nodediskhbstat('Y'meansconnected,otherwise'N'):|N|N|主机Ping正常情况下,pingA\B,流复制、VIP网络,都应该通,不通说明网络存在一定问题!!!NextcheckpingVIPNetworkPING10.xxxxx.6(10.xxxxx.6)56(84)bytesofdata.From10.xxxxx.5icmp_seq=1DestinationHostUnreachableFrom10.xxxxx.5icmp_seq=2DestinationHostUnreachableFrom10.xxxxx.5icmp_seq=3DestinationHostUnreachable10.xxxxx.6pingstatistics3packetstransmitted,0received,+3errors,100%packetloss,time2041mspipe3PING10.xxxxx.6(10.xxxxx.6)56(84)bytesofdata.From10.xxxxx.5icmp_seq=1DestinationHostUnreachableFrom10.xxxxx.5icmp_seq=2DestinationHostUnreachableFrom10.xxxxx.5icmp_seq=3DestinationHostUnreachable10.xxxxx.6pingstatistics3packetstransmitted,0received,+3errors,100%packetloss,time2044mspipe3NextcheckpingANetworkPING10.xxxxx.5(10.xxxxx.5)56(84)bytesofdata.64bytesfrom10.xxxxx.5:icmp_seq=1ttl=64time=0.038ms64bytesfrom10.xxxxx.5:icmp_seq=2ttl=64time=0.036ms64bytesfrom10.xxxxx.5:icmp_seq=3ttl=64time=0.038ms10.xxxxx.5pingstatistics3packetstransmitted,3received,0%packetloss,time2043msrttmin/avg/max/mdev=0.036/0.037/0.038/0.000msPING10.xxxxx.7(10.xxxxx.7)56(84)bytesofdata.64bytesfrom10.xxxxx.7:icmp_seq=1ttl=64time=0.181ms64bytesfrom10.xxxxx.7:icmp_seq=2ttl=64time=0.200ms64bytesfrom10.xxxxx.7:icmp_seq=3ttl=64time=0.251ms10.xxxxx.7pingstatistics3packetstransmitted,3received,0%packetloss,time2044msrttmin/avg/max/mdev=0.181/0.210/0.251/0.029msNextcheckpingBNetworkPING10.xxxxx.5(10.xxxxx.5)56(84)bytesofdata.64bytesfrom10.xxxxx.5:icmp_seq=1ttl=64time=0.025ms64bytesfrom10.xxxxx.5:icmp_seq=2ttl=64time=0.035ms64bytesfrom10.xxxxx.5:icmp_seq=3ttl=64time=0.034ms10.xxxxx.5pingstatistics3packetstransmitted,3received,0%packetloss,time2043msrttmin/avg/max/mdev=0.025/0.031/0.035/0.004msPING10.xxxxx.7(10.xxxxx.7)56(84)bytesofdata.64bytesfrom10.xxxxx.7:icmp_seq=1ttl=64time=0.240ms64bytesfrom10.xxxxx.7:icmp_seq=2ttl=64time=0.255ms64bytesfrom10.xxxxx.7:icmp_seq=3ttl=64time=0.257ms10.xxxxx.7pingstatistics3packetstransmitted,3received,0%packetloss,time2044msrttmin/avg/max/mdev=0.240/0.250/0.257/0.007msNextcheckpingStreamreplicationNetworkPING192.168.1.100(192.168.1.100)56(84)bytesofdata.64bytesfrom192.168.1.100:icmp_seq=1ttl=64time=0.025ms64bytesfrom192.168.1.100:icmp_seq=2ttl=64time=0.035ms64bytesfrom192.168.1.100:icmp_seq=3ttl=64time=0.035ms192.168.1.100pingstatistics3packetstransmitted,3received,0%packetloss,time2043msrttmin/avg/max/mdev=0.025/0.031/0.035/0.004msPING192.168.1.101(192.168.1.101)56(84)bytesofdata.64bytesfrom192.168.1.101:icmp_seq=1ttl=64time=0.291ms64bytesfrom192.168.1.101:icmp_seq=2ttl=64time=0.308ms64bytesfrom192.168.1.101:icmp_seq=3ttl=64time=0.286ms192.168.1.101pingstatistics3packetstransmitted,3received,0%packetloss,time2044msrttmin/avg/max/mdev=0.286/0.295/0.308/0.009ms数据库SQL结果,一定程度的问题诊断,正常可忽略NextcheckDBviewSQLputNextcheckDBpg_replication_slotsNextcheckDBselectcontextname,pg_size_pretty(sum(totalsize)),pg_size_pretty(sum(freesize))fromgs_session_memory_detailgroupbycontextnameorderbysum(totalsize)desclimit10NextcheckDBgs_total_memory_detailNextcheckDBselectstate,count(*)frompg_stat_activitygroupbystate数据库归档日志及Pg_xlog日志展示正常可忽略,但是当空间使用率比较高,可以分析是否是归档日志或者pg_xlog日志导致的问题Nextcheckarchivefiledu-sm1/data/mogdb/data/pg_xlog/archive_statusNextcheckarchivefilels-lrt|wc-l12Nextcheckpg_xlogfiledu-sm577/data/mogdb/data/pg_xlogNextcheckpg_xlogfilels-lrt|wc-l37Nextcheckarchivefilels-n200-rw1ommomm0Jun1001:40000000010000000000000010.done-rw1ommomm0Jun1103:41000000010000000000000011.done-rw1ommomm0Jun1203:39000000010000000000000012.done-rw1ommomm0Jun1701:34000000010000000000000014.done-rw1ommomm0Jun1802:13000000010000000000000015.done-rw1ommomm0Jun2301:19000000010000000000000019.done-rw1ommomm0Jun2401:3200000001000000000000001A.done-rw1ommomm0Jun301:4000000001000000000000000B.done-rw1ommomm0Jun402:4200000001000000000000000C.done-rw1ommomm0May2602:59000000010000000000000005.done-rw1ommomm0May2801:17000000010000000000000007.donetotal0Nextcheckpg_xlogfilels-n200-rw1ommomm16777216Jun1001:40000000010000000000000010-rw1ommomm16777216Jun103:1400000001000000000000000A-rw1ommomm16777216Jun1103:41000000010000000000000011-rw1ommomm16777216Jun1203:39000000010000000000000012-rw1ommomm16777216Jun1601:38000000010000000000000013-rw1ommomm16777216Jun1701:34000000010000000000000014-rw1ommomm16777216Jun1802:13000000010000000000000015-rw1ommomm16777216Jun1903:27000000010000000000000016-rw1ommomm16777216Jun2300:48000000010000000000000017-rw1ommomm16777216Jun2300:48000000010000000000000018-rw1ommomm16777216Jun2301:19000000010000000000000019-rw1ommomm16777216Jun2401:3200000001000000000000001A-rw1ommomm16777216Jun2700:4600000001000000000000001B-rw1ommomm16777216Jun2700:4600000001000000000000001D-rw1ommomm16777216Jun2700:4600000001000000000000001E-rw1ommomm16777216Jun2700:4600000001000000000000001F-rw1ommomm16777216Jun2700:46000000010000000000000020-rw1ommomm16777216Jun2700:46000000010000000000000021-rw1ommomm16777216Jun2700:46000000010000000000000022-rw1ommomm16777216Jun2700:46000000010000000000000023-rw1ommomm16777216Jun2700:46000000010000000000000024-rw1ommomm16777216Jun2700:46000000010000000000000025-rw1ommomm16777216Jun2700:46000000010000000000000026-rw1ommomm16777216Jun2701:3500000001000000000000001C-rw1ommomm16777216Jun301:4000000001000000000000000B-rw1ommomm16777216Jun402:4200000001000000000000000C-rw1ommomm16777216Jun503:3800000001000000000000000D-rw1ommomm16777216Jun603:4100000001000000000000000E-rw1ommomm16777216Jun803:3700000001000000000000000F-rw1ommomm16777216May2602:59000000010000000000000003-rw1ommomm16777216May2602:59000000010000000000000004-rw1ommomm16777216May2602:59000000010000000000000005-rw1ommomm16777216May2603:46000000010000000000000006-rw1ommomm16777216May2801:17000000010000000000000007-rw1ommomm16777216May2802:33000000010000000000000008-rw1ommomm16777216May3101:42000000010000000000000009total589828数据库节点2巡检结果解析备注:其它指标与节点1一样,如下列举不同情况下,输出结果不同。节点1检测数据库是否正常是ptk工具,节点2使用cm工具,避免ptk工具异常,巡检都存在异常结果!数据库进程及从库状态如下,表示节点1cm集群规划是主库,当前是备库角色,实例normal正常;节点2cm集群规划是备库,当前角色是主库,实例normal正常Nextcheckdbclusterprocessandslavestatus=>cluster_state:Normalredistributing:Nobalanced:Nonode:1node_name:mogdb1node:1instance_id:1node_ip:10.xxxxx.5data_path:/data/mogdb/cm/cm_servertype:CMServerinstance_state:Primarynode:1instance_id:6001node_ip:10.xxxxx.5data_path:/data/mogdb/datatype:Datanodeinstance_state:Standbydcf_role:FOLLOWERstatic_connections:2HA_state:Normalreason:Normalsender_sent_location:0/1D774450sender_write_location:0/1D774450sender_flush_location:0/1D774450sender_replay_location:0/1D774450receiver_received_location:0/1D774450receiver_write_location:0/1D774450receiver_flush_location:0/1D774450receiver_replay_location:0/1D774450sync_state:Asyncnode:1node_name:mogdb1node:1instance_id:1node_ip:10.xxxxx.5data_path:/data/mogdb/cm/cm_servertype:CMServerinstance_state:Primarynode:1node_ip:10.xxxxx.5type:FencedUDFstate:Normalnode:2node_name:mogdb2node:2instance_id:2node_ip:10.xxxxx.7data_path:/data/mogdb/cm/cm_servertype:CMServerinstance_state:Standbynode:2instance_id:6002node_ip:10.xxxxx.7data_path:/data/mogdb/datatype:Datanodeinstance_state:Primarystatic_connections:2HA_state

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论