版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、,高可用性系统介绍(MC/ServiceGuard),HP 小型机培训,HA (High Availability)定义,A system is highly available if a single component or resource failure interrupts the system for only a brief time,What cause a system to go down,planned reasons: reconfigure the kernal apply patchs perform hardware and software upgrades p
2、erform full system backups perform system maintenance,unplanned reasons: hardware failures: CPU, Memory, Disk drives , LAN Card, Cable,Disk Controller cards etc system panics application errors power failures user errors,% of Failures,Hardware,High Availability Terms,Downtime: any amount of time whe
3、n the application is unavailable (planned or unplanned) planned: customer plans to bring down the system unplanned: due to an unplanned event or outage,High available: A system that can be recover quickly from all or most resource failures. The application may become unavailable, but only for a shor
4、t period of time. DownTime: 5 min 50 min 8.8 hours 12 hours 24 hours 3.6 days 7.2 days 10.8 days Availability: 99.999% 99.99% 99.9% 99.86% 99.73% 99.0% 98% 97%,Outage: an occurrence that renders an application unavailable when it is expected to be available (Hardware,software,user,environmental Prob
5、lem),Availability: The time that application is can be used during times when when it is expected to be useble. Availability igored planned or scheduled downtime and is expressed as a igored planned or scheduled downtime and is expressed as a percentage,Fault tolerant: These system protect against h
6、ardware failures by providing totally redundant hardware in a single system,Standard reliability: A system that relies only on basic hardware;there are no additional precautions taken to protect against an outage. (97-98%),SPOF(Single Points of Failure),SPOF Solution,CPU Memory Cluster,Disk Mirror a
7、nd RAID,Interface Cards Mirror and PV Links,LAN, NICs Redundant LANs and LANIC,Power UPS,SPU,LAN,Power,CPU,Memory,NIC,Disk,SCSI Controller,root,root mirror,High Availability Solution,Continuously Available Systems future HP products,Highly Available System MC/ServiceGuard MC/LockManager OnLine JFS P
8、rocess Resource Manager ClusterView,Protected Data MirrorDisk/UX HP DiskArray/EMC DiskArray JFS,Reliable system HP9000 systems HP peripherals HP-UX,Cluster(群集),cluster is a networked group of nodes (hosts) which monitor each other in order to ensure that interruptions to the availability of applicat
9、ion running on these nodes are kept small .,Pkg A,Pkg B,root,root,Primary LAN Card,Primary LAN Card,Standby LAN Cards,Pkg A Disks,Pkg B Disks,Dedicated Heatbeat LAN,Primary Lan :Heatbeat/Data,Standby LAN : Heatbeat/Data,Standby LAN : Heatbeat/Data,Node 1,Node 2,client,Pkg A,Pkg B,root,root,Primary L
10、AN Card,Primary LAN Card,Standby LAN Cards,Pkg A Disks,Pkg B Disks,Dedicated Heatbeat LAN,Primary Lan :Heatbeat/Data,Standby LAN : Heatbeat/Data,Standby LAN : Heatbeat/Data,Node 1,Node 2,Sample cluster (two-nodes),cmcld,Package概念,Package: an application along with its programs and resources (volume
11、group, target node, Network address, control Script and services) Floating IP: application IP address(attach to host NIC). Client connect to host through the floating IP Original node: adoptive node : a package can have several adoptive nodes,LVM,PV links: dual links(hardware paths) to the same disk
12、 such that if one link fails, LVM automaticlly rerouteds the I/O to an alternate path MC/SG VG: if a VG is a part of an MC/SG, only one node will be allowed to access the VG at a time Exclusive Mode Activation: in general, you must provide at least one volume group for each package,Sample cluster (8
13、 nodes cluster) (Max 16 nodes),WAN,client,DiskArray,standby,EMC symmetrix, HP XP256,Cluster reformation,System B,Pkg 3,System C,Pkg 4,cluster Reformation,System A leave,System A join,cluster Reformation,Lock Disk概念,The cluster lock is a disk located in a volume group shared by all nodes in the clust
14、er,required for 2-nodes cluster optional for 3 or 4 nodes cluster not supported for 5 node or more cluster,Pkg A,Pkg B,root,root,Primary LAN Card,Primary LAN Card,Standby LAN Cards,Pkg A Disks,Pkg B Disks,Dedicated Heatbeat LAN,Node 1,Node 2,Lock Disk,X,model 10, mode20, model30,FC60等DiskArray 需要单独另
15、配一块锁盘 AutoRaid12H:其中的一个物理卷可用作锁盘 不需单独另配一块锁盘,MC处理的失效类型,Node(host) failover : SPU (CPU, Memory, disk I/O, Power) LAN failover: LAN Card, LAN link,Pkg A(float IP_A),Pkg B(float IP_B),root,root,Primary LAN Card,Primary LAN Card,Standby LAN Cards,Pkg A Disks,Pkg B Disks,Dedicated Heatbeat LAN,Primary Lan
16、:Heatbeat/Data,Standby LAN : Heatbeat/Data,Standby LAN : Heatbeat/Data,Node 1,Node 2,Pkg A(float IP_A),X,Client,Application Switch Demo(SPU Failure),Pkg A client,Pkg A,Pkg B,root,root,Primary LAN Card,Primary LAN Card,Standby LAN Cards,Pkg A Disks,Pkg B Disks,Dedicated Heatbeat LAN,Primary Lan :Heat
17、beat/Data,Standby LAN : Heatbeat/Data,Standby LAN : Heatbeat/Data,Node 1,Node 2,Pkg A(float IP),X,Client,Application Switch Demo(SPU Failure),Pkg A client,Pkg A,Pkg B,root,root,Primary LAN Card,Primary LAN Card,Standby LAN Cards,Pkg A Disks,Pkg B Disks,Dedicated Heatbeat LAN,Primary Lan :Heatbeat/Da
18、ta,Standby LAN : Heatbeat/Data,Standby LAN : Heatbeat/Data,Node 1,Node 2,Pkg A,Application Switch Demo(LAN Failure),Client,X,应用切换时间,activate_volume_group,Pkg A,Pkg B,root,root,Primary LAN Card,Primary LAN Card,Standby LAN Cards,Pkg A Disks,Pkg B Disks,Dedicated Heatbeat LAN,Node 1,Node 2,Pkg A,X,umo
19、unt_fs,remove_ip_address,customer_defined_halt_cmds,halt_services,deactivate_volume_group,check_and_mount,add_ip_address,customer_defined_run_cmds,start_services,MC管理命令(1): Cluster startup,1. Automatic-/etc/rc.config.d/cmcluster AUTOSTART_CMCLD=1 2. Manual: cmruncl 3. Single-node: cmruncl -n hostnam
20、e,MC管理命令(2): cluster view:,CLUSTER STATUS cluster1 up NODE STATUS STATE systemA up running PACKAGE STATUS STATE PKG_SWITCH NODE pkg_A up running enabled systemA pkg_B up running enabled systemB NODE STATUS STATE systemB up running,cmviewcl,MC管理命令(3): cluster stop:,cmhaltcl -f forcely close database
21、and application cmviewcl CLUSTER STATUS cluster1 down,MC管理命令(4): node stop & join,node stop: cmhaltnode -f -n systemB CLUSTER STATUS cluster1 up NODE STATUS STATE systemA up running PACKAGE STATUS STATE PKG_SWITCH NODE pkg_A up running enabled systemA pkg_B up running enabled systemA NODE STATUS STA
22、TE systemB down halted node start : cmrunnode systemB CLUSTER STATUS cluster1 up NODE STATUS STATE systemA up running PACKAGE STATUS STATE PKG_SWITCH NODE pkg_A up running enabled systemA pkg_B up running enabled systemA NODE STATUS STATE systemB up running,MC管理命令(5): package stop,PACKAGE STATUS STA
23、TE PKG_SWITCH NODE pkg_A up running enabled systemA pkg_B up running enabled systemB cmhaltpkg pkg_B PACKAGE STATUS STATE PKG_SWITCH NODE pkg_A up running enabled systemA pkg_B down unowned disabled unowned,MC管理命令(6): package status change & start,PACKAGE STATUS STATE PKG_SWITCH NODE pkg_A up runnin
24、g enabled systemA pkg_B down unowned disabled unowned cmrunpkg -n systemB pkg_B - not successful cmrunnode systemB cmrunpkg -n systemA pkg_B PACKAGE STATUS STATE PKG_SWITCH NODE pkg_A up running enabled systemA pkg_B up running disabled systemA cmmodpkg -e pkg_B PACKAGE STATUS STATE PKG_SWITCH NODE
25、pkg_A up running enabled systemA pkg_B up running enabled systemA,MC测试方法,MC/ServiceGuard软件安装: swlist B3935BA B.11.00 MC/ServiceGuard运行: cmruncl cmviewcl 手工切换包: cmhaltpkg pkg_name cmrunpkg pkg_name 手工停止节点: cmhaltnode -f node_name 操作系统故障: shutdown -r -y 0,注意事项:电源连接,N,L,G,UPS,N,N,专用地线,输 入 端,G,L,G,电源箱,G
26、,N,L,G,N,L,G::地线 N:零线 L:火线,220v, 1.0 v,电阻小于1欧姆,L,15A,15A,15A,零线与地线不能接在一起 地线要求直接接地,Standby LAN Card,注意事项:心跳线网络连接(switch),Primary LAN Card,Pkg A,Pkg B,root,root,HeartBeat LAN Cards,Pkg A Disks,Pkg B Disks,Node 1,Node 2,Pkg A,Pkg B,root,root,HeartBeat LAN Cards,Pkg A Disks,Pkg B Disks,Node 1,Node 2,1 2
27、 3 4 5 6 7 8,1 2 3 4 5 6 7 8,1-3 2-6,Direct connect,SPOF,注意事项,1.应用稳定: MC不能保护应用程序本身的缺陷、OS的bug等等。 应用在单机上稳定运行后再配置MC系统 2.数据可靠性: MC不能保证数据的可用性。 采用适合的磁盘技术保护数据。 3.应用系统整体可靠性: MC只保证主机系统的高可靠性。 整个应用系统的可靠性需要考虑到各方面的单点故障SPOF 如采用可靠性的网络,中间件产品,客户端程序等。 4.主机处理能力:考虑MC系统切换后,一台主机运行多个应用的处理能力。 5. 应用设计考虑:分解应用均衡负载(active/acti
28、ve模式 避免active/standby模式) 一个应用一个卷组 (根据应用划分磁盘阵列的空间) 客户端程序用 floating IP 进行连接,不要用固定的主机地址。 数据一致性:保证MC卷组对各节点同步。 (vgexport vgimport命令) 不要改变MC配置文件 : /.rhosts /etc/hosts /etc/cmcluster/cmclnodelist /etc/cmcluster/* 网络服务,MC系统切换后的措施,假设2节点Cluster ,主机名为host1 、host2, 主机host1出现故障: 确认应用切换并且可用: 在主机host2上执行: cmviewcl pkg_name的状态应为running ps
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 人工智能潜在风险探讨
- 薛家湾地区就业前景
- 悉尼华人就业市场分析
- 建筑工人施工现场安全指导手册
- 2026年机场安全问题面试
- 中牟AI教学方案
- AI在矿井建设工程技术中的应用
- 人教版英语三年级下册Unit 3 Learning better(教学设计)
- 运输企业安全检查制度
- 公关服务公司公关物料与设备管理培训管理制度
- 水力发电设备防腐涂料施工合同
- 四川省凉山州2022-2023学年五年级下学期数学期末试卷(含答案)
- 汽车保险理赔(第四版)课件 项目7 车险事故车辆损失评估
- DZ∕T 0215-2020 矿产地质勘查规范 煤(正式版)
- 多图中华民族共同体概论课件第十一讲 中华一家与中华民族格局底定(清前中期)根据高等教育出版社教材制作
- 商品和服务税收分类编码解析(45号公告)
- Cook球囊放置操作规程
- 小学道德与法治人教部编版(新)五年级下册(2020)-红军不怕远征难1.0-公开课
- 部编版道德与法治五年级下册期末综合测试卷含答案(共6套)
- 【电气专业】15D501建筑物防雷设施安装
- 年产8000万块页岩砖改扩建项目环评报告表
评论
0/150
提交评论