HDS双活数据中心方案 _ GAD解决方案介绍_v4_第1页
HDS双活数据中心方案 _ GAD解决方案介绍_v4_第2页
HDS双活数据中心方案 _ GAD解决方案介绍_v4_第3页
HDS双活数据中心方案 _ GAD解决方案介绍_v4_第4页
HDS双活数据中心方案 _ GAD解决方案介绍_v4_第5页
已阅读5页,还剩56页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

HDS双活数据中心方案-GAD解决方案介绍,HDS容灾解决方案部顾问:谢勇,目录,GADtechnicaldiscussindetail,likeI/Oprocessprinciple;Successexperiencesharing;Competitiveanalysis;Resourcesharing;Q/A,Agenda,HDSGAD双活数据中心方案,Clustering,VSPG1000M-DKC,VSPG1000R-DKC,主机/应用,主机/应用,GAD,概念GADGlobalActiveDevice本地或远程(100KM)存储双活方案配置两台高端存储VSPG1000存储系统两台存储通过FC相互连接,保持数据一致主机间通过集群软件,实现高可用。客户收益存储双活,负载均衡,提高存储能力和效率当任何节点故障时,自动切换,不影响业务I/O本地读写,保证最佳应用性能架构简单、故障点少,不需要额外软件、工具平滑集成远程容灾方案,实现两地三中心容灾,Active,Active,100km,globalstoragevirtualization,VirtualStorageIdentity123456,VirtualStorageIdentity123456,10:0110:02,20:0120:02,VirtualLDEVs:10:0110:02,SimultaneousfromMultipleApplications,ServerswithAppsRequiringHighAvailability,ServerswithAppsRequiringHighAvailability,ReadLocally,WritetoMultipleCopies,SimultaneousfromMultipleApplications,GAD技术原理,Quorum,GAD实施架构,目录,HDS双活数据中心I/O工作机制,VSPG1000-1,VSPG1000-2,双活的卷PAIR(100KM),QRM,仲裁存储,生产主机-1,生产主机-2,Cluster/ExtendedRAC,写I/O,写I/O,读I/O,读I/O,主、备存储都能接受写I/O写入I/O都被复制到两台存储支持I/O本地读功能,GAD写I/O原理-WriteI/OFlowto主存储,MDKC,RDKC,HAmirroring,QRM,QuorumDKC,Prod.Server1(Active),Prod.Server2(Active),App/DBMS,App/DBMS,App/DBMSclustering,1.WriteI/O,GAD写I/O原理-WriteI/OFlowto主存储,MDKC,RDKC,HAmirroring,QRM,QuorumDKC,Prod.Server1(Active),Prod.Server2(Active),App/DBMS,App/DBMS,App/DBMSclustering,2.Setexclusivelock,GAD写I/O原理-WriteI/OFlowto主存储,MDKC,RDKC,HAmirroring,QRM,QuorumDKC,Prod.Server1(Active),Prod.Server2(Active),App/DBMS,App/DBMS,App/DBMSclustering,3.WritetoMDKC,GAD写I/O原理-WriteI/OFlowto主存储,MDKC,RDKC,HAmirroring,QRM,QuorumDKC,Prod.Server1(Active),Prod.Server2(Active),App/DBMS,App/DBMS,App/DBMSclustering,4.WritetoRDKC,GAD写I/O原理-WriteI/OFlowto主存储,MDKC,RDKC,HAmirroring,QRM,QuorumDKC,Prod.Server1(Active),Prod.Server2(Active),App/DBMS,App/DBMS,App/DBMSclustering,5.Releaseexclusivelock,GAD写I/O原理-WriteI/OFlowto主存储,MDKC,RDKC,HAmirroring,QRM,QuorumDKC,Prod.Server1(Active),Prod.Server2(Active),App/DBMS,App/DBMS,App/DBMSclustering,6.Returngood,GAD写I/O原理-WriteI/OFlowto备存储,MDKC,RDKC,HAmirroring,QRM,QuorumDKC,Prod.Server1(Active),Prod.Server2(Active),App/DBMS,App/DBMS,App/DBMSclustering,1.WriteI/O,GAD写I/O原理-WriteI/OFlowto备存储,MDKC,RDKC,HAmirroring,QRM,QuorumDKC,Prod.Server1(Active),Prod.Server2(Active),App/DBMS,App/DBMS,App/DBMSclustering,2.Setexclusivelock,GAD写I/O原理-WriteI/OFlowto备存储,MDKC,RDKC,HAmirroring,QRM,QuorumDKC,Prod.Server1(Active),Prod.Server2(Active),App/DBMS,App/DBMS,App/DBMSclustering,3.WritetoMDKC,GAD写I/O原理-WriteI/OFlowto备存储,MDKC,RDKC,HAmirroring,QRM,QuorumDKC,Prod.Server1(Active),Prod.Server2(Active),App/DBMS,App/DBMS,App/DBMSclustering,4.WritetoRDKC,GAD写I/O原理-WriteI/OFlowto备存储,MDKC,RDKC,HAmirroring,QRM,QuorumDKC,Prod.Server1(Active),Prod.Server2(Active),App/DBMS,App/DBMS,App/DBMSclustering,5.Returngood,WhenanwriteI/Oisreceivedfromthehost,dataiswrittentoboththeactiveandthestandbyDKCsinordertosynchronizethedatabetweenthetwoDKCs.RegardlessofwhetherthewritecommandfromthehostisreceivedintheMDKCortheRDKC,thedataisalwayswrittentotheMDKCfirstandthentotheRDKC.AlllocksarecontrolledintheMDKCinPMwhichisontheMPB,noquorumaccessisrequired.Alllockingisdoneonanextentlevel.(写MDKC路径通过TC路径),GAD写I/O原理-WriteI/OFlowto备存储,MDKC,RDKC,HAmirroring,QRM,QuorumDKC,Prod.Server1(Active),Prod.Server2(Active),App/DBMS,App/DBMS,App/DBMSclustering,6.Releaseexclusivelock,GADQuorumDisk,QuorumDiskisusedtomonitortheGADpairvolumesQuorumDiskactsasaheartbeatfortheGADpairQuorumDiskHealthCheck:Frequencyisevery500ms256kdatatransfer,readofpairbitmaptable15secondtimeout,sameasstandardUVMSIMisgeneratedwhenquorumdiskisblockedSIMisgeneratedwhenquorumdiskisrecovered,目录,GAD状态处理机制,InitialState初始状态,Duplicating(复制),Duplicated同步,Suspended,Blocked锁定,PreparingQuorumDisk,SynchronizingP-VolandS-Vol,P-Volhasthelatestdata,S-Volhasthelatestdata,ForcePairDelete,PairDelete,PairResync(S-Vol),PairResync(P-Vol),CreatePair,Error,Suspend(S-Vol)/Error,Suspend(P-Vol)/Error,Suspend(P-Vol)/Error,GADPAIRStatus,GADVOLUMEStatus,GADI/OMODE,GADStatuscanbederivedfrompairstatusandI/OMode,GADStatusandI/OMode,(*)PairstatusandI/Omodecannotberetrievedduetofailure,失效点分析与恢复,VSPG1000-1,VSPG1000-2,GADPair,QRM,仲裁存储,Clustering,1,2,7,8,3,4,9,12,13,主中心故障,备中心故障,14,15,10,11,生产服务器-1(Active),App/DBMS,App/DBMS,生产服务器-2(Active),5,6,单链路故障,故障恢复单链路故障,QD,GADPair,Clustering,主机端单链路故障为系统设置,如AIX,re-try15秒后切换dyntrk和fast_fail会带来15s的delay缺省主机R/WTimeout=60秒存储端单链路故障如果交换RSCN=YES,Re-try15秒后切换如果交换RSCN=NO,Re-try60秒后切换存储间复制单链路故障对所有链路round-robin方式处理I/ORe-try3秒,3秒后切换别的链路,VSPG1000-1,VSPG1000-1,故障恢复Server1到VSP1链路故障,VSPG1000-1,VSPG1000-2,GADPair,QRM,仲裁存储,App/DBMA-1,App/DBMS-2,Clustering,1,HDLM发现故障,Re-try后自动切换到Non-Preferred链路Server1访问VSP2存储,故障切换,链路修复后,HDLM自动识别Server1自动访问Preferred链路,故障恢复,Server1短暂停顿Server2不影响,Server1,Server2,故障恢复Server2到VSP2链路故障,VSPG1000-1,VSPG1000-2,GADPair,QRM,仲裁存储,App/DBMS-1,App/DBMS-2,HDLM发现故障,Re-try后自动切换到Non-Preferred链路Server2访问VSP1存储,故障切换,链路修复后,HDLM自动识别Server2自动访问Preferred链路,故障恢复,Server2短暂停顿Server1不影响,2,Server1,Server2,Clustering,故障恢复Non-Preferred链路故障,VSPG1000-1,VSPG1000-2,GADPair,QRM,仲裁存储,App/DBMS-1,App/DBMS-2,相当与单链路故障,不切换,故障切换,链路修复后,HDLM自动识别,故障恢复,Server1不影响Server2不影响,9,10,Server1,Server2,Clustering,单点故障与恢复,故障恢复VSP1故障,VSPG1000-1,VSPG1000-2,GADPair,QRM,仲裁存储,App/DBMS-1,App/DBMS-2,3,VSP2访问QD,检查数据一致性SplitGADPair,VSP2为最新数据Server1,2对VSP2I/O读写,故障切换,修复VSP1反向同步后,回切Server1自动访问Preferred链路,故障恢复,Server1短暂停顿Server2短暂停顿,1,Server1,Server2,Clustering,故障恢复VSP2故障,VSPG1000-1,VSPG1000-2,GADPair,QRM,仲裁存储,App/DBMS-1,App/DBMS-2,4,VSP1访问QD,检查数据一致性SplitGADPair,VSP1为最新数据Server1,2对VSP1I/O读写,故障切换,修复VSP1同步Rensync后,GAD恢复正常Server2自动访问Preferred链路,故障恢复,Server1短暂停顿Server2短暂停顿,1,Server1,Server2,Clustering,故障恢复存储间链路故障(单向M-R),VSPG1000-1,VSPG1000-2,GADPair,QRM,仲裁存储,App/DBMS-1,App/DBMS-2,VSP1访问QD,检查数据一致性SplitGADPair,最新数据VSP1Server1,2对活的VSP1I/O读写,故障切换,修复存储间链路增量同步GAD,Server2自动访问Preferred链路,故障恢复,Server1短暂停顿Server2短暂停顿,1,Server1,Server2,5,Clustering,故障恢复存储间链路故障(单向R-M),VSPG1000-1,VSPG1000-2,GADPair,QRM,仲裁存储,App/DBMS-1,App/DBMS-2,VSP2访问QD,检查数据一致性SplitGADPair,最新数据VSP2Server1,2对活的VSP2I/O读写,故障切换,修复存储间链路反向同步GAD,Server2自动访问Preferred链路,故障恢复,Server1短暂停顿Server2短暂停顿,1,Server1,Server2,5,Clustering,故障恢复存储间链路故障(双向M-R),VSPG1000-1,VSPG1000-2,GADPair,QRM,仲裁存储,App/DBMS-1,App/DBMS-2,VSP1,2访问QD,检查数据一致性根据最新状态,SplitGADPairServer1,2对活新的VSPI/O读写,故障切换,修复存储间链路增量同步GAD,Server2自动访问Preferred链路,故障恢复,Server1短暂停顿Server2短暂停顿,1,Server1,Server2,5,5,1,Clustering,故障恢复仲裁故障(链路、存储),VSPG1000-1,VSPG1000-2,GADPair,QRM,仲裁存储,App/DBMS-1,App/DBMS-2,VSP1或VSP2访问QD超时,GADSplit,VSP1为最新数据VSP2PSUEServer1,2对活的VSP1I/O读写,故障切换,修复仲裁链路或存储增量同步GAD,Server2自动访问Preferred链路,故障恢复,Server1短暂停顿Server2短暂停顿,6,7,8,Server1,Server2,Clustering,故障恢复Server1节点故障,VSPG1000-1,VSPG1000-2,GADPair,QRM,仲裁存储,App/DBMS-1,App/DBMS-2,Server1自动切换到Server2,故障切换,修复Server1Server1自动访问Preferred链路,故障恢复,Server2短暂停顿,11,Server1,Server2,Clustering,故障恢复Server2节点故障,VSPG1000-1,VSPG1000-2,GADPair,QRM,仲裁存储,App/DBMS-1,App/DBMS-2,Server2自动切换到Server1,故障切换,修复Server2Server2自动访问Preferred链路,故障恢复,Server1短暂停顿,12,Server1,Server2,Clustering,故障恢复主中心故障(Server1,VSP1,Link),VSPG1000-1,VSPG1000-2,GADPair,QRM,仲裁存储,App/DBMS-1,App/DBMS-2,1,VSP2访问QD,检查数据一致性GAD断开PAIRServer1访问VSP2存储,故障切换,修补Server1,RAC回切VSP1修复,反向同步,故障恢复,Server1切换到Server2,5,3,11,9,10,6,1,Server1,Server2,Clustering,组合场景-多点故障与恢复,组合场景2个节点同时故障,QD,GADPair,Clustering,主/备存储故障,Server1,2停I/O,手工恢复主存储/QD故障,Server1,2停I/O,手工恢复备存储/QD故障,Server1,2停I/O,手工恢复主备链路/QD故障,Server1,2停I/O,手工恢复,VSPG1000-1,VSPG1000-1,组合场景复制链路+M-QD链路,VSPG1000-1,VSPG1000-2,GADPair,QRM,仲裁存储,App/DBMS-1,App/DBMS-2,VSP1访问QD不成功,BlockVSP2访问QD,检查数据一致性SplitGADPair,VSP2为最新数据Server1,2对VSP2I/O读写,故障切换,修复链路反向同步后,回切Server1自动访问Preferred链路,故障恢复,Server1短暂停顿Server2短暂停顿,1,Server1,Server2,Clustering,组合场景复制链路+R-QD链路,VSPG1000-1,VSPG1000-2,GADPair,QRM,仲裁存储,App/DBMS-1,App/DBMS-2,VSP1访问QD,检查数据一致性VSP2访问QD不成功,BlockSplitGADPair,VSP1为最新数据Server1,2对VSP1I/O读写,故障切换,修复链路同步Rensync后,GAD恢复正常Server2自动访问Preferred链路,故障恢复,Server1短暂停顿Server2短暂停顿,1,Server1,Server2,Clustering,目录,QRM,QRM,物理站点部署模式分析与比较,QRM,QRM,WhenfailingtocommunicatewithPrimaryStorageSystem,SecondaryStorageSystemtakesSVOLlockonthequorumdisk.,WhenfailingtocommunicatewithSecondaryStorageSystem,PrimaryStorageSystemtakesPVOLlockonthequorumdisk.,WhenfailingtocommunicatewithQuorum,PrimaryStorageSystemchangestheSVOLtoblockage.,QRM,Duetothefailureofquorumdisk,theSecondaryStorageSystemcannottaketheSVOLlock,causingI/OfailureonthePrimaryStorageSystem.,WhenfailingtocommunicatewithSecondaryStorageSystem,PrimaryStorageSystemtakesPVOLlockonthequorumdisk,andthenacceptread/writeI/O.,QRM,Allthecomponentsonthesitefailed,物理站点部署模式部署方案,3个物理中心主、备和仲裁存储在三个物理中心;提供最佳可用性能和业务保护,PrimaryStorageSystem,SecondaryStorageSystem,QRM,Quorum,QRM,Quorum,Localsite,Remotesite,Quorumsite,QRM,Localsite,Remotesite,Localsite,2个物理中心主和仲裁存储在主数据中心;备用存储在同城中心提供较好可用性能和业务保护,1个物理中心主备和仲裁存储在1个中心;提供设备级可用性能和业务保护,Quorum,PrimaryStorageSystem,SecondaryStorageSystem,PrimaryStorageSystem,SecondaryStorageSystem,交叉与不交叉模式比较与分析,交叉互联架构,非交叉互联架构,主存储故障,主服务器不切换;存储切换后,主服务器远程访问;需要HDLM软件,主存储故障,主服务器切换;存储切换后,没有远程程访问;HDLM软件不是必须,与容灾技术结合方案,部署与实施方案,网络连接图,DarkFiber,T,I,T,E,VSPG1000-1,FCSW-1,RAC-1,GESW-1,DRSW-1,QD,P-VOL,T,I,E,VSPG1000-2,FCSW-2,RAC-2,S-VOL,GESW-2,IP服务网,HDSHUS,DRSW-2,性能分析模型,应用模型随机I/O(联机)一个交易5个I/O客户端响应2S顺序I/O(批处理)一个批量100万个I/O读写I/O比例:读70%,写30%I/O响应时间:平均5ms,单机(基线),一个交易(随机交易)I/O时间:5msx5=25ms广域网传输时间:2S25ms2S批处理时间:1,000,000 x5=5000S=1.4小时,写I/O响应时间:5msx2=10ms一个交易(随机交易)I/O时间:5msx5x0.7+10msx5x0.3=32.5ms客户端响应时间:2S25ms+32.52S批处理时间:100万x(0.7x5+0.3x10)=6500S=1.8小时,链路增加:0.5ms(50km往返)+1ms(4次信号转换)=1.5ms写I/O响应时间:5msx2+1.5ms=11.5ms一个交易(随机交易)I/O时间:5msx5x0.7+11.5msx5x0.3=34.75ms客户端响应时间:2S25ms+34.52S批处理时间:100万x(0.7x5+0.3x11.5)=6950S=1.93小时,联机交易(随机I/O),性能几乎无影响批处理(顺序I/O),增加30%处理时间,联机交易(随机I/O),性能几乎无影响批处理(顺序I/O),增加38

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论