




已阅读5页,还剩88页未读, 继续免费阅读
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
双活数据中心与灾备解决方案-技术部分,臧铁军VMware,GCH COE 云架构师,议程,2,基于虚拟化技术的业务连续性解决方案概览,本地站点,灾备站点,基于虚拟化层的异步复制 基于硬件设备的同异步复制 自动化应用切换管理 城域集群,应用感知的高可用性 关键应用零停机保护 在线迁移虚拟机,动态调配计算与存储资源 VMotion and Storage VMotion,高效的数据备份与恢复 可通过运行计划与脚本实现自动化操作,灾难恢复,本地高可用,数据保护,方案特点 与应用程序和操作系统无关 与硬件设备无关 完善的保护 简单,经济,3,议程,4,双活数据中心在各个级别上全面保障可用性,5,双活数据中心总体架构,双活存储集群,站点A,站点B,延伸的vSphere集群, 200 km,行为与单个vSphere相同 延伸距离最大200KM,通常小于50KM 通过VMware HA与vMotion实现自动的DR保护 需要双活存储集群,如EMC的vPlex,NetApp的MetroCluster等,6,计算资源设计,Making an Application Service Highly Available,vSphere HA vSphere App HA,8,VMware vFabric tc Server,vSphere App HA,Policy-based,Protect off-the-shelf apps,9,Fault Tolerance vs. High Availability,Fault tolerance Ability to recover from component loss Example: Hard drive failure High availability,X,10,支持多vCPU的容错技术,Instantaneous Failover,4 vCPU,4 vCPU,vSphere,Primary,Secondary,Fast Checkpointing,11,长距离vMotion,vSphere 6.0支持跨三层网络和跨vCenter Server的vMotions,12,vCenter Availability,Run vCenter Server application in a VM Run vCenter Server database in a VM Run both in same VM? Protect with vSphere HA vCenter and DB VM restart priority set to High Enable guest OS and App monitoring App HA can protect SQL Server database Back up vCenter Server VM and database Image-level backup for vCenter Server VM App-level backup using agent for database backup,13,网络资源设计,双活数据中心网络架构,15,NSX vSphere Multi-Site Use Cases,NSX for vSphere supports 3 different Multi-Site Deployment Models VXLAN with Stretched Clusters (vSphere Metro Storage Cluster) VXLAN with Separate Clusters L2 VPN All solutions provide L2 extension over an L3 network, enabling workload & IP mobility without the need to stretch VLANs Local egress is supported, however it does add complexity The appropriate deployment model will depend on customer requirements and their environment,NSX利用层叠网络实现双活数据中心,双活存储 vSphere城域存储集群,数据存储1,数据存储2,vCenter Server,三层 网络,站点A,站点B,17,VMware NSX Multi-Site Single VC, Stretched Cluster,Solution Detail Requires a supported vSphere Metro Storage Cluster configuration In a vMSC deployment, storage is Active/Active and spans both sites. Examples of Active/Active storage are: EMC VPLEX, NetApp Metro Cluster (see VMware HCL for more information) Stretched clusters support Live vMotion of workloads Use L3 for all VMkernel networks: Management, vMotion, IP Storage All management components such as vCenter Server, NSX Manager and Controllers are located in Site A Latency and bandwidth requirements are dictated by vMSC storage vendor, eg 10ms RTT for VPLEX which also aligns with vMotion using Enterprise Plus vMSC enables disaster avoidance and basic Disaster Recovery (without the orchestration or testing capabilities of SRM) Loss of either NSX Components or the Datacenter Interconnect will results in a fallback to data plane based learning using existing network state. Therefore there is no outage to data forwarding and without vCenter Server, there are no VM provisioning or migration operations NSX and vMSC are complimentary technologies that fit a sweet spot for NSX (Single vCenter Server),VMware NSX Multi-Site Single VC, Stretched Cluster,Cluster Configuration vMSC enables stretched clusters across two physical sites In an NSX deployment Management, Edge and Workload clusters are all stretched Under normal conditions all Management Components run in a Site A and are protected by vSphere HA They are automatically restarted at Site B in the event of a site outage. The management network is not stretched and must be enabled on Site B as part of the recovery run book Dependent on design, NSX Edge Services Gateways are either active in both sites or a single site and can also leverage HA VMs in the Workload Clusters are automatically recovered,19,VMware NSX Multi-Site Single VC, Stretched Cluster,In a vMSC environment, DRS is used to balance resource utilization, provide site affinity, improved availability and ensure optimal traffic flow Use Should rules, rather than Must as this allows vSphere HA to take precedence Example DRS Groups, Rules and Settings for NSX Edges:,VMware NSX Multi-Site Single VC, Stretched Cluster,NSX Configuration (Option 1 - Preferred) Transport Zone spans both Sites and VXLAN Logical Switches provide L2 connectivity to VMs Distributed Logical Routing is used for all VMs to provide consistent default gateway vMAC Local Egress is provided by using separate Uplink LIFs and Edge GWs per site. Hosts on Site A have DLR default gateway configured via Site A Edge GW using net-vdr CLI. While Site B DLR default gateway is via Site B Edge GW Caveat: Dynamic Routing cannot be enabled on DLR, or a static route set via NSX Manager NSX Edge Gateways will have a static route for any networks directly connected to DLR. Consistent IP addressing will simplify routing by allowing a supernet to be used DFW provides vNIC policy enforcement independent of the VMs location,VM1,VM2,VM3,Web Logical Switch /24,Site A,Site B,Distributed Logical Router,VM4,VM5,App Logical Switch /24,Site A NSX Edge GW ,Site B NSX Edge GW ,Uplink Net A /29 Uplink A LIF ,Uplink Net B /29 Uplink B LIF ,VM6,VM7,DB Logical Switch /24,Internal LIFs .1,VMware NSX Multi-Site Single VC, Stretched Cluster,NSX Configuration (Option 2) As per Option 1 Transport Zone spans both Sites and VXLAN Logical Switches provide L2 connectivity for VMs NSX Edge Gateways are deployed per site with the same internal IP address NSX DFW L2 Ethernet Rules are defined to block ARP to the remote GW using MAC Sets, which provides Local Egress as only the site local Edge GW is learnt. Future enhancement planned to enable ESXi host object for DFW* Caveats: Traffic flow between application tiers may be asymmetric if they are split across sites and DRS rules arent used Does not leverage Distributed Logical Routing and is limited to 10 vNICs per Edge vMotion will result in network interruption as VM ARP cache entry for site specific GW needs to time out Can be used if Option 1 isnt a fit (eg, require Dynamic Routing or vSphere 5.1 support),Site A,Site B,VM1,VM2,VM3,VM3,Logical Switch /24,VMware NSX Multi-Site Single VC, Separate Clusters (2),Datastore 1,Datastore 2,vCenter Server,L3 Network,Site A,Site B,Storage vMotion Required for VM Mobility,23,VMware NSX Multi-Site Single VC, Separate Clusters,Solution Detail Separate vSphere Clusters are used at each site, therefore DRS rules & groups are not required Storage is local to a site Enhanced vMotion (simultaneous vMotion and svMotion) can provide live vMotion without shared storage Use L3 for all VMkernel networks: Management, vMotion, IP Storage All management components such as vCenter Server, NSX Manager and Controllers are located in Site A Supported latency requirement for Enhanced vMotion is 100ms RTT(vSphere 6). vMotion requires 250 Mbps of bandwidth per concurrent vMotion This solution provides Disaster Avoidance where live vMotion is supported, by enabling workloads to be moved proactively between sites Does not provide automated Disaster Recovery,VMware NSX Multi-Site Single VC, Separate Clusters,Cluster Configuration Clusters do not span beyond a physical site All Management Components run in Site A, and will not be automatically recovered in the event of a site outage. Storage replication to a standby Cluster in Site B and a manual recovery process could be implemented Separate Edge and Workloads Clusters are used per site NSX Edge Services Gateways are active in a single site, with HA is local to the site Workloads are active across both sites and can optionally support live vMotion DRS affinity rules for workloads are not required,25,VMware NSX Multi-Site Single VC, Separate Clusters,NSX Configuration Option 1 with Distributed Logical Routing is unchanged from Stretched Cluster configuration and is still recommended For option 2, as vCenter objects are not shared we can leverage NSX DFW L2 Ethernet Rules with a scope of the Datacenter to provide Local Egress. as only the site local Edge GW is learnt. No enhancements required Same caveats with Option 2 for Stretched Clusters also apply,Site A,Site B,VM1,VM2,VM3,VM3,Logical Switch /24,To Local Egress/Ingress or not to,As a first step, ask the customer if they have stateful services for traffic entering and exiting the Datacenter ? This is generally the case and if so they will require a solution to provide Local Ingress for their applications. Eg, NAT GSLB Anycast LISP, RHI etc If they can address this, then a Multi-Site NSX solution providing Local Egress is a good fit If they do not, other questions to ask are: Do they have high bandwidth between sites ? and is reducing operational complexity a goal ? An active NSX Edge Gateway at one site, with failover to the secondary site may meet the customers requirements and is much simpler than providing Local Egress & Ingress,VMware NSX Multi-Site L2 VPN (3),Datastore 1,Datastore 2,vCenter Server,Site A or On Prem,Site B or Off Prem,vCenter Server,SSL,SSL,28,存储资源设计,存储需求,Site A,Site B,Dark Fiber,=200 km,Metro Cluster,DWDM,DWDM,Aggr X Plex1,时延要求: vSphere要求RTT100ms 存储同步复制要求RTT5ms,30,Metro Storage的两种实现方式:Uniform与Non-Uniform,31,vSphere Metro Storage Cluster工作原理,vSphere HA Cluster,Stretched across campus or metro area,vMSC Certified Storage,Metro Cluster,Array based synchronous replication,Plex0,Plex0,32,vSphere Metro Storage Cluster工作原理,Standard vMotion of Virtual Machines,vMotion,vMSC Certified Storage,Metro Cluster,Array based synchronous replication,Plex0,vSphere HA Cluster,Plex0,33,vSphere Metro Storage Cluster工作原理,vSphere HA Cluster,vMSC Certified Storage,Metro Cluster,Plex1,Plex0,Plex0,Site shutdown for maintenance,34,vSphere Metro Storage Cluster工作原理,vSphere HA Cluster,vMSC Certified Storage,Metro Cluster,Plex0,Plex1,Plex0,Automatic resync,Maintenance performed, site restored,35,vSphere Metro Storage Cluster工作原理,vSphere HA Cluster,vMSC Certified Storage,NetApp MetroCluster,Plex0,Plex1,Plex0,36,存储设备选型,兼容性网站:/comp_guide2/search.php,六类Metro Cluster Storage 1, iSCSI 2, FC 3, NFS 4, iSCSI-SVD 5, FC-SVD 6, NFS-SVD,37,EMC VPLEX for Stretched Metro Clusters,Roadmap,Stretched vSphere Cluster,Site A (Active),Site B (Active),10ms, IP or FC,vCenter,Established VPLEX Active-Active Solution Instant vMotion across distance VMware HA automatically restarts VMs at either site for system or site failure Balance workloads across both sites with VMware DRS Supports VMware FT out of the box Additional flexibility of VPLEX Metro Doesnt Require FC Cross-Connect Choose IP or FC Connectivity between sites Third Site IP connectivity to Witness VM No SPOF If you lose a Director, no loss of access at any site,VPLEX,VPLEX,Dual Site DRS,Dual Site HA,Instant vMotion,Site C (Optional Witness),VPLEX Distributed Virtual Volumes,38,Stretched Storage with IBM SAN Volume Controller,Single system image across two sites provides single pane of glass management for day-to-day storage management activity Simplify management of your environment at same time as deploying active-active storage Based upon a rich and mature platform Provide Real-time Compression, Easy Tier, Non-disruptive migrations, Long distance replication 40,000 engines installed worldwide, 11 years field experience 250+ storage devices supported to provide back-end capacity Retain your existing investment in storage devices Keep flexibility for the future Active quorum device enables automatic failover No external management software Prevents split-brain Supports recovery in case of full unplanned site failure scenarios,Quorum,Storage Pool 1,Storage Pool 2,Site 1,Site 1,Site 2,Site 2,Site 3,SVC Stretched Cluster,39,来自存储厂商的参考指南,Implementing VMware vSphere Metro Storage Cluster with HP LeftHand Multi-Site storage /V2/GetPDF.aspx%2F4AA4-0955ENW.pdf Implementing vSphere Metro Storage Cluster using HP 3PAR Peer Persistence /V2/GetPDF.aspx%2F4AA4-7734ENW.pdf Deploy VMware vSphere Metro Storage Cluster on Hitachi Virtual Storage Platform /assets/pdf/deploy-vmware-vsphere-metro-storage-cluster-on-hitachi-vsp.pdf IBM SAN and SVC Stretched Cluster and VMware Solution Implementation /redbooks/pdfs/sg248072.pdf VMware vSphere 5.5 vMotion on EMC VPLEX Metro /files/pdf/techpaper/vplex-metro-vmotion-vsphere55-perf.pdf,40,VSAN for Metro Cluster 2015Q3 (计划),Site A,Fault Domain A,Fault Domain B,Fault Domain C,Virtual SAN Cluster,Site C,SIte B,vmdk,witness,vmdk,vmdk,witness,vmdk,从机架感知升级到站点感知: 1,迷你容错站点专用于witness 2,优先从本地站点读取数据以提升性能,41,议程,42,RTO, RPO, and MTD,Recovery Time Objective (RTO) How long it should take to recover Recovery Point Objective (RPO) Amount of data loss that can be incurred Maximum Tolerable Downtime (MTD) Downtime that can occur before significant loss is incurred Examples: Financial, reputation,43,The Three Building Blocks For Disaster Recovery,vSphere,Virtual SAN,Ecosystem,VDP Advanced,vSphere Replication,Site Recovery Manager,VMware,Array-based,Backup copies,External Storage,Storage,Compute,Backup and Recovery,Replication,DR Orchestration,44,异地(同城)灾备解决方案总体架构,45,异地(同城)灾备解决方案多种映射关系,主备式切换,双活切换,双向切换,双活数据中心,Production,Recovery,Production,Recovery,Production,Production,最常见的场景 花销较大,灾备架构主要用于测试,开发和培训等非生产应用 有效降低开销,两个站点均有生产应用 每个站点为对方提供容灾支持,两个站点的应用可以跨站点自由移动 计划内事件零停机 限制在城域范围内,Site 1,Site 2,Production,46,网络资源设计,“Protected” Site,“Recovery” Site,Storage,Storage,VMFS/NFS,VMFS/NFS,Storage,VMFS/NFS,VMFS/NFS,Replication,SRM with NSX for vSphere,Firewall Rules & Security Groups,48,SRM with NSX for vSphere,What has been validated SRM can map VMs from one VXLAN Logical Switch on the Primary Site to a different Logical Switch on the Recovery Site These Logical Switches can be connected to pre-created NSX Distributed Logical Routers or NSX Edge Services GWs Placeholder VMs can be added to Security Groups and in a DR event, when these VMs become active they are protected by DFW Dynamic Routing can be used to advertise networks on the primary site. Using metric/weight these networks can be re-advertised on the recovery site if there is a site failover This maps very closely to the vCAC deployment model for pre-created networks which is used for production workloads. Test/Dev workloads using on-demand networking do not typically require DR Currently being tested Automate synchronization of NSX Distributed Firewall Ruleset and Security Groups between two NSX Managers Tie into SRM, so at the time VMs are added to a Protection Group the placeholder VMs are automatically added to the appropriate Security Groups Working closely with EMC as part of their Enterprise Private Cloud Reference Architecture project to turn this into a productized solution including vCAC,Logical Architecture View,/24,,,/28,/24,/28,No Network Readdressing (Dynamic Routing),VXLAN,VXLAN,VLAN,VLAN,vCenter + SRM,vCenter + SRM,Distributed Logical Router,Dynamic Routing (OSPF, BGP),Primary VMs,Placeholder VMs,,,,,Distributed Logical Router,Dynamic Routing (OSPF, BGP),,,VMFS,VMFS,“Protected” Site,“Recovery” Site,50,Primary VMs,SRM with NSX for vSphere,/24,/28,/24,/28,No Network Readdressing (Dynamic Routing),VXLAN,VXLAN,VLAN,VLAN,vCenter + SRM,vCenter + SRM,Dynamic Routing (OSPF, BGP),Primary VMs,Placeholder VMs,,,Distributed Logical Router,Dynamic Routing (OSPF, BGP),,,SG-Prod-01,SG-Dev-01,SG-Prod-01,SG-Dev-01,Site B NSX Edge GW,51,SRM with NSX for vSphere,Current Challenges Both the primary and recovery sites need to be prepared for NSX in and networks need to be created on both sites for the SRM mapping Dynamically created networks and port-groups on the protected site are not created on the recovery site automatically and will need to be done my the administrator DR site configuration verification and remediation processes to maintain synchronization is current and matching on both the sites Service composer features are dependent on UUID and VDS unique identitys (currently not supported) Out of the Box support from SRM Future Development Scope Acceptable Preparation step, should be part of a run book Automate SRM Port-Group mapping via VCO. Automate synchronization of NSX Distributed Firewall Ruleset and Security Groups between two NSX Managers using VCO workflows and the NSX-VCO plugin Enhance VCO to support 2-way synchronization. Investigating removal of service Composer VDS thumbprint dependencies Tie into SRM, so at the time VMs are added to a Protection Group the placeholder VMs are automatically added to the appropriate Security Groups ( Q2-2015),52,“Protected” Site,“Recovery” Site,Storage,Storage,VMFS/NFS,VMFS/NFS,VMFS/NFS,NSX Workflow for Manual Synchronization,When an object is created on the primary Site, the same object needs to be deployed on the recovery site to maintain consistency,Duplicate network segments and subnets can exist on both the protected and the recovery site, this is to maintain IP addresses,53,“Protected” Site,“Recovery” Site,Storage,Storage,VMFS/NFS,VMFS/NFS,VMFS/NFS,NSX VCO Workflow for Synchronization,54,存储资源设计,软件定义存储在BC/DR解决方案中的独特价值,56,基于存储策略的管理,虚拟卷管理,Virtual Datastore,Virtual Datastore,低成本的存储解决方案 快速,面向应用的置备 不依赖硬件 便于采用非对称设计 不需要管理LUNs, Vols 在线变更策略无需数据迁移,数据保护方案设计,Data Protection Techniques,58,HIGH,LOW,SLOW,FAST,Potential for Data Loss (RPO),Time to Recover (RTO),Synchronous Replication,Tape Backup,Asynchronous Replication,Snapshots,Disk Backup,How much data can you afford to lose?,How long can you afford to be without the data or service?,Data Protection Use Cases,59,选择正确的数据保护方案,60
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2024年高考语文二轮复习专题3散文阅读突破练12词句理解与表达技巧赏析
- 员工管理手册培训
- 医药连锁营运培训资料
- 2025年科技互联网行业5G网络覆盖与市场布局分析报告
- 社交媒体平台在2025年文化传播与舆论引导中的用户行为分析研究报告
- 小白的理财培训
- 廉政教育讲座
- 脑卒中患者的预防与护理
- 护工培训年终总结
- 酒店员工人事制度培训
- 2024小学六年级人教版道德与法治升学毕业小升初试卷及答案(时政+上下册考点)04
- 人教版2024年数学小升初模拟试卷(含答案解析)
- 市场营销学智慧树知到期末考试答案章节答案2024年广东石油化工学院
- 架空送电线路导线及避雷线液压施工工艺规程
- 迁往各地的陇西李氏
- GB/T 3880.2-2024一般工业用铝及铝合金板、带材第2部分:力学性能
- 艺术中国智慧树知到期末考试答案2024年
- 广东省普通高中学生档案
- 小学优美的开头结尾集锦作文开头结尾优美句段
- 盐城市2022-2023学年七年级下学期数学期末试卷(含答案解析)
- 采购管理的绿色采购与可持续发展
评论
0/150
提交评论