HPE 3PAR自适应数据压缩功能介绍_第1页
HPE 3PAR自适应数据压缩功能介绍_第2页
HPE 3PAR自适应数据压缩功能介绍_第3页
HPE 3PAR自适应数据压缩功能介绍_第4页
HPE 3PAR自适应数据压缩功能介绍_第5页
已阅读5页,还剩47页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、HPE 3PAR自适应数据压缩功能介绍3PAR Adaptive Data ReductionAgendaWhat is Adaptive Data ReductionDeduplicationCompressionDeduplication and CompressionData Reduction SizingData Reduction Best Practice vAdaptive Data Reduction3HPE 3PAR Adaptive Data ReductionWhy move away from Thin Technologies?Thin Provisioning i

2、s considered old fashioned and is base-level functionality for modern storage arraysToo often, our Thin Technologies are mistaken for Thin Provisioning onlyThin Deduplication and Thin Compression causes confusion as to what makes our technologies thinWhy Adaptive Data Reduction?Data ReductionData Re

3、duction is the de facto standard term for compression and deduplication combined providing a simple message for customersAdaptiveSelect the Data Reduction technologies to suit the application characteristics4 HPE 3PAR StoreServ deduplicationAdvanced inline, in-memory deduplicationHost writes data to

4、 the array held in cache pages to increase write performanceDuplicates are removed, only unique data is flushed to the SSDs reducing writesThe 3PAR ASIC, paired with Express Index lookup tables provides high performance, low-latency inline deduplicationThe 3PAR ASIC checks to see if the pages are du

5、plicates of existing pagesDedup lookup tablePotential duplicates are confirmed with a bit-for-bit checkNew with HPE 3PAR OS 3.3.1: CompressionAdvanced inline, in-memory compressionHPE 3PAR StoreServ arrays leverage Express Scan technology to prevent wasted CPU cyclesHost written data is held in cach

6、e pages to increase write performanceCache pages compressed using CPUCompressed pages are written to SSD for permanent storage HPE 3PAR Adaptive Data ReductionThe HPE 3PAR StoreServ data reduction storyWhen used together, the Adaptive Data Reduction technologies will operate in this orderDeduplicati

7、onPrevent storing duplicate dataCompressionReduce data footprintData PackingPack odd-sized data togetherZero DetectRemove zeros inline1234DeduplicationTDVV3: A new deduplication formatWhy TDVV3?Challenges with TDVV1/TDVV2:Space managementAnalysis of field data showed that most of the data written to

8、 a TDVV was not dedupable yet we stored it in the Dedup Store (DDS)Overwrites generated un-referenced pages in DDS which needed to be reclaimed PerformanceThe reclaiming of large amounts of unreferenced pages impacted the performance of all VVsSparse access patterns on DDS caused frequent in and out

9、 of metadata As a result systems with all TDVVs could be significantly slower than TPVVsA new dedup format was needed to address the issues.9 TDVV3 Enhancements 10Space managementPut single-referenced data in the Dedup Client (DDC) storeNew writes go to the DDC first When a second reference of the d

10、ata is detected, the data is moved into the DDS storeIncrease in size of hash signature usedThis allows quicker detection of collisionsSolves the problem of SSD preconditioning using sequential overwrites with 1:1 dedupThe EMC PoC Toolkit will no longer run the system out of spaceSignificant DDS def

11、rag enhancementsDDS defrag will also be available for 7000 and 10000 systemsA single DDS goes further than ever beforeOn average, duplicate data accounts for 10% of capacity consumed11DDS10%DDC90%Max size: 64 TiBGiven that 10% of data by volume stored is duplicate and 90% is unique, a single 64 TiB

12、TDVV3 DDS is equivalent to a 640 TiB TDVV2 DDSAs the DDS fills the probability of a hash collision increases for data to be stored in the DDS and customers with large TDVV2 DDS (50 TB) saw increased collision rates which impacted the dedup ratios achievedSince single-referenced data is stored in the

13、 DDCs with TDVV3 these will no longer create hash collisions and impact dedup ratios576 TiB total DDC space for a 64 TiB DDS640 TiBTDVV3 Enhancements12PerformanceLess data in the DDS so less garbage, so less work for reclaim to do therefore less impact on I/OsSingle referenced data is located in the

14、 DDCs so there is much better read performance with local accessingNew metadata caching scheme to adapt to the sparse access patterns of DDS metadataNew metadata layout to reduce the dynamic space allocation overheadsWhat does the new implementation mean?Advantages of the new deduplication implement

15、ationUp to 8x better scalabilityBetter deduplication scalability with more efficient use of DDSImproved savingsStoring only duplicate data in the DDS means more chances to deduplicate dataSimplified managementImproved scalability means fewer CPGs for more deduplicated volumesImproved performanceIncr

16、eased IOPS, bandwidth and reduced latency for all platformsDedup I/O flowsDDS SD SpaceABABABAB10FF83L1 L2 L3DDSVV1VV2L1 TableL3 TableL2 TableL1 TableL3 TableL2 Table10AC22L1 L2 L3L1 TableL2 TableL3 TableTDVV SD Space16k PageTDVV SD Space16k Page16k write of data “ABABABAB” to LBA 0 x10FF8316k write

17、of data “ABABABAB” to LBA 0 x10AC22 Offset23AC16L1 L2 L323AC16NULLHASH(“ABABABAB”)23AC16Compare dataTDVV2 Data Dedup exampleNULLDDS SD Space10FF83L1 L2 L3DDSVV1L1 TableL3 TableL2 TableL1 TableL2/L3 TableTPVV SD SpaceABABABAB16k write of data “ABABABAB” to LBA 0 x10FF8310FF83NULLHASH(“ABABABAB”)TDVV3

18、 Write example23AC16L1 L2 L317 2B C1 2A 31 1117 2B C1 2A 31 11 DDS SD Space10FF83L1 L2 L3DDSVV1L1 TableL3 TableL2 TableL1 TableL2/L3 TableTPVV SD SpaceABABABAB16k write of data “ABABABAB” to LBA 0 x10FF8310FF83HASH(“ABABABAB”)TDVV3 Data Dedup example23AC16L1 L2 L317 2B C1 2A 31 1117 2B C1 2A 31 11VV

19、2L1 TableL3 TableL2 Table10AC22L1 L2 L3TPVV SD Space16k write of data “ABABABAB” to LBA 0 x10AC22 23AC16Compare dataABABABABNULL23AC16MatchDDS SD SpaceABAB10FF93L1 L2 L3DDSVV1L1 TableL3 TableL2 TableL1 TableL2/L3 TableTPVV SD Space16k Page8k read of data from LBA 0 x10FF93OffsetDDC Read exampleData

20、ABAB returned to host DDS SD Space10FF93L1 L2 L3DDSVV1L1 TableL3 TableL2 TableL1 TableL2/L3 TableTPVV SD Space16k read of data from LBA 0 x10FF9323AC16DDS Read exampleData ABABABAB returned to host23AC16L1 L2 L3HASHABABABAB Compression20Data segments11111000001111100000000000000000000000000000000000

21、0000000000000000000000000000000000000000CompressionData reduction by compressing dataCompression algorithms work by inspecting data in blocks and removing redundant informationWithin each block, there will be repeated data and often padding around the real dataCompression removes the repeated data a

22、nd padding space to reduce the capacity required to store the data100001000010000111111111111111100010111010001100010111010001111110000011111100001000010000 3PAR Compression OverviewSupported on 8000 and 20000 (Gen5) systems onlyThere are technical limitations which prevent compression on Gen4 syste

23、msOnly supported on SSDsCompression is on a per VV basis for thin and dedup volumesUses existing metadata structure (Express Indexing technology) Ability to compress existing volumes (Dynamic Optimization) Compression prevented for VVs in an AO configurationThe minimum size of compressed VVs is 16 G

24、iB 3PAR Compression OverviewCompression occurs when pages are flushed from cache to the backend SSDsOnly pages belonging to a single VV are compressed togetherThe pages do not need to belong to contiguous addressesPages belonging to different VVs will not be compressed togetherPages belonging to dif

25、ferent snapshots of same base VV also not compressed togetherWhen data is re-written to a compressed Virtual page, we try to recompress the data into the existing compressed SD page (a refit). If the new compressed virtual page does not fit into the original page then the virtual page is written to

26、a new compressed/uncompressed SD page Up to eight 16 KiB CMPs can be compress into a single 16 KiB compressed pageThe number will depend upon how compressible the data is Compression algorithm and block sizeCompression ratio on proof-of-concept Oracle test dataFull File(n:1)64 KiB blocks(n:1)16 KiB

27、blocks(n:1)Compression throughput (MB/s per core)Decompression throughput (MB/s per core)LZ43.743.693.454411460LZO3.773.753.58404610DEFLATE (gzip)4.733.643.47246133Compression ratios are directly related to block sizes16 KiB block size is a good choice for compressionLZ4 offers superior performance

28、for only a small reduction in compressibilityCompressing data with small page sizes is inefficientFor example, when EMC introduced compression in XtremIO 3.0 they were forced to change their block size from 4 KiB to 8 KiB Storing Compressed I/OsHow does it work?Compressed Data Page FormatBuffer Head

29、er is 256 bytes and contains pointers to the compressed pagesEach data page can hold up to 8 compressed pages (limited by the available page table entry space) Control Buffer Header (256b)Compressed Data0Compressed Data1.Compressed Data716 KiB data pageHow does it work?Data Packing Control Buffer He

30、aderCompressed DataCompressed DataCompressed Data16 KiB data page16 KiB compression buffer16 KiB uncompressed CMP #1101000101000101011101010101001010100101001010101000001010100101001010016 KiB uncompressed CMP #216 KiB uncompressed CMP #3How does it work?OverwritesWhen an overwrite occurs the system

31、 will attempt to refit the new compressed data in placeIf the new data, once compressed, is smaller than before it will be written to the old location and the compressed virtual pages that come after it are shuffled forwards so there is no gapIf the new data, once compressed, is larger than before t

32、he compressed virtual pages that come after it are shuffled backwards to make room so it can be written to the old locationIf there is not enough spare space in the CMP to refit the new data it is written to a new pageThe page can become an uncompressed page if it has only one virtual page and that

33、page gets rewritten with incompressible dataA new virtual page will never be added to an existing compressed page, even if there is spare room available in the compressed page 2816 KiB data page1 KiBHow does it work?Overwrites 2916 KiB compressed CMPControl Buffer Header3 KiB2 KiB8 KiBControl Buffer

34、 Header3 KiB2 KiB8 KiBControl Buffer Header3 KiB2 KiB8 KiBControl Buffer Header3 KiB1 KiB8 KiBNew DataCompression16 KiB data pageHow does it work?Overwrites 3016 KiB compressed CMPControl Buffer Header3 KiB2 KiB8 KiBControl Buffer Header3 KiB2 KiB8 KiBControl Buffer Header3 KiB2 KiB8 KiB4 KiBControl

35、 Buffer Header3 KiB4 KiB8 KiBNew DataCompressionHow does it work?Express ScanExpress Scan technology is used to prevent wasted CPU cyclesIf an incompressible page is written to a compressed volumeThe traditional approach is to compress the page and then check the compressed sizeWith Express Scan com

36、pression of the page is started but is aborted if the size exceeds a certain threshold (currently 51% of 16 KiB)The uncompressed page is then written to the backendThis partial compression saves CPU cycles on data which has a low compression ratio 3116 KiB data pageHow does it work?Express Scan 16 K

37、iB compression bufferCompressed Data16 KiB uncompressed CMP101000101000101011101010101001010100101001010101000001010100101001010051%Deduplication and Compression33Deduplication compared with compressionA simpler way to understand the differences and target use-casesDeduplicationWorks across datasets

38、CompressionWorks within datasetsDeduplication and compressionCompression is also an option for dedup volumes, which are called DECO volumesOnly data in the DDC volumes will be compressed, data in the DDS is uncompressed DECO is only applicable to dedup volumes with a TDVV3 formatTDVV1/TDVV2 only sav

39、es I/Os 16 KiB and hash collisions in the DDC. All other data, even if it is non-dedupable, goes into the DDSTDVV3 saves all non-deduped data in the DDC so there can be more savings from compression Deduplication and compression36PrivateSpace(DDC)PrivateSpace(DDC)Shared Space(DDS)PrivateSpace(DDC)Pr

40、ivateSpace(DDC)90% of dataCompression ratio2:110% of dataAverage references per page: 10-20CPG dedup ratio 2:1When new pages are received, a hash is calculated. If the page is unique, its compressed and written to the DDCNew unique pages are received, the hash calculated and if theyre unique, theyre

41、 written to the DDC.When a duplicate page is detected, its written uncompressed to the DDS and a pointer in the L3 exception table points to the location. The existing pages L3 exception is also updated with the DDS locationThe original, compressed page is now marked as invalid and collected during

42、the next GC runOnly 10% of data is stored in the DDS that data is referenced between 10 and 20 times on average, resulting in a 10-20:1 ratio within the DDS. Compressing this data offers limited savingsData Reduction Sizing and Best Practice 37Data Reduction SizingAn equation for Dr Beha 382:1Dedupl

43、icationCompressionData Reduction?+2:1=Why does it depend?Example of storing data that is both 2:1 dedupable and 2:1 compressible 39DDCDDSDDCDDSData Written = 10 blocksData Stored = 5 blocksData ReductionRatio = 2:1Data Written = 10 blocksData Stored = 3 blocksData ReductionRatio = 3.3:1DataData3Par

44、Data Reduction Calculator 40Dedup and Compression ToolDedupcrawler 41Estimating Data Reduction SavingsVOLUME TYPEcommand lineCompressioncheckvv -compr_dryrun Deduplicationcheckvv -dedup_dryrun Deduplication+Compressioncheckvv -dedup_compr_dryrun 42Note: There is no Deduplication+Compression estimati

45、on available from SSMC 3.1 Data Reduction Best Practice 43Volume Type Positioning 44FullDeduplicatedDeduplicated + CompressedCompressedThinProvisioning TypePerformanceSpace SavingsSelective Adaptive Data ReductionAllowing more efficient use of system resourcesDifferent data types have different requ

46、irementsFor each data type, enable the technologies that provide benefits and disable the technologies that dontOracle databaseCompressed(2:1)Exchange serverDeduplicatedCompressed(1.5:1)Compressed videoThin ProvisionedVDI environmentDeduplicatedCompressed(2:1+)When to use whatFull provisioned volume

47、sGood for:Maximum performanceCustomers who dont want to overprovision storage (or manage utilization)Host compressed dataHost encrypted data 46When to use whatThin provisioned volumesShould still be considered as the default volume typeGood for:Host compressed dataHost encrypted data 47When to use w

48、hatCompressed volumesCompression is ideal for data that does not have a high level of block redundancy. Data sets that are good candidates for compression include:Databases - Most databases do not contain redundant data blocks but do have redundant data within blocks so they can benefit from compres

49、sionVirtual Machine (VM) images - VMs where the application data size far exceeds the operating system binaries size may not yield significant deduplication savings but can benefit from compression of the application data.Virtual Desktop Infrastructure (VDI) - Client virtualization environments with

50、 hosted non-persistent desktops can achieve excellent compression ratios. 48When to use whatDeduplicated volumesDeduplication is ideal for data that has a high level of redundancy. Data sets that are good candidates for deduplication include:Virtual Machine (VM) images - The operating system binarie

51、s from multiple VMs can be reduced to a single copy by deduplication. Note that the application data within the VMs may be unique will therefore not benefit from storage deduplication.Virtual Desktop Infrastructure (VDI) - Client virtualization environments with hosted persistent desktops can achiev

52、e excellent deduplication ratios.Home directory and file shares - Users often store copies of the same file in their private workspaces and therefore storage deduplication can offer significant space savings.Data with a low level of redundancy should not be stored on deduplicated volumes. Data sets

53、that are not good candidates for deduplication include:Databases - Most databases do not contain redundant data blocks.Deduplicated data - Data that has already been deduplicated on the host will not be compacted further by storage deduplication.Compressed data - Compression creates a stream a unique data that will not benefit from storage deduplication.Encrypted data - The use of host or SAN encryption will also result in a stream unique of data that will not benefit from storage deduplication. 49When to use whatDeduplicated-Compressed volumesDeduplication and compression

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论