ParallelDBMS_第1页
ParallelDBMS_第2页
ParallelDBMS_第3页
ParallelDBMS_第4页
ParallelDBMS_第5页
已阅读5页,还剩18页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、1Parallel DBMSSlides adapted from textbook; from Joe Hellerstein; and from Jim Gray, Microsoft Research. DBMS Textbook Chapter 222Why Parallel Access To Data?1 Terabyte10 MB/s At 10 MB/s1.2 days to scan 1 Terabyte1,000 x parallel1.5 minute to scan.Parallelism: divide a big problem into many smaller

2、ones to be solved in parallel. Bandwidth3Parallel DBMS: IntroductionnParallelism natural to DBMS processingPipeline parallelism: many machines each doing one step in a multi-step process. Partition parallelism: many machines doing the same thing to different pieces of data.Both are natural in DBMS!P

3、ipelinePartitionAny Sequential ProgramAny Sequential ProgramSequentialSequentialSequentialSequentialAny Sequential ProgramAny Sequential Programoutputs split N ways, inputs merge M ways4DBMS: The | Success StorynDBMSs are among most (only?) successful application of parallelism.Teradata, Tandem vs.

4、Thinking Machines, Every major DBMS vendor has some | serverWorkstation manufacturers depend on | DB server sales.nReasons for success:Bulk-processing (= partitioned |-ism).Natural pipelining.Inexpensive hardware can do the trick!Users/app-prog. dont need to think in |5Some | TerminologynSpeed-UpMor

5、e resources means proportionally less time for given amount of data.nScale-UpIf resources increased in proportion to increase in data size, time is constant.degree of |-ismXact/sec.(throughput)Idealdegree of |-ismsec./Xact(response time)Ideal6Architecture Issue: Shared What?Shared Memory (SMP)Shared

6、 DiskShared Nothing (network)CLIENTSCLIENTSCLIENTSMemoryProcessorsEasy to programExpensive to buildDifficult to scaleupHard to programCheap to buildEasy to scaleupSequent, SGI, SunVMSclusterTandem, Teradata7What Systems Work This WayShared NothingTeradata: 400 nodesTandem: 110 nodesIBM / SP2 / DB2:

7、128 nodesInformix/SP2 48 nodesATT & Sybase ? nodesShared DiskOracle170 nodesDEC Rdb 24 nodesShared MemoryInformix 9 nodes RedBrick ? nodesCLIENTSMemoryProcessorsCLIENTSCLIENTS(as of 9/1995)8Different Types of DBMS |-ismnIntra-operator parallelismget all machines working to compute a given operat

8、ion (scan, sort, join)nInter-operator parallelismeach operator may run concurrently on a different site (exploits pipelining)nInter-query parallelismdifferent queries run on different site Well focus on intra-operator |-ism 9Automatic Data PartitioningPartitioning a table:RangeHashRound RobinShared-

9、disk and -memory less sensitive to partitioning, Shared nothing benefits from good partitioning !A.E F.J K.N O.S T.ZA.E F.J K.N O.S T.ZA.E F.J K.N O.S T.ZGood for group-by,range queries, andalso equip-joinGood for equijoins Good to spread load;Most flexible - but10Parallel ScansnScan in parallel, an

10、d merge.nSelection may not require access of all sites for range or hash partitioning.nIndexes can be built at each partition.nQuestion: How do indexes differ in the different schemes?Think about both lookups and inserts!11Parallel SortingnNew records again and again:8.5 Gb/minute, shared-nothing; D

11、atamation benchmark in 2.41 secs (UCB students)nIdea: Scan in parallel, and range-partition as you go.As tuples come in, begin “local” sorting on eachResulting data is sorted, and range-partitioned.Problem: skew!Solution: “sample” data at start to determine partition points. 12Jim Gray & Gordon

12、Bell: VLDB 95 Parallel Database Systems SurveyParallel AggregatesA.EF.JK.NO.ST.ZA TableCountCountCountCountCountCountnFor each aggregate function, need a decomposition:count(S) = S S count(s(i), ditto for sum()avg(S) = (S S sum(s(i) / S S count(s(i)nFor groups:Sub-aggregate groups close to source.Pa

13、ss each sub-aggregate to its groups site.13Parallel JoinsnNested loop:Each outer tuple must be compared with each inner tuple that might join.Easy for range partitioning on join cols, hard otherwise!nSort-Merge (or plain Merge-Join):Sorting gives range-partitioning.Merging partitioned tables is loca

14、l.14Parallel Hash JoinnIn first phase, partitions get distributed to different sites:A good hash function distributes work evenlynDo second phase at each site.nAlmost always the winner for equi-join.Original Relations(R then S)OUTPUT2B main memory buffersDiskDiskINPUT1hashfunctionhB-1Partitions12B-1

15、. . .Phase 115Dataflow Network for | JoinnGood use of split/merge makes it easier to build parallel versions of sequential join code.16Complex Parallel Query Plans nComplex Queries: Inter-Operator parallelismPipelining between operators: nsort and phase 1 of hash-join block the pipeline!Bushy TreesA

16、BRSSites 1-4Sites 5-8Sites 1-817NM-way ParallelismA.EF.JK.NO.ST.ZMergeJoinSortJoinSortJoinSortJoinSortJoinSortMergeMergeN inputs, M outputs, no bottlenecks.Partitioned DataPartitioned and Pipelined Data Flows18ObservationsnIt is relatively easy to build a fast parallel query executornIt is hard to w

17、rite a robust high-quality parallel query optimizer:There are many tricks.One quickly hits the complexity barrier.19Parallel Query OptimizationnCommon approach: 2 phasesPick best sequential plan (System R algo)Pick degree of parallelism based on current system parameters.n“Bind” operators to process

18、orsTake query tree, “decorate” with assigned processor20nBest serial plan != Best | plan ! nWhy?nTrivial counter-example:Table partitioned with local secondary index at two nodesRange query: all data of node 1 and 1% of node 2.Node 1 should do a scan of its partition.Node 2 should use secondary index. SELECT * FROM telephone_book WHERE name “NoGood”;Whats Wrong With That?N.ZTableScanA.MIndex Scan21Parallel DBMS Summaryn|-ism natural to query processing:Both pipeline and partition |-ism!nS

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论