版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、COMP231Query OptimizationPrepared by Raymond WongPresented by Raymond WongCOMP2311OutlineIntroductionMaterialization and On-the-flyJoin over Two TablesJoin over Multiple TablesCOMP2312IntroductionAfter users write SQL statements, They want to obtain the results quicklyE.g., student (sid, sname, age)
2、 take (sid, cid) select sname from student S, take T where S.sid = T.sid and T.cid = “231” and S.age 20However, there are many ways to obtain the resultsFind the name of all students with age greater than 20 who take 231. COMP2313SQL select sname from student S, take T where S.sid = T.sid and T.cid
3、= “231” and S.age 20Translate the query to relational algebra operations sname(cid=231age20(Take Take.sid=Student.sid Student)sname(cid=231age20(Take Take.sid=Student.sid Student)COMP2314Evaluation Plan 1Take TStudent ST.sid=S.sidcid=231 age 20snameEvaluation planNodes are relational operators or ta
4、blesEdges point to where the input comes fromsname(cid=231age20(Take Take.sid=Student.sid Student)COMP2315Evaluation Plan 2Take TStudent ST.sid=S.sidcid=231 snameage 20sname(cid=231age20(Take Take.sid=Student.sid Student)COMP2316Evaluation Plan 3Student STake TT.sid=S.sidcid=231 snameage 20sname(cid
5、=231age20(Take Take.sid=Student.sid Student)COMP2317Evaluation Plan 4Take TStudent Ssid=sidcid=231 snameage 20sname(cid=231age20(Take Take.sid=Student.sid Student)COMP2318DBMSs job is to optimize the users query byConverting the query to a number of evaluation plansEstimate the cost of each evaluati
6、on planFind the evaluation plan with the lowest costsname(cid=231age20(Take Take.sid=Student.sid Student)COMP2319OutlineIntroductionMaterialization and On-the-flyJoin over Two TablesJoin over Multiple TablesCOMP23110Materialization and On-the-flyLet op1 and op2 be two relational algebra operations,
7、and op1 is performed on the result of op2The evaluation of op1 is on-the-fly if the result of op2 is directly sent to op1 (i.e, not stored in a temporary file) It is materialized if that result is stored in a temporary file firstOn-the-fly evaluation is called pipelined evaluationOn-the-fly can be m
8、ore efficient than materialized sname(cid=231age20(Take Take.sid=Student.sid Student)COMP23111Materialization and On-the-flyMaterializationOn-the-flysname(cid=231age20(Take Take.sid=Student.sid Student)COMP23112Materializationsname(cid=231age20(Take Take.sid=Student.sid Student)Disksidsnameage100Ray
9、mond20200Peter21sidcid100231200231200170StudentTakeMemoryReadingsidsnameage100Raymond20200Peter21sidcid100231200231200170Joinsidsnameagecid100Raymond20231200Peter21231200Peter21170Writingsidsnameagecid100Raymond20231200Peter21231200Peter21170MaterializationCOMP23113Materializationsname(cid=231age20(
10、Take Take.sid=Student.sid Student)Disksidsnameage100Raymond20200Peter21sidcid100231200231200170StudentTakeMemorysidsnameagecid100Raymond20231200Peter21231200Peter21170COMP23114Materializationsname(cid=231age20(Take Take.sid=Student.sid Student)Disksidsnameage100Raymond20200Peter21sidcid1002312002312
11、00170StudentTakeMemorysidsnameagecid100Raymond20231200Peter21231200Peter21170Readingsidsnameagecid100Raymond20231200Peter21231200Peter21170sidsnameagecid200Peter21231snamePetercid=231age20snameOutput to the screenCOMP23115Materialization and On-the-flyMaterializationOn-the-flysname(cid=231age20(Take
12、 Take.sid=Student.sid Student)COMP23116On-the-flysname(cid=231age20(Take Take.sid=Student.sid Student)Disksidsnameage100Raymond20200Peter21sidcid100231200231200170StudentTakeMemoryReadingsidsnameage100Raymond20200Peter21sidcid100231200231200170Joinsidsnameagecid100Raymond20231200Peter21231200Peter21
13、170COMP23117On-the-flysname(cid=231age20(Take Take.sid=Student.sid Student)Disksidsnameage100Raymond20200Peter21sidcid100231200231200170StudentTakeMemorysidsnameagecid100Raymond20231200Peter21231200Peter21170sidsnameagecid200Peter21231snamePetercid=231age20snameOutput to the screenCOMP23118OutlineIn
14、troductionMaterialization and On-the-flyJoin over Two TablesJoin over Multiple TablesCOMP23119Example SchemaAssuming the following table sizes:Student: 500 pages, 80 tuples/page, 50 bytes/tupleTake: 1000 pages, 100 tuples/page, 40 bytes/tupleAssume that there are 200 courses in table TakeStudent: 50
15、0 pages, 80 tuples/page, 50 bytes/tupleTake: 1000 pages, 100 tuples/page, 40 bytes/tuple- 200 courses in table Take- 1, 2, , 40 in attribute Ageof table StudentCOMP23120Example SchemaJoinSimple-nested loop JoinBlock-nested loop JoinSort-Merge JoinIndex-Nested loop JoinStudent: 500 pages, 80 tuples/p
16、age, 50 bytes/tupleTake: 1000 pages, 100 tuples/page, 40 bytes/tuple- 200 courses in table Take- 1, 2, , 40 in attribute Ageof table StudentCOMP23121Evaluation Plan 1Take TStudent ST.sid=S.sidcid=231 age 20sname(File scan)(File scan)(On-the-fly)(On-the-fly)(Simple-NestedLoop Join)Student: 500 pages,
17、 80 tuples/page, 50 bytes/tupleTake: 1000 pages, 100 tuples/page, 40 bytes/tupleCost = 500+500*1000 =500500 pagesWe do NOT calculate the cost of writing the output. Why?- 200 courses in table Take- 1, 2, , 40 in attribute Ageof table StudentSuppose that Student is the outer relationCOMP23122Evaluati
18、on Plan 1Take TStudent ST.sid=S.sidcid=231 age 20sname(File scan)(File scan)(On-the-fly)(On-the-fly)(Simple-NestedLoop Join)Student: 500 pages, 80 tuples/page, 50 bytes/tupleTake: 1000 pages, 100 tuples/page, 40 bytes/tupleCost = 1000+1000*500 =501000 pages- 200 courses in table Take- 1, 2, , 40 in
19、attribute Ageof table StudentSuppose that Take is the outer relationIf Student is the outer relation,then the cost is 500500 pagesIf Take is the outer relation,then the cost is 501000 pagesWhich one should we choose?COMP23123Evaluation Plan 2Take TStudent ST.sid=S.sidcid=231 snameage 20(File scan)(F
20、ile scan)(Write to Temp T1)(Write to Temp T2)(Simple-NestedLoop Join)(On-the-fly)Student: 500 pages, 80 tuples/page, 50 bytes/tupleTake: 1000 pages, 100 tuples/page, 40 bytes/tuple- 200 courses in table Take- 1, 2, , 40 in attribute Ageof table StudentCOMP23124Evaluation Plan 2Take TStudent ST.sid=S
21、.sidcid=231 snameage 20(File scan)(File scan)(Write to Temp T1)(Write to Temp T2)(Simple-NestedLoop Join)(On-the-fly)Cost of Reading T = 1000 pagesStudent: 500 pages, 80 tuples/page, 50 bytes/tupleTake: 1000 pages, 100 tuples/page, 40 bytes/tupleSize of T1 = 1000/200 = 5 pages- 200 courses in table
22、Take- 1, 2, , 40 in attribute Ageof table StudentCost of Writing T1 = 5 pagesCost of Reading S = 500 pagesSize of T2 = 500/2 = 250 pagesSuppose that T1 is theouter relationCost of Join = 5 + 5*250= 1255 pagesTotal cost = 1000 + 5 + 500 + 250 + 1255 = 3010 pagesCost of Writing T2 = 250 pagesCOMP23125
23、Evaluation Plan 2Take TStudent ST.sid=S.sidcid=231 snameage 20(File scan)(File scan)(Write to Temp T1)(Write to Temp T2)(Simple-NestedLoop Join)(On-the-fly)Student: 500 pages, 80 tuples/page, 50 bytes/tupleTake: 1000 pages, 100 tuples/page, 40 bytes/tuple- 200 courses in table Take- 1, 2, , 40 in at
24、tribute Ageof table StudentCOMP23126Which evaluation plan is better?Plan 1 or Plan 2?Why?COMP23127Evaluation Plan 3Student STake TT.sid=S.sidcid=231 snameage 20Student: 500 pages, 80 tuples/page, 50 bytes/tupleTake: 1000 pages, 100 tuples/page, 40 bytes/tuple- 200 courses in table Take- 1, 2, , 40 i
25、n attribute Ageof table Student(File Scan)(Write to temp T1)(File Scan)(Simple-NestedLoop Join)(On-the-fly)(On-the-fly)Similar derivations can be done.COMP23128Evaluation Plan 3Student STake TT.sid=S.sidcid=231 snameage 20Student: 500 pages, 80 tuples/page, 50 bytes/tupleTake: 1000 pages, 100 tuples
26、/page, 40 bytes/tuple- 200 courses in table Take- 1, 2, , 40 in attribute Ageof table Student(File Scan)(Write to temp T1)(File Scan)(Simple-NestedLoop Join)(On-the-fly)(On-the-fly)COMP23129Evaluation Plan 4Take TStudent Ssid=sidcid=231 snameage 20Student: 500 pages, 80 tuples/page, 50 bytes/tupleTa
27、ke: 1000 pages, 100 tuples/page, 40 bytes/tuple- 200 courses in table Take- 1, 2, , 40 in attribute Ageof table Student(File Scan)(Write to temp T1)(File Scan)(Simple-NestedLoop Join)(On-the-fly)(On-the-fly)Similar derivations can be done.COMP23130Example SchemaJoinSimple-nested loop JoinBlock-neste
28、d loop JoinSort-Merge JoinIndex-Nested loop JoinStudent: 500 pages, 80 tuples/page, 50 bytes/tupleTake: 1000 pages, 100 tuples/page, 40 bytes/tuple- 200 courses in table Take- 1, 2, , 40 in attribute Ageof table StudentCOMP23131Evaluation Plan 2Take TStudent ST.sid=S.sidcid=231 snameage 20(File scan
29、)(File scan)(Write to Temp T1)(Write to Temp T2)(Sort-Merge Join)(On-the-fly)Student: 500 pages, 80 tuples/page, 50 bytes/tupleTake: 1000 pages, 100 tuples/page, 40 bytes/tuple- 200 courses in table Take- 1, 2, , 40 in attribute Ageof table StudentCost of Sorting = 2N * (# of passes)where no. of pas
30、ses=Size of the buffer = 3 pagesCOMP23132Evaluation Plan 2Take TStudent ST.sid=S.sidcid=231 snameage 20(File scan)(File scan)(Write to Temp T1)(Write to Temp T2)(Sort-Merge Join)(On-the-fly)Cost of Reading T = 1000 pagesStudent: 500 pages, 80 tuples/page, 50 bytes/tupleTake: 1000 pages, 100 tuples/p
31、age, 40 bytes/tupleSize of T1 = 1000/200 = 5 pages- 200 courses in table Take- 1, 2, , 40 in attribute Ageof table StudentCost of Writing T1 = 5 pagesCost of Reading S = 500 pagesSize of T2 = 500/2 = 250 pagesTotal cost = 1000 + 5 + 500 + 250 + 20 + 4000 + 255 = 6030 pagesSize of the buffer = 3 page
32、sCost of Sorting T1 = 2*5*( )= 20 pagesCost of Sorting T2 = 2*250*( )= 4000 pagesCost of Merging = cost of reading T1 and T2= 5 + 250 = 255 pagesCost of Reading S = 500 pagesCost of Writing T2 = 250 pagesCOMP23133Evaluation Plan 3Take TStudent ST.sid=S.sidage 20 snamecid=231Student: 500 pages, 80 tu
33、ples/page, 50 bytes/tupleTake: 1000 pages, 100 tuples/page, 40 bytes/tuple- 200 courses in table Take- 1, 2, , 40 in attribute Ageof table Student(File Scan)(Write to temp T1)(Sort-MergeJoin)(On-the-fly)(On-the-fly)Size of the buffer = 3 pagesAssume that the tuples in Table Student are sorted accord
34、ing to sidCOMP23134Evaluation Plan 3Take TStudent ST.sid=S.sidage 20 snamecid=231Student: 500 pages, 80 tuples/page, 50 bytes/tupleTake: 1000 pages, 100 tuples/page, 40 bytes/tuple- 200 courses in table Take- 1, 2, , 40 in attribute Ageof table Student(File Scan)(Write to temp T1)(Sort-MergeJoin)(On
35、-the-fly)(On-the-fly)Size of the buffer = 3 pagesCost of Reading T = 1000 pagesSize of T1 = 1000/200 = 5 pagesCost of Writing T1 = 5 pagesTotal cost = 1000 + 5 + 20 + 505 = 1530 pagesCost of Sorting T1 = 2*5*( )= 20 pagesCost of Merging = Cost of Reading T1 and S= 5 + 500 = 505 pagesSince S is sorte
36、d according to sid,we do NOT need to sort SCOMP23135Example SchemaJoinSimple-nested loop JoinBlock-nested loop JoinSort-Merge JoinIndex-Nested loop JoinStudent: 500 pages, 80 tuples/page, 50 bytes/tupleTake: 1000 pages, 100 tuples/page, 40 bytes/tuple- 200 courses in table Take- 1, 2, , 40 in attrib
37、ute Ageof table StudentSize of the buffer = 3 pagesCOMP23136Evaluation Plan 4Take TStudent Ssid=sidcid=231 snameage 20Student: 500 pages, 80 tuples/page, 50 bytes/tupleTake: 1000 pages, 100 tuples/page, 40 bytes/tuple- 200 courses in table Take- 1, 2, , 40 in attribute Ageof table Student(Use Hash I
38、ndex on cid; Do not write Result to temp)(Hash Index on sid)(Index Nested LoopJoin with Pipelining)(On-the-fly)(On-the-fly)Size of the buffer = 3 pagesAssume the following.T is sorted according to cidT has a clustered hash index on cidS has an unclustered hash index on sidAssume that the (average) c
39、ost of reading a hashindex is 1.2 pagesCOMP23137(Use Hash Index on cid; Do not write Result to temp)Evaluation Plan 4Take TStudent Ssid=sidcid=231 snameage 20Student: 500 pages, 80 tuples/page, 50 bytes/tupleTake: 1000 pages, 100 tuples/page, 40 bytes/tuple- 200 courses in table Take- 1, 2, , 40 in
40、attribute Ageof table Student(Hash Index on sid)(Index Nested LoopJoin with Pipelining)(On-the-fly)(On-the-fly)Size of the buffer = 3 pagesCost of Reading T with Hash Index = Cost of Reading Hash Index + Cost of Reading T containing tuples with cid = 231= 1.2 + 5 = 6.2 pagesSize of the temp result =
41、 1000/200 = 5 pagesTotal cost = 6.2 + 1100 = 1106.2 pagesCost of Index Nested Loop Join= Cost of Reading S with Hash Index= 500*(1.2+1)= 1100 pagesNo. of tuples in T with cid = 231 = 100*5 = 500 COMP23138OutlineIntroductionMaterialization and On-the-flyJoin over Two TablesJoin over Multiple TablesCOMP23139Join over Multiple TablesRelational Algebra EquivalencesLeft-Deep PlanCOMP23140Allow us to choose dif
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2026年酒店服务质量评价与改进标准题
- 2026年抢答模式哲学思想问题解答集
- 2026年乡镇道路清扫保洁质量题库
- 2026年四川单招资源环境类题库
- 2026年计算机基础理论知识宝典试题
- 2026年各乡镇林区火源管控题库
- 2026年公司质量检测与管理操作流程解析
- 2026四川创锦发展控股集团有限公司招聘项目负责人的1人备考题库附答案详解(a卷)
- 2026陕西省定向延安“优师计划地方专项”师范毕业生招聘备考题库(30人)附答案详解(满分必刷)
- 2026安徽马鞍山市教育系统部分中小学校园招聘20人备考题库(南京师范大学考点)及参考答案详解一套
- 三角龙科普教学课件
- 酒后上岗安全培训课件教学
- 古诗词诵读《锦瑟》课件+2025-2026学年统编版高二语文选择性必修中册
- 2025年及未来5年市场数据中国化纤行业发展趋势及投资前景预测报告
- 《接纳不完美的我拥抱自我自信成长》班会课活动
- 2025年海南辅警招聘考试真题附答案详解(考试直接用)
- 精神病人肇事警情处置规范
- 2026年河南工业职业技术学院单招职业倾向性测试必刷测试卷新版
- 潮汕功夫茶课件
- 安全工器具管理制度
- 图书馆物业管理服务采购项目方案投标文件(技术方案)
评论
0/150
提交评论