版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、PART 4,DATA STORAGE AND QUERY,Chapter 12 Indexing and Hashing,May 2011,Database System Concepts - Chapter 12 Indexing and Hashing -,3,Contents in This Chapter,Basic concepts about and classification of indexing ordered indices, hash indices Properties/types of ordered indices primary/clustering indi
2、ces, secondary/non-clustering indices dense indices, sparse indices single-level indices, multi-level indices (e.g. B+-tree, B-tree) Hash indices hash functions static hash, dynamic hash,May 2011,Database System Concepts - Chapter 12 Indexing and Hashing -,4,12.1 Basic Concepts,How to locate records
3、 in DB file quickly? Indexing (索引技术) mechanisms used to speed up access to desired data e.g. for relation account(account-number, branch-name, balance) shown in Fig.12.1, the index branch-name physical address of record (i.e.tuple) in DB file account Search Key attributes or a set of attributes used
4、 to look up the records in a file e.g. branch-name,A-217,A-110,A-101,A-215,A-201,A-218,A-102,A-222,A-305,Fig. 12.1 DB indexed file account and its index file,Note: the file account is logically a sequential file, but its records may be stored non-contiguously or non-ordered on the disk,logical file
5、account,physical file account,index file,(account-number, branch-name, balance),indexed file,May 2011,Database System Concepts - Chapter 12 Indexing and Hashing -,6,12.1 Basic Concepts (cont.),Indexing mapping from search-key to storage locations of the file records, i.e. search-key storage location
6、s of the records in disks,搜索键 (search key),索引文件 (index file),数据文件 (主文件、被索引文件indexed file),s1 s2 s3 . . si . . sj . sn,散列 函数H,s1 s2 s3 . . si . . sj . sn,a3,a2,a1,ai,aj,am,an-1,an,b3,b2,b1,bi,bj,bm,bn-1,bn,s3,s2,s1,si,sj,sm,sn-1,sn,d3,d2,d1,di,dj,dm,dn-1,dn,R (A, B, S, . , , D),h(si),h(sn),h(s1),索引项
7、index entry,图 12.0.1 索引技术(indexing)及其分类,ordered indices,hash indices,搜索键 (search key),May 2011,Database System Concepts - Chapter 12 Indexing and Hashing -,8,Two basic kinds of indices: ordered indices: the index file is used to store the index entries in which the search key of the records and the
8、address of the records are stored in sorted order hash indices: the “hash function” is used to map the the search key of the records to the address of the records the records are stored in the “buckets”, the number of the bucket is as the address of the records and is determined by the hash function
9、,12.1 Basic Concepts (cont.),May 2011,Database System Concepts - Chapter 12 Indexing and Hashing -,9,DBS file with indexing mechanism include two parts : indexed file, in which data records are stored index file, in which index entries are included e.g. Fig.12.1 The indexed file can be organized as
10、sequential file heap file hash file clustering file,12.2 Ordered Indices,May 2011,Database System Concepts - Chapter 12 Indexing and Hashing -,10,12.1 Basic Concepts (cont.),Index file set of index entries of the form Index files are typically much smaller than the original file,May 2011,Database Sy
11、stem Concepts - Chapter 12 Indexing and Hashing -,11,Ordered index (in index file) the index entries in the index file are stored in some sorted order, for instance, in accordance with the order of the search key /* 索引项的排列顺序与搜索键的排列顺序一致 e.g. in Fig.12.1( ) , the index file is sorted by branch-name,12
12、.2 Ordered Indices (cont.),May 2011,Database System Concepts - Chapter 12 Indexing and Hashing -,12,Primary/clustering index (in index file) considering a index file and its corresponding indexed file. if the indexed file is a sequential file, and the search key in index file also specifies the sequ
13、ential order of the indexed file /* 索引文件的搜索键所规定的顺序与被索引的顺序文件中的纪录顺序一致 e.g. in Fig.12.1 , (branch-name, address of records) defines the same orders of the records as that in sequential indexed file account note: also called clustering index,12.2-I Primary and Secondary Indices,May 2011,Database System
14、Concepts - Chapter 12 Indexing and Hashing -,13,The search key of a primary index is usually but not necessarily the primary key e.g. in Fig. 12.1, branch-name is not the primary key of account,12.2-I Primary and Secondary Indices (cont.),May 2011,Database System Concepts - Chapter 12 Indexing and H
15、ashing -,14,在实际数据库系统中,如SQL Server、DB2等, 聚集索引是指索引文件的搜索键所规定的顺序与被索引的顺序文件中的纪录顺序一致 主索引是指建立在主键(主属性上)的索引 当一个表定义了主键后,系统会自动为该表在主键上建立聚集索引,该索引同时又是主索引 一个表上只能建立一个聚集索引,但可以建立多个非聚集索引 如果一个表上没有定义主键,则不会有主索引;但可以为该表建立一个聚集索引,12.2-I Primary and Secondary Indices (cont.),May 2011,Database System Concepts - Chapter 12 Index
16、ing and Hashing -,15,Secondary index an index whose search key specifies an order different from the sequential order of the file. also called non-clustering index e.g. Fig.12.5 , secondary index on balance field of account Secondary indices have to be dense indices Index-sequential file (索引顺序文件) or
17、dered sequential file with a primary/clustering index on the search key e.g. Fig.12.1,12.2-I Primary and Secondary Indices (cont.),May 2011,Database System Concepts - Chapter 12 Indexing and Hashing -,16,Fig.12.5 Secondary Index on balance field of account,12.2-I Primary and Secondary Indices (cont.
18、),May 2011,Database System Concepts - Chapter 12 Indexing and Hashing -,17,Dense index the index record of the index file appears for every search-key value in the indexed file each value of search-key in the indexed file corresponds to an index entry in the index file e.g. Fig.12.1 In a dense prima
19、ry index, the index record contains the search-key value and a pointer to the first record with that search-key value the rest of the records with the same search-key value would be stored sequentially after the first record e.g. Fig.12.1, search-key value “Perryridge” corresponds to three records i
20、n the file,12.2-II Dense and Sparse Indices,May 2011,Database System Concepts - Chapter 12 Indexing and Hashing -,18,Sparse Index index file contains index entries for only some search-key values in the indexed file e.g. Fig.12.3 With respect to the sparse index, to locate a file record with search-
21、key value K find index entry with largest search-key value K search file sequentially starting at the record to which this index entry points,12.2-II Dense and Sparse Indices (cont.),May 2011,Database System Concepts - Chapter 12 Indexing and Hashing -,19,Fig.12.3 Sparse index for the file account,i
22、ndex file,indexed file,12.2-II Dense and Sparse Indices,May 2011,Database System Concepts - Chapter 12 Indexing and Hashing -,20,The indices in Fig.12.1, Fig.12.3, and Fig.12.5 are single-level indices The index file may be very large, and cannot be entirely kept in memory To access the index file q
23、uickly, the index file is stored on disk as a sequential file and construct a sparse index on this sequential index file outer index a sparse index of primary index file inner index the primary index file If even outer index is too large to fit in main memory, yet another level of index can be creat
24、ed, and so on.,12.2-III Single-level and Multi-level Indices,Fig. 12.4 Two-level sparse index,indexed file,index file,May 2011,Database System Concepts - Chapter 12 Indexing and Hashing -,22,B+-tree indices and B-tree indices are two types of efficient multi-level indices, and widely used in DBS,.,1
25、2.2-III Single-level and Multi-level Indices,Fig. An Example of B+-tree index,May 2011,Database System Concepts - Chapter 12 Indexing and Hashing -,23,The file records are stored in a set of buckets a bucket is a unit of storage containing one or more records (a bucket is typically a disk block). Ha
26、sh function h a function from the set of all search-key values K in a file to the set of all bucket addresses, i.e. the set of the addresses of file records typical hash functions perform computation on the internal binary representation of the search-key. Hash file organization obtaining the bucket
27、 of a record directly from its search-key value using a hash function,12.5 Hashing Index Files,Fig.12.21 Hash file organization of account file, with branch-name as key,Hash function:returns the sum of the binary representations of the characters modulo 10,May 2011,Database System Concepts - Chapter
28、 12 Indexing and Hashing -,25,Handling of Bucket Overflows,Bucket overflow (溢出) can occur because of insufficient buckets skew in distribution of records. This can occur due to two reasons: multiple records have same search-key value chosen hash function produces non-uniform distribution of key valu
29、es Although the probability of bucket overflow can be reduced, it cannot be eliminated; it is handled by using overflow buckets.,May 2011,Database System Concepts - Chapter 12 Indexing and Hashing -,26,Handling of Bucket Overflows (cont.),Overflow chaining the overflow buckets of a given bucket are chained together in a linked list. Above scheme is called closed hashing.,Fig. 12.22 Overflows chaining in an hash structure,May 2011,Database System Concepts - Chapter 12 Indexing and Hashing -,27,Hash Indices,Hashing can be used not only for file organization, but also f
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 江西省九江市名校2026届初三2月模拟(一)语文试题含解析
- 新疆维吾尔自治区阿克苏地区沙雅县2026届初三“停课不停学”阶段性检测试题数学试题含解析
- 江西省鹰潭市名校2026届初三(高补班)上-期中考试语文试题试卷含解析
- 湖南省怀化市会同一中学、溆浦一中学2026年初三第一次诊断考试语文试题含解析
- 黑龙江省大庆市肇源市级名校2026年初三第一次检测试题语文试题(慢班)含解析
- MT-T 146-2025 树脂锚杆树脂锚杆
- 2026年喜茶跨界联名与潮流文化营销案例解析
- 2026年互联网公司新员工入职培训方案
- 2026年机关干部法制教育报告总结
- 江西版美术四年级下册教案
- 2026学校防范电信网络诈骗“无诈校园”建设工作方案(完整版)
- 北京化工集团招聘26人笔试备考试题及答案解析
- 急性脑卒中绿色通道急救规程
- 纯电动汽车原理与检修-宝骏E100
- 2026年及未来5年中国石墨碳素行业市场需求预测及投资战略规划报告
- 2025年四川大学mba面试题库及答案
- T/CECS 10143-2021高分子量高密度聚乙烯(HMWHDPE)双波峰缠绕结构壁排水管
- DL∕T 1616-2016 火力发电机组性能试验导则
- 玻璃钢化粪池施工方案(化粪池)
- 2023年黑龙江省学位英语历年考试真题
- 安全生产考试中心工作制度
评论
0/150
提交评论