已阅读5页,还剩55页未读, 继续免费阅读
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
Introduction to Algorithms,III Data Structures,2,Dynamic Sets,Dynamic Sets: Different from mathematical set, the sets manipulated by algorithms can grow, shrink, or otherwise change over time. Algorithms may require several different types of operations to be performed on sets. Dictionary: A dynamic set that only supports the ability to insert elements into, delete elements from, and test membership in a set.,3,Elements of a dynamic set,Typically, each element is represented by an object. An object = Key +satellite data Some dynamic sets presuppose that the keys are drawn from a totally ordered set,4,Operations on dynamic sets,Queries: simply return information about the set SEARCH(S, k), returns a pointer x to an element in S such that keyx = k, or NIL if no such element belongs to S. MINIMUM(S), returns a pointer to the element of S with the smallest key. MAXIMUM(S), returns a pointer to the element of S with the largest key. SUCCESSOR(S, x), returns a pointer to the next larger element in S, or NIL if x is the maximum element. PREDECESSOR(S, x), returns a pointer to the next smaller element in S, or NIL if x is the minimum element. Modifying operations: change the set. INSERT(S, x), augments the set S with the element pointed to by x. DELETE(S, x), given a pointer x to an element in the set S, removes x from S.,5,Overview,The time taken to execute a set operation is usually measured in terms of the size of the set. Next few lectures will focus on data structures rather than straight algorithms Chapter 10 presents the essentials of working with simple data structures such as stacks, queues, linked lists, and rooted trees. Chapter 11 introduces hash tables, which support the dictionary operations INSERT, DELETE, and SEARCH. Chapter 12 Binary search trees Chapter 13 Red-black trees, a variant of binary search trees Chapter 14, augment red-black trees to support operations other than the basic ones,11 Hash Table,7,Hash Tables,Motivation: symbol tables A compiler uses a symbol table to relate symbols to associated data Symbols: variable names, procedure names, etc. Associated data: memory location, call graph, etc. For a symbol table (also called a dictionary), we care about search, insertion, and deletion Typically dont care about sorted order,8,Symbol-table problem,The structure we will use is a hash table Supports all the above in O(1) expected time!,9,Hashing: Keys,consider all keys to be (possibly large) natural numbers How can we convert floats to natural numbers for hashing purposes? How can we convert ASCII strings to natural numbers for hashing purposes?,10,Review,Section 11.1 discusses direct addressing Section 11.2 presents the main ideas, focusing on “chaining“ as a way to handle “collisions“ Section 11.3 describes how array indices can be computed from keys using hash functions Section 11.4 looks at “open addressing,“ which is another way to deal with collisions Section 11.5 introduces “perfect hashing“,11.1 Direct-address tables,12,Direct Addressing,Suppose: The range of keys is 0m-1 Keys are distinct The idea: Set up an array T0m-1 in which Ti = x if x T and keyx = i Ti = NULL otherwise This is called a direct-address table Operations take O(1) time!,13,Direct Addressing,14,Direct Addressing,DIRECT-ADDRESS-SEARCH(T, k) return T k DIRECT-ADDRESS-INSERT(T, x) Tkeyx x DIRECT-ADDRESS-DELETE(T, x) Tkeyx NIL Each of these operations is fast: only O(1) time is required.,15,The Problem With Direct Addressing,Direct addressing works well when the range m of keys is relatively small But what if the keys are 32-bit integers? Problem 1: direct-address table will have 232 entries, more than 4 billion Problem 2: even if memory is not an issue, the time to initialize the elements to NULL may be Solution: map keys to smaller range 0m-1 This mapping is called a hash function,11.2 Hash tables,17,Hash tables,18,Hash tables,19,Hashing: Collisions,20,Hash tables,a well-designed, “random“-looking hash function can minimize the number of collisions a method for resolving the collisions that do occur.,21,Resolving Collisions,How can we solve the problem of collisions? Solution 1: chaining Solution 2: open addressing,22,Chaining,Chaining puts elements that hash to the same slot in a linked list:,T,k4,k2,k3,k1,k5,U (universe of keys),K (actual keys),k6,k8,k7,k1,k4,k5,k2,k3,k8,k6,k7,23,Chaining,How do we insert an element?,T,k4,k2,k3,k1,k5,U (universe of keys),K (actual keys),k6,k8,k7,k1,k4,k5,k2,k3,k8,k6,k7,24,Chaining,T,k4,k2,k3,k1,k5,U (universe of keys),K (actual keys),k6,k8,k7,k1,k4,k5,k2,k3,k8,k6,k7,How do we delete an element? Do we need a doubly-linked list for efficient delete?,25,Chaining,How do we search for a element with a given key?,T,k4,k2,k3,k1,k5,U (universe of keys),K (actual keys),k6,k8,k7,k1,k4,k5,k2,k3,k8,k6,k7,26,Chaining,CHAINED-HASH-INSERT(T, x) insert x at the head of list Th(keyx) CHAINED-HASH-SEARCH(T, k) search for an element with key k in list Th(k) CHAINED-HASH-DELETE(T, x) delete x from the list Th(keyx),27,Analysis of Chaining,Assume simple uniform hashing: each key in table is equally likely to be hashed to any slot Given n keys and m slots in the table: load factor = n/m = average # keys per slot What will be the average cost of an unsuccessful search for a key?,28,Analysis of Chaining,Assume simple uniform hashing: each key in table is equally likely to be hashed to any slot Given n keys and m slots in the table, load factor = n/m = average # keys per slot What will be the average cost of an unsuccessful search for a key? A: O(1+),29,Analysis of Chaining,Assume simple uniform hashing: each key in table is equally likely to be hashed to any slot Given n keys and m slots in the table, load factor = n/m = average # keys per slot What will be the average cost of an unsuccessful search for a key? A: O(1+) What will be the average cost of a successful search?,30,Analysis of Chaining,Assume simple uniform hashing: each key in table is equally likely to be hashed to any slot Given n keys and m slots in the table, the load factor = n/m = average # keys per slot What will be the average cost of an unsuccessful search for a key? A: O(1+) What will be the average cost of a successful search? A: O(1 + /2) = O(1 + ),31,Analysis of Chaining,32,Analysis of Chaining,33,Analysis of Chaining,34,Analysis of Chaining,35,Analysis of Chaining,Thus, the total time required for a successful search (including the time for computing the hash function) is (2 + /2 - /2n) = (1 + ).,36,Analysis of Chaining,So the cost of searching = O(1 + ) If the number of keys n is proportional to the number of slots in the table, what is ? A: = O(1) In other words, we can make the expected cost of searching constant if we make constant,11.3 Hash functions,38,Choosing A Hash Function,Clearly choosing the hash function well is crucial What will a worst-case hash function do? What will be the time to search in this case? What are desirable features of the hash function? Should distribute keys uniformly into slots Should not depend on patterns in the data,39,The Division Method,h(k) = k mod m In words: hash k into a table with m slots using the slot given by the remainder of k divided by m What happens to elements with adjacent values of k? What happens if m is a power of 2 (say 2P)? What if m is a power of 10? Upshot: pick table size m = prime number not too close to a power of 2 (or 10),40,The Multiplication Method,For a constant A, 0 A 1: h(k) = m (kA - kA) ,41,The Multiplication Method,For a constant A, 0 A 1: h(k) = m (kA - kA) Choose m = 2P Choose A not too close to 0 or 1 Knuth: Good choice for A = (5 - 1)/2,Fractional part of kA,42,The Multiplication Method,43,The Multiplication Method,11.4 Open addressing,45,Open Addressing,Basic idea : The hash function: h : U 0, 1, ., m - 1 0, 1, ., m - 1. For every key k, the probe sequence : h(k,0),h(k,1), ., h(k,m - 1),46,Open Addressing,Basic idea : To insert: if slot is full, try another slot, , until an open slot is found (probing) HASH-INSERT(T, k) 1 i 0 2 repeat j h(k, i) 3 if Tj = NIL 4 then Tj k 5 return j 6 else i i + 1 7 until i = m 8 error “hash table overflow“,47,Open Addressing,Basic idea : To search, follow same sequence of probes as would be used when inserting the element If reach element with correct key, return it If reach a NULL pointer, element is not in table HASH-SEARCH(T, k) 1 i 0 2 repeat j h(k, i) 3 if Tj = k 4 then return j 5 i i + 1 6 until Tj = NIL or i = m 7 return NIL,48,Open Addressing,Good for fixed sets (adding but no deletion) Example: spell checking Table neednt be much bigger than n Three techniques are commonly used to compute the probe sequences required for open ddressing: Linear Probing Quadratic Probing Double Hashing,49,Linear Probing,Linear probing : Auxiliary hash function: h : U 0, 1, ., m - 1 The hash function: h(k, i) = (h(k) + i) mod m Disadvantage: primary clustering. Long runs of occupied slots build up, increasing the average search time. Clusters arise since an empty slot preceded by i full slots gets filled next with probability (i + 1)/m.,50,Quadratic Probing,Quadratic probing the hash function: h(k, i) = (h(k) +c1 i+c2i2) mod m Disadvantage: h(k1, 0) = h(k2, 0) implies h(k1, i) = h(k2, i). This property leads to a milder form of clustering, called secondary clustering.,51,Double Hashing,Double Hashing the hash function: h(k, i) = (h1(k) + ih2(k) mod m (m2) probe sequences are used
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2026年甘肃省陇南市卫健口事业单位第一批高层次人才和急需紧缺专业技术人才引进101人考试备考试题及答案解析
- 配电列头柜安装施工方案
- 2026年及未来5年市场数据中国三维葡磷钙咀嚼片行业市场发展数据监测及投资战略规划报告
- 芳香烃生产工岗前实操评估考核试卷含答案
- 电工合金冷变形工岗前工艺分析考核试卷含答案
- 纸面石膏板制备工岗前绩效评估考核试卷含答案
- 2026年云南省烟草专卖局招聘(第二批585人)笔试备考试题及答案解析
- 记号笔制造工安全演练模拟考核试卷含答案
- 2026中国电子科技集团公司第十二研究所校园招聘考试参考题库及答案解析
- 2026四川九洲电器集团有限责任公司招聘系统工程师(系统集成解决岗)等岗位5人考试参考题库及答案解析
- YS/T 337-2009硫精矿
- GB/T 25146-2010工业设备化学清洗质量验收规范
- GB/T 13083-2018饲料中氟的测定离子选择性电极法
- 2023年图书资料中级考试题库
- 中学生物学教学论试题库
- 2023年盐城市大数据集团有限公司招聘笔试题库及答案解析
- 国家开放大学《西方行政学说》形考任务1-4参考答案
- 心脏体格检查血管检查电子教案课件
- 应用文写作:申请书课件
- 临床流行病学的研究设计类型
- 《高等数学(下)》课程教学大纲
评论
0/150
提交评论