数据库系统教学课件:第9讲-关系数据库规范化理论_第1页
数据库系统教学课件:第9讲-关系数据库规范化理论_第2页
数据库系统教学课件:第9讲-关系数据库规范化理论_第3页
数据库系统教学课件:第9讲-关系数据库规范化理论_第4页
数据库系统教学课件:第9讲-关系数据库规范化理论_第5页
已阅读5页,还剩86页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、第11讲: (第8章) 关系数据库规范化理论 重庆大学计算机学院 课程名称: 数据库系统 -1Consider relation obtained (call it SNLRHW) Hourly_Emps(ssn, name, lot, rating, hrly_wage, hrs_worked)What if we know rating determines hrly_wage?2Whats the ProblemCan we change W in just the 1st tuple of SNLRWH? modification anomalyWhat if we want to i

2、nsert an employee and dont know the hourly wage for his rating? Insertion anomalyWhat if we delete all employees with rating 5? We lose the information about the wage for rating 5! Deletion anomaly3Whats the problem, againWhen part of data can be derived from other parts, we say redundancy exists. E

3、xample: the hrly_wage of Smiley can be derived from the hrly_wage of Attishoo because they have the same rating and we know rating determines hrly_wage.Redundancy exists because of the existence of integrity constraints (e.g., FD: R W).4RedundancySince constraints, in particular functional dependenc

4、ies, cause problems, we need to study them, and understand when and how they cause redundancy.When redundancy exists, refinement is needed. Main refinement technique: decomposition (replacing ABCD with, say, AB and BCD, or ACD and ABD). Decomposition should be used judiciously: Is there reason to de

5、compose a relation? What problems (if any) does the decomposition cause?5What do we do?6What do we do? DecompositionA functional dependency (FD) has the form: XY, where X and Y are two sets of attributes. Examples: ratinghrly_wage, AB CThe FD XY is satisfied by a relation instance r if: for each pai

6、r of tuples t1 and t2 in r: t1.X = t2.X implies t1.Y =t2.Y i.e., given any two tuples in r, if the X values agree, then the Y values must also agree. (X and Y are sets of attributes.) Convention: X, Y, Z etc denote sets of attributes, and A, B, C, etc denote attributes.7Functional Dependencies (FDs)

7、The FD holds over relation name R if, for every allowable instance r of R, r satisfies the FD.An FD, as an integrity constraint, is a statement about all allowable relation instances. Must be identified based on semantics of application. Given some instance r1 of R, we can check if it violates some

8、FD f or not But we cannot tell if f holds over R by looking at an instance!Cannot prove non-existence (of violation) out of ignorance This is the same for all integrity constraints!8Functional Dependencies (FDs) The FD XY is NOT satisfied by a relation instance r if: There exists a pair of tuples t1

9、 and t2 in r such that t1.X = t2.X but t1.Y t2.Y i.e., we can find two tuples in r, such that X values agree, but Y values dont.9Violation of FD by a relation Consider relation obtained from Hourly_Emps: Hourly_Emps (ssn, name, lot, rating, hrly_wage, hrs_worked) Notation: We will denote this relati

10、on schema by listing the attributes: SNLRWH This is really the set of attributes S,N,L,R,W,H. Sometimes, we will refer to all attributes of a relation by using the relation name. (e.g., Hourly_Emps for SNLRWH) Some FDs on Hourly_Emps: ssn is the key: S SNLRWH rating determines hrly_wage: R W10Exampl

11、e: Constraints on Entity Set11One more exampleHow many possibleFDs totally on thisrelation instance?12Some other FDsGiven R(A, B, C). AABC means that A is a key. In general,X R means X is a (super)key. How about key constraint?ssn did13Relationship between FDs and Keys Given some FDs, we can usually

12、 infer additional FDs:ssn did, did lot implies ssn lotA BC implies A B An FD f is logically implied by a set of FDs F if f holds whenever all FDs in F hold. F+ = closure of F is the set of all FDs that are implied by F.14Reasoning About FDsHow do we get all the FDs that are logically implied by a gi

13、ven set of FDs? Armstrongs Axioms (X, Y, Z are sets of attributes): Reflexivity (自反律): If X Y, then X Y Augmentation (增广律):If X Y, then XZ YZ for any Z Transitivity (传递律):If X Y and Y Z, then X Z15Reasoning about FDs Armstrongs axioms are sound and complete inference rules for FDs! A = f | 可用Armstro

14、ng公理从F中导出的函数依赖fB = f | 被F所逻辑蕴涵的函数依赖fSound正确性:all the derived FDs (by using the axioms) are those logically implied by the given set.用Armstrong公理从F中导出的函数依赖必为F所蕴涵A BComplete完备性:all the logically implied (by the given set) FDs can be derived by using the axioms. F所蕴涵的函数依赖都能用Armstrong公理从F中导出B A16Armstro

15、ngs axiomsArmstrongs axioms关于Armstrong公理正确性按函数依赖定义r是R上的任一关系,t,s rtX = sXYXtY = sYX Y自反律Armstrongs axiomstXZ = sXZtX = sXXYtY = sYtXZ = sXZtZ = sZtYZ = sYZXZYZ增广律自反律Armstrongs axiomsX YtX = sXtY = sYY ZtZ = sZX Z传递律Armstrongs axioms由Armstrong公理导出的推理规则合并律(union rule)若X Y,X Z,则X YZ分解律(decomposition rul

16、e)若X YZ ,则X Y,X Z伪传递律(pseudotransitivity rule)若X Y,WY Z,则WX ZShow that If X Y and X Z, then X YZ21Derive UnionX Y增广律X XY合并律X Z增广律XY YZ传递律X YZShow that If X YZ, then X Y and X Z22Derive DecompositionZ YZ自反律ZY Z分解律X YZ传递律X Z23Derive pseudotransitivityX Y增广律WX WYWY Z传递律WX Z伪传递律Show that If X Y, WY Z,th

17、en WX Z R = (A, B, C, G, H, I) F = A B; A C; CG H; CG I; B H some members of F+ (how to derive them?) A HAG ICG HI24Derivation ExampleTo compute the closure of a set of functional dependencies F:NOTE: We shall see an alternative procedure for this task later25Procedure for Computing F+F + = Frepeatf

18、or each functional dependency f in F+ apply reflexivity and augmentation rules on f add the resulting functional dependencies to F +for each pair of functional dependencies f1and f2 in F + if f1 and f2 can be combined using transitivity then add the resulting functional dependency to F +until F + do

19、es not change any further F = A B, B C, C D E Step 1: For each f in F, apply reflexivity rule We get: CD C; CD D Add them to F: F = A B, B C, C D E; CD C; CD D Step 2: For each f in F, apply augmentation rule From A B we get: A AB; AB B; AC BC; AD BD; ABC BC; ABD BD; ACD BCD From B C we get: AB AC;

20、BC C; BD CD;ABC AC; ABD ACD, etc.Step 3: Apply transitivity on pairs of fsKeep repeating You get the idea26Example on Computing F+ Computing the closure of a set of FDs can be expensive.(Size of closure is exponential in # of attrs!)Typically, we just want to check if a given FD X Y is in the closur

21、e of a set of FDs F. An efficient check: Compute attribute closure of X (denoted X+) wrt F:Set of all attributes Z such that X Z is in F+There is a linear time algorithm to compute this. Check if Y is in X+ Does F = A B, B C, C D E imply A E? i.e, is A E in the closure F+? Equivalently, is E in A+?2

22、7Reasoning About FDs (Contd.) Input F (a set of FDs), and X (a set of attributes) Output: Result=X+ (under F) Method: Step 1: Result :=X; Step 2: Take Y Z in F, and Y is in Result, do:Result := Result Z Repeat step 2 until Result cannot be changed and then output Result.28Computing X+Does F = A B, B

23、 C, C D E imply A E? i.e, is A E in the closure F+? Equivalently, is E in A+?Step 1: Result = AStep 2: Consider A B, Result = AB Consider B C, Result = ABC Consider CD E, CD is not in ABC, so stopStep 3: A+ = ABCE is NOT in A+, so A E is NOT in F+29Example of Attribute Closure X+F = A B, AC D, AB C?

24、What is X+ for X = A? (i.e. what is the attribute closure for A?)Answer: A+ = ABCD30Example of computing X+R = (A, B, C, G, H, I)F = A B; A C; CG H; CG I; B H (AG)+ = ?Answer: ABCGHIIs AG a candidate key? This question involves two parts:1. Is AG a super key? Does AG R? = Is (AG)+ R2. Is any subset

25、of AG a superkey? Does A R? = Is (A)+ R Does G R? = Is (G)+ R31Example of Attribute ClosureThere are several uses of the attribute closure algorithm: Testing for superkey: To test if X is a superkey, we compute X+, and check if X+ contains all attributes of R.Testing functional dependencies To check

26、 if a functional dependency X Y holds (or, in other words, is in F+), just check if Y X+. That is, we compute X+ by using attribute closure, and then check if it contains Y. Is a simple and cheap test, and very useful Computing closure of F32Uses of Attribute ClosureGiven F= A B, B C. Compute F+ (wi

27、th attributes A, B, C).33Computing F+Step 1: Construct an empty matrix, with all possible combinations of attributes in the rows and columnsStep 2: Compute the attributeclosures for all attribute/combination of attributesStep 3: Fill in the matrix using the results from Step 2Given F= A B, B C. Comp

28、ute F+ (with attributes A, B, C).Well do an example on A+.Step 1: Result = AStep 2: Consider A B, Result = A B = ABConsider B C, Result = AB C = ABCStep 3: A+ = ABC34Computing F+Given F= A B, B C. Compute F+ (with attributes A, B, C).35Computing F+Step 1: Construct an empty matrix, with all possible

29、 combinations of attributes in the rows and columnsStep 2: Compute the attributeclosures for all attribute/combination of attributesStep 3: Fill in the matrix using the results from Step 2.We have A+=ABC. Now fill in the row for A. Consider the first column. Is A part of A+? Yes, so check it. Is B p

30、art of A+? Yes, so check it and so on.Given F= A B, B C. Compute F+ (with attributes A, B, C).36Computing F+An entry with means FD (the row) (the column) is in F+.An entry gets when (the column) is in (the row)+Given F= A B, B C. Compute F+ (with attributes A, B, C).37Computing F+An entry with means

31、 FD (the row) (the column) is in F+.An entry gets when (the column) is in (the row)+ABCTwo sets of FDs are equivalent if they logically imply the same set of FDs. i.e., if F1+ = F2+, then they are equivalent. For example, F1=A B, A C is equivalent to F2=A BCHow to test? Two steps: Every FD in F1 is

32、in F2+ Every FD in F2 is in F1+These two steps can use the algorithm (many times) for X+38Check if two sets of FDs are equivalent Constraints give rise to redundancy Three anomaliesFD is a “popular” type of constraint Satisfaction & violation Logical implication Reasoning Armstrongs Axioms FD infere

33、nce/derivation Computing the closure of FDs (F+) Check for existence of an FD By computing the Attribute closure39SummaryThe first question: Is any refinement needed? Normal forms: If a relation is in a certain normal form (BCNF, 3NF etc.), it is known that certain kinds of problems are avoided/ min

34、imized. This can be used to help us decide whether decomposing the relation will help.Role of FDs in detecting redundancy: Consider a relation R with 3 attributes, ABC.No FDs hold: There is no redundancy here.Given A B: Several tuples could have the same A value, and if so, theyll all have the same

35、B value!40Normal FormsFirst normal form (1NF) Every field must contain atomic values, i.e. no sets or lists. Essentially all relations are in this normal form Second normal form (2NF) Any relation in 2NF is also in 1NF All the non-key attributes must depend upon the WHOLE of the candidate key rather

36、 than just a part of it.It is only relevant when the key is composite, i.e., consists of several fields. e.g. Consider a relation:Inventory(part, warehouse, quantity, warehouse_address). Suppose part, warehouse is a candidate key. warehouse_address depends upon warehouse alone - 2NF violation Soluti

37、on: decompose41Normal FormsBoyce-Codd Normal Form (BCNF) Any relation in BCNF is also in 2NF Third normal form (3NF) Any relation in BCNF is also in 3NF42Normal Forms Reln R with FDs F is in BCNF if for each nontrivial FD X A in F , X is a super key for R(i.e., X R in F+). An FD X A is said to be “t

38、rivial” if A X. In other words, R is in BCNF if the only non-trivial FDs that hold over R are key constraints. If BCNF: No “data” in R can be predicted using FDs alone. Why: Because X is a (super)key, we cant have two different tuples that agree on the X value43Boyce-Codd Normal Form (BCNF)Suppose w

39、e know that this instance satisfies X A. This situation cannot arise if the relation is in BCNF. If R has only two attributes, then it is in BCNFIf F only uses attributes in R, then: R is in BCNF if and only if for each X Y in F (not F+!), X is a superkey of R, i.e., X R is in F+ (not F!).44How do w

40、e know R is in BCNF?List all non-trivial FDs Ensure that left hand side of each FD is a superkey We have to first find all the keys!45Checking for BCNF ViolationsIs Courses(course_num, dept_name, course_name, classroom, enrollment, student_name, address) in BCNF? FDs are: course_num, dept_name cours

41、e_name course_num, dept_name classroom course_num, dept_name enrollment What is (course_num, dept_name)+? course_num, dept_name, course_name, classroom, enrollment Therefore, the key iscourse_num, dept_name, course_name, classroom, enrollment, student_name, address The relation is not in BCNF46Check

42、ing for BCNF ViolationsWhen a relation schema is not in BCNF: decompose. Suppose that relation R contains attributes A1 . An. A decomposition of R consists of replacing R by two or more relations such that: Each new relation schema contains a subset of the attributes of R(and no attributes that do n

43、ot appear in R), and Every attribute of R appears as an attribute of at least one of the new relations.Intuitively, decomposing R means we will store instances of the relation schemas produced by the decomposition, instead of instances of R.47Decomposition of a Relation Schema48Decomposition example

44、 There are three potential problems to consider: Some queries become more expensive.e.g., How much did Attishoo earn? (earn = W*H)Given instances of the decomposed relations, we may not be able to reconstruct the corresponding instance of the original relation! Checking some dependencies may require

45、 joining the instances of the decomposed relations.Tradeoff: Must consider these issues vs. redundancy.49Problems with Decompositions50Example of problem 2Decomposition of R into R1 and R2 is lossless joinw.r.t. a set of FDs F if, for every instance r that satisfies F, we have:It is always true that

46、In general, the other direction does not hold!If it does, the decomposition is lossless-join.51Lossless Join Decompositions52Example (lossy decomposition)53Example (lossless join decomposition)The decomposition of R into R1 and R2 is lossless-join wrt F if and only if F+ contains: R1 R2 R1, or R1 R2

47、 R2 In particular, the decomposition of R into(UV) and (R-V) is lossless-join if U V holds on R54Lossless Join Decomposition Definition extended to decomposition into 3 or more relations in a straightforward way. It is essential that all decompositions used to deal with redundancy be lossless! (Avoi

48、ds Problem (2)55DecompositionConsider relation R with FDs F. If X A in F over R violates BCNF, then: decompose R into R - A and XA. Repeated application of this idea will give us a collection of relations that are in BCNF; lossless join decomposition, and guaranteed to terminate.56Decomposition into

49、 BCNFR = (A, B, C )F = A B; B CKey = A R is not in BCNF (B C but B is not a superkey) DecompositionR1 = (B, C)R2 = (A, B)57BCNF Decomposition ExampleIn general, several dependencies may cause violation of BCNF. The order in which we “deal with” them could lead to very different sets of relations!58B

50、CNF Decomposition R = (A, B, C)F = A B, B C) Can be decomposed in two different waysR1 = (A, B), R2 = (B, C) Lossless-join decomposition:R1 R2 = B and B BC Dependency preserving R1 = (A, B), R2 = (A, C) Lossless-join decomposition:R1 R2 = A and A AB Not dependency preserving(cannot check B C without

51、 computing R1 R2)59Example Regarding Dependency PreservationIn general, there may not be a dependency preserving decomposition into BCNF. Example: schema CSZ (city, street_name, zip_code) with FDs: CS Z, Z C(city, street_name) zip_codezip_code cityCant decompose while preserving CS Z, but CSZ is not

52、 in BCNF.60BCNF and Dependency PreservationDependency preserving decomposition(Intuitive): If R is decomposed into X, Y and Z, and weenforce the FDs that hold on X, on Y and on Z,then all FDs that were given to hold on R mustalso hold. (Avoids Problem (3)61Dependency Preserving DecompositionProjecti

53、on of set of FDs F: If R is decomposed into X, . the projection of F onto X (denoted FX ) is the set of FDs U V in F+ (closure of F ) such that U, V are in X.62What FD on a decomposition?Decomposition of R into X and Y is dependency preserving if (FX union FY ) + = F + i.e., if we consider only depe

54、ndencies in the closure F + that can be checked in X without considering Y, and in Y without considering X, these imply all dependencies in F +.Important to consider F +, not F, in this definitionDependency preserving does not imply lossless join And vice-versa!63Dependency Preserving Decompositions

55、(Contd.) If a relation is in BCNF, it is free of redundancies that can be detected using FDs. Thus, trying to ensure that all relations are in BCNF is a good heuristic. If a relation is not in BCNF, we can try to decompose it into a collection of BCNF relations. It is always possible to decompose a

56、relation into a set of relations that are in BCNF such that: the decomposition is lossless it may not be possible to preserve dependencies.64Summary of BCNFThere are some situations where BCNF is not dependency preserving, and efficient checking for FD violation on updates is importantSolution: defi

57、ne a weaker normal form, called Third Normal Form (3NF) Allows some redundancy But functional dependencies can be checked on individual relations without computing a join. There is always a lossless-join, dependency-preserving decomposition into 3NF.65Next: Third Normal Form If R is in BCNF, obvious

58、ly in 3NF. If R is in 3NF, some redundancy is possible.It is a compromise, used when BCNF not achievable (e.g., no “good” decomposition, or performance considerations). Lossless-join, dependency-preserving decomposition of R into a collection of 3NF relations always possible.66 Third Normal Form (3N

59、F) Relation R with FDs F is in 3NF if, for each FD X A (X R and A R) in F, one of the following statements is true: A X (trivial FD), or X is a superkey, or A is part of some key for R673NFIf one of these two is satisfied for ALL FDs, then R is in BCNF Obviously, the algorithm for lossless join deco

60、mp into BCNF can be used to obtain a lossless join decomp into 3NF (typically, can stop earlier).To ensure dependency preservation, one idea: If X Y is not preserved, add relation XY. Problem is that XY may violate 3NF! Refinement: Instead of the given set of FDs F, use a minimal cover for F.68Decom

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论