




已阅读5页,还剩36页未读, 继续免费阅读
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
第11讲: (第10章) XML基础知识 重庆大学计算机学院,课程名称: 数据库系统 -,第11讲:XML基础知识,项目驱动目标: 如何采用XML描述应用数据及结构: 一、 什么是XML 二、 XML基础知识 三、 XML数据查询与转换 主要讨论问题: 1. 什么是XML 2. XML有什么优缺点 3. 属性和子元素有何本质不同 4. 如何用DTD说明XML数据的结构 5. 使用DTD有无局限性 6. 什么是XML模式,与DTD有和不同 7. 有哪些工具可以实现XML数据查询与转换 8. XML数据有结构可言,Exercise 11,XML,XML(Extensible Markup Language): Defined by the WWW Consortium联盟 (W3C) Derived from SGML (Standard Generalized Markup Language标准通用标记语言), but simpler to use than SGML 注释:SGML是在Web出现之前于1986年出版发布的一种定义电子文档结构和描述其内容的国际标准语言(ISO 8879) XML与HTML的区别: Documents have tags giving extra information about sections of the document E.g. XML Introduction Extensible, unlike HTML Users can add new tags, and separately specify how the tag should be handled for display XML没有固定的标签集,每个应用可以选择自己的标签集。,1-1 什么是XML?,一 什么是XML,问题1答案,XML tag-标签及用途,The ability to specify new tags, and to create nested tag structures, make XML a great way to exchange data, not just documents. Much of the use of XML has been in data exchange applications, not as a replacement for HTML 在一个应用程序必须与另一个应用程序通信(交换数据)时、或在整合数据时,XML作为一种数据格式特别有用! Tags make data (relatively) self-documenting 自描述 E.g. 教材 P.262 A-101 Downtown 500 Hayes Main Harrison A-101 Johnson ,1-2 XML标签有何用途?,一 什么是XML,返回,XML Motivation,Data interchange is critical in todays networked world Each application area has its own set of standards for representing information XML has become the basis for all new generation data interchange formats Each XML based standard defines what are valid elements, using XML type specification languages to specify the syntax DTD (Document Type Descriptors) XML Schema Plus textual descriptions of the semantics XML allows new tags to be defined as required However, this may be constrained by DTDs A wide variety of tools is available for parsing, browsing and querying XML documents/data,1-3 为什么要引入XML?,一 什么是XML,Comparison with Relational Data,XML不足Inefficient: tags, which in effect represent schema information, are repeated XML优点:Better than relational tuples as a data-exchange format Unlike relational tuples, XML data is self-documenting due to presence of tags Non-rigid format非刚性格式: tags can be added Allows nested structures Wide acceptance, not only in database systems, but also in browsers, tools, and applications,1-4 相比于关系数据, XML有何优点?,一 什么是XML,问题2答案,Structure of XML Data,标签Tag: labels for a section of data 元素Element: section of data beginning with and ending with matching 元素是XML数据的基本结构 允许嵌套:Elements must be properly nested恰当地嵌套 Proper nesting . Improper nesting . Formally: every start tag must have a unique matching end tag, that is in the context上下文 of the same parent element. Every document must have a single top-level element 混合Mixture of text with sub-elements is legal合法的 in XML.,2-1 如何用XML描述数据的结构?,二 XML基础知识,example,example,2.1 XML数据结构,Example of Nested Elements, _ top-level element 教材p.264 Hayes Main Harrison A-102 Perryridge 400 . . 注:前面的bank 文件元素未嵌套,这里的bank-1 文件包含嵌套元素account,2.1 XML数据结构,返回,Example of Mixing text and sub-elements,Mixture of text with sub-elements is legal in XML. Example: This account is seldom used any more. A-102 Perryridge 400 Useful for document markup, but discouraged for data representation,2.1 XML数据结构,9,Motivation for Nesting,Nesting is supported in object-relational databases Nesting is not supported, or discouraged, in relational databases With multiple orders, customer name and address are stored redundantly normalization replaces nested structures in each order by foreign key into table storing customer name and address information But nesting is appropriate when transferring data External application does not have direct access to data referenced by a foreign key Nesting of data is useful in data transfer Example: elements representing customer_id, customer_name, and address nested within an order element,2-2 为什在XML中要引入嵌套?,2.1 XML数据结构,Attributes,Elements can have attributes A-102 Perryridge 400 Attributes are specified by name=value pairs inside the starting tag of an element An element may have several attributes, but each attribute name can only occur once ,2-3 什么是XML属性,如何说明?,2.1 XML数据结构,Attributes vs. Subelements,Distinction between subelement and attribute 文档结构角度看:In the context of documents, attributes are part of markup, while subelement contents are part of the basic document contents 数据表示角度看:In the context of data representation, the difference is unclear and may be confusing Same information can be represented in two ways . A-101 提倡用法Suggestion: use attributes for identifiers of elements, and use subelements for contents,2-4 属性和子元素有何本质不同?,2.1 XML数据结构,问题3答案,Namespaces,问题: XML data has to be exchanged between organizations Same tag name may have different meaning in different organizations, causing confusion on exchanged documents 土办法:Specifying a unique string as an element name avoids confusion XML解决方案 Better solution: use unique-name:element-name Avoid using long unique names all over document by using XML Namespaces Downtown Brooklyn ,2-5 标签名在不同单位的含义不同,如何消歧?,2.1 XML数据结构,XML Document Schema,Database schemas constrain what information can be stored and the data types of stored values XML documents are not required to have an associated schema However, schemas are very important for XML data exchange Otherwise, a site cannot automatically interpret data received from another site Two mechanisms for specifying XML schema Document Type Definition (DTD) Widely used XML Schema Newer, increasing use,2-6 XML数据是否也需要模式?,2-7 如何说明XML数据的模式?,2.2 XML文档模式,Document Type Definition(DTD),The type of an XML document can be specified using a DTD DTD constraints structure of XML data What elements can occur What attributes can/must an element have What subelements can/must occur inside each element, and how many times. DTD does not constrain data types All values represented as strings in XML DTD syntax ,2-8 如何用DTD说明XML数据的结构?,2.3 DTD模式,问题4答案,Element Specification in DTD,Subelements can be specified as names of elements, or #PCDATA (parsed character data), i.e., character strings, or EMPTY (no subelements) or ANY (anything can be a subelement) Example Subelement specification may have regular expressions Notation: “|” - alternatives 或者 “+” - 1 or more occurrences 1个或多个 “*” - 0 or more occurrences 0个或多个,2-9 DTD中子元素如何定义?,example,2.3 DTD模式,储户,Bank DTD, ,2.3 DTD模式,17,Attribute Specification in DTD,Attribute specification: for each attribute 属性申明方式(包括三个部分) Name 属性名称 Type of attribute 属性类型 CDATA ID (identifier) or IDREF (ID reference) or IDREFS (multiple IDREFs) more on this later Whether 属性值默认申明 mandatory (#REQUIRED) 该属性必须指定值 has a default value (value), 属性的默认值 or neither (#IMPLIED) 该属性无默认值,可以被忽略 属性说明例子Examples: ,2-10 DTD中也可说明属性?,2.3 DTD模式,IDs and IDREFs,ID属性说明: An element can have at most one attribute of type ID The ID attribute value of each element in an XML document must be distinct Thus the ID attribute value is an object identifier IDREF& IDREFS属性说明: An attribute of type IDREF must contain the ID value of an element in the same document 一个ID值 An attribute of type IDREFS contains a set of (0 or more) ID values. Each ID value must contain the ID value of an element in the same document 一组ID值,2-11 ID和IDREF属性有何区别?,example,2.3 DTD模式,Bank DTD-with Attributes,Bank DTD with ID and IDREF attribute types. declarations for branch, balance, customer_name, customer_street and customer_city ,2.3 DTD模式,Bank XML data-with ID and IDREF attributes, Downtown 500 Perryridge 900 Joe Monroe Madison Lisa Mountain Murray Hill Mary Erin Newark 注:前面的bank-1 文件元素无属性,这里的bank-2 文件包含属性描述,返回,2.3 DTD模式,Limitations of DTDs,No typing of text elements and attributes All values are strings, no integers, reals, etc. Difficult to specify unordered sets of subelements Order is usually irrelevant in databases (unlike in the document-layout environment from which XML evolved) (A | B)* allows specification of an unordered set, but Cannot ensure that each of A and B occurs only once IDs and IDREFs are untyped 并无类型区分 The owners attribute of an account may contain a reference to another account, which is meaningless owners attribute should ideally be constrained to refer to customer elements,2-12 使用DTD有无局限性?,问题5答案,2.3 DTD模式,XML Schema,象DTD一样,XML Schema 也用于描述 XML 文档的结构。 XML Schema is a more sophisticated精致的/复杂的 schema language which addresses the drawbacks缺点 of DTDs. Supports Typing of values E.g. integer, string, etc Also, constraints on min/max values User-defined, comlex types Many more features, including uniqueness and foreign key constraints, inheritance XML Schema is itself specified in XML syntax, unlike DTDs More-standard representation, but verbose 累赘的/复杂的 XML Scheme is integrated with namespaces BUT: XML Schema is significantly more complicated than DTDs.,2-13 什么是XML模式,与DTD有和不同?,example,2.4 XML模式,问题6答案,XML Schema Version-of Bank DTD, definitions of customer (省略,详见书p.268) ,2.4 XML模式,部分符号说明,返回,定义的新类型被使用,XML Schema 部分符号说明,Choice of “xs:” was ours - any other namespace prefix could be chosen (比如xsd:) Element “bank” has type “BankType”, which is defined separately xs:complexType is used later to create the named complex type “BankType” Element “account” has its type defined in-line (内嵌的复杂类型) 同xs:complexType一样,xs:squence也是一种复杂类型的构造器 Attributes specified by xs:attribute tag: adding the attribute use = “required” means value must be specified,2.4 XML模式,More features of XML Schema,Key constraint: “account numbers form a key for account elements under the root bank element: 指明键元素位置 指明键元素名 Foreign key constraint from depositor to account: 指明参照元素位置 指明被参照元素名 ,2-14 XMLSchema中,也有完整性约束?,2.4 XML模式,查看XML模式,Querying and Transforming XML Data,Translation of information from one XML schema to another Querying on XML data Above two are closely related, and handled by the same tools Standard XML querying/translation languages XPath Simple language consisting of path expressions XQuery An XML query language with a rich set of features *XSLT 虽然被设计用于格式化控制显示/打印,也可生成XML作为输出 Simple language designed for translation from XML to XML and XML to HTML (本课程不讲,有兴趣者参教材p.276),3-1 有哪些工具可以实现XML数据查询与转换?,三 XML数据查询与转换,问题7答案,3.1 XML数据的树结构,Tree Model of XML Data-一种半结构化数据,Query and transformation 转换 languages are based on a tree model of XML data An XML document is modeled as a tree, with nodes corresponding to elements and attributes Element nodes have child nodes, which can be attributes or subelements Text in an element is modeled as a text node child of the element Children of a node are ordered according to their order in the XML document Element and attribute nodes (except for the root node) have a single parent, which is an element node The root node has a single child, which is the root element of the document,3-2 XML数据有结构可言?,3.1 XML数据的树结构,example,问题8答案,Example of Tree Model of XML Data -representing a movie and stars,p.174,may not a tree,3.1 XML数据的树结构,相应XML数据,p.185,XML Data -representing a movie and stars,3.1 XML数据的树结构,XPath,核心思路:XPath is used to address (select) parts of documents using path expressions (路径表达式!) A path expression is a sequence of steps separated by “/” Think of file names in a directory hierarchy 查找结果Result of path expression: set of values that along with their containing elements/attributes match the specified path 例子 E.g. /bank-2/customer/customer_name 取出元素! it evaluated on the bank-2 data we saw earlier returns Joe Lisa Mary E.g. /bank-2/customer/customer_name/text( ) 取出元素的内容! it returns the same names, but without the enclosing tags Joe Lisa Mary,3-3 XPath实现数据查找的核心思路是什么?,3.2 XPath语言,XPath Selection Predicates -选择谓词,The initial第1个 “/” denotes root of the XML document (above the top-level tag) Path expressions are evaluated left to right Each step operates on the set of instances produced by the previous step Selection predicates may follow any step in a path, in E.g. /bank-2/account balance 400 returns account elements with a balance value greater than 400 /bank-2/accountbalance returns account elements containing a balance subelement Attributes are accessed using “” E.g./bank-2/accountbalance 400/account_number returns the account_numbers of accounts with balance 400 注:默认条件下,并不考虑对IDREF链接的引用。 (链接用法在稍后介绍),3-4 XPath支持条件查询?,3.2 XPath语言,相应XML数据,Functions in XPath,XPath provides several functions The function count() at the end of a path counts the number of elements in the set generated by the path E.g. /bank-2/accountcount(./customer) 2 returns accounts with 2 customers Also function for testing position (1, 2, ) of node with respect to siblings 关于兄弟姐们 Boolean connectives and and or and function not() can be used in predicates谓词(约束条件) IDREFs can be referenced using function id() 链接的用法 id() can also be applied to sets of references such as IDREFS and even to strings containing multiple references separated by blanks E.g. /bank-2/account/id(owner) returns all customers (而非account) referred to from the owners attribute of account elements.,3-5 XPath查询支持函数?,3.2 XPath语言,相应XML数据,More XPath Features,Operator “|” used to implement union E.g. /bank-2/account/id(owner) | /bank-2/loan/id(borrower) Gives customers with either accounts or loans However, “|” cannot be nested inside other operators. “/” can be used to skip multiple levels of nodes E.g. /bank-2 / customer_name finds any customer_name element anywhere under the /bank-2 element, regardless of the element in which it is contained. A step in the path can go to parents, siblings, ancestors and descendants of the nodes generated by the previous step, not just to the children “/”, described above, is a short from for specifying “all descendants” “” specifies the parent. doc(name) returns the root of a named document,3-6 XPath还提供什么操作?,3.2 XPath语言,XQuery,XQuery is a general purpose query language for XML data Currently being standardized by the World Wide Web Consortium (W3C) The textbook description is based on a January 2005 draft of the standard. The final version may differ, but major features likely to stay unchanged. XQuery is derived from the Quilt query language, which itself borrows from SQL, XQL and XML-QL XQuery uses a syntax of (包含5个部分,缩写为: FLWOR expression) for let where order by result 各部分的作用与SQL对比: for SQL from where SQL where order by SQL order by result SQL select let allows temporary variables, and has no equivalent in SQL,3-8 XQuery与SQL有哪些相似之处?,3.3 XQuery语言,3-7 你知道XQuery语言的由来?,FLWOR Syntax in XQuery,Simple FLWOR expression in XQuery find all accounts with balance 400, and with each result enclosed in an tag for $x in /bank-2/account let $acctno := $x/account_number where $x/balance 400 return $acctno For clause uses XPath expressions, and (其后的变量) (指定搜索范围) variable $x in for clause ranges over values in the set returned by XPath Where clause gives the constraints as the where clause in SQL (指定约束条件) Items in the return clause are XML texts unless enclosed in , in which case they are evaluated 可能有多个XML元素! (构造查询返回结果) Let clause not really needed in this query, and selection can be done In XPath. Query can be written as: (设置临时变量,以简化表达式) for $x in /bank-2/accountbalance400 return $x/account_number ,3-9 FLWOR表达式各部分有何作用?,3.3 XQuery语言,Joins -元素连接,Joins are specified in a manner very similar to SQL,例如: for $a in /bank/account, $c in /bank/customer, $d in /bank/depositor where $a/account_number = $d/account_number 连接条件 and $c/customer_name = $d/customer_name 连接条件 return $c $a 输出结果为:将同一客户的客户信息customer和账户信息account(存款余额)作为一复合XML元素cust_acct依序输出 (注:原来XML数据中并不是按照每个用户列出信息, 而是先列所有用户的account,然后是所有用户的customer, 然后是所有用户的depositor) *等效表达式: The same query can be expressed with the selections specified as XPath selections: for $a in /bank/account $c in /bank/customer $d in /bank/depositor account_number = $a/account_number and customer_name = $c/customer_name return $c $a ,3-10 XQuery支持连接操作?,3.3 XQuery语言,相应XML模式,相应XML数据,37,Nested Queries,The following query converts data from the flat structure for (相应XML数据) bank information into the nested structure used in bank-1 (相应XML数据) for $c in /bank/customer return $c/* 返回原来customer下所有子元素 for $d in /bank/depositorcustomer_name = $c/customer_name, $a in /bank/accountacco
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 二手房公司行为管理制度
- 智慧停车平台管理制度
- 晋城煤矿班组管理制度
- 汽车课件9-0章节
- 我们的祖国讲课件
- 搞笑视频班会课件大全
- 专题-汉译英-单词或短语(含答案)六年级英语下学期小升初高频考点培优(内蒙古专版)
- 单片机应用教程课件项目三 定时计数器
- 多媒体远程教学设备管理维护日常经验总结
- 妄想症治疗讲课件
- 2025年北方华创招聘笔试参考题库含答案解析
- 期末综合试题 2024-2025学年下期初中英语人教版七年级下册(新教材)
- 2025年甘肃高考真题化学试题(解析版)
- 恶臭的测定作业指导书
- 中国政法大学《中国政治制度史》2023-2024学年第二学期期末试卷
- 2024年上海浦东新区公办学校储备教师教辅招聘真题
- 2025年高考历史全国卷试题评析-教育部教育考试院
- 贵州省贵阳市2023−2024学年度第二学期期末监测试卷高一 数学试题(含解析)
- 井冈山的故事试题及答案
- 公共组织绩效评估-形考任务三(占10%)-国开(ZJ)-参考资料
- 2025年广东高中学业水平合格性考试化学试卷试题(含答案解析)
评论
0/150
提交评论