




已阅读5页,还剩12页未读, 继续免费阅读
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
学校代码: 10128学 号:8 本科毕业设计外文文献翻译(英文题目:Software Database An Object-Oriented Perspective.中文题目:软件数据库的面向对象的视角学生姓名:宋兰兰学 院:信息工程学院系 别:软件工程系专 业:软件工程班 级:软件09-1指导教师:关玉欣 讲师二一三 年 六 月A HISTORICAL PERSPECTIVEFrom the earliest days of computers, storing and manipulating data have been a major application focus. The first general-purpose DBMS was designed by Charles Bachman at General Electric in the early 1960s and was called the Integrated Data Store. It formed the basis for the network data model, which was standardized by the Conference on Data Systems Languages (CODASYL) and strongly influenced database systems through the 1960s. Bachman was the first recipient of ACMs Turing Award (the computer science equivalent of a Nobel prize) for work in the database area; he received the award in 1973. In the late 1960s, IBM developed the Information Management System (IMS) DBMS, used even today in many major installations. IMS formed the basis for an alternative data representation framework called the hierarchical data model. The SABRE system for making airline reservations was jointly developed by American Airlines and IBM around the same time, and it allowed several people to access the same data through computer network. Interestingly, today the same SABRE system is used to power popular Web-based travel services such as Travelocity!In 1970, Edgar Codd, at IBMs San Jose Research Laboratory, proposed a new data representation framework called the relational data model. This proved to be a watershed in the development of database systems: it sparked rapid development of several DBMSs based on the relational model, along with a rich body of theoretical results that placed the field on a firm foundation. Codd won the 1981 Turing Award for his seminal work. Database systems matured as an academic discipline, and the popularity of relational DBMSs changed the commercial landscape. Their benefits were widely recognized, and the use of DBMSs for managing corporate data became standard practice.In the 1980s, the relational model consolidated its position as the dominant DBMS paradigm, and database systems continued to gain widespread use. The SQL query language for relational databases, developed as part of IBMs System R project, is now the standard query language. SQL was standardized in the late 1980s, and the current standard, SQL-92, was adopted by the American National Standards Institute (ANSI) and International Standards Organization (ISO). Arguably, the most widely used form of concurrent programming is the concurrent execution of database programs (called transactions). Users write programs as if they are to be run by themselves, and the responsibility for running them concurrently is given to the DBMS. James Gray won the 1999 Turing award for his contributions to the field of transaction management in a DBMS.In the late 1980s and the 1990s, advances have been made in many areas of database systems. Considerable research has been carried out into more powerful query languages and richer data models, and there has been a big emphasis on supporting complex analysis of data from all parts of an enterprise. Several vendors (e.g., IBMs DB2, Oracle 8, Informix UDS) have extended their systems with the ability to store new data types such as images and text, and with the ability to ask more complex queries. Specialized systems have been developed by numerous vendors for creating data warehouses, consolidating data from several databases, and for carrying out specialized analysis.An interesting phenomenon is the emergence of several enterprise resource planning(ERP) and management resource planning (MRP) packages, which add a substantial layer of application-oriented features on top of a DBMS. Widely used packages include systems from Baan, Oracle, PeopleSoft, SAP, and Siebel. These packages identify a set of common tasks (e.g., inventory management, human resources planning, financial analysis) encountered by a large number of organizations and provide a general application layer to carry out these tasks. The data is stored in a relational DBMS, and the application layer can be customized to different companies, leading to lower Introduction to Database Systems overall costs for the companies, compared to the cost of building the application layer from scratch. Most significantly, perhaps, DBMSs have entered the Internet Age. While the first generation of Web sites stored their data exclusively in operating systems files, the use of a DBMS to store data that is accessed through a Web browser is becoming widespread. Queries are generated through Web-accessible forms and answers are formatted using a markup language such as HTML, in order to be easily displayed in a browser. All the database vendors are adding features to their DBMS aimed at making it more suitable for deployment over the Internet. Database management continues to gain importance as more and more data is brought on-line, and made ever more accessible through computer networking. Today the field is being driven by exciting visions such as multimedia databases, interactive video, digital libraries, a host of scientific projects such as the human genome mapping effort and NASAs Earth Observation System project, and the desire of companies to consolidate their decision-making processes and mine their data repositories for useful information about their businesses. Commercially, database manage- ment systems represent one of the largest and most vigorous market segments. Thusthes- tudy of database systems could prove to be richly rewarding in more ways than one!INTRODUCTION TO PHYSICAL DATABASE DESIGNLike all other aspects of database design, physical design must be guided by the nature of the data and its intended use. In particular, it is important to understand the typical workload that the database must support; the workload consists of a mix of queries and updates. Users also have certain requirements about how fast certain queries or updates must run or how many transactions must be processed per second. The workload description and users performance requirements are the basis on which a number of decisions have to be made during physical database design.To create a good physical database design and to tune the system for performance in response to evolving user requirements, the designer needs to understand the workings of a DBMS, especially the indexing and query processing techniques supported by the DBMS. If the database is expected to be accessed concurrently by many users, or is a distributed database, the task becomes more complicated, and other features of a DBMS come into play. DATABASE WORKLOADSThe key to good physical design is arriving at an accurate description of the expected workload. A workload description includes the following elements: 1. A list of queries and their frequencies, as a fraction of all queries and updates. 2. A list of updates and their frequencies. 3. Performance goals for each type of query and update.For each query in the workload, we must identify:Which relations are accessed.Which attributes are retained (in the SELECT clause).Which attributes have selection or join conditions expressed on them (in the WHERE clause) and how selective these conditions are likely to be. Similarly, for each update in the workload, we must identify:Which attributes have selection or join conditions expressed on them (in the WHERE clause) and how selective these conditions are likely to be.The type of update (INSERT, DELETE, or UPDATE) and the updated relation.For UPDATE commands, the fields that are modified by the update.Remember that queries and updates typically have parameters, for example, a debit or credit operation involves a particular account number. The values of these parameters determine selectivity of selection and join conditions.Updates have a query component that is used to find the target tuples. This component can benefit from a good physical design and the presence of indexes. On the other hand, updates typically require additional work to maintain indexes on the attributes that they modify. Thus, while queries can only benefit from the presence of an index, an index may either speed up or slow down a given update. Designers should keep this trade-offer in mind when creating indexes.NEED FOR DATABASE TUNINGAccurate, detailed workload information may be hard to come by while doing the initial design of the system. Consequently, tuning a database after it has been designed and deployed is importantwe must refine the initial design in the light of actual usage patterns to obtain the best possible performance.The distinction between database design and database tuning is somewhat arbitrary.We could consider the design process to be over once an initial conceptual schema is designed and a set of indexing and clustering decisions is made. Any subsequent changes to the conceptual schema or the indexes, say, would then be regarded as a tuning activity. Alternatively, we could consider some refinement of the conceptual schema (and physical design decisions affected by this refinement) to be part of the physical design process.Where we draw the line between design and tuning is not very important.OVERVIEW OF DATABASE TUNINGAfter the initial phase of database design, actual use of the database provides a valuable source of detailed information that can be used to refine the initial design. Many of the original assumptions about the expected workload can be replaced by observed usage patterns; in general, some of the initial workload specification will be validated, and some of it will turn out to be wrong. Initial guesses about the size of data can be replaced with actual statistics from the system catalogs (although this information will keep changing as the system evolves). Careful monitoring of queries can reveal unexpected problems; for example, the optimizer may not be using some indexes as intended to produce good plans.Continued database tuning is important to get the best possible performance. TUNING THE CONCEPTUAL SCHEMAIn the course of database design, we may realize that our current choice of relation schemas does not enable us meet our performance objectives for the given workload with any (feasible) set of physical design choices. If so, we may have to redesign our conceptual schema (and re-examine physical design decisions that are affected by the changes that we make).We may realize that a redesign is necessary during the initial design process or later, after the system has been in use for a while. Once a database has been designed and populated with data, changing the conceptual schema requires a significant effort in terms of mapping the contents of relations that are affected. Nonetheless, it may sometimes be necessary to revise the conceptual schema in light of experience with the system. We now consider the issues involved in conceptual schema (re)design from the point of view of performance.Several options must be considered while tuning the conceptual schema:We may decide to settle for a 3NF design instead of a BCNF design.If there are two ways to decompose a given schema into 3NF or BCNF, our choice should be guided by the workload.Sometimes we might decide to further decompose a relation that is already in BCNF.In other situations we might denormalize. That is, we might choose to replace a collection of relations obtained by a decomposition from a larger relation with the original (larger) relation, even though it suffers from some redundancy problems. Alternatively, we might choose to add some fields to certain relations to speed up some important queries, even if this leads to a redundant storage of some information (and consequently, a schema that is in neither 3NF nor BCNF).This discussion of normalization has concentrated on the technique of decomposition, which amounts to vertical partitioning of a relation. Another technique to consider is horizontal partitioning of a relation, which would lead to our having two relations with identical schemas. Note that we are not talking about physically partitioning the cuples of a single relation; rather, we want to create two distinct relations (possibly with different constraints and indexes on each).Incidentally, when we redesign the conceptual schema, especially if we are tuning an existing database schema, it is worth considering whether we should create views to mask these changes from users for whom the original schema is more natural. TUNING QUERIES AND VIEWSIf we notice that a query is running much slower than we expected, we have to examine the query carefully to end the problem. Some rewriting of the query, perhaps in conjunction with some index tuning, can often ?x the problem. Similar tuning may be called for if queries on some view run slower than expected. When tuning a query, the first thing to verify is that the system is using the plan that you expect it to use. It may be that the system is not finding the best plan for a variety of reasons. Some common situations that are not handled efficiently by many optimizers follow:A selection condition involving null values.Selection conditions involving arithmetic or string expressions or conditions using the or connective. For example, if we have a condition E.age = 2*D.age in the WHERE clause, the optimizer may correctly utilize an available index on E.age but fail to utilize an available index on D.age. Replacing the condition by E.age/2=D.age would reverse the situation.Inability to recognize a sophisticated plan such as an index-only scan for an aggregation query involving a GROUP BY clause. If the optimizer is not smart enough to and the best plan (using access methods and evaluation strategies supported by the DBMS), some systems allow users to guide the choice of a plan by providing hints to the optimizer; for example, users might be able to force the use of a particular index or choose the join order and join method. A user who wishes to guide optimization in this manner should have a thorough understanding of both optimization and the capabilities of the given DBMS.(8)OTHER TOPICSMOBILE DATABASESThe availability of portable computers and wireless communications has created a new breed of nomadic database users. At one level these users are simply accessing a database through a network, which is similar to distributed DBMSs. At another level the network as well as data and user characteristics now have several novel properties, which affect basic assumptions in many components of a DBMS, including the query engine, transaction manager, and recovery manager.Users are connected through a wireless link whose bandwidth is ten times less than Ethernet and 100 times less than ATM networks. Communication costs are therefore significantly higher in proportion to I/O and CPU costs.Users locations are constantly changing, and mobile computers have a limited battery life. Therefore, the true communication costs is connection time and battery usage in addition to bytes transferred, and change constantly depending on location. Data is frequently replicated to minimize the cost of accessing it from different locations.As a user moves around, data could be accessed from multiple database servers within a single transaction. The likelihood of losing connections is also much greater than in a traditional network. Centralized transaction management may therefore be impractical, especially if some data is resident at the mobile computers. We may in fact have to give up on ACID transactions and develop alternative notions of consistency for user programs.MAIN MEMORY DATABASESThe price of main memory is now low enough that we can buy enough main memory to hold the entire database for many applications; with 64-bit addressing, modern CPUs also have very large address spaces. Some commercial systems now have several gigabytes of main memory. This shift prompts a reexamination of some basic DBMS design decisions, since disk accesses no longer dominate processing time for a memory-resident database:Main memory does not survive system crashes, and so we still have to implement logging and recovery to ensure transaction atomicity and durability. Log records must be written to stable storage at commit time, and this process could become a bottleneck. To minimize this problem, rather than commit each transaction as it completes, we can collect completed transactions and commit them in batches; this is called group commit. Recovery algorithms can also be optimized since pages rarely have to be written out to make room for other pages.The implementation of in-memory operations has to be optimized carefully since disk accesses are no longer the limiting factor for performance.A new criterion must be considered while optimizing queries, namely the amount of space required to execute a plan. It is important to minimize the space overhead because exceeding available physical memory would lead to swapping pages to disk (through the operating systems virtual memory mechanisms), greatly slowing down execution.Page-oriented data structures become less important (since pages are no longer the unit of data retrieval), and clustering is not important (since the cost of accessing any region of main memory is uniform).(一)从历史的角度回顾从数据库的早期开始,存储和操纵数据就一直是主要的应用焦点。第一个通用的DBMS是由Charles Bechman于20世纪60年代早期在通用电器公司设计的,称为集成数据存储(Integrated Data Store).它奠定了网状数据模型的基础。网状数据模型由数据系统语言协会(CODASYL)标准化,并在整个20世纪60年代对数据库系统产生了巨大的影响。由于Bachman在数据库领域的贡献,他成为第一个ACM图灵奖(相当于计算机科学界的诺贝尔奖)的获得者,并于1973年接受了这一奖励。20世纪60年代末期,IBM成功开发了信息管理系统(IMS)DBMS。直至今天,它还在许多系统中使用。IMS奠定了另一个数据表达框架层次数据模型的基础。同时,美国航空公司和IBM联合开发出用于飞机订票的SABRE系统,它允许多个用户通过计算机网络存取相同数据。有趣的是,今天SABRE系统
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 神经内科设备培训
- 校园宿舍闲置空地的利用设计
- 车辆借用与租赁车辆保险理赔责任合同范本
- 商业地产项目场地承包经营合作协议书
- 餐饮企业员工劳动合同范本及培训考核合同
- 特色主题餐厅经营合作协议
- 党建联学共建项目合作协议书
- 车辆抵押担保汽车维修担保服务合同
- 汽车抵押典当贷款业务合作协议
- 车棚租赁与停车诱导系统合作协议
- 2009-2022历年河北省公安厅高速交警总队招聘考试真题含答案2022-2023上岸必备带详解版4
- 六年级信息技术下册《走进人工智能》优质课获奖课件
- 工程开工报告表
- 劳动法课件(完整版)
- 营运车辆智能视频监控系统管理制度范本及动态监控管理制度
- 完整版:美制螺纹尺寸对照表(牙数、牙高、螺距、小径、中径外径、钻孔)
- 偏头痛PPT课件(PPT 43页)
- (完整版)入河排污口设置论证基本要求
- 10kV架空线路施工方案
- 2022年人教版小学数学一年级下册期中测试卷二(含答案)
- 关于恒温恒湿项目装修方案及装修细部做法
评论
0/150
提交评论