




已阅读5页,还剩33页未读, 继续免费阅读
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
Transportation: Refreshing Warehouse Data,Overview,Objectives,After completing this lesson, you should be able to do the following: Describe methods for capturing changed data Explain techniques for applying the changes Discuss techniques for purging and archiving data Outline final tasks, such as publishing the data, controlling access, and automating processes List tools for transporting data into the warehouse,Developing a Refresh Strategy for Capturing Changed Data,Consider load window Identify data volumes Identify cycle Know the technical infrastructure Plan a staging area Determine how to detect changes,T1,T2,T3,Operational databases,User Requirements and Assistance,Users define the refresh cycle IT balances requirements against technical issues Document all tasks and processes Employ user skills,T1,T2,T3,Operational databases,Load Window,Time available for entire ETT process Plan Test Prove Monitor,0 3 am 6 9 12 pm 3 6 9 12,User Access Period,Load Window,Load Window,Load Window,Plan and build processes according to a strategy. Consider volumes of data. Identify technical infrastructure. Ensure currency of data. Consider user access requirements first. High availability requirements may mean a small load window.,0 3 am 6 9 12 pm 3 6 9 12,User Access Period,Scheduling the Load Window,0 3 am,1,File 1,File 2,Receive data,Control File File names File types Number of files Number of loads First-time load or refresh Date of file Date range Records in file - counts Totals - amounts,FTP,Control process,4,Open and read files to verify and analyze,3,2,Requirements,Load cycle,Scheduling the Load Window,3 am 6 am 9 am,Load into warehouse,File 1,File 2,5,Verify, analyze, reapply,6,Create summaries,8,7,Index data,Update metadata,9,Parallel load,Scheduling the Load Window,6 am 9 am,Create views for specialized tools,11,10,Back up warehouse,Users access summary data,12,Publish,13,User access,Capturing Changed Data for Refresh,Capture new fact data Capture changed dimension data Determine method for capture of each Methods: Wholesale data replacement Comparison of database instances Time stamping Database triggers Database log Hybrid techniques,Expensive Limited historical data, if any Data mart implementations Time period replacement,Wholesale Data Replacement,Comparison of Database Instances,Database comparison,Yesterdays operational database,Delta file holds changed data,Simple to perform, but expensive in time and processing Delta file: Changes to operational data since last refresh Used by various techniques,Todays operational database,Time and Date Stamping,Fast scanning for records changed since last extraction Date Updated field No detection of deleted data,Operational data,Delta file holds changed data,Database Triggers,Changed data intersected at the server level Extra I/O required Maintenance overhead,Operational server (DBMS),Triggers on server,Trigger,Trigger,Trigger,Operational data,Delta file holds changed data,Using a Database Log,Contains before and after images Requires system checkpoint Common technique,Log,Log analysis and data extraction,Operational server (DBMS),Verdict,Consider each method on merit. Consider a hybrid approach if one approach is not suitable. Consider current technical, existing operational, and current application issues.,Applying the Changes to Data,You have a choice of techniques: Overwrite a record Add a record Add a field Maintain history Add version numbers,Overwriting a Record,Customer Id John Doe Single,.,.,Customer Id John Doe Married,Easy to implement Loses all history Not recommended,Adding a New Record,1 Customer Id John Doe Single,History is preserved; dimensions grow. Time constraints are not required. Generalized key is created. Metadata tracks usage of keys.,Adding a Current Field,Customer Id John Doe Single,Customer Id John Doe Single Married 01-JAN-96,Maintains some history Loses intermediate values Is enhanced by adding an Effective Date field,Limitations of Methods for Applying Changes,Complete history impossible Dimensions may grow large Maintenance overhead,Maintaining History,Product,Time,Sales,HIST_CUST,CUSTOMER,One-to-many relationship Always retain current record Consistently able to refer to record history,History Preserved,History enables realistic analysis. History retains context of data. History provides for realistic historical analysis. Model must be able to: Reflect business changes Maintain context between fact and dimension data Retain sufficient data to relate old to new,Version Numbering,Avoid double counting Facts hold version number,Customer.CustId Version Customer Name 1234 1 Comer 1234 2 Comer Sales.CustId Version Sales Facts 1234 1 11,000 1234 2 12,000,Customer,Sales,Product,Time,Purging and Archiving Data,As data ages, its value depreciates. Remove old data from the warehouse: Archive for later use Purge without copy,Techniques for Purging Data,TRUNCATE: Retains no rollback DELETE: Retains redo and rollback ALTER TABLE: Removes a partition PL/SQL: Uses database triggers,Techniques for Archiving Data,Export to dump file from tables Import to tables from dump file ALTER TABLE EXCHANGE partitions,EXP,.dmp,IMP,Verdict,Defined by business requirements Must be managed,Final Tasks,Update metadata ETT User Publish data Availability Changes Subject area basis Use database roles to prevent and allow access,Sources,Extract,Stage,Transform,Rules,Load,Publish,Query,Publishing Data,Control access using database roles 24-hour operation may be requested Compromise between load and access Consider Staggering updates Using temporary tables Using separate tables,ETT Tool Selection Criteria,Overlap with existing tools Availability of meta model Supported data sources Ease of modification and maintenance Required fine tuning of code Ease of change control Power of transformation logic Level of modularization Power of error, exception, resubmission features Intuitive documentation Performance of code,ETT Tool Selection Criteria,Activity scheduling and sophistication Metadata generation Learning curve Flexibility Supported operating systems Cost,Transportation Tools,Informatica OpenBridge Oracle SQL*Loader Gateways PL/SQL Precompilers Platinum Technology InfoPump Platinum Info Transpo
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2025国际关系学院应届毕业生招聘1人(第2号)考前自测高频考点模拟试题及答案详解(必刷)
- 2025广西平果市农业机械化服务中心城镇公益性岗位人员招聘1人模拟试卷及一套参考答案详解
- 2025广东广州市中级人民法院招聘劳动合同制审判辅助人员模拟试卷及答案详解(必刷)
- 2025甘肃定西市人力资源有限公司招聘9人模拟试卷及完整答案详解一套
- 2025贵州黔东南州镇远县青溪司法所招聘1人模拟试卷及完整答案详解一套
- 2025年春季中国电子校园招聘模拟试卷及完整答案详解一套
- 2025辽宁盘锦建设投资有限责任公司招聘工作人员和模拟试卷完整参考答案详解
- 2025贵州三穗县第七批城镇公益性岗位招聘15人模拟试卷及一套参考答案详解
- 2025江苏连云港市灌南县招聘事业单位人员43人考前自测高频考点模拟试题及完整答案详解1套
- 班组安全培训活动记录课件
- 零星维修工程施工组织设计方案方案
- 2025年汽车驾驶员(技师)考试试题及答案(含答案)
- 2025大连国际机场招聘25人笔试历年参考题库附带答案详解
- 2025年浙江铁塔招聘笔试备考题库(带答案详解)
- 2025年上海市(秋季)高考语文真题详解
- 《秘书文档管理第三版》课件第七章
- 电力工程电缆设计课件
- 施工班组驻地管理制度
- 城投公司成本控制管理制度
- 中国磷化工行业市场规模及发展前景研究报告(智研咨询)
- 万亨工业科技(台州)股份有限公司年产500万套逆变器及配件、800万套新能源汽车控制器配件技改项目环评报告
评论
0/150
提交评论