




已阅读5页,还剩69页未读, 继续免费阅读
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
DataWarehouse,WhyDatawarehouse,Themostcommonissuecompaniesfacewhenlookingatdataminingisthattheinformationisnotinoneplace.Thebiggestchallengebusinessanalystsfaceinusingdataminingishowtoextract,integrate,cleanse,andpreparedatatosolvetheirmostpressingbusinessproblems.,WhatisDataWarehouse,Theideaofadatawarehouseistoputawiderangeofoperationaldatafrominternalandexternalsourcesintooneplacesoitcanbebetterutilizedbyexecutives,lineofbusinessmanagersandotherbusinessanalysts.Oncetheinformationisgathered,OLAP(on-lineanalyticalprocessing)softwarecomesintoplaybyprovidingthedesktopanalysistoolsforquerying,manipulatingandreportingthedatafromthedatawarehouse.,DataWarehouseenvironment,thesourcesystemsfromwhichdataisextractedthetoolsusedtoextractdataforloadingthedatawarehousethedatawarehousedatabaseitselfwherethedataisstoredthedesktopqueryandreportingtoolsusedfordecisionsupport,DataWarehousingProcessOverview,OperationalVs.MultidimensionalViewOfSales,CreatingADataWarehouse,TheDataWarehouse,TheDataWarehouseisanintegrated,subject-oriented,time-variant,non-volatiledatabasethatprovidessupportfordecisionmaking.,TheDataWarehouse,IntegratedTheDataWarehouseisacentralized,consolidateddatabasethatintegratesdataretrievedfromtheentireorganization.Subject-OrientedTheDataWarehousedataisarrangedandoptimizedtoprovideanswerstoquestionscomingfromdiversefunctionalareaswithinacompany.,TheDataWarehouse,TimeVariantTheWarehousedatarepresenttheflowofdatathroughtime.Itcanevencontainprojecteddata.Non-VolatileOncedataentertheDataWarehouse,theyareneverremoved.TheDataWarehouseisalwaysgrowing.,OperationalDatabasevs.Datawarehouse,OperationalDBSimilardatacanhavedifferentrepresentationsormeaningsFunctionalorprocessorientationCurrenttransactionFrequentupdating,DataWarehouseUnifiedviewofalldataelementsSubjectorientationfordecisionsupportHistoricalinformationwithtimedimensionDataareaddedwithoutchange,DataMart,Adatamartisasmall,single-subjectdatawarehousesubsetthatprovidesdecisionsupporttoasmallgroupofpeople.,DataMart,DataMartscanserveasatestvehicleforcompaniesexploringthepotentialbenefitsofDataWarehouses.DataMartsaddresslocalordepartmentalproblems,whileaDataWarehouseinvolvesacompany-wideefforttosupportdecisionmakingatalllevelsintheorganization.,EnterpriseDataWarehouse(EDW),AlargescaredatawarehousethatisusedacrosstheenterprisefordecisionsupportEDWareusedtoprovidedataformanytypesofDSS,includingCRM,SCM,BPM,BAM,PLM,andKMS.BPM:BusinessperformancemanagementBAM:BusinessactivitymonitoringPLM:productlifecyclemanagementKMS:Knowledgemanagementsystems,Metadata,Metadataisthedataaboutdata.Inadatawarehouse,metadatadescribethecontentsofadatawarehouseandthemannerofitsuseGoodmetadataisessentialtotheeffectiveoperationofadatawarehouseanditisusedindataacquisition/collection,datatransformation,anddataaccess.,TheneedsforTechnicalmetadata,Theuseofdatawarehousinganddecisionprocessingofteninvolvesawiderangeofdifferentproducts,andcreatingandmaintainingthemetadatafortheseproductsistime-consuminganderrorprone.Automatingthemetadatamanagementprocessandenablingthesharingofthisso-calledtechnicalmetadatabetweenproductscanreducebothcostsanderrors.,TheNeedsforBusinessmetadata,Businessusersneedtohaveagoodunderstandingofwhatinformationexistsinadatawarehouse.Theyneedtounderstandwhattheinformationmeansfromabusinessviewpoint,howitwasderived,fromwhatsourcesystemsitcomes,whenitwascreated,whatpre-builtreportsandanalysesexistformanipulatingtheinformation,andsoforth.,metadatainadatawarehouse,Kimballliststhefollowingtypesofmetadatainadatawarehouse:SourcesystemmetadataDatastagingmetadataDBMSmetadataRalphKimball,TheDataWarehouseLifecycleToolkit,Wiley,1998,ISBN0-471-25547-5,sourcesystemmetadata,sourcespecifications,suchasrepositories,andsourcelogicalschemassourcedescriptiveinformation,suchasownershipdescriptions,updatefrequenciesandaccessmethodsprocessinformation,suchasjobschedulesandextractioncode,datastagingmetadata,dataacquisitioninformation,suchasdatatransmissionschedulingandresults,andfileusagedimensiontablemanagement,suchasdefinitionsofdimensions,andsurrogatekeyassignmentstransformationandaggregation,suchasdataenhancementandmapping,DBMSloadscripts,andaggregatedefinitionsaudit,joblogsanddocumentation,suchasdatalineagerecords,datatransformlogs,StarSchema,Thestarschemaisadatamodelingtechniqueusedtomapmultidimensionaldecisionsupportintoarelationaldatabase.Starschemasyieldaneasilyimplementedmodelformultidimensionaldataanalysiswhilestillpreservingtherelationalstructureoftheoperationaldatabase.,StarSchema,FourComponents:FactsDimensionsAttributesAttributehierarchies,Figure13.14AThree-DimensionalViewofSales,Figure13.17AttributeHierarchiesinMultidimensionalAnalysis,Facts,NumericmeasurementsthatrepresentspecificbusinessaspectoractivityNormallystoredinfacttablethatiscenterofstarschemaFacttablecontainsfactslinkedthroughtheirdimensionsMetricsarefactscomputedatruntime,Dimensions,QualifyingcharacteristicsprovideadditionalperspectivestoagivenfactDecisionsupportdataalmostalwaysviewedinrelationtootherdataStudyfactsviadimensionsDimensionsstoredindimensiontables,Attributes,DimensionsprovidedescriptionsoffactsthroughtheirattributesNomathematicallimittothenumberofdimensionsUsetosearch,filter,andclassifyfactsSliceanddice:focusonslicesofthedatacubformoredetailedanalysis,AttributeHierarchies,Providetop-downdataorganizationTwopurpose:AggregationDrill-down/roll-updataanalysisDeterminehowthedataareextractedandrepresentedStoredinaDBMSsdatadictionaryUsedbyOLAPtooltoaccesswarehouseproperly.,StarSchema,Astarschemaconsistsoffacttablesanddimensiontables.Facttablescontainthequantitativeorfactualdataaboutabusiness-theinformationbeingqueried.Thisinformationisoftennumerical,additivemeasurementsandcanconsistofmanycolumnsandmillionsorbillionsofrows.Dimensiontablesareusuallysmallerandholddescriptivedatathatreflectsthedimensions,orattributes,ofabusiness.,Figure13.17StarSchemaForSales,StarSchemaRepresentation,Factsanddimensionsarenormallyrepresentedbyphysicaltablesinthedatawarehousedatabase.Thefacttableisrelatedtoeachdimensiontableinamany-to-one(M:1)relationship.Factanddimensiontablesarerelatedbyforeignkeysandaresubjecttotheprimary/foreignkeyconstraints.,Figure13.18OrdersStarSchema,StarSchema,Performance-ImprovingTechniquesNormalizationofdimensionaltablesMultiplefacttablesrepresentingdifferentaggregationlevelsDenormalizationoffacttablesTablepartitioningandreplication,Figure13.19NormalizedDimensionTables,MultipleFactTables,Practice,Howtodesignastarschemaforanautoinsurancecompanytodoriskanalysis?WhatistheObjective?WhataretheFacts?WhataretheDimensions?WhataretheAttributes?WhataretheAttributehierarchy?,AutoinsuranceDWstarschema,DataWarehouseDesign,GrainAdefinitionofthehighestlevelofdetailthatissupportedinadatawarehouseDrill-downTheprocessofprobingbeyondasummarizedvaluetoinvestigateeachofthedetailtransactionsthatcomprisethesummary,DataWarehouseImplementation,TheDataWarehouseasanActiveDecisionSupportNetworkACompany-WideEffortthatRequiresUserInvolvementandCommitmentatAllLevelsSatisfytheTrilogy:Data,Analysis,andUsersApplyDatabaseDesignProcedures,DataWarehouseImplementation,ImplementingadatawarehouseisgenerallyamassiveeffortthatmustbeplannedandexecutedaccordingtoestablishedmethodsTherearemanyfacetstotheprojectlifecycle,andnosinglepersoncanbeanexpertineacharea,DataWarehouseImplementationRoadMap,DataIntegrationandtheExtraction,Transformation,andLoad(ETL)Process,Dataintegrationcomprisesthreemajorprocesses:dataaccess(theabilitytoaccessandextractdatafromanydatasource)datafederation(theintegrationofbusinessviewsacrossmultipledatastores),andchangecapture(theidentification,capture,anddeliveryofthechangesmadetoenterprisedatasources).,DataIntegrationandtheExtraction,Transformation,andLoad(ETL)Process,Extraction,transformation,andload(ETL)Extraction-readingdatafromadatabaseTransformation-convertingtheextracteddatafromitspreviousformintotheformthatcanbeplacedintoadatawarehouseLoad-puttingthedataintothedatawarehouse,DataIntegrationandtheExtraction,Transformation,andLoad(ETL)Process,DataCleanse,Datacleansingordatascrubbingistheactofdetectingandcorrecting(orremoving)corruptorinaccuraterecordsfromarecordset,table,ordatabase.Usedmainlyindatabases,thetermreferstoidentifyingincomplete,incorrect,inaccurate,irrelevantetc.partsofthedataandthenreplacing,modifyingordeletingthisdirtydata.,ETLtools,AgoodETLtoolmustbeabletocommunicatewiththemanydifferentrelationaldatabasesandreadthevariousfileformatsusedthroughoutanorganization.ETLtoolshavestartedtomigrateintoEnterpriseApplicationIntegration,orevenEnterpriseServiceBus,systemsthatnowcovermuchmorethanjusttheextraction,transformationandloadingofdata.ManyETLvendorsnowhavedataprofiling,dataqualityandmetadatacapabilities.,On-LineAnalyticalProcessing,On-LineAnalyticalProcessing(OLAP)isanadvanceddataanalysisenvironmentthatsupportsdecisionmaking,businessmodeling,andoperationsresearchactivities.FourMainCharacteristicsofOLAPUsemultidimensionaldataanalysistechniques.Provideadvanceddatabasesupport.Provideeasy-to-useenduserinterfaces.Supportclient/serverarchitecture.,On-LineAnalyticalProcessing,AdditionalFunctionsofMultidimensionalDataAnalysisTechniquesAdvanceddatapresentationfunctionsAdvanceddataaggregation,consolidation,andclassificationfunctionsAdvancedcomputationalfunctionsAdvanceddatamodelingfunctions,IntegrationOfOLAPWithASpreadsheetProgram,Figure13.7OLAPServerArrangement,SAPsBusinessInformationWarehouse:anEnterprise-WideInformationHub,Anend-to-endenterprise-wideinformationhubtosupportplanninganddecision-making.AcentraldatarepositoryofSAP,non-SAP,current,andhistoricalbusinesstransactionsandmetadata.Timelyinformationtoalllevelsandroles,fromanalysttoexecutive.YearsofSAPfinancial,logistic,andhumanresourceinformationsystemsexperienceweddedwithmoderndatawarehousemethodologies.,ASampleOfCurrentDataWarehousingAndDataMiningVendors,Table13.10,SuccessStoriesatPepsi,Usingthedatawarehouse,wevebeenabletoidentifyimportantitems,findnationalsuppliersforthem,andleveragethoserelationshipstoreducecosts.“Thankstothewarehouse,Pepsicanmonitorpurchasingcomplianceattheuserlevel,anabilitythathasboostedpriceandproductcompliancewellover90percent.Thewarehousealsohelpsensure100percentsalestaxcompliance,saysBridgman.Sincegoingonlinein1995,thewarehousehashelpedgenerateprocurementsavingsinexcessof$100million.,LevelsofDWSupportforEnterpriseDecisionMaking,Theneedforreal-timedata,AbusinessoftencannotaffordtowaitawholedayforitsoperationaldatatoloadintothedatawarehouseforanalysisProvidesincrementalreal-timedatashowingeverystatechangeandalmostanalogouspatternsovertimeMaintainingmetadatainsyncispossibleLesscostlytodevelop,maintain,andsecureonehugedatawarehousesothatdataarecentralizedforBI/BAtoolsAnEAIwithreal-timedatacollectioncanreduceoreliminatethenightlybatchprocesses,Real-Time/ActiveDataWarehouse(RDW/ADW),Loadingandandprovidingdataviathedatawarehouseastheybecomeavailable.ExpandtraditionaldatawarehousefunctionsintotherealmoftacticaldecisionmakingEmpowerdecisionmakingwheninteractdirectlywithcustomersandsuppliers.,Real-TimeDataWarehousing,DataWarehouseAdministration,Duetoitshugesizeanditsintrinsicnature,
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 餐饮装修维修合同范本
- 云南廉价租房合同范本
- 西方飞机租赁合同范本
- 住房公积金管理中心创建市文明单位申报材料
- 乡镇2025年防止返贫动态监测帮扶集中排查工作方案
- 2025年安全工器具题库及答案
- 2025年山西省朔州市事业单位工勤技能考试考试题库及参考答案
- 2025年山东省烟台市事业单位工勤技能考试题库及答案
- CN120269323A 冷却器的拆装装置及方法 (华能澜沧江水电股份有限公司)
- 轨道交通考试试题及答案
- 《医疗机构工作人员廉洁从业九项准则》解读
- Axure RP 互联网产品原型设计课件 第10章 团队合作与输出
- 5.2做自强不息的中国人(教学设计)2024-2025学年七年级道德与法治下册(统编版2024)
- 《支架外固定的护理》课件
- 环氧地坪维修施工方案
- 农村公路养护管理讲座
- 以房抵债协议书二零二五年
- 部编人教版道德与法治4年级上册全册教学课件
- 物业管家服务方案
- 钢铁厂的安全教育
- DB11∕T500-2024城市道路城市家具设置与管理规范
评论
0/150
提交评论