已阅读5页,还剩58页未读, 继续免费阅读
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
29.05.2020,DataMining:ConceptsandTechniques,1,DataMining:ConceptsandTechniquesChapter11ApplicationsandTrendsinDataMiningAdditionalTheme:VisualDataMining,JiaweiHanandMichelineKamberDepartmentofComputerScienceUniversityofIllinoisatUrbana-C/hanj2006JiaweiHanandMichelineKamber.Allrightsreserved.,29.05.2020,DataMining:ConceptsandTechniques,2,29.05.2020,DataMining:ConceptsandTechniques,3,VisualDataMining:AnOverview,WhatisVisualDataMining?SurveyoftechniquesDataVisualizationVisualizingDataMiningResultsVisualDataMining,29.05.2020,DataMining:ConceptsandTechniques,4,WhatIsVisualDataMining?,Visualdatamining“discoversimplicitandusefulknowledgefromlargedatasetsusingdataand/orknowledgevisualizationtechniques”Datavisualization+Dataminingtechniques,29.05.2020,DataMining:ConceptsandTechniques,5,WhyVisualDataMining?,AdvantagesofhumanvisualsystemHighlyparallelprocessorSophisticatedreasoningengineLargeknowledgebaseCanbeusedtocomprehenddatadistributions,patterns,clusters,andoutliers,29.05.2020,DataMining:ConceptsandTechniques,6,WhyNotOnlyVisualDataMining?,DisadvantagesofhumanvisualsystemNeedstrainingNotautomatedIntrinsicbiasLimitofabout106or107observations(Wegman1995)Powerofintegrationwithanalyticalmethods,29.05.2020,DataMining:ConceptsandTechniques,7,ScopeofVisualDataMining,Visualization:Useofcomputergraphicstocreatevisualimageswhichaidintheunderstandingofcomplex,oftenmassiverepresentationsofdataVisualDataMining:Theprocessofdiscoveringimplicitbutusefulknowledgefromlargedatasetsusingvisualizationtechniques,ComputerGraphics,HighPerformanceComputing,PatternRecognition,HumanComputerInterfaces,MultimediaSystems,29.05.2020,DataMining:ConceptsandTechniques,8,PurposeofVisualization,GaininsightintoaninformationspacebymappingdataontographicalprimitivesProvidequalitativeoverviewoflargedatasetsSearchforpatterns,trends,structure,irregularities,relationshipsamongdataHelpfindinterestingregionsandsuitableparametersforfurtherquantitativeanalysisProvideavisualproofofcomputerrepresentationsderived,29.05.2020,DataMining:ConceptsandTechniques,9,VisualDataMining&DataVisualization,IntegrationofvisualizationanddataminingdatavisualizationdataminingresultvisualizationdataminingprocessvisualizationinteractivevisualdataminingDatavisualizationDatainadatabaseordatawarehousecanbeviewedatdifferentlevelsofabstractionasdifferentcombinationsofattributesordimensionsDatacanbepresentedinvariousvisualforms,29.05.2020,DataMining:ConceptsandTechniques,10,AbilitiesofHumansandComputers,29.05.2020,DataMining:ConceptsandTechniques,11,VisualMiningvs.ScientificVis.&Graphics,ScientificVisualizationOftenvisualizephysicalmodel,lowdimensionalityGraphicsMoreconcernedwithhowtorender(draw)ratherthanwhattorender,29.05.2020,DataMining:ConceptsandTechniques,12,DataVisualization,ViewdataindatabaseordatawarehouseUsermaycontrolDifferentlevelsofdetailsSubsetofattributesDrawnusingboxplots,histograms,polylines,etc.,29.05.2020,DataMining:ConceptsandTechniques,13,HistoricalOverviewofExploratoryDataVisualizationTechniques(cf.WB95),PioneeringworksofTufteTuf83,Tuf90andBertinBer81focusonVisualizationofdatawithinherent2D-/3D-semanticsGeneralrulesforlayout,colorcomposition,attributemapping,etc.DevelopmentofvisualizationtechniquesfordifferenttypesofdatawithanunderlyingphysicalmodelGeographicdata,CADdata,flowdata,imagedata,voxeldata,etc.Developmentofvisualizationtechniquesforarbitrarymultidimensionaldata(w.o.anunderlyingphysicalmodel)Applicabletodatabasesandotherinformationresources,29.05.2020,DataMining:ConceptsandTechniques,14,DimensionsofExploratoryDataVisualization,29.05.2020,DataMining:ConceptsandTechniques,15,ClassificationofDataVisualizationTechniques,GeometricTechniques:Scatterplots,Landscapes,ProjectionPursuit,ProsectionViews,Hyperslice,ParallelCoordinates.Icon-basedTechniques:ChernoffFaces,StickFigures,Shape-Coding,ColorIcons,TileBars,.Pixel-orientedTechniques:RecursivePatternTechnique,CircleSegmentsTechnique,Spiral-&Axes-Techniques,.HierarchicalTechniques:DimensionalStacking,Worlds-within-Worlds,Treemap,ConeTrees,InfoCube,.Graph-BasedTechniques:BasicGraphs(Straight-Line,Polyline,Curved-Line,.)SpecificGraphs(e.g.,DAG,Symmetric,Cluster,.)Systems(e.g.,TomSawyer,Hy+,SeeNet,Narcissus,.)HybridTechniques:arbitrarycombinationsfromabove,29.05.2020,DataMining:ConceptsandTechniques,16,Distortion&Dynamic/InteractionTechniques,DistortionTechniquesSimpleDistortion(e.g.PerspectiveWall,BifocalLenses,TableLens,GraphicalFisheyeViews,.)ComplexDistortion(e.g.HyperbolicRepr.Hyperbox,.)Dynamic/InteractionTechniquesData-to-VisualizationMapping(e.g.AutoVisual,SPlus,XGobi,IVEE,.)Projections:(e.g.GrandTour,SPlus,XGobi,.)Filtering(Selection,Querying)(e.g.MagicLens,Filter/FlowQueries,InfoCrystal,.)Linking&Brushing(e.g.Xmdv-Tool,XGobi,DataDesk,.)Zooming(e.g.PAD+,IVEE,DataSpace,.)DetailonDemand(e.g.IVEE,TableLens,MagicLens,VisDB,.),29.05.2020,DataMining:ConceptsandTechniques,17,VisualSurvey,DatavisualizationtechniquesScatterplotMatrices,Landscapes,ParallelCoordinatesIcon-based,DimensionalStacking,Treemaps,29.05.2020,DataMining:ConceptsandTechniques,18,DirectVisualization,RibbonswithTwistsBasedonVorticity,29.05.2020,DataMining:ConceptsandTechniques,19,GeometricTechniques,BasicIdeaVisualizationofgeometrictransformationsandprojectionsofthedataMethodsLandscapesWis95ProjectionPursuitTechniquesHub85(atechniquesforfindingmeaningfulprojectionsofmultidimensionaldata)Scatterplot-MatricesAnd72,Cle93ProsectionViewsFB94,STDS95HypersliceWL93ParallelCoordinatesIns85,ID90,29.05.2020,DataMining:ConceptsandTechniques,20,matrixofscatterplots(x-y-diagrams)ofthek-dimensionaldatatotalof(k2/2-k)scatterplots,UsedbyermissionofM.Ward,WorcesterPolytechnicInstitute,Scatterplot-MatricesCleveland93,29.05.2020,DataMining:ConceptsandTechniques,21,LandscapesWis95,VisualizationofthedataasperspectivelandscapeThedataneedstobetransformedintoa(possiblyartificial)2Dspatialrepresentationwhichpreservesthecharacteristicsofthedata,newsarticlesvisualizedasalandscape,UsedbypermissionofB.Wright,VisibleDecisionsInc.,29.05.2020,DataMining:ConceptsandTechniques,22,ParallelCoordinatesIns85,ID90,nequidistantaxeswhichareparalleltooneofthescreenaxesandcorrespondtotheattributestheaxesarescaledtotheminimum,maximumrangeofthecorrespondingattributeeverydataitemcorrespondstoapolygonallinewhichintersectseachoftheaxesatthepointwhichcorrespondstothevaluefortheattribute,29.05.2020,DataMining:ConceptsandTechniques,23,ParallelCoordinates,29.05.2020,DataMining:ConceptsandTechniques,24,Icon-BasedTechniques,BasicIdeaVisualizationofthedatavaluesasfeaturesoficonsOverviewChernoff-FacesChe73,Tuf83StickFiguresPic70,PG88ShapeCodingBed90ColorIconsLev91,KK94TileBarsHea95(useofsmalliconsrepresentingtherelevancefeaturevectorsindocumentretrieval),29.05.2020,DataMining:ConceptsandTechniques,25,censusdatashowingage,income,sex,education,etc.,usedbypermissionofG.Grinstein,UniversityofMassachusettesatLowell,StickFigures,29.05.2020,DataMining:ConceptsandTechniques,26,HierarchicalTechniques,BasicIdea:Visualizationofthedatausingahierarchicalpartitioningintosubspaces.OverviewDimensionalStackingLWW90Worlds-within-WorldsFB90a/bTreemapShn92,Joh93ConeTreesRMC91InfoCubeRG93,29.05.2020,DataMining:ConceptsandTechniques,27,DimensionalStackingLWW90,partitioningofthen-dimensionalattributespacein2-dimensionalsubspaceswhicharestackedintoeachotherpartitioningoftheattributevaluerangesintoclassestheimportantattributesshouldbeusedontheouterlevelsadequateespeciallyfordatawithordinalattributesoflowcardinality,29.05.2020,DataMining:ConceptsandTechniques,28,UsedbypermissionofM.Ward,WorcesterPolytechnicInstitute,Visualizationofoilminingdatawithlongitudeandlatitudemappedtotheouterx-,y-axesandoregradeanddepthmappedtotheinnerx-,y-axes,DimensionalStacking,29.05.2020,DataMining:ConceptsandTechniques,29,DimensionalStacking,Disadvantages:DifficulttodisplaymorethanninedimensionsImportanttomapdimensionsappropriatelyMaybedifficulttounderstandvisualizationsatfirst,29.05.2020,DataMining:ConceptsandTechniques,30,Screen-fillingmethodwhichusesahierarchicalpartitioningofthescreenintoregionsdependingontheattributevaluesThex-andy-dimensionofthescreenarepartitionedalternatelyaccordingtotheattributevalues(classes),TreemapJS91,Shn92,Joh93,MSRNetscanimage:,29.05.2020,DataMining:ConceptsandTechniques,31,29.05.2020,DataMining:ConceptsandTechniques,32,TreemapofaFileSystem(Schneiderman),29.05.2020,DataMining:ConceptsandTechniques,33,Treemaps,Theattributesusedforthepartitioningandtheirorderingareuser-defined(themostimportantattributesshouldbeusedfirst)ThecoloroftheregionsmaycorrespondtoanadditionalattributeSuitabletogetanoverviewoverlargeamountsofhierarchicaldata(e.g.,filesystem)andfordatawithmultipleordinalattributes(e.g.,censusdata),29.05.2020,DataMining:ConceptsandTechniques,34,DataMiningResultVisualization,PresentationoftheresultsorknowledgeobtainedfromdatamininginvisualformsExamplesScatterplotsandboxplots(obtainedfromdescriptivedatamining)DecisiontreesAssociationrulesClustersOutliersGeneralizedrulesTextmining,29.05.2020,DataMining:ConceptsandTechniques,35,BoxplotsfromStatsoft:MultipleVariableCombinations,29.05.2020,DataMining:ConceptsandTechniques,36,VisualizationofDataMiningResultsinSASEnterpriseMiner:ScatterPlots,29.05.2020,DataMining:ConceptsandTechniques,37,VisualizationofAssociationRulesinSGI/MineSet3.0,29.05.2020,DataMining:ConceptsandTechniques,38,VisualizationofDecisionTreeinSGI/MineSet3.0,29.05.2020,DataMining:ConceptsandTechniques,39,VizualizationofDecisionTrees,29.05.2020,DataMining:ConceptsandTechniques,40,VisualizationofClusterGroupingIBMIntelligentMiner,29.05.2020,DataMining:ConceptsandTechniques,41,AssociationRules(MineSet),LHSandRHSitemsaremappedtox-,y-axisConfidence,supportcorrespondtoheightofthebarordisc,respectivelyInterestingnessismappedtoColor,29.05.2020,DataMining:ConceptsandTechniques,42,MineSet:AssociationRules,29.05.2020,DataMining:ConceptsandTechniques,43,AssociationBallGraph(DBMiner),ItemsarevisualizedasballsArrowsindicateruleimplicationSizerepresentssupport,29.05.2020,DataMining:ConceptsandTechniques,44,Classification(SASEMSAS01),ColorcorrespondstorelativefrequencyofaclassinanodeBranchlinethicknessisproportionaltothesquarerootoftheobjects,TreeViewer,29.05.2020,DataMining:ConceptsandTechniques,45,ClusterAnalysis(H-BLOB:HierarchicalBLOB)SBG00,Cluster,Formellipsoids,Formblobs(implicitsurfaces),29.05.2020,DataMining:ConceptsandTechniques,46,H-BLOB,29.05.2020,DataMining:ConceptsandTechniques,47,TextMining(ThemeRiverWCF+00),VisualizationofthematicChangesindocumentsVerticaldistanceindicatescollectivestrengthofthethemes,29.05.2020,DataMining:ConceptsandTechniques,48,DataMiningProcessVisualization,Presentationofthevariousprocessesofdatamininginvisualformssothatuserscanseetheflowofdatacleaning,integration,preprocessing,miningDataextractionprocessWherethedataisextractedHowthedataiscleaned,integrated,preprocessed,andminedMethodselectedfordataminingWheretheresultsarestoredHowtheymaybeviewed,29.05.2020,DataMining:ConceptsandTechniques,49,VisualizationofDataMiningProcessesbyClementine,Understandvariationswithvisualizeddata,Seeyoursolutiondiscoveryprocessclearly,29.05.2020,DataMining:ConceptsandTechniques,50,InteractiveVisualDataMining,UsingvisualizationtoolsinthedataminingprocesstohelpusersmakesmartdataminingdecisionsExampleDisplaythedatadistributioninasetofattributesusingcoloredsectorsorcolumns(dependingonwhetherthewholespaceisrepresentedbyeitheracircleorasetofcolumns)Usethedisplaytowhichsectorshouldfirstbeselectedforclassificationandwhereagoodsplitpointforthissectormaybe,29.05.2020,DataMining:ConceptsandTechniques,51,Visualdatamining,ProjectionPursuits(Class)ToursDhillonetal.98VisualClassificationAnkerstetal.KDD99,29.05.2020,DataMining:ConceptsandTechniques,52,ProjectionPursuits,Exploratoryprojectionpursuit:Goal:reducedimensionalityDefine“interestingness”indextoeachpossibleprojectionofadatasetMaximizethisindex,projectlinearlyNotalwayspossible/useful,29.05.2020,DataMining:ConceptsandTechniques,53,ClassTours,“VisualizingClassStructureofMultidimensionalData”byDhillonetal.1998Problem:VisualizemultidimensionaldatacategorizedintoclassesSolution:Projectdatainto2Dwhilepreservingdistancesbetweenclassmeans,29.05.2020,DataMining:ConceptsandTechniques,54,Class-PreservingProjection:Preservesdistancesbetweenprojectedmeans,29.05.2020,DataMining:ConceptsandTechniques,55,Tours,Toursareanimatedandinterpolatedsequencesof2DprojectionsAsimov1985Classtours:sequencesofclass-preserving2-dimensionalprojectionsCaptures“inter-classstructureofcomplex,multi-dimensionaldata”,29.05.2020,DataMining:ConceptsandTechniques,56,InteractiveVisualMiningbyPerception-BasedClassification(PBC),29.05.2020,DataMining:ConceptsandTechniques,57,VisualClassification,“VisualClassification:AnInteractiveApproachtoDecisionTreeConstruction”byAnkerst
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 医院候诊室设计
- 质量管理体系审查及持续改进的工具
- 农业科技研发农艺师科技成果转化应用绩效考核表
- 江苏省南通市通州区2026届高一化学第一学期期末综合测试试题含解析
- 通信网络工程师技术能力与业绩考核表
- 一次奇特的探险经历想象文11篇
- 储能电池舱消防
- 对父母的一次真挚表白抒情作文14篇
- 企业物流及运输成本管理方案书
- 江苏省南通市通州区西亭高级中学2026届高三化学第一学期期中联考试题含解析
- 叙事医学故事汇报
- 国家开放大学《园林树木学》形考任务1-4参考答案
- 人文关怀护理查房案例
- 蜜雪冰城加盟合同(2025年版)
- 【课件】进出口货物报关单填制
- 小儿甲型流感护理
- 美术与设计的关系与发展
- 创伤性凝血病救治
- 中国血管性认知障碍诊治指南(2024版)解读
- 服务工作程序、方法和制度
- RhD阴性孕产妇的合理输血讲课分享
评论
0/150
提交评论