第三章-数据分析方法.ppt_第1页
第三章-数据分析方法.ppt_第2页
第三章-数据分析方法.ppt_第3页
第三章-数据分析方法.ppt_第4页
第三章-数据分析方法.ppt_第5页
已阅读5页,还剩154页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

Chapter4TheMethodsofDataAnalysis6.1DatanormalizationDatanormalizationisthebasisforcomparingexperimentswithinlargeserieswhenexperimentalconditionsmaynotbeidentical.Normalizationensuresthattheexperimentalqualityofthedataiscomparableand,soundmathematicalalgorithmshavebeenemployed.Normalizationincludesvariousoptionstostandardizedataandtoadjustbackgroundlevelsandcorrectgradients.Thecommonlyusednormalizationfunctionsareasfollows:,Linearnormalization:(6.1)Rationormalization:(6.2)Z-scorenormalization:(6.3),Generally,linearnormalizationisrecommended(ifXmax=1andXmin=0,xiisnormalizedinpercentagebyformula(6.1).AfterRationormalization,thesumofnormalizedvariableswillbeequalto1.Z-scoreassumesxiobeysGaussiandistribution.Ifxihasadifferentdistribution,thenthenormalizationwilltwistthepattern(variancewillbefarawayfromthestandarddeviation)andleadstoincorrectpatternrecognition.Ispopulationstandarddeviation,ingeneral,itcanbeapproximatedbysamplestandarddeviation(S),6.2SimpleLinearRegression,LearningObjectives:1.DescribetheLinearRegressionModel2.ExplainOrdinaryLeastSquares3.ComputeRegressionCoefficients4.Evaluatethelinearregressionmodel5.PredictResponseVariable,6.2.1DescribetheLinearRegressionModel,RegressionModels:1.AnswerWhatIstheRelationshipBetweentheVariables?2.EquationUsed1NumericalDependent(Response)Variable1orMoreNumericalorCategoricalIndependent(Explanatory)Variables3.UsedMainlyforPrediction&Estimation,TypesofRegressionModels,Regression,Models,TypesofRegressionModels,Regression,Models,Simple,1ExplanatoryVariable,TypesofRegressionModels,Regression,Models,2+ExplanatoryVariables,Simple,Multiple,1ExplanatoryVariable,TypesofRegressionModels,Regression,Models,Linear,Simple,Multiple,1ExplanatoryVariable,2+ExplanatoryVariable,TypesofRegressionModels,Regression,Models,Linear,Non-,Linear,Simple,Multiple,1ExplanatoryVariable,2+ExplanatoryVariable,TypesofRegressionModels,Regression,Models,Linear,Non-,Linear,Simple,Multiple,Linear,1ExplanatoryVariable,2+ExplanatoryVariable,TypesofRegressionModels,Regression,Models,Linear,Non-,Linear,Simple,Multiple,Linear,Non-,Linear,1ExplanatoryVariable,2+ExplanatoryVariable,LinearEquations,HighSchoolTeacher,Y,X,i,i,i,0,1,LinearRegressionModel,Assumesthattherelationshipbetweenvariablesisalinearfunction,Dependent(Response)Variable(e.g.,properties),Independent(Explanatory)Variable(e.g.,structurerepresentation),PopulationSlope,PopulationY-Intercept,RandomError,LinearRegressionModel,Observedvalue,Observedvalue,i=Randomerror,SimpleLinearRegressionModel,Unsampledobservation,i=Randomerror,Observedvalue,0,20,40,60,0,20,40,60,X,Y,6.2.2ExplainOrdinaryLeastSquares,ScatterGraph:1.Plotofall(Xi,Yi)pairs2.Suggestshowwellmodelwillfit,ThinkingChallenge,Howwouldyoudrawalinethroughthepoints?Howdoyoudeterminewhichlinefitsbest?,ThinkingChallenge,Howwouldyoudrawalinethroughthepoints?Howdoyoudeterminewhichlinefitsbest?,ThinkingChallenge,Howwouldyoudrawalinethroughthepoints?Howdoyoudeterminewhichlinefitsbest?,ThinkingChallenge,Howwouldyoudrawalinethroughthepoints?Howdoyoudeterminewhichlinefitsbest?,ThinkingChallenge,Howwouldyoudrawalinethroughthepoints?Howdoyoudeterminewhichlinefitsbest?,ThinkingChallenge,Howwouldyoudrawalinethroughthepoints?Howdoyoudeterminewhichlinefitsbest?,ThinkingChallenge,Howwouldyoudrawalinethroughthepoints?Howdoyoudeterminewhichlinefitsbest?,LeastSquares(LS),1.BestFitMeansDifferenceBetweenActualYValues&PredictedYValuesAreaMinimumButPositiveDifferencesOffSetNegative2.LSMinimizestheSumoftheSquaredDifferences(SSE),LeastSquaresGraphically,6.2.3ComputeRegressionCoefficients,Goal:Minimizesquarederror:,(6.4),(6.5),(6.6),Where:,6.2.4PredictResponseVariableComputationTable1,ComputationTable2,CoefficientEquations,SampleSlope,SampleY-intercept,RegressionEquation,Example1:,1.维尼纶纤维的耐热水性能好坏可以用指标“缩醛化度”来衡量,此指标越高,耐热水性能也越好。下表为测得的一组数据,分别计算出回归系数和相关系数。,把聚乙烯醇溶解于水中经干法纺丝或湿法纺丝合成纤维。聚乙烯醇纤维用甲醛处理制成聚乙烯醇缩甲醛纤维即通常所称的维尼纶。,聚乙烯醇缩醛化反应可得到重要的高分子产品,缩甲醛:维尼纶缩丁醛:良好的玻璃粘合剂,Table1:,Table2:,6.2.5EvaluatethelinearregressionmodelMeasuresofVariationinRegression,1.TotalSumofSquares(SSyy)MeasuresVariationofObservedYiAroundtheMeanY2.ExplainedVariation(SSR)VariationDuetoRelationshipBetweenX&Y3.UnexplainedVariation(SSE)VariationDuetoOtherFactors,VariationMeasures,Totalsumofsquares(Yi-Y)2,Unexplainedsumofsquares(Yi-Yi)2,Explainedsumofsquares(Yi-Y)2,Yi,1.ProportionofVariationExplainedbyRelationshipBetweenX&Y,CoefficientofDetermination,0r21,=0.9211,CoefficientofDeterminationExamples,r2=1,r2=1,r2=.8,r2=0,2.PearsonProductMomentCoefficientofCorrelation,r:,SimpleCoefficientofCorrelation,=0.9597,CoefficientofCorrelationValues,-1.0,+1.0,0,-.5,+.5,-1.0,+1.0,0,-.5,+.5,NoCorrelation,-1.0,+1.0,0,Increasingdegreeofnegativecorrelation,-.5,+.5,NoCorrelation,-1.0,+1.0,0,-.5,+.5,PerfectNegativeCorrelation,NoCorrelation,-1.0,+1.0,0,-.5,+.5,PerfectNegativeCorrelation,NoCorrelation,Increasingdegreeofpositivecorrelation,-1.0,+1.0,0,PerfectPositiveCorrelation,-.5,+.5,PerfectNegativeCorrelation,NoCorrelation,CoefficientofCorrelationExamples,r=1,r=-1,r=.89,r=0,6.2.6IntroductiontoNon-linearregression,基本概念非线性模型及其线性化方法,非线性回归,1.因变量y与x之间不是线性关系2.可通过变量代换转换成线性关系用最小二乘法求出参数的估计值并非所有的非线性模型都可以化为线性模型,几种常见的非线性模型,指数函数,线性化方法两端取对数得:lny=ln+x令:y=lny,则有y=ln+x,基本形式:,图像,几种常见的非线性模型,幂函数,线性化方法两端取对数得:lgy=lg+lgx令:y=lgy,x=lgx,则y=lg+x,基本形式:,图像,几种常见的非线性模型,双曲线函数,线性化方法令:y=1/y,x=1/x,则有y=+x,基本形式:,图像,几种常见的非线性模型,对数函数,线性化方法x=lgx,则有y=+x,基本形式:,图像,几种常见的非线性模型,S型曲线,线性化方法令:y=1/y,x=e-x,则有y=+x,基本形式:,图像,非线性回归(实例),【例】为研究生产率与废品率之间的关系,记录数据如下表。试拟合适当的模型。,非线性回归,生产率与废品率的散点图,非线性回归(实例),用线性模型:y=01x+,有y=2.671+0.0018x用指数模型:y=x,有y=4.05(1.0002)x3.用指数模型:y=4.003e0.000219x4.比较5.直线的残差平方和5.337134.Averagej=1(correlationmatrix),PrincipalComponentsAnalysis:Eigenvalues,1,2,PCA:Terminology,jthprincipalcomponentisjtheigenvectorofcorrelation/covariancematrixcoefficients,ajk,areelementsofeigenvectorsandrelateoriginalvariables(standardizedifusingcorrelationmatrix)tocomponentsscoresarevaluesofunitsoncomponents(producedusingcoefficients)amountofvarianceaccountedforbycomponentisgivenbyeigenvalue,jproportionofvarianceaccountedforbycomponentisgivenbyj/jloadingofkthoriginalvariableonjthcomponentisgivenbyajk%j-correlationbetweenvariableandcomponent,Howmanycomponentstouse?,Ifj1,alongthesingulardirectionsuiFortheOLSsolutioni=1,i=1,p,i.e.allthedirectionsuicontributeequally,PCR,Uselinearcombinationszm=Xvasnewfeaturesvjistheprincipalcomponent(columnofV)correspondingtothejthlargestelementofD,e.g.thedirectionsofmaximalsamplevarianceForsomeMpformthederivedinputvectorsz1zM=Xv1XvMRegressyonz1zM,givesthesolutionwhere,PCRcontinued,Themthprincipalcomponentdirectionvmsolves:Filterfactorsbecomee.g.itdiscardsthep-MsmallesteigenvaluecomponentsfromtheOLSsolution.Ifp=MitgivestheOLSsolution,6.6IntroductionofOrigin,6.6.1introductionOriginisprofessionalgraphinganddataanalysissoftwareforscientistsandengineers.Origin,hasbeengrowinginpopularityamongscientistsandengineersasaseriousdataanalysisandgraphingsoftwaresince1991.Originisusedinhundredsoflargecorporationsandaroundathousandcollegesanduniversitiesworldwide.Therearevariousversion,itremainscommittedtothemissionofmakingOriginthebestscientificgraphingsoftwareanddataanalysissoftware.Alongwithitseasy-to-usegraphicalinterface,Originoffersintuitive,yetpowerful,researchtoolsforthedailyneedsoftheresearcher.ThelatestversionisOrigin7.5.,MenusandMenuCommandsOriginsmenubarprovidescommandstoperformoperationsontheactivewindowandtoperformgeneraloperationssuchasopeningaHelpfileorturningonthedisplayofatoolbar.Themenubarchangesasyouchangetheactivewindow.Forexample,thefollowingfigurescomparetheworksheetandgraphmenubars.,Origin是美国OriginLab公司(其前身为Microcal公司)开发的图形可视化和数据分析软件,是科研人员和工程师常用的高级数据分析和制图工具。自1991年问世以来,由于其操作简便,功能开放,很快就成为国际流行的分析软件之一,是公认的快速、灵活、易学的工程制图软件。在国内,其使用范围也越来越广泛,目前的最高版本为Origin7.5。当前流行的图形可视化和数据分析软件有Matlab,Mathmatica和Maple等。这些软件功能强大,可满足科技工作中的许多需要,但使用这些软件需要一定的计算机编程知识和矩阵知识,并熟悉其中大量的函数和命令。而使用Origin就像使用Excel和Word那样简单,只需点击鼠标,选择菜单命令就可以完成大部分工作,获得满意的结果。,像Excel和Word一样,Origin是个多文档界面应用程序。它将所有工作都保存在Project(*.OPJ)文件中。该文件可以包含多个子窗口,如Worksheet,Graph,Matrix,Excel等。各子窗口之间是相互关联的,可以实现数据的即时更新。子窗口可以随Project文件一起存盘,也可以单独存盘,以便其他程序调用。Origin具有两大主要功能:数据制图和数据分析。Origin数据制图主要是基于模板的,提供了50多种2D和3D图形模板。用户可以使用这些模板制图,也可以根据需要自己设置模板。Origin数据分析包括排序、计算、统计、平滑、拟合和频谱分析等强大的分析工具。这些工具的使用也只是单击工具条按钮或选择菜单命令。,在Origin7.0的基础上,OriginLab公司开发了Originpro和附加模块(Addonmodules)。用户可以在Originpro中建立自己需要的特殊工具。Originpro的灵活界面使用起来快捷方便,这样用户可以将精力集中到图形的数据分析上,而不是处理图形本身。Addonmodules为Origin和Originpro添加了特殊的高级数据分析功能,可以弥补Origin7.0相对Matlab和Mathmatica的不足。用户可以自定义数学函数和制图模板,添加菜单命令和命令按钮,调用OriginC和NAG函数。,Origin界面,6.6.2数据分析绘图工具Origin,1概述2数据文件的建立3数据的编辑4绘制图形5图形的编辑和格式化6Tools工具栏的使用7数据分析曲线拟合8Origin图形文件的输出,1.1Origin的主要功能,由数据或函数作图,图形的拟合,1.2Origin的工作界面(Workspace),工作表窗口,子窗口,工程管理器,图形窗口,1.标题栏2.菜单栏3.工具栏工具栏的开启方法4.子窗口子窗口种类5.工程管理器(ProjectExplorer):TheProjectExplorerisatooltohelpyouorganizeyourOriginprojects6.状态栏,Origin的工作界面,工具栏的开启方法:selectView:ToolbarsfromtheOriginmenubar.,Whenaworkbook(Excel)isactive,selectWindow:OriginToolbars.,Toolbarsdialogbox,子窗口的种类主要有:,TheWorksheetWindow工作表窗口TheExcelWorkbookWindowExcel工作表窗口TheGraphWindow图形窗口TheFunctionGraphWindow函数图形窗口TheLayoutPageWindow版面编排窗口Attention:Eachchildwindowhasitsownmenustructure,whichisdisplayedwhenthewindowisactive.,1.使用菜单中的相应命令2.使用工具按钮3.右击鼠标,在弹出的快捷菜单中选相应命令4.选定对象后双击,打开对话框,1.3基本操作方法,1.Origin的启动桌面快捷图标“开始”“程序”“MicrocalOrigin6.0”快捷图标,1.4Origin的启动和退出,2.Origin的退出方法有两种:单击右上角的关闭按扭;单击Origin窗口菜单的“File”“Exit”,注意:要区分Origin的退出和子窗口的退出,Saveaproject保存工程Saveachildwindowseparatelyasafile单独将子窗口作为一个文件保存Saveatemplateasafile保存为模板文件可保存的文件类型及文件扩展名,1.5Origin文件的保存,文件类型及文件扩展名(fileextension)ProjectOPJItcannotsaveastemplateGraphWorksheetOGWTemplateextensionisOTWExcelWorkbookXLSItcannotsaveastemplateLayoutPageOTPItcannotsaveasfileMatrixOGMTemplateextensionisOTMFunctionGraphOGGTemplateextensionisOTPNotesTXTItcannotsaveastemplate,Originprovidesseveralwaystoadddatatotheworksheet1.Enteringdatausingthekeyboard.键盘输入2.Importingafile.导入文件3.PastingdatafromanotherapplicationusingtheClipboard.4.Pastingdatafromanother(orthesame)OriginworksheetusingtheClipboard.(3)和(4)是粘贴数据5.UsingExcelWorkbookWindow.打开或创建Excel工作表6.Usingafunctiontosetcolumnvalues.用函数设置列的值,2数据文件的建立,3数据的编辑,3.1工作表简介3.2工作表的选定操作3.3数据的编辑修改3.4列的插入、删除及重排3.5行的插入和删除3.6删除工作表3.7格式化数据表(Worksheet),3.1工作表简介,Theworksheetwindowisorganizedintoverticalcolumnsandhorizontalrows.工作表由垂直的列和水平的行组成Attheintersectionofeachcolumnandrowisacell.列与行的交叉处称为单元格Eachcellcancontainasinglenumeric,text,numericandtext,date,ortimevalue.每个单元格内可包含数、文本、日期、时间等Originprojectscancontainmultipleworksheets.一个Originprojects中可以包含多个工作表,3.2工作表的选定操作,1.选定若干单元格(SelectingCells)Click-and-dragtoselectthecells.2.选定若干行(SelectingRows)Selecttherowheading,dragthemouseOrSelecttherowheading,Click+SHIFTkey3.选相邻的列(SelectingAdjacentColumns)Selectthefirstcolumnheading,Click-and-dragOrSelectthecolumnheading,Click+SHIFTkey4.选不相邻的列(SelectingNonadjacentColumns)Selectthecolumnheading,+CTRLkey,1.数据的修改Tochangeavalue,selectthecellandtypethecorrectvalue.(Originautomaticallyoverwritesthevalueinthecell)替换单元格中的数据,点击该单元格,输入新的值Toeditacellvalue,pressF2orclickatthedesiredposition修改单元格中的数据,点击该单元格后,在拟修改的位置单击鼠标Delete:Deleteonevaluetotherightofthecursor,ordeleteallhighlightedvalues.,3.3数据的编辑修改,2.在列中插入数据(InsertingDatawithinaColumn),Toinsertacellinacolumn,selectthecellthatisdirectlybelowwhereyouwanttoinsertthenewcell.选定拟插入新单元格下方的单元格ThenselectEdit:Insertorright-clickselectInsertfromtheshortcutmenu.执行编辑菜单中的插入命令Thenewcellinsertsabovetheselectedcell.新单元格将插在所选定单元格的上方,3.删除数据(DeletingData),Tocleartheentirecontentsofaworksheet,selectEdit:ClearWorksheet.删除整个工作表中的内容Todeletethecontentsofarangeofcellsfromtheworksheet,selectEdit:Clear.删除单元格或单元格区域中的内容,格子保留Todeletearangeofcellsfromtheworksheet,selectEdit:Delete.内容和单元格同时删除Attention:Delete(1)TheEdit:Deletedeletesaselectedvaluesandcells(2)Thekeyboard:Deletedeletestheworksheetvaluesonly.,4.列的插入、删除和重排,AddingColumns:增加列Performoneofthefollowingoperations:SelectColumn:AddNewColumns.ClicktheAddNewColumnsbuttonontheStandardtoolbar.Right-clickinsidetheworksheetwindowbuttotherightoftheworksheetgrid.SelectAddNewColumnfromtheshortcutmenu.InsertingColumns:插入列SelectEdit:InsertRight-clickselectInsertfromtheshortcutmenu.DeletingColumns:删除列SelectEdit:DeleteRight-clickselectDelete.Note:Toclearthecolumnvaluesbutremainthecolumns,selectEdit:Clear.MovingColumns:移动列SelectColumn:MovetoFirst.Column:MovetoLast.,先选定后操作,5.行的插入和删除,InsertingRows:插入行selectEdit:Insertorright-clickandselectInsertDeletingRows:删除行Edit:Deleteorright-clickandselectDelete.,先选定后操作,6.删除工作表(DeletingaWorksheetfromaProject),Todeleteaworksheetfromtheproject,performoneofthefollowingoperations:ClicktheCloseWindowbuttonintheupper-rightcorneroftheworksheet.点击工作表右上方的关闭窗口按扭Right-clickontheworksheetwindowiconinProjectExplorerandselectDeleteWindowfromtheshortcutmenu.在工程管理器中右击工作表图标,在快捷菜单中选DeleteWindowClickontheworksheetwindowiconinProjectExplorerandthenpressDelete.在工程管理器中点击工作表图标,按Delete键,作用:1.改变列的名称(ColumnName)2.改变列的标识(PlotDesignation)3.改变数据的类型(Display)4.改变数的格式(Format)5.改变数的显示格式(NumericDisplay)6.改变列宽(ColumnWidth)7.为列标签添加说明(ColumnLabel),7.数据表的格式化,方法:双击工作表的列标签打开WorksheetColumnFormat,5.改变数的显示格式(NumericDisplay),4.改变数的格式(Format),十进制格式科学记数格式工程记数格式有千位分隔符的十进制格式,默认的十进制显示数据设置小数点的位置设置有效数字的位数,4绘制图形,4.1图形窗口中的基本术语4.2绘图的方法,4.1图形窗口中的基本术语,Page:Eachgraphwindowcontainsasingleeditablepage.Thepageservesasabackdrop页Layer:Amovable,sizeableunit层Frame:Theframeisarectangularbox框架Graph:Agraphincludesatleastthreeelements:图asetofX,Y,Zcoordinateaxes,oneormoredataplots,associatedtextandgraphiclabels.DataPlot:Thedataplotisthevisualdisplayofoneormoredatasetsinagraphwindow.数据点,4.2绘图的方法,1.数据表窗口激活时绘图的方法2.图形窗口激活时绘图的方法3.在同一张图上绘制多条线4.绘制双横纵坐标图,1.数据表窗口激活时绘图的方法,选定列或数据范围selectPlot:Graph自动画出图形自定义列坐标不需选定数据范围直接选Plot:Graph在弹出的对话框中设定,2.图形窗口激活时绘图的方法,AddingdatausingtheLayerndialogbox.用层对话框在图形窗口中添加数据右击层标识Thismethodrespectstheworksheetcolumnplottingdesignations.AddingdatausingtheSelectColumnsforPlottingdialogbox.用绘图对话框通过选择数据列在图形窗口中添

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论