应用多元统计分析1.ppt_第1页
应用多元统计分析1.ppt_第2页
应用多元统计分析1.ppt_第3页
应用多元统计分析1.ppt_第4页
应用多元统计分析1.ppt_第5页
已阅读5页,还剩33页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

Prefacetothe1stEdition,Mostoftheobservablephenomenafinminintheempirical(empirikl经验)sciencesareofamultivariatenature.Infinancialstudies,assetsinstockmarketsareobservedsimultaneouslyandtheirjointdevelopmentisanalyzedtobetterunderstandgeneraltendencies(趋势)andtotrackindices(路灯).Theunderlyingtheoreticalstructureoftheseandmanyotherquantitativestudiesofappliedsciencesismultivariate.ThisbookonAppliedMultivariateStatisticalAnalysispresentsthetoolsandconceptsofmultivariatedataanalysiswithastrongfocusonapplications.Theaimofthebookistopresentmultivariatedataanalysisinawaythatisunderstandablefornon-mathematiciansandpractitionerswhoare(面对)bystatisticaldataanalysis.Thisisachievedbyfocusingonthepracticalrelevanceandthroughthee-bookcharacterofthistext.Allpracticalexamplesmayberecalculatedandmodifiedbythereaderusingastandardwebbrowserandwithoutreferenceorapplicationofanyspecificsoftware.,Mostoftheobservablephenomenafinminintheempirical(empirikl经验)sciencesareofamultivariatenature.Theunderlyingtheoreticalstructureoftheseandmanyotherquantitativestudiesofappliedsciencesismultivariate.ThisbookonAppliedMultivariateStatisticalAnalysispresentsthetoolsandconceptsofmultivariate,mltivereitdataanalysiswithastrongfocusonapplications.,Thebookisdividedintothreemainparts.Thefirstpartisdevotedtographicaltechniquesdescribingthedistributionsofthevariablesinvolved.Thesecondpartdealswithmultivariaterandomvariablesandpresentsfromatheoreticalpointofviewdistributions,estimatorsandtestsforvariouspracticalsituations.Thelastpartisonmultivariatetechniquesandintroducesthereadertothewideselectionoftoolsavailableformultivariatedataanalysis.Alldatasetsaregivenintheappendixandaredownloadablefromwww.md-.Thetextcontainsawidevarietyofexercisesthesolutionsofwhicharegiveninaseparatetextbook.Inadditionafullsetoftransparenciesonwww.md-isprovidedmakingiteasierforaninstructortopresentthematerialsinthisbook.Alltransparenciescontainhyperlinkstothestatisticalwebservicesothatstudentsandinstructorsalikemayrecomputeallexamplesviaastandardwebbrowser.,1-2week,UNIT-IDescriptiveTechniques(描述技术)1Comparison(对照)ofBatches1.1Boxplots41.2Histograms101.3Scatterplots171.4DataSet-BostonHousing35,1ComparisonofBatches,Multivariatestatisticalanalysisisconcernedwithanalyzingandunderstandingdatainhighdimensions.Wesupposethatwearegivenasetxini=1ofnobservationsofavariablevectorXinRp.Thatis,wesupposethateachobservationxihaspdimensions:xi=(xi1,xi2,.,xip),andthatitisanobservedvalueofavariablevectorXRp.Therefore,Xiscomposedofprandomvariables:X=(X1,X2,.,Xp)whereXj,forj=1,.,p,isaone-dimensionalrandomvariable.,1ComparisonofBatches,Multivariatestatisticalanalysisisconcernedwithanalyzingandunderstandingdatainhighdimensions.Howdowebegintoanalyzethiskindofdata?Beforeweinvestigatequestionsonwhatinferenceswecanreachfromthedata,weshouldthinkabouthowtolookatthedata.Thisinvolvesdescriptivetechniques.Questionsthatwecouldanswerbydescriptivetechniquesare:AretherecomponentsofXthataremorespreadoutthanothers?AretheresomeelementsofXthatindicatesubgroupsofthedata?ArethereoutliersinthecomponentsofX?How“normal”isthedistributionofthedata?,1.1Boxplots,1ComparisonofBatches,Genuinedenjuin真正的,X6,X1,Themedianandmeanbarsaremeasuresoflocations.Therelativelocationofthemedian(andthemean)intheboxisameasureofskewness.Thelengthoftheboxandwhiskersareameasureofspread.Thelengthofthewhiskersindicatethetaillengthofthedistribution.Theoutlyingpointsareindicatedwitha“”or“”dependingoniftheyareoutsideofFUL1.5dForFUL3dFrespectively.Theboxplotsdonotindicatemultimodalityorclusters.Ifwecomparetherelativesizeandlocationoftheboxes,wearecomparingdistributions.,Summary,Readingmaterial,1.2Histograms,h=0.4,Diagonal,Histogramsaredensity(denst)(密度)estimates(estimeits概算).Adensityestimategivesagoodimpressionofthedistributionofthedata.Incontrasttoboxplots,densityestimatesshowpossiblemultimodality(多模式;综合,mltimdliti)ofthedata.Theideaistolocallyrepresentthedatadensitybycountingthenumberofobservationsinasequenceofconsecutive(连续的)intervals(bins)(箱)withorigin(rn起源、原点)x0.LetBj(x0,h)denote(dinut,指示,表示)thebinoflengthhwhichistheelementofabingridstartingatx0:Bj(x0,h)=x0+(j1)h,x0+jh),jZ,where.,.)(squarebrackets)denotesaleftclosedandrightopeninterval(ntrvl间隔,右开区间).,Ifxini=1isani.i.d.samplewithdensityf,thehistogramisdefinedasfollows:Insum(1.7)thefirstindicatorfunctionIxiBj(x0,h)countsthenumberofobservationsfallingintobinBj(x0,h).ThesecondindicatorfunctionIisresponsiblefor“localizing”(luklizi局限)thecountsaroundx.Theparameterhisasmoothingorlocalizingparameterandcontrolsthewidth(wid)ofthehistogrambins.Anhthatistoolargeleadstoverybigblocksandthustoaveryunstructuredhistogram.Ontheotherhand,anhthatistoosmallgivesaveryvariableestimatewithmanyunimportantpeaks.,H=0.1,H=0.2,H=0.3,Diagonaldaignladj.对角线的,斜的n.对角线,斜线,H=0.4,TheeffectofhisgivenindetailinFigure1.6.Itcontainsthehistogram(upperleft)forthediagonalofthecounterfeitbanknotesforx0=137.8(theminimumoftheseobservations)andh=0.1.Increasinghtoh=0.2andusingthesameorigin,x0=137.8,resultsinthehistogramshowninthelowerleftofthefigure.Thisdensityhistogramissomewhatsmootherduetothelargerh.Thebinwidthisnextsettoh=0.3(upperright).Fromthishistogram,onehastheimpressionthatthedistributionofthediagonalisbimodalwithpeaksatabout138.5and139.9.Thedetectionofmodesrequiresafinetuningofthebinwidth.Usingmethodsfromsmoothingmethodology(medldi,n.方法学)onecanfindan“optimal”binwidthhfornobservations:,counterfeitkauntfitadj.假冒的,假装的,InFigure1.7,weshowhistogramswithx0=137.65(upperleft),x0=137.75(lowerleft),withx0=137.85(upperright),andx0=137.95(lowerright).Allthegraphshavebeenscaledequallyonthey-axistoallowcomparison.Oneseesthatdespitethefixedbinwidthhtheinterpretationisnotfacilitated(fsiliteitidvt.使容易).Theshiftoftheoriginx0(to4differentlocations)created4differenthistograms.Thispropertyofhistogramsstronglycontradictsthegoalofpresentingdatafeatures.,Modesofthedensityaredetectedwithahistogram.Modescorrespondtostrongpeaksinthehistogram.Histogramswiththesamehneednotbeidentical.Theyalsodependontheoriginx0ofthegrid.Theinfluenceoftheoriginx0isdrastic.Changingx0createsdifferentlookinghistograms.Theconsequenceofanhthatistoolargeisanunstructuredhistogramthatistooflat.Abinwidthhthatistoosmallresultsinanunstablehistogram.Thereisan“optimal”h=(24/n)1/3.Itisrecommendedtouseaveragedhistograms.Theyarekerneldensities.,Summary,1.4Scatterplots,Scatterplotsarebivariateortrivariateplotsofvariables(vribl)againsteachother.Theyhelpusunderstandrelationshipsamongthevariablesofadataset.Adownward-sloping(slupi)scatterindicatesthatasweincreasethevariableonthehorizontalaxis,thevariableontheverticalaxisdecreases(di:kri:svt.减少).Ananalogous(nlgsadj.类似的)statementcanbemadeforupward-slopingscatters.,Figure1.12plotsthe5thcolumn(upperinnerframe)ofthebankdataagainstthe6thcolumn(diagonal).Thescatterisdownward-sloping.Asweal

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论