




已阅读5页,还剩24页未读, 继续免费阅读
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
JianweiGou,Slide1,PrincipalComponentsAnalysis,Objectives:Understandtheprinciplesofprincipalcomponentsanalysis(PCA)RecognizeconditionsunderwhichPCAmaybeusefulUseRprocedurePRINCOMPtoperformaprincipalcomponentsanalysisinterpretPRINCOMPoutput.,JianweiGou,Slide2,TypicalFormofData,Adatasetina8x3matrix.Therowscouldbespeciesandcolumnssamplingsites.,10097999690908075607585956240287780789291807585100,X=,Amatrixisoftenreferredtoasanxpmatrix(nfornumberofrowsandpfornumberofcolumns).Ourmatrixhas8rowsand3columns,andisan8x3matrix.,JianweiGou,Slide3,WhatarePrincipalComponents?,Principalcomponentsarelinearcombinationsoftheobservedvariables.ThecoefficientsoftheseprincipalcomponentsarechosentomeetthreecriteriaWhatarethethreecriteria?,Y=b1X1+b2X2+bnXn,JianweiGou,Slide4,WhatarePrincipalComponents?,Thethreecriteria:Thereareexactlypprincipalcomponents(PCs),eachbeingalinearcombinationoftheobservedvariables;ThePCsaremutuallyorthogonal(i.e.,perpendicularanduncorrelated);Thecomponentsareextractedinorderofdecreasingvariance.,JianweiGou,Slide5,ASimpleDataSet,XYX11Y11,XYX11.414Y1.4142,CorrelationmatrixCovariancematrix,JianweiGou,Slide6,GeneralPatterns,Thetotalvarianceis3(=1+2)Thetwovariables,XandY,areperfectlycorrelated,withallpointsfallontheregressionline.Thespatialrelationshipamongthe5pointscanthereforeberepresentedbyasingledimension.PCAisadimension-reductiontechnique.WhatwouldhappenifweapplyPCAtothedata?,JianweiGou,Slide7,GraphicPCA,JianweiGou,Slide8,RProgram,#PricipalComponentsAnalysis#enteringrawdataandextractingPCs#fromthecorrelationmatrixx=c(-1.264911064,-0.632455532,0,0.632455532,1.264911064)y=c(-1.788854382,-0.894427191,0,0.894427191,1.788854382)mydata=cbind(x,y)fit-princomp(mydata,cor=TRUE)summary(fit)#printvarianceaccountedforloadings(fit)#pcloadingsplot(fit,type=lines)#screeplotfit$scores#theprincipalcomponentsbiplot(fit),JianweiGou,Slide9,StepsinaPCA,HaveatleasttwovariablesGenerateacorrelationorvariance-covariancematrixObtaineigenvaluesandeigenvectors(Thisiscalledaneigenvalueproblem,andwillbeillustratedwithasimplenumericalexample)Generateprincipalcomponent(PC)scoresPlotthePCscoresinthespacewithreduceddimensionsAllthesecanbeautomatedbyusingR.,JianweiGou,Slide10,CovarianceorCorrelationMatrix?,0,10,20,30,40,Abundance,Sp1,Sp2,XuhuaXia,Slide11,CovarianceorCorrelationMatrix?,XuhuaXia,Slide12,CovarianceorCorrelationMatrix?,JianweiGou,Slide13,TheEigenvalueProblem,Thecovariancematrix.TheEigenvalueisthesetofvaluesthatsatisfythiscondition.Theresultingeigenvalues(Thereareneigenvaluesfornvariables).Thesumofeigenvaluesisequaltothesumofvariancesinthecovariancematrix.,Findingtheeigenvaluesandeigenvectorsiscalledaneigenvalueproblem(oracharacteristicvalueproblem).,JianweiGou,Slide14,GettheEigenvectors,Aneigenvectorisavector(x)thatsatisfiesthefollowingcondition:Ax=xInourcaseAisavariance-covariancematrixoftheorderof2,andavectorxisavectorspecifiedbyx1andx2.,JianweiGou,Slide15,GettheEigenvectors,Wewanttofindaneigenvectorofunitlength,i.e.,x12+x22=1Wethereforehave,FromPreviousSlide,Thefirsteigenvectorisoneassociatedwiththelargesteigenvalue.,Solvex1,JianweiGou,Slide16,GetthePCScores,FirstPCscore,SecondPCscore,Originaldata(xandy),Eigenvectors,Theoriginaldatainatwodimensionalspaceisreducedtoonedimension.,JianweiGou,Slide17,WhatArePrincipalComponents?,Principalcomponentsareanewsetofvariables,whicharelinearcombinationsoftheobservedones,withtheseproperties:Becauseofthedecreasingvarianceproperty,muchofthevariance(informationintheoriginalsetofpvariables)tendstobeconcentratedinthefirstfewPCs.ThisimpliesthatwecandropthelastfewPCswithoutlosingmuchinformation.PCAisthereforeconsideredasadimension-reductiontechnique.BecausePCsareorthogonal,theycanbeusedinsteadoftheoriginalvariablesinsituationswherehavingorthogonalvariablesisdesirable(e.g.,regression).,JianweiGou,Slide18,Indexofhiddenvariables,TherankingofAsianuniversitiesbytheAsianWeekHKUisrankedsecondinfinancialresources,butseventhinacademicresearchHowdidHKUgetrankedthird?Isthereamoreobjectivewayofranking?Anillustrativeexample:,JianweiGou,Slide19,ASimpleDataSet,School5isclearlythebestschoolSchool1isclearlytheworstschool,JianweiGou,Slide20,GraphicPCA,-1.7889-0.894400.89441.7889,JianweiGou,Slide21,CrimeDatain50States,STATEMURDERRAPEROBBEASSAUBURGLALARCENAUTOALABAMA14.225.296.8278.31135.51881.9280.7ALASKA10.851.696.8284.01331.73369.8753.3ARIZONA9.534.2138.2312.32346.14467.4439.5ARKANSAS8.827.683.2203.4972.61862.1183.4CALIFORNIA11.549.4287.0358.02139.43499.8663.5COLORADO6.342.0170.7292.91935.23903.2477.1CONNECTICUT4.216.8129.5131.81346.02620.7593.2DELAWARE6.024.9157.0194.21682.63678.4467.0FLORIDA10.239.6187.9449.11859.93840.5351.4GEORGIA11.731.1140.5256.51351.12170.2297.9HAWAII7.225.5128.064.11911.53920.4489.4IDAHO5.519.439.6172.51050.82599.6237.6ILLINOIS9.921.8211.3209.01085.02828.5528.6.PROCPRINCOMPOUT=CRIMCOMP;,DATACRIME;TITLECRIMERATESPER100,000POPBYSTATE;INPUTSTATENAME$1-15MURDERRAPEROBBERYASSAULTBURGLARYLARCENYAUTO;CARDS;Alabama14.225.296.8278.31135.51881.9280.7Alaska10.851.696.8284.01331.73369.8753.3Arizona9.534.2138.2312.32346.14467.4439.5Arkansas8.827.683.2203.4972.61862.1183.4California11.549.4287.0358.02139.43499.8663.5Colorado6.342.0170.7292.91935.23903.2477.1Connecticut4.216.8129.5131.81346.02620.7593.2Delaware6.024.9157.0194.21682.63678.4467.0Florida10.239.6187.9449.11859.93840.5351.4Georgia11.731.1140.5256.51351.12170.2297.9Hawaii7.225.5128.064.11911.53920.4489.4Idaho5.519.439.6172.51050.82599.6237.6Illinois9.921.8211.3209.01085.02828.5528.6Indiana7.426.5123.2153.51086.22498.7377.4Iowa2.310.641.289.8812.52685.1219.9Kansas6.622.0100.7180.51270.42739.3244.3Kentucky123.3872.21662.1245.4Louisiana15.530.9142.9335.51165.52469.9337.7Maine2.413.538.7170.01253.12350.7246.9Maryland8.034.8292.1358.91400.03177.7428.5Massachusetts3.120.8169.1231.61532.22311.31140.1Michigan9.338.9261.9274.61522.73159.0545.5Minnesota2.719.585.985.81134.72559.3343.1Mississippi14.319.665.7189.1915.61239.9144.4Missouri9.628.3189.0233.51318.32424.2378.4Montana5.416.739.2156.8804.92773.2309.2Nebraska3.918.164.7112.7760.02316.1249.1Nevada15.849.1323.1355.02453.14212.6559.2NewHampshire3.210.723.276.01041.72343.9293.4NewJersey5.621.0180.4185.11435.82774.5511.5NewMexico8.839.1109.6343.41418.73008.6259.5NewYork10.729.4472.6319.11728.02782.0745.8,NorthCarolina10.617.061.3318.31154.12037.8192.1NorthDakota0.99.013.343.8446.11843.0144.7Ohio7.827.3190.5181.11216.02696.8400.4Oklahoma8.629.273.8205.01288.22228.1326.8Oregon4.939.9124.1286.91636.43506.1388.9Pennsylvania5.619.0130.3128.0877.51624.1333.2RhodeIsland3.610.586.5201.01489.52844.1791.4SouthCarolina11.933.0105.9485.31613.62342.4245.1SouthDakota2.013.517.9155.7570.51704.4147.5Tennessee10.129.7145.8203.91259.71776.5314.0Texas13.333.8152.4208.21603.12988.7397.6Utah3.520.368.8147.31171.63004.6334.5Vermont1.415.930.8101.21348.22201.0265.2Virginia9.023.392.1165.7986.22521.2226.7Washington4.339.6106.2224.81605.63386.9360.3WestVirginia6.013.242.290.9597.41341.7163.3Wisconsin2.812.952.263.7846.92614.2220.7Wyoming5.421.939.7173.9811.62772.2282.0;PROCPRINCOMPout=crimcomp;run;PROCPRINT;IDSTATENAME;VARPRIN1PRIN2MURDERRAPEROBBERYASSAULTBURGLARYLARCENYAUTO;run;PROCGPLOT;PLOTPRIN2*PRIN1=STATENAME;TITLE2PLOTOFTHEFIRSTTWOPRINCIPALCOMPONENTS;run;PROCPRINCOMPdata=CRIMECOVOUT=crimcomp;run;PROCPRINT;IDSTATENAME;VARPRIN1PRIN2MURDERRAPEROBBERYASSAULTBURGLARYLARCENYAUTO;run;,/*Addtohaveamapview*/procsortdata=crimcompout=crimcomp;bySTATENAME;run;procsortdata=maps.us2out=mymap;bySTATENAME;run;databoth;mergemymapcrimcomp;bySTATENAME;run;procgmapdata=both;id_map_geometry_;choroPRIN1PRIN2/levels=15;/*choroPRIN1/discrete;*/run;,JianweiGou,Slide24,MURDERRAPEROBBERYASSAULTBURGLARYLARCENYAUTOMURDER1.00000.60120.48370.64860.38580.10190.0688RAPE0.60121.00000.59190.74030.71210.61400.3489ROBBERY0.48370.59191.00000.55710.63720.44670.5907ASSAULT0.64860.74030.55711.00000.62290.40440.2758BURGLARY0.38580.71210.63720.62291.00000.79210.5580LARCENY0.10190.61400.44670.40440.79211.00000.4442AUTO0.06880.34890.59070.27580.55800.44421.0000,CorrelationMatrix,Ifvariablesarenotcorrelated,therewouldbenopointindoingPCA.Thecorrelationmatrixissymmetric,soweonlyneedtoinspecteithertheupperorlowertriangularmatrix.,JianweiGou,Slide25,EigenvalueDifferenceProportionCumulativePRIN14.114962.876240.5878510.58785PRIN21.238720.512910.1769600.76481PRIN30.725820.409380.1036880.86850PRIN40.316430.058460.0452050.91370PRIN50.257970.035930.0368530.95056PRIN60.222040.097980.03
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- DB14-T 3368-2025 农作物认定品种试验技术通则 杂粮作物
- 野外露营餐饮服务协议
- 家居建材展览会参展商服务及产品推广合同
- 车展现场购车优惠活动合同范本
- 企业财务预算员劳动合同规定
- 公共体育设施场地无偿使用与运营管理协议
- 2025年财务会计考试试题及答案
- 水处理设备产品加工技术秘密保护合同
- 专业研发厂房租赁合同范本(水电使用安全保障措施)
- 生态农业园区厂房出租居间代理合同
- 主题班会-好好说话与爱同行【课件】共2
- 2024年全国高中数学联赛(浙江预赛)试题含参考答案
- 办公家具采购项目投标方案投标文件(技术方案)
- 2025年中考物理知识点归纳(挖空版)
- 硫化钾测试报告范文
- 《掌握出口贸易管制》课件
- 供水管道工程总承包EPC项目投标方案(技术标)
- 张爱玲爱情论文开题报告
- 2025-2030年中国IT分销市场竞争格局规划研究报告
- 2025年初中学业水平考试语文模拟试卷(四)
- 光伏项目监理竣工报告模
评论
0/150
提交评论