应用回归分析.pdf_第1页
应用回归分析.pdf_第2页
应用回归分析.pdf_第3页
应用回归分析.pdf_第4页
应用回归分析.pdf_第5页
已阅读5页,还剩57页未读 继续免费阅读

应用回归分析.pdf.pdf 免费下载

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

Applied Regression Analysis Lecture Notes Xu Ke School of Statistics and Management Shanghai University of Finance and Economics xuke August 10 2011 Homework HWDueProblems R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Outline 1 R notes 2 Matrices and Vectors 3 Multiple Linear Regression 4 Tranformation 5 Residual Analysis and Model Selection XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Introduction to R At its most basic level R can be viewed as a fancy calculator R is case sensitive An object can be created using an assign operator or All objects have two intrinsic attributes mode and length There are three main modes numeric character and logical Missing values are represented by NA Main objects vector factor matrix list data frame To list the objects in the workspace use ls To delete objects from memory use rm eg rm x To access help fi les use help eg help read csv To exit from R type q XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Vectors To create a vector use c eg c 2 5 7 c red blue c TRUE FALSE Patterned vector can be created using seq as well as rep eg seq 1 100 by 2 rep c red blue 3 To extract elements from a vector use square brackets eg x 2 x c 1 3 x 1 2 x x 10 Arithmetic can be done on R vectors For example we can multiple all elements of x by 3 x 3 XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Logical Vectors and Relational Operators Comparison operators Logical operators x x and if x M x can be written uniquely as x x0 x1 with x0 N and x1 N M The ranks of these spaces satisfy the relation r M r N r N M XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Miscellaneous Results Theorem For any matrix X C XX0 C X Corollary For any matrix X r XX0 r X If Xn phas r X p then the p p matrix X0X is nonsingular Theorem If B is nonsingular C XB C X XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Singular Value Decomposition Theorem If A is an n n symmetric matrix then there exits and orthogonal matrix P such that P0AP Diag i where 1 nare the eigenvalues of A Corollary Singular Value Decomposition A PD i P0 XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Projection Matrix Defi nition P is a projection matrix onto C X if and only if 1 v C X implies Pv v 2 w C X implies Pw 0 XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Projection Matrix Theorem 1 If P is a projection matrix onto C X then C P C X 2 P is a projection matrix onto C P if and only if PP P and PT P 3 Projection matrix is unique 4 X XTX 1XTis the projection matrix onto C X XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Projection Matrix Theorem Let P1and P2be projection matrices P1 P2 is the perpendicular projection matrix onto C P1 P2 if and only if C P1 C P2 XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Projection Matrix Theorem If P is symmetric then P is idempotent and of rank r if and only if it has r eigenvalues equal to unity and n r eigenvalues equal to zero XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Projection Matrix Theorem If P is a projection matrix then tr P rank P XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Quadratic Form Defi nition Let x be an n dimensional random vector and let A be an n n symmetric matrix A quadratic form is a random variable defi ned by xTAx XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Mean of Quadratic Forms Theorem Let x be a n 1 random vector with mean and covariance matrix Let A be an n n symmetric matrix E xTAx tr A TA XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Distribution of Quadratic Form Theorem Let y N 0 In and let A be a symmetric matrix Then yTAy is 2 r if and only if A is idempotent of rank r XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Distribution of Quadratic Form Theorem Suppose that y Nn where is positive defi nite Then Q y T 1 y 2 n XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Multiple Linear Regression E Y X 0 1X1 pXp XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Least Square Estimation Theorem Suppose that X is n p 1 of rank p 1 so that P X XTX 1XT Then the following hold 1 P and In P are symmetric and idempotent 2 rank In P tr In P n p 1 3 PX X XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Distribution Theory Theorem If y N X 2In where X is n p 1 of rank p 1 then 1 N 2 XTX 1 2 TXTX 2 2 p 1 3 is independent of 2 4 RSS 2 2 n p 1 XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Estimability Defi nition A vector valued linear function of say 0 is estimable if 0 P0X for some matrix P XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Gauss Markov Theorem Theorem Consider the linear model y X e E e 0 Cov e 2I If 0 is estimable the the least squares estimate of 0 is a BLUE of 0 XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Estimation with Linear Restrictions Theorem Let y X e Suppose that we wish to fi nd the minimum of eTe subject to the linear restrictions A c where A is a known q p 1 matrix of rank q and c is a known q 1 vector H XTX 1AT A XTX 1AT 1 c A XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection F Test Theorem Suppose we want to test H0 A c 1 RSSH RSS y yH 2 A c T A XTX 1AT 1 A c 2 When H0is true F RSSH RSS q RSS n p 1 A c T A XTX 1AT 1 A c q 2 is distributed as Fq n p 1 XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Added Variable Plot Let 2 j be the squared correlation between the quantities in the added variable plot It can be shown that 2 j Fj n p 1 Fj where Fjis the F statistic for testing j 0 XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Extra Sum of Squares An extra sum of squares measures the marginal reduction in error sum of squares when one or several predictor variables are added into the model given other predictor variables are already in the model We defi ne SSR X2 X1 RSS X1 RSS X1 X2 or equivalently SSR X2 X1 SSR X1 X2 SSR X1 XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Decomposition of SSR SSR X1 X2 X3 SSR X1 SSR X2 X1 SSR X3 X1 X2 XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Centering E Y X 0 1 X1 X1 p Xp Xp where 0 0 1X1 pXp XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Logarithms DependentIndependentInterpretation VariableVariableof 1 yx y 1 x ylog x y 1 100 x log y x y 100 1 x log y log x y 1 x XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Predictors Measured with Error Suppose we have the so called errors in variable model Yi 0 1ui i Xi ui i where iand iare independently distributed with zero means and respective unknown variances 2 and 2 It can be shown that the least square estimate 1is biased However the bias will be small if P i Xi X 2 Furthermore If n is large and 2 is small the standard errors of 1is also approximately unbiased XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Data with Repeated Observations Suppose we have repeated observations on Y Yir 0 1xi1 pxip eir where r 1 2 Ri and i 1 2 n Let yir i eir y W e W 1R10 0 01R2 0 00 1Rn 1 2 n XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Lack of Fit Consider the test H i 0 1xi1 pxipi 1 2 n or H X Theorem C X if and only if A 0 for some n p 1 n matrix A of rank n p 1 XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Factor The Factor rule A factor with d levels can be represented by at most d dummy variables If the intercept is in the mean function at most d 1 of the dummy variables can be used in the mean function XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Transformation Empirical Rules The log rule If the values of a variable range over more than one order of magnitude and the variable is strictly positive then replacing the variable by its logarithm is likely to be helpful The range rule If the range of a variable is considerably less than one order of magnitude then any transformation of that variable is unlikely to be helpful XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Power Transformation The Power family transformation for a strictly positive variable U is defi ned by U U The usual values of that are considered are in the range from 2 to 2 but values in the range from 1 to 1 are ordinarily selected We will intepret the value of 0 to be a log transformation It is convenient to introduce scaled power transformations defi ned for strictly positive X by S X n X 1 if 6 0 log X if 0 XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Box Cox Transformation Box and Cox 1964 introduce the modifi ed power family for strictly positive Y by M Y S Y gm Y 1 n gm Y 1 Y 1 if 6 0 gm Y log Y if 0 where gm Y is the geometric mean of the untransformed variable XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Lowess Method The name lowess stands for locally weighted regression scatter plot smoothing The lowess method estimates E Y X xg by ygat the point xgvia a weighted least squares simple regression giving more weights to points close to xgthan points distant from xg We need to choose a smoothing parameter f a number between 0 and 1 Remarkably for many problems f 2 3 is a good choice XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Yeo Johnson Transformation Yeo and Johnson 2000 proposed a family of transformations that can be used without restrictions on U that have many good properties of the Box Cox power family YJ U n M U 1 if 0 M U 1 2 if 0 XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Mahalanobis Distance Defi nition Let E x and Cov x The squared Mahalanobis distance is D2 x T 1 x The estimated squared Mahalanobis distance for ith case in a sample of vectors x1 xnis D2 i xi x TS 1 xi x where S 1 n 1 P x i x xi x T XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Leverage Theorem Consider the linear model with intercept y X e where X is a full rank matrix Then hii 1 n D2 n 1 where hiiis ith diagonal element of H X XTX 1XT XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Breusch Pagan Cook Weisberg Score test for heteroskedasticity This test was independently developed by Breusch and Pagan 1979 and Cook and Weisberg 1983 Assume Var Y X Z z exp Tz Any form that depends on the linear combination Tz would lead to very similar inference Assume the errors are normally distributed White 1980 has proposed a closely related test that does not depend as crucially on normality XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Breusch Pagan Cook Weisberg Score test for heteroskedasticity 1 Compute OLS fi t with mean function E Y X x Tx as if Var Y X 2 Save residuals ei 2 Let ui ei 2 where 2 P e 2 i n 3 Compute regression with mean function E U Z z 0 Tz Obtain SSreg 4 Compute the score test S SSreg 2 Under the hypothesis 0 S has asymptotic 2 q distribution where q number of components in Z XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Marginal Model Plots 1 Plot Y versus U which can be any function of X Fitting a smoother estimates E Y U without any assumptions 2 Under the model E Y U E E Y X U so E Y U E Y U We can estimate E Y U by smoothing the scatterplot with U on the horizontal axis and the fi tted values Y on the vertical axis If the model is correct it should agree with the smooth in step 1 XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Standard Residual ri e i i 1 h ii XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection PRESS Residual The PRESS residual is defi ned as e i i yi yi i Theorem e i i ei 1 hii Var ei i 2 1 hii XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Residual Analysis and Model Selection Studentized Residual ti e i i 1 hii XKApplied Regression Analysis R notes Matrices and Vectors Multiple Linear Regression Tranformation Resi

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论