已阅读5页,还剩1页未读, 继续免费阅读
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
Class 7 Path analysis and multicollinearity I Standardized Coefficients Transformations If the true model is 1 ipipii xxy 1 1110 If we make the following transformation xkkikik yii sXxx sYyy where and are sample standard deviations of y and xk respectively y s xk s Thus standardization does two things centering and rescaling Centering is to normalize the location of a variable so that it has a mean of zero Rescaling is to normalize a variable to have a variance of unity Location of a measurement where is zero Scale of a measurement how big is one unit Both the location and the scale of a variable can be arbitrary to begin with and need to be normalized Examples temperature IQ emotion Some other variables have natural location and scale such as the number of children and the number of days Standardized regression a regression with all variables standardized 2 i111 1 y ipi pi xx Relationship between 1 and 2 Average equation 1 and then take the difference between 1 and the averaged 1 This is equivalent to centering variables in 1 note that 0 3 i p pipii XxXxYy 1 1 1111 Note 1 1 110 p p XXY Divide 3 by y s 1 1 1 1 1 1 1 1 111111 1 1 1111 ipipi yipxppiypxpxiyx yippiypiyyi xx ssXxsssXxss sXxsXxssYy That is yxkkk ss When variables are standardized variables we have xx X Xr xy X yr Class 7 Page 2 xyxx rryXXXb 1 1 In the older days of sociology 1960s and 1970s many studies publish correlation matrices so that their regression results can be easily replicated This is possible because correlation matrices contain all the sufficient statistics for path analysis II Why Standardized Coefficients A Ease of Computation B Boundaries of Estimates 1 to 1 C Standardized Scale in Comparison Which is better Standardized or Unstandardized Unstandardized coefficients are generally better because they tell you more about the data and about changes in real units Rule of Thumb A Usually it is not a good idea to report standardized coefficients B Almost always report unstandardized coefficients if you can C Read standardized coefficients on your own D You can interpret unstandardized coefficients in terms of standard deviations homework E If only a correlation matrix is available then only standardized coefficients can be estimated LISREL F In an analysis of comparing multiple populations whether to use standardized or unstandardized is consequential In this case theoretical conceptual considerations should dictate the decision III Decomposition of Total Effects A Difference between reduced form equations and structural equations Everything I am now discussing is about systems of equations What are systems of equations Systems of equations are equations with different dependent variables For example we talked about auxiliary regressions one independent variable is turned into the new dependent variable 1 Exogenous variables Exogenous variables are variables that are used only as independent variables in all equations 2 Endogenous variables Endogenous variables are variables that are used as dependent variables in some equations and may be used as independent variables in other equations B Structural Equations versus Reduced Forms 1 Structural Equations Structural equations are theoretically derived equations that often have endogenous variables as independent variables 2 Reduced Forms Reduced form equations are equations in which all independent variables are exogenous variables In other words in reduced form equations we purposely ignore intermediate or relevant variables Class 7 Page 3 C Types of Effects Total effects can be decomposed into two parts direct effects and indirect effects A famous example is drawn from Blau and Duncan model of status attainment 1 Total Effect A total effect can be defined as the effect in the reduced form equations In the example what is the total effects of father s education and father s occupation on son s occupation You run a regression of son s occupation on father s education and father s occupation The estimated coefficients are total effects 2 Direct Effect Direct effects can be defined as the effects in the structural equations In our example the direct effect of father s education is zero by assumption which is subject to testing The direct effect of father s occupation on son s occupation is estimated in the model regression son s occupation on son s education and father s occupation 3 Indirect Effect The indirect effect works through an intermediate variable It is usually the product of two coefficients In our example the indirect effect of father s education on son s occupation is the product of the effect of father s education on son s education and the effect of son s education on son s occupation This is the same as the auxiliary regression before The total effect is the sum of the direct effect and the indirect effect This result is consistent with our earlier discussion of omitted variables How do we calculate the total effect It should be the direct effect plus the indirect effect It has the same formula as the one we discussed in connection with auxiliary regressions X Father s occ 516 V 310 394 859 818 753 115 440 Respondent s education U Occ In 1962Y 281 W 279 First job Father s education 224 Class 7 Page 4 Total effect Direct Effect Indirect Effect k k kp 1 IV Problem of Multicollinearity A Assumption about the singularity of XX Recall that the first assumption for the least squares estimator is that is XX nonsingular What is meant by that is that none of the columns in the X matrix is a linear combination of other columns in Why do we need the assumption Because without the X assumption we cannot take the inverse of for XX yXXXb 1 Why do we use the word multicollinearity instead of collinearity joke multi is a trendy word multimillionaires multi national and multiculturalism Answer linear combinations of several variables We cannot determine whether there is a problem of collinearity from correlations B Examples of Perfect Multicollinearity 1 If includes 1 we cannot include other variables that do not change across all X observations 2 We cannot include parent s education after we include mother and father s education in the model separately C Identification Problem Contrary to common misunderstandings multicollinearity does not cause bias It is an identification problem D Empirical Under identification Even though the model is identified theoretically the data may be so thin that it is under identified empirically Rather than yes no the under identification is a matter of degree Thus we would like to have a way to quantify the degree of under identification Root of the problem less information Empirical under identification problem can often be overcome by collecting more data Under identification less efficiency reduction in effective number of cases Thus increase of sample size compensates for under identification E Consequences of Multicollinearity In the presence of multicollinearity the estimates are not biased Rather they are unstable or having large standard errors If through the computer output gives you small standard errors of the estimates do not worry about the multicollinearity problem This is important but often misunderstood V Variance Inflation Factor Review of partial regression estimation True regression Class 7 Page 5 ipipii xxy 1 1 110 In matrix Xy This model can always be written into 1 12 12 yXX where now 2 1 21 XXX are matrices of dimensions and are parameter 21 X and X 2121 ppppnpn 12 vectors of dimensions 21 p p We first want to prove that regression equation 1 is equivalent to the following procedure 1 Regress obtain residuals 1 on Xy y 2 Regress obtain residuals 12 on XX 2 X 3 Then regress on and obtain the correct least squares estimates of same y 2 X 2 2 b as those from the one step method Without loss of generality say that the last independent variable is singled out That is make make From the above result we can estimate from 21 1 p xx 1 X 1 p x 2 X 1 p 11 pp xy where and are respectively residuals of the regressions of and on y 1 p xy 1 p x 21 1 p xx both and have zero means y 1i x p There is no intercept term because was contained in so that and are 1 1 X y 1 p x centered around zero From the f
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 设备管理与维护的标准化作业程序手册
- 山东低压电工证考试题库及答案
- 单位协作发展协议签订承诺书4篇
- 船舶电工考试题库及答案
- 2025年人工智能行业语音识别与智能交互技术研究报告及未来发展趋势预测
- 2025年生物科技在医药健康领域中的创新药物研发与应用报告
- 业务报告自动化生成工具
- 跨部门协作平台的沟通工具与技巧
- 客户关系管理系统客户信息分类版
- 企业运营成本分析工具模板
- 地产抖音拍摄活动方案
- 综合虫害管理培训
- 直播公司规则管理制度
- 公交公司安全生产管理制度
- 公司法务部管理制度
- 2025至2030中国教学模型教具行业发展趋势分析与未来投资战略咨询研究报告
- 军工国企面试题及答案
- 第九讲:信息与大数据伦理问题-工程伦理
- 《理解当代中国+大学英语综合教程1》Unit2-教师用书 Unit 2 Planting seeds,harvesting the future
- 陈亮《水调歌头送章德茂大卿使虏》阅读答案及解析
- 中建八局如何做好转型升级下的技术标编制工作
评论
0/150
提交评论