多元线性回归模型:假设检验ppt课件_第1页
多元线性回归模型:假设检验ppt课件_第2页
多元线性回归模型:假设检验ppt课件_第3页
多元线性回归模型:假设检验ppt课件_第4页
多元线性回归模型:假设检验ppt课件_第5页
已阅读5页,还剩35页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、Multiple Regression Analysisw y = b0 + b1x1 + b2x2 + . . . bkxk + uw 2. InferenceAssumptions of the Classical Linear Model (CLM) So far, we know that given the Gauss-Markov assumptions, OLS is BLUE, In order to do classical hypothesis testing, we need to add another assumption (beyond the Gauss-Mark

2、ov assumptions) Assume that u is independent of x1, x2, xk and u is normally distributed with zero mean and variance s2: u Normal(0,s2)CLM Assumptions (cont) Under CLM, OLS is not only BLUE, but is the minimum variance unbiased estimator We can summarize the population assumptions of CLM as follows

3、y|x Normal(b0 + b1x1 + bkxk, s2) While for now we just assume normality, clear that sometimes not the case Large samples will let us drop normality.x1x2The homoskedastic normal distribution with a single explanatory variableE(y|x) = b0 + b1xyf(y|x)NormaldistributionsNormal Sampling Distributions err

4、ors theofn combinatiolinear a isit becausenormally ddistribute is 0,1Normal thatso ,Normal st variableindependen theof valuessample theon lconditiona s,assumption CLM Under thejjjjjjjsdVarThe t Test 1:freedom of degrees theNoteby estimate tohave webecausenormal) (vson distributi a is thisNote sassum

5、ption CLM Under the221jknttseknjjThe t Test (cont) Knowing the sampling distribution for the standardized estimator allows us to carry out hypothesis tests Start with a null hypothesis For example, H0: bj=0 If accept null, then accept that xj has no effect on y, controlling for other xsThe t Test (c

6、ont) 0jH ,hypothesis null accept theo whether tdetermine torulerejection a withalong statistic our use then willWe : for statistic theform toneedfirst eour test w perform Totsettjjjt Test: One-Sided Alternatives Besides our null, H0, we need an alternative hypothesis, H1, and a significance level H1

7、 may be one-sided, or two-sided H1: bj 0 and H1: bj 0c0a1 aOne-Sided Alternatives (cont)Fail to rejectrejectExamples 1wHourly Wage EquationwH0: bexper = 0 H1: bexper 0 316. 0526)003. 0()0017. 0()007. 0()104. 0(022. 0exp0041. 0092. 0284. 0)log(2RntenureereducgeawOne-sided vs Two-sided Because the t d

8、istribution is symmetric, testing H1: bj 0 is straightforward. The critical value is just the negative of before We can reject the null if the t statistic than c then we fail to reject the null For a two-sided test, we set the critical value based on a/2 and reject H1: bj 0 if the absolute value of

9、the t statistic cyi = b0 + b1Xi1 + + bkXik + uiH0: bj = 0 H1: bj 0c0a/21 a-ca/2Two-Sided Alternativesrejectrejectfail to rejectSummary for H0: bj = 0 Unless otherwise stated, the alternative is assumed to be two-sided If we reject the null, we typically say “xj is statistically significant at the a

10、% level” If we fail to reject the null, we typically say “xj is statistically insignificant at the a % level”Examples 2wDeterminants of College GPAw colGPAcollege GPA(great point average), hsGPAhigh school GPAw skippedaverage numbers of letures missed per week.234. 0141)026. 0()011. 0()094. 0()33. 0

11、(083. 0015. 0412. 039. 12RnskippedACThsGPAAPcolGTesting other hypotheseswA more general form of the t statistic recognizes that we may want to test something like H0: bj = aj wIn this case, the appropriate t statistic is teststandard for the 0 where,jjjjaseatExamples 3wCampus Crime and EnrollmentwH0

12、: benroll = 1 H1: benroll 1585. 0)11. 0()03. 1 (97)log(27. 163. 6)log(2Rnenrollmeicruenrollcrime)log()log(10Examples 4wHousing Prices and Air PollutionwH0: blog(nox) = -1 H1: blog(nox) - 1581. 0506)006. 0()019. 0()043. 0()117. 0()32. 0(052. 0255. 0)log(134. 0)log(954. 008.11)log(2Rnstratioroomsdistn

13、oxpriceustratioroomsdistnoxprice43210)log()log()log(Confidence Intervalsw Another way to use classical statistical testing is to construct a confidence interval using the same critical value as was used for a two-sided testw A (1 - a) % confidence interval is defined as ondistributi ain percentile 2

14、-1 theis c where,1knjjtsecaComputing p-values for t tests An alternative to the classical approach is to ask, “what is the smallest significance level at which the null would be rejected?” So, compute the t statistic, and then look up what percentile it is in the appropriate t distribution this is t

15、he p-value p-value is the probability we would observe the t statistic we did, if the null were true Most computer packages will compute the p-value for you, assuming a two-sided test If you really want a one-sided alternative, just divide the two-sided p-value by 2Many software,such as Stata or Evi

16、ews provides the t statistic, p-value, and 95% confidence interval for H0: bj = 0 for youTesting a Linear Combinationw Suppose instead of testing whether b1 is equal to a constant, you want to test if it is equal to another parameter, that is H0 : b1 = b2w Use same basic procedure for forming a t st

17、atistic 2121setTesting Linear Combo (cont) 211221122221212121212121, of estimatean is where2,2 then,SinceCovssseseseCovVarVarVarVarseTesting a Linear Combo (cont) So, to use formula, need s12, which standard output does not have Many packages will have an option to get it, or will just perform the t

18、est for youMore generally, you can always restate the problem to get the test you wantExamples 5 Suppose you are interested in the effect of campaign expenditures on outcomes Model is voteA = b0 + b1log(expendA) + b2log(expendB) + b3prtystrA + u H0: b1 = - b2, or H0: q1 = b1 + b2 = 0 b1 = q1 b2, so

19、substitute in and rearrange voteA = b0 + q1log(expendA) + b2log(expendB - expendA) + b3prtystrA + uExample (cont): This is the same model as originally, but now you get a standard error for b1 b2 = q1 directly from the basic regression Any linear combination of parameters could be tested in a simila

20、r manner Other examples of hypotheses about a single linear combination of parameters:b1 = 1 + b2 ; b1 = 5b2 ; b1 = -1/2b2 ; etc Multiple Linear Restrictions Everything weve done so far has involved testing a single linear restriction, (e.g. b1 = 0 or b1 = b2 ) However, we may want to jointly test m

21、ultiple hypotheses about our parameters A typical example is testing “exclusion restrictions” we want to know if a group of parameters are all equal to zeroTesting Exclusion Restrictions Now the null hypothesis might be something like H0: bk-q+1 = 0, . , bk = 0 The alternative is just H1: H0 is not

22、true Cant just check each t statistic separately, because we want to know if the q parameters are jointly significant at a given level it is possible for none to be individually significant at that levelExclusion Restrictions (cont)w To do the test we need to estimate the “restricted model” without

23、xk-q+1, , xk included, as well as the “unrestricted model” with all xs includedw Intuitively, we want to know if the change in SSR is big enough to warrant inclusion of xk-q+1, , xk edunrestrict isur and restricted isr where,1knSSRqSSRSSRFururrThe F statistic The F statistic is always positive, sinc

24、e the SSR from the restricted model cant be less than the SSR from the unrestricted Essentially the F statistic is measuring the relative increase in SSR when moving from the unrestricted to restricted model q = number of restrictions, or dfr dfur n k 1 = dfurThe F statistic (cont) To decide if the

25、increase in SSR when we move to a restricted model is “big enough” to reject the exclusions, we need to know about the sampling distribution of our F stat Not surprisingly, F Fq,n-k-1, where q is referred to as the numerator degrees of freedom and n k 1 as the denominator degrees of freedom 0ca1 af(

26、F)FThe F statistic (cont)rejectfail to rejectReject H0 at a significance level if F cExample:Major League Baseball Players Salaryurbisyrhrunsyrbavggamesyryearssalary543210)log(6278. 0186.183353)0072. 0()0161. 0()00110. 0(0108. 00144. 000098. 0)0026. 0()0121. 0()29. 0(0126. 00689. 010.11)log(2RSSRnrb

27、isyrhrunsyrbavggamesyryearssalaryugamesyryearssalary210)log(5971. 0311.198353)0013. 0()0125. 0()11. 0(0202. 00713. 022.11)log(2RSSRngamesyryearssalaryRelationship between F and t StatThe F statistic is intended to detect whether any combination of a set of coefficients is different from zero, The t

28、test is best suited for testing a single hypothesis.Group a bunch of insignificant varialbes with a significant variable, it is possible conclude that the entire set of variables is jointly insignificant. Often, when a variable is very statistically significant and it is tested jointly with another

29、set of variables, the set will be jointly significant. The R2 form of the F statisticw Because the SSRs may be large and unwieldy, an alternative form of the formula is usefulw We use the fact that SSR = SST(1 R2) for any regression, so can substitute in for SSRu and SSRuredunrestrict isur and restricted isr again where,11222knRqRRFurrurOverall Significancew A special case of e

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论