版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、2.10(iii) From (2.57), Var() = s2/. 由提示: , and so Var() Var(). A more direct way to see this is to write(一个更直接的方式看到这是编写) = , which is less than unless = 0.(iv)给定的c但随着 的增加, 的方差与Var()的相关性也增加.小时的偏差也小.因此, 在均方误差的基础上不管我们选择还是要取决于,和n的大小 (除了 的大小).3.7We can use Table 3.2. By definition, 0, and by assumption,
2、Corr(x1,x2) 0. Therefore, there is a negative bias in: E() 0. 我们可以使用表3.2。根据定义, 0,由假设,科尔(X1,X2)0。因此,有一个负偏压为:E()0。 我们可以使用表格3.2。根据定义, 0,通过假设,柯尔(x1,x2) 0。因此,有一种负面的偏见:E() 0。3.8 Only (ii), omitting an important variable, can cause bias, and this is true only when the omitted variable is correlated with th
3、e included explanatory variables. The homoskedasticity assumption, MLR.5, played no role in showing that the OLS estimators are unbiased. (Homoskedasticity was used to obtain the usual variance formulas for the .) Further, the degree of collinearity between the explanatory variables in the sample, e
4、ven if it is reflected in a correlation as high as .95, does not affect the Gauss-Markov assumptions. Only if there is a perfect linear relationship among two or more explanatory variables is MLR.3 violated. 只有3.8(ii),遗漏重要变量,会造成偏见确实是这样,只有当省略变量就与包括解释变量。homoskedasticity的假设,多元线性回归。5,没有发挥作用在显示OLS估计量是公正的
5、。(Homoskedasticity是用来获取通常的方差公式。)进一步,共线的程度解释变量之间的样品中,即使它是反映在尽可能高的相关性。95年,不影响的高斯-马尔可夫假定。只要有一个完美的线性关系在两个或更多的解释变量是多元线性回归。三违反了。3.9 (i) Because is highly correlated with and , and these latter variables have large partial effects on y, the simple and multiple regression coefficients on can differ by large
6、 amounts. We have not done this case explicitly, but given equation (3.46) and the discussion with a single omitted variable, the intuition is pretty straightforward. 因为 是高度相关,和这些后面的变量有很大部分影响y,简单和多元回归系数的差异可大量。我们还没有做到,这种情况下显式,但鉴于方程(3.46)和以讨论单个变量遗漏,直觉是相当简单的。 (ii) Here we would expect and to be similar
7、 (subject, of course, to what we mean by “almost uncorrelated”). The amount of correlation between and does not directly effect the multiple regression estimate on if is essentially uncorrelated with and.这里我们将期待和相似(主题,当然对我们所说的“几乎不相关的”)。相关性的数量,但不会直接影响了多元回归估计如果本质上是不相关的和。(iii) (iii) In this case we are
8、 (unnecessarily) introducing multicollinearity into the regression: and have small partial effects on y and yet and are highly correlated with. Adding and like increases the standard error of the coefficient on substantially, so se() is likely to be much larger than se().在这种情况下我们(不必要的)引入重合放入回归:,有微小的
9、部分影响,但y,是高度相关的。添加和像增加标准错误的系数显著,所以se()可能会远远大于se()。(iv) In this case, adding and will decrease the residual variance without causing much collinearity (because is almost uncorrelated with and), so we should see se() smaller than se(). The amount of correlation between and does not directly affect se()
10、.在这种情况下,添加和将减少剩余方差,也没有引起共线(因为几乎是不相关的,),所以我们应该看到se()小于se()。相关性的数量,但不会直接影响se()。3.11(i) 0 because more pollution can be expected to lower housing values; note that is the elasticity of price with respect to nox. is probably positive because rooms roughly measures the size of a house. (However, it does
11、not allow us to distinguish homes where each room is large from homes where each room is small.) 0 and Corr(x1,x2) 0, the simple regression estimator has a downward bias. But because 0和柯尔(x1,x2) 0,那么简单的(iii) This is what we expect from the typical sample based on our analysis in part (ii). The simpl
12、e regression estimate, -1.043, is more negative (larger in magnitude) than the multiple regression estimate, -0.718. As those estimates are only for one sample, we can never know which is closer to. But if this is a “typical” sample,is closer to -0.718. 这是我们期待的东西从典型的示例基于我们的分析部分(ii)。简单的回归估计,?1.043,是更
13、多的负面(大级)比多元回归估计,?0.718。作为这些估计仅供一个样品,我们永远也不会知道,更靠近。但是如果这是一个“典型”的示例,接近?0.7186.4 (i) The answer is not entire obvious, but one must properly interpret the coefficient on alcohol in either case. If we include attend, then we are measuring the effect of alcohol consumption on college GPA, holding attenda
14、nce fixed. Because attendance is likely to be an important mechanism through which drinking affects performance, we probably do not want to hold it fixed in the analysis. If we do include attend, then we interpret the estimate of as being those effects on colGPA that are not due to attending class.
15、(For example, we could be measuring the effects that drinking alcohol has on study time.) To get a total effect of alcohol consumption, we would leave attend out. 答案并不完全是显而易见的,但你必须正确解析系数酒精在这两种情况下。如果我们包括参加,那么我们正在测量效果的酒精消费对大学GPA,持有出席固定。因为出勤率可能是一个重要的机制,通过这种机制,饮酒会影响性能,我们可能不想把它固定在分析。如果我们确实包括参加,然后我们把这些影响的
16、估计是在colGPA,不是由于atten(ii) We would want to include SAT and hsGPA as controls, as these measure student abilities and motivation. Drinking behavior in college could be correlated with ones performance in high school and on standardized tests. Other factors, such as family background, would also be goo
17、d controls.我们想要包括SAT和hsGPA作为对照组,这些测量学生的能力和动力。在大学的饮酒行为可以与一个人的表现在高中和标准化考试。其他因素,如家庭背景,也将是良好的控制。6.6 The second equation is clearly preferred, as its adjusted R-squared is notably larger than thatin the other two equations. The second equation contains the same number of estimatedparameters as the first,
18、 and the one fewer than the third. The second equation is also easier tointerpret than the third. 第二个方程显然是首选的,因为它是大调整平方比其他两个方程。第二个等式包含相同数量的估计参数作为第一个,和一个少于第三。第二个方程也更容易解释第三。7.3(i) The t statistic on hsize2 is over four in absolute value, so there is very strong evidence that it belongs in the equation
19、. We obtain this by finding the turnaround point; this is the value of hsize that maximizes (other things fixed): 19.3/(22.19) 4.41. Because hsize is measured in hundreds, the optimal size of graduating class is about 441. 在hsize2 t统计超过4在绝对价值,所以有非常有力的证据,它是属于方程。我们通过发现获得这样的转变点,这是hsize的价值最大化(其他东西固定):19
20、.3 /(2 2.19)?4.41。因为hsize是以数百,最佳的毕业生的人数大约是441。(ii) This is given by the coefficient on female (since black= 0): nonblack females have SAT scores about 45 points lower than nonblack males. The t statistic is about 10.51, so the difference is very statistically significant. (The very large sample size
21、 certainly contributes to the statistical significance.) 这是当系数对妇女(因为黑色= 0):非黑人女性45点SAT分数低于非黑人男性。t统计大约-10.51,所以差异非常显著。(非常大的样本量肯定有助于统计意义。)(iii) Because female= 0, the coefficient on black implies that a black male has an estimated SAT score almost 170 points less than a comparable nonblack male. The t
22、 statistic is over 13 in absolute value, so we easily reject the hypothesis that there is no ceteris paribus difference. 因为女= 0,系数在黑色意味着一个黑人男性估计有近170点的SAT分数低于可比的非黑人男性。t统计是在13在绝对价值,所以我们很容易拒绝假说,没有其他条件不变时不同。 (iv) We plug in black= 1, female= 1 for black females and black= 0 and female= 1 for nonblack f
23、emales. The difference is therefore 169.81+ 62.31= -107.50. Because the estimate depends on two coefficients, we cannot construct a t statistic from the information given. The easiest approach is to define dummy variables for three of the four race/gender categories and choose nonblack females as th
24、e base group. We can then obtain the t statistic we want as the coefficient on the black female dummy variable. 我们用黑色= 1,女= 1为黑人女性和黑人= 0和女= 1非黑人女性。不同的是,因此?-169.81 + 62.31 = 107.50。因为取决于两个系数的估计,我们不能构造t统计值从给出的信息。最简单的方法是定义虚拟变量三四个种族/性别分类和选择非黑人女性为基地组织。然后我们可以得到我们想要的t统计系数的黑人女虚拟变量7.4(i) The approximate diff
25、erence is just the coefficient on utility times 100, or 28.3%. The t statistic is -0.283/.099 -2.86, which is very statistically significant. 近似的区别仅仅在于在公用系数乘以100,或-28.3%。t统计是-0.283/.099 -2.86,这是非常显著的。(ii) 100exp(-.283) 1) -24.7%, and so the estimate is somewhat smaller in magnitude. 100exp(?.283)- 1
26、)?24.7%,所以该估计是相对较小的大小。(iii) The proportionate difference is0.181- 0.158= 0.023, or about 2.3%. One equation that can be estimated to obtain the standard error of this difference is的比例差异是0.181- 0.158= 0.023年,或约2.3%.一个方程,可以获得标准错误估计的差别是log(salary) = + log(sales) + roe + consprod + utility +trans + u,wh
27、ere trans is a dummy variable for the transportation industry. Now, the base group is finance, and so the coefficient directly measures the difference between the consumer products and finance industries, and we can use the t statistic on consprod. 的比例差异是0.181- 0.158= 0.023年,或约2.3%。一个方程,可以获得标准错误估计这些
28、区别就是 反式是含虚拟变量的运输产业。现在,基地组织是金融,所以系数直接衡量消费者产品之间的差别和金融产业,我们可以使用t统计consprod上。8.1Parts (ii) and (iii). The homoskedasticity assumption played no role in Chapter 5 in showing that OLS is consistent. But we know that heteroskedasticity causes statistical inference based on the usual t and F statistics to b
29、e invalid, even in large samples. As heteroskedasticity is a violation of the Gauss-Markov assumptions, OLS is no longer BLUE. 假设没有发挥作用的homoskedasticity在第五章在表明OLS是一致的。但我们知道,异统计推断原因基于通常t和F统计数据是无效的,即使在大样本。作为异违反了高斯-马尔可夫假定,OLS不再是蓝色的。8.3False. The unbiasedness of WLS and OLS hinges crucially on Assumptio
30、n MLR.4, and, as we know from Chapter 4, this assumption is often violated when an important variable is omitted. When MLR.4 does not hold, both WLS and OLS are biased. Without specific information on how the omitted variable is correlated with the included explanatory variables, it is not possible
31、to determine which estimator has a small bias. It is possible that WLS would have more bias than OLS or less bias. Because we cannot know, we should not claim to use WLS in order to solve “biases” associated with OLS. WLS和OLS的无偏性依赖于假设多元线性回归。4,并且,正如我们知道从第四章,这种假设是经常违反了一个重要的变量被省略了。当多元线性回归。4站不住,无论是WLS和O
32、LS有偏见。没有特定的信息变量遗漏的就与包括解释变量,它不可能确定哪些估计有一个小的偏差。很可能是这样的,WLS将会有更多或更少的倾向性较OLS偏见。因为我们无法知道,我们不应该要求使用WLS为了解决“偏见”与OLS相关8.4(i) These coefficients have the anticipated signs. If a student takes courses where grades are, on average, higher as reflected by higher crsgpa then his/her grades will be higher. The be
33、tter the student has been in the past as measured by cumgpa the better the student does (on average) in the current semester. Finally, tothrs is a measure of experience, and its coefficient indicates an increasing return to experience. 这些系数有预期的迹象。如果一个学生带课程,成绩是,平均来说,更高的就反映了这一点crsgpa更高然后他/她的成绩会更高。更好的学
34、生已经过去(按照cumgpa更好的学生确实(平均)在当前的学期。最后,tothrs是衡量你的经验,那么它的系数表明收益不断增长的经验。The t statistic for crsgpa is very large, over five using the usual standard error (which is the largest of the two). Using the robust standard error for cumgpa, its t statistic is about 2.61, which is also significant at the 5% leve
35、l. The t statistic for tothrs is only about 1.17 using either standard error, so it is not significant at the 5% level. 为crsgpa t统计是非常大的,在使用通常的标准错误五(这是最大的两个)。使用健壮的标准误差为cumgpa,其t统计值大约为2.61,这也是意义上5%的水平。为tothrs t统计仅为1.17或者使用标准错误,所以它不重要,5%的水平.(ii) This is easiest to see without other explanatory variabl
36、es in the model. If crsgpa were the only explanatory variable, H0:= 1 means that, without any information about the student, the best predictor of term GPA is the average GPA in the students courses; this holds essentially by definition. (The intercept would be zero in this case.) With additional ex
37、planatory variables it is not necessarily true that = 1 because crsgpa could be correlated with characteristics of the student. (For example, perhaps the courses students take are influenced by ability as measured by test scores and past college performance.) But it is still interesting to test this
38、 hypothesis.这是最容易看到没有其他解释变量的模型。如果crsgpa是唯一的解释变量,H0:= 1意味着,没有任何信息的学生,最好的预测,术语的平均成绩是平均GPA的学生的课程,这是本质上的定义。(拦截将零在这种情况下)。用额外的解释变量,这未必属实,= 1,因为crsgpa可以与学生的特点。(例如,也许课程学生参加影响能力- - -通过测量测试分数和过去的大学性能。)但它仍然是有趣的检验这个假设。The t statistic using the usual standard error is t= (.900 1)/.175 -.57; using the heteroskeda
39、sticity-robust standard error gives t -.60. In either case we fail to reject H0: = 1 at any reasonable significance level, certainly including 5%. t统计使用通常的标准错误是t =(。900 - 1)/。175?。57;利用异性?skedasticity-robust标准错误给t?.60。在这两种情况下我们无法拒绝H0:= 1在任何合理的水平上显著,当然包括5%。(iii) The in-season effect is given by the c
40、oefficient on season, which implies that, other things equal, an athletes GPA is about .16 points lower when his/her sport is competing. The t statistic using the usual standard error is about 1.60, while that using the robust standard error is about 1.96. Against a two-sided alternative, the t stat
41、istic using the robust standard error is just significant at the 5% level (the standard normal critical value is 1.96), while using the usual standard error, the t statistic is not quite significant at the 10% level (cv 1.65). So the standard error used makes a difference in this case. This example
42、is somewhat unusual, as the robust standard error is more often the larger of the two. 当令的效果是当系数的季节,这意味着,其他条件相同的情况下,一个运动员的平均成绩是有关。16分低,当他/她的运动是竞争。t统计使用通常的标准错误大约是-1.60,而使用健壮的标准错误大约是-1.96。对一个双边的替代品,t统计值使用健壮的标准错误只是意义上5%的水平(标准的正常的关键值是1.96),而使用通常的标准错误,t统计是不太重要,10%的水平(cv 1.65)。所以标准错误使用使之在这种情况下。这个例子有点不寻常,因
43、为健壮的标准错误往往是较大的两个。8.5(i) No. For each coefficient, the usual standard errors and the heteroskedasticity-robust ones are practically very similar.对于每个系数,通常的标准错误和heteroskedasticity-robust实际上是非常相似的。(ii)The effect is -.029(4)= -.116, so the probability of smoking falls by about .116. 效果?.029(4)=?。116年,所以
44、吸烟的可能性下降了大约.116。(iii) As usual, we compute the turning point in the quadratic: .020/2(.00026) 38.46, so about 38 and one-half years. 像往常一样,我们计算的转折点二次:.020 /2(.00026)38.46,所以约38年半时间。(iv) Holding other factors in the equation fixed, a person in a state with restaurant smoking restrictions has a .101 l
45、ower chance of smoking. This is similar to the effect of having four more years of education. 持有其他因素在等式中固定的,一个人在一个国家与餐厅吸烟限制有一个。101年的风险降低吸烟。这类似于效果有另外四年的教育。(v) We just plug the values of the independent variables into the OLS regression line: 我们只是把独立变量的值到OLS回归直线:Thus, the estimated probability of smok
46、ing for this person is close to zero. (In fact, this person is not a smoker, so the equation predicts well for this particular observation.) 因此,估计为这个人吸烟的可能性几乎为零。(事实上,这个人不是一个吸烟者,所以方程预测的特殊观察。)8.7 (i) This follows from the simple fact that, for uncorrelated random variables, the variance of the sum is the sum of the variances: .此前,简单的事实,那就是,对于不相关的随机变量的方差之和之和的差异:(ii) We compute the covariance between any t
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 永寿县2025年四年级数学上学期期中教学质量检测试题含答案
- 2025-2026月考试卷浙教版八年级数学上册期末数学测试仿真冲刺卷(一)(原卷版)
- 2026年江苏连云港初中学业水平考试化学试卷真题(含答案)
- 2025年农田监测信号覆盖方案
- 2026年班组活动室墙上设计
- 2026年金属镁生产工艺学研究现状分析
- 2026年数学概念性教学案例设计
- 2026年交互装置设计想法创意说明
- 2026年人工智能财务应用研究述评报告
- 2026年廉洁风险联防联控工作方案
- 尿液红细胞形态检验与规范化报告专家共识(2026版)
- 2026年高考英语新高考一卷真题卷附答案
- 2026河南淅胜产业发展有限责任公司招聘工作人员10人笔试备考题库及答案详解
- 电梯意外事件与事故应急救援及演习制度培训
- 临床输血全流程清单式质量管理专家共识
- 2026年江苏省文化投资管理集团有限公司招聘笔试题库
- 高考英语近6年高频考察300个长难句型(带解析版)
- 2026年东省济南第一中学高考语文二模试卷
- 铁路专用线竣工验收管理方案
- 2026春粤教花城版三年级下册音乐期末练习卷含参考答案
- 2026年文献检索和科技论文写作练习题库及答案详解(易错题)
评论
0/150
提交评论