




版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、chapter 1 solutions to problems 1.1 (i) ideally, we could randomly assign students to classes of different sizes. that is, each student is assigned a different class size without regard to any student characteristics such as ability and family background. for reasons we will see in chapter 2, we wou
2、ld like substantial variation in class sizes (subject, of course, to ethical considerations and resource constraints).(ii) a negative correlation means that larger class size is associated with lower performance. we might find a negative correlation because larger class size actually hurts performan
3、ce.however, with observational data, there are other reasons we might find a negative relationship. for example, children from more affluent families might be more likely to attend schools with smaller class sizes, and affluent children generally score better on standardized tests. another possibili
4、ty is that, within a school, a principal might assign the better students to smaller classes. or, some parents might insist their children are in the smaller classes, and these same parents tend to be more involved in their childrens education. (iii) given the potential for confounding factors some
5、of which are listed in (ii) finding a negative correlation would not be strong evidence that smaller class sizes actually lead to better performance. some way of controlling for the confounding factors is needed, and this is the subject of multiple regression analysis. 1.2 (i) here is one way to pos
6、e the question: if two firms, say a and b, are identical in allrespects except that firm a supplies job training one hour per worker more than firm b, by how much would firm as output differ from firm bs? (ii) firms are likely to choose job training depending on the characteristics of workers. some
7、observed characteristics are years of schooling, years in the workforce, and experience in a particular job. firms might even discriminate based on age, gender, or race. perhaps firms choose to offer training to more or less able workers, where ability might be difficult toquantify but where a manag
8、er has some idea about the relative abilities of different employees. moreover, different kinds of workers might be attracted to firms that offer more job training on average, and this might not be evident to employers. (iii) the amount of capital and technology available to workers would also affec
9、t output. so, two firms with exactly the same kinds of employees would generally have different outputs if they use different amounts of capital or technology. the quality of managers would also have an effect. (iv) no, unless the amount of training is randomly assigned. the many factors listed in p
10、arts (ii) and (iii) can contribute to finding a positive correlation between output and training even if job training does not improve worker productivity. 1.3 it does not make sense to pose the question in terms of causality. economists would assume that students choose a mix of studying and workin
11、g (and other activities, such as attending class,1leisure, and sleeping) based on rational behavior, such as maximizing utility subject to the constraint that there are only 168 hours in a week. we can then use statistical methods to measure the association between studying and working, including re
12、gression analysis that we cover starting in chapter 2. but we would not be claiming that one variable causes the other. they are both choice variables of the student. chapter 2solutions to problems 2.1 (i) income, age, and family background (such as number of siblings) are just a fewpossibilities. i
13、t seems that each of these could be correlated with years of education. (income and education are probably positively correlated; age and education may be negatively correlated because women in more recent cohorts have, on average, more education; and number of siblings and education are probably ne
14、gatively correlated.) (ii) not if the factors we listed in part (i) are correlated with educ. because we would like to hold these factors fixed, they are part of the error term. but if u is correlated with educ then e(u|educ) 0, and so slr.4 fails. 2.2 in the equation y = b0 + b1x + u, add and subtr
15、act a0 from the right hand side to get y = (a0 + b0) + b1x + (u - a0). call the new error e = u - a0, so that e(e) = 0. the new intercept is a0 + b0, but the slope is still b1. 2.3 (i) let yi = gpai, xi = acti, and n = 8. then = 25.875, = 3.2125, (xi )(yi ) = i=1n= 5.8125, and (xi )2 = 56.875. from
16、equation (2.9), we obtain the slope as b1i=1n = 5.8125/56.875 .1022, rounded to four places after the decimal. from (2.17), b0 3.2125 (.1022)25.875 .5681. so we can write b1 = .5681 + .1022 act gpan = 8. the intercept does not have a useful interpretation because act is not close to zero for the inc
17、reases by .1022(5) = .511. population of interest. if act is 5 points higher, gpa (ii) the fitted values and residuals rounded to four decimal places are given along with the observation number i and gpa in the following table: 2 you can verify that the residuals, as reported in the table, sum to -.
18、0002, which is pretty close to zero given the inherent rounding error. = .5681 + .1022(20) 2.61. (iii) when act = 20, gpa i2, is about .4347 (rounded to four decimal places), (iv) the sum of squared residuals, ui=1nnand the total sum of squares, (yi )2, is about 1.0288. so the r-squared from thei=1r
19、egression is r2 = 1 ssr/sst 1 (.4347/1.0288) .577. therefore, about 57.7% of the variation in gpa is explained by act in this small sample of students. 2.4 (i) when cigs = 0, predicted birth weight is 119.77 ounces. when cigs = 20, bwght = 109.49.this is about an 8.6% drop. (ii) not necessarily. the
20、re are many other factors that can affect birth weight, particularly overall health of the mother and quality of prenatal care. these could be correlated withcigarette smoking during birth. also, something such as caffeine consumption can affect birth weight, and might also be correlated with cigare
21、tte smoking. (iii) if we want a predicted bwght of 125, then cigs = (125 119.77)/( .524) 10.18, or about 10 cigarettes! this is nonsense, of course, and it shows what happens when we are trying to predict something as complicated as birth weight with only a single explanatory variable. the largest p
22、redicted birth weight is necessarily 119.77. yet almost 700 of the births in the sample had a birth weight higher than 119.77. 3 (iv) 1,176 out of 1,388 women did not smoke while pregnant, or about 84.7%. because we are using only cigs to explain birth weight, we have only one predicted birth weight
23、 at cigs = 0. the predicted birth weight is necessarily roughly in the middle of the observed birth weights at cigs = 0, and so we will under predict high birth rates. 2.5 (i) the intercept implies that when inc = 0, cons is predicted to be negative $124.84. this, of course, cannot be true, and refl
24、ects that fact that this consumption function might be a poor predictor of consumption at very low-income levels. on the other hand, on an annual basis, $124.84 is not so far from zero. = 124.84 + .853(30,000) = 25,465.16 dollars. (ii) just plug 30,000 into the equation: cons (iii) the mpc and the a
25、pc are shown in the following graph. even though the intercept is negative, the smallest apc in the sample is positive. the graph starts at an annual income levelincreases housing prices. (ii) if the city chose to locate the incinerator in an area away from more expensiveneighborhoods, then log(dist
26、) is positively correlated with housing quality. this would violate slr.4, and ols estimation is biased. (iii) size of the house, number of bathrooms, size of the lot, age of the home, and quality of the neighborhood (including school quality), are just a handful of factors. as mentioned in part (ii
27、), these could certainly be correlated with dist and log(dist).4 2.7 (i) when we condition on ince(u|ince|inc) = e(e|inc0 because e(e|inc) = e(e) = 0. (ii) again, when we condition on incvar(u|ince|inc2var(e|inc) = se2inc because var(e|inc) = se2. (iii) families with low incomes do not have much dis
28、cretion about spending; typically, a low-income family must spend on food, clothing, housing, and other necessities. higher income people have more discretion, and some might choose more consumption while others more saving. this discretion suggests wider variability in saving among higher income fa
29、milies. 2.8 (i) from equation (2.66), nn2%b1 = xiyi / xi. i=1i=1 plugging in yi = b0 + b1xi + ui gives nn2%b1 = xi(b0+b1xi+ui)/ xi. i=1i=1 after standard algebra, the numerator can be written as b0xi+b1x+xiui. 2nnni=1i=1ii=1% as putting this over the denominator shows we can write b1 nn2nn2%b1 = b0x
30、i/ xi + b1 + xiui/ xi. i=1i=1i=1i=1 conditional on the xi, we have nn2%e(b1) = b0xi/ xi + b1 i=1i=1% is given by the first term in this equation. because e(ui) = 0 for all i. therefore, the bias in b1this bias is obviously zero when b0 = 0. it is also zero when xi = 0, which is the same asi=1n= 0. i
31、n the latter case, regression through the origin is identical to regression with an intercept.5 %in part (i) we have, conditional on the xi, (ii) from the last expression for b1%) var(b1n2nn2n2= xivarxiui = xixivar(ui) i=1i=1i=1i=1-2-2-2 nn22n22= xisxi = s/ xi2. i=1i=1i=1nnn222(iii) from (2.57), var
32、(b1) = s/(xi-). from the hint, xi (xi-)2, and soi=1i=1i=1nn%) var(b). a more direct way to see this is to write (x-)2 = x2-n()2, which var(bii11i=1i=1is less than xi2 unless = 0.i=1n % increases as increases (holding the sum of the (iv) for a given sample size, the bias in b1increases relative to va
33、r(b%). the bias in b% x2 fixed). but as increases, the variance of bi111% or b on a mean squared error is also small when b0 is small. therefore, whether we prefer b11basis depends on the sizes of b0, , and n (in addition to the size of xi2).i=1n 2.9 (i) we follow the hint, noting that c1y = c1 (the
34、 sample average of c1yi is c1 times the sample average of yi) and c2x = c2. when we regress c1yi on c2xi (including an intercept) we use equation (2.19) to obtain the slope:%=b1(cx-cx)(c-c)cc(x-)(y-)2in21i112niii=1nn=i=1(cx-c)2i2i=1nn2c(x-)22ii=12 c1=i=1c2(xi-)(yi-)=2i(x-)i=1c1b1.c2% = (c1) b%(c2) =
35、 (c1) (c1/c2)b(c2) = from (2.17), we obtain the intercept as b011) because the intercept from regressing yi on xi is ( b) = c1b). c1( b1 1 (ii) we use the same approach from part (i) along with the fact that (c1+y) = c1 + and(c2+x) = c2 + . therefore, (c1+yi)-(c1+y) = (c1 + yi) (c1 + ) = yi and (c2
36、+ xi) 6(c2+x) = xi . so c1 and c2 entirely drop out of the slope formula for the regression of (c1 + yi)% = (c+y) b% = b. the intercept is b%(c+x) = (c1 + ) b(c2 + ) = on (c2 + xi), and b1101121 + c1 c2b) + c1 c2b = b, which is what we wanted to show. (-b0111 (iii) we can simply apply part (ii) beca
37、use log(c1yi)=log(c1)+log(yi). in other words, replace c1 with log(c1), yi with log(yi), and set c2 = 0. (iv) again, we can apply part (ii) with c1 = 0 and replacing c2 with log(c2) and xi with log(xi). and b are the original intercept and slope, then b%=b and b%=b-log(c)b. if b01110021 2.10 (i) thi
38、s derivation is essentially done in equation (2.52), once (1/sstx) is brought inside the summation (which is valid because sstx does not depend on i). then, just definewi=di/sstx. ,)=e(b-b) , we show that the latter is zero. but, from part (i), (ii) because cov(b111n=nwe(u). because the u are pairwi
39、se uncorrelated -b) =ee(bwui11ii=1iii=1i(they are independent), e(ui)=e(ui2/n)=s2/n (because e(uiuh)=0, ih). therefore, ()i=1wie(ui)=i=1wi(s2/n)=(s2/n)i=1wi=0. nnn=-b and, plugging in =b+b+ (iii) the formula for the ols intercept isb010=(b+b+)-b=b+-(b-b) gives b0011011 and are uncorrelated, (iv) bec
40、ause b1)=var()+var(b)2=s2/n+(s2/sst)2=s2/n+s22/sst, var(b01xxwhich is what we wanted to show. )=s2(sst/n)+2/sst (v) using the hint and substitution gives var(b0xx-12222-12=s2nx-+/sst=snx/sstx. xi=1ii=1i 2.11 (i) we would want to randomly assign the number of hours in the preparation course so that h
41、ours is independent of other factors that affect performance on the sat. then, we would collect information on sat score for each student in the experiment, yielding a data set (sati,hoursi):i=1,.,n, where n is the number of students we can afford to have in the study. from equation (2.7), we should
42、 try to get as much variation in hoursi as is feasible.nn()()7(ii) here are three factors: innate ability, family income, and general health on the day of the exam. if we think students with higher native intelligence think they do not need to prepare for the sat, then ability and hours will be nega
43、tively correlated. family income would probably be positively correlated with hours, because higher income families can more easily affordpreparation courses. ruling out chronic health problems, health on the day of the exam should be roughly uncorrelated with hours spent in a preparation course. (i
44、ii) if preparation courses are effective,b1 should be positive: other factors equal, anincrease in hours should increase sat. (iv) the intercept, b0, has a useful interpretation in this example: because e(u) = 0, b0 is the average sat score for students in the population with hours = 0. chapter 3sol
45、utions to problems 3.1 (i) hsperc is defined so that the smaller it is, the lower the students standing in high school. everything else equal, the worse the students standing in high school, the lower is his/her expected college gpa. (ii) just plug these values into the equation: = 1.392 - .0135(20)
46、 + .00148(1050) = 2.676. colgpa (iii) the difference between a and b is simply 140 times the coefficient on sat, because hsperc is the same for both students. so a is predicted to have a score .00148(140) .207 higher. (iv) with hsperc fixed, dcolgpa = .00148dsat. now, we want to find dsat such that
47、= .5, so .5 = .00148(dsat) or dsat = .5/(.00148) 338. perhaps not surprisingly, a dcolgpalarge ceteris paribus difference in sat score almost two and one-half standard deviations is needed to obtain a predicted difference in college gpa or a half a point. 3.2 (i) yes. because of budget constraints,
48、it makes sense that, the more siblings there are in a family, the less education any one child in the family has. to find the increase in the number of siblings that reduces predicted education by one year, we solve 1 = .094(dsibs), so dsibs = 1/.094 10.6. (ii) holding sibs and feduc fixed, one more
49、 year of mothers education implies .131 years more of predicted education. so if a mother has four more years of education, her son is predicted to have about a half a year (.524) more years of education.8 (iii) since the number of siblings is the same, but meduc and feduc are both different, the co
50、efficients on meduc and feduc both need to be accounted for. the predicted difference in education between b and a is .131(4) + .210(4) = 1.364. 3.3 (i) if adults trade off sleep for work, more work implies less sleep (other things equal), so b1 < 0. (ii) the signs of b2 and b3 are not obvious, a
51、t least to me. one could argue that moreeducated people like to get more out of life, and so, other things equal, they sleep less (b2 < 0). the relationship between sleeping and age is more complicated than this model suggests, and economists are not in the best position to judge such things. (ii
52、i) since totwrk is in minutes, we must convert five hours into minutes: dtotwrk = 5(60) = 300. then sleep is predicted to fall by .148(300) = 44.4 minutes. for a week, 45 minutes less sleep is not an overwhelming change. (iv) more education implies less predicted time sleeping, but the effect is qui
53、te small. if we assume the difference between college and high school is four years, the college graduate sleeps about 45 minutes less per week, other things equal. (v) not surprisingly, the three explanatory variables explain only about 11.3% of the variation in sleep. one important factor in the e
54、rror term is general health. another is marital status, and whether the person has children. health (however we measure that), marital status, and number and ages of children would generally be correlated with totwrk. (for example, less healthy people would tend to work less.) 3.4 (i) a larger rank
55、for a law school means that the school has less prestige; this lowers starting salaries. for example, a rank of 100 means there are 99 schools thought to be better.(ii) b1 > 0, b2 > 0. both lsat and gpa are measures of the quality of the entering class. no matter where better students attend l
56、aw school, we expect them to earn more, on average. b3, b4 > 0. the number of volumes in the law library and the tuition cost are both measures of the school quality. (cost is less obvious than library volumes, but should reflect quality of the faculty, physical plant, and so on.) (iii) this is just the coefficient on gpa, multiplied by 100: 24.8%. (iv) this is an elasticity: a one percent increase in library volumes implies a .095% increase in
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2025江苏扬州工业职业技术学院博士专项招聘16人笔试备考试题带答案详解
- 2025杭州临安区河桥镇公开招聘村务工作者6人笔试备考题库及答案详解1套
- 陕西省西安市2023-2024学年高二下学期4月联考(选课分科摸底考)物理试题(解析版)
- 辽宁省沈阳市2024-2025学年高二上学期第三次质量监物理试卷(解析版)
- 山东省日照市2024-2025学年高一上学期11月期中校际联合考试数学试题(解析版)
- 脑动脉瘤夹闭手术实况
- 护理实践中的卫生保健与健康促进
- 土木工程中的可再生能源利用技术
- 数学 期末综合复习解答压轴题专题提升训练+2024-2025学年人教版七年级数学下册
- 肯德基的营销渠道布局
- 咨询实施顾问能力素质模型
- 2025年上海市数学高考一轮复习:立体几何(Ⅰ)(考点练+模拟练)含详解
- 情侣协议书电子版简单模板
- 建筑水电安装工程规范要求中的水泵安装要求
- 植物蛋白功能性修饰优化
- 【usmile电动牙刷的营销策略探析8600字(论文)】
- 小儿麻醉术后并发症
- 2019青鸟消防JBF6481编码器使用指南
- 2024-2030年中国生物基化学品行业市场发展趋势与前景展望战略分析报告
- 《机械设计基础》试题库(主观题及答案)
- JGJ100-2015 车库建筑设计规范
评论
0/150
提交评论