研一上以前八高高计2chapter the linear regression model_第1页
研一上以前八高高计2chapter the linear regression model_第2页
研一上以前八高高计2chapter the linear regression model_第3页
研一上以前八高高计2chapter the linear regression model_第4页
研一上以前八高高计2chapter the linear regression model_第5页
免费预览已结束,剩余21页可下载查看

付费下载

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、Econometric Analysis7th Edition William GreeneCHAPTER 2 The Linear Regression Model Lectured by 元惠萍2.1 Introduction2.2 The Linear Regression Model2.3 Assumptions of the Linear Regression Model2.4 Summary and ConclusionsThe EndCHAPTER 2 The Linear Regression ModelChapterThe End2.1 Introduction (1/3)T

2、he econometrician is to translate that idea into a set of equations, and answer interesting questions about the variable of interest. The statement of a “model” typically begins with an observation or a proposition that one variable “is caused by” another, or “varies with another,” or some qualitati

3、ve statement about a relationship between a variable and one or more covariates that are expected to be related to the interesting one in question. Individuals usage of the health care system depends on, for example, perceived health status, demographics such as e, age, and education, and the amount

4、 and type of insurance they have. How does health care system utilization depend on insurance coverage? Specifically, is the relationship “positive”all else equal, is an insured consumer more likely to “demand more health care,” or is it “negative”? And, ultimately, one might be interested in a more

5、 precise statement, “how much more (or less)”?ChapterThe End2.1 Introduction (2/3)From a purely statistical point of view, the researcher might have in mind a variable, y, broadly “demand for health care, H,” and a vector of covariates, x ( e, I, insurance, T), and a joint probability distribution o

6、f the three, p(H, I, T).p(H, I, T) = p(H|I, T)p(I, T)interested not in joint variation of all the variables in the model, but in conditional variation of one of the variables related to the others.The idea of the conditional distribution provides a useful starting point for thinking about a relation

7、ship between a variable of interest, a “y,” and a set of variables, “x“, that we think might bear some relationship to it.ChapterThe End2.1 Introduction (3/3)What feature of the conditional distribution is of interest?Ey|x, the conditional mean, the regression function.If we were studying es, I, how

8、ever, which often have a highly skewed distribution, then the mean might not be particularly interesting. Rather, the conditional median, for given ages, MI|x, might be a more interesting statistic. Quantiles, such as the 20th percentile, or a poverty line defined as, say, the 5th percentile, might

9、be more interesting yet.If the variable of interest is asset returns, in at least some contexts, means are not interesting at allit is conditional variances that are most interesting.The linear regression model is a useful departure point for studying other features, such as quantiles and variances.

10、ChapterThe End2.2 The Linear Regression Model (1/3)y is the dependent or explained variable, or regressand.x1, , xK are the independent or explanatory variables, or regressors, or covariates.One can conceive of movement of the independent variables outside the relationships defined by the model whil

11、e movement of the dependent variable is considered in response to some independent or exogenous stimulus.This function, f (x1, , xK), is commonly called the population regression equation of y on x1, , xK. The underlying theory will specify the dependent and independent variables as well as the func

12、tion in the model.(2-1)ChapterThe End2.2 The Linear Regression Model (2/3)The random disturbance, , arises for several reasons.We cannot hope to capture every influence on an economic variable in a model, no matter how elaborate. The net effect, which can be positive or negative, of these omitted fa

13、ctors is captured in the disturbance. It is important not to view as a catchall for the inadequacies of the model.Errors of measurement. For example, the difficulty of obtaining reasonable measures of profits, interest rates, capital stocks, or, worse yet, flows of services from capital stocks, is a

14、 recurrent theme in the empirical literature. At the extreme, there may be no observable counterpart to the theoretical variable. The literature on the permanent e model of consumption e.g., Friedman (1957) provides an interesting example.ChapterThe End2.2 The Linear Regression Model (3/3)Example 2.

15、1 KeynessConsumption FunctionC = + XC = +X +C = +X +dwaryearsw +Example 2.2 Earnings and Educationearnings = 1 + 2 education + earnings = 1 + 2 education + 3 age + earnings = 1 + 2 education + 3 age + 4 age2 + ChapterThe End2.3 Assumptions of the Linear Regression Model2.3.1 LINEARITY OF THE REGRESS

16、ION MODEL2.3.2 FULL RANK2.3.3 ZERO CONDITIONAL MEAN2.3.4 SPHERICAL DISTURBANCES2.3.5 DATA GENERATING PROCESS FOR THE REGRESSORS2.3.6 NORMALITY2.3.7 NOTION OF INDEPENDENCETABLE 2.1 Assumptions of the Linear Regression ModelChapterThe EndTABLE 2.1 Assumptions of the Linear Regression Model (1/2)Sectio

17、nA linear regression model: Given a sample: for an observation: A1. Linearity: The model specifies a linear relationship between y and x1, , xK.A2. Full rank: There is no exact linear relationship among any of the independent variables in the model. This assumption will be necessary for estimation o

18、f the parameters of the model.A3. Exogeneity of the independent variables: E i | xj1, xj2, . . . , xjK = 0. This states that the expected value of the disturbance at observation i in the sample is not a function of the independent variables observed at any observation, including this one. This means

19、 that the independent variables will not carry useful information for prediction of i .ChapterThe EndTABLE 2.1 Assumptions of the Linear Regression Model (2/2)SectionA4. Homoscedasticity and nonautocorrelation: Each disturbance, i has the same finite variance, 2, and is uncorrelated with every other

20、 disturbance, j . This assumption limits the generality of the model, and we will want to examine how to relax it .A5. Data generation: The data in (xj1, xj2, . . . , xjK) may be any mixture of constants and random variables. The crucial elements for present purposes are the strict mean independence

21、 assumption A3 and the implicit variance independence assumption in A4. Analysis will be done conditionally on the observed X, so whether the elements in X are fixed constants or random draws from a stochastic process will not influence the results. In later, more advanced treatments, we will want t

22、o be more specific about the possible relationship between i and xj .A6. Normal distribution: The disturbances are normally distributed. Once again, this is a convenience that we will dispense with after some analysis of its implications.ChapterThe End2.3.1 LINEARITY OF THE REGRESSION MODEL (1/4)Sec

23、tionChapterThe End2.3.1 LINEARITY OF THE REGRESSION MODEL (2/4)SectionThe linearity assumption is not so narrow as it might first appear.Example 2.3 The U.S. Gasoline Market(loglinear model, constant elasticity)What should we expect for the sign of 4? Cars and gasoline are complementary goods, so if

24、 the prices of new cars rise, ceteris paribus, gasoline consumption should fall. Or should it? If the prices of new cars rise, then consumers will buy fewer of them; they will keep their used cars longer and buy fewer new cars. If older cars use more gasoline than newer ones, then the rise in the pr

25、ices of new cars would lead to higher gasoline consumption than otherwise, not lower. We can use the multiple regression model and the gasoline data to attempt to answer the question.ChapterThe End2.3.1 LINEARITY OF THE REGRESSION MODEL (3/4)SectionExample 2.4 The Translog Model (flexible functional

26、 form)Elasticities of substitution are functions of the second derivatives of production, cost, or utility functions. The linear model restricts elasticities of substitution to equal zero, whereas the loglinear model restricts the interesting elasticities to the uninteresting values of 1 or +1. The

27、most popular flexible functional form is the translog model, which is often interpreted as a second-order approximation to an unknown functional form. It allow analysts to model these elasticities.ChapterThe End2.3.1 LINEARITY OF THE REGRESSION MODEL (4/4)SectionNow, expand this function in a second

28、-order Taylor series around the point This model is linear by our definition but can, in fact, mimic an impressive amount of curvature when it is used to approximate another function. An interesting feature of this formulation is that the loglinear model is a special case, gkl = 0.ChapterThe End2.3.

29、2 FULL RANK (1/2)SectionThere are no exact linear relationships among the variables.Hence, X has full column rank; the columns of X are linearly independent and there are at least K observations. This assumption is known as an identification condition.If there are fewer than K observations, then X c

30、annot have full rank. Hence, the assumption that n is at least as large as K is redundant.In a two-variable linear model with a constant term, the full rank assumption means that there must be variation in the regressor x. If there is no variation in x, it is a flaw in the data set. The possibility

31、that this suggests is that we could have drawn a sample in which there was variation in x, but in this instance, we did not. Thus, the model still applies, but we cannot learn about it from the data set in hand.ChapterThe End2.3.2 FULL RANK (2/2)SectionExample 2.5 Short Rank (An Inestimable Model)Su

32、ppose that consumption, C, relates to e as follows:where total e is exactly equal to salary plus nonlabor e. Clearly, there is an exact linear dependency in the model. Now letwhere a is any number. Then the exact same value appears on the right-hand side of C if we substitute Obviously, there is no

33、way to estimate the parameters of this model.ChapterThe End2.3.3 ZERO CONDITIONAL MEAN (1/2)SectionThe mean of each i conditioned on all observations xi is zero. This conditional mean assumption states, in words, that no observations on x convey information about the expected value of the disturbanc

34、e. Later, when we extend the model, we will study the implications of dropping this assumption. We will also assume that the disturbances convey no information about each other. That is, Ei | 1, . . . , i1, i+1, . . . , n = 0. In sum, at this point, we have assumed that the disturbances are purely r

35、andom draws from some population.ChapterThe End2.3.3 ZERO CONDITIONAL MEAN (2/2)SectionChapterThe End2.3.4 SPHERICAL DISTURBANCESSectionChapterThe End2.3.5 DATA GENERATING PROCESS FOR THE REGRESSORSSectionThe assumption of nonstochastic regressors at this point would be a mathematical convenience. W

36、ith it, we could use the results of elementary statistics to obtain our results by treating the vector xi simply as a known constant in the probability distribution of yi . With this simplification, Assumptions A3 and A4 would be made unconditional and the counterparts would now simply state that th

37、e probability distribution of i involves none of the constants in X.If xi is taken to be a random vector, then Assumptions 1 through 4 e a statement about the joint distribution of yi and xi . The precise nature of the regressor and how we view the sampling process will be a major determinant of our

38、 derivation of the statistical properties of our estimators and test statistics. In the end, the crucial assumption is Assumption 3, the uncorrelatedness of X and .ChapterThe End2.3.6 NORMALITYSectionIn view of our description of the source of , the conditions of the central limit theorem will gener

39、ally apply, at least approximately, and the normality assumption will be reasonable in most settings. A useful implication of Assumption 6 is that it implies that observations on i are statistically independent as well as uncorrelated.Normality is not necessary to obtain many of the results we use i

40、n multiple regression analysis, although it will enable us to obtain several exact statistical results. It does prove useful in constructing confidence intervals and test statistics. Later, it will be possible to relax this assumption and retain most of the statistical results we obtain here.Chapter

41、The End2.3.7 NOTION OF INDEPENDENCE (1/2)SectionIndependent variables. Here, the notion of independence refers to the sources of variation. In the context of the model, the variation in the independent variables arises from sources that are outside of the process being described.Mean and statistical independence. The Ei|X = Ei (Ei|X = 0 in the book), is mean independence. Its implication is that variation in the disturbances in our data is not explained by variation in the independent variables. Conditional normality of the disturbances assu

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论