




已阅读5页,还剩97页未读, 继续免费阅读
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
The Simple Regression Model (1) 简单二元回归,y = b0 + b1x + u,1,Chapter Outline 本章大纲,Definition of the Simple Regression Model 简单回归模型的定义-面临的三个问题(P22) Deriving the Ordinary Least Squares Estimates 普通最小二乘法的推导 Mechanics of OLS OLS的操作技巧 Units of Measurement and Functional Form 测量单位和函数形式 Expected Values and Variances of the OLS estimators OLS估计量的期望值和方差 Regression through the Origin 过原点回归,Intermediate Econometrics Yan Shen,2,Lecture Outline 讲义大纲,Some Terminology 一些术语的注解(P23) Functional relationship(P23) Difficult(P24) A Simple Assumption 一个简单假定(P25) Zero Conditional Mean Assumption 条件期望零值假定 What is Ordinary Least Squares 何为普通最小二乘法 Deriving OLS Estimates 普通最小二乘法的推导,Intermediate Econometrics Yan Shen,3,Some Terminology 术语注解,In the simple linear regression model, where y = b0 + b1x + u, we typically refer to y as the Dependent Variable, or Left-Hand Side Variable, or Explained Variable, or Regressand 在简单二元回归模型y = b0 + b1x + u中, y通常被称为因变量,左边变量,被解释变量,或回归子。,Intermediate Econometrics Yan Shen,4,Some Terminology 术语注解,In the simple linear regression of y on x, we typically refer to x as the Independent Variable, or Right-Hand Side Variable, or Explanatory Variable, or Regressor, or Covariate, or Control Variables 在y 对 x进行回归的简单二元回归模型中, x通常被称为自变量,右边变量,解释变量,回归元,协变量,或控制变量。,5,Some Terminology 术语注解,Equation y = b0 + b1x + u has only one nonconstant regressor x, it is called a simple linear regression model, or two-variables regression model, or bivariate linear regression model. 等式y = b0 + b1x + u只有一个非常数回归元。我们称之为简单回归模型, 两变量回归模型或双变量回归模型.,6,Some Terminology 术语注解,The coefficients b0 , b1 are called the regression coefficients. b0 is also called the constant term or the intercept term, or intercept parameter. b1 represents the marginal effects of the regressor, x. It is also called the slope parameter. b0 , b1被称为回归系数。 b0也被称为常数项或截矩项,或截矩参数。 b1代表了回归元x的边际效果,也被成为斜率参数。,7,Some Terminology 术语注解,The variable u is called the error term or disturbance in the relationship. It represents factors other than x that can affect y. u 为误差项或扰动项,它代表了除了x之外可以影响y的因素。,8,Some Terminology 术语注解,Meaning of linear: linear means linear in parameters, not necessarily mean that y and x must have a linear relationship. There are many cases that y and x have nonlinear relationship, but after some transformation, they are linear in parameters. For example, y=eb0+b1x+u . 线性的含义: y 和x 之间并不一定存在线性关系,但是,只要通过转换可以使y的转换形式和x的转换形式存在相对于参数的线性关系,该模型即称为线性模型。,9,Examples 简单二元回归模型例子,A simple wage equation wage= b0 + b1(years of education) + u b1 : if education increase by one year, how much more wage will one gain. 上述简单工资函数描述了受教育年限和工资之间的关系, b1 衡量了多接受一年教育工资可以增加多少.,10,A Simple Assumption 关于u的假定,The average value of u, the error term, in the population is 0. That is, E(u) = 0 (2.5) It it restrictive? 我们假定总体中误差项u的平均值为零. 该假定是否具有很大的限制性呢?,11,A Simple Assumption 关于u的假定,If for example, E(u)=5. Then y = (b0 +5)+ b1x + (u-5), therefore, E(u)=E(u-5)=0. This is not a restrictive assumption, since we can always use b0 to normalize E(u) to 0. 上述推导说明我们总可以通过调整常数项来实现误差项的均值为零, 因此该假定的限制性不大.,12,Zero Conditional Mean Assumption 条件期望零值假定,We need to make a crucial assumption about how u and x are related We want it to be the case that knowing something about x does not give us any information about u, so that they are completely unrelated. That is E(u|x) = E(u)。(P25)(阅读) P25FIG.2.1的阅读 我们需要对u和 x之间的关系做一个关键假定。理想状况是对x的了解并不增加对u的任何信息。换句话说,我们需要u和 x完全不相关。,13,Zero Conditional Mean Assumption 条件期望零值假定,Since we have assumed E(u) = 0, therefore, E(u|x) = E(u) = 0. (2.6) What does it mean? 由于我们已经假定了E(u) = 0,因此有E(u|x) = E(u) = 0。该假定是何含义?,14,Zero Conditional Mean Assumption 条件期望零值假定,In the example of education, suppose u represents innate ability, zero conditional mean assumption means E(ability|edu=6)=E(ability|edu=18)=0. The average level of ability is the same regardless of years of education. 在教育一例中,假定u 代表内在能力,条件期望零值假定说明不管解释教育的年限如何,该能力的平均值相同。,15,Zero Conditional Mean Assumption 条件期望零值假定,Question: Suppose that a score on a final exam, score, depends on classes attended (attend) and unobserved factors that affect exam performance (such as student ability). Then consider model score =b0 + b1attend +u When would you expect it satisfy (2.6)? 假设期末成绩分数取决于出勤次数和影响学生现场发挥的因素,如学生个人素质。那么上述模型中假设(2.6)何时能够成立?,16,Zero Conditional Mean Assumption 条件期望零值假定,(2.6) implies the population regression function, E(y|x) , satisfies E(y|x) = b0 + b1x. E(y|x) as a linear function of x, where for any x the distribution of y is centered about E(y|x). (2.6)说明总体回归函数应满足E(y|x) = b0 + b1x。该函数是x的线性函数,y的分布以它为中心。,17,18,.,.,x1=5,x2 =10,E(y|x) = b0 + b1x,y,f(y),给定x时y的 条件分布,Deriving the Ordinary Least Squares Estimates 普通最小二乘法的推导,Basic idea of regression is to estimate the population parameters from a sample Let (xi,yi): i=1, ,n denote a random sample of size n from the population For each observation in this sample, it will be the case that yi = b0 + b1xi + ui 回归的基本思想是从样本去估计总体参数。 我们用(xi,yi): i=1, ,n 来表示一个随机样本,并假定每一观测值满足yi = b0 + b1xi + ui。,19,20,.,.,.,.,y4,y1,y2,y3,x1,x2,x3,x4,u1,u2,u3,u4,x,y,Population regression line, sample data points and the associated error terms 总体回归线,样本观察点和相应误差,E(y|x) = b0 + b1x,Deriving OLS Estimates 普通最小二乘法的推导,To derive the OLS estimator we need to realize that our main assumption of E(u|x) = E(u) = 0 also implies that Cov(x,u) = E(xu) = 0 (P27) Why? Remember from basic probability that Cov(X,Y) = E(XY) E(X)E(Y) 由E(u|x) = E(u) = 0 可得Cov(x,u) = E(xu) = 0 。,21,Deriving OLS continued 普通最小二乘法的推导,We can write our 2 restrictions just in terms of x, y, b0 and b1 , since u = y b0 b1x E(y b0 b1x) = 0 Ex(y b0 b1x) = 0 These are called moment restrictions 可将u = y b0 b1x代入以得上述两个矩条件。,22,Deriving OLS using M.O.M. 使用矩方法推导普通最小二乘法,The method of moments approach to estimation implies imposing the population moment restrictions on the sample moments。 矩方法是将总体的矩限制应用于样本中。,23,Derivation of OLS 普通最小二乘法的推导,We want to choose values of the parameters that will ensure that the sample versions of our moment restrictions are true 目标是通过选择参数值,使得在样本中矩条件也可以成立。 The sample versions are as follows(P28),24,Derivation of OLS 普通最小二乘法的推导,Given the definition of a sample mean, and properties of summation, we can rewrite the first condition as follows 根据样本均值的定义以及加总的性质,可将第一个条件写为,25,Derivation of OLS 普通最小二乘法的推导,26,So the OLS estimated slope is 因此OLS估计出的斜率为(P29)-meaning,27,Summary of OLS slope estimate OLS斜率估计法总结(P29涵义阅读,P30图的理解),The slope estimate is the sample covariance between x and y divided by the sample variance of x. If x and y are positively correlated, the slope will be positive. If x and y are negatively correlated, the slope will be negative. Only need x to vary in our sample. 斜率估计量等于样本中x 和 y 的协方差除以x的方差。若x 和 y 正相关则斜率为正,反之为负。,28,More OLS 关于OLS的更多信息(P30阅读OLS),Intuitively, OLS is fitting a line through the sample points such that the sum of squared residuals is as small as possible, hence the term least squares。 The residual, , is an estimate of the error term, u, and is the difference between the fitted line (sample regression function) and the sample point。 OLS法是要找到一条直线,使残差平方和最小。 残差是对误差项的估计,因此,它是拟合直线(样本回归函数)和样本点之间的距离。,29,30,.,.,.,.,y4,y1,y2,y3,x1,x2,x3,x4,1,2,3,4,x,y,Sample regression line, sample data points and the associated estimated error terms 样本回归线,样本数据点和相关的误差估计项,Alternate approach to derivation 推导方法二(P30阅读,为什么不用别的方法?),Given the intuitive idea of fitting a line, we can set up a formal minimization problem That is, we want to choose our parameters such that we minimize the following: 正式解一个最小化问题,即通过选取参数而使下列值最小:,31,Alternate approach, continued 推导方法二(阅读P32),If one uses calculus to solve the minimization problem for the two parameters you obtain the following first order conditions, which are the same as we obtained before, multiplied by n 如果直接解上述方程我们得到下面两式,这两个式子等于前面两式乘以n,32,Lecture Summary 讲义总结,Introduce the simple linear regression model. Introduce the method of ordinary least squares to estimate the slope and intercept parameters using data from a random sample. 介绍简单线性回归模型 介绍通过随机样本的数据运用普通最小二乘法估计斜率和截距的参数值,33,The Simple Regression Model (2) 简单二元回归,y = b0 + b1x + u,34,Chapter Outline 本章大纲,Definition of the Simple Regression Model 简单回归模型的定义 Deriving the Ordinary Least Squares Estimates 推导普通最小二乘法的估计量 Mechanics of OLS OLS的操作技巧 Unites of Measurement and Functional Form 测量单位和回归方程形式 Expected Values and Variances of the OLS estimators OLS估计量的期望值和方差 Regression through the Origin 过原点的回归,35,Lecture Outline 讲义大纲(P38),Algebraic Properties of OLS OLS的代数特性 Goodness of fit 拟合优度Using Stata for OLS regression使用stata做OLS 回归 Effects of Changing Units in Measurement on OLS Statistics 改变测量单位对OLS统计量的效果,36,Intermediate Econometrics Yan Shen,37,Mechanics of OLS OLS的操作技巧 Example: CEO Salary and Return on Equity 例:CEO的薪水和资本权益报酬率,Example: CEO Salary and Return on Equity 例:CEO的薪水和资本权益报酬率,Salary: annual salary measured in $1000. In the 1990 data above, (min, mean, max)=(223, 1281, 14822). 变量salary衡量了已1000美元为单位的年薪,其最小值,均值和最大值分别如上。 Roe: net income/common equity, three-year average,(0.5, 17.18,56.3) Roe净收入/所有者权益,为三年平均值。 N=209. The estimated relation (estimated salary)=963.191 + 18.501roe.,38,Example: CEO Salary and Return on Equity 例:CEO的薪水和资本权益报酬率,Interpretation: 对估计量的解释: 963.19: The salary that the CEO will get when roe=0. 常数项的估计值衡量了当roe为零时CEO的薪水。 18.5: If ROE increases by one percentage point, then salary is going to increase by 18.5, i.e., $18,500. b1 的估计值反应了ROE若增加一个百分点工资将增加18500美元。 If roe=30, what is the estimated salary?,39,Algebraic Properties of OLS OLS的代数性质P38:3个特性阅读,P39,几个术语阅读,The sum of the OLS residuals is zero OLS 残差和为零 (p24) Thus, the sample average of the OLS residuals is zero as well 因此 OLS 的样本残差平均值也为零.,40,Algebraic Properties of OLS OLS的代数性质,The sample covariance between the regressors and the OLS residuals is zero 回归元(解释变量)和OLS残差之间的样本协方差为零 (p25),Intermediate Econometrics Yan Shen,41,Algebraic Properties of OLS OLS的代数性质,The OLS regression line always goes through the mean of the sample. OLS回归线总是通过样本的均值。,Intermediate Econometrics Yan Shen,42,Algebraic Properties of OLS OLS的代数性质,We can think of each observation as being made up of an explained part, and an unexplained part, 我们可把每一次观测看作由被解释部分和未解释部分构成. Then the fitted values and residuals are uncorrelated in the sample. 预测值和残差在样本中是不相关的,Intermediate Econometrics Yan Shen,43,Algebraic Properties of OLS OLS的代数性质,Intermediate Econometrics Yan Shen,44,More Terminology 更多术语,Define the total sum of square as 定义总平方和为,Intermediate Econometrics Yan Shen,45,More Terminology 更多术语,SST is a measure of the total sample variation in the ys; that is, it measures how spread out the ys are in the sample. 总平方和是对y在样本中所有变动的度量,即它度量了y在样本中的分散程度 If we divide SST by n-1, we obtain the sample variance of y. 将总平方和除以n-1,我们得到y的样本方差。,Intermediate Econometrics Yan Shen,46,More Terminology 更多术语,Explained Sum of Squares (SSE)is defined as 解释平方和定义为 It measures the sample variation in the predicted value of ys. 它度量了y的预测值的在样本中的变动,Intermediate Econometrics Yan Shen,47,More Terminology 更多术语,Residual Sum of Squares is defined as 残差平方和定义为 SSR measures the sample variation in the residuals. 残差平方和度量了残差的样本变异,Intermediate Econometrics Yan Shen,48,SST, SSR and SSE,The total variation in y can always be expressed as the sum of the explained variation SSE and the unexplained variation SSR, i.e. y 的总变动可以表示为已解释的变动SSE和 未解释的变动SSR之和,即 SST=SSE+SSR,Intermediate Econometrics Yan Shen,49,Proof that SST = SSE + SSR 证明 SST = SSE + SSR,Intermediate Econometrics Yan Shen,50,Proof that SST = SSE + SSR,Therefore, SST = SSE + SSR. We have used the fact that the fitted value and residuals are uncorrelated in the sample. 该证明中我们使用了一个事实, 即样本中因变量的拟合值和残差不相关.,Intermediate Econometrics Yan Shen,51,Goodness-of-Fit 拟合优度(P40定义)R2范围,How do we think about how well our sample regression line fits our sample data? 我们如何衡量样本回归线是否很好地拟合了样本数据呢? Can compute the fraction of the total sum of squares (SST) that is explained by the model, call this the R-squared of regression 可以计算模型解释的总平方和的比例,并把它定义为回归的R-平方 R2 = SSE/SST = 1 SSR/SST,Intermediate Econometrics Yan Shen,52,Goodness-of-Fit 拟合优度,R-squared is the ratio of the explained variation compared to the total variation. R-平方是已解释的变动占所有变动的比例 It is thus interpreted as the fraction of the sample variation in y that is explained by x. 它因此可被看作是y的样本变动中被可以被x解释的部分 The value of R-squared is always between zero and one. R-平方的值总是在0和1之间,Intermediate Econometrics Yan Shen,53,Goodness-of-Fit 拟合优度,In the social sciences, low R-squareds in regression equations are not uncommon, especially for cross-sectional analysis. 在社会科学中,特别是在截面数据分析中, 回归方程得到低的R-平方值并不罕见。 It is worth emphasizing that a seemingly low R-squared does not necessarily mean that an OLS regression equation is useless. 值得强调的是表面上低的R-平方值不一定说明 OLS回归方程是没有价值的,Intermediate Econometrics Yan Shen,54,Goodness-of-Fit 拟合优度,Example 2.8(阅读P41) CEO Salary and Return on Equity CEO薪水和净资产回报 Example 2.9 Voting outcomes and Campaign Expenditures 竞选结果和选举活动开支,Intermediate Econometrics Yan Shen,55,Using Stata for OLS regressions 使用 Stata 进行OLS回归,Now that weve derived the formula for calculating the OLS estimates of our parameters, youll be happy to know you dont have to compute them by hand 我们已经推导出公式计算参数的OLS估计值,所幸的是我们不必亲手去计算它们。 Regressions in Stata are very simple, to run the regression of y on x, just type 在Stata中进行回归非常简单,要让y对x进行回归,只需要输入 reg y x,Intermediate Econometrics Yan Shen,56,Units of Measurement 测量单位,Suppose the salary is measured in hundreds of dollars, rather than in thousands of dollars, say salarhun. 假定薪水的单位是美元,而不是千美元,salarys. What will be the OLS intercept and slope estimates in the regression of salarhun on roe? 在Salarys对roe进行回归时OLS截距和斜率的估计值是多少?,Intermediate Econometrics Yan Shen,57,Units of Measurement 测量单位,The estimated sample regression is changed from (estimated salarys)=963.191 + 18.501roe to (estimated salarys)=963191 + 18501roe In general, when the dependent variable is multiplied by the constant c, but nothing has changed for the independent variable, the OLS intercept and slope estimates are also multiplied by c. 一般而言,当因变量乘上常数c,而自变量不改变时,OLS的截距和斜率估计量也要乘上c。,Intermediate Econometrics Yan Shen,58,Units of Measurement 测量单位,If redefine roedec = roe/100, then the sample regression line will change from 如果定义 roedec = roe/100,那么样本回归线将会从 (estimated salary)=963.191 + 18.501roe to 改变到 (estimated salary)=963.191 + 1850.1roedec In general, if the independent variable is divided or multiplied by some nonzero constant, c, then the OLS slope coefficient is multiplied or divided by c, but the intercept will not change. 一般而言,如果自变量除以或乘上某个非零常数,c,那么OLS斜率将乘以或除以c,而截距则不改变。,Intermediate Econometrics Yan Shen,59,Incorporating Nonlinearrities in Simple Regression 在简单回归中加入非线性(P46线性与非线性的),Linear relationships are not general enough for all economic applications. 线性关系并不适合所有的经济学运用 However, it is rather easy to incorporate many nonlinearities into simple regression analysis by appropriately defining the dependent and independent variables. 然而,通过对因变量和自变量进行恰当的定义, 我们可以在简单回归分析中非常容易地处理许多y和x之间的非线性关系.,Intermediate Econometrics Yan Shen,60,The Natural Logarithm 自然对数,Intermediate Econometrics Yan Shen,61,In the wage-education example, now suppose the percentage increase in wage is the same, given one more year of education. 在工资-教育的例子中,假定每增加一年的教育,工资的百分比增长都是相同的 A model that gives a constant percentage effect is 能够给出不变的百分比效果的模型是 If , we have,Intermediate Econometrics Yan Shen,62,Example 2.10,A log Wage Equation 将对数工资方程 Compared to 和该方程相比,Intermediate Econometrics Yan Shen,63,Another important use of the natural log is in obtaining a constant elasticity model 自然对数的另一个重要用途是用于获得弹性为常数的模型 In the example of CEO Salary and Firm Sales, a constant elasticity model is 在CEO的薪水和企业销售额的例子中,常数弹性模型是,Intermediate Econometrics Yan Shen,64,Combination of functional forms available from using either the original variable or its natural log 变量的原始形式和其自然对数的不同组合,Intermediate Econometrics Yan Shen,65,The Simple Regression Model 简单二元回归 (3),y = b0 + b1x + u,Intermediate Econometrics Yan Shen,66,Chapter Outline 本章大纲,Definition of the Simple Regression Model 二元回归模型的定义 Deriving the Ordinary Least Squares Estimates 推导普通最小二乘法的估计量 Mechanics of OLS OLS的操作技巧 Unites of Measurement and Functional Form 测量单位和函数形式 Expected Values and Variances of the OLS estimators OLS估计量的期望值和方差(P47) Regression through the Origin 过原点回归,Intermediate Econometrics Yan Shen,67,Expected Values and Variances of the OLS Estimators OLS估计量的期望值和方差,We will study the properties of the distributions of OLS estimators over different random samples from the population. 从总体中抽取的不同的随机样本可得到不同的OLS估计量,我们将研究这些OLS估计量的分布。(P49,关注什么?参变量的性质) We begin by establishing the unbiasedness of OLS under a set of assumptions. 首先,我们在一些假定下证明OLS的无偏性。,Intermediate Econometrics Yan Shen,68,Assumption SLR.1 (Linear in Parameters): 假定SLR.1 (关于参数是线性的)(P49),In the population model, the dependent variable y is related to the independent variable x and the error u as 在总体模型中,因变量 y 和自变量 x 和误差 u 的关系可写作 y = b0 + b1x + u ,(没有帽) where b0 and b1 are the population intercept and slope parameters respectively. 其中 b0 和 b1 分别是总体的截距参数和斜率参数,Intermediate Econometrics Yan Shen,69,Assumption SLR.2 (Random Sampling): 假定SLR.2 (随机抽样):,Assume we can use a random sample of size n, (xi, yi): i=1, 2, , n, from the population model. Thus we can write the sample model yi = b0 + b1xi + ui 假定我们从总体模型随机抽取容量为n的样本, (xi, yi): i=1, 2, , n, 那么可以写出样本模型为 yi = b0 + b1xi + ui(没有帽),Intermediate Econometrics Yan Shen,70,Assumptions SLR.3 and SLR.4 假定 SLR.3 和 SLR.4,SLR.3 , Zero Conditional Mean: SLR.3, 零条件期望(P48阅读理解) Assume E(u|x) = 0 and thus in a random sample E(ui|xi) = 0 假定 E(u|x) = 0 . 那么在随机样本中我们有 E(ui|xi) = 0 SLR.4 (Sample Variation in the Independent Variable): In the sample, the independent variables xs are not all equal to the same constant. SLR.4 (自变量中的样本变动): 在样本中,自变量 x 并不等于一个不变常数。,Intermediate Econometrics Yan Shen,71,Theorem 2.1 (Unbiasedness of OLS) 定理2.1 ( OLS的无偏性),Using assumptions SLR.1 through SLR.4, we have the ex
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 中直企业用地管理办法
- 企业量化基础管理办法
- 信息保护薪酬管理办法
- 住院患者欠费管理办法
- 会所泊车部门管理办法
- 乡镇挂职人员管理办法
- 云南税种认定管理办法
- 公司参展物资管理办法
- 信息伦理案件管理办法
- 事业人员上班管理办法
- 网络安全运维认证试卷含答案
- 2025年江苏盐城市射阳县城市照明服务有限公司聘考试笔试试题(含答案)
- 2025年团委工作总结-循“荔枝之道”而行走稳青春育人之路
- 消防装备维护保养课件教学
- 设备安全培训
- 2025至2030中国角膜塑形镜行业产业运行态势及投资规划深度研究报告
- 艾梅乙反歧视培训课件
- 小学数学课堂教学实践与创新
- 妇幼保健院(2025-2025年)十五五发展规划
- 2025年03月四川成都农业科技中心公开招聘笔试历年典型考题(历年真题考点)解题思路附带答案详解
- 七年级期末考试数学质量分析
评论
0/150
提交评论