第7讲 面板数据_第1页
第7讲 面板数据_第2页
第7讲 面板数据_第3页
第7讲 面板数据_第4页
第7讲 面板数据_第5页
已阅读5页,还剩71页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、面板 数据24-124-2Agenda Panel Data Panel Data DGPs Fixed Effects Random Effects Example: Production Functions The Hausman Test24-3Panel Data Potential unobserved heterogeneity is a form of omitted variables bias. “Unobserved heterogeneity” refers to omitted variables that are fixed for an individual (at

2、 least over a long period of time). A persons upbringing, family characteristics, innate ability, and demographics (except age) do not change.24-4Panel Data (cont.) With cross-sectional data, there is no particular reason to differentiate between omitted variables that are fixed over time and omitte

3、d variables that are changing. However, when an omitted variable is fixed over time, panel data offers another tool for eliminating the bias.24-5Panel Data (cont.) Panel Data is data in which we observe repeated cross-sections of the same individuals. Examples: Annual unemployment rates of each stat

4、e over several years Quarterly sales of individual stores over several quarters Wages for the same worker, working at several different jobs24-6Panel Data (cont.) Some of the most valuable data sets in economics are panel data sets. Longitudinal surveys return year after year to the same individuals

5、, tracking them over time.7什么是面板数据 经典计量经济学模型使用的或者是时序序列数据(time series data),或者是截面数据(crosss section data)。 面板数据模型是将时间序列和截面数据联合使用建立的模型。 可以增大样本数量 可以增大变量的变异程度 可以分析不同观察对象之间的差异 可以分析不同时期之间的差异 可以分析跨时期的因果关系(动态模型) 概括而言,联合使用时间序列和截面混合数据(Pooled data)增加了信息含量信息含量,这不仅有利于改善模型估计结果,而且可以探讨单纯用时间序列数据或截面数据无法分析的问题。面板数据有哪些来源

6、 在现实生活中,有大量的公开统计数据属于时间序列和截面混合数据。 历年分行政区的统计数据 上市企业报表 国家统计局居民收支调查 (定期轮换) 农产品成本调查数据 农业部农村经济研究中心固定观察点调查资料 89处理时间序列和截面混合数据的方法 处理时间序列和截面混合数据有以下两种做法: 混合数据模型:将针对不同时期、不同对象的观察结果(指标)看作是随机抽取的观察值。 处理方法最为简便,但由样本提取的信息不够充分。 面板数据模型(Panel data):将针对不同时期、不同对象的观察结果看作是与时期或截面相关联的现象。 处理方法较复杂,但能够更充分地利用样本信息。 混合数据模型可以被看作是处理面板

7、数据的一种特例。 在应用工作中,可以将面板数据分为: 平衡的面板数据(样本量=N*T) 非平衡的面板数据(样本量= )NiiT面板数据的优点控制个体异质性 Eg. 香烟消费对滞后消费,价格和收入 Zi:宗教、教育程度 Wt:广告 Deaton1995. reg单产on land, labor, fertilizer, farmers education 等 小农户单产更高?24-10面板数据的优点 解释: 不确定下,小农户追求更高单产 家庭农场监督成本低 土地质量差异:小农户土地质量高24-11面板数据的优点面板数据提供更多信息,更多变化,更少共线性,更多自由度,更有效面板数据可以更好分析动态

8、调整更好分析时间序列、横截面数据无法分析的效应Eg妇女劳动参与率、满意度分析24-12面板数据的优点面板数据可以更好估计和检验复杂行为模型Eg.技术效率24-1314 例1:供给行为模型 在生产函数分析中,人们长期关注的一个问题是如何分离规模经济和技术进步产生的效果。 就我国农村情况而言,这类研究有助于回答是否应鼓励扩大农户经营规模这一政策问题。 截面数据可以反映规模差异的影响,但无法考虑技术进步。 时间序列数据将两者的影响混合在一起而难以分离。15 例2:需求行为模型 对需求行为分析造成困扰的一个难题是如何分离收入变化的影响和价格变化的影响。 两者的动态变化模式常常表现出高度相关,因而利用时

9、间序列数据建立模型面临严重的多重共线。 利用Panel数据可以增大价格和收入的变异程度,降低其相关程度,从而改善模型参数的估计结果。 此外Panel数据样本量较大,因而允许引入更多的其他影响因素(例如人口学变量)。 Panel模型结果可以帮助识别观察对象间的差别及消费行为随时间的变化模式,这些信息有助于决策制定。面板数据的缺点数据收集问题测量误差问题选择性偏误、不响应、耗损较短的时间序列24-16面板数据模型 考虑以下利用混合数据建立的模型(2N个待估计参数) 此表达式意味着为每个截面单独建立模型。 是否可行(样本容量) 是否必要(研究目的) 简化假定1:有共同的斜率(N+1个待估计参数) 上

10、述情况属于截面固定效应模型(Fixed effect model) 。 简化假定2:有共同的常数项和斜率(2个待估计参数)171,2,;1,2,itiiititYXuiNtT1,2,;1,2,itiititYXuiNtT1,2,;1,2,itititYXuiNtT24-18A Panel Data DGP01122332 .1. ;1.()0()()0()0, , if OR for all itiititKKitititititi tjititYXXXXintTEVarEiittE Xj i t 24-19Panel Data DGPs Notice that when we have pa

11、nel data, we index observations with both i and t. Pay close attention to the subscripts on variables. Some variables vary only across time or across individual.24-20Panel Data DGPs (cont.)0112233223.1. ;1. For example, varies only by individual, and is fixed over time. might be a variable such as r

12、ace or gender. varies only by titiititKKititiitYXXXXintTXXX 311ime, and is fixed across . might be national unemployment. varies across BOTH individual and time. For example, might refer to wages.tititiXXX24-21Panel Data DGPs (cont.)01122330.1. ;1. One of the key features of the DGP is that we allow

13、 each individual to have a distinct intercept This intercept includes ALL itiititKKititiYXXXXintTi aspects of unobserved heterogeneity that are fixed over the length of the panel.24-22Panel Data DGPs (cont.) In this DGP, the 0i are fixed across samples. The unmeasured heterogeneity is the same in ev

14、ery sample. This DGP is called the “Distinct Intercepts” DGP. It is suitable for panels of states or countries, where the same individuals would be selected in each sample.24-23Panel Data DGPs (cont.) With longitudinal data on individual workers or consumers, we draw a different set of individuals f

15、rom the population each time we collect a sample. Each individual has his/her own set of fixed omitted variables. We cannot fix each individual intercept.24-24Another Panel Data DGP01122332 2.1. ;1.()0()()0( )0()0( )()0, ,()0, , if OR for for all for all EitititKKitiititititi tiiiivitijititYXXXXvint

16、TEVarEiittE vE vviiVar vEvi i tE xj i t ()0, ,()0, ,ITHER for all OR for at least some jitijitiE X vj i tE X vj i t24-25Panel Data DGPs In this DGP, we return to a model with a single intercept for all data points, 0 However, we break the error term into two components: When we draw an individual i,

17、 we draw one vi that is fixed for that individual in all time periods. vi includes all fixed omitted variables.itiitv24-26Panel Data DGPs (cont.) In the Distinct Intercepts DGP, the unobserved heterogeneity is absorbed into the individual-specific intercept 0i In the second DGP, the unobserved heter

18、ogeneity is absorbed into the individual fixed component of the error term, vi This DGP is an “Error Components Model误差成分模型.”24-27Panel Data DGPs (cont.) The Error Components DGP comes in two flavors, depending on. If, then the unobserved heterogeneity is uncorrelated with the explanators. OLS is un

19、biased and consistent.()jitiE X v()0jitiE X v24-28Panel Data DGPs (cont.) If, then the unobserved heterogeneity IS correlated with the explanators. OLS is BIASED and INCONSISTENT.()0jitiE X v24-29Panel Data DGPs (cont.) Panel data is most useful in the second Error Components case. When, OLS is inco

20、nsistent. Using panel data, we can create a consistent estimator: Fixed Effects.()0jitiE X v24-30Fixed Effects The Fixed Effects Estimator Used with EITHER the distinct intercepts DGP OR the error components DGP with Basic Idea: estimate a separate intercept for each individual()0ijtiE X v24-31Fixed

21、 Effects (cont.) The simplest way to estimate separate intercepts for each individual is to use dummy variables. This method is called the least squares dummy variable estimator(最小二乘虚最小二乘虚拟变量估计量)拟变量估计量).24-32Fixed Effects (cont.) We have already seen that we can use dummy variables to estimate separ

22、ate intercepts for different groups. With panel data, we have multiple observations for each individual. We can group these observations.24-33Fixed Effects (cont.) Least Squares Dummy Variable Estimator:Create a set of n dummy variables, D j, such that D j = 1 if i = j.1. Regress Yit against all the

23、 dummies, Xt , and Xit variables (you must omit Xi variables and the constant).24-34Fixed Effects (cont.) The LSDV estimator is conceptually quite simple. In practice, the tricky parts are:Creating the dummy variablesEntering the regression into the computer1. Reporting results24-35Fixed Effects (co

24、nt.) Suppose we have a longitudinal dataset with 300 workers over 10 years. n = 300 We must create 300 dummy variables and then specify a regression with 300+ explanators. How do we do this in our software package?24-36Fixed Effects (cont.) Our regression output includes 300 intercepts. Usually, we

25、are not interested in the intercepts themselves. We include the dummy variables to control for heterogeneity.24-37Fixed Effects (cont.) In reporting your regression output, it is preferable to note that you have included “individual fixed effects.” Then omit the dummy variable coefficients from your

26、 table of results.24-38Fixed Effects (cont.) At some point, n becomes too large for the computer to handle easily. Modern computers can implement LSDV for ever larger data sets, but eventually LSDV becomes computationally intractable.双重固定效应 截面和时间都设虚拟变量 检验pool还是固定效应24-3924-40Fixed Effects (cont.) A c

27、omputationally convenient alternative is called the Fixed Effects Estimator. Technically, only this strategy is “Fixed Effects;” using dummy variables is LSDV. In practice, econometricians tend to refer to either method as Fixed Effects.24-41Fixed Effects (cont.) The initial insight for the Fixed Ef

28、fects estimator: if we DIFFERENCE observations for the same individual, the vi cancels out.24-42Fixed Effects (cont.)01122011 221()0()00ititiiitititiiitititititititYXXvYXXvYYXX24-43Fixed Effects (cont.) When we difference, the heterogeneity term vi drops out. (In the distinct intercepts model, the 0

29、i would drop out). By assumption, the it are uncorrelated with the Xit OLS would be a consistent estimator of 124-44Fixed Effects (cont.) If T = 2, then we have only 2 observations for each individual (as in the Gibbons and Katz example). Differencing the 2 observations is efficient. If T 2, then di

30、fferencing any 2 observations ignores valuable information in the other observations for each individual.24-45Fixed Effects (cont.) We can use all the observations for each individual if we subtract the individual-specific mean from each observation.24-46Fixed Effects (cont.)01201211()0()00111 where

31、 Note: ititiiitiiiiiitiitiitiTiittiiiYXXvYXXvYYXXYYTvn vvnn24-47Fixed Effects (cont.)1Fixed Effects:1) Construct 2) Regress FEititiFEititiFEFEititityYYxXXyx24-48Fixed Effects (cont.) The Fixed Effects and DVLS estimators provide exactly identical estimates. The n-T-k term of the Fixed Effects e.s.e.

32、s must be adjusted to account for the extra n degrees of freedom that have been used. The computer can make this adjustment.24-49Fixed Effects (cont.) Demeaning each observation by the individual-specific mean eliminates the need to create n dummy variables. FE is computationally much simpler.24-50F

33、ixed Effects (cont.) Fixed Effects (however estimated) discards all variation between individuals. Fixed Effects uses only variation over time within an individual. FE is sometimes called the “within” estimator.24-51Fixed Effects (cont.) Fixed Effects discards a great deal of variation in the explan

34、ators (all variation between individuals). Fixed Effects uses n degrees of freedom. Fixed Effects is not efficient if Could we use OLS?()0itiE X v24-52Checking Understanding (cont.)012 2()0()()0( )0( )()0()0,()0, ,()0, if OR for for all for all for all Question: is OLS consititiititititi tiiviiitiit

35、iititYXvEVarEiittE vVar vE vviiE x vi tEvi i tE Xi t istent and efficient?24-53Checking Understanding (cont.) Because X is uncorrelated with either v or , OLS is consistent in the uncorrelated version of the error components DGP. The error terms are homoskedatic.22()() itiitvVarVar v24-54Checking Un

36、derstanding (cont.) However, the covariance between disturbances for a given individual is222222(,)( )()2()()()(,)(,)() ititiitiitiiitititivititvitititvCovE vvE vE vEE vCovCorrVar 24-55Checking Understanding (cont.) In the presence of serial correlation, OLS is inefficient.24-56Random Effects When u

37、nobserved heterogeneity is uncorrelated with explanators, panel data techniques are not needed to produce a consistent estimator. However, we do need to correct for serial correlation between observations of the same individual.24-57Random Effects (cont.) When, panel data provides a valuable tool fo

38、r eliminating omitted variables bias. We use Fixed Effects to gain the benefits of panel data. When , panel data does not offer special benefits. We use Random Effects to overcome the serial correlation of panel data.()0itiE X v()0itiE X v24-58Random Effects (cont.) The key idea of random effects: E

39、stimate v2 and 2 Use these estimates to construct efficient weights of panel data observations(GLS)24-59Random Effects (cont.)211122:1(1)1) Estimate the regression using Fixed Effects.2) Construct Fixed Effects residuals, 3) Estimate 4) Estimate the regression using OLS5) Estimate itTnTititiuuuTsn T

40、ks222222:as usual6) Because vvsss24-60Random Effects (cont.) Once we have estimates of v2 and 2, we can re-weight the observations optimally. These calculations are complicated, but most computer packages can implement them.24-61Example: Production Functions We have data from 625 firms from 16 count

41、ries for 8 years. We wish to estimate a CobbDouglas production function: Taking logs: We estimate using random effects.120iiiiQL K012ln()ln()ln()ln()ln( )iiiiQLK24-62TABLE 16.1 Random Effects Estimation of a CobbDouglas Production Function for a Sample of Manufacturing Firms24-63Example: Production

42、Functions The estimated coefficients of 0.30 for capital and 0.69 for labor are similar to estimates using US data. We also get similar results using fixed effects estimation.TABLE 16.2 Fixed Effects Estimation of a CobbDouglas Production Function for a Sample of Manufacturing Firms24-6424-65Example

43、: Production Functions We arrive at similar estimates using either random effects or fixed effects. Because only fixed effects controls for unobserved heterogeneity that is correlated with the explanators, the similarity between the two estimates suggests that unobserved heterogeneity is not creatin

44、g a large bias in this sample.24-66Example: Production Functions (cont.) The fixed effects estimator discards all variation between firms, and must use 624 more degrees of freedom than random effects. Moving from RE to FE increases the e.s.e. on capital from 0.0116 to 0.0145 The e.s.e. on labor move

45、s from 0.0118 to 0.013224-67Example: Production Functions (cont.) The RE estimator provides more precise estimates We would prefer to use RE instead of FE. However, RE might be inconsistent if We need a test to help determine whether it is safe to use RE.()0itiE X v固定效应模型与随机效应模型固定效应模型固定效应模型 优点 不需要事先

46、假定固定效应与X无关; 估计系数具有一致性; 固定效应可能含有有用的信息。 缺点 自由度损失大,不适合用于观察对象众多的情况。随机效应模型随机效应模型 优点 自由度损失小,适用于处理观察对象多的情况。 缺点 固定效应与X无关的假定不一定成立,因而有可能出现遗漏重要解释变量错误,此时无法保证估计系数的无偏性和一致性。6824-69The Hausman Test Hausmans specification test for error components DGPs provides guidance on whether The key idea: if, then the inconsis

47、tent RE estimator and the consistent FE estimator converge to different estimates.()0itiE X v()0itiE X v24-70The Hausman Test (cont.) If, then the unobserved heterogeneity is uncorrelated with X and does not create a bias. RE and FE are both consistent. For two consistent estimators to provide significantly different es

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论