付费下载
下载本文档
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
Multiple
Regression
ysis:Estimation(1)多元回归分析:估计(1)y
=
0
+
1x1
+
2x2
+
.
.
.kxk
+u1Chapter
Outline
本章大纲Motivation
for
Multiple
Regression使用多元回归的动因Mechanics
andInterpretation
of
Ordinary
Least
Squares普通最小二乘法的操作和解释The
Expected
Values
of
the
OLS
EstimatorsOLS估计量的期望值The
Variance
of
the
OLSEstimatorsOLS估计量的方差Efficiency
of
OLS:
The
Gauss-Markov
TheoremOLS的有效性:
-
定理2Lecture
Outline
课堂大纲Motivation
for
multivariate
ysis
使用多元回归的动因The
Model
模型The
Estimation
估计Propertiesof
the
OLS
estimates
OLS估计的性质The
Partialling
out
Interpretation
对“排除其它变量影响”的解释Simple
versus
multiple
regressions
比较简单回归模型与多元回归模型Goodness
of
Fit
拟合优度3Motivation:
Advantage动因:优点The
primary
drawback
of
the
simple
regression ysis
forempirical
workis
that
it
is
very
difficult
to
draw
ceteris
paribusconclusions
about
how
x
affects
y.在实证工作中使用简单回归模型的主要缺陷是:要得到在其它条件不变的情况下,x对y的影响非常。Whether
the
ceteris
paribus
effects
are
reliable
or
not
depends
onwhether
the
conditional
mean
assumption
is
realistic.在其它条件不变情况假定下 估计出的x对y的影响值是否
依赖,完全取决于条件均值零值假设是否现实。If
other
factors
that
affecting
y
are
not
correlated
with
x,
changingx
can
ensure
thatu
is
not
changed,and
the
effect
of
x
ony
can
beidentified.如果影响y的其它因素与x不相关,则改变x可以保证u不变,从而x对y的影响可以被识别出来。4Motivation
:
Advantage动因:优点Multiple
regression ysis
is
more
amenable
to
ceteris
paribusysis
because
it
allows
us
to
explicitly
control
for
many
otherfactors
that
simultaneously
affect
the
dependent
variable.多元回归分析更适合于其它条件不变情况下的分析,因为多元回归分析允许
明确地控制许多其它也同时影响因变量的因素。Multiple
regression
models
can modate
manyexplanatoryvariables
that
may
be
correlated.多元回归模型能容许很多解释变量,而这些变量可以是相关的。Important
for
drawing
inference
about
causal
relations
betweeny
and
explanatory
variables
when
using
non-experimentaldata.在使用非实验数据时,多元回归模型对推断y与解释变量间的因果关系很重要。5Motivation
:
Advantage动因:优点It
can
explain
more
of
the
variation
in
thedependent
variable.它可以解释
的因变量变动。It
can
incorporate
more
general
functional
form.它可以表现更一般的函数形式。The
multiple
regression
model
is
the
most
widelyused
vehicle
for
empirical
ysis.多元回归模型是实证分析中最广泛使用的工具。6Motivation:
AnExample动因:一个例子7Consider
a
simple
version
of
the
wage
equation
forobtaining
the
effect
of
education
on
hourly
wage:考虑一个简单版本的解释教育对小时工资影响的工资方程。exper:
years
of
labor
marketexperienceexper:在劳动力市场上的经历,用年衡量wage
0
1educ
2
exp
er
uIn
this
example
experience
is
explicitly
taken
out
ofthe
error
term.在这个例子中,“在劳动力市场上的经历”被明确地从误差项中提出。Motivation:
AnExample动因:一个例子8Consider
a
model
that
says
family
consumptionis
a
quadratic
function
of
family
e:考虑一个模型:家庭消费是家庭收入的二次方程。Cons
=
0
+
1
inc+2
inc2
+uNow
the
marginal
propensity
to
consume
isapproximated
by现在,边际消费倾向可以近似为MPC=
1
+22The
Model
with
kIndependentVariables含有k个自变量的模型T eral
multiple
linearregression
model
can
be
written
as一般的多元线性回归模型可以写为y
0
1x1
2
x2
k
xk
u9Parallels
with
Simple
Regression类似于简单回归模型0
is
still
the
intercept
0仍是截距1
to
k
all
called
slope
parameters1到k都称为斜率参数u
is
still
the
error
term(or
disturbance)
u仍是误差项(或干扰项)Still
need
to
make
a
zero
conditional
mean
assumption,
so
nowassume
that
仍需作零条件期望的假设,所以现在假设E(u|x1,x2,
…,xk)
=
0Still
minimizing
the
sum
of
squaredresiduals,
so
have
k+1order
conditions
仍然最小化残差平方和,所以得到k+1个一阶条件10Obtaining
the
OLS
Estimates11如何得到OLS估计值The
method
of
ordinary
least
squareschooses
the
estimates
to
minimizethe
sum
of
squared
residuals,普通最小二乘法选择能最小化残差平方和的估计值,01
i1ni(
y
i
1Obtaining
the
OLS
Estimates如何得到OLS估计值The
k+1
order
conditions
arek+1
个一阶条件是i
2ikni1ni1ni1ni1xi1
(
yi
ˆ
ˆ
x
ˆ
x0
1
i1
k
ikx
(
yi
ˆ
ˆ
x
ˆ
x0
1
i1
k
ik(
yi
ˆ
ˆ
x
ˆ
x
)
00
1
i1
k
ik)
0)
0x
(
yi
ˆ
ˆ
x
ˆ
x
)
00
1
i1
k
ik...12Obtaining
the
OLS
Estimates如何得到OLS估计值The order
conditions
are
also
the
samplecounterparts
of
the
related
population
moments.一阶条件也是相关的总体矩在样本中的对应。After
estimationwe
obtain
the
OLS
regressionline,
or
the
sample
regression
function
(SRF)得到OLS回归线,或称为样本回归k
ik101i
...
ˆxx
ˆˆ在估计之后,方程(SRF)ˆi
13Interpreting
Multiple
Regression对多元回归的解释141
1
2
2
k
kyˆ
ˆ
ˆ
x
ˆ
x
...
ˆ
x
,
so0 1
1
2
2
k
kyˆ
ˆ
x
ˆ
x
...
ˆ
x
,yˆ
ˆ
x
,
that
is
each
hasso
holding
x2
,...,
xk
fixed
implies
that所以,保持
x2
,...,xk
不变意味着1
1a
ceteris
paribus
interpretation即,每一个
都有一个局部效应,或其它情况不变效应,的解释Example:Determinants
of
College
GPA例子:大学GPA的决定因素Two-independent-variable
regression两个解释变量的回归pcolGPA:
predicted
values
of
college
grade
point
averagepcolGPA:大学绩点
值hsGPAhsGPAACTACT:
high
school
GPA:高中绩点:
achievement
test
score:成绩测验分数pcolGPA
=
1.29
+
0.453hsGPA+0.0094ACT15Example:Determinants
of
College
GPA例子:大学GPA的决定因素16One-independent-variable
regression一个解释变量的回归pcolGPA
=
2.4
+0.0271ACTThe
coefficients
on
ACT
is
three
times
larger.ACT的系数大三倍。If
these
two
regressions
were
both
true,
they
can
beconsidered
as
the
results
of
two
differentexperiments.如果这两个回归都是对的,它们可以被认为是两个不同实验的结果。Holding
other
factors
fixed“保持其它因素不变”的含义The
power
of
multiple
regression ysis
isthat
it
allowsus
to n
non-experimentalenvironments
what
natural
scientists
are
able
ton
a
controlled
laboratory
setting:
keep
otherfactors
fixed.多元回归分析的优势在于它使能在非实验环境中去做自然科学家在受控实验中所能做的事情:保持其它因素不变。17Properties
性质The
sample
average
of
the
residuals
is
zero.残差项的样本平均值为零The
sample
covariance
between
each
independentvariable
and
the
OSL
residuals
is
zero.每个自变量和OLS协残差之间的样本协方差为零。The
point
(x1,
x2
, ,
xk
,
y)
isalways
on
the
OLSregression
line.点(x1,
x2
, ,
xk
,
y)
总位于OLS回归线上。18A
“Partialling
Out”
Interpretation19对“排除其它变量影响”的解释Consider
regression
line
of
考虑回归线1One
way
to
express
ˆ
is0
1
1
2
2ˆˆiy
ˆ
x
ˆ
x1ˆˆi1
iri1
i
(1ˆ
的一种表达是rˆi1is
obtained
in
the
following
way:rˆi1
由以下方式得出:A
“Partialling
Out”
Interpretation对“排除其它变量影响”的解释In
other
words, is
the
residual
from
the
regression然后,将y向
进行简单回归得到。r11obtain
ˆ
.1r1ˆRegress
our independent
variable
x1
on
oursecond
independentvariable
x2
,and
then
obtainthe
residualr1
.将第一个自变量对第二个自变量进行回归,然后得到残差r1
。xˆ1
ˆ0
ˆ1xˆ2换句话说,r1
是由回归
xˆ1
ˆ0
ˆ1xˆ2得到的残差。Then,
do
a
simple
regression
of
y
on
r1
to20“Partialling
Out”
continued“排除其它变量影响”(续)
Previous
equation
implies
that
regressingy
on
x1and
x2
gives
same
effect
of
x1
as
regressing
y
onresiduals
from
a
regression
of
x1
on
x2上述方程意味着:将y同时对x1和x2回归得出的x1的影响与先将x1对x2回归得到残差,再将y对此残差回归得到的x1的影响相同。
This
meansonly
the
part
of
x1
that
is
uncorrelatedwith
x2
are
being
related
to
y
,
so
we’re
estimatingthe
effect
ofx1
on
y
after
x2
has
been
“partialled
out”这意味着只有x1中与x2不相关的部分与y有关,所以在x2被“排除影响”之后,
再估计x1对y的影响。21“Partialling
Out”
continued“排除其它变量影响”(续)In
t eral
model
with
k
explanatoryequationcomes
from
the
regression
of
x1
on
x2…
,
xk.的回归。Thusmeasures
the
effect
of
x1
on
y
afterx2,…
,
xk.has
been
partialled
out.x1对y的影响。ˆi1
irˆi1
i
(1
,
but
the
residual
r11variables,
ˆ
can
still
be
written
asin1ˆ1在一个含有k个解释变量的一般模型中,ˆ
仍然可以ˆi1
irˆi1
i
(1
1写成
,但残差
r
来自x1对x2…
,xk1于是ˆ
度量的是,在排除x2…,xk等变量的影响之后,22Simple
vs
Multiple
Regression
Estimates比较简单回归和多元回归估计值11Generally,
1
ˆ
unless:Compare
the
simple
regression
y
0
1
x1比较简单回归y
0
1
x1with
the
multiple
regression
yˆ
ˆ
ˆ
x
ˆ
x0
1
1
2
2与多元回归
yˆ
ˆ
ˆ
x
ˆ
x0
1
1
2
223一般来说,1
ˆ
,除非:
0
(i.e.
no
partial
effect
of
x2
)
ORˆ2ˆ2
(0
也就是x2对y没有局部效应),或x1
and
x2
are
uncorrelated
in
the
sample在样本中x1和x2不相关Simple
vs
Multiple
Regression
Estimates比较简单回归和多元回归估计值24regression
of
x2
on
x1
. The
proof.This
is
because
there
existsa
simple
relationship这是因为存在一个简单的关系~
ˆ
ˆ
~1
1
2
1~where
1
is
theslope
coefficient
from
thesimple这里,1是x2对x1的简单回归得到的斜率系数。证明如下。~1
125211
11
111
11121
1
1
2
20
1
1
2
2ˆ~
ˆ
ˆ
~1
2
1
ˆ
ˆ
(x
x
)2(x1
x1
)(x2
x2
)(x
x
)2
x2
)]
x
)[
(x
x1
)
ˆ2
(x2
(x1(x
x
)2(x
x1
)(
y
y)
x
),
thereforey
y
ˆ
(x
x
)
ˆ
(x
uˆ
so
thatBecause
y
ˆ
ˆ
x
ˆ
xSimple
vs
Multiple
Regression
Estimates简单回归和多元回归估计值的比较Let
βˆj
,
j
0,1,...,
k
be
the
OLS
estimators
from
theregression
using
full
set
of
explanatory
variables.令βˆj
,j
0,1,...,k为用全部解释变量回归的OLS估计量。
Let
βj
,j
0,1,...,k
1be
the
OLS
estimators
fromthe
regression
that
leaves
out
xk
.令βj
,j
0,1,...,k-1为用除xk
外的解释变量回归的OLS估计量。
Let
δj
be
the
slope
coefficient
on
xj
in
the
regressionof
xk
on
x1
,...,
xk-1.Then令δj为xk向x1
,...,xk-1回归中x
j的斜率系数。那么βj
βˆj
βˆk
δj
.26Simple
vs
Multiple
Regression
Estimates简单回归和多元回归估计值的比较27In
the
case
with
k
independentvariables,
thesimple
regression and
the
multiple
regressionproduce
identical
estimate
for
x1
only
if在k个自变量的情况下,简单回归和多元回归只有在以下条件下才能得到对x1相同的估计(1)
the
OLS
coefficients
on
x2
through
xk
are
allzero,or对从x2到xk的OLS系数都为零,或(2)
x1
isuncorrelated
with
eachof
x2…
,
xk.(2)
x1与x2…,xk中的每一个都不相关。Summary
总结In
this
lecture
we
introduce
the
multiple
regression.在本次课中,
介绍了多元回归。Important
concepts:重要概念:Interpreting
the
meaning
of
OLS
estimates
in
multipleregression解释多元回归中OLS估计值的意义Partialling
effect局部效应(其它情况不变效应)Properties
of
OLSOLS的性质When
will
the
estimates
from
simple
and
multipleregression
to
be
identical什么时候简单回归和多元回归的估计值相同2829Multiple
Regression ysis:
Estimation(2)多元回归分析:估计(2)y
=
0
+
1x1
+
2x2
+
.
.
.kxk
+
uChapter
Outline
本章大纲Motivation
for
Multiple
Regression使用多元回归的动因Mechanics
and
Interpretation
of
Ordinary
Least
Squares普通最小二乘法的操作和解释The
Expected
Values
of
the
OLS
EstimatorsOLS估计量的期望值The
Variance
of
the
OLS
EstimatorsOLS估计量的方差Efficiency
of
OLS:
The
Gauss-MarkovTheoremOLS的有效性:
-
定理30Lecture
Outline
课堂大纲31The
MLR.1–
MLR.4
Assumptions假定MLR.1–MLR.4The
Unbiasedness
of
the
OLS
estimatesOLS估计值的无偏性Over
or
Under
specification
of
models模型设定不足或过度设定Omitted
VariableBias遗漏变量的偏误Sampling
Variance
of
the
OLS
slope
estimatesOLS斜率估计量的抽样方差The
expected
value
of
the
OLS
estimatorsOLS估计量的期望值We
now
turn
to
the
statistical
propertiesof
OLSforestimating
the
parameters
in
an
underlyingpopulation
model.现在转向OLS的统计特性,而 知道OLS是估计潜在的总体模型参数的。Statistical
properties
are
the
properties
ofestimators
when
random
sampling
is
donerepeatedly.
We
do
not
care
about
how
an
estimatordoes
in
a
specific
sample.统计性质是估计量在随机抽样不断重复时的性质。并不关心在某一特定样本中估计量如何。32Assumption
MLR.1
(Linear
in
Parameters)假定MLR.1(对参数而言为线性)33In
the
population
model
(or
the
true
model),
thedependent
variable
y
is
related
to
the
independentvariable
x
and
the
error
u
as在总体模型(或称真实模型)中,因变量y与自变量x和误差项u关系如下y=
0+
1x1+
2x2+
…+kxk+uwhere
1,
2
…,
k
are
the
unknown
parametersof
interest,and
u
is
an
unobservable
random
error
or
randomdisturbance
term.其中,1,2
…,k
为所关心的未知参数,u为不可观测的随机误差项或随机干扰项。Assumption
MLR.2
(Random
Sampling)假定MLR.2(随机抽样性)We
can
use
a
random
sampleof
size
n
from
thepopulation,{(xi1,可以使用总体的一个容量为n的随机样本xi2…,
xik;
yi):
i=1,…,n},wherei
denotesobservation,and
j=
1,…,k
denotesthe
jth
regressor.其中i
代表观察,j=1,…,k代表第j个回归元Sometimes
we
write
有时 模型写为yi=
0+
1xi1+
2xi2+
…+kxik+ui34Assumptions
MLR.3
假定MLR.3MLR.3(Zero
Conditional
Mean)
(零条件均值)
:E(u|
xi1,
xi2…,xik)=0.When
this
assumption
holds,
we
say
all
of
theexplanatory
variables
are
exogenous;
when
it
fails,
wesay
that
the
explanatory
variables
are
endogenous.当该假定成立时,称所有解释变量均为外生的;否则,则称解释变量为内生的。We
will
pay
particular
attention
to
the
case
thatassumption
3
fails
because
of
omitted
variables.特别注意当重要变量缺省时导致假定3不成立的情况。35Assumption
MLR.4
假定MLR.4MLR.4(No
perfect
collinearity)
(不存在完全共线性)
:In
the
sample,none
of
the
independent
variables
is
constant,
and
there
are
noexactlinearrelationshipsamongtheindependentvariables.在样本中,没有一个自变量是常数,自变量之间也不存在严格的线性关系。When
one
regressor
is
an
exact
linear
combination
of
the
other
regressor(s),wesaythemodelsuffersfromperfectcollinearity.当一个自变量是其它解释变量的严格线性组合时,说此模型有严格共线性。Examples
of
perfect
collinearity:完全共线性的例子:y=0+
1x1+
2x2+
3x3+u,
x2
=
3x3,y=
0+
1log(inc)+
2log(inc2
)+uy=
0+
1x1+
2x2+
3x3+
4x4
u,x1+x2
+x3+
x4
=1.Perfect
collinearity
also
happenswhen
y=0+1x1+2x2+3x3+u,n<(k+1).当y=0+1x1+2x2+3x3+u,n<(k+1)也发生完全共线性的情况。The
denominator
of
the
OLS
estimator
is
0
when
there
is
perfect
collinearity,hence
the
OLS
estimator
cannot
be
performed.You
can
check
this
by
looking
atthe
formula
of
the
estimator
for
2
in
the
session
discussing
the
partialling-outeffect.在完全共线性情况下,OLS估计量的分母为零,因此OLS估计量不能得到。你可以回顾“排除其它变量影响”部分中的2估计量的式子,来检验这一点。36Theorem
3.1
(Unbiasedness
of
OLS)37定理3.1(OLS的无偏性)Under
assumptions
MLR.1
throughMLR.4,
the
OLS
estimators
areunbiased
estimator
of
thepopulation
parameters,
that
is在假定MLR.1~MLR.4下,OLS估计量是总体参数的无偏估计量,即E(
j
)
j
,
j
1,2,...,kTheorem
3.1
(Unbiasedness
of
OLS)定理3.1(OLS的无偏性)Unbiasedness
is
the
property
of
an
estimator,thatis,
the
procedure
that
can
produce
an
estimate
fora
specific
sample,
not
an
estimate.无偏性是估计量的特性,而不是估计值的特性。估计量是(过程),该方法使得给定一个样本,法可以得到一组估计值。 评价的是方法的优劣。Not
correct
to
say“5
percent
is
anunbiasedestimate
of
the
return
of
education”.不正确的说法:“5%是教育汇报率的无偏估计值。”38Too
Many
or
TooFew
Variables变量太多还是太少了?What
happens
if
we
include
variables
in
our
specificationthat
don’t
belong?如果
在设定中包含了不属于真实模型的变量会怎样?A
model
is
overspecifed
when
one
or
more
of
theindependent
variablesis
included
in
the
model
even
thoughit
has
no
partial
effect
on
y
in
the
population尽管一个(或多个)自变量在总体中对y没有局部效应,但却被放到了模型中,则此模型被过度设定。
There
is
no
effect
on
our
parameter
estimate,
and
OLSremains
unbiased.
But
it
can
have
undesirable
effects
on
thevariances
of
the
OLS
estimators.过度设定对
的参数估计没有影响,OLS仍然是无偏的。但它对OLS估计量的方差有不利影响。39Too
Many
or
TooFew
Variables变量太多还是太少了?What
if
we
exclude
a
variable
from
ourspecification
that
doesbelong?如果 在设定中排除了一个本属于真实模型的变量会如何?If
a
variable
th tually
belongs
in
the
true
model
is
omitted,
wesay
the
modelis
underspecified.如果一个实际上属于真实模型的变量被遗漏,说此模型设定不足。OLS
will
usually
be
biased.此时OLS通常有偏。Deriving
the
bias
caused
by
omitting
animportant
variable
isanexample
ofmisspecification
ysis.推导由遗漏重要变量所造成的偏误,是模型设定分析的一个例子。40Omitted
Variable
Bias遗漏变量的偏误
121i11
ii1x
x
y
x
xSuppose
the
true
model
is
given
as假定真实模型如下y
0
1
x1
2
x2
u,but
we
estimate
y
0
1
x1
u,
then但
估计的是
y
0
1
x1
u,有41Omitted
Variable
Bias
(cont)遗漏变量的偏误(续)24221
i1
1
1
xi1
x1
0
x
x回想真实模型,有
yi
0
1xi1
2
xso
the
numerator
be所以分子为Omitted
Variable
Bias
(cont)遗漏变量的偏误(续)
1222111121i1i1i11
i
2i1E
x
x
x
x
xx
xx
x
x
x
i1 1
i
2
xi1
x1
ui
2
x
since
E(ui
)
0,taking
expectations
we
have由于E(ui
)
0,取期望, 得到43
12i11
i
2i1
1x
x
xx2
0
1
x1
then
x
xso
E
1
1
21Omitted
Variable
Bias
(cont)遗漏变量的偏误(续)Consider
the
regression
of
x2
on
x1考虑x2对x1的回归44Omitted
Variable
Bias
Summary遗漏变量的偏误 总结45Two
cases
where
biasisequal
to
zero
两种偏误为零的情形2
=
0,
that
isx2
doesn’t
really
belongin
model2
=0,也就是,x2实际上不属于模型x1
and
x2
are
uncorrelated
inthe
sample样本中x1与x2不相关
If
correlation
between
x2
,x1
and
x2
,y
isthe
same
direction,
bias
will
be
positive
如果x2与x1间相关性和x2与y间相关性同方向,偏误为正。
If
correlation
between
x2
,x1
andx2
,y
is
theopposite
direction,
bias
will
be
negative
如果x2与x1间相关性和x2与y间相关性反方向,偏误为负。Omitted
Variable
Bias
Summary遗漏变量的偏误 总结When
E(1
)
1,we
say
that
1
hasupwardbias.当E(1
)
1,
说1上偏。When
E(1
)
1,
we
say
that1
hasdownwardbias.当E(1
)
1,
说1下偏。46Summaryof
Direction
ofBias偏误方向总结47Corr(x1,
x2)
>
0Corr(x1,
x2)
<
02
>
0Positive
bias偏误为正Negative
bias偏误为负2
<
0Negative
bias偏误为负Positive
bias偏误为正Omitted-Variable
Bias
遗漏变量偏误In
general
,
2
is
unknown;
and
when
a
variable
isomitted,
it
is
mainly
because
of
this
variable
isunobserved.
In
other
words,
we
do
not
know
thesign
of
Corr(x1,
x2).
Whatto
do?
但是,通常
不能观测到2,而且,当一个重要变量被缺省时,主要原因也是因为该变量无法观测,换句话说,无法准确知道Corr(x1,x2)的符号。怎么办呢?We
rely
on
economic
theories
and
intuition
tomake
a
educated
guess
ofthesign.
依靠经济理论和 来帮助 对相应符号做出较好的估计。48Example:
hourly
wage
equation例子:小时工资方程Suppose
the
model
log(wage)
=
0+1educ+
2abil
+uis
estimated
with
abil
omitted.
What
is
the
direction
of
biasfor
1?假定模型
log(wage)=0+1educ+2abil+u,在估计时遗漏了abil。1的偏误方向如何?
Since
in
general
ability
has
positive
partial
effect
on
yandability
and
education
years
is
positive
corrected,
we
expect1
to
have
a
upwardbias.因为一般来说ability对y有正的局部效应,并且ability和education
years正相关,所以预期1上偏。49The
More
General
Case更一般的情形Technically,
it
is
more
difficult
to
derive
the
signof
omitted
variable
bias
with
multiple
regressors.从技术上讲,要推出多元回归下缺省一个变量时各个变量的偏误方向更加
。
But
remember
that
if
an
omitted
variable
haspartial
effects
on
y
and
it
is
correlated
with
atleast
one
of
the
regressors,
then
the
OLSestimators
of
all
coefficients
will
be
biased.需要记住,若有一个对y有局部效应的变量被缺省,且该变量至少和一个解释变量相关,那么所有系数的OLS估计量都有偏。50The
More
General
Case更一般的情形ymodel2ytrueyˆmodel1
0
Suppose
corr(x1
,
x3It
is
notdifficult
to
beliestimator
of
2
.Will
1
be
un若corr(x1,x3
)
0,corr(x2
,x3
)
0很容易想到2是2的一个有偏估计量而1是有偏的吗?51The
More
General
Case更一般的情形1
1 3
1
2
2ˆ
ˆˆ
,
1When
corr(x1
,
x2
)
0corr(x1
,
x3
)
0.There当corr(x1
,x2
)
0,即
is
a
biasedestimato52—
有偏估计Variance
of
the
OLS
EstimatorsOLS估计量的方差Now
we
know
that
the
sampling
distribution
of
ourestimate
iscentered
around
the
true
parameter。现在
知道估计值的样本分布是以真实参数为中心的。Want
to
think
about
how
spreadout
this
distribution
is还想知道这一分布的分散状况。
Much
easier
to
think
about
this
variance
underanadditional
assumption,
so在一个新增假设下,度量这个方差就容易多了,有:53Assumption
MLR.5
(Homoskedasticity)假定MLR.5(同方差性)Assume
Homoskedasticity:同方差性假定:Var(u|x1,
x2,…,
xk)
=
2
.Means
that
the
variancein
the
errorterm,
u,conditionalontheexplanatorcombinations
ofbles,
is
the
same
for
alles
of
explanatory
variables.意思是,不管解释变量出现怎样的组合,误差项u的条件方差都是一样的。If
the
assumption
fails,
we
say
the
model
exhibitsheteroskedasticity.如果这个假定不成立, 说模型存在异方差性。54Variance
of
OLS
(cont)OLS估计量的方差(续)Let
x
standfor(x1,x2,…xk)
用x表示(x1,x2,…xk)
Assuming
that
Var(u|x)=2
also
implies
thatVar(y|
x)=2
假定Var(u|x)=2,也就意味着
Var(y|
x)=2
Assumption
MLR.1-5
are
collectively
known
asthe
Gauss-Markov
assumptions.假定MLR.1-5共同被称为
-
假定55
22ˆjj2jjjjij
jand
R2
is
the
R2from
regressing
xj
on
all
other
x's
2Var
,
whereSST
1
RSSTj
xij
xTheorem
3.2
(Sampling
Variances
of
the
OLS
SlopeEstimators)定理3.2(OLS斜率估计量的抽样方差)Given
the
Gauss-Markov
Assumptions给定
-
假定其中,SST
x
x
,R2是x
向所有其它x回归所得到的R2j
j56Interpreting
Theorem
3.2对定理3.2的解释57
Theorem
3.2
shows
that
the
variances
of
theestimated
slope
coefficients
are
influenced
by
threefactors:定理3.2显示:估计斜率系数的方差受到三个因素的影响:The
error
variance误差项的方差The
total
sample
variation总的样本变异Linear
relationships
among
the
independent
variables解释变量之间的线性相关关系Interpreting
Theorem
3.2:
The
Error
Variance对定理3.2的解释(1):误差项方差A
larger
2
implies
a
larger
variance
forthe
OLSestimators.更大的2意味着更大的OLS估计量方差。A
larger
2
means
more
noises
in
the
equation.更大的2意味着方程中的“噪音”越多。This
makes
it
more
difficult
toextract
theexact
partial
effectof
the
regressor
on
the
regressand.
这使得得到自变量对因变量的准确局部效应变得更加。Introducing
more
regressors
can
reduce
the
variance.
Butoften
this
is
not
possible,
neither
is
it
desirable.
引入 的解释变量可以减小方差。但这样做不仅不一定可能,而且也不一定总令人满意。2
does
not
depends
on
sample
size.
2
不依赖于样本大小58Interpreting
Theorem
3.2:
The
total
sample
variation对定理3.2的解释(2):总的样本变异A
larger
SSTj
implies
a
smaller
variance
for
the
estimators,andvice
versa.更大的SSTj意味着更小的估计量方差,反之亦然。Everything
else
being
equal,more
sample
variation
in
x
is
always
preferred.其它条件不变情况下,x的样本方差越大越好。One
way
to
gain
more
sample
variation
is
to
increase
thesample
size.
增加样本方差的
法是增加样本容量。This
components
of
parameter
variance
depends
on
thesample
size.
参数方差的这一组成部分依赖于样本容量。59Interpreting
Theorem
3.2:
multicollinearity对定理3.1的解释(3):多重共线性60A
larger
R
2
implies
a
larger
variance
for
the
estimatorsjj更大的R
2意味着更大的估计量方差。jA
large
R
2
meansother
regressors
can
explain
much
of
the2variationsin
xj.如果Rj
较大,就说明其它解释变量解释可以解释较大部分的该变量。j
jWhen
R
2
is
very
close
to
1,
x
is
highly
correlated
with
other2regressors,
this
is
called
multicollinearity.
当Rj
非常接近1时,xj与其它解释变量高度相关,被称为多重共线性。Severe
multicollinearity
means
the
variance
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 安防系统集成公司网络安全检查与漏洞扫描管理制度
- 2026年中考化学百校联考冲刺押题密卷及答案(三)
- 模块化卫生间安装专项施工方案
- 机加工巡检制度
- 既有线长轨换铺施工方案
- 执行谈话委托书
- 全国卷2高考试题及答案
- 基于生态设计方法的城市井盖设计研究
- 2026儿童情商培养课程社会认知度与市场推广策略研究报告
- 2026儿童心理健康教育市场现状与未来增长潜力报告
- 延后发工资协议书
- 2025年开封大学单招职业技能测试题库附答案
- TCSEE0338-2022火力发电厂电涡流式振动位移传感器检测技术导则
- 帕金森病震颤症状及护理建议
- 安徽省公务员2025年公共基础真题汇编卷
- 冷链食品安全检查表模板
- 宁夏石化苯罐和抽提原料罐隐患治理项目报告表
- 游泳馆设备规范及维护要求手册
- 消除艾梅乙培训课件
- CRT2000 消防控制室图形显示装置-使用说明书-V1.0
- 人体首剂最大安全起始剂量的估算
评论
0/150
提交评论