中级计量-第三讲_第1页
中级计量-第三讲_第2页
中级计量-第三讲_第3页
中级计量-第三讲_第4页
中级计量-第三讲_第5页
免费预览已结束,剩余90页可下载查看

付费下载

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

Multiple

Regression

ysis:Estimation(1)多元回归分析:估计(1)y

=

0

+

1x1

+

2x2

+

.

.

.kxk

+u1Chapter

Outline

本章大纲Motivation

for

Multiple

Regression使用多元回归的动因Mechanics

andInterpretation

of

Ordinary

Least

Squares普通最小二乘法的操作和解释The

Expected

Values

of

the

OLS

EstimatorsOLS估计量的期望值The

Variance

of

the

OLSEstimatorsOLS估计量的方差Efficiency

of

OLS:

The

Gauss-Markov

TheoremOLS的有效性:

定理2Lecture

Outline

课堂大纲Motivation

for

multivariate

ysis

使用多元回归的动因The

Model

模型The

Estimation

估计Propertiesof

the

OLS

estimates

OLS估计的性质The

Partialling

out

Interpretation

对“排除其它变量影响”的解释Simple

versus

multiple

regressions

比较简单回归模型与多元回归模型Goodness

of

Fit

拟合优度3Motivation:

Advantage动因:优点The

primary

drawback

of

the

simple

regression ysis

forempirical

workis

that

it

is

very

difficult

to

draw

ceteris

paribusconclusions

about

how

x

affects

y.在实证工作中使用简单回归模型的主要缺陷是:要得到在其它条件不变的情况下,x对y的影响非常。Whether

the

ceteris

paribus

effects

are

reliable

or

not

depends

onwhether

the

conditional

mean

assumption

is

realistic.在其它条件不变情况假定下 估计出的x对y的影响值是否

依赖,完全取决于条件均值零值假设是否现实。If

other

factors

that

affecting

y

are

not

correlated

with

x,

changingx

can

ensure

thatu

is

not

changed,and

the

effect

of

x

ony

can

beidentified.如果影响y的其它因素与x不相关,则改变x可以保证u不变,从而x对y的影响可以被识别出来。4Motivation

:

Advantage动因:优点Multiple

regression ysis

is

more

amenable

to

ceteris

paribusysis

because

it

allows

us

to

explicitly

control

for

many

otherfactors

that

simultaneously

affect

the

dependent

variable.多元回归分析更适合于其它条件不变情况下的分析,因为多元回归分析允许

明确地控制许多其它也同时影响因变量的因素。Multiple

regression

models

can modate

manyexplanatoryvariables

that

may

be

correlated.多元回归模型能容许很多解释变量,而这些变量可以是相关的。Important

for

drawing

inference

about

causal

relations

betweeny

and

explanatory

variables

when

using

non-experimentaldata.在使用非实验数据时,多元回归模型对推断y与解释变量间的因果关系很重要。5Motivation

:

Advantage动因:优点It

can

explain

more

of

the

variation

in

thedependent

variable.它可以解释

的因变量变动。It

can

incorporate

more

general

functional

form.它可以表现更一般的函数形式。The

multiple

regression

model

is

the

most

widelyused

vehicle

for

empirical

ysis.多元回归模型是实证分析中最广泛使用的工具。6Motivation:

AnExample动因:一个例子7Consider

a

simple

version

of

the

wage

equation

forobtaining

the

effect

of

education

on

hourly

wage:考虑一个简单版本的解释教育对小时工资影响的工资方程。exper:

years

of

labor

marketexperienceexper:在劳动力市场上的经历,用年衡量wage

0

1educ

2

exp

er

uIn

this

example

experience

is

explicitly

taken

out

ofthe

error

term.在这个例子中,“在劳动力市场上的经历”被明确地从误差项中提出。Motivation:

AnExample动因:一个例子8Consider

a

model

that

says

family

consumptionis

a

quadratic

function

of

family

e:考虑一个模型:家庭消费是家庭收入的二次方程。Cons

=

0

+

1

inc+2

inc2

+uNow

the

marginal

propensity

to

consume

isapproximated

by现在,边际消费倾向可以近似为MPC=

1

+22The

Model

with

kIndependentVariables含有k个自变量的模型T eral

multiple

linearregression

model

can

be

written

as一般的多元线性回归模型可以写为y

0

1x1

2

x2

k

xk

u9Parallels

with

Simple

Regression类似于简单回归模型0

is

still

the

intercept

0仍是截距1

to

k

all

called

slope

parameters1到k都称为斜率参数u

is

still

the

error

term(or

disturbance)

u仍是误差项(或干扰项)Still

need

to

make

a

zero

conditional

mean

assumption,

so

nowassume

that

仍需作零条件期望的假设,所以现在假设E(u|x1,x2,

…,xk)

=

0Still

minimizing

the

sum

of

squaredresiduals,

so

have

k+1order

conditions

仍然最小化残差平方和,所以得到k+1个一阶条件10Obtaining

the

OLS

Estimates11如何得到OLS估计值The

method

of

ordinary

least

squareschooses

the

estimates

to

minimizethe

sum

of

squared

residuals,普通最小二乘法选择能最小化残差平方和的估计值,01

i1ni(

y

i

1Obtaining

the

OLS

Estimates如何得到OLS估计值The

k+1

order

conditions

arek+1

个一阶条件是i

2ikni1ni1ni1ni1xi1

(

yi

ˆ

ˆ

x

ˆ

x0

1

i1

k

ikx

(

yi

ˆ

ˆ

x

ˆ

x0

1

i1

k

ik(

yi

ˆ

ˆ

x

ˆ

x

)

00

1

i1

k

ik)

0)

0x

(

yi

ˆ

ˆ

x

ˆ

x

)

00

1

i1

k

ik...12Obtaining

the

OLS

Estimates如何得到OLS估计值The order

conditions

are

also

the

samplecounterparts

of

the

related

population

moments.一阶条件也是相关的总体矩在样本中的对应。After

estimationwe

obtain

the

OLS

regressionline,

or

the

sample

regression

function

(SRF)得到OLS回归线,或称为样本回归k

ik101i

...

ˆxx

ˆˆ在估计之后,方程(SRF)ˆi

13Interpreting

Multiple

Regression对多元回归的解释141

1

2

2

k

kyˆ

ˆ

ˆ

x

ˆ

x

...

ˆ

x

,

so0 1

1

2

2

k

kyˆ

ˆ

x

ˆ

x

...

ˆ

x

,yˆ

ˆ

x

,

that

is

each

hasso

holding

x2

,...,

xk

fixed

implies

that所以,保持

x2

,...,xk

不变意味着1

1a

ceteris

paribus

interpretation即,每一个

都有一个局部效应,或其它情况不变效应,的解释Example:Determinants

of

College

GPA例子:大学GPA的决定因素Two-independent-variable

regression两个解释变量的回归pcolGPA:

predicted

values

of

college

grade

point

averagepcolGPA:大学绩点

值hsGPAhsGPAACTACT:

high

school

GPA:高中绩点:

achievement

test

score:成绩测验分数pcolGPA

=

1.29

+

0.453hsGPA+0.0094ACT15Example:Determinants

of

College

GPA例子:大学GPA的决定因素16One-independent-variable

regression一个解释变量的回归pcolGPA

=

2.4

+0.0271ACTThe

coefficients

on

ACT

is

three

times

larger.ACT的系数大三倍。If

these

two

regressions

were

both

true,

they

can

beconsidered

as

the

results

of

two

differentexperiments.如果这两个回归都是对的,它们可以被认为是两个不同实验的结果。Holding

other

factors

fixed“保持其它因素不变”的含义The

power

of

multiple

regression ysis

isthat

it

allowsus

to n

non-experimentalenvironments

what

natural

scientists

are

able

ton

a

controlled

laboratory

setting:

keep

otherfactors

fixed.多元回归分析的优势在于它使能在非实验环境中去做自然科学家在受控实验中所能做的事情:保持其它因素不变。17Properties

性质The

sample

average

of

the

residuals

is

zero.残差项的样本平均值为零The

sample

covariance

between

each

independentvariable

and

the

OSL

residuals

is

zero.每个自变量和OLS协残差之间的样本协方差为零。The

point

(x1,

x2

, ,

xk

,

y)

isalways

on

the

OLSregression

line.点(x1,

x2

, ,

xk

,

y)

总位于OLS回归线上。18A

“Partialling

Out”

Interpretation19对“排除其它变量影响”的解释Consider

regression

line

of

考虑回归线1One

way

to

express

ˆ

is0

1

1

2

2ˆˆiy

ˆ

x

ˆ

x1ˆˆi1

iri1

i

(1ˆ

的一种表达是rˆi1is

obtained

in

the

following

way:rˆi1

由以下方式得出:A

“Partialling

Out”

Interpretation对“排除其它变量影响”的解释In

other

words, is

the

residual

from

the

regression然后,将y向

进行简单回归得到。r11obtain

ˆ

.1r1ˆRegress

our independent

variable

x1

on

oursecond

independentvariable

x2

,and

then

obtainthe

residualr1

.将第一个自变量对第二个自变量进行回归,然后得到残差r1

。xˆ1

ˆ0

ˆ1xˆ2换句话说,r1

是由回归

xˆ1

ˆ0

ˆ1xˆ2得到的残差。Then,

do

a

simple

regression

of

y

on

r1

to20“Partialling

Out”

continued“排除其它变量影响”(续)

Previous

equation

implies

that

regressingy

on

x1and

x2

gives

same

effect

of

x1

as

regressing

y

onresiduals

from

a

regression

of

x1

on

x2上述方程意味着:将y同时对x1和x2回归得出的x1的影响与先将x1对x2回归得到残差,再将y对此残差回归得到的x1的影响相同。

This

meansonly

the

part

of

x1

that

is

uncorrelatedwith

x2

are

being

related

to

y

,

so

we’re

estimatingthe

effect

ofx1

on

y

after

x2

has

been

“partialled

out”这意味着只有x1中与x2不相关的部分与y有关,所以在x2被“排除影响”之后,

再估计x1对y的影响。21“Partialling

Out”

continued“排除其它变量影响”(续)In

t eral

model

with

k

explanatoryequationcomes

from

the

regression

of

x1

on

x2…

,

xk.的回归。Thusmeasures

the

effect

of

x1

on

y

afterx2,…

,

xk.has

been

partialled

out.x1对y的影响。ˆi1

irˆi1

i

(1

,

but

the

residual

r11variables,

ˆ

can

still

be

written

asin1ˆ1在一个含有k个解释变量的一般模型中,ˆ

仍然可以ˆi1

irˆi1

i

(1

1写成

,但残差

r

来自x1对x2…

,xk1于是ˆ

度量的是,在排除x2…,xk等变量的影响之后,22Simple

vs

Multiple

Regression

Estimates比较简单回归和多元回归估计值11Generally,

1

ˆ

unless:Compare

the

simple

regression

y

0

1

x1比较简单回归y

0

1

x1with

the

multiple

regression

ˆ

ˆ

x

ˆ

x0

1

1

2

2与多元回归

ˆ

ˆ

x

ˆ

x0

1

1

2

223一般来说,1

ˆ

,除非:

0

(i.e.

no

partial

effect

of

x2

)

ORˆ2ˆ2

(0

也就是x2对y没有局部效应),或x1

and

x2

are

uncorrelated

in

the

sample在样本中x1和x2不相关Simple

vs

Multiple

Regression

Estimates比较简单回归和多元回归估计值24regression

of

x2

on

x1

. The

proof.This

is

because

there

existsa

simple

relationship这是因为存在一个简单的关系~

ˆ

ˆ

~1

1

2

1~where

1

is

theslope

coefficient

from

thesimple这里,1是x2对x1的简单回归得到的斜率系数。证明如下。~1

125211

11

111

11121

1

1

2

20

1

1

2

2ˆ~

ˆ

ˆ

~1

2

1

ˆ

ˆ

(x

x

)2(x1

x1

)(x2

x2

)(x

x

)2

x2

)]

x

)[

(x

x1

)

ˆ2

(x2

(x1(x

x

)2(x

x1

)(

y

y)

x

),

thereforey

y

ˆ

(x

x

)

ˆ

(x

so

thatBecause

y

ˆ

ˆ

x

ˆ

xSimple

vs

Multiple

Regression

Estimates简单回归和多元回归估计值的比较Let

βˆj

,

j

0,1,...,

k

be

the

OLS

estimators

from

theregression

using

full

set

of

explanatory

variables.令βˆj

,j

0,1,...,k为用全部解释变量回归的OLS估计量。

Let

βj

,j

0,1,...,k

1be

the

OLS

estimators

fromthe

regression

that

leaves

out

xk

.令βj

,j

0,1,...,k-1为用除xk

外的解释变量回归的OLS估计量。

Let

δj

be

the

slope

coefficient

on

xj

in

the

regressionof

xk

on

x1

,...,

xk-1.Then令δj为xk向x1

,...,xk-1回归中x

j的斜率系数。那么βj

βˆj

βˆk

δj

.26Simple

vs

Multiple

Regression

Estimates简单回归和多元回归估计值的比较27In

the

case

with

k

independentvariables,

thesimple

regression and

the

multiple

regressionproduce

identical

estimate

for

x1

only

if在k个自变量的情况下,简单回归和多元回归只有在以下条件下才能得到对x1相同的估计(1)

the

OLS

coefficients

on

x2

through

xk

are

allzero,or对从x2到xk的OLS系数都为零,或(2)

x1

isuncorrelated

with

eachof

x2…

,

xk.(2)

x1与x2…,xk中的每一个都不相关。Summary

总结In

this

lecture

we

introduce

the

multiple

regression.在本次课中,

介绍了多元回归。Important

concepts:重要概念:Interpreting

the

meaning

of

OLS

estimates

in

multipleregression解释多元回归中OLS估计值的意义Partialling

effect局部效应(其它情况不变效应)Properties

of

OLSOLS的性质When

will

the

estimates

from

simple

and

multipleregression

to

be

identical什么时候简单回归和多元回归的估计值相同2829Multiple

Regression ysis:

Estimation(2)多元回归分析:估计(2)y

=

0

+

1x1

+

2x2

+

.

.

.kxk

+

uChapter

Outline

本章大纲Motivation

for

Multiple

Regression使用多元回归的动因Mechanics

and

Interpretation

of

Ordinary

Least

Squares普通最小二乘法的操作和解释The

Expected

Values

of

the

OLS

EstimatorsOLS估计量的期望值The

Variance

of

the

OLS

EstimatorsOLS估计量的方差Efficiency

of

OLS:

The

Gauss-MarkovTheoremOLS的有效性:

定理30Lecture

Outline

课堂大纲31The

MLR.1–

MLR.4

Assumptions假定MLR.1–MLR.4The

Unbiasedness

of

the

OLS

estimatesOLS估计值的无偏性Over

or

Under

specification

of

models模型设定不足或过度设定Omitted

VariableBias遗漏变量的偏误Sampling

Variance

of

the

OLS

slope

estimatesOLS斜率估计量的抽样方差The

expected

value

of

the

OLS

estimatorsOLS估计量的期望值We

now

turn

to

the

statistical

propertiesof

OLSforestimating

the

parameters

in

an

underlyingpopulation

model.现在转向OLS的统计特性,而 知道OLS是估计潜在的总体模型参数的。Statistical

properties

are

the

properties

ofestimators

when

random

sampling

is

donerepeatedly.

We

do

not

care

about

how

an

estimatordoes

in

a

specific

sample.统计性质是估计量在随机抽样不断重复时的性质。并不关心在某一特定样本中估计量如何。32Assumption

MLR.1

(Linear

in

Parameters)假定MLR.1(对参数而言为线性)33In

the

population

model

(or

the

true

model),

thedependent

variable

y

is

related

to

the

independentvariable

x

and

the

error

u

as在总体模型(或称真实模型)中,因变量y与自变量x和误差项u关系如下y=

0+

1x1+

2x2+

…+kxk+uwhere

1,

2

…,

k

are

the

unknown

parametersof

interest,and

u

is

an

unobservable

random

error

or

randomdisturbance

term.其中,1,2

…,k

为所关心的未知参数,u为不可观测的随机误差项或随机干扰项。Assumption

MLR.2

(Random

Sampling)假定MLR.2(随机抽样性)We

can

use

a

random

sampleof

size

n

from

thepopulation,{(xi1,可以使用总体的一个容量为n的随机样本xi2…,

xik;

yi):

i=1,…,n},wherei

denotesobservation,and

j=

1,…,k

denotesthe

jth

regressor.其中i

代表观察,j=1,…,k代表第j个回归元Sometimes

we

write

有时 模型写为yi=

0+

1xi1+

2xi2+

…+kxik+ui34Assumptions

MLR.3

假定MLR.3MLR.3(Zero

Conditional

Mean)

(零条件均值)

:E(u|

xi1,

xi2…,xik)=0.When

this

assumption

holds,

we

say

all

of

theexplanatory

variables

are

exogenous;

when

it

fails,

wesay

that

the

explanatory

variables

are

endogenous.当该假定成立时,称所有解释变量均为外生的;否则,则称解释变量为内生的。We

will

pay

particular

attention

to

the

case

thatassumption

3

fails

because

of

omitted

variables.特别注意当重要变量缺省时导致假定3不成立的情况。35Assumption

MLR.4

假定MLR.4MLR.4(No

perfect

collinearity)

(不存在完全共线性)

:In

the

sample,none

of

the

independent

variables

is

constant,

and

there

are

noexactlinearrelationshipsamongtheindependentvariables.在样本中,没有一个自变量是常数,自变量之间也不存在严格的线性关系。When

one

regressor

is

an

exact

linear

combination

of

the

other

regressor(s),wesaythemodelsuffersfromperfectcollinearity.当一个自变量是其它解释变量的严格线性组合时,说此模型有严格共线性。Examples

of

perfect

collinearity:完全共线性的例子:y=0+

1x1+

2x2+

3x3+u,

x2

=

3x3,y=

0+

1log(inc)+

2log(inc2

)+uy=

0+

1x1+

2x2+

3x3+

4x4

u,x1+x2

+x3+

x4

=1.Perfect

collinearity

also

happenswhen

y=0+1x1+2x2+3x3+u,n<(k+1).当y=0+1x1+2x2+3x3+u,n<(k+1)也发生完全共线性的情况。The

denominator

of

the

OLS

estimator

is

0

when

there

is

perfect

collinearity,hence

the

OLS

estimator

cannot

be

performed.You

can

check

this

by

looking

atthe

formula

of

the

estimator

for

2

in

the

session

discussing

the

partialling-outeffect.在完全共线性情况下,OLS估计量的分母为零,因此OLS估计量不能得到。你可以回顾“排除其它变量影响”部分中的2估计量的式子,来检验这一点。36Theorem

3.1

(Unbiasedness

of

OLS)37定理3.1(OLS的无偏性)Under

assumptions

MLR.1

throughMLR.4,

the

OLS

estimators

areunbiased

estimator

of

thepopulation

parameters,

that

is在假定MLR.1~MLR.4下,OLS估计量是总体参数的无偏估计量,即E(

j

)

j

,

j

1,2,...,kTheorem

3.1

(Unbiasedness

of

OLS)定理3.1(OLS的无偏性)Unbiasedness

is

the

property

of

an

estimator,thatis,

the

procedure

that

can

produce

an

estimate

fora

specific

sample,

not

an

estimate.无偏性是估计量的特性,而不是估计值的特性。估计量是(过程),该方法使得给定一个样本,法可以得到一组估计值。 评价的是方法的优劣。Not

correct

to

say“5

percent

is

anunbiasedestimate

of

the

return

of

education”.不正确的说法:“5%是教育汇报率的无偏估计值。”38Too

Many

or

TooFew

Variables变量太多还是太少了?What

happens

if

we

include

variables

in

our

specificationthat

don’t

belong?如果

在设定中包含了不属于真实模型的变量会怎样?A

model

is

overspecifed

when

one

or

more

of

theindependent

variablesis

included

in

the

model

even

thoughit

has

no

partial

effect

on

y

in

the

population尽管一个(或多个)自变量在总体中对y没有局部效应,但却被放到了模型中,则此模型被过度设定。

There

is

no

effect

on

our

parameter

estimate,

and

OLSremains

unbiased.

But

it

can

have

undesirable

effects

on

thevariances

of

the

OLS

estimators.过度设定对

的参数估计没有影响,OLS仍然是无偏的。但它对OLS估计量的方差有不利影响。39Too

Many

or

TooFew

Variables变量太多还是太少了?What

if

we

exclude

a

variable

from

ourspecification

that

doesbelong?如果 在设定中排除了一个本属于真实模型的变量会如何?If

a

variable

th tually

belongs

in

the

true

model

is

omitted,

wesay

the

modelis

underspecified.如果一个实际上属于真实模型的变量被遗漏,说此模型设定不足。OLS

will

usually

be

biased.此时OLS通常有偏。Deriving

the

bias

caused

by

omitting

animportant

variable

isanexample

ofmisspecification

ysis.推导由遗漏重要变量所造成的偏误,是模型设定分析的一个例子。40Omitted

Variable

Bias遗漏变量的偏误

121i11

ii1x

x

y

x

xSuppose

the

true

model

is

given

as假定真实模型如下y

0

1

x1

2

x2

u,but

we

estimate

y

0

1

x1

u,

then但

估计的是

y

0

1

x1

u,有41Omitted

Variable

Bias

(cont)遗漏变量的偏误(续)24221

i1

1

1

xi1

x1

0

x

x回想真实模型,有

yi

0

1xi1

2

xso

the

numerator

be所以分子为Omitted

Variable

Bias

(cont)遗漏变量的偏误(续)

1222111121i1i1i11

i

2i1E

x

x

x

x

xx

xx

x

x

x

i1 1

i

2

xi1

x1

ui

2

x

since

E(ui

)

0,taking

expectations

we

have由于E(ui

)

0,取期望, 得到43

12i11

i

2i1

1x

x

xx2

0

1

x1

then

x

xso

E

1

1

21Omitted

Variable

Bias

(cont)遗漏变量的偏误(续)Consider

the

regression

of

x2

on

x1考虑x2对x1的回归44Omitted

Variable

Bias

Summary遗漏变量的偏误 总结45Two

cases

where

biasisequal

to

zero

两种偏误为零的情形2

=

0,

that

isx2

doesn’t

really

belongin

model2

=0,也就是,x2实际上不属于模型x1

and

x2

are

uncorrelated

inthe

sample样本中x1与x2不相关

If

correlation

between

x2

,x1

and

x2

,y

isthe

same

direction,

bias

will

be

positive

如果x2与x1间相关性和x2与y间相关性同方向,偏误为正。

If

correlation

between

x2

,x1

andx2

,y

is

theopposite

direction,

bias

will

be

negative

如果x2与x1间相关性和x2与y间相关性反方向,偏误为负。Omitted

Variable

Bias

Summary遗漏变量的偏误 总结When

E(1

)

1,we

say

that

1

hasupwardbias.当E(1

)

1,

说1上偏。When

E(1

)

1,

we

say

that1

hasdownwardbias.当E(1

)

1,

说1下偏。46Summaryof

Direction

ofBias偏误方向总结47Corr(x1,

x2)

>

0Corr(x1,

x2)

<

02

>

0Positive

bias偏误为正Negative

bias偏误为负2

<

0Negative

bias偏误为负Positive

bias偏误为正Omitted-Variable

Bias

遗漏变量偏误In

general

,

2

is

unknown;

and

when

a

variable

isomitted,

it

is

mainly

because

of

this

variable

isunobserved.

In

other

words,

we

do

not

know

thesign

of

Corr(x1,

x2).

Whatto

do?

但是,通常

不能观测到2,而且,当一个重要变量被缺省时,主要原因也是因为该变量无法观测,换句话说,无法准确知道Corr(x1,x2)的符号。怎么办呢?We

rely

on

economic

theories

and

intuition

tomake

a

educated

guess

ofthesign.

依靠经济理论和 来帮助 对相应符号做出较好的估计。48Example:

hourly

wage

equation例子:小时工资方程Suppose

the

model

log(wage)

=

0+1educ+

2abil

+uis

estimated

with

abil

omitted.

What

is

the

direction

of

biasfor

1?假定模型

log(wage)=0+1educ+2abil+u,在估计时遗漏了abil。1的偏误方向如何?

Since

in

general

ability

has

positive

partial

effect

on

yandability

and

education

years

is

positive

corrected,

we

expect1

to

have

a

upwardbias.因为一般来说ability对y有正的局部效应,并且ability和education

years正相关,所以预期1上偏。49The

More

General

Case更一般的情形Technically,

it

is

more

difficult

to

derive

the

signof

omitted

variable

bias

with

multiple

regressors.从技术上讲,要推出多元回归下缺省一个变量时各个变量的偏误方向更加

But

remember

that

if

an

omitted

variable

haspartial

effects

on

y

and

it

is

correlated

with

atleast

one

of

the

regressors,

then

the

OLSestimators

of

all

coefficients

will

be

biased.需要记住,若有一个对y有局部效应的变量被缺省,且该变量至少和一个解释变量相关,那么所有系数的OLS估计量都有偏。50The

More

General

Case更一般的情形ymodel2ytrueyˆmodel1

0

Suppose

corr(x1

,

x3It

is

notdifficult

to

beliestimator

of

2

.Will

1

be

un若corr(x1,x3

)

0,corr(x2

,x3

)

0很容易想到2是2的一个有偏估计量而1是有偏的吗?51The

More

General

Case更一般的情形1

1 3

1

2

ˆˆ

,

1When

corr(x1

,

x2

)

0corr(x1

,

x3

)

0.There当corr(x1

,x2

)

0,即

is

a

biasedestimato52—

有偏估计Variance

of

the

OLS

EstimatorsOLS估计量的方差Now

we

know

that

the

sampling

distribution

of

ourestimate

iscentered

around

the

true

parameter。现在

知道估计值的样本分布是以真实参数为中心的。Want

to

think

about

how

spreadout

this

distribution

is还想知道这一分布的分散状况。

Much

easier

to

think

about

this

variance

underanadditional

assumption,

so在一个新增假设下,度量这个方差就容易多了,有:53Assumption

MLR.5

(Homoskedasticity)假定MLR.5(同方差性)Assume

Homoskedasticity:同方差性假定:Var(u|x1,

x2,…,

xk)

=

2

.Means

that

the

variancein

the

errorterm,

u,conditionalontheexplanatorcombinations

ofbles,

is

the

same

for

alles

of

explanatory

variables.意思是,不管解释变量出现怎样的组合,误差项u的条件方差都是一样的。If

the

assumption

fails,

we

say

the

model

exhibitsheteroskedasticity.如果这个假定不成立, 说模型存在异方差性。54Variance

of

OLS

(cont)OLS估计量的方差(续)Let

x

standfor(x1,x2,…xk)

用x表示(x1,x2,…xk)

Assuming

that

Var(u|x)=2

also

implies

thatVar(y|

x)=2

假定Var(u|x)=2,也就意味着

Var(y|

x)=2

Assumption

MLR.1-5

are

collectively

known

asthe

Gauss-Markov

assumptions.假定MLR.1-5共同被称为

假定55

22ˆjj2jjjjij

jand

R2

is

the

R2from

regressing

xj

on

all

other

x's

2Var

,

whereSST

1

RSSTj

xij

xTheorem

3.2

(Sampling

Variances

of

the

OLS

SlopeEstimators)定理3.2(OLS斜率估计量的抽样方差)Given

the

Gauss-Markov

Assumptions给定

假定其中,SST

x

x

,R2是x

向所有其它x回归所得到的R2j

j56Interpreting

Theorem

3.2对定理3.2的解释57

Theorem

3.2

shows

that

the

variances

of

theestimated

slope

coefficients

are

influenced

by

threefactors:定理3.2显示:估计斜率系数的方差受到三个因素的影响:The

error

variance误差项的方差The

total

sample

variation总的样本变异Linear

relationships

among

the

independent

variables解释变量之间的线性相关关系Interpreting

Theorem

3.2:

The

Error

Variance对定理3.2的解释(1):误差项方差A

larger

2

implies

a

larger

variance

forthe

OLSestimators.更大的2意味着更大的OLS估计量方差。A

larger

2

means

more

noises

in

the

equation.更大的2意味着方程中的“噪音”越多。This

makes

it

more

difficult

toextract

theexact

partial

effectof

the

regressor

on

the

regressand.

这使得得到自变量对因变量的准确局部效应变得更加。Introducing

more

regressors

can

reduce

the

variance.

Butoften

this

is

not

possible,

neither

is

it

desirable.

引入 的解释变量可以减小方差。但这样做不仅不一定可能,而且也不一定总令人满意。2

does

not

depends

on

sample

size.

2

不依赖于样本大小58Interpreting

Theorem

3.2:

The

total

sample

variation对定理3.2的解释(2):总的样本变异A

larger

SSTj

implies

a

smaller

variance

for

the

estimators,andvice

versa.更大的SSTj意味着更小的估计量方差,反之亦然。Everything

else

being

equal,more

sample

variation

in

x

is

always

preferred.其它条件不变情况下,x的样本方差越大越好。One

way

to

gain

more

sample

variation

is

to

increase

thesample

size.

增加样本方差的

法是增加样本容量。This

components

of

parameter

variance

depends

on

thesample

size.

参数方差的这一组成部分依赖于样本容量。59Interpreting

Theorem

3.2:

multicollinearity对定理3.1的解释(3):多重共线性60A

larger

R

2

implies

a

larger

variance

for

the

estimatorsjj更大的R

2意味着更大的估计量方差。jA

large

R

2

meansother

regressors

can

explain

much

of

the2variationsin

xj.如果Rj

较大,就说明其它解释变量解释可以解释较大部分的该变量。j

jWhen

R

2

is

very

close

to

1,

x

is

highly

correlated

with

other2regressors,

this

is

called

multicollinearity.

当Rj

非常接近1时,xj与其它解释变量高度相关,被称为多重共线性。Severe

multicollinearity

means

the

variance

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论