SPSS数据统计分析与实践第十五章加权最小二乘法WeightedLeastSquares_第1页
SPSS数据统计分析与实践第十五章加权最小二乘法WeightedLeastSquares_第2页
SPSS数据统计分析与实践第十五章加权最小二乘法WeightedLeastSquares_第3页
SPSS数据统计分析与实践第十五章加权最小二乘法WeightedLeastSquares_第4页
SPSS数据统计分析与实践第十五章加权最小二乘法WeightedLeastSquares_第5页
已阅读5页,还剩12页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、spss数据统计分析与实践 第十五章:加权最小二乘法(weighted least squares)spss数据统计分析与实践主讲:周涛副教授北京师范大学资源学院2007-12-4教学网站:/4>>.ires>.cn/courses/spss第十五章:加权最小二乘法(weighted least squares)本章内容:一、最小二乘法的应用领域z根据需要人为地改变观测量的权重zremedial measures for unequal error variances二、spss提供的wls过程zlinear regression procedure (with weight

2、 variable)zweight estimation procedure三、相关输出结果的比较zols与wls比较zspss提供的两种wls方法比较加权最小二乘法应用(一)根据需要人为地改变观测量的权重根据需要人为地改变观测量的权重实例实验中收集的15对数据,每对数据都是将n份样品混合后测得的平均结果,但各对数据的n大小不等,试求出x对y的线性方程。数据源:郭祖超,医用数理统计方法第三版p249根据需要人为地改变观测量的权重实例bmodel summary方法一:如果不考虑adjustedstd>. error ofmodelrr squarer squarethe estimate

3、a1>.987>.975>.973>.11330样品混合量的差a>. predictors: (constant), xb>. dependent variable: y异,则该问题是一个非常简单的线性acoefficientsunstandardizedstandardized回归问题,可直接coefficientscoefficientsmodelbstd>. errorbetatsig>.1(constant)7>.454>.17343>.143>.000拟合回归方程,结x->.015>.001->

4、.987-22>.468>.000a>. dependent variable: y果如下:2y = 7>.45 0>.015 * x (r=0>.98) 根据需要人为地改变观测量的权重实例方法二:由于每对测量数据都是将n份样品混合后测得结果,显然混合的样品越多,测得的结果越稳定,即变异越小。如果直接拟合方程,则是将所有测量值均一视同仁,1份样品的测量结果与15份样品混合后的测量结果等价对待,这显然不太合理。为此可以考虑在分析中将样品数n作为权重变量,n值越大的观测量在计算中给予的权重越高,对方程的影响越大,即按照加权最小二乘法来拟合回归方程。根据需要人为地

5、改变观测量的权重实例zspss操作步骤:zanalyze?regression?linearzdependent: yzindependent: xzwls weight: nwls weightwls: weighted least squareswls 输出结果b,cmodel summaryadjustedstd>. error ofmodelrr squarer squarethe estimatea1>.982>.965>.962>.29365a>. predictors: (constant), xb>. dependent variabl

6、e: yc>. weighted least squares regression - weighted by na,bcoefficientsunstandardizedstandardizedcoefficientscoefficientsmodelbstd>. errorbetatsig>.1(constant)7>.190>.18838>.316>.000x->.014>.001->.982-18>.7><816>.000a>. dependent variable: yb>. weight

7、ed least squares regression - weighted by n2y = 7>.19 0>.014 * x (r=0>.97) wls 与ols 输出结果比较1>.在ols中,测定系数为0>.975, 而在wls中测定系数降低为0>.965。2>.由于测定系数是按照普通最小二乘法进行计算,因此加权后的方程测定系数必然小于普通最小二乘法,即此时不能使用测定系数来判断模型的优劣。wls 与ols 输出结果比较3>. 通过绘制ols和wls的回归直线加以比较,如下图所示,wls更靠近中部那些混合样品数据n较大的测量值,而对两端n较

8、小的测量值则比ols回归直线更远一些,显然这些测量值在计算时对方程的影响程度是不同的。实现wls的另一种方法z事实上,如果使用spss的weight case过程,将n指定为频数变量,然后进行普通的线性回归,得到的分析结果与上述加权最小二乘法完全相同。z操作过程如下所示:实现wls的另一种方法步骤:(1)调用weight case过程data?weightcases)(2)调用线性回归过程(采用ols)实现wls的另一种方法输出结果b,cmodel summaryadjustedstd>. error ofmodelrr squarer squarethe estimatea1>.

9、982>.965>.962>.29365a>. predictors: (constant), xb>. dependent variable: yc>. weighted least squares regression - weighted by na,bcoefficientsunstandardizedstandardizedcoefficientscoefficientsmodelbstd>. errorbetatsig>.1(constant)7>.190>.18838>.316>.000x->.014&

10、gt;.001->.982-18>.<816>.000a>. dependent variable: yb>. weighted least squares regression - weighted by n2y = 7>.19 0>.014 * x (r=0>.97) 加权最小二乘法应用(二)unequal error variances remedial measures-weighted least squaresvariation of errors around the regression line1>. y value

11、s are normallydistributed around the regression line>.f(e)2>. for each x value, the “spread”or variance around the regression line is the same>.yx2x1xregression lineequal error variancesy=+x+x+k+x+(1)i01i12i2p?1i,p?1i,kare parameters01p?1are known constantsx,x,k,xi1i2i,p?12n(0,)are independ

12、entiequal error i=1,l,nvariance?1b=(xx)xy(2)p×1p×pp×1p×punequal error variancesy=+x+x+k+x+(3)i01i12i2p?1i,p?1i,kare parameters01p?1x,x,kare known constants,xi1i2i,p?12n(0,)are independentiiunequal error i=1,l,nvariance2?0k01?20k02?2=?n×nmmm?2?00k?n?unequal error varianceszth

13、e estimation of the regression coefficients in generalized model (3)could be done by using the estimators in (2)for regression model (1)with equal error variances>. these estimators are still unbiased and consistent for model (3),but they no longer have minimum variance>.zto obtain unbiased es

14、timators with minimum variance, we must take into account that the different y observations for the n cases no longer have the same reliability>.zobservations with small variances provide more reliable informationabout the regression function than those with large variances>.error variances kn

15、ownerror variances knownwe first consider the estimation of the regression 2function coefficients when the error variance iare known>. this case is usually unrealistic, but it provides guidance as to how to proceed when the error variances are not know>.error variances known2when the error var

16、iances are known, we can use ithe method of maximum likelihoodto obtain estimators of the regression coefficients in (3)>.n112l()=exp?(y?x?k?x)i01i1p?1i,p?121(4)/22(2)2i=1iiwhere denotes the vector of the regression coefficients>. we define the2reciprocal of the variance as the weight w:ii1w=(

17、5)i2inn?w?1?81/22il()=()exp?w(y?x?k?x)ii01i1p?1i,p?1(6)?22?i=11?error variances knownwe find the maximum likelihood estimators of the regression coefficients by maximizing l()in formula (6),k,with respect to>.01p?1nn?w?1?81/22il()=()exp?w(y?x?k?x)(6)ii01i1p?1i,p?1?22?i=11?2wsince the error varian

18、ces and hence the weights iiare assumed to be known, maximizing l()with respect to the regression coefficients is equivalent to minimizing the exponential term:n2(7)q=w(y?x?k?x)wii01i1p?1i,p?1i=1q: weight least squares criterionwerror variances known2w?since the weightis inversely related to the var

19、iance , iiit reflects the amount of information contained in the observation y>.i?thus, an observation ythat has a large variance receives iless weightthan another observation that has a small variance>.?intuitively, this is reasonable>. the more precise is y(i>.e>., i2the smaller is

20、), the more information yprovides about iieyand therefore the more weight it should receive in ifitting the regression function>.error variances knownregression coefficientslet the matrix wbe a diagonal matrix containing the wweight :iw0k0?81?0wk0(8)2?w=n×n?mmm?00kw?n?the normal equations ca

21、n then be expressed as follows:(9)(xwx)b=xwywand the weighted least squares and maximum likelihood estimators of the regression coefficients are:?1(10)b=(xwx)xwywwhere bis the vector of the estimated regression wcoefficients obtained by weighted least squares>.error variances knownregression coef

22、ficients?the weighted least squares and maximum likelihood estimators of the regression coefficients in formula (10)are unbiased, consistent, and have minimum varianceamong unbiased linear estimator>.?thus, when the weights are known, bgenerally exhibits wless variabilitythan the ordinary least s

23、quares estimatorb>.error variances unknownerror variances unknownzif the variances of errors were known, the use of weighted least squares with weights wiwould be straightforward>.zunfortunately, one rarely has knowledge of the variances of errors>. we are then forced to use estimates of th

24、e variances>.estimation of variance functionzthe magnitudes of error variances often vary in a regular fashion with one or several predictor variables xor with the mean kresponse ey>.iexestimation of variance functionex2zsuch a relationship between and one or several ipredictor can be estimate

25、d because the squared 2eresidual obtained from an ordinary least squares i2regression fit is an estimate of , provided that the iregression function is appropriate>.22=eiiestimation of variance functionsome possible variance functions:1>.a residual plotagainst xexhibits a megaphone shape>.

26、1regress the absolute residuals against x>.1?2>.a residual plot against exhibits a megaphone shape>. y?regress the absolute residuals against >.y3>.a plot of squared residuals against xexhibits an upward 3tendency>. regress the squared residuals against x>.34>.a plot of the r

27、esiduals against xsuggests that the variance 2increases rapidly with increases in xup to a point and then 2increases more slowly>. regreethe absolute residuals 2against xand >.x22summary of the wls estimation process1>.fit the regression model by unweightedleast squaresand analyze the resid

28、uals>.2>.estimate thevariance functionby regression either the squared residualsor the absolute residualson the appropriate predictor(s)>.3>.use the fitted values from the estimated variance to obtain the weights w>.i4>.estimate the regression coefficients using these weights>.s

29、pss example of wlsexampleza health researcher, interested in studying the relationship between diastolic blood pressure(舒张压) and ageamong healthy adult women 20 to 60 years old, collected data on 54 subjects>. step 1: preliminary analysesblood_pressure= 56>.157 + 0>.58003 * agenonconstanter

30、ror variancestep 2: estimating the variance function?s=1>.549+0>.198×age(11)denotes the estimated ?sexpected standard deviationto obtain the weights w, the ianalyst obtained the fitted values from the standard deviation function in (11)>. for case 1 (x=27)1?the plot of the absolute res

31、iduals s=1>.549+0>.198×27=3>.8011against xsuggests that a linear 11w=0>.0692relationbetween the error standard 122?(s)(3>.801)1deviationand xmaybe reasonable>. step 2: estimating the variance functionwi?e|e|sxyiiiii1w=i2?(s)1step 3: weighted least squaresa,bcoefficientsunstand

32、ardizedstandardizedcoefficientscoefficientsmodelbstd>. errorbetatsig>.1(constant)55>.5662>.52122>.042>.000age>.596>.079>.7227>.526>.000a>. dependent variable: blood_pressureb>. weighted least squares regression - weighted by wi?y=55>.566+0>.596x(12)step 4

33、: comparison of wls and olsacoefficientsunstandardizedstandardizedcoefficientscoefficientsmodelbstd>. errorbetatsig>.ols1(constant)56>.1573>.99414>.061>.000age>.580>.097>.6395>.983>.000a>. dependent variable: blood_pressurea,bcoefficientsunstandardizedstandardized

34、coefficientscoefficientsmodelbstd>. errorbetatsig>.wls1(constant)55>.5662>.52122>.042>.000age>.596>.079>.7227>.526>.000a>. dependent variable: blood_pressureit is interesting to note that this std>. b>. weighted least squares regression - weighted by wiis so

35、mewhat smaller than the std>.ofbgenerally exhibits less the estimate obtained by ols>. the wvariabilitythan the reduction of about 18% is the result ordinary least squares of the recognition of unequal error estimatorbvariances when using wls>. step 4: comparison of wls and olsacoefficients

36、unstandardizedstandardizedcoefficientscoefficientsmodelbstd>. errorbetatsig>.ols1(constant)56>.1573>.99414>.061>.000age>.580>.097>.6395>.983>.000a>. dependent variable: blood_pressurea,bcoefficientsunstandardizedstandardizedcoefficientscoefficientsmodelbstd>. e

37、rrorbetatsig>.wls1(constant)55>.5662>.52122>.042>.000age>.596>.079>.7227>.526>.000a>. dependent variable: blood_pressureb>. weighted least squares regression - weighted by wisince the regression coefficients changed only a little, the analyst concluded that there

38、was no need to reestimatethe standard deviation function and the weightsbased on the residuals for the weighted regression in formula (12)>.step 4: comparison of wls and olsif the estimated coefficients differ substantially from the estimated regression coefficients obtained by ols, it is usually

39、 advisable to iterate the wls process by using the residuals from the wls fit to reestimatethe variance or std>. function and then obtain revised weights>. this iteration process is often called iteratively reweightedleast >.squaresspss weight estimation procedurespss weight estimation proc

40、edurez尽管采用spss linear regression过程(人为确定wls weight变量)可以用于wls,但两个因素使得使用该过程比较麻烦:1>.有时wls weight变量值不容易确定。2>.采用前述的estimation of variance function可以解决wls weight变量值,但当迭代次数比较大时,使这个过程比较麻烦。z为此,spss提供了一个单独的“weight estimation”过程zthe weight estimation procedure tests a range of weight transformations and i

41、ndicates which will give the best fit to the data>. weight estimation procedurez基本调用方式:1>.analyze ?regression ?weight estimation2>.dependent 框: y3>.independent 框:x4>.weight variable 框:weight variable z选项:1>.weight function: power range2>.save best weigh as new variable3>.disp

42、lay anova and estimatesweight function: power rangezweight variable>. the data are weighted by the reciprocal of this variable raised to a power>. the regression equation is calculated for each of a specified range of power values and indicates the power that maximizes the log>. -likelihood

43、 functionweight function is 1 / (weight var) * powerzpower range>. this is used in conjunction with the weight variable to compute weights>. several regression equations will be fit, one for each value in the power range>. the values entered in the power range test box and the through text

44、box must be between -6>.5 and 7>.5, inclusive>. the power values range from the low to high value, in increments determined by the value specified>. the total number of values in the power range is limited to 150>. weight estimation exampleza health researcher, interested in studying the relationship between diastolic blood pressure(舒张压) and ageamong healthy adult women 20 to 60 years old, collected data on 54 subjects>. weight estimation examplestep 1:采用缺省的power rangeweight estima

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论