




版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、12 Simple Linear Regression and CorrelationIntroduction Regression analysis is the part of statistics that deals with investigation of the relationship between two or more variables relation in a nondeterministic fashion. In this chapter, we generalize the linear relation y= 0+1x to a linear probabi
2、listic relationship, develop procedures for making inferences about the parameters of the model, and obtain a quantitative measure of the extent to which the two variables are related.12.1 The Simple Linear Regression ModelWhose value is fixed by the experimenter will be denote by x.x may called Ind
3、ependent variable, Predictor variable, Explanatory variableFor fixed x, the second variable y will be random, we called it Dependent variable, Response variable, Explained variableThe correlation relationship between variablesx=the age of childy=the size of vocabularyx=5y=100,200,300,400,500,1000 xE
4、y10ORxy10Example 12.1: Visual and musculoskeletal problems associated with the use of visual display terminals (VDTs) have become rather common in recent years. Some researchers have focused on vertical gaze direction as a source of eye strain and irritation. This direction is known to be closely re
5、lated to ocular surface area (OSA), so a method of measuring OSA is needed. The accompanying representative data on y=OSA (cm2) and x=width of the palprebal fissure is from the article “Analysis of Ocular Surface Area for Comfortable VDT Workstation Layout”. The order in which observations were obta
6、ined was not given, so for convenience they are listed in increasing order of x values.The first step in regression analysis-Scatter plotixiyiixiyi10.41.02161.153.1820.421.21171.23.7630.480.88181.253.6840.510.98191.253.8250.571.52201.283.2160.61.83211.34.2770.71.5221.343.1280.751.8231.373.9990.751.7
7、4241.43.75100.781.63251.434.1110.842261.464.18120.952.8271.493.77130.992.48281.554.34141.032.47291.584.21151.123.05301.64.92Example 12.2 Forest growth and decline phenomena throughout the world have attracted considerable public and scientific interest. The article “ Relationship among crown conditi
8、on, growth, and stand nutrition in seven northern sugarbushes” included a scatter plot y= mean crown dieback(%), one indicator of growth retardation, and x=soil PH (High PH corresponds to more acidic soil), from which the following observations were taken:xyxy3.33.37.37.33.93.96.66.63.43.410.810.84
9、410103.43.413.113.14.14.19.29.23.53.510.410.44.24.212.412.43.63.65.85.84.34.32.32.33.63.69.39.34.44.44.34.33.73.712.412.44.54.53 33.73.714.914.95 51.61.63.83.811.211.25.15.11 13.83.88 8Scatter plotA Linear Probabilistic Motel For the deterministic model y= 0+1x , the actual observed value of y is a
10、linear function of x. The appropriate generalization of this to a probabilistic model assumes that the expected value of Y is a linear function of x, but that for fixed x, the variable Y differs from its expected value by a random amount.The Simple Linear Regression ModelThere exist parameters 0,1,
11、and 2 such that for any fixed value of the independent variable x, the dependent variable is related to x through the model equation Y= 0+1x+ (12.1)The quantity in the model equation is a random variable, assumed to be normally distributed with E()=0 and V()= 2.),(iiyxixiyxy10iy 12.2 Estimating Mode
12、l ParametersPrinciple of Least SquaresThe vertical deviation of the point (xi,yi) from the line y=b0+b1x is height of point-height of line=yi-(b0+b1x)0101xy10The sum of squared vertical deviations from the points (x1,y1) (x2,y2) , (xn,yn) to the line is then f(b0,b1)=yi-(b0+b1xi)2. The point estimat
13、es of 0and1, denoted by and called the least squares estimates, are those values that minimize f(b0,b1). That is, and are such that for any b0 ,b1The estimated regression line or least squares line is then the line whose equation is),(),(1010bbff niiiyyQ12)(niiixy1210)(0)(21100niiixyQ0)(21101niiiixx
14、yQniiniiyxn1110niiiniiniiyxxx112110The normal equations:iniiniiniininiiyxnyxxnxi11112211)(niiniixnyn111011niixnx11Let niiyny11niixxxxS12)(niiyyyyS12)(niiixyyyxxS1)(2112)(1niinixnxi2112)(1niiniynyiniiniiiniiyxnyx1111The least squares estimate of the coefficientsof the true regression line are 10andTh
15、e regression equation is xy10 xxxyniniiiniiniiniiSSxnxyxnyxi12211111)(niiniixnyn111011xy1 Example 12.4 No-fines concrete, made from a uniformly graded coarse aggregate and a cement-water paste, is beneficial in areas prone to excessive rainfall because of its excellent drainage properties. Consider
16、the following representative data, displayed in a tabular format convenient for calculating the values of the summary statistics.ObsObs. .x xy yx2x2xyxyy2y21 1999928.828.8980198012851.22851.2829.44829.442 2101.1101.127.927.910221.2110221.212820.692820.69778.41778.413 3102.7102.7272710547.2910547.292
17、772.92772.97297294 410310325.225.210609106092595.62595.6635.04635.045 5105.4105.422.822.811109.1611109.162403.122403.12519.84519.846 610710721.521.511449114492300.52300.5462.25462.257 7108.7108.720.920.911815.6911815.692271.832271.83436.81436.818 8110.8110.819.619.612276.6412276.642171.682171.68384.
18、16384.169 9112.1112.117.117.112566.4112566.411916.911916.91292.41292.411010112.4112.418.918.912633.7612633.762124.362124.36357.21357.211111113.6113.6161612904.9612904.961817.61817.62562561212113.8113.816.716.712950.4412950.441900.461900.46278.89278.891313115.1115.1131313248.0113248.011496.31496.3169
19、1691414115.4115.413.613.613317.1613317.161569.441569.44184.96184.96151512012010.810.8144001440012961296116.64116.64sumsum1640.11640.1299.8299.8179849.7179849.732308.5932308.596430.066430.06Solution:niix1niiiyx11 .1640niiy18 .299niix1273.179849niiy1206.643059.32308niixnx11niiyny1134.109986667.19niixx
20、xxS12)(niiixyyyxxS1)(2112)(1niinixnxiniiniiiniiyxnyx1111196.521151 .164073.1798492542.471158 .2991 .164059.32308xxxySSxy110905. 0196.521542.47191.118)34.109)(90473066. 0(986667.19xy905. 091.118The equation of the estimated regression line isEstimating 2Definition:The fitted (or predicted) value are
21、obtained by successively substituting into the equation of the estimated regression line: . The residuals are the vertical deviations from the estimated line.nyyy,.,21,1101xynxxx,.,21,.,2102xy,10nnxy2102)()(SSEiiiixyyyThe error sum of squares, denoted by SSE, is 2)(SSRyyiThe regression sum of square
22、s, denoted by SSR, is2)(SSTyyiThe total sum of squares, denoted by SST, isAnd the estimate of 2 is 2)(2222nyynSSEsii2102)()(SSEiiiixyyyThe error sum of squares, denoted by SSE, is 2)(SSRyyiThe regression sum of squares, denoted by SSR, is2)(SSTyyiThe total sum of squares, denoted by SST, isSSRSSESST
23、SSRSSESSTDefinition: The coefficient of determination, denoted by r2, is given by It is interpreted as the proportion of observed y variation that can be explained by the simple linear regression model (attributed to an approximate linear relationship between y and x).21SSErSST The sample correlatio
24、n coefficient rDefinition:The sample correlation coefficient for the n pairs (x1,y1), (x1,y1), , (xn,yn) is yyxxxyiixySSSyyxxSr22)()(220121121222211111()()1()()()nniiiinyyiinniixyiiyyyyxxyyyyxySSESSRrSSTSSTSyyyxxyxxSSSS S Properties of rThe most important properties of r are as follows:1) The value
25、of r does not depend on which of the two variables under study is labeled x and which is labeled y.2) The value of r is independent of the units in which x and y are measured.3) -1r1.4) r=1 if and only if all (xi,yi) pairs lie on a straight line with positive slope, and r=-1 iff all (xi,yi) pairs li
26、e on a straight line with negative slope.5) The square of the sample correlation coefficient gives the value of the coefficient of determination that would result from fitting the simple linear regression model-in symbols, (r)2=r2.Example 12.9 The scatter plot of the no-fines concrete data in figure
27、 12.8 certainly portends a very high r2 value. 909917.1180Solution:90473066. 018 .299iy59.32308iiyx06.64302iy06.438057333.438158 .29906.64302SST44.114388.11)59.32308)(90473066. 0()8 .299)(909917.118(06.6430SSESo, the coefficient of determination is then 974. 0026. 0106.43844.1112rExercise P509 18 Th
28、e following summary statistics were obtained from a study that used regression analysis to investigate the relationship between pavement deflection and surface temperature of the pavement at various locations on a state highway. Here x= temperature ( 。F) and y=deflection adjustment factor (y0): 8518. 7,645.987,25.139037,68.10,1425,1522iiiiiiyyxxyxna. Compute , and the equation of the estimated regression line. Graph the estimated line. b. What is the estimate of expected change in the deflecti
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 7.2.2 我国最大的城市群 水乡的文化特色与旅游(说课稿)2025-2026学年八年级地理下册同步教学(人教版河北专版)
- 六年级上册心理健康教育教案-6自信添力量 | 辽大版
- 蓄电池销售课件
- 18.2.2菱形 说课稿-2024-2025学年人教版数学八年级下册
- 5.3《十年的变化》(教学设计)-2024-2025学年二年级下册数学北师大版
- 《梦游天姥吟留别》教学设计 2024-2025学年统编版高中语文必修上册
- 初中期末考试试卷及答案
- 2025饮料的采购合同模板
- 显微镜构造题目及答案
- 葡萄糖耐量试验课件
- 钢筋工劳务分包合同书模板
- 2024年中国手动电动工具市场调查研究报告
- GB/T 32124-2024磷石膏的处理处置规范
- 12G614-1砌块标准图集(附条文及目录)
- 华为HCSA-Presales-IT售前认证备考试题及答案
- 集成光电子器件及设计-4集成光有源器件
- 2025届浙江省新英语高三第一学期期末教学质量检测试题含解析
- TCECA-G 0304-2024 数字化碳管理平台 总体框架
- 中等职业学校《单片机原理与应用》课程标准
- DL∕T 976-2017 带电作业工具、装置和设备预防性试验规程
- 近几年大学英语四级词汇表(完整珍藏版)
评论
0/150
提交评论