商务与经济统计.ppt_第1页
商务与经济统计.ppt_第2页
商务与经济统计.ppt_第3页
商务与经济统计.ppt_第4页
商务与经济统计.ppt_第5页
已阅读5页,还剩101页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、 2011 Pearson Education, Inc, 2011 Pearson Education, Inc,Statistics for Business and Economics,Chapter 10 Simple Linear Regression, 2011 Pearson Education, Inc,Contents,10.1Probabilistic Models 10.2Fitting the Model: The Least Squares Approach 10.3Model Assumptions 10.4Assessing the Utility of the

2、Model: Making Inferences about the Slope 1, 2011 Pearson Education, Inc,Contents,10.5The Coefficients of Correlation and Determination 10.6Using the Model for Estimation and Prediction 10.7A Complete Example, 2011 Pearson Education, Inc,Learning Objectives,Introduce the straight-line (simple linear

3、regression) model as a means of relating one quantitative variable to another quantitative variable Introduce the correlation coefficient as a means of relating one quantitative variable to another quantitative variable, 2011 Pearson Education, Inc,Learning Objectives,Assess how well the simple line

4、ar regression model fits the sample data Employ the simple linear regression model for predicting the value of one variable from a specified value of another variable, 2011 Pearson Education, Inc,10.1,Probabilistic Models, 2011 Pearson Education, Inc,Models,Representation of some phenomenon Mathemat

5、ical model is a mathematical expression of some phenomenon Often describe relationships between variables Types Deterministic models Probabilistic models, 2011 Pearson Education, Inc,Deterministic Models,Hypothesize exact relationships Suitable when prediction error is negligible Example: force is e

6、xactly mass times acceleration F = ma, 2011 Pearson Education, Inc,Probabilistic Models,Hypothesize two components Deterministic Random error Example: sales volume (y) is 10 times advertising spending (x) + random error y = 10 x + Random error may be due to factors other than advertising, 2011 Pears

7、on Education, Inc,General Form of Probabilistic Models,y = Deterministic component + Random error where y is the variable of interest. We always assume that the mean value of the random error equals 0. This is equivalent to assuming that the mean value of y, E(y), equals the deterministic component

8、of the model; that is, E(y) = Deterministic component, 2011 Pearson Education, Inc,A First-Order (Straight Line) Probabilistic Model,y = 0 + 1x + where y = Dependent or response variable (variable to be modeled) x = Independent or predictor variable (variable used as a predictor of y) E(y) = 0 + 1x

9、= Deterministic component (epsilon) = Random error component, 2011 Pearson Education, Inc,A First-Order (Straight Line) Probabilistic Model,y = 0 + 1x + 0 (beta zero) = y-intercept of the line, that is, the point at which the line intercepts or cuts through the y-axis 1 (beta one) = slope of the lin

10、e, that is, the change (amount of increase or decrease) in the deterministic component of y for every 1-unit increase in x, 2011 Pearson Education, Inc,A First-Order (Straight Line) Probabilistic Model,Note: A positive slope implies that E(y) increases by the amount 1 for each unit increase in x. A

11、negative slope implies that E(y) decreases by the amount 1., 2011 Pearson Education, Inc,Five-Step Procedure,Step 1:Hypothesize the deterministic component of the model that relates the mean, E(y), to the independent variable x. Step 2:Use the sample data to estimate unknown parameters in the model.

12、 Step 3:Specify the probability distribution of the random error term and estimate the standard deviation of this distribution. Step 4:Statistically evaluate the usefulness of the model. Step 5:When satisfied that the model is useful, use it for prediction, estimation, and other purposes., 2011 Pear

13、son Education, Inc,10.2,Fitting the Model: The Least Squares Approach, 2011 Pearson Education, Inc,Scattergram,Plot of all (xi, yi) pairs Suggests how well model will fit, 2011 Pearson Education, Inc,Thinking Challenge,How would you draw a line through the points? How do you determine which line fit

14、s best?, 2011 Pearson Education, Inc,Least Squares Line,The least squares line is one that has the following two properties: 1.The sum of the errors equals 0,i.e., mean error = 0. 2.The sum of squared errors (SSE) is smaller than for any other straight-line model, i.e., the error variance is minimum

15、., 2011 Pearson Education, Inc,Formula for the Least Squares Estimates,n = sample size, 2011 Pearson Education, Inc,Interpreting the Estimates of 0 and 1 in Simple Liner Regression,y-intercept: represents the predicted value of y when x = 0 (Caution: This value will not be meaningful if the value x

16、= 0 is nonsensical or outside the range of the sample data.) slope: represents the increase (or decrease) in y for every 1-unit increase in x (Caution: This interpretation is valid only for x-values within the range of the sample data.), 2011 Pearson Education, Inc,Least Squares Graphically,e,2,y,x,

17、e,1,e,3,e,4, 2011 Pearson Education, Inc,Least Squares Example,Youre a marketing analyst for Hasbro Toys. You gather the following data: Ad Expenditure (100$) Sales (Units)1121324254 Find the least squares line relatingsales and advertising., 2011 Pearson Education, Inc,0,1,2,3,4,0,1,2,3,4,5,Scatter

18、gram Sales vs. Advertising,Sales,Advertising, 2011 Pearson Education, Inc,Parameter Estimation Solution, 2011 Pearson Education, Inc,Parameter Estimates Parameter Standard T for H0: Variable DF Estimate Error Param=0 Prob|T| INTERCEP 1 -0.1000 0.6350 -0.157 0.8849 ADVERT 1 0.7000 0.1914 3.656 0.0354

19、,Parameter Estimation Computer Output,0,1, 2011 Pearson Education, Inc,Coefficient Interpretation Solution, 2011 Pearson Education, Inc,0,1,2,3,4,0,1,2,3,4,5,Regression Line Fitted to the Data,Sales,Advertising, 2011 Pearson Education, Inc,Least Squares Thinking Challenge,Youre an economist for the

20、county cooperative. You gather the following data: Fertilizer (lb.)Yield (lb.) 43.0 65.5106.5129.0 Find the least squares line relatingcrop yield and fertilizer., 1984-1994 T/Maker Co., 2011 Pearson Education, Inc,0,2,4,6,8,10,0,5,10,15,Scattergram Crop Yield vs. Fertilizer*,Yield (lb.),Fertilizer (

21、lb.), 2011 Pearson Education, Inc,Parameter Estimation Solution*, 2011 Pearson Education, Inc,Coefficient Interpretation Solution*, 2011 Pearson Education, Inc,Regression Line Fitted to the Data,0,2,4,6,8,10,0,5,10,15,Yield (lb.),Fertilizer (lb.), 2011 Pearson Education, Inc,10.3,Model Assumptions,

22、2011 Pearson Education, Inc,Basic Assumptions of the Probability Distribution,Assumption 1: The mean of the probability distribution of is 0 that is, the average of the values of over an infinitely long series of experiments is 0 for each setting of the independent variable x. This assumption implie

23、s that the mean value of y, E(y), for a given value of x is E(y) = 0 + 1x., 2011 Pearson Education, Inc,Basic Assumptions of the Probability Distribution,Assumption 2: The variance of the probability distribution of is constant for all settings of the independent variable x. For our straight-line mo

24、del, this assumption means that the variance of is equal to a constant, say 2, for all values of x., 2011 Pearson Education, Inc,Basic Assumptions of the Probability Distribution,Assumption 3: The probability distribution of is normal. Assumption 4: The values of associated with any two observed val

25、ues of y are independentthat is, the value of associated with one value of y has no effect on the values of associated with other y values., 2011 Pearson Education, Inc,Basic Assumptions of the Probability Distribution,., 2011 Pearson Education, Inc,Estimation of 2 for a (First-Order) Straight-Line

26、Model,To estimate the standard deviation of , we calculate We will refer to s as the estimated standard error of the regression model., 2011 Pearson Education, Inc,Calculating SSE, s2, s Example,Youre a marketing analyst for Hasbro Toys. You gather the following data: Ad Expenditure (100$) Sales (Un

27、its)1121324254 Find SSE, s2, and s., 2011 Pearson Education, Inc,Calculating s2 and s Solution, 2011 Pearson Education, Inc,10.4,Assessing the Utility of the Model: Making Inferences about the Slope 1, 2011 Pearson Education, Inc,Sampling Distribution of,If we make the four assumptions about , the s

28、ampling distribution of the least squares estimator of the slope will be normal with mean 1 (the true slope) and standard deviation, 2011 Pearson Education, Inc,Sampling Distribution of,We estimate by and refer to this quantity as the estimated standard error of the least squares slope ., 2011 Pears

29、on Education, Inc,A Test of Model Utility: Simple Linear Regression,One-Tailed Test H0: 1 = 0 Ha: 1 0) Rejection region: t t when Ha: 1 0) where t is based on (n 2) degrees of freedom, 2011 Pearson Education, Inc,A Test of Model Utility: Simple Linear Regression,Two-Tailed Test H0: 1 = 0 Ha: 1 0 Rej

30、ection region: | t | t where t is based on (n 2) degrees of freedom, 2011 Pearson Education, Inc,Interpreting p-Values for Coefficients in Regression,Almost all statistical computer software packages report a two-tailed p-value for each of the parameters in the regression model. For example, in simp

31、le linear regression, the p-value for the two-tailed test H0: 1 = 0 versus Ha: 1 0 is given on the printout. If you want to conduct a one-tailed test of hypothesis, you will need to adjust thep-value reported on the printout as follows:, 2011 Pearson Education, Inc,Interpreting p-Values for Coeffici

32、ents in Regression,Upper-tailed test (Ha: 1 0): Lower-tailed test (Ha: 1 0): where p is the p-value reported on the printout and t is the value of the test statistic., 2011 Pearson Education, Inc,A 100(1 a)% Confidence Interval for the Simple Linear Regression Slope 1,where the estimated standard er

33、ror is calculated by and ta/2 is based on (n 2) degrees of freedom., 2011 Pearson Education, Inc,Test of Slope Coefficient Example,Youre a marketing analyst for Hasbro Toys. You find 0 = .1, 1 = .7 and s = .6055. Ad Expenditure (100$) Sales (Units)1121324254 Is the relationship significant at the .0

34、5 level of significance?, 2011 Pearson Education, Inc,Test of Slope Coefficient Solution,H0: Ha: df Critical Value(s):, 2011 Pearson Education, Inc,Test StatisticSolution, 2011 Pearson Education, Inc,Test of Slope Coefficient Solution,H0: Ha: df Critical Value(s):,Test Statistic: Decision: Conclusio

35、n:,Reject at = .05,There is evidence of a relationship, 2011 Pearson Education, Inc,Test of Slope CoefficientComputer Output,Parameter Estimates Parameter Standard T for H0: Variable DF Estimate Error Param=0 Prob|T| INTERCEP 1 -0.1000 0.6350 -0.157 0.8849 ADVERT 1 0.7000 0.1914 3.656 0.0354,t = 1 /

36、 S,P-Value,S,1,1,1, 2011 Pearson Education, Inc,10.5,The Coefficients of Correlation and Determination, 2011 Pearson Education, Inc,Correlation Models,Answers How strong is the linear relationship between two variables? Coefficient of correlation Sample correlation coefficient denoted r Values range

37、 from 1 to +1 Measures degree of association Does not indicate causeeffect relationship, 2011 Pearson Education, Inc,Coefficient of Correlation,where, 2011 Pearson Education, Inc,Coefficient of Correlation, 2011 Pearson Education, Inc,Coefficient of Correlation, 2011 Pearson Education, Inc,Coefficie

38、nt of Correlation, 2011 Pearson Education, Inc,Coefficient of Correlation Example,Youre a marketing analyst for Hasbro Toys. Ad Expenditure (100$) Sales (Units)1121324254 Calculate the coefficient ofcorrelation., 2011 Pearson Education, Inc,Coefficient of Correlation Solution, 2011 Pearson Education

39、, Inc,A Test for Linear Correlation,One-Tailed Test H0: = 0 Ha: 0) Rejection region: t t (or t t) Where the distribution of t depends on (n 2) degrees of freedom, 2011 Pearson Education, Inc,A Test for Linear Correlation,Two-Tailed Test H0: = 0 Ha: 0 Rejection region: | t | t Where the distribution

40、of t depends on (n 2) degrees of freedom, 2011 Pearson Education, Inc,Condition Required for a Valid Test of Correlation,The sample of (x, y) values is randomly selected from a normal population., 2011 Pearson Education, Inc,Coefficient of Correlation Thinking Challenge,Youre an economist for the co

41、unty cooperative. You gather the following data: Fertilizer (lb.)Yield (lb.) 43.0 65.5106.5129.0 Find the coefficient of correlation., 1984-1994 T/Maker Co., 2011 Pearson Education, Inc,Coefficient of Correlation Solution, 2011 Pearson Education, Inc,It represents the proportion of the total sample

42、variability around y that is explained by the linear relationship between y and x.,Coefficient of Determination,0 r2 1,r2 = (coefficient of correlation)2, 2011 Pearson Education, Inc,Coefficient of Determination Example,Youre a marketing analyst for Hasbro Toys. You know r = .904. Ad Expenditure (10

43、0$) Sales (Units)1121324254 Calculate and interpret thecoefficient of determination., 2011 Pearson Education, Inc,Coefficient of Determination Solution,r2 = (coefficient of correlation)2 r2 = (.904)2 r2 = .817,Interpretation: About 81.7% of the sample variation in Sales (y) can be explained by using

44、 Ad $ (x) to predict Sales (y) in the linear model., 2011 Pearson Education, Inc,r2 Computer Output,Root MSE 0.60553 R-square 0.8167 Dep Mean 2.00000 Adj R-sq 0.7556 C.V. 30.27650,r2 adjusted for number of explanatory variables a sample of 15 recent fires in this suburb is selected. The amount of da

45、mage, y, and the distance between the fire and the nearest fire station, x, are recorded for each fire., 2011 Pearson Education, Inc,Example, 2011 Pearson Education, Inc,Example,Step 1: First, we hypothesize a model to relate fire damage, y, to the distance from the nearest fire station, x. We hypot

46、hesize a straight-line probabilistic model: y = 0 + 1x + , 2011 Pearson Education, Inc,Example,Step 2: Use a statistical software package to estimate the unknown parameters in the deterministic component of the hypothesized model. The Excel printout for the simple linear regression analysis is shown

47、 on the next slide. The least squares estimates of the slope 1 and intercept 0, highlighted on the printout, are, 2011 Pearson Education, Inc,Example, 2011 Pearson Education, Inc,Example,This prediction equation is graphed in the Minitab scatterplot., 2011 Pearson Education, Inc,Example,The least sq

48、uares estimate of the slope, implies that the estimated mean damage increases by $4,919 for each additional mile from the fire station. This interpretation is valid over the range of x, or from .7 to 6.1 miles from the station. The estimated y-intercept, , has the interpretation that a fire 0 miles

49、from the fire station has an estimated mean damage of $10,278., 2011 Pearson Education, Inc,Example,Step 3: Specify the probability distribution of the random error component . The estimate of the standard deviation of , highlighted on the Excel printout is s = 2.31635 This implies that most of the

50、observed fire damage (y) values will fall within approximately 2 = 4.64 thousand dollars of their respective predicted values when using the least squares line., 2011 Pearson Education, Inc,Example,Step 4: First, test the null hypothesis that the slope 1 is 0 that is, that there is no linear relatio

51、nship between fire damage and the distance from the nearest fire station, against the alternative hypothesis that fire damage increases as the distance increases. We test H0: 1 = 0 Ha: 1 0 The two-tailed observed significance level for testing is approximately 0., 2011 Pearson Education, Inc,Example

52、,The 95% confidence interval yields (4.070, 5.768). We estimate (with 95% confidence) that the interval from $4,070 to $5,768 encloses the mean increase (1) in fire damage per additional mile distance from the fire station. The coefficient of determination, is r2 = .9235, which implies that about 92

53、% of the sample variation in fire damage (y) is explained by the distance (x) between the fire and the fire station., 2011 Pearson Education, Inc,Example,The coefficient of correlation, r, that measures the strength of the linear relationship between y and x is not shown on the Excel printout and mu

54、st be calculated. We find The high correlation confirms our conclusion that 1 is greater than 0; it appears that fire damage and distance from the fire station are positively correlated. All signs point to a strong linear relationship between y and x., 2011 Pearson Education, Inc,Example,Step 5: We are now prepared to use the least squares model. Suppose the insurance company wants to predict the fire damage if a major residential fire were to occur 3.5 miles from the nearest fire station. A 95% confidence interval for E(y) and prediction interva

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论