




版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、第十讲 虚拟变量DUMMY VARIALBE,DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES TWO SETS OF DUMMY VARIABLES SLOPE DUMMY VARIABLES,DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES,This sequence explains how you can include qualitative explanatory variables
2、 in your regression model.,DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES,Suppose that you have data on the annual recurrent expenditure, COST, and the number of students enrolled, N, for a sample of secondary schools, of which there are two types: regular and occupational.,DUMMY VARIABLE CLASSIF
3、ICATION WITH TWO CATEGORIES,The occupational schools aim to provide skills for specific occupations and they tend to be relatively expensive to run because they need to maintain specialized workshops.,DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES,One way of dealing with the difference in the cos
4、ts would be to run separate regressions for the two types of school.,DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES,However this would have the drawback that you would be running regressions with two small samples instead of one large one, with an adverse effect on the precision of the estimates
5、of the coefficients.,OCC = 0 Regular schoolCOST = b1 + b2N + u OCC = 1 Occupational schoolCOST = b1 + b2N + u,DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES,Another way of handling the difference would be to hypothesize that the cost function for occupational schools has an intercept b1 that is g
6、reater than that for regular schools.,b1,b1,DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES,Effectively, we are hypothesizing that the annual overhead cost is different for the two types of school, but the marginal cost is the same. The marginal cost assumption is not very plausible and we will re
7、lax it in due course.,OCC = 0 Regular schoolCOST = b1 + b2N + u OCC = 1 Occupational schoolCOST = b1 + b2N + u,b1,b1,DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES,d,Let us define d to be the difference in the intercepts: d = b1 - b1.,OCC = 0 Regular schoolCOST = b1 + b2N + u OCC = 1 Occupational
8、 schoolCOST = b1 + b2N + u,b1,b1,DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES,Then b1 = b1 + d and we can rewrite the cost function for occupational schools as shown.,b1+d,d,OCC = 0 Regular schoolCOST = b1 + b2N + u OCC = 1 Occupational schoolCOST = b1 + d + b2N + u,b1,Combined equationCOST = b
9、1 + d OCC + b2N + u OCC = 0 Regular schoolCOST = b1 + b2N + u OCC = 1 Occupational schoolCOST = b1 + d + b2N + u,DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES,We can now combine the two cost functions by defining a dummy variable OCC that has value 0 for regular schools and 1 for occupational sc
10、hools.,d,b1,b1+d,DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES,Dummy variables always have two values, 0 or 1. If OCC is equal to 0, the cost function becomes that for regular schools. If OCC is equal to 1, the cost function becomes that for occupational schools.,d,b1,b1+d,Combined equationCOST
11、= b1 + d OCC + b2N + u OCC = 0 Regular schoolCOST = b1 + b2N + u OCC = 1 Occupational schoolCOST = b1 + d + b2N + u,DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES,We will now fit a function of this type using actual data for a sample of 74 secondary schools in Shanghai.,School TypeCOST N OCC 1Occ
12、upational345,0006231 2Occupational 537,0006531 3Regular 170,0004000 4Occupational 526.0006631 5Regular100,0005630 6Regular 28,0002360 7Regular 160,0003070 8Occupational 45,0001731 9Occupational 120,0001461 10 Occupational61,000991,DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES,The table shows the
13、 data for the first 10 schools in the sample. The annual cost is measured in yuan, one yuan being worth about 20 cents U.S. at the time. N is the number of students in the school.,School TypeCOST N OCC 1Occupational345,0006231 2Occupational 537,0006531 3Regular 170,0004000 4Occupational 526.0006631
14、5Regular100,0005630 6Regular 28,0002360 7Regular 160,0003070 8Occupational 45,0001731 9Occupational 120,0001461 10 Occupational61,000991,DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES,OCC is the dummy variable for the type of school.,. Dependent Variable: COST Method: Least Squares Date: 05/16/04
15、 Time: 19:22 Sample: 1 74 Included observations: 74 VariableCoefficientStd. Errort-StatisticProb. C-33612.5523573.47-1.4258640.1583 N331.449339.758448.3365780.0000 OCC133259.120827.596.3982010.0000 R-squared0.615637 Mean dependent var187418.0 Adjusted R-squared0.604810 S.D. dependent var141969.9 S.E
16、. of regression89248.09 Akaike info criterion25.67592 Sum squared resid5.66E+11 Schwarz criterion25.76933 Log likelihood-947.0092 F-statistic56.86072 Durbin-Watson stat2.422989 Prob(F-statistic)0.000000,DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES,We now run the regression of COST on N and OCC,
17、 treating OCC just like any other explanatory variable, despite its artificial nature.,DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES,COST = -34,000 + 133,000OCC + 331N,The regression results have been rewritten in equation form. From it we can derive cost functions for the two types of school by
18、 setting OCC equal to 0 or 1.,Regular School (OCC = 0),DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES,COST = -34,000 + 133,000OCC + 331N,COST = -34,000 + 331N,If OCC is equal to 0, we get the equation for regular schools, as shown. It implies that the marginal cost per student per year is 331 yua
19、n and that the annual overhead cost is -34,000 yuan.,Regular School (OCC = 0),DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES,COST = -34,000 + 133,000OCC + 331N,COST = -34,000 + 331N,Obviously having a negative intercept does not make any sense at all and it suggests that the model is misspecified
20、 in some way. We will come back to this later.,Regular School (OCC = 0) Occupational School (OCC = 1),DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES,Putting OCC equal to 1, we estimate the annual overhead cost of an occupational school to be 99,000 yuan. The marginal cost is the same as for regul
21、ar schools. It must be, given the model specification.,COST = -34,000 + 133,000OCC + 331N,COST = -34,000 + 331N,COST = -34,000 + 133,000 + 331N,= 99,000 + 331N,DUMMY VARIABLE CLASSIFICATION WITH TWO CATEGORIES,The scatter diagram shows the data and the two cost functions derived from the regression
22、results.,. Dependent Variable: COST Method: Least Squares Date: 05/16/04 Time: 19:22 Sample: 1 74 Included observations: 74 VariableCoefficientStd. Errort-StatisticProb. C-33612.5523573.47-1.4258640.1583 N331.449339.758448.3365780.0000 OCC133259.120827.596.3982010.0000 R-squared0.615637 Mean depende
23、nt var187418.0 Adjusted R-squared0.604810 S.D. dependent var141969.9 S.E. of regression89248.09 Akaike info criterion25.67592 Sum squared resid5.66E+11 Schwarz criterion25.76933 Log likelihood-947.0092 F-statistic56.86072 Durbin-Watson stat2.422989 Prob(F-statistic)0.000000,DUMMY VARIABLE CLASSIFICA
24、TION WITH TWO CATEGORIES,We will perform a t test on the coefficient of the dummy variable. our null hypothesis is that there is no difference in the overhead costs of the two types of school. The t statistic is 6.40, so it is rejected at the 0.1% significance level.,DUMMY CLASSIFICATION WITH MORE T
25、HAN TWO CATEGORIES,This sequence explains how to extend the dummy variable technique to handle a qualitative explanatory variable which has more than two categories.,COST = b1 + dTTECH + dWWORKER + dVVOC + b2N + u,DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES,In the previous sequence we used a
26、dummy variable to differentiate between regular and occupational schools when fitting a cost function.,COST = b1 + dTTECH + dWWORKER + dVVOC + b2N + u,DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES,In actual fact there are two types of regular secondary school in Shanghai. There are general scho
27、ols, which provide the usual academic education, and vocational schools.,COST = b1 + dTTECH + dWWORKER + dVVOC + b2N + u,DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES,As their name implies, the vocational schools are meant to impart occupational skills as well as give an academic education.,COS
28、T = b1 + dTTECH + dWWORKER + dVVOC + b2N + u,DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES,However the vocational component of the curriculum is typically quite small and the schools are similar to the general schools. Often they are just general schools with a couple of workshops added.,COST =
29、 b1 + dTTECH + dWWORKER + dVVOC + b2N + u,DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES,Likewise there are two types of occupational school. There are technical schools training technicians and skilled workers schools training craftsmen.,COST = b1 + dTTECH + dWWORKER + dVVOC + b2N + u,DUMMY CLA
30、SSIFICATION WITH MORE THAN TWO CATEGORIES,So now the qualitative variable has four categories. The standard procedure is to choose one category as the reference category and to define dummy variables for each of the others.,COST = b1 + dTTECH + dWWORKER + dVVOC + b2N + u,DUMMY CLASSIFICATION WITH MO
31、RE THAN TWO CATEGORIES,In general it is good practice to select the most normal or basic category as the reference category, if one category is in some sense more normal or basic than the others.,COST = b1 + dTTECH + dWWORKER + dVVOC + b2N + u,DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES,In th
32、e Shanghai sample it is sensible to choose the general schools as the reference category. They are the most numerous and the other schools are variations of them.,COST = b1 + dTTECH + dWWORKER + dVVOC + b2N + u,DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES,Accordingly we will define dummy varia
33、bles for the other three types. TECH will be the dummy for the technical schools: TECH is equal to 1 if the observation relates to a technical school, 0 otherwise.,COST = b1 + dTTECH + dWWORKER + dVVOC + b2N + u,DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES,Each of the dummy variables will have
34、 a coefficient which represents the extra overhead costs of the schools, relative to the reference category.,COST = b1 + dTTECH + dWWORKER + dVVOC + b2N + u,DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES,Note that you do not include a dummy variable for the reference category, and that is the re
35、ason that the reference category is usually described as the omitted category.,COST = b1 + dTTECH + dWWORKER + dVVOC + b2N + u,DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES,If an observation relates to a general school, the dummy variables are all 0 and the regression model is reduced to its ba
36、sic components.,COST = b1 + dTTECH + dWWORKER + dVVOC + b2N + u General SchoolCOST = b1 + b2N + u (TECH = WORKER = VOC = 0),DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES,If an observation relates to a technical school, TECH will be equal to 1 and the other dummy variables will be 0. The regress
37、ion model simplifies as shown.,COST = b1 + dTTECH + dWWORKER + dVVOC + b2N + u General SchoolCOST = b1 + b2N + u (TECH = WORKER = VOC = 0) Technical SchoolCOST = (b1 + dT) + b2N + u (TECH = 1; WORKER = VOC = 0),COST = b1 + dTTECH + dWWORKER + dVVOC + b2N + u General SchoolCOST = b1 + b2N + u (TECH =
38、 WORKER = VOC = 0) Technical SchoolCOST = (b1 + dT) + b2N + u (TECH = 1; WORKER = VOC = 0) Skilled Workers SchoolCOST = (b1 + dW) + b2N + u (WORKER = 1; TECH = VOC = 0) Vocational SchoolCOST = (b1 + dV) + b2N + u (VOC = 1; TECH = WORKER = 0),DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES,The reg
39、ression model simplifies in a similar manner in the case of observations relating to skilled workers schools and vocational schools.,COST,N,b1+dT,b1+dW,b1+dV,b1,Workers,Vocational,dW,dV,dT,DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES,The diagram illustrates the model graphically. The d coeffic
40、ients are the extra overhead costs of running technical, skilled workers, and vocational schools, relative to the overhead cost of general schools.,Technical,General,COST,N,dW,dV,dT,Note that we do not make any prior assumption about the size, or even the sign, of the d coefficients. They will be es
41、timated from the sample data.,DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES,Workers,Vocational,Technical,General,b1+dT,b1+dW,b1+dV,b1,School TypeCOST N TECH WORKERVOC 1Technical345,000623100 2Technical 537,000653100 3General 170,000400000 4Workers 526.000663010 5General 100,000563000 6Vocationa
42、l 28,000236001 7Vocational 160,000307001 8Technical 45,000173100 9Technical 120,000146100 10 Workers 61,00099010,Here are the data for the first 10 of the 74 schools. Note how the values of the dummy variables TECH, WORKER, and VOC are determined by the type of school in each observation.,DUMMY CLAS
43、SIFICATION WITH MORE THAN TWO CATEGORIES,DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES,The scatter diagram shows the data for the entire sample, differentiating by type of school.,Dependent Variable: COST Method: Least Squares Date: 05/16/04 Time: 20:32 Sample: 1 74 Included observations: 74 Va
44、riableCoefficientStd. Errort-StatisticProb. C-54893.0926673.08-2.0579960.0434 N342.633540.219508.5190900.0000 TECH154110.926760.415.7589150.0000 WORKER143362.427852.805.1471440.0000 VOC53228.6431061.651.7136450.0911 R-squared0.632050 Mean dependent var187418.0 Adjusted R-squared0.610719 S.D. depende
45、nt var141969.9 S.E. of regression88578.37 Akaike info criterion25.68634 Sum squared resid5.41E+11 Schwarz criterion25.84202 Log likelihood-945.3946 F-statistic29.63132 Durbin-Watson stat2.503728 Prob(F-statistic)0.000000,DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES,The coefficient of N indicat
46、es that the marginal cost per student per year is 343 yuan.,Dependent Variable: COST Method: Least Squares Date: 05/16/04 Time: 20:32 Sample: 1 74 Included observations: 74 VariableCoefficientStd. Errort-StatisticProb. C-54893.0926673.08-2.0579960.0434 N342.633540.219508.5190900.0000 TECH154110.9267
47、60.415.7589150.0000 WORKER143362.427852.805.1471440.0000 VOC53228.6431061.651.7136450.0911 R-squared0.632050 Mean dependent var187418.0 Adjusted R-squared0.610719 S.D. dependent var141969.9 S.E. of regression88578.37 Akaike info criterion25.68634 Sum squared resid5.41E+11 Schwarz criterion25.84202 L
48、og likelihood-945.3946 F-statistic29.63132 Durbin-Watson stat2.503728 Prob(F-statistic)0.000000,DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES,The coefficients of TECH, WORKER, and VOC are 154,000, 143,000, and 53,000, respectively, and should be interpreted as the additional annual overhead cos
49、ts, relative to those of general schools.,COST = -55,000 + 154,000TECH + 143,000WORKER + 53,000VOC + 343N General SchoolCOST= -55,000 + 343N (TECH = WORKER = VOC = 0) Technical SchoolCOST= -55,000 + 154,000 + 343N (TECH = 1; WORKER = VOC = 0)= 99,000 + 343N Skilled Workers SchoolCOST= -55,000 + 143,
50、000 + 343N (WORKER = 1; TECH = VOC = 0)= 88,000 + 343N Vocational SchoolCOST= -55,000 + 53,000 + 343N (VOC = 1; TECH = WORKER = 0)= -2,000 + 343N,DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES,Note that in each case the annual marginal cost per student is estimated at 343 yuan. The model specifi
51、cation assumes that this figure does not differ according to type of school.,DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES,The four cost functions are illustrated graphically.,TWO SETS OF DUMMY VARIABLES,The explanatory variables in a regression model may include multiple sets of dummy variable
52、s. This sequence provides an example of a model with two types.,COST = b1 + d OCC + e RES + b2N + u,TWO SETS OF DUMMY VARIABLES,We will continue with the school cost function model and extend it to take account of the fact that some of the schools are residential.,COST = b1 + d OCC + e RES + b2N + u
53、,TWO SETS OF DUMMY VARIABLES,To model the higher overhead costs of residential schools, we introduce a dummy variable RES which is equal to 1 for them and 0 for non-residential schools. e is the extra annual overhead cost of a residential school, relative to that of a non-residential one.,COST = b1
54、+ d OCC + e RES + b2N + u,TWO SETS OF DUMMY VARIABLES,We will also make a distinction between occupational and regular schools, using the dummy variable OCC defined in the first sequence.,COST = b1 + d OCC + e RES + b2N + u,TWO SETS OF DUMMY VARIABLES,In the case of a non-residential occupational sc
55、hool, RES is 0 and OCC is 1, so the overhead cost increases by d. If the school is both occupational and residential, it increases by (d + e).,COST = b1 + d OCC + e RES + b2N + u Regular, non-residentialCOST = b1 + b2N + u (OCC = RES = 0) Regular, residentialCOST = (b1 + e ) + b2N + u (OCC = 0; RES
56、= 1) Occupational, non-residentialCOST = (b1 + d ) + b2N + u (OCC = 1; RES = 0) Occupational, residentialCOST = (b1 + d + e ) + b2N + u (OCC = RES = 1),COST,N,b1+d +e,b1+d,b1+e,b1,Occupational, residential,Regular, non-residential,d,e,d +e,e,Occupational,non-residential,Regular,residential,TWO SETS
57、OF DUMMY VARIABLES,The diagram illustrates the model graphically. Note that the effects of the different components of the model are assumed to be separate and additive in this specification.,TWO SETS OF DUMMY VARIABLES,Here are the data for the first 10 schools. Note how the values of the dummy var
58、iables vary according to the characteristics of the school.,School Type Residential?COST N OCCRES 1OccupationalNo345,00062310 2Occupational Yes537,00065311 3Regular No170,00040000 4Occupational Yes526.00066311 5RegularNo100,00056300 6Regular No28,00023600 7Regular Yes160,00030701 8Occupational No45,
59、00017310 9Occupational No120,00014610 10 OccupationalNo61,0009910,Dependent Variable: COST Method: Least Squares Date: 05/16/04 Time: 21:06 Sample: 1 74 Included observations: 74 VariableCoefficientStd. Errort-StatisticProb. C-29045.2723291.54-1.2470310.2165 N321.833039.402258.1678840.0000 OCC109564.624039.584.5576740.0000 RES57909.0130821.311.8788630.0644 R-squared0.634090 Mean dependent var187418.0 Adj
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2025年办公设备维修工(中级)职业技能鉴定全真模拟试卷库全新全面升级
- 2025年车载空气净化器项目申请报告
- 经济学宏观分析与微观决策知识考点
- 品牌设计合作协议
- 儿童心理发育的关键里程碑和监测
- 2025年茶叶加工与评茶员(高级)茶叶加工工艺研究考试试卷
- 2025年俄语ТРКИ考试中级模拟试题
- 2025年一建《机电工程管理与实务》考试现场施工管理题库及答案解析
- 2025计算机辅助设计师考试计算机辅助设计智能机器人设计试题
- 2025年语言培训行业课程国际化教学策略研究报告
- 北京市海淀区2023-2024学年四年级下学期语文期末练习试卷(含答案)
- 银行安全培训课件
- 贵州省建筑工程施工资料管理导则
- 2025年节能知识竞赛题库及答案(共80题)
- 餐饮卫生清洁管理制度
- 景区消防安全知识培训
- 瑞吉欧教育理念的环境观
- 2025-2030水飞蓟宾项目商业计划书
- 二保焊基础知识单选题100道及答案
- 浙教版重点名校2025届中考适应性考试生物试题含解析
- 精准药物研发策略-深度研究
评论
0/150
提交评论