




已阅读5页,还剩13页未读, 继续免费阅读
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
更多数学建模资料请关注微店店铺 数学建模学习交流 Team 52815Page 1 of 17 Mr Alpha Chiang We are writing to you regarding the investment strategy of the Goodgrant Foundation We have created a suffi cient model for dispersing the donations to colleges in a way that would enhance the Return on Investment RoI for Goodgrant and thus the benefi ts for the impacted students This model was developed due in large part to the analysis of data provided by the U S National Center on Education Statistics Considering the lack of an offi cial RoI measurement regarding the funding of schools we opted to create our own before model creation This measurement was a function of what we called Goodness obtained by and resources available to a given school Goodness was assessed as a school s contribution to the U S Gross Domestic Product GDP Resources were defi ned as a school s annual income from tuition and external funds This RoI was optimized by the model s simulations Jobs added by a given school were calculated by considering the unemployment rates for each major and the number of undergraduate degrees awarded by a school This was compared to the unemployment rate of those with bachelor s degrees in the respective majors across the United States A positive relationship was then found between the amount of jobs added and the change in GDP Therefore schools were assigned high goodness if they awarded a large amount of degrees in fi elds that were undersaturated in the U S economy The 500 000 000 that Goodgrant plans to donate is disbursed on a yearly basis in the analysis The simulations determine which schools to invest in and how much to invest in each school The time duration over which the money should be invested is also included These durations were a result of forecasted GDP and population data over the next fi ve years We fi rst computed an overall model by looking at the relationship between resources and goodness for all the schools on which we had data Then using this fi rst model as a starting point we developed individual models for each school That is to say we have an individual relationship between how much funding each school receives and how much it s able to contribute to the economy based on that income as well as a custom estimation for the rate at which decreasing marginal returns occur for each school Based on these data we estimated RoI over time by comparing the change in resources to the change in Goodness as a result of each respective investment Then we ran a number of simulations to allocate the individual investments that maximized the RoI These investments are allocated until the overall donation amount is exhausted leaving us with a plan to best disburse the grants every year After optimizing RoI 73 schools were given at least some sort of investment Most were small possibly underdeveloped schools that would benefi t greatly from a private contribution There are also a few highly developed schools that were recommended for investment such as Princeton The yearly investments in these schools are provided by the model The recommended investments output from the model optimize an acceptable metric for gauging the added value to the invested schools Multiple models were compared to one another to arrive at a concluded model for the recommended investments We are confi dent that this model will realize its full potential Sincerely MCM Team Members Team 52815Page 2 of 17 A Multilevel Bayesian Model based on Economic Contribution of University Alumni February 1 2016 Contents 1Introduction3 2Data Cleaning and Imputation Schema3 2 1Data Gathering Cleaning and Filtering 3 2 2Imputation Schema 4 3Model Components4 3 1Model Overview 5 3 2Contribution to GDP 5 3 3 Final Cross Sectional Model Specifi cation 8 3 4Funding Over Time 8 3 5Analysis of Linear Model Assumptions 8 4Solving the Optimization Problem10 4 1Minimal Equivalent Example 10 4 2Full Optimization Scheme 10 4 3Algorithm 11 5Problem Statement and Special Considerations11 6Model Analysis12 6 1Why Bayes 12 6 2Model Limitations 12 6 3Model Assumptions 12 6 4Model Strengths 13 6 5Potential Improvements 13 7Results from Model13 8Conclusions15 8 1Unconstrained Model 15 8 2Constrained Model 16 8 3Evaluation of Assumptions 16 Team 52815Page 3 of 17 1Introduction Starting in July 2016 the Goodgrant Foundation will provide an overall donation amount of 500 000 000 over the course of 5 years 100 000 000 per year to various schools This donation amount will be allotted to certain schools based on the school s return on investment RoI A school s projected RoI will determine whether or not the school will receive any investment whatsoever how much money the school will receive and the time duration over which the school will receive the money Solid investments will have a positive eff ect on overall student performance While it is acknowledged that in general investment in tertiary education has lower RoI than invest ment in primary education there are still important returns to expanding higher education especially in developed countries such as the United States 24 In particular we will show the impact that investment in universities can have on GDP Investing money in the development of skilled workers is not only benefi cial to the economy it is necessary for the US to compete globally We hope that this document will help to show how best that can be achieved Return on Investment was calculated as a function of the school s goodness and the school s resources Goodness and resources were determined using the provided data sets and external data sets Before review the data required various preparation techniques 2Data Cleaning and Imputation Schema Any serious statistical analysis begins with substantial work on the data The data provided had a panoply of issues ranging from unnatural variables structural and seemingly random missingness and impossible observations Their rectifi cation was prerequisite to model fi tting and analysis 2 1Data Gathering Cleaning and Filtering For our approach to evaluating RoI the gathering of unemployment measurements for each major that was referred to in the given data sets was required Multiple sources were referred to in collecting these measurements 6 12 19 25 26 It was also necessary to obtain average starting salary for each major 8 9 13 14 15 16 17 20 Initial data cleaning was performed in Python where we re organized variables into a more natural structure for example there were two tuition variables one for public and one for private schools as opposed to one tuition variable and one private public fl ag Certain schools were found to have negative average tuition We considered this to be measurement error and removed from consideration for funding any school reporting negative average tuition Next we analyzed the missingness structure in R The MI package short for Multiple Imputation 23 has many useful tools for dealing with missing data problems Among these is a heatmap of missing cases and observations clustered by missingness structure Figure 1 shows the initial missingness structure In Figure 1 the x axis represents cases the y variables the names of which are not intended to be legible There are a number of schools which are missing the majority of variables shown at the top of the missingness image These schools were not considered for funding due to the lack of information and were removed from the analysis There is a large block on the left representing the lack of SAT or ACT scores for a majority of schools These variables were not included in our analysis because of this lack of data for most schools On the right a strange missingness pattern is observed but this is due to the factors applying Team 52815Page 4 of 17 Figure 1 Initial Missingness Structure of Data either to public or private schools and is rectifi ed in the python script Any further missingness was handled by our imputation schema 2 2Imputation Schema We identifi ed two distinct kinds of missingness in the data and modeled it with two diff erent mecha nisms Firstly data simply missing from the model were assumed to be Missing Completely at Random MCAR BUGS treats missing values as parameters and conducts inference on them automatically As we are acting as Bayesians we must specify priors for them and choose to use a very disperse normal distribution centered at zero We were not comfortable making this same assumption with the Privacy Suppressed Data In reporting the data aggregators are required to censor data if they believe it may be possible to make reasonable inferences on personal data of any individual Oftentimes this is the case if the data represents an aggregation from very few people We believe that schools with few people may have diff erent characterisitcs than schools in general and so these were not assumed to be MCAR but Missing at Random MAR that is whether or not they are missing depends on other variables in our model These data were estimated using a multiple regression model depending on the complete data 21 3Model Components There were two means of assessing how contributions to a school would increase its goodness The fi rst was a measurement of how well the school is doing at satisfying the nation s needs with respect to the skills of its graduated students The second was simply a function of the median salary for that school Team 52815Page 5 of 17 The resources of a given school were calculated by estimating the annual tuition income and the annual funds added externally to the school 3 1Model Overview The data did not come with any kind of response there was no measurement of how deserving a school was of money As such we had to create our own which we will refer to as Goodness Return on Investment was defi ned as a change in Goodness We hypothesized that there would be a positive concave down relationship between the amount of resources available to a school and its Goodness The relationship was expected to be positive because in general a school with access to more money is expected to be able to produce better results and concave down due to the law of diminishing marginal returns Specifi cally resources are defi ned as the tuition per student as well as the total enrollment for the school and Goodness is a linear combination of a school s contribution to GDP defi ned below and the median salary of its alumni We fi rst computed an overall model Goodness ln Tuition Enrollment N 0 I Next we computed an individual model for each school The posterior for the model described above was given extra uncertainty variance and was used as a prior for each school 1 P Ms X L Ms Xs P Ms X This gave us an estimated Goodness or RoI Curve for each school Given these curves the idea was to maximize RoI through a change in per capita dollar amount theoretically added to each school s resources to change its eff ective tuition A more precise overview of how schools were chose based on this model may be found in the optimization section 3 2Contribution to GDP The model related the proportion of students in each course of study to GDP in a several step process The fi rst step was to identify how many jobs each school is expected to add to the economy This was calculated based on the unemployment in the fi elds which a university was preparing students to work in For example if there is higher than usual unemployment in Psychology we would rate colleges producing many Psychology majors as producing negative jobs which would lead to an increase in unemployment On the other hand if there is lower than average unemployment in Computer Science schools with many computer scientists would be rated as producing many jobs The calculations were conducted as follows DegreesAwarded is the proportion of degrees awarded Unemployment is the mean unemployment of all majors Unemployment is the standard deviation of employment for all majors Students DegreesAwarded Enrollment Team 52815Page 6 of 17 Unemployment Unemployment Unemployment Unemployment JobsAdded Students DegreesAwarded Unemployment If a school was found to have a net decrease in jobs it was removed from consideration for fund ing We next map jobs added to the economy to a change in the unemployment rate for each major This is simply defi ned as follows CurrentJobs LaborForce 1 Unemploymentinitial Unemploymentfinal 1 CurrentJobs JobsAdded LaborForce Unemployment Unemploymentfinal Unemploymentinitial It is well known that there is a relationship between real gross domestic product GDP and unemploy ment 4 and there exist complicated economic models relating the two 22 However we chose to fi t our own simple model to unemployment and GDP data spanning from 1948 to 2015 obtained from the Bureau of Economic Analysis 11 and the Bureau of Labor Statistics 2 respectively for use in our greater model We are not directly concerned with the relationship between GDP and unemployment but with the relationship between a change in GDP and a change in unemployment As such our model is as such ln GDP Unemployment N 0 I We use the natural log of a change in GDP because while reasonable values for GDP have greatly increased since 1948 reasonable values for unemployment have remained about the same So our model predicts not on an absolute change in GDP but a percent change in GDP when back transformed from log space ln GDP ln GDPfinal ln GDPinitial ln GDP ln GDPfinal GDPinitial eln GDP GDPfinal GDPinitial Team 52815Page 7 of 17 Figure 2 Regression of logGDP vs Unemployment But in our model we want an absolute change in GDP We obtain this using the following for mula GDP eMLEGDPinitial GDPinitial GDP eln GDPfinal GDPinitial GDPinitial GDPinitial GDP GDPfinal GDPinitial GDPinitial GDPinitial GDP GDPfinal GDPinitial The model fi t looks surprisingly good See in Figure 2 a scatterplot of the chart with regression line plotted This negative relationship was manipulated to evaluate the goodness that would be added by a possible Goodgrant contribution The jobs that are added to a particular fi eld decrease the overall unemployment rate and thus increase the GDP While keeping resources constant an increase in GDP leads to a positive ROI Team 52815Page 8 of 17 3 3 Final Cross Sectional Model Specifi cation Once the data were manipulated as desired in Python and R BUGS was called from within R to sample from the posterior distribution The model as specifi ed to BUGS was as such Goodness 0 1ln Price 2Enrollment 3ln Price Enrollment N 0 I N 0 0 00001 0 00001 0 00001 For Missing Completely at Random Data Salary N 0 0 00001 For Missing at Random Data Salary X Where X is a matrix of other covariates and is a vector of coeffi cients relating the two 3 4Funding Over Time In order to answer the question of when these funds should be dispersed we simply recomputed our model posteriors for diff erent values of economic data We used forecasted econometric 10 and population data 7 in our calculations We assumed that the percent of the US population making up the Labor Force would stay constant over the next fi ve years for our Job Added calculations We did not inject any other diff erent information into the model when computing the coeffi cients at diff erent times 3 5Analysis of Linear Model Assumptions We will in this section perform an analysis of the residuals of the regression model in order to determine whether any of the assumptions of linear models have been validated We fi rst consider a normal Quantile Quantile Plot Figure 3 It does not look perfectly normal contrary to the assumptions which lead to our posterior estimates but it is close enough This predicted versus residual plot is more worrisome Figure 4 We actually see a decrease in variance as the predicted value increases This plot shows that our model is not perfect The error of the prediction seems to increase as the predicted value increases Figure 5 This plot again serves as evidence against this model the most correct Team 52815Page 9 of 17 Figure 3 Q Q Plot for Main Model Figure 4 Residual vs Predicted Plot for Main Model Team 52815Page 10 of 17 Figure 5 Actual vs Predicted Plot for Main Model 4Solving the Optimization Problem Our model as described above is only part of our solution Given it we are able to calculate the most benefi cial distribution of available funds How this is done will be described here 4 1Minimal Equivalent Example Consider an example see Figure 6 in which we have only two schools competing for Goodgrant s funds School A has a coeffi cient of 1 to relate log of funding with Goodness while School B has a coeffi cient of 0 2 Both graphs above show what the change in Goodness is for a disbursement of 1 3 thousand dollars to each school would be Obviously the disbursement has a larger eff ect on the school on the left School A so we would choose to fund it over the school on the right School B This is essentially how our optimization problem works we iteratively check which school would most benefi t from a small disbursement until we have exhausted our funds The above example is just a simple case of our investments When considering our real model the problem becomes how much overall disbursement each school should be given from the Goodgrant Foundation to maximize RoI 4 2Full Optimization Scheme Our approach for the full optimization problem is simply to iteratively calculate the Goodness Resource slope for each school and choosing to fund the school which resulted in the largest increase in Goodness on Team 52815Page 11 of 17 Figure 6 2D Example of Optimizing Funding Distribution each iteration until all our funds are depleted 4 3Algorithm Below is the algorithm we have created for the optimization process We denotes r as resources s as number of students enrolled in a school and as the parameter matrix recall that this is matrix as opposed to a vector because each school has its own model The variable of interest Goodness is represented by g and constant money distributed denotes as m Algorithm 1 Initiate
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 不正当竞争行为的法律后果
- 算法交易对股市的影响研究
- 2025年编程原理试题及答案
- 奇葩语文拼音题目及答案
- 七年级数学下试卷及答案
- 2025年善意的谎言辩论反方资料
- CN222980409U 一种抗跌落型继电器 (四川宏发电声有限公司)
- CN120248362A 一种多孔配位聚合物及其制备方法与其在c3h6-c3h8混合气分离中的应用 (同济大学)
- 2025年职测题库及答案
- UPS服务安全培训课件
- GB/T 8948-1994聚氯乙烯人造革
- GB/T 11869-2018造船和海上结构物甲板机械远洋拖曳绞车
- 小学英语人教PEP六年级上册Unit3Myweekendplan击鼓传花小游戏
- 最新-骨髓炎-课件
- 主题班会《反对邪教-从我做起》
- 幕墙预埋件专项施工方案
- HDX8000系列安装配置操作指南
- 白虎汤分析课件
- 山东青年政治学院校徽校标
- 教学课件:《新能源材料技术》朱继平
- EDA课程第3~5章QuartusII Verilog HDL 数字电路设计实现
评论
0/150
提交评论