数学类短论文_数据分析方法翻译.doc_第1页
数学类短论文_数据分析方法翻译.doc_第2页
数学类短论文_数据分析方法翻译.doc_第3页
数学类短论文_数据分析方法翻译.doc_第4页
数学类短论文_数据分析方法翻译.doc_第5页
已阅读5页,还剩2页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

数据分析方法Data Analysis Method通常,管理研究中运用比较普遍的数据分析方法是多元回归分析法,但是多元回归方法存在两个弱点(李怀祖,2000):Generally speaking, the Multiple Regression Analysis Method is commonly used in the application of management research. However, the Multiple Regression Method has two weaknesses. (Li Huaizu, 2000):(1)、管理研究中难以回避一些无法直接观测的变量,而多元回归的因变量和自变量都要求可测,才能估计出回归系数。比如本论文种的运营质量,需要通过准确性、可靠性、准时性等来阐述一样。结构方程模型的广泛应用主要是因为在一般的回归研究中所涉及的很多变量并不能直接、准确地测量,这些变量称为潜变量( Latent Variable ,简称LV)。人们可以找到一些可观测的变量将其作为这些潜变量的“指标”( Indicators) ,进而间接研究潜变量的性质(刘金兰 2005)。传统的统计分析方法通常不能有效处理这些含潜变量的问题,而结构方程模型正是用来检验观测变量和潜变量、潜变量和潜变量之间关系的一种多元统计方法。(理顺一下逻辑)(1) In the management research, some variables, which is unable to be observed directly, could not be neglected. However, the Dependent Variable and Independent Variable should be observable in the Multiple Regression Analysis so that the Regression Coefficient could be estimated. For instance, the operation quality mentioned in the essay should be elaborated through its accuracy, reliability, punctuality and so on. The widespread application of Structural Equation Model mainly because many variables included in the common regression research could not be directly and accurately observed. The variable is regarded as Latent Variable (LV). Some observable variable may be found to serve as the “Indicators” of the LV so that the nature of LV could be indirectly researched (Liu Jinnan 2005). Generally speaking, these problems including LV could not be effectively handled by traditional statistical analysis method, but the structural equation model is just one kind of multivariate statistical method used for checking and observing the relationship between variable & LV or among different latent variables (For straightening out the logic).(2)、回归分析难以处理多重共线性问题。因此,在涉及自变量多或自变量相互关联复杂的系统时,人们需要在多元回归分析的基础上,探索新的数据分析方法。目前,在管理研究中,特别是采用问卷法收集数据的情况下,结构方程建模是针对上述回归分析的弱点而研发出来的并已得到较广泛应用的数据分析方法(李怀祖,2000)。根据所研究的关联模型的特点,本文选择结构方程建模作为研究工具。(2) As the problem of multicollinearity is difficult to be solved in the regression analysis, therefore, new data analysis method should be searched based on the Multiple Regression Analysis. In the current management research, the modeling of structural equation is a kind of data analysis method developed in view of the weakness of regression analysis in the management research, which is specially adopted by the questionnaire to allocate data and has already been widely used (Li Huaizu, 2000). According to the features of the connection model, the essay chooses the structural equation modeling as the research tool.本论文采用的分析检验方法包括:一般线性相关分析、多元回归分析、主成分分析等,统计软件是采用的是SPSS 15,以及用于偏最小二乘的PLS( Partial Least Square )软件SmartPLS 2.0。The analytical and inspection method of the essay includes: general linear correlation analysis, multiple regression analysis, principal components analysis and so on. SPSS 15 is adopted in the statistical software and partial least squares (PLS) are used in the software SmartPLS .1 结构方程简介4.3.1 Introduction of Structural Equation结构方程模型(Structural Equation Modeling,简称SEM)是由瑞典统计学家Karl G Joreskog 于和Dag Sorbom等学者在20世纪70年代提出来的一种线性统计建模技术。是对探索性因子分析、验证性因子分析、路径分析、多元回归及方差分析等统计方法的综合运用和改进提高。最近十多年来,结构方程模型已成为一种非常通用的、主要的线性统计建模技术,广泛应用于经济学、管理学、行为科学等领域的研究。人们所熟悉的多元回归(Multiple regression)、因子分析(Factor Analysis)和路径分析(Path Analysis)等统计方法实际上都只是结构方程模型的一种特例。结构方程模型目前仍然是多元统计分析中一个前沿研究领域。Structural Equation Modeling (SEM) is a kind of linear statistics modeling technology presented by Sweden statistician Karl G Joreskog, scholar Dag Sorbom in the 1970s. It is a comprehensive application and promotion of many statistical methods such as Exploratory Factor Analysis, Confirmatory Factor Analysis, Path Analysis, Multiple Regression Analysis, and Variance Analysis and so on. The structural equation model has already became a kind of fairly universal and main linear statistical modeling technology which is widely adopted in the research of economic, management, behavioral science. The statistical methods such as Multiple Regression Analysis, Factor Analysis and Path Analysis are actually an exceptional case of SEM which is known to all. SEM is still a forefront research area included in the multi-dimensional statistical analysis at present.目前,主要有两大类估计技术来求解结构方程模型。-种是基于最大似然估计(ML) 的协方差结构分析方法,如以LISREL 方法为代表(Anderson J.C , Rungtusanathamm, Schroeder R Get al, et al 1995); 另一种则是基于偏最小二乘( PLS) 的分析方法,以PLS 方法为代表(Chin.W.W 1998)。 国内关于前者的讨论已有很多,但对后者的研究却较少。本论文的所应用的分析技术就是偏最小二乘方法(PLS Partial Least Squares)。At present, there is mainly two kinds of estimation technology for solving SEM. One is the covariance structure analysis method based on Maximum Likelihood Estimation (ML), for example, the LISREL method (Anderson J.C, Rungtusanathamm, Schroeder R Get al, et al 1995). The other kind is the analysis method based on Partial Least Squares (PLS) and represented by PLS method (Chin.W.W 1998). There has been much discussion on the first method but there are little research on PLS method. Therefore, the analysis technology discussed in the essay is PLS method.4.3.2 偏最小二乘PLS简介4.3.2 Introduction of PLS偏最小二乘法(PLS:Partial Least Squares)被称为第二代的多变量技术,是一种新型的多元统计分析技术,是近年来模型参数估计的常用方法(Herman Wold,1992)。PLS 理论由两个部分组成:PLS 回归与PLS 路径建模。最初,PLS 回归的应用主要在化工领域。PLS 路径建模方法是PLS 回归的扩展与延伸,它于八十年代早期由Herman Wold 和Joreskog等人开发出来,相对PLS 回归的应用范围而言,PLS 路径建模技术在计量经济学和心理学以及管理行为等领域发挥着更为重要的作用。 Partial Least Squares (PLS) is regarded as the second generation multivariable technology, which is a new kind of multi-dimensional statistical analysis technology and commonly used method for model parameter estimation in recent years (Herman Wold, 1992). The PLS theory could be divided into two parts: PLS Regression Method and PLS Path Modeling. The application of PLS Regression Method was mainly adopted in the chemical field at early time. However, PLS Path Modeling method is the expansion and extension of PLS Regression Method and it is introduced by Herman Wold and Joreskog et al. in the early 1980s. Compared with the application range of PLS Regression Method, PLS Path Modeling plays even more important role in the field of econometrics, psychology, management behavior and so on.偏最小二乘回归具有方法简便、受限制小、应用范围广的优点。一般认为基于成分提取的PLS方法具有很强的解释与预测能力,PLS是一种将主成分分析与多元回归结合起来的迭代估计, 该方法对不同潜变量的显变量子集抽取主成分, 放在回归模型系统中使用, 然后调整主成分权数, 以最大化模型的预测能力。此外,PLS最大的好处就是非参数检验,例如:如果不能保证变量的正态性以及同方差性质,就可以用PLS。PLS对数据的分布没有严格要求而且可以是小样本。而基于协方差拟合的LISREL 方法对数据的分布有一定的要求且需要足够大的样本,且必须要保证变量分布的严格假设。PLS Regression Method is easy and of small restrictions, and its application scope is broad. Generally speaking, PLS method is based on Component Extraction and has strong explanation and the predictive ability, PLS is a kind of Iterative Estimation by the combination of principal components analysis and multiple regression analysis. In this method, the main ingredient is extracted in the subset of manifest variable of different latent variables and it is adopted in the regression model system, then adjust the weight of principal components in order to best show the predictive ability of the model. In addition, the biggest advantage of PLS is non-parametric test, for example, If the nature of normality and homoscedasticity could not be shown, PLS may be used. PLS has no strict rule on the distribution of data so that small sample could be used. However, LISREL method is based on covariance fitting and has certain request on the distribution of data distribution and requires big samples, meanwhile, strict supposition must be done to guarantee the distribution of variable.PLS方法也有它自己的缺点,首先PLS方法是有“偏”的最小二乘,因为估计的每一步都在给定其他参数条件下,对某个参数子集的残差方差进行最小化。虽然在收敛的极限下,对所有残差方差联合的进行最小化,但PLS方法仍然是“偏”的,因为没有对总体残差方差或其他总体最优标准严格的进行最小化。PLS通过最大化测量变量的可靠性估计和潜变量回归的R2来计算潜变量得分,导致PLS参数估计有偏,使潜变量得分的价值大打折扣。此外,因为PLS估计的潜变量路径系数有低估,不能很准确的揭示潜变量之间的关系(Dijkstra, 1983);基于成分分析的算法(PLS)的外生潜变量的R2 的数值比基于协方差的SEM的算法得出的值偏小(HSU, Sheng-Hsun et al 2006)。PLS的潜变量载荷的参数估计易于趋同,且有高估偏差;无法给出模型的检验,它所给出解释变量与因变量之间的结构关系过于抽象、难以理解,无法确定它们之间准确的数量关系。PLS method also has its own shortcoming. First, PLS method has a rather “partial” least squares because the residual variance of a certain variable subset is being minimized when other parameters is defined in each step of the estimation. Although all of the residual variances are being minimized under the condition of convergence limits, PLS method was still inaccurate because the whole residual variance or other optimal criterions are not strictly being minimized. The score of latent variable is calculated through the maximum measurement of the reliability estimation and latent variable regression of variables (R2), which causes inaccurate score of latent variable and PLS parameter estimation. In addition, as the result of the estimated path coefficient in latent variable via PLS method has been underestimated, the relationship among different latent variables could not be accurately shown (Dijkstra, 1983); The value of exogenous latent variables (R2) based on Partial Least Squares (PLS) is smaller than the value based on the covariance of SEM algorithm (HSU, Sheng-Hsun et al 2006). The parameter estimation of latent variable load is becoming convergence easily and has overestimation deviation. The model check could not be carried out because the structural relation between variable and dependent variable is rather abstract and difficult to understand. Therefore, the accurate quantitative relation between variable and dependent variable is unable to be determined.尽管如此,PLS还是由于它对于假设的限制比较小,且不需要有联合多元正态分布,不需要大量的样本,被认为是适合于理论发展的早期使用。目前,PLS成功的应用于市场营销研究、组织行为研究、以及信息系统研究。However, PLS is suitable for the theory development in the early stage because it requires small supposition limit. What is more, the normal distribution of unified multivariate and massive samples is not needed. At present, PLS has been successfully applied in the research of marketing, organization behavior and information system.4.3.3 为什么选择偏最小二乘PLS4.3.3 Why choosing PLS在使用结构方程建模中,PLS和LISREL两种建模技术应用最为广泛,对于LISREL和PLS来说人们在两种方法的选择上一直存在分歧,一般认为PLS适用于以下情况:The modeling technology of PLS and LISREL has been widely adopted in the use of structural equation modeling. However, there are still differences on the selection of the two methods. Generally speaking, PLS is adapted to the following situation:1、研究者更加关注通过测量变量对潜变量的预测,胜于关注模型的参数估计值大小,虽然PLS的估计量是有偏的,但可以根据测量变量得到潜变量的最优预测。(1)The researcher pays more attention on the prediction of latent variable through the measurement of variables rather than the parameter estimation value of the model. Although the estimation value is not accurate, it is the best prediction of latent value through the measurement of variables.2、适用于数据有偏分布的情况,因为PLS使用非参数推断方法(例如Jackknife),不需要对数据进行严格假定(比如多元正态分布、同方差性等等);而LISREL却有严格的假设观测是独立的,且必须服从多元正态分布。(2) PLS is suitable for the existence of skew distribution of data because nonparametric inference is used in PLS method (e.g. Jackknife) and the strict hypothesis of data is not needed (e.g. Multivariate Normal Distribution, Homoscedasticity, etc.). However, LISREL has strict and independent hypothesis measuring and it must in accordance to Multivariate Normal Distribution.3、适用于小样本研究,因为PLS是一种有限信息估计方法,所需要的样本量比完全信息估计方法LISREL小得多。Chin and Newsted (1999)进行的Monte Carlo Simulation证明显示样本的大小可以小至50。(3) PLS is suitable for small sample research because PLS is an estimation method for limited information, the amount of sample required by the adoption of PLS is much smaller than LISREL, which is based on the estimation method of complete information. It is proved that the amount of samples could be smaller than 50 according to “Monte Carlo Simulation” implemented by Chin and Newsted in 1999.4、适用于较大、较复杂的结构方程模型,因为PLS收敛速度非常快,计算效率比LISREL更高。(4) PLS is suitable for big and complex structural equation model because the convergence speed of PLS is very high and the computational efficiency of PLS is much higher than LISREL.5、适用于有形成型(Formative)变量的结构模型,LISREL只能处理反映型(Reflective)的潜变量,而不能处理形成型(Formative)的潜变量。而本文中的XX,XX是形成型的变量,而XX,XX是反应型的变量,整个模型是属于混合模型,因此,使用lisrel或AMOS是无法完成的。而关于形成型与反映型的区分,在大多数文章里面是没有区分的,这样的误用会导致统计中的一类错误和二类错误的发生,从而导致检验的失效。(5) PLS is suitable for the structural model in which formative variable is included. Reflective Latent Variable could be handled via LISREL method but Formative Latent Variable could not be handled via LISREL method. XX and XX mentioned in the essay belong to Formative Latent Variable but XX and XX belong to Reflective Latent Variable. The whole model is hybrid model. Therefore, the research could not finish only by the adoption of LISREL or AMOS. There is no comparison between Reflective Latent Variable and Formative Latent Variable in most essays, which will cause the outcome of type I error and type II error in the statistical procedure so that testing failure would occur.本论文采用PLS的主要原因:The main reason of choosing PLS:1、本文的主要研究目的是找出对于外包合作关系有影响的因素,更加关注的是变异量的解释。更加关注的是服务质量对外包关系的解释能力,而不是用所调查的数据与理论模型的拟合程度。(1) The main research purpose of the e

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论