测验理论08508_第1页
测验理论08508_第2页
测验理论08508_第3页
测验理论08508_第4页
测验理论08508_第5页
已阅读5页,还剩90页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、Fred Li, 20031 測驗理論測驗理論 Fred Li, 20032 邁向科學之路邁向科學之路 量化科學之路:觀察、實驗、測量 界定心理建構 決定測量單位 編製測量工具 基本條件:待測的特質可量化嗎? 亦即具有次序性與 可加性嗎?(See Michell, 1990) 測量的思路: We dream before we think We think before we point. We point before we count. We count before we rank. We rank before we define equal units. We define equa

2、l units before seek natural origin. Fred Li, 20033 測驗理論:從傳統到當代測驗理論:從傳統到當代 Charles Spearman(1904) laid its foundation in a paper in which he introduced the decomposition of an observed score into a true score and an error and showed how to estimate the reliability of observed scores. 經過60餘年的擴充與推演之後,

3、Novick (1966) 終於可以推出完整的CTT理論。 Novick, M. R. (1966). The axioms and principal results of classical test theory. Journal of Mathematical Psychology, 3, 1-18. Fred Li, 20037 測驗理論:測驗理論:CTT(1) X =T + E X = the observed test score T = a hypothetical error-free true-score E = the random error associated wi

4、th a true score. Further, items are assumed to be sampled from “universes” or “domains”. Estimation of reliability and other parameters may be made using the algebra of linear sums. See Nunnally (1979) pp. 190-224 and Suen (1990), pp. 27-39). Fred Li, 20038 測驗理論:測驗理論:CTT(2) Observed score = True sco

5、re + error score X = T + E SEM for a sample SEM using population parameters (note: r is for sample while is for a population. There is a tendency to use population parameters to denote reliability coefficients. Confidence intervals (bands): 95%CI = X + 1.96 ssrsr ex xx xTX 11 2 SEM ex XX XTX 11 2 Fr

6、ed Li, 20039 CTT的基本假設的基本假設:平行測驗平行測驗 假如兩個測驗的平均數(A= b)與變異數(A2= B2)相等、 兩個測驗的原始分數與真分數間的相關亦相等(rtA = rtB), 且其誤差分數間的關係為0(Cov(eA,eB)=0),即可直接估 計信度,複本信度、重測信度即是一例。 2 2 2 2 ),(),(),()( ),(),( x t x BABA x BA AA AB eeCovetCovetCovtVar etetCovBACov r Fred Li, 200310 平行測驗的意義平行測驗的意義: CTT與與IRT In CTT, observed X IRT

7、 does not. In IRT terminology, item/test bias is referred to as DIF/DTF Fred Li, 200379 DIF 與 DTF的界定 DIF refers to a difference in the probability of endorsing an item for members of a reference group (e.g., US workers) and a focal group (e.g., Chinese workers), having the same standing on theta. DT

8、F refers to a difference in the test characteristic curves, obtained by summing the item response functions for each group. DTF is perhaps more important for selection because decisions are made based on test scores, not individual item responses. Fred Li, 200380 DIF實例 Uniform DIF Against Focal Grou

9、p 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 -3-2.5-2-1.5-1-0.500.511.522.53 Theta Prob. of Positive Response Reference Focal Nonuniform (Crossing) DIF 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 -3-2.5-2-1.5-1-0.500.511.522.53 Theta Prob. of Positive Response Reference Focal Reference group favore

10、d at all levels Focal favored at low theta Reference favored at high theta Fred Li, 200381 DIF/DTF的檢驗 DIF Parametric Lords Chi-Square Likelihood Ratio Test Signed and Unsigned Area Methods Nonparametric SIBTEST Mantel-Haenszel DTF Parametric Rajus DFIT Method Nonparametric SIBTEST Fred Li, 200382 Lo

11、rds Chi-Square考驗 i 1 i 2 i vv vi is a vector of the differences in the estimated item parameters for the ith item between the focal and reference groups i is the variance-covariance matrix for the differences in item parameter estimates Lords Chi-Square is sensitive to both uniform and nonuniform DI

12、F. Fred Li, 200383 Lords Chi-Square考驗 1.Estimate item parameters and covariances for focal and reference groups separately. 2.Obtain linking constants, A and K, for putting the focal and reference parameters on a common metric. 3.Compute Lords chi-square to identify DIF items using the reference and

13、 transformed focal group parameters and their covariances. 4.Once the DIF items have been identified, reequate the focal and reference group metrics using only the non-DIF items. 5.Repeat steps 2 through 4 until the same items are identified on consecutive trials. This procedure is implemented in th

14、e program ITERLINK. Fred Li, 200384 Using ITERLINK ITERLINK is an interactive program that performs iterative linking for the 2PL and 3PL models using Lords Chi-Square. Creates three output files: ITERLINK.DBG DIF results and linking constants across iterations PAIRDIF.DBG Summary of DIF results Use

15、r-named file Contains transformed focal parameters Fred Li, 200385 ITERLINK.DBG - ITEM P obs. DIF PRESENT using P .006 On each run, item with largest DIF statistic removed DTF eliminated after removing 10 items Fred Li, 200392 Detecting DIF/DTF Using SIBTEST Nonparametric method that can be used to

16、examine individual items or groups of items Assumes only monotonicity Requires only item response data Works well with fairly small samples (250+) Several variations exist Original SIBTEST: Uniform DIF Crossing SIBTEST: Nonuniform DIF PolySIB: Uniform DIF, polytomous data MultiSIB: Uniform DIF, mult

17、iple dimensions Discussed in Web Tutorial Fred Li, 200393 Using SIBTEST SIBTEST consists of two executable files: SIBIN.EXE : interactive, creates input file SIBTEST.EXE : performs DIF/DTF analyses Choose “E” for either, “R” for reference, or “F” for focal group Detailed discussion of running SIBIN and SIBTEST is presented on

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论