




版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、1,Reliability and Validity Designs,Evidence-based Chiropractic,2,Accurate and consistent measures are needed,It is very important in research and clinical practice to be able to measure patient characteristics accurately and consistently Needed in clinical trials to effectively assess differences be
2、tween groups Needed in practice to help make clinical decisions and to track patients progress,Evidence-based Chiropractic,3,Reliability,The ability of a test to provide consistent results when repeated By the same examiner Or by more than one examiner testing the same attribute on the same group of
3、 subjects Specific research designs are utilized to determine the degree tests are reliable,Evidence-based Chiropractic,4,Validity,The degree to which a test truly measures what it was intended it to measure Valid tests characteristic being measured changes changes occur in the test measurement Test
4、s with reduced validity do not reflect patient changes very well,Evidence-based Chiropractic,5,Measurement error,All measurements have some degree of error Observed score = True score + Error group of subjects, variation of true scores occurs because of Individual differences of the subjects Plus an
5、 error component This results in a distribution (hopefully normal),Evidence-based Chiropractic,6,Random errors,Errors that are attributable to the examiner, the subject, or the measuring instrument Have little effect on the groups mean score because the errors are just as likely to be high as they a
6、re low For example, blood pressure which is variable depending on a number of factors,Evidence-based Chiropractic,7,Systematic errors,Errors that cause scores to move in only one direction in response to a factor that has a constant effect on the measurement system Considered to be a form of bias Fo
7、r example, a sphygmomanometer that is out of calibration and always generates high BP readings,Evidence-based Chiropractic,8,Error components,Evidence-based Chiropractic,9,Estimating reliability,The proportion of true score variance divided by the observed score variance True score variance Real dif
8、ferences between subjects scores due to biologically different people Observed score variance The portion of variability that is due to faults in measurement,Evidence-based Chiropractic,10,Observed score variance,Evidence-based Chiropractic,11,The reliability coefficient,Becomes larger (increased re
9、liability) as error variance gets smaller Equals 1.0 when error variance is 0.0 Becomes smaller (decreased reliability) as error variance gets larger,Evidence-based Chiropractic,12,Interpretation of thereliability coefficient,A reliability coefficient of 0.75 means that 75% of the variance in the sc
10、ores is due to the true variance of the trait being measured and 25% is due to the error variance,Evidence-based Chiropractic,13,Interpretation of thereliability coefficient (cont.),Ranges from 0.0 to 1.0 0.0 represents no reliability and 1.0 perfect reliability Implications 0.75 or greater good rel
11、iability 0.5 to 0.75 moderate reliability 0.5 indicates poor reliability.,Evidence-based Chiropractic,14,Inter-examiner reliability,When 2 or more examiners test the same subjects for the same characteristic using the same measure, scores should match Inter-examiner reliability is the degree that th
12、eir findings agree,Evidence-based Chiropractic,15,Intra-examiner reliability,Scores should also match when the same examiner tests the same subjects on two or more occasions Intra-examiner reliability is the degree that the examiner agrees with himself or herself,Evidence-based Chiropractic,16,Quant
13、ifying inter-examiner and intra-examiner reliability,Correlation There should be a high degree of correlation between scores of 2 examiners testing the same group of subjects or 1 examiner testing the same group on 2 occasions However, it is possible to have good correlation and concurrent poor agre
14、ement Occurs when 1 examiner consistently scores subjects higher or lower than the other examiner,Evidence-based Chiropractic,17,Test-retest reliability,A test is administered to the same group of subjects on more than one occasion (stable conditions only) Test scores should be consistent when repea
15、ted Test scores should correlate well Test-retest reliability is used to assess self-administered questionnaires which are not directly controlled by the examiner,Evidence-based Chiropractic,18,Parallel forms reliabilitya.k.a. Alternate forms reliability,Two versions of a questionnaire or test that
16、measures the same construct are compared Both versions are administered to the same subjects Scores are compared to determine the level of correlation,Evidence-based Chiropractic,19,Internal consistency reliability,The degree each of the items in a questionnaire measures the targeted construct All q
17、uestions should measure various characteristics of the construct and nothing else,Evidence-based Chiropractic,20,2 X 2 contingency table to compare results of examiners,Useful to visualize the results of two examiners who are evaluating the same group of patients Inter-examiner reliability articles
18、often present their findings in the form of a 2 X 2 contingency table If not, they are fairly easy to create from the data presented in the article,Evidence-based Chiropractic,21,2 X 2 contingency table (cont.),Rater 1,Agreements - a & d,Disagreements - b & c,Evidence-based Chiropractic,22,The kappa
19、 statistic (),Agreement between examiners evaluating the same patients can be represented by the percentage of agreement of paired ratings However, percentage of agreement does not account for agreement that would be expected to occur by chance,Evidence-based Chiropractic,23,The kappa statistic (con
20、t.),Even using unreliable measures, a few agreements are expected to occur just by chance Only agreement that occurs beyond chance levels represents true agreement This is what is represented by the kappa statistic It is appropriate for use with dichotomous or nominal data,Evidence-based Chiropracti
21、c,24,The kappa statistic (cont.),Where observed agreement (PO) is the total proportion of observations where there is agreement,Evidence-based Chiropractic,25,The kappa statistic (cont.),The values of PO and PC are then utilized in the following formula to calculate the kappa statistic When the amou
22、nt of observed agreement exceeds chance agreement, kappa will be positive The strength of agreement is determined by the magnitude of kappa If negative, agreements are less than chance,Evidence-based Chiropractic,26,Interpretation of kappa values,Evidence-based Chiropractic,27,Kappa example,Reliabil
23、ity of McKenzie classification of patients with cervical or lumbar pain 50 spinal pain patients (25 lumbar and 25 cervical) were simultaneously assessed by 2 physical therapists (14 in total) to classify patients into syndromes and subsyndromes = 0.84 for syndrome classification = 0.87 for subsyndro
24、me classification,Evidence-based Chiropractic,28,Intraclass Correlation Coefficient (ICC),Another measure of inter-examiner reliability that is for use with continuous variables Can be used to evaluate 2 or more raters Pearsons r can be used But ICC is preferred when sample size is small (15) or mor
25、e than two tests are involved,Evidence-based Chiropractic,29,ICC (Cont.),There are three models of ICC that may utilize one of two different forms Thus, 6 possible types of ICC depending on how raters are chosen and how subjects are assigned The type of ICC used should always be presented in researc
26、h papers The first number represents the ICC model The second represents the form used,Evidence-based Chiropractic,30,ICC (Cont.),For example Clare et al reported on the reliability of detection of lumbar lateral shift and found it to be moderate ICC 2,1 values ranging from 0.48 to 0.64,Model,Form,E
27、vidence-based Chiropractic,31,ICC is an index of reliability,Can range from below 0.0 to +1.0 With 0.0 indicating weak reliability 1.0 strong reliability Suggestedinterpretation Some clinical measures require 0.90,Evidence-based Chiropractic,32,ICC is based on variance,ICC is the ratio of between-gr
28、oups variance to total variance, where Between-groups variance is due to different subjects having test scores that truly differ Total variance is due to score differences resulting from inter-rater unreliability of two or more examiners rating the same person Two-way ANOVA is used to calculate ICC,
29、Evidence-based Chiropractic,33,Validity,The ability of tests and measurements to in fact evaluate the traits that they were intended to evaluate Vital in research, as well as in clinical practice The extent of a tests validity depends on the degree to which systematic error has been controlled for,E
30、vidence-based Chiropractic,34,Validity (cont.),The greater the validity, the more likely test results will reflect true differences between scores and not systematic error Its a matter of degrees, not black-and-white Technically incorrect to say a test is “valid” or “invalid” Better to use categorie
31、s like highly valid, moderately valid, etc.,Evidence-based Chiropractic,35,Validity (cont.),Test validity depends on its intended purpose For example, a hand-grip dynamometer is valid to measure grip strength, but it is not valid to measure the qualities of hand tremor,Evidence-based Chiropractic,36
32、,Validity (cont.),An invalid test can still be reliable For example, a test that used skull circumference to predict intelligence Reliability would probably be excellent, but it would not be a valid predictor of intelligence But an unreliable test can never be considered valid,Evidence-based Chiropr
33、actic,37,Methods to estimate the extent of test validity,Can be divided into 3 major categories Self-evident Does the test appear to measure what it is supposed to measure Pragmatic Does the test actually work as hypothesized Construct validity Does the test adequately measure the theoretical constr
34、uct involved,Evidence-based Chiropractic,38,Self-evident methods,Face validity Simply deciding whether a test appears to have merit based on “face value” e.g., if a headache questionnaire asked about the location of head pain it would have face validity If it asked about hair color, it probably woul
35、d not The lowest level of test validation Often assessed when researchers are first exploring a topic,Evidence-based Chiropractic,39,Self-evident methods (cont.),Content validity The ability of a test to include or represent all of the content of a construct Another definition for content validity T
36、he content of a test is compared to the literature that is already available on the topic The test is said to have good content validity if it accurately reflects what is in the literature,Evidence-based Chiropractic,40,Pragmatic methods,Criterion-related validity The degree a test corresponds with
37、an external criterion that is an independent measure of the characteristic being tested A criterion is the standard by which a measure is judged A valid test should correlate well with or predict some relevant criterion Concurrent and predictive validity are subgroups of criterion-related validity,E
38、vidence-based Chiropractic,41,Pragmatic methods (cont.),Concurrent validity The results of a new test are compared with an established test (gold standard) to see if they are well correlated Both tests are given at the same time For example, a study that compares a clinical test to detect spondyloli
39、sthesis with x-ray findings,Evidence-based Chiropractic,42,Pragmatic methods (cont.),Gold standard test a.k.a, reference standard A test that is generally acknowledged to be the best available The value of a concurrent validity trial depends greatly on the quality of the gold standard that is used,E
40、vidence-based Chiropractic,43,Pragmatic methods (cont.),Construct validity The extent to which a test effectively measures a theoretical construct Like pain or disability The characteristic is not observed directly Rather, an abstraction of the characteristic that corresponds to the construct under
41、consideration is observed e.g., a pain scale or disability questionnaire,Evidence-based Chiropractic,44,Pragmatic methods (cont.),Construct validity can be thought of as theaccumulation of evidence that points to the ability of a test to actually measure what it claims to measure It involves the acc
42、umulation of evidence by establishing some of the other types of validity The validity of a test is supported if the results of these studies agree with one another,Evidence-based Chiropractic,45,Pragmatic methods (cont.),Construct validity is determined by comparing a new test with other tests that
43、 measure a similar construct Another way to evaluate construct validity is to compare the new test with other tests that are different, but related, which should not correlate well,Evidence-based Chiropractic,46,Pragmatic methods (cont.),Convergent validity Has to do with the degree of correlation t
44、hat exists between a new test and another measure of the same or similar constructs A test that has good convergent validity correlates well with another measure of the same construct,Evidence-based Chiropractic,47,Pragmatic methods (cont.),Discriminant validity The opposite of convergent validity,
45、where the new test is weakly related to or unrelated to another measure that it should in fact be different from A test with good discriminant validity should be able to separate patients into different groups e.g., normal vs. abnormal,Evidence-based Chiropractic,48,Evidence-based Chiropractic,49,Th
46、e concept of validity and reliability,Can be compared with scores on a target Scores may be systematically off center Results from bias The test environment is faulty, causing all scores to be inaccurate Scores miss the bulls eye in one direction Scores may be randomly off center Scores miss the bul
47、ls eye in any direction,Evidence-based Chiropractic,50,The concept of validity and reliability (cont.),When test scores miss the bulls eye in any direction, it is caused by random error Some subjects are affected while others are not Accurate tests Are free from bias Precise tests Are free from rand
48、om error,Evidence-based Chiropractic,51,Accuracy and precision,An accurate and precise test hits the bulls eye and is tightly grouped,An inaccurate test syste-matically misses thebulls eye in one direction,An imprecise test misses the bulls eye randomly,Evidence-based Chiropractic,52,Cutoff points,T
49、est results involving ordinal or continuous measures are often converted to a dichotomous scale (dichotomized) Achieved by establishing a cutoff point at a specified value Scores above the specified value are considered positive Scores below the value are negative,Evidence-based Chiropractic,53,Woul
50、d always correctly discriminate between those with and those without the condition Always positive for those with the condition Always negative for those without it,The ideal diagnostic test,Evidence-based Chiropractic,54,The ideal test,Always positive for those with the condition,Always negative fo
51、r those without the condition,Evidence-based Chiropractic,55,Real-world test,False negatives,False positives,Evidence-based Chiropractic,56,Sensitivity and Specificity,Commonly used to assess the validity of tests Sensitivity The ability of a test to correctly identify people who have the target dis
52、order Specificity The ability of a test to correctly identify people who do not have the target disorder,Evidence-based Chiropractic,57,In tests with low sensitivity People with the target disorder will be missed (false negatives) In tests with low specificity People who do not actually have the tar
53、get disorder will be identified as having it (false positives),Implications of sensitivity & specificity,Evidence-based Chiropractic,58,Sensitivity and Specificity (cont.),Expressed as a percentage 0% represents no sensitivity or specificity 100% is perfect sensitivity or specificity A 2 X 2 conting
54、ency table can be used to calculate these indices,Evidence-based Chiropractic,59,2 X 2 contingency table,Test Result,Evidence-based Chiropractic,60,Sensitivity and Specificity (cont.),Evidence-based Chiropractic,61,SnOUT (Sensitivity rules OUT),In tests that have very high sensitivity A negative tes
55、t will rule out the condition under consideration This is because there are very few false negatives in tests with very high sensitivity If a test with very high sensitivity is negative, it is very likely a true negative,Evidence-based Chiropractic,62,SpIN (SPecificity rules IN),In tests that have v
56、ery high specificity A positive test will rule in the condition under consideration This is because there are very few false positives in tests with very high specificity If a test with very high specificity is positive, it is very likely a true positive,Evidence-based Chiropractic,63,The cutoff poi
57、nt influences a tests sensitivity & specificity,Higher scores point to a worsening condition,False negatives,False positives,If the cutoff point is raised, specificity increases, but there are more false negatives,Evidence-based Chiropractic,64,If the cutoff point is lowered, sensitivity increases,
58、but there are more false positives,The cutoff point and sensitivity & specificity (cont.),False negatives,False positives,Evidence-based Chiropractic,65,Because increasing sensitivity will decrease specificity, and increasing specificity will decrease sensitivity, the cutoff point that is set depend
59、s on Whether it is best to maximize sensitivity at the expense of specificity, or Whether it is best to maximize specificity at the expense of sensitivity,The cutoff point and sensitivity & specificity (cont.),Evidence-based Chiropractic,66,Receiver Operating Characteristic (ROC) curves,Graphically depicts the tradeoff between sensitivity and specificity In accurate tests The curve closely follows the left-hand border and the top border of the ROC space In less accurate the tests The curve is closer to the 45-degree diagonal of the ROC space,Evidence-based Chiropractic,67,ROC curves
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 以视启思:小学数学课程中视觉素养教育的融合与创新
- 以行促思:动手实践在南新小学三年级数学教学中的应用与探索
- 以若水育新苗:成都市双水小学校流动儿童学习适应与学校管理优化探究
- 以自然为笔:J市残疾人福利基金会农疗项目中的园艺治疗实践
- 以职业核心能力为导向:高职语文教学改革的探索与实践
- 餐厅领班岗位职责说明书
- 后勤保障综合服务提升措施他
- 七年级语文课堂管理工作计划
- 移动电子围栏管理办法
- 影视行业合同管理办法
- 2023年本科招生考试
- 碳捕集、利用与封存技术课件
- 碳达峰和“碳中和”环境知识科普宣传PPT教学课件
- 中文版b4a新手指南-第14-15章语言画图
- 新入职护士培训考试试题及答案
- 《消防安全技术实务》课本完整版
- 公路工程标准施工监理招标文件(2018年版)
- 北师大版七年级数学下册 与信息技术相融合的数学教学案例 教案
- 钝针穿刺法临床应用护理
- 精品中文版b4a新手指南第4章开发环境
- 光缆线路的故障分析及障碍抢修
评论
0/150
提交评论