Reliability validity信度效度.ppt_第1页
Reliability validity信度效度.ppt_第2页
Reliability validity信度效度.ppt_第3页
Reliability validity信度效度.ppt_第4页
Reliability validity信度效度.ppt_第5页
已阅读5页,还剩83页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、1,Reliability and Validity Designs,Evidence-based Chiropractic,2,Accurate and consistent measures are needed,It is very important in research and clinical practice to be able to measure patient characteristics accurately and consistently Needed in clinical trials to effectively assess differences be

2、tween groups Needed in practice to help make clinical decisions and to track patients progress,Evidence-based Chiropractic,3,Reliability,The ability of a test to provide consistent results when repeated By the same examiner Or by more than one examiner testing the same attribute on the same group of

3、 subjects Specific research designs are utilized to determine the degree tests are reliable,Evidence-based Chiropractic,4,Validity,The degree to which a test truly measures what it was intended it to measure Valid tests characteristic being measured changes changes occur in the test measurement Test

4、s with reduced validity do not reflect patient changes very well,Evidence-based Chiropractic,5,Measurement error,All measurements have some degree of error Observed score = True score + Error group of subjects, variation of true scores occurs because of Individual differences of the subjects Plus an

5、 error component This results in a distribution (hopefully normal),Evidence-based Chiropractic,6,Random errors,Errors that are attributable to the examiner, the subject, or the measuring instrument Have little effect on the groups mean score because the errors are just as likely to be high as they a

6、re low For example, blood pressure which is variable depending on a number of factors,Evidence-based Chiropractic,7,Systematic errors,Errors that cause scores to move in only one direction in response to a factor that has a constant effect on the measurement system Considered to be a form of bias Fo

7、r example, a sphygmomanometer that is out of calibration and always generates high BP readings,Evidence-based Chiropractic,8,Error components,Evidence-based Chiropractic,9,Estimating reliability,The proportion of true score variance divided by the observed score variance True score variance Real dif

8、ferences between subjects scores due to biologically different people Observed score variance The portion of variability that is due to faults in measurement,Evidence-based Chiropractic,10,Observed score variance,Evidence-based Chiropractic,11,The reliability coefficient,Becomes larger (increased re

9、liability) as error variance gets smaller Equals 1.0 when error variance is 0.0 Becomes smaller (decreased reliability) as error variance gets larger,Evidence-based Chiropractic,12,Interpretation of thereliability coefficient,A reliability coefficient of 0.75 means that 75% of the variance in the sc

10、ores is due to the true variance of the trait being measured and 25% is due to the error variance,Evidence-based Chiropractic,13,Interpretation of thereliability coefficient (cont.),Ranges from 0.0 to 1.0 0.0 represents no reliability and 1.0 perfect reliability Implications 0.75 or greater good rel

11、iability 0.5 to 0.75 moderate reliability 0.5 indicates poor reliability.,Evidence-based Chiropractic,14,Inter-examiner reliability,When 2 or more examiners test the same subjects for the same characteristic using the same measure, scores should match Inter-examiner reliability is the degree that th

12、eir findings agree,Evidence-based Chiropractic,15,Intra-examiner reliability,Scores should also match when the same examiner tests the same subjects on two or more occasions Intra-examiner reliability is the degree that the examiner agrees with himself or herself,Evidence-based Chiropractic,16,Quant

13、ifying inter-examiner and intra-examiner reliability,Correlation There should be a high degree of correlation between scores of 2 examiners testing the same group of subjects or 1 examiner testing the same group on 2 occasions However, it is possible to have good correlation and concurrent poor agre

14、ement Occurs when 1 examiner consistently scores subjects higher or lower than the other examiner,Evidence-based Chiropractic,17,Test-retest reliability,A test is administered to the same group of subjects on more than one occasion (stable conditions only) Test scores should be consistent when repea

15、ted Test scores should correlate well Test-retest reliability is used to assess self-administered questionnaires which are not directly controlled by the examiner,Evidence-based Chiropractic,18,Parallel forms reliabilitya.k.a. Alternate forms reliability,Two versions of a questionnaire or test that

16、measures the same construct are compared Both versions are administered to the same subjects Scores are compared to determine the level of correlation,Evidence-based Chiropractic,19,Internal consistency reliability,The degree each of the items in a questionnaire measures the targeted construct All q

17、uestions should measure various characteristics of the construct and nothing else,Evidence-based Chiropractic,20,2 X 2 contingency table to compare results of examiners,Useful to visualize the results of two examiners who are evaluating the same group of patients Inter-examiner reliability articles

18、often present their findings in the form of a 2 X 2 contingency table If not, they are fairly easy to create from the data presented in the article,Evidence-based Chiropractic,21,2 X 2 contingency table (cont.),Rater 1,Agreements - a & d,Disagreements - b & c,Evidence-based Chiropractic,22,The kappa

19、 statistic (),Agreement between examiners evaluating the same patients can be represented by the percentage of agreement of paired ratings However, percentage of agreement does not account for agreement that would be expected to occur by chance,Evidence-based Chiropractic,23,The kappa statistic (con

20、t.),Even using unreliable measures, a few agreements are expected to occur just by chance Only agreement that occurs beyond chance levels represents true agreement This is what is represented by the kappa statistic It is appropriate for use with dichotomous or nominal data,Evidence-based Chiropracti

21、c,24,The kappa statistic (cont.),Where observed agreement (PO) is the total proportion of observations where there is agreement,Evidence-based Chiropractic,25,The kappa statistic (cont.),The values of PO and PC are then utilized in the following formula to calculate the kappa statistic When the amou

22、nt of observed agreement exceeds chance agreement, kappa will be positive The strength of agreement is determined by the magnitude of kappa If negative, agreements are less than chance,Evidence-based Chiropractic,26,Interpretation of kappa values,Evidence-based Chiropractic,27,Kappa example,Reliabil

23、ity of McKenzie classification of patients with cervical or lumbar pain 50 spinal pain patients (25 lumbar and 25 cervical) were simultaneously assessed by 2 physical therapists (14 in total) to classify patients into syndromes and subsyndromes = 0.84 for syndrome classification = 0.87 for subsyndro

24、me classification,Evidence-based Chiropractic,28,Intraclass Correlation Coefficient (ICC),Another measure of inter-examiner reliability that is for use with continuous variables Can be used to evaluate 2 or more raters Pearsons r can be used But ICC is preferred when sample size is small (15) or mor

25、e than two tests are involved,Evidence-based Chiropractic,29,ICC (Cont.),There are three models of ICC that may utilize one of two different forms Thus, 6 possible types of ICC depending on how raters are chosen and how subjects are assigned The type of ICC used should always be presented in researc

26、h papers The first number represents the ICC model The second represents the form used,Evidence-based Chiropractic,30,ICC (Cont.),For example Clare et al reported on the reliability of detection of lumbar lateral shift and found it to be moderate ICC 2,1 values ranging from 0.48 to 0.64,Model,Form,E

27、vidence-based Chiropractic,31,ICC is an index of reliability,Can range from below 0.0 to +1.0 With 0.0 indicating weak reliability 1.0 strong reliability Suggestedinterpretation Some clinical measures require 0.90,Evidence-based Chiropractic,32,ICC is based on variance,ICC is the ratio of between-gr

28、oups variance to total variance, where Between-groups variance is due to different subjects having test scores that truly differ Total variance is due to score differences resulting from inter-rater unreliability of two or more examiners rating the same person Two-way ANOVA is used to calculate ICC,

29、Evidence-based Chiropractic,33,Validity,The ability of tests and measurements to in fact evaluate the traits that they were intended to evaluate Vital in research, as well as in clinical practice The extent of a tests validity depends on the degree to which systematic error has been controlled for,E

30、vidence-based Chiropractic,34,Validity (cont.),The greater the validity, the more likely test results will reflect true differences between scores and not systematic error Its a matter of degrees, not black-and-white Technically incorrect to say a test is “valid” or “invalid” Better to use categorie

31、s like highly valid, moderately valid, etc.,Evidence-based Chiropractic,35,Validity (cont.),Test validity depends on its intended purpose For example, a hand-grip dynamometer is valid to measure grip strength, but it is not valid to measure the qualities of hand tremor,Evidence-based Chiropractic,36

32、,Validity (cont.),An invalid test can still be reliable For example, a test that used skull circumference to predict intelligence Reliability would probably be excellent, but it would not be a valid predictor of intelligence But an unreliable test can never be considered valid,Evidence-based Chiropr

33、actic,37,Methods to estimate the extent of test validity,Can be divided into 3 major categories Self-evident Does the test appear to measure what it is supposed to measure Pragmatic Does the test actually work as hypothesized Construct validity Does the test adequately measure the theoretical constr

34、uct involved,Evidence-based Chiropractic,38,Self-evident methods,Face validity Simply deciding whether a test appears to have merit based on “face value” e.g., if a headache questionnaire asked about the location of head pain it would have face validity If it asked about hair color, it probably woul

35、d not The lowest level of test validation Often assessed when researchers are first exploring a topic,Evidence-based Chiropractic,39,Self-evident methods (cont.),Content validity The ability of a test to include or represent all of the content of a construct Another definition for content validity T

36、he content of a test is compared to the literature that is already available on the topic The test is said to have good content validity if it accurately reflects what is in the literature,Evidence-based Chiropractic,40,Pragmatic methods,Criterion-related validity The degree a test corresponds with

37、an external criterion that is an independent measure of the characteristic being tested A criterion is the standard by which a measure is judged A valid test should correlate well with or predict some relevant criterion Concurrent and predictive validity are subgroups of criterion-related validity,E

38、vidence-based Chiropractic,41,Pragmatic methods (cont.),Concurrent validity The results of a new test are compared with an established test (gold standard) to see if they are well correlated Both tests are given at the same time For example, a study that compares a clinical test to detect spondyloli

39、sthesis with x-ray findings,Evidence-based Chiropractic,42,Pragmatic methods (cont.),Gold standard test a.k.a, reference standard A test that is generally acknowledged to be the best available The value of a concurrent validity trial depends greatly on the quality of the gold standard that is used,E

40、vidence-based Chiropractic,43,Pragmatic methods (cont.),Construct validity The extent to which a test effectively measures a theoretical construct Like pain or disability The characteristic is not observed directly Rather, an abstraction of the characteristic that corresponds to the construct under

41、consideration is observed e.g., a pain scale or disability questionnaire,Evidence-based Chiropractic,44,Pragmatic methods (cont.),Construct validity can be thought of as theaccumulation of evidence that points to the ability of a test to actually measure what it claims to measure It involves the acc

42、umulation of evidence by establishing some of the other types of validity The validity of a test is supported if the results of these studies agree with one another,Evidence-based Chiropractic,45,Pragmatic methods (cont.),Construct validity is determined by comparing a new test with other tests that

43、 measure a similar construct Another way to evaluate construct validity is to compare the new test with other tests that are different, but related, which should not correlate well,Evidence-based Chiropractic,46,Pragmatic methods (cont.),Convergent validity Has to do with the degree of correlation t

44、hat exists between a new test and another measure of the same or similar constructs A test that has good convergent validity correlates well with another measure of the same construct,Evidence-based Chiropractic,47,Pragmatic methods (cont.),Discriminant validity The opposite of convergent validity,

45、where the new test is weakly related to or unrelated to another measure that it should in fact be different from A test with good discriminant validity should be able to separate patients into different groups e.g., normal vs. abnormal,Evidence-based Chiropractic,48,Evidence-based Chiropractic,49,Th

46、e concept of validity and reliability,Can be compared with scores on a target Scores may be systematically off center Results from bias The test environment is faulty, causing all scores to be inaccurate Scores miss the bulls eye in one direction Scores may be randomly off center Scores miss the bul

47、ls eye in any direction,Evidence-based Chiropractic,50,The concept of validity and reliability (cont.),When test scores miss the bulls eye in any direction, it is caused by random error Some subjects are affected while others are not Accurate tests Are free from bias Precise tests Are free from rand

48、om error,Evidence-based Chiropractic,51,Accuracy and precision,An accurate and precise test hits the bulls eye and is tightly grouped,An inaccurate test syste-matically misses thebulls eye in one direction,An imprecise test misses the bulls eye randomly,Evidence-based Chiropractic,52,Cutoff points,T

49、est results involving ordinal or continuous measures are often converted to a dichotomous scale (dichotomized) Achieved by establishing a cutoff point at a specified value Scores above the specified value are considered positive Scores below the value are negative,Evidence-based Chiropractic,53,Woul

50、d always correctly discriminate between those with and those without the condition Always positive for those with the condition Always negative for those without it,The ideal diagnostic test,Evidence-based Chiropractic,54,The ideal test,Always positive for those with the condition,Always negative fo

51、r those without the condition,Evidence-based Chiropractic,55,Real-world test,False negatives,False positives,Evidence-based Chiropractic,56,Sensitivity and Specificity,Commonly used to assess the validity of tests Sensitivity The ability of a test to correctly identify people who have the target dis

52、order Specificity The ability of a test to correctly identify people who do not have the target disorder,Evidence-based Chiropractic,57,In tests with low sensitivity People with the target disorder will be missed (false negatives) In tests with low specificity People who do not actually have the tar

53、get disorder will be identified as having it (false positives),Implications of sensitivity & specificity,Evidence-based Chiropractic,58,Sensitivity and Specificity (cont.),Expressed as a percentage 0% represents no sensitivity or specificity 100% is perfect sensitivity or specificity A 2 X 2 conting

54、ency table can be used to calculate these indices,Evidence-based Chiropractic,59,2 X 2 contingency table,Test Result,Evidence-based Chiropractic,60,Sensitivity and Specificity (cont.),Evidence-based Chiropractic,61,SnOUT (Sensitivity rules OUT),In tests that have very high sensitivity A negative tes

55、t will rule out the condition under consideration This is because there are very few false negatives in tests with very high sensitivity If a test with very high sensitivity is negative, it is very likely a true negative,Evidence-based Chiropractic,62,SpIN (SPecificity rules IN),In tests that have v

56、ery high specificity A positive test will rule in the condition under consideration This is because there are very few false positives in tests with very high specificity If a test with very high specificity is positive, it is very likely a true positive,Evidence-based Chiropractic,63,The cutoff poi

57、nt influences a tests sensitivity & specificity,Higher scores point to a worsening condition,False negatives,False positives,If the cutoff point is raised, specificity increases, but there are more false negatives,Evidence-based Chiropractic,64,If the cutoff point is lowered, sensitivity increases,

58、but there are more false positives,The cutoff point and sensitivity & specificity (cont.),False negatives,False positives,Evidence-based Chiropractic,65,Because increasing sensitivity will decrease specificity, and increasing specificity will decrease sensitivity, the cutoff point that is set depend

59、s on Whether it is best to maximize sensitivity at the expense of specificity, or Whether it is best to maximize specificity at the expense of sensitivity,The cutoff point and sensitivity & specificity (cont.),Evidence-based Chiropractic,66,Receiver Operating Characteristic (ROC) curves,Graphically depicts the tradeoff between sensitivity and specificity In accurate tests The curve closely follows the left-hand border and the top border of the ROC space In less accurate the tests The curve is closer to the 45-degree diagonal of the ROC space,Evidence-based Chiropractic,67,ROC curves

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论