英语教学中的测试与评价方法课件_第1页
英语教学中的测试与评价方法课件_第2页
英语教学中的测试与评价方法课件_第3页
英语教学中的测试与评价方法课件_第4页
英语教学中的测试与评价方法课件_第5页
已阅读5页,还剩78页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、Testing & Assessment in ELT英语教学中的测试与评价方法英语教学中的测试与评价方法Outline1. A sketchy history2. Types of language tests 3. Testing techniques 4. Criteria for good language tests 5. Constructing multiple-choice questions 6. Critical discussion 1. History of Langauge Testing and AssessmentA Sketchy History1.1

2、Pre-scientific Stage前科学阶段1.2 Psychometric-Structuralist Testing Stage心理测量学-结构主义测试阶段1.3 Integrative-Sociolinguistic Testing Stage 综合-社会语言学测试阶段1.4 Pragmatic and Communicative Testing Stage语用交际测试阶段1.1 Pre-scientific Stage Before the 1940s Grammar-translation method 语法翻译法 Traditional testing approachTra

3、ditional Testing Approach What: grammatical rules, word formation, word usage How: written test, no oral test Question type: subjective, like translation, composition, written questions1.2 Psychometric-Structuralist Testing心理测量心理测量-结构主义测试结构主义测试 Since the 1950s Theoretical guidance Features Audioling

4、ual method 听说法 Discrete-point testing approach(分项测验分项测验/分立式测分立式测验验) 理论基础: 语言学中的结构主义语言观:语言=语音+词汇+句式+语法,语言能力可以分解为具体的部分来考查。 心理学中的行为主义教育论:刺激反应式学习方式,反复练习。 心理测量学:依据一定的心理学理论,运用一定的操作程序,把人的知识、能力、性格、态度等心理特性和行为进行量化。 Emphasizing reliability (信度)and validity (效度) Emphasizing objective and accurate assessment, Ob

5、jective questions dominate, particularly multiple choices. Analyzing testing results statistically Enjoying a high reliability 信度高Discrete-point Testing A question only assesses one language point Testing: conducted at different levels of language structure 语言层面 Language proficiency:being assessed f

6、rom the aspects of listening, speaking, reading and writing 技能层面1.3 Integrative-Sociolinguistic Testing 综合综合-社会语言学测试社会语言学测试 Since the mid-1970s (动态语言观) Integrative skill tests 综合技能测试 Assess a learners ability to use many bits at the same time. Question types:cloze/composition/oral interview, etc.1.4

7、 Pragmatic & Communicative Testing Stage语用交际测试阶段语用交际测试阶段 From the 1980s (功能语言观) Communicative approach From language usage to language use Pragmatic approach Integrity of language/whole language Assessing with tasks Accuracy, fluency and appropriateness Communicative competence 2. Types of Langu

8、age TestsTypes of Language Tests Formative & Summative Tests; 形成性和总结性 Objective & Subjective Tests; 客观性和主观性 Criterion-referenced & Norm-referenced Tests;标准参照性和常模参照性 Tests Classified According to Testing Purposes Discrete-point & Integrative Tests 分立式和综合测试 High stakes & Low-stakes

9、 Tests 高风险和低风险2.1 Formative & Summative Formative Assessment: Being carried out throughtout the course; Diagnostic purpose; Assessment for learning; Teacher or learner-initiated Summative Assessment: Being carried out at the end of a course; Grading purpose, assign a course grade; Assessment of

10、learning; Teacher-initiated2.2 Criterion-referenced & Norm-referenced Criterion-referenced Assessment: A way of measuring candidates against defined (and objective) criteria; Relatively consistent Being used to establish a persons competence; Examples: Driving tests, IELTS, TEM, etc.Norm-referen

11、ced Assessment: A way of comparing candidates to identify whether the test taker performed better or worse than other test takers; Varying from year to year; Being used for selection; Examples: CEE (gaokao), TOEFL2.3 Objective & SubjectiveObjective Assessment: A single correct answer; Objective

12、scoring, no judgment on the part of the scorer. Examples: true/false, multiple choice, matching questions. Subjective Assessment: More than one way of expressing the correct answer; Subective scoring, calling for judgment on the part of the scorer. Examples: extended-response questions and essays. 2

13、.4 Tests Classified According to Testing Purposesl Proficiency Test 水平测试、能力测试l Achievement Test 成绩测试、学业测试l Placement Test 分级测试、分班测试l Aptitude Test 能力倾向测试、学能测试l Diagnostic Test 诊断测试Proficiency Testl Measuring language proficiencyl The content 考试内容: Not based on the content of a language course which

14、people taking the test may have followed. It is based on a specification of what candidates have to be able to do in the language in order to be considered proficient. Examples: SAT, ACT, CEE, IELTS, PSC, PETS, BECAchievement Test Examining how successful a student, a teacher, or a syllabus, or a me

15、thod is. Being closely linked to the course material used in class. Final achievement tests are those administered at the end of a course of study. Progress achievement tests are intended to measure the progress that students are making. Placement Testl To identify the appropriate stage of language

16、course according to students ability.l To assign students to the appropriate level of classes they should take. Aptitude Test Measuring the extent to which an individual possesses specific language learning ability Being usually used for selection and diagnosis and for prediction of language learnin

17、g success. Components of language aptitude: phonetic coding ability (sound discrimination and memory), grammatical sensitivity (recognizing the grammatical function of words), rote learning ability for new sound and meaning inductive learning ability for language patternsDiagnostic Testl To show stu

18、dents strengths and weaknesses. 2.5 Discrete-point & Integrative Discrete-point Test: multiple-choice questions Integrative Test: Dictation, translation, composition, etc.When he saw his mother, the little boy stopped _. A. crying B. cry C. to cry D. cried One day, the wife of a Chinese king sat

19、 watching a worm as it ate some mulberry leaves. Soon it stopped _. Then, as it slowly turned its head from side _ side, a very fine thread came out of its _. It wrapped the thread around and around itself until it was shut _ a little cocoon.2.6 High- & Low-stakes Tests A relative concept High-s

20、takes: a test with important consequences for the test taker. Examples:CEE, TEM Low-stakes: End-term Exam测试种类总结测试种类总结分类标准分类标准 测试类别测试类别学习阶段不同学习阶段不同 形成性测试,终结性测试形成性测试,终结性测试评分方式评分方式 客观性测试,主观性测试客观性测试,主观性测试分数解释参照标准不同分数解释参照标准不同 标准参照测试,常模参照测试标准参照测试,常模参照测试测试目的测试目的 水平测试水平测试/成绩测试成绩测试/学能测试学能测试/分班测试分班测试/诊断测试诊断测试

21、测试语言技能的分合测试语言技能的分合 分立式测试,综合式测试分立式测试,综合式测试测试对用户影响的大小测试对用户影响的大小 低风险测试,高风险测试低风险测试,高风险测试3. Testing TechniquesTesting TechniquesMultiple-choice 多项选择题(单选或复选)多项选择题(单选或复选)Gap-filling 填充题填充题 Matching 配对题配对题Transformation 句型转换题句型转换题Cloze 完形填空题(填充或选择题)完形填空题(填充或选择题)True/False 是非题,判断正误题是非题,判断正误题 Error Correction

22、 改错题改错题Dictation 听写听写Open Questions 开放式问题开放式问题Short Answer Qs简答题简答题Essay Writing写作写作Translation翻译翻译3.1 Multiple-choice QuestionsAn example: Noise made by a snake is called _. A mew B bark C hiss D quack Stem + Choices/Alternatives (the correct choice and distractors)Advantages: Efficiency Neutrality

23、 Universality Response clarityDisadvantages: Ambiguity No partial credit guessing Time-consuming for item consrtuction3.2 Gap-fillingAn example:Eating too much fast food is not _. A hint:a root word (health), the first letter of the word (h_).Advantages: Testing grammar or vocabulary Essy to grade R

24、elatively easy to construct. Disadvantages: Ambiguity: more than one possible correct answers. Parents owe their children a set of solid values _ which to build their lives. (around/on)3.3 MatchingMatch the word on the left to the word with the opposite meaning.fat old young tallactive thinshort qui

25、et This could be individual words, words and definitions, pictures to words etc.Advantages: Testing vocabulary Easy to construct and gradeDisadvantages: Students may get the right answers without knowing all the words. 3.4 Transformationl This is an interesting book. (转为感叹句) What an interesting book

26、 this is!l I went to bed after I finished my homework. I didnt go to bed until I finished my homework. It was not until I finished my homework that I went to bed. Not until I finished my homework did I go to bed. A student has to rewrite a sentence based on an instruction or a key word given. Advant

27、ages: Testing grammar and understanding of form Fairly easy to gradeDisadvantages: A student may rewrite sentences to a formula. 3.5 Cloze 完形填空完形填空Complete the text by adding a word to each gap. One day, the wife of a Chinese king sat watching a worm as it ate some mulberry leaves. Soon it stopped _

28、. Then, as it slowly turned its head from side _ side, a very fine thread came out of its _. It wrapped the thread around and around itself until it was shut _ a little cocoon.Advantages: Much more integrative; Effective for testing grammar, vocabulary and intensive reading; A good indicator of over

29、all language proficiency.Disadvantages: There may have multiple correct answers.3.6 True/FalseDecide if the statement is true or false. England won the world cup in 1966. T/F The candidate must decide if a statement is true or false.Advantages: Test listening & reading comprehension Easy to grad

30、e Disadvantages: Guessing can result in many correct answers. 3.7 Error CorrectionFind the mistake in the sentence and correct them. He dont know why Tom refused to speak to him. Errors must be found and corrected in a sentence or passage. It could be an extra word, words missed, mistakes with verb

31、forms, etc. Advantages: Useful for testing grammar and vocabulary as well as reading and listening comprehension.Disadvantages: Some errors can be corrected in more than one way.The Internet is playing a important part in 56 our daily life. On the net, we can learn about 57 news both home and abroad

32、 and some other 58 informations as well. We can also make phone calls, 59 send messages by e-mails, go to net schools, and 60 learn foreign languages by ourselves. Beside, we 61 can enjoy music, watch sports matches, and play the 62 chess or cards. The net even help us do shopping, 63 make a chat wi

33、th others and make friends with them. 64 In a word, the Internet has made our life more easier. 65 3.8 Dictation One of the oldest techniques known for the teaching and testing of foreign languages; Being closely related to grammar translation method; Testing spelling, listening and recognition.Stan

34、dard dictation 标准听写Partial dictation 部分听写Dictation with competing noise 干扰听写 Dictation-composition 听写作文Elicited imitation 复述听写 3.9 Open QuestionsAnswer the questions. Why did John steal the money?Here the candidate must answer simple questions after reading or listening or as part of an oral intervi

35、ew. Advantages: Useful for testing any of the four skills, but less useful for testing grammar or vocabulary.Disadvantages: More difficult and time consuming to grade An element of subjectivity involved in judging how complete the answer is.3.10 Short Answer Questions Requiring the learner to write

36、a word, phrase, number or symbol; often based on a passage; Sometines with a limit of words in one answer (3-5 words)3.11 Essay Writing Being widely used Often being criticized for their lack of objectivity. Requirements for writing:用词正确语句通顺结构合理内容符合要求文体得当 (措辞和行文 正式-非正式)The two new senators have prov

37、ed themselves exceptionally able (guys/men).Writing a letter to a close friend or writing a job application letterTypes of Essay Writing单句写作He doesnt like dogs as much as his wife does.His wife likes dogs better than him.组句成章()They are students. ()Mr and Mrs White have two sons. ()Now Ben and Jerry

38、are playing football with their father. ()Alice is only three. ()The boys have a sister, Alice. ()Their names are Ben And Jerry. ()Alice is sitting on the grass with her mother. Advantages:InegrativeDisadvantages:Difficult to score reliably and time-consuming to grade Often affected by handwriting,

39、presence or spelling errors, grammar used the subjective judgments of the grader. Training of graders: time-consuming and needs to be repeated at frequent intervals throughout the grading.3.12 Translation Used method of testing in both classroom assessment and formal test. Criteria of good translati

40、on vary4. General Criteria of Language Testing 4.1 Practicality Factors to consider: Financial limitations; Time constraints; Ease of administration; Scoring. A test that is prohibitively expensive is impractical. A test that takes a students ten hours to complete is impractical. A test that require

41、s individual one-to-one proctoring is impractical. A test that takes a few minutes for a student to take and several hours for an examiner to evaluate is impractical. A test that can be scored only by computer is impractical if the test takes place a thousand miles away from the nearest computer. 3.

42、2 Reliability 信度信度 A consistent measure of performance. 可靠性/稳定性 Sources of unreliability: the test itself or the scoring of the test, that is, test reliability and rater reliability. Test reliability: the consistency of results if giving the same test to the same subject on two different occasions.

43、Scoring or rater reliability: the consistency of scoring by two or more scorers or by the same scorer on different occasions. 3.3 Validity 效度效度 The degree to which the test actually measures what is intended to measure; Test what is important to test, not what is easy to test; The most complex and i

44、mportant criterion of a good test.Types of Validity Content validity 内容效度 Construct validity 构念效度Face validity 表面效度Not what a test actually measures, but what it superficially appears to measure Criterion validity 标准效度The extent to which the tests are related to concrete criteria in the real world T

45、he extent to which a test is relevant and representative of what it is used to measure. 内容与测试目标是否有关 测试内容是否具有代表性 测试内容是否适合测试对象The degree to which a test measures what it claims to be measuring based on a theoretical guidance试题是否以有效的语言观为依据;“结构或构念”指整个考试的理论基础。How to improve validity of a test: Specificat

46、ion of what is to be measured based on course syllubus; Construction of the test items; Review by experienced teachers and experts4.4 Backwash 反拨作用反拨作用 Backwash: the effect of testing on teaching and learning. Backwash can be harmful (teaching to the test)or beneficial (diagnostic and promoting impr

47、ovement).4.5 Difficulty and Discrimination Index of difficulty 难度系数 Discrimination 区分度 (区分考生能力的程度)5. Developing Multiple-Choice QuestionsWhat to measure To measure knowledge recall as well as higher order thinking. Four types of content (facts, concepts, principles, and procedures) and five types of

48、 cognitive behaviors (recalling, understanding, predicting, evaluating, and problem solving). Factual informationl True FalseThe capital of Kentucky is Louisville. l Multiple Choice Which city is the capital of Kentucky? A. Frankfort B. Lexington C. Louisville D. Paducah Higher order thinkingWhat is

49、 likely to happen to mortgage interest rates when interest rates on savings go up? A. Increase B. Decrease C. No change D. UnpredictableTrue/False & Multiple Choice Questions More time-consuming for the teacher to construct good multiple-choice items than true/ false or completion items. The dif

50、ficulty of finding suitable distractors, which are plausible. Plausible: the distractor must have the potential for being selected as the correct answer. Two distractors are as effective as three if one of the three is not plausible. Reading level and reading speed of the students must be considered

51、 when constructing the items. To insure that one question or its distractors do not provide clues to the answer of another question. Best answer items (measuring understanding, or interpretation) are usually more difficult than correct answer items. Which one of the following was the most important

52、consideration in locating cities during frontier times in America? A. good farmland B. access to waterways C. moderate temperature. D. easy to defend against attack by Indians May test the ability to compare and evaluate; or may test knowledge or ability to recall.A. The stem 题干题干 Be meaningful and provide a definite problem Include as much of the item as possible. 1. The talk show host can _ the president brilliantly. A take on B take after C take for D

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论