中国英语学习者语料库CLEC桂诗春杨惠中_第1页
中国英语学习者语料库CLEC桂诗春杨惠中_第2页
中国英语学习者语料库CLEC桂诗春杨惠中_第3页
中国英语学习者语料库CLEC桂诗春杨惠中_第4页
中国英语学习者语料库CLEC桂诗春杨惠中_第5页
已阅读5页,还剩6页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、中国英语学习者语料库CLEC收集了包括中学生、大学英语4级和6级、专业英语低年级和高年级在内的5种学生的语料一百多万词,并对言语失误进行标注。其目的就是观察各类学生的英语特征和言语失误的情况,希望通过定量和定性的方法对中国学习者英语作出较为精确的描写,为我国学生的英语教学提供有用的反馈信息。 表1 CLEC语料分布类型 词次 ST2208088ST3209043ST4212855ST5214510ST6226106总计1070602    言语失误标注 原则  1.         简

2、单合理,易于系统操作。参与标注的人比较多,分类表过于繁复,就难于掌握。我们采取两级分类,第一级有11类:词形(fm)、动词短语(vp)、名词短语(np)、代词(pr)、形容词短语(aj)、副词(ad)、介词短语(pp)、连词(cj)、词汇(wd)、搭配(cc)、句子(sn)。每一类里再用数目字细分。如cc为词语搭配不当,cc1表示名词和名词的搭配,cc2表示名词和动词的搭配,cc3表示动词和名词的搭配,等等。 2.         分类表的类别要适中。过粗容易统一,但信息太少,不利于分析学习者的失误/过细

3、难以统一,容易把同一种失误归到不同类别。目前我们采取的办法是对常见的失误从细(如vp和np都有9小类),对少见的失误从粗(如cj只有两小类)。现在的分类表有61个失误码,是属于中等规模的分类表。 3.      提供足够的失误信息(失误本身、失误类型和失误发生范围)。例如In the past, people are vp6, 4- kind to each other, 失误用方括号表示,放在失误之后。 vp6In the past, people are vp6,4-kind to each other,vp6,4- are 为v

4、p(动词)第6种(时态)失误,4-为失误发生的范围,-表示失误的位置,4表示失误前有4个词。要联系这4个词,才能判断areare这个词用错了。  4.      开放性。容许研究者根据需要对失误类型进行补充或进一步再分出细类。例如sn8为句子结构有缺陷,研究者可以对这种失误再分为若干细类来研究。这需要把sn8的失误全部检索出来,然后定出第三级的分类范畴,如sn81,sn82,等等。 5.         对语体或失误的来由暂不作标注,因为这需要标注者较多的

5、主观判断,更难以统一。言语失误分类表(总数:61)词形 动词短语 名词短语 代词 码 类型 码 类型 码 类型 码 类型 fm1 Spelling vp1 pattern np1 pattern pr1 Reference fm2 word building vp2 set phrase np2 set phrase pr2 anticipatory it fm3 capitalization vp3 agreement np3 agreement pr3 Agreement     vp4 finite/non-finite np4 case pr4 Case  

6、;   vp5 non-finite np5 countability pr5 wh-     vp6 tense np6 number pr6 Indefinite     vp7 voice np7 article         vp8 mood np8 quantifiers         vp9 modal/auxiliary np9 other determiners     形容词短语 副词 介词短语 连词 码 类型 码 类型 码

7、 类型 码 类型 aj1 pattern ad1 order pp1 pattern cj1 pattern aj2 set phrase ad2 modification pp2 set phrase cj2 set phrase aj3 degree ad3 degree         aj4 -ed/-ing confusion             aj5 predicative/attributive            

8、; 词语 搭配 句子 码 类型 码 类型 码 类型 wd1order cc1 noun/noun sn1 run-on sentence wd2 part of speech cc2 noun/verb sn2 sentence fragment wd3 substitution cc3 verb/noun sn3 dangling modifier wd4 absence cc4 adj/noun sn4 illogical comparison wd5 redundancy cc5 verb/adv sn5 topic prominence wd6 repetition cc6 adv/a

9、dj sn6 Coordination wd7 ambiguity     sn7 Subordination         sn8 structural deficiency         sn9 Punctuation 标注说明 码 分 类 类 别 说 明 fm1wordSpelling(拼写)spelling, coinage, abbreviation, apostrophefm2wordword building(构词)derivation, inflection, compoun

10、ding, plurality (noun), irregularity(verb), 3rd person singular form(verb), syllabification, hyphenation, word division or fusion     fm3wordCapitalization(大小写)lower initial letter for upper initial letter or vice versavp1vb phrPattern(及物性型式)error in transitivity(vi as vt or vice versa), t

11、ransitive verb pattern/ grammatical(cf Oxford advanced learners dictionary of current English edited by A. S. Hornby)vp2vb phrset phrase(固定词组)phrasal verb and verbal phrase: error in form or usevp3vb phrAgreement(主谓一致性)number agreement with its subject (noun or pronoun)vp4vb phrfinite/non-finite(定式)

12、finite verb for non-finite verb or vice versavp5vb phrnon-finite(不定式)infinitive error: form and use/ infinitive for participle or vice versa/ -ed participle for -ing participle or vice versavp6vb phrTense(时态)error in tense use within a sentence/ the sequence of tenses between sentencesvp7vb phr voic

13、e (语态)error in the use of voice: active for passive or vice versavp8vb phrMood(语气)error in the use of mood: imperative, subjunctive/ improper structure of conditional sentencesvp9vb phrmodal/auxiliary(情态)misuse of modal/auxiliary verbs/ wrong form of modal verb(or auxiliary verb) and verb combinatio

14、n (e.g tense form, voice form, etc)np1nn phrPattern(名词型式)Error in combination with other words/grammaticalnp2nn phrset phrase(固定词组)omission or replacement of a fixed element that goes after a certain nounnp3nn phrAgreement(主谓一致性)number agreement of a noun with its determiner or a word that refers to

15、 itnp4nn phrCase(格)possessive case error: form or usenp5nn phrCountability(可数性)uncountable noun used as countable nounnp6nn phrNumber(数)countable noun used with no determiner or -s/ a or -s with plural nounnp7nn phrArticle(冠词)a/an confusion or definite/indefinite confusionnp8nn phrQuantifiers(数量词)mi

16、suse or confusion between many/much, (a) few/(a) little, some/any, etcnp9nn phrother determiners(其他限定词)misuse or confusion of demonstratives, wh- determiners, numerals, etc.pr1pronReference(指称)incorrect/ambiguous pronoun reference/anaphoricpr2pronanticipatory it(先行it)improper or wrong use of anticip

17、atory it / it replaced by a demonstrative, etcpr3pronAgreement(主谓一致性)number agreement with a noun it refers topr4pronCase(格)case error of any personal pronounpr5pronwh-(wh-代词)misuse or confusion of interrogative, relative and conjunctive pronounspr6pronIndefinite(不定式)misuse or confusion of indefinit

18、e pronouns such as all/both, few/little, some/any, either/neither, etcaj1 adjPattern(形容词型式)error in the combination with other words/grammaticalaj2adjset phrase(固定词组)error in the idiomatic use of an adjectival phrase/ omission or replacement of a fixed element that goes after a certain adjectiveaj3a

19、djDegree(级)adjective degree error: form and useaj4adj-ed/-ing confusion(-ed/-ing混淆)-ed adjective for -ing adjective or vice versaaj5adjpredicative/attributive(谓语/定语)predicative adjective used as attributive adjectivead1advOrder(词序)improper adverb placement/wrong positionad2advModification(修饰语)adject

20、ive modifier used as verb modifier/ other kinds of confusionad3advDegree(级)adverb degree error: form and usepp1prepPattern(介词型式)unacceptable combination with other words/grammaticalpp2prepset phrase(固定词组)error in the formation or use of an idiomatic prepositional phrasecj1conjPattern(连词型式)unacceptab

21、le combination with other words/grammaticalcj2conjset phrase(固定词组)error in the formation or use of a phrase functioning as a conjunctionwd1wordOrder(词序)misplacement of any word other than an adverbwd2wordpart of speech(词类)error in part of speech: right root but wrong word classwd3wordSubstitution(替代

22、)error in word choice: right word class but wrong selection (any part of speech)wd4wordAbsence(缺少)omission of a word(any part of speech)wd5wordRedundancy(冗余)oversuppliance of a word(any part of speech)wd6wordRepetition(重复)unnecessary repeating of a word wd7wordAmbiguity(歧义)not clear word meaning/sem

23、anticcc1notionaln/n collocation(名词/名词)improper noun(phrase) and noun(phrase) combination/semanticcc2notionaln/v collocation(名词/动词)improper noun(phrase) and verb(phrase) combination/semanticcc3notionalv/n collocation(动词/名词)improper verb and noun(phrase) combination/semanticcc4notionala/n collocation(

24、形容词/名词)improper adjective and noun(phrase) combination/semanticcc5notionalv/ad collocation(动词/副词)improper verb and adverb (or ad/v) combination/semanticcc6notionalad/a collocation(副词/形容词)improper adverb and adjective combination/semanticsn1sentencerun-on sentence(不断句)improper addition of clauses/fus

25、ed sentencesn2sentencesentence fragment(片段)subordinate clause as a sentence/ any phrase as a sentencesn3sentencedangling modifier(垂悬修饰语)illogical adverbial modification of a clausesn4sentenceillogical comparison(比较不符合逻辑) error in the comparison of words or phrases in a sentence which can not be comp

26、aredsn5sentencetopic prominence(主题突出)the co-occurrence of an initial noun phrase and its equivalent(usually a pronoun) in the same sentencesn6sentenceCoordination(并列)faulty parallelism of clauses (or words/phrases) in a sentencesn7sentenceSubordination(主从)faulty attachment of a subordinate clause to

27、 the main clausesn8sentencestructural deficiency(结构缺陷)error in the grammatical construction of a sentence: improper splitting, pattern shifting, confusing structure, etcsn9sentencePunctuation(标点符号)overuse, absence, choice, apostrophe, comma splice, etc.  标准化处理后的各种失误频数及其比例 失误类型 st2 st3 st3 st4 s

28、t5 总计 百分比(%) fm1 1928.8 2877.4 2112.6 1826.7 1686.7 10432.2 17.47 fm2 349.3 448.9 438.9 226.9 328.7 1792.7 3 fm3 1474.4 731.8 405.8 694.1 174.6 3480.7 5.83 vp1 259.4 325.9 498.4 103.4 200.8 1387.9 2.32 vp2 179 139.3 61.2 104.2 22.1 505.8 0.85 vp3 374 524.6 785.2 273.1 327 2283.9 3.82 vp4 140.8 159.1

29、 110.8 63.9 51.6 526.2 0.88 vp5 140 118.7 107.4 89.9 46.7 502.7 0.84 vp6 1165.7 356 311.6 379.8 215.6 2428.7 4.07 vp7 172.7 104.1 98.4 63.9 46.7 485.8 0.81 vp8 27.1 16.3 8.3 25.2 11.5 88.4 0.15 vp9 111.4 274.3 278.5 42.9 86.1 793.2 1.33 np1 46.9 33.5 28.9 16.8 10.7 136.8 0.23 np2 24.7 22.4 17.4 19.3

30、 2.5 86.3 0.14 np3 202.1 247.7 249.6 210.9 186 1096.3 1.84 np4 66.8 55.9 26.4 22.7 21.3 193.1 0.32 np5 58.9 98 71.9 60.5 84.4 373.7 0.63 np6 374 654.4 481 358.8 354.1 2222.3 3.72 np7 237.9 107.5 89.3 174.8 54.9 664.4 1.11 np8 35 65.4 47.9 13.4 7.4 169.1 0.28 np9 6.4 41.3 12.4 7.6 5.7 73.4 0.12 pr1 8

31、2 236.5 205 89.9 18.9 632.3 1.06 pr2 16.7 78.3 23.1 4.2 0 122.3 0.2 pr3 52.5 54.2 172.7 28.6 60.6 368.6 0.62 pr4 74.8 37 20.7 48.7 10.7 191.9 0.32 pr5 26.3 53.3 14.1 7.6 10.7 112 0.19 pr6 9.5 2.6 5 3.4 0 20.5 0.03 aj1 6.4 18.9 15.7 5 9 55 0.09 aj2 9.5 3.4 9.9 5.9 7.4 36.1 0.06 aj3 38.2 39.6 32.2 43.

32、7 97.5 251.2 0.42 aj4 16.7 2.6 22.3 12.6 5.7 59.9 0.1 aj5 0.8 3.4 7.4 1.7 0 13.3 0.02 ad1 35.8 96.3 39.7 27.7 15.6 215.1 0.36 ad2 42.2 37.8 12.4 9.2 4.9 106.5 0.18 ad3 7.2 12 9.9 1.7 2.5 33.3 0.06 pp1 136.1 98 43 169.7 28.7 475.5 0.8 pp2 25.5 262.3 143.8 37 27.9 496.5 0.83 cj1 27.8 20.6 18.2 21.8 12

33、.3 100.7 0.17 cj2 4 7.7 13.2 5.9 4.9 35.7 0.06 Wd1 43.8 151.3 114.1 25.2 37.7 372.1 0.62 Wd2 324.6 929.6 772.8 226.9 242.6 2496.5 4.18 Wd3 1102 1634.7 1815 757.1 359.8 5668.6 9.49 Wd4 585.6 829.8 443.8 403.3 427 2689.5 4.5 Wd5 410.6 613.1 518.2 265.5 171.3 1978.7 3.31 Wd6 27.1 37 22.3 34.5 29.5 150.

34、4 0.25 Wd7 261.8 430.8 261.2 228.6 209.8 1392.2 2.33 cc1 72.4 65.4 76 23.5 36.1 273.4 0.46 cc2 35 177.1 49.6 6.7 21.3 289.7 0.49 Cc3 168.7 514.2 417.4 75.6 112.3 1288.2 2.16 Cc4 64.5 94.6 134.7 42 39.3 375.1 0.63 Cc5 23.9 40.4 29.8 5 4.1 103.2 0.17 Cc6 17.5 12 6.6 2.5 1.6 40.2 0.07 Sn1 419.3 596.8 5

35、76.9 118.5 42.6 1754.1 2.94 Sn2 424.9 389.6 303.3 132.8 76.2 1326.8 2.22 Sn3 10.3 20.6 17.4 2.5 10.7 61.5 0.1 Sn4 17.5 24.9 6.6 20.2 4.9 74.1 0.12 Sn5 9.5 14.6 17.4 2.5 4.9 48.9 0.08 Sn6 84.3 41.3 39.7 41.2 1.6 208.1 0.35 Sn7 49.3 55.9 63.6 23.5 3.3 195.6 0.33 Sn8 1103.6 446.3 862.1 493.2 231.9 3137

36、.1 5.25 Sn9 861.7 573.6 337.2 649.5 322.9 2744.9 4.6 总计 14105.2 16160.6 13935.9 8883.4 6633.8 59718.9 100   按大类区分言语失误排列表         st2 st3 st4 st5 st6 总计 百分比 累积百分比 词形 3752.5 4058.1 2957.3 2747.7 2190 15705.6 26.299 26.299 词汇 2755.5 4626.3 3947.4 1941.1 1477.7 14748 24.696 50.9

37、95 句法 2980.4 2163.6 2224.2 1483.9 699 9551.1 15.993 66.988 动词 2570.1 2018.3 2259.8 1146.3 1008.1 9002.6 15.075 82.063 名词 1052.7 1326.1 1024.8 884.8 727 5015.4 8.398 90.461 搭配 382 903.7 714.1 155.3 214.7 2369.8 3.968 94.429 代词 261.8 461.9 440.6 182.4 100.9 1447.6 2.424 96.853 介词 161.6 360.3 186.8 206

38、.7 56.6 972 1.628 98.481 形容词 71.6 67.9 87.5 68.9 119.6 415.5 0.696 99.177 副词 85.2 146.1 62 38.6 23 354.9 0.594 99.771 连词 31.8 28.3 31.4 27.7 17.2 136.4 0.228 99.999 总计 14105.2 16160.6 13935.9 8883.4 6633.8 59718.9 99.999 百分比 0.24 0.27 0.23 0.15 0.11   中国学习者最常见的言语失误         类

39、型 st2 st3 st4 st5 st6 总计 百分比 fm1 1928.8 2877.4 2112.6 1826.7 1686.7 10432.2 17.47 wd3 1102 1634.7 1815 757.1 359.8 5668.6 9.49 fm3 1474.4 731.8 405.8 694.1 174.6 3480.7 5.83 sn8 1103.6 446.3 862.1 493.2 231.9 3137.1 5.25 sn9 861.7 573.6 337.2 649.5 322.9 2744.9 4.6 wd4 585.6 829.8 443.8 403.3 427 26

40、89.5 4.5 wd2 324.6 929.6 772.8 226.9 242.6 2496.5 4.18 vp6 1165.7 356 311.6 379.8 215.6 2428.7 4.07 vp3 374 524.6 785.2 273.1 327 2283.9 3.82 np6 374 654.4 481 358.8 354.1 2222.3 3.72 wd5 410.6 613.1 518.2 265.5 171.3 1978.7 3.31 fm2 349.3 448.9 438.9 226.9 328.7 1792.7 3 sn1 419.3 596.8 576.9 118.5

41、 42.6 1754.1 2.94 wd7 261.8 430.8 261.2 228.6 209.8 1392.2 2.33 vp1 259.4 325.9 498.4 103.4 200.8 1387.9 2.32 sn2 424.9 389.6 303.3 132.8 76.2 1326.8 2.22 cc3 168.7 514.2 417.4 75.6 112.3 1288.2 2.16 np3 202.1 247.7 249.6 210.9 186 1096.3 1.84 vp9 111.4 274.3 278.5 42.9 86.1 793.2 1.33 np7 237.9 107

42、.5 89.3 174.8 54.9 664.4 1.11 pr1 82 236.5 205 89.9 18.9 632.3 1.06   从上表可看出,1.         词形的3种失误(拼写、构词、大小写)均在其中,而拼写更是居榜首,占失误中的17.47%。3种失误合并共占20.57%。2.         词汇失误7种中有5种(替代、缺少、词类、冗余、歧义),占失误中的23.81%。3.  

43、0;      句法失误9种中有4种(结构缺陷、标点符号、不断句、片段),占失误中的15.01%。4.         动词词组9种中有4种(时态、主谓不一致、及物性、情态),占失误中的11.54%5.         名词词组9种中有3种(数、主谓不一致、冠词),占6.67%。6.         其他失误(动词/名词搭配、代词指称),占3.22%。中国学习者最常见拼写失误表         频数 词 频数 词 频数 词 频数 词 379 MORTALITY 23 THEMSELVES 15 LIMITED 12 WRITING 113 KNOWLEDGE 21 FESTIVAL 15 NOTICE 11 ARTICLE 78 POLLUTION 20 BELIEVE 15 OURSELVES 11 CONTRARY 7

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论