(外国语言学及应用语言学专业论文)学术论文中基于语料库的词块研究.pdf_第1页
(外国语言学及应用语言学专业论文)学术论文中基于语料库的词块研究.pdf_第2页
(外国语言学及应用语言学专业论文)学术论文中基于语料库的词块研究.pdf_第3页
(外国语言学及应用语言学专业论文)学术论文中基于语料库的词块研究.pdf_第4页
(外国语言学及应用语言学专业论文)学术论文中基于语料库的词块研究.pdf_第5页
已阅读5页,还剩58页未读 继续免费阅读

(外国语言学及应用语言学专业论文)学术论文中基于语料库的词块研究.pdf.pdf 免费下载

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

摘要 ii 论文摘要 语言中的定型表达现象一直都受到众多语言学家的关注 然而语言学家们对定型 表达尚未达成共识词块是典型的定型表达方式本篇论文对学术论文中存在的词块 进行了基于语料库方法的研究借助充足的学术英语语料库 jdest 索引证据及前人 的研究 本篇论文提出了词块的工作定义并根据词块的形式和功能特点对其进行了分 类接着词块在学术论文中的语用功能得到了微观和宏观的具体探讨并给出了详细 的实例研究发现词块是学术论文中不可或缺的一部分他们在学术论文中实施诸 如定义示例表达因果关系陈述研究发现总结表示推断表示比较表示分 类提供证据开启话题转移话题等的一系列语篇功能通过对英语本族语者和中 国学生使用词块情况的研究对比发现中国学生过少使用或过多使用某些词块这对 语言产出的地道性产生了负面的影响文章还讨论了未来词块研究的一些问题 关键词语言定型表达语料库词块形式特点语篇功能词组的 从句的学术论文 abstract iii abstract stereotyping in language has long been attracting attention among linguists however linguists have not yet reached a consensus on it lexical chunks are characteristic of stereotyping in language this thesis presents a corpus based study of lexical chunks in academic texts on the basis of sufficient jdest corpus evidence and a critical review of previous studies the present study works out a working definition for lexical chunks and a categorization is made according to their formal and functional features then pragmatic functions of lexical chunks in academic texts are examined in detail at both a micro level and a macro level and specific examples are given it is found that lexical chunks are an integral part of academic texts performing a wide range of discourse functions major functions include defining exemplifying indicating findings summarizing classifying making inference initiating topic and shifting topic making comparison and contrast providing evidence moreover this paper presents a comparative study conducted between the native and chinese students use of lexical chunks it reveals that chinese students tend to underuse or overuse certain lexical chunks which exerts a negative influence upon the idiomaticity of their language production in addition prospects for future study of lexical chunks are discussed key words stereotyping corpus lexical chunk formal features pragmatic function phrasal clausal academic texts 上海交通大学 学位论文原创性声明 本人郑重声明所呈交的学位论文是本人在导师的指导下独立 进行研究工作所取得的成果除文中已经注明引用的内容外本论文不 包含任何其他个人或集体已经发表或撰写过的作品成果对本文的研究 做出重要贡献的个人和集体均已在文中以明确方式标明本人完全意 识到本声明的法律结果由本人承担 学位论文作者签名孙丽 日期 2004 年 1 月 8 日 上海交通大学 学位论文版权使用授权书 本学位论文作者完全了解学校有关保留使用学位论文的规定同 意学校保留并向国家有关部门或机构送交论文的复印件和电子版允许 论文被查阅和借阅本人授权上海交通大学可以将本学位论文的全部或 部分内容编入有关数据库进行检索可以采用影印缩印或扫描等复制 手段保存和汇编本学位论文 保密在 年解密后适用本授权书 本学位论文属于 不保密 请在以上方框内打 学位论文作者签名 孙丽 指导教师签名卫乃兴 日期 2 0 0 4 年 1 月 8 日 日期 2 0 0 4 年 1 月 8 日 a corpus based study of lexical chunks in academic texts 1 introduction in recent years there has been a growing interest in the study of stereotyping in language it is believed that stereotyping in language contributes considerably to the effectiveness and efficiency of language communication it can well account for why native speakers are endowed with the ability to convey meanings by expressions that are not only grammatical but also natural and idiomatic and why native speakers are able to produce fluent stretches of spontaneous connected discourse which exceeds human capacities for encoding novel speech in advance we shall see clearly that the study of stereotyping in language provides a profound insight into the exploration of learning english as a second language however chomsky maintained that internal grammatical rules play a major role in language production in his view whether or not the language productions are acceptable must be determined by the internal grammatical rules it is evident that there exists a conflict between the two schools of thought and the study of stereotyping in language would pose serious challenge to the traditional mainstream thinking in spite of the upper hand that the mainstream thinking of chomsky takes on linguistic study unexpected levels of stereotyping in language have been a long recognized phenomenon jespersen 1924 85 ever observed that a language would be a difficult thing to handle if its speakers had the burden imposed on them of remembering every little item separately the current problem with the study of stereotyping in language is that the study is so diversified that linguists have not yet reached a consensus on which notion should be used to describe stereotyping in language while a multitude of terms exist in the literature such as memorized sequences lexicalized sentence stems prefabricated chunks semi preconstructed phrases formulaic language and semi fixed phrases a corpus based study of lexical chunks in academic texts 2 since opinions on stereotyping in language are so widely divergent and there are found only a small number of theories and notions the author s tremendous interest is aroused in this paper the term lexical chunk is used to describe stereotyping in language and a systematic corpus based research is conducted in order to look into the formal and functional categorization of lexical chunks in academic texts drawing on the substantial data from the jdest corpus the research sets out to investigate the following issues 1 what are the defining features of lexical chunks in academic texts 2 what are the formal features and syntactic features of lexical chunks 3 what pragmatic functions do lexical chunks perform in academic texts and how the different functions are categorized 4 what are the prospects for further empirical research in lexical chunks in this study a corpus based method is adopted for the investigation of chunks the corpus to be used jdest is a corpus of academic texts which consists of nearly 4 million tokens and covers a wide range of fields such as medicine information technology mechanics physiology environmental science geology automation life science and psychology both qualitative and quantitative analyses are employed in order to provide a fairly precise description of the use of the lexical chunks in academic texts in order to conduct the research within a well delimited framework a working definition for the lexical chunk is developed on the basis of the formal and functional features and then lexical chunks are categorized according to their formal features it has been found that lexical chunks perform a variety of pragmatic functions in the flow of academic texts and constitute an integral part of academic texts the related functions in academic texts are discussed in detail and are provided with typical examples in the paper through a comparison between native speakers and chinese students writings significant differences are traced in terms of their uses of lexical a corpus based study of lexical chunks in academic texts 3 chunks it s evident that native speakers use lexical chunks more frequently more diversified and more effectively than chinese students do the present paper is made up of 5 chapters chapter one presents a review of the previous studies in relation to lexical chunks chapter two specifies the method adopted in the study of lexical chunks chapter three presents a working definition and categorization of lexical chunks according to their formal features defining features are also discussed in the chapter chapter four describes in detail the pragmatic functions of lexical chunks at both the micro level and the macro level with typical examples chapter five summarizes the major points of the study and some issues in connection with future studies are also addressed a corpus based study of lexical chunks in academic texts 4 chapter 1 a review of the previous studies in relation to lexical chunks linguists have for long noticed the existence of stereotyping in language production various attempts have been made to investigate this pervasive phenomenon however there exists a multitude of technical terms in the literature which indicates that linguists have not yet reached a consensus on chunks but hold diverse linguistic positions as pawley b institutionalized expressions c phrasal constraints d sentence builders moreover dozens of pragmatic functions such as greeting parting disapproval denial evaluator relator closing etc that lexical phrases perform are discussed in detail it can be seen that nattinger and decarrico have conducted an extensive and intensive study of lexical chunks however it is not scientific to assert that one phrase does not perform any pragmatic function unless the phrase is placed into certain contexts for example 13 a would you like to go out for a walk with me b it s raining cats and dogs in example 13 the institutionalized statement it s raining cats and dogs performs the function of denial when placed in the dialogue as a result to treat pragmatic function as a distinguishing factor appears less persuasive than expected 1 4 naixing wei s institutionalized collocation naixing wei has done a lot of research into the phenomenon of stereotyping in language and raises the notion of institutionalized collocation in his opinion institutionalized collocation means highly conventional lexicalized sequences which are regarded as a standard expression by a speech community or a discourse community he divides the institutionalized collocation into two types one is institutionalized collocation adopted by a speech community such as in general language examples are shown as follows 14 in broad daylight 15 to put it another way 16 that s the point 17 what is going on here naixing wei 2001 86 the other type is institutionalized collocation adopted by a discourse community such a corpus based study of lexical chunks in academic texts 10 as in academic texts in the following are listed the examples 18 a case in point is 19 there is evidence to suggest that 20 there is good reason to believe naixing wei 2001 86 according to naixing wei institutionalized collocation in academic texts perform a number of discourse acts such as defining exemplifying presenting results comparing contrasting classifying substantiating reasoning and discussion and summarizing and concluding apparently it indicates that institutionalized collocation which features stereotyping in language play its individual role in running academic text naixing wei depicts a picture of institutionalized collocation with its formal features and discourse acts analyzed in detail it is clear that naixing wei s study of institutionalized collocation provides profound insight into the research in chunks when categorization and discourse acts are systematically investigated 1 5 jespersen s formula jespersen interprets formula as a whole sentence or a group of words or it may be one word or it may be only part of a word that is not important but it must always be something which to the actual speech instinct is a unit which cannot be further analyzed or decomposed in the way a free combination can jespersen 1983 88 in the statement jespersen provided a general account of fixedness in formula examples are shown as follows 21 a rolling stone gathers no moss a whole sentence 22 pull my leg a group of words 23 nevertheless one word 24 bf part of a word meaning bloody fool according to jespersen one feature is specially highlighted that is formula must not a corpus based study of lexical chunks in academic texts 11 be further analyzed and have to be treated as a unit to the actual speech instinct however in fact many fixed terms are treated as a unit as well they are analyzable and allow lexical variation to take place on them it s not absolutely true that fixed language must be frozen in form moreover the notion only focuses on the form but is ignorant of the pragmatic semantic functions that formula may perform in the contexts 1 6 moon s multi word item moon defines multi word item as a vocabulary item which consists of a sequence of two or more words this sequence of words semantically syntactically forms a meaningful and inseparable unit multi word items are the result of lexical processes of fossilization and word formation rather than the results of the operation of grammatical rules moon 1994 43 moon holds that there are three criteria which help distinguish holistic multi word items from other kind of strings they are institutionalization fixedness and non compositionality multi word item to some degree is conventionalized in the language and frozen as a sequence of words for example another kettle of fish and a different kettle of fish are alternative forms but on the other hand is not varied to on another hand or on a different hand the last criteria non compositionality rules that a multi word item cannot be interpreted on a word by word basis but has a specialized unitary meaning this is typically associated with semantic non compositionality for example kick the bucket if we take its literal meaning interpreting it word by word the phrase will fail to be understood as die but relating to doing anything to a receptacle with their foot moon puts emphasis on meaning when describing multi word item but does not focus much attention on its pragmatic functions they may perform in context and in his views multi items have to be inseparable sequence of words which take in no a corpus based study of lexical chunks in academic texts 12 consideration grammatical rules consequently in the following sentence 25 the bushes and the trees were blowing in the wind but the rain had stopped were blowing and had stopped are verb groups or verb phrases but they are not multi word items the above 7 notions approach the concept of stereotyping in language production from different perspectives while notions vary it appears that researchers bear very much the same phenomenon in mind in spite of the fact that disagreement still exist as we can see what the stereotyped language has in common i s that they are all treated as a unit and composed with more than one word while notions of memorized sequence sentence stem semi preconstructed phases formula attach more importance to its formal features the notion of lexical phrase lays emphasis on its pragmatic function and considers function as an important distinguishing factor the notion of institutionalized collocation claims that it must be conventional and well accepted by a speech community or a discourse community the notion of multi word item highlights on its semantic value they must be meaningful above all drawing on what the researchers have done in the field of exploring pre constructed phrases in the following chapters the present study will examine lexical chunks in academic texts with jdest corpus as source text and provide a working definition for lexical chunks managing to conduct a systematic and exhaustive study of lexical chunks a corpus based study of lexical chunks in academic texts 13 chapter 2 research methodology this chapter is devoted to the description of the corpus based method adopted in the study of lexical chunks in this chapter the advantages of corpus based method are discussed briefly and how to identify lexical chunks with corpus based method is presented in detail in addition some relevant statistical techniques employed to quantify lexical chunks are expounded with examples 2 1 comments on corpus based method in study of lexical chunks the advantage of relying on computer searches for the identification of lexical chunks would seem enormous as sinclair and renouf argue the retrieval system unlike human beings miss nothing if properly instructed no usage can be overlooked because it is too ordinary or too familiar the statistical evidence is helpful too because it distinguishes the commoner patterns of usage which occur very frequently indeed from the less common usage which occurs very frequently sinclair w is the total number of running words in the corpus f a b is the frequency of the co occurrence of the two words f a is the frequency of occurrence of word form a in the corpus and f b is that of word form b in the corpus the bigger the mi value is the greater the collocational strength between the two words is and thus it can be safely said that the sequence is a chunk usually the threshold mi value 3 is adopted to filter out the casual collocation take the following phrases note that in summary for instance given that for example word a word b mi value note that 5 98 in summary 3 38 for instance 6 86 given that 1 12 table 2 2 mi value of note that in summary and for instance the figures show that note that in summary for instance demonstrate strong a corpus based study of lexical chunks in academic texts 16 collocatonal strength while given that does not note that in summary for instance are qualified to be chunks in the case of sequences longer than two words a statistical measure of cluster computing is adopted in cluster computing the ratio between observed frequency and expected frequency is considered the standard to identify lexical chunks the formula is e f a f b w if a sequence consists of n word forms that is a1 a2 a3 a an then e f a1 f a2 f a f an wn 1 in the formula f a refers to the observed frequency of the word form a and f b the observed frequency of the word form b f an refers to the observed frequency of the word form an w stands for the corpus size according to the formula we can work out the expected frequency of the components co occurring in a corpus then the observed frequency expected frequency ratio can be worked out if the ratio is great say 10 times or more we can safely say that the sequence is a chunk evidently the ratio provides us with objective criteria for identifying chunks for example 2 it is clear that when it was observed in 1832 biela was apparently still in one piece 3 there is no doubt that making castings is not as simple as it used to be jdest according to cluster computing the following table shows the observed frequency expected frequency ratio of the sequences it is clear that there is no doubt that in jdest a corpus based study of lexical chunks in academic texts 17 cluster frequency expected frequency observed frequency ratio it is clear that 15998 16000 789 15995 3 23 81 25 there is no doubt that 6413 16000 4408 215 15995 1 56 32 20 5 table 2 3 ration of it is clear that and there is no doubt that as has been shown from the table the two ratios are both higher than 10 it is clear that and there is no doubt that

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论