Readability_Analyzer_Readme.doc_第1页
Readability_Analyzer_Readme.doc_第2页
Readability_Analyzer_Readme.doc_第3页
Readability_Analyzer_Readme.doc_第4页
Readability_Analyzer_Readme.doc_第5页
全文预览已结束

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

Readability AnalyzerReadability Analyzer is a tool designed to extract basic readability statistics of English texts. It was programmed by Yunlong Jia and designed by Jiajin Xu and Yunlong Jia. This tool can compute a couple of classic readability scores, such as Flesch Reading Ease Reading Ease and Flesch-Kincaid Grade Level Grade Level, and a few other indices of lexical complexity of texts, e.g. type-token ratio TTR, standardized type-token ratio STTR. Descriptive statistics of words/tokens, types, lemmata, sentences, average word length AWL, average sentence length ASL, etc. can also be read from the Results.SettingsTo get started with Readability Analyzer, users can load local English texts into the tool, be it in plain text format (*.txt), rich text format (*.rtf) or a Word document (*.doc). Readability Analyzer allows users to analyze multiple texts at a time. Please dont attempt to process texts containing Chinese or other full-width characters.Also on the Settings tab, the default value for Set Basis for STTR is 100 words, which is different from 1,000 words as used in WordSmith, because Readability Analyzer often times is used to analyze ELT textbook reading passages, learners compositions, etc., which normally do not exceed 1,000 words. The basis can be customized from 100 words to 4,000 words, with a 100-word interval between bases. See also Standardized TTR (STTR).Save Frequency List(s) setting can be used to generate word lists based on the texts imported. When Words is checked, a normal frequency list is created; when Lemmas is checked, a lemmatized list is saved. These are “complimentary” functions of the tool, and are not “forced” to run by default. If the two options are checked, a window/folder containing the lists will pop up right after the texts are processed.Press Analyze at the right upper/lower corner of the Choose Text(s) box, once the target texts are chosen, and all other settings made, then the results will be tabulated in the Results tab.Filter allows users to choose some of the browsed files in the Choose Text(s) box. Regular Expressions is enabled for Filter to match the filenames of certain texts. Clear Selection actually resets the file selection.Understanding Readability Scores in the Results tabThe File column lists the filenames of the individual files loaded.The following information is adapted from Microsoft Office online, /en-us/help/HP101485061033.aspxFlesch Reading Ease scoreThis test rates text on a 100-point scale. The higher the score, the easier it is to understand the document. For most standard files, you want the score to be between 60 and 70.Score mapping tableFlesch Reading Ease ScoreReadability Level0 - 29Very difficult30 - 49Difficult50 - 59Fairly difficult60 - 69Standard70 - 79Fairly easy80 - 89Easy90 - 100Very easyThe formula for the Flesch Reading Ease score is:Flesch Reading Ease = 206.835 (1.015*ASL) (84.6*ASW)where:ASL = average sentence length (the number of words divided by the number of sentences)ASW = average number of syllables per word (the number of syllables divided by the number of words)Text Difficulty scoreTo facilitate the reading of readability score based on the Flesch Reading Ease test, we reverted the easiest score 100 to the most difficult value, i.e. 0 is the easiest text difficulty level, and the 100 the most difficult. We applied the following equation to get the text difficulty score.(Flesch Reading Ease based) Text Difficulty = 100 Flesch Reading Ease scoreScore mapping tableText DifficultyReadability Level0 - 29Very easy30 - 49Easy50 - 59Fairly easy60 - 69Standard70 - 79Fairly difficult80 - 89Difficult90 - 100Very difficultFlesch-Kincaid Grade Level scoreThis test rates text on a U.S. school grade level. For example, a score of 8.0 means that an eighth grader can understand the document. For most documents, aim for a score of approximately 7.0 to 8.0.The formula for the Flesch-Kincaid Grade Level score is:Flesch-Kincaid Grade Level = (.39*ASL) + (11.8*ASW) 15.59where:ASL = average sentence length (the number of words divided by the number of sentences)ASW = average number of syllables per word (the number of syllables divided by the number of words)The above information is adapted from Microsoft Office online, /en-us/help/HP101485061033.aspxASL = Average Sentence LengthAWL = Average Word LengthTokens = the total number of all occurrences of alphanumeric symbolsWord Types = the total number of distinct words (e.g. 10 instances of do are counted as 1 TYPE of do.)Lemma Types = the total number of base forms (e.g. do is the lemma for do, does, did, doing, done if any).Lemma/Word ratio, Word TTR (Type Token Ratio), Word STTR (Standardized Type Token Ratio), Lemma TTR and Lemma STTR are all scores for lexical richness. Most often Word STTR is good enough to measure lexical richness. Users can make a decision on the other scores for their particular research purposes.Standardized TTR (STTR) (adapted from WordSmith manual)Wordlist in WordSmith uses a different strategy for computing this, therefore. The standardized type/token ratio (STTR) is computed every n words as Wordlist goes through each text file. By default, n = 1,000. In other words, the ratio is calculated for the first 1,000 running words, then calculated afresh for the next 1,000, and so on to the end of your text or corpus. A running average is computed, which means that you get an average type/token ratio based on consecutive 1,000- word chunks of text. Texts with less than 1,000 words (or whatever n is set to) will get a standardized type/token ratio of 0.Readability Analyzer follows WordSmiths method of computing STTR except that the basis for STTR is adjustable, and the TTR for the last basis, 100 words (i.e. the designated basis), for example, is calculated on the basis of the actual word types and tokens. This is important, because we think the WordSmith way of dealing with the last basis for STTR, whether it is 1 word or 999 words, is unfair.The

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论