




版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、parsingparsingcalculate grammatical structure of program, like diagramming sentences, where: tokens = “words” programs = “sentences”for further information, read: aho, sethi, ullman, “compilers: principles, techniques, and tools” (a.k.a, the “dragon book”)outline of coveragelcontext-free grammarslpa
2、rsing tabular parsing methods one pass top-down bottom-upparser: extracts grammatical structure of programfunction-defnameargumentsstmt-listmainstmtexpressionoperatorexpressionexpressionvariablestringcout s | s.ide - e + t - e + t + t - t + t + t - id + t + t - id + t * id + t - id + id * id + t -id
3、 + id * id + idambiguitylambiguity is a function of the grammar rather than the languagelcertain ambiguous grammars may have equivalent unambiguous onesgrammar transformationslgrammars can be transformed without affecting the language generatedlthree transformations are discussed next: eliminating a
4、mbiguity eliminating left recursion (ductions of the form aa ) left factoringeliminating ambiguitylsometimes an ambiguous grammar can be rewritten to eliminate ambiguitylfor example, expressions involving additions and products can be written as follows: lthe language generated by this gramma
5、r is the same as that generated by the grammar on tranparency 11. both generate id(+id|*id)*lhowever, this grammar is not ambiguouseliminating ambiguity (cont.)lone advantage of this grammar is that it represents the precedence between operators. in the parsing tree, products appear nested within ad
6、ditionsetteid+*idtideliminating ambiguity (cont.)lan example of ambiguity in a programming language is the dangling elseelselconsider ifif thenthen elseelse | ifif thenthen | eliminating ambiguity (cont.)lwhen there are two nested ifs and only one else.sthenselsesif then ssthensthenselseseliminating
7、 ambiguity (cont.)lin most languages (including c+ and java), each elseelse is assumed to belong to the nearest ifif that is not already matched by an elseelse. this association is expressed in the following (unambiguous) grammar: s matched | unmatched matched ifif thenthen matched elseelse matched
8、| unmatched ifif thenthen s | ifif thenthen matched elseelse unmatchedeliminating ambiguity (cont.)lambiguity is a property of the grammarlit is undecidable whether a context free grammar is ambiguouslthe proof is done by reduction to posts correspondence problemlalthough there is no general algorit
9、hm, it is possible to isolate certain constructs in productions which lead to ambiguous grammarseliminating ambiguity (cont.)lfor example, a grammar containing the production would be ambiguous, because the substring has two parses:aaaaaaaaaalthis ambiguity disappears if we use the productions and o
10、r the productions and .eliminating ambiguity (cont.)lexamples of ambiguous productions:aa aa a | a anda a | a a an example of such a language isl=aibjcm | i=j or j=m which can be generated by the grammar: sab | dc aaa | e e ccc | e e bbbc | e e dadb | e eelimination of left recursionla grammar is le
11、ft recursive if it has a nonterminal a and a derivation + + for some string top-down parsing methods (to be discussed shortly) cannot handle left-recursive grammars, so a transformation to eliminate left recursion is needed.limmediate left recursion (productions of the form ) can be easily eliminate
12、d.lwe group the -productions as 1 | 2 | | m | 1| 2 | | n where no i begins with . then we replace the -productions by 1 | 2 | | n 1 | 2 | | m | e eelimination of left recursion (cont.)lthe previous transformation, however, does not eliminate left recursion involving two or more steps. for example, c
13、onsider the grammar saa | b aac| sd |e e s is left-recursive because a da, , but it is not immediately left recursiveelimination of left recursion (cont.)algorithm. for to fortoelimination of left recursion (cont.)lto show that the previous algorithm actually works all we need notice is that iterati
14、on i only changes productions with on the left-hand side. and m i in all productions of the form linduction proof: clearly true for i=1 if it is true for all ik, then when the outer loop is executed for i=k, the inner loop will remove all productions aiam with m ilso, at the end of the algorithm, al
15、l derivations of the form + + will have m i and therefore left recursion would not be possibleleft factoringlleft factoring helps transform a grammar for predictive parsinglfor example, if we have the two productions ifif thenthen elseelse | ifif thenthen lon seeing the input token ifif, we cannot i
16、mmediately tell which production to choose to expand lin general, if we have 1 | 2 and the input begins with , we do not know (without looking further) which production to use to expand left factoring (cont.)lhowever, we may defer the decision by expanding a to althen after seeing the input derived
17、from , we may expand a to 1 or to 2lleft-factored, the original productions become 1 | 2non-context-free language constructslexamples of non-context-free languages are: l1 = wcw | w is of the form (a|b)* l2 = anbmcndm | n 1 and m 1 l3 = anbncn | n 0 l1 = wcwr | w is of the form (a|b)* (wr stands for
18、 w reversed) this language is generated by the grammar s asa | bsb | c l2 = anbmcmdn | n 1 and m 1 this language is generated by the grammar s asd | aada bac | bcnon-context-free language constructs (cont.)ll”2=a b c d | ands aba aab | abb cbd | cds asb | ab non-context-free language constructs (con
19、t.)lsuppose we could construct a dfsm accepting l3. must have a finite number of states, say . lconsider the sequence of states , , , , entered by having read e e, a, aa, , ak. lsince only has states, two of the states in the sequence have to be equal. say, (i j). lfrom , a sequence of i bs leads to
20、 an accepting (final) state. therefore, the same sequence of i bs will also lead to an accepting state from . therefore d would accept ajbi which means that the language accepted by d is not identical to l3. a contradiction.parsingthe parsing problem is: given string of tokens , find a parse tree wh
21、ose frontier is . (equivalently, find a derivation from .)a for a grammar reads a list of tokens and finds a parse tree if they form a sentence (or reports an error otherwise)two classes of algorithms for parsing: top-down bottom-upparser generatorsla is a program that reads a grammar and produces a
22、 parserlthe best known parser generator is it produces bottom-up parserslmost parser generators - including yacc - do not work for every cfg; they accept a restricted class of cfgs that can be parsed efficiently using the method employed by that parser generatortop-down parsinglstarting from parse t
23、ree containing just , build tree down toward input. expand left-most non-terminal.lalgorithm: (next slide)top-down parsing (cont.)llet input = a1a2.ancurrent sentential form (csf) = sloop suppose csf = t1.tka if t1.tk a1.ak , its an error based on ak+1., choose production a csf becomes t1.tk top-dow
24、n parsing examplegrammar: : l e l | e e | input: parse tree sentential form inputl a;be;l a;bllel;lel;aa;la;btop-down parsing example (cont.)parse tree sentential form inputa;ea;blel;aelel;aeba;ba;bll(1) parsinglefficient form of top-down parsingluse only first symbol of remaining input (ak+1) to ch
25、oose next production. that is, employ a function m: n p in “choose production” step of algorithm.lwhen this works, grammar is called ll(1) exampleslexample 1: h: l e ; l | e e a | bgiven input a;b, so next symbol is a.which production to use? cant tell. h not ll(1)ll(1) exampleslexample 2: exp term
26、exp exp $ | + exp term id(use $ for “end-of-input” symbol.)grammar is ll(1): exp and term have only one production; exp has two productions but only one is applicable at any time. nonrecursive predictive parsinglit is possible to build a nonrecursive predictive parser by maintaining a stack explicit
27、ly, rather than implicitly via recursive callslthe key problem during predictive parsing is that of determining the production to be applied for a non-terminalnonrecursive predictive parsing algorithm. repeatifthenifthenelseelseifthenelse untilll(1) grammarsa a : if this production is chosen, parse
28、makes no progress.a | can fix by “left factoring”:a a | ll(1) grammars (cont.)precise definition requires that production to choose be unique (“choose” function m very hard to calculate otherwise)top-down parsinginput tokens: le0 e-nstart symbol androot of parse treeinput tokens: le0 e-n.from left t
29、o right,“grow” the parsetree downwardschecking ll(1)-nesslfor any sequence of grammar symbols , define set first( ) to befirst( ) = a | * a for some checking ll(1)-nessldefine: grammar g = (n, , p, s) is iff whenever there are two left-most derivations (in which the leftmost non-terminal is always e
30、xpanded first) =* = =* =* = =* such that first(x) = first(y), it follows that = in other words, given 1. a string in v* and 2. the first terminal symbol to be derived from say there is at most one production that can be applied to to yield a derivation of any terminal string beginning with lfirst se
31、ts can often be calculated by inspectionfirst setsexp term expexp $ | + exp term id(use $ for “end-of-input” symbol)first($) = $first(+ exp) = +first($) first(+ exp) = grammar is ll(1)first setsl e ; l | ee a | bfirst(e ; l) = a, b = first(e)first(e ; l) first(e) grammar not ll(1).computing first se
32、ts algorithm. foralldo foralldo foralldo repeatforall doforallifthen untilfirst sets of strings of symbolsfirst sets do not sufficelgiven the productions t x t y t wt e elt w should be applied when the next input token is w.lt e e should be applied whenever the next terminal (the one pointed to by )
33、 is either x or yfollow setslfor any nonterminal , define set follow( ) asfollow( ) = a | s * a computing the follow set algorithm. follow(s) =$ foralldo repeatforall do untilconstruction of a predictive parsing table algorithm. m:,: = foralldoforall do ifthenforall do erroranother definition of ll(1)define: grammar g is if for every a n with productions a 1 | | | | nfirst( i follow(a) first( j follow(a) ) = for all i, jregular languagesldefinition. a grammar is one whose productions are all of the type: a ab a a a r1 | r2 r1 r2 r*nondeterministic finite s
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 特殊需求宠物共同抚养权及费用分担合同
- 新能源汽车关键技术合作及知识产权共享协议
- 短视频电商直播商品选择与供应链解决方案协议
- 微信小程序电商供应链金融与仓储管理联合合同
- 《文学作品的深邃魅力:课件设计与展示》
- 企业业务流程管理
- 《小学课件:探索宇宙的奥秘》
- 电器知识培训教材
- 《Lily的产品摄影》课件
- 学生干部能力提升与班级建设专题培训
- 《民法》全册精讲课件
- 小学语文五年级知识竞赛课件
- 护理人员业务技术档案 模板
- 工艺管道仪表流程图PID基础知识入门级培训课件
- 人音版小学一年级音乐下册教案 全册
- 草皮铺种施工方案
- 中医养生穴位保健按摩课件
- 回旋镖运动轨迹的模拟
- 《康复医学》PPT课件(PPT 105页)
- (完整)高血压病历以及全套临床病历
- 标准溶液配制与标定原始记录(氢氧化钠)
评论
0/150
提交评论