




版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、parsingparsingcalculate grammatical structure of program, like diagramming sentences, where: tokens = “words” programs = “sentences”for further information, read: aho, sethi, ullman, “compilers: principles, techniques, and tools” (a.k.a, the “dragon book”)outline of coveragelcontext-free grammarslpa
2、rsing tabular parsing methods one pass top-down bottom-upparser: extracts grammatical structure of programfunction-defnameargumentsstmt-listmainstmtexpressionoperatorexpressionexpressionvariablestringcout s | s.ide - e + t - e + t + t - t + t + t - id + t + t - id + t * id + t - id + id * id + t -id
3、 + id * id + idambiguitylambiguity is a function of the grammar rather than the languagelcertain ambiguous grammars may have equivalent unambiguous onesgrammar transformationslgrammars can be transformed without affecting the language generatedlthree transformations are discussed next: eliminating a
4、mbiguity eliminating left recursion (ductions of the form aa ) left factoringeliminating ambiguitylsometimes an ambiguous grammar can be rewritten to eliminate ambiguitylfor example, expressions involving additions and products can be written as follows: lthe language generated by this gramma
5、r is the same as that generated by the grammar on tranparency 11. both generate id(+id|*id)*lhowever, this grammar is not ambiguouseliminating ambiguity (cont.)lone advantage of this grammar is that it represents the precedence between operators. in the parsing tree, products appear nested within ad
6、ditionsetteid+*idtideliminating ambiguity (cont.)lan example of ambiguity in a programming language is the dangling elseelselconsider ifif thenthen elseelse | ifif thenthen | eliminating ambiguity (cont.)lwhen there are two nested ifs and only one else.sthenselsesif then ssthensthenselseseliminating
7、 ambiguity (cont.)lin most languages (including c+ and java), each elseelse is assumed to belong to the nearest ifif that is not already matched by an elseelse. this association is expressed in the following (unambiguous) grammar: s matched | unmatched matched ifif thenthen matched elseelse matched
8、| unmatched ifif thenthen s | ifif thenthen matched elseelse unmatchedeliminating ambiguity (cont.)lambiguity is a property of the grammarlit is undecidable whether a context free grammar is ambiguouslthe proof is done by reduction to posts correspondence problemlalthough there is no general algorit
9、hm, it is possible to isolate certain constructs in productions which lead to ambiguous grammarseliminating ambiguity (cont.)lfor example, a grammar containing the production would be ambiguous, because the substring has two parses:aaaaaaaaaalthis ambiguity disappears if we use the productions and o
10、r the productions and .eliminating ambiguity (cont.)lexamples of ambiguous productions:aa aa a | a anda a | a a an example of such a language isl=aibjcm | i=j or j=m which can be generated by the grammar: sab | dc aaa | e e ccc | e e bbbc | e e dadb | e eelimination of left recursionla grammar is le
11、ft recursive if it has a nonterminal a and a derivation + + for some string top-down parsing methods (to be discussed shortly) cannot handle left-recursive grammars, so a transformation to eliminate left recursion is needed.limmediate left recursion (productions of the form ) can be easily eliminate
12、d.lwe group the -productions as 1 | 2 | | m | 1| 2 | | n where no i begins with . then we replace the -productions by 1 | 2 | | n 1 | 2 | | m | e eelimination of left recursion (cont.)lthe previous transformation, however, does not eliminate left recursion involving two or more steps. for example, c
13、onsider the grammar saa | b aac| sd |e e s is left-recursive because a da, , but it is not immediately left recursiveelimination of left recursion (cont.)algorithm. for to fortoelimination of left recursion (cont.)lto show that the previous algorithm actually works all we need notice is that iterati
14、on i only changes productions with on the left-hand side. and m i in all productions of the form linduction proof: clearly true for i=1 if it is true for all ik, then when the outer loop is executed for i=k, the inner loop will remove all productions aiam with m ilso, at the end of the algorithm, al
15、l derivations of the form + + will have m i and therefore left recursion would not be possibleleft factoringlleft factoring helps transform a grammar for predictive parsinglfor example, if we have the two productions ifif thenthen elseelse | ifif thenthen lon seeing the input token ifif, we cannot i
16、mmediately tell which production to choose to expand lin general, if we have 1 | 2 and the input begins with , we do not know (without looking further) which production to use to expand left factoring (cont.)lhowever, we may defer the decision by expanding a to althen after seeing the input derived
17、from , we may expand a to 1 or to 2lleft-factored, the original productions become 1 | 2non-context-free language constructslexamples of non-context-free languages are: l1 = wcw | w is of the form (a|b)* l2 = anbmcndm | n 1 and m 1 l3 = anbncn | n 0 l1 = wcwr | w is of the form (a|b)* (wr stands for
18、 w reversed) this language is generated by the grammar s asa | bsb | c l2 = anbmcmdn | n 1 and m 1 this language is generated by the grammar s asd | aada bac | bcnon-context-free language constructs (cont.)ll”2=a b c d | ands aba aab | abb cbd | cds asb | ab non-context-free language constructs (con
19、t.)lsuppose we could construct a dfsm accepting l3. must have a finite number of states, say . lconsider the sequence of states , , , , entered by having read e e, a, aa, , ak. lsince only has states, two of the states in the sequence have to be equal. say, (i j). lfrom , a sequence of i bs leads to
20、 an accepting (final) state. therefore, the same sequence of i bs will also lead to an accepting state from . therefore d would accept ajbi which means that the language accepted by d is not identical to l3. a contradiction.parsingthe parsing problem is: given string of tokens , find a parse tree wh
21、ose frontier is . (equivalently, find a derivation from .)a for a grammar reads a list of tokens and finds a parse tree if they form a sentence (or reports an error otherwise)two classes of algorithms for parsing: top-down bottom-upparser generatorsla is a program that reads a grammar and produces a
22、 parserlthe best known parser generator is it produces bottom-up parserslmost parser generators - including yacc - do not work for every cfg; they accept a restricted class of cfgs that can be parsed efficiently using the method employed by that parser generatortop-down parsinglstarting from parse t
23、ree containing just , build tree down toward input. expand left-most non-terminal.lalgorithm: (next slide)top-down parsing (cont.)llet input = a1a2.ancurrent sentential form (csf) = sloop suppose csf = t1.tka if t1.tk a1.ak , its an error based on ak+1., choose production a csf becomes t1.tk top-dow
24、n parsing examplegrammar: : l e l | e e | input: parse tree sentential form inputl a;be;l a;bllel;lel;aa;la;btop-down parsing example (cont.)parse tree sentential form inputa;ea;blel;aelel;aeba;ba;bll(1) parsinglefficient form of top-down parsingluse only first symbol of remaining input (ak+1) to ch
25、oose next production. that is, employ a function m: n p in “choose production” step of algorithm.lwhen this works, grammar is called ll(1) exampleslexample 1: h: l e ; l | e e a | bgiven input a;b, so next symbol is a.which production to use? cant tell. h not ll(1)ll(1) exampleslexample 2: exp term
26、exp exp $ | + exp term id(use $ for “end-of-input” symbol.)grammar is ll(1): exp and term have only one production; exp has two productions but only one is applicable at any time. nonrecursive predictive parsinglit is possible to build a nonrecursive predictive parser by maintaining a stack explicit
27、ly, rather than implicitly via recursive callslthe key problem during predictive parsing is that of determining the production to be applied for a non-terminalnonrecursive predictive parsing algorithm. repeatifthenifthenelseelseifthenelse untilll(1) grammarsa a : if this production is chosen, parse
28、makes no progress.a | can fix by “left factoring”:a a | ll(1) grammars (cont.)precise definition requires that production to choose be unique (“choose” function m very hard to calculate otherwise)top-down parsinginput tokens: le0 e-nstart symbol androot of parse treeinput tokens: le0 e-n.from left t
29、o right,“grow” the parsetree downwardschecking ll(1)-nesslfor any sequence of grammar symbols , define set first( ) to befirst( ) = a | * a for some checking ll(1)-nessldefine: grammar g = (n, , p, s) is iff whenever there are two left-most derivations (in which the leftmost non-terminal is always e
30、xpanded first) =* = =* =* = =* such that first(x) = first(y), it follows that = in other words, given 1. a string in v* and 2. the first terminal symbol to be derived from say there is at most one production that can be applied to to yield a derivation of any terminal string beginning with lfirst se
31、ts can often be calculated by inspectionfirst setsexp term expexp $ | + exp term id(use $ for “end-of-input” symbol)first($) = $first(+ exp) = +first($) first(+ exp) = grammar is ll(1)first setsl e ; l | ee a | bfirst(e ; l) = a, b = first(e)first(e ; l) first(e) grammar not ll(1).computing first se
32、ts algorithm. foralldo foralldo foralldo repeatforall doforallifthen untilfirst sets of strings of symbolsfirst sets do not sufficelgiven the productions t x t y t wt e elt w should be applied when the next input token is w.lt e e should be applied whenever the next terminal (the one pointed to by )
33、 is either x or yfollow setslfor any nonterminal , define set follow( ) asfollow( ) = a | s * a computing the follow set algorithm. follow(s) =$ foralldo repeatforall do untilconstruction of a predictive parsing table algorithm. m:,: = foralldoforall do ifthenforall do erroranother definition of ll(1)define: grammar g is if for every a n with productions a 1 | | | | nfirst( i follow(a) first( j follow(a) ) = for all i, jregular languagesldefinition. a grammar is one whose productions are all of the type: a ab a a a r1 | r2 r1 r2 r*nondeterministic finite s
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2025版矿产资源开采合同履约保证金协议
- 二零二五年度宾馆特色客房装饰设计采购合同范本
- 2025版场地使用权合作协议规范模板
- 2025年玻璃制品安装与环保性能评估承包合同
- 2025版金融衍生品销售合同范本三(附风险评估)
- 二零二五年度建筑安全及文明施工现场管理合同
- 2025版测绘仪器设备销售及市场分析合同
- 二零二五年度cfg桩基础施工项目变更与索赔合同
- 二零二五年度旅游并购居间服务合同范本
- 2025版SaaS企业协同办公解决方案服务合同
- 电子教程pdms中文培训手册详细
- 绿皮书拉片电影节拍表(借鉴材料)
- 专业技术职务聘任表(2017年版)
- GB/T 602-2002化学试剂杂质测定用标准溶液的制备
- GB/T 12706.1-2020额定电压1 kV(Um=1.2 kV)到35 kV(Um=40.5 kV)挤包绝缘电力电缆及附件第1部分:额定电压1 kV(Um=1.2 kV)和3 kV(Um=3.6 kV)电缆
- 新版有创血压监测ABP培训课件
- 重症医学科常用知情告知书
- 防溺水、防性侵、防欺凌安全教育家长会
- DB11-T1322-14-2017安全生产等级评定技术规范第14部分:汽车制造企业
- 养老机构安全检查表
- 企业员工上下班交通安全培训(简详共2份)
评论
0/150
提交评论