编译原理-概述_第1页
编译原理-概述_第2页
编译原理-概述_第3页
编译原理-概述_第4页
编译原理-概述_第5页
已阅读5页,还剩44页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

Chapter1IntroductionInstructorJianhuiYueSoftwareCollege@SCUOffice:JC-B321Outline1Conceptions2CompilerProcessesOverview3SuggestionsonhowtostudythiscourseCompilersCompilersarecomputerprogramsthattranslateonelanguagetoanother.Verycomplexprogramfrom10,000to1,000,000linesofcode.Itsinputisaprogramwritteninitssourcelanguage.Usually,thesourcelanguageisahigh-levellanguage(C,C++,etc).Itproducesanequivalentprogramwritteninitstargetlanguage.Thetargetlanguageisobjectcode

(machinecode)forthetargetmachine.MachineLanguageInitially,theprogramswerewritteninmachinelanguage–numericcodesthatrepresentedtheactualmachineoperationstobeperformed.C70600000002movesnumber2tothelocation0000.Writingsuchcodesistimeconsumingandtedious.AssemblyLanguageInassemblylanguageinstructionsandmemorylocationsaregivensymbolicforms.MOVX,2Anassemblertranslatesthesymboliccodesandmemorylocationsintocorrespondingnumericcodes.Advantages:Greatimprovementinspeedandaccuracyofwritingprograms.stillusedtoday.Disadvantages:Stillnoteasytowriteanddifficulttoreadandunderstand.Machinedependent.High-levelLanguagesNearlyresemblesmathematicalnotationornaturallanguage.Independentonanyparticularmachine.CapableofbeingtranslatedintoexecutablecodeX=2Needaprogramthatperformstranslation.Isitpossible?Isgeneratedobjectcodeefficient?TheoreticalFoundationsNoamChomsky’sstudyofthestructureofnaturallanguages.Classificationoflanguagesaccordingtothecomplexityoftheirgrammarsandthealgorithmstorecognizethem.Context-freegrammarsarethemostusefulforprogramminglanguagesStudyoftheparsingproblem,whichbecomeastandardpartofcompilertheory.Studyoffiniteautomataandregularexpressions.LedtosymbolicmethodsforexpressingthestructureofwordsofaprogramminglanguageOptimizationtechniques

(codeimprovementtechniques).InterpretersAninterpreterisalanguagetranslatorlikeacompiler.Thedifference:thesourceprograminexecutedimmediately,notaftertranslationiscomplete.Programminglanguagecanbeeitherinterpretedorcompiled.Interpretedlanguages:BASIC,LISP,JavaCompiledlanguages:FORTRAN,C,C++.Interpreterssharemanyoperationswithcompilers.AssemblersAnassemblerisatranslatorfortheassemblylanguageofaparticularcomputer.Assemblylanguageisasymbolicformofthemachinelanguageanditiseasytotranslate.Sometimes,acompilerwillgenerateassemblylanguageasitstargetlanguage.Thenassemblerwillfinishthetranslationintoobjectcode.LinkersAlinkercollectscodeseparatelycompiledorassembledindifferentobjectfilesintofinalexecutablefile.AlsoconnectstothecodeforstandardlibraryfunctionsandtoresourcessuppliedbyOS(memoryallocators,I/Odevices)Alinkerwasoriginallyoneoftheprincipalactivitiesofacompiler.LoadersInobjectcodetheprimarymemoryreferencesaremaderelativetoanundeterminedstartinglocationthatcanbeanywhereinmemory.Loaderwillresolveallrelocateableaddressestoagivenstartingaddress.Usually,theloadingprocessispartofOS.PreprocessorsApreprocessorisaseparateprogramthatiscalledbythecompilerbeforethetranslationbegins.PreprocessorscanDeletecommentsIncludeotherfilesPerformmacrosubstitutionsAmacroisashorthanddescriptionofarepeatedsequenceoftextEditorsSourceprogramsarewrittenusinganeditorthatproducesastandardfile(ASCII).Recently,compilershavebeenbundledwitheditorsandotherprogramsintoaninteractivedevelopmentenvironment(IDE).Sucheditorsmaybeorientedtowardstheformatofprogramminglanguage.Programmermaybeinformedoferrorsastheprogramiswritten.Thecompilercanbecalledfromwithintheeditor.DebuggersAdebuggerisaprogramthatdeterminesexecutionerrorsinacompiledprogram.ItisalsopackagedinIDE.Thedebuggerkeepstrackofthesourcecodeinformationsuchaslinenumbers,namesofvariablesandprocedures.Itcanhaltexecutionatbreakpointandprovideinformationoncalledfunctionsandcurrentvaluesofvariables.ItsRoleintheComputerSystemHardwareOSCompilerHighLevelLanguageOtherFacilitiesTranslationProcessScannerParserSemanticAnalyzerSourceCodeOptimizerCodeGeneratorTargetCodeOptimizerSourceCodeTargetCodeTokensSyntaxTreeAnnotatedTreeIntermediatecodeTargetCodeLiteralTableSymbolTableErrorHandlerWhytoStudyit?1ObtaintheKnowledgeaboutCompiler2Writegoodqualityprogramcode3ObtaintheKnowledgeaboutsystemsoftwareconstructiontoimproveprogrammingskills4Learnhowtoapplythetheoriesintosoftwareconstruction(DataStructure,DiscreteMath,FormalLanguageandAutomata,etc)5Layasolidbaseforfutureresearchworkonthecomputerscience(Doseitreallydead?)ItsApplicationCases1EmbeddedSQL2ExtensionstoExistedLanguages3DevelopaDedicatedUsageLanguages(DisplayCardDriverGenerationLanguage,GamedevelopmentLanguage,AILanguage)4ConstructCASEtools(OrientedtoSEresearch)(checksoftwarebug…)5Toreducethegapsbetweenthesoftwareandhardware(OrientedtoCSresearch)TheScannerReadsthesourceprogram(streamofcharacters).Performslexicalanalysis:collectssequencesofcharactersintomeaningfulunitscalledtokens.Example:a[index]=4+2;aidentifier[leftbracket

indexidentifier]rightbracket

=assignment4number

+plussign2numberTheParserReceivesthesourceinformoftokens.Performssyntaxanalysisdeterminesthestructureoftheprogramsimilartoperforminggrammaticalanalysisonasentenceinnaturallanguage.Theresultisrepresentedasaparsetreeorasyntaxtree.ParseTree

expressionassign-expression

expression

expression=

subscript-expression

additive-expression

expression

expression

expression

expression[]+identifieraidentifierindexnumber4number2AbstractSyntaxTreeAnabstractsyntaxtreeisacondensationoftheinformationcontainedinaparsetree.assign-expression

subscript-expression

additive-expressionidentifieraidentifierindexnumber4number2TheSemanticAnalyzerThesemanticsofaprogramareits“meaning〞.Thesemanticsofaprogramdetermineitsruntimebehavior.Mostprogramminglanguageshavefeatures(calledstaticsemantics)thatcanbedeterminedpriortoexecution.TypicalstaticsemanticsfeaturesDeclarationsTypecheckingTheextrainformationcomputedbythesemanticanalyzerarecalledattributes.Theyareaddedtothetreeasannotations,or“decorations〞AnnotatedTreeassign-expression

subscript-expressioninteger

additive-expressionintegeridentifieraarrayofintegeridentifierindexintegernumber4integernumber2integerTheSourceCodeOptimizerTheearliestpointatwhichoptimizationstepscanbeperformedisjustaftersemanticanalysis.Theremaybepossibilitiesthatdependonlyonthesourcecode.Compilersexhibitawidevariationinthekindofoptimizationanditsplacement.Theoutputofthesourcecodeoptimizeristheintermediaterepresentation

(IR)orintermediatecode.Example

4+2canbeprecomputedbythecompiler.Thisoptimizationisknownasconstantfolding.Thisoptimizationcanbeperformedontheannotatedsyntaxtreebycollapsingtherighthandsubtreetoitsconstantvalue.assign-expression

subscript-expressioninteger

number6integeridentifieraarrayofintegeridentifierindexintegerTheCodeGeneratorThecodegeneratortakestheintermediatecodeorIRandgeneratescodeforthetargetmachine.Wewillwritetargetcodeinassemblylanguageform.Mostcompilersgenerateobjectcodedirectly.Thepropertiesofthetargetmachinebecomeimportant.Useinstructionsofthetargetmachine.Datarepresentations:howmanybytesorwordsintegerandfloating-pointdatatypesoccupyinmemory.Example&aistheaddressofa(thebaseaddressofthearray)*R1meansindirectregisteraddressingWeassumedthatthemachineperformsbyteaddressing.Integersoccupytwobytesofmemory.MOVR0,index;;valueofindex->R0MULR0,2;;doublevalueinR0MOVR1,&a;;addressofa->R1ADDR1,R0;;addR0toR1MOV*R1,6;;constant6->addressinR1TheTargetCodeOptimizerImprovementsincludeChoosingaddressingmodestoimproveperformance.Replacingslowinstructionsbyfasterones.EliminatingredundantorunnecessaryoperationsExample:

MOVR0,index;;valueofindex->R0SHLR0;;doublethevalueinR0MOV&a[R0],6;;constant6->addressa+R0MajorDataStructuresinCompilerThereisastronginteractionbetweenthealgorithmsusedbythephasesofacompilerandthedatastructuresthatsupportthesephases.Algorithmsneedtobeimplementedinefficientmanner.ThechoiceofdatastructuresisimportantTokensWhenascannercollectscharactersintoatoken,itrepresentsthetokensymbolicallyasavalueofanenumerateddatatyperepresentingasetoftokensofthesourcelanguageSometimes,itisnecessarytopreservethecharacterstringitselforotherinformationderivedfromitThenameassociatedwithanidentifiertokenThevalueofanumbertokenInmostlanguagesthescannerneedstogenerateonetokenatatime(singlesymbollookahead)Asingleglobalvariablecanbeusedtoholdthetokeninformation.TheSyntaxTreeThesyntaxtreeisconstructedasastandardpointer-basedstructurethatisdynamicallyallocatedasparsingproceeds.Thetreecanbekeptasasinglevariablepointingtotherootnode.Eachnodeisarecord.Itsfieldsrepresenttheinformationcollectedbytheparserandthesemanticanalyzer.SometimesthesefieldsaredynamicallyallocatedTheSymbolTableThisdatastructurekeepsinformationassociatedwithidentifiers:functions,variables,constants,anddatatypes.Thesymboltableinteractswithalmosteveryphaseofthecompiler.Theinsertion,deletionaccessoperationsneedtobeefficient.Astandarddatatypeforthispurposeisthehashtable.TheLiteralTableStoresconstantsandstringsusedintheprogram.Quickinsertionandlookupareessential.Neednotallowdeletions.IntermediateCodeDependingonthekindofintermediatecode,itmaybekeptasAnarrayoftextstringsAtemporarytextfileLinkedlistofstructuresTemporaryFilesComputersdidnotpossessenoughmemoryforanentireprogramtobekeptinmemoryduringcompilation.Thiswassolvedbyusingtemporaryfilestoholdtheproductsofintermediatesteps.Memoryconstrainsarenowmuchsmallerproblem.Occasionally,compilersgenerateintermediatefilesduringsomeofthesteps.PassesAcompileroftenprocessedtheentiresourceprogramseveraltimesbeforegeneratingcode.Theserepetitionsarereferredaspasses.Passesmayormaynotcorrespondtophases.Dependingonthelanguage,acompilermaybeonepass.Efficientcompilation,butnotefficienttargetcode.Examples:PascalandC.Mostcompilerswithoptimizationsusemorethanonepass:ScanningandparsingSemanticanalysisandsource-leveloptimizationCodegenerationandtargetcodeoptimizationLanguageDefinitionThedescriptionofthelexical,syntactic,andsemanticsofaprogramminglanguageiscollectedinalanguagereferencemanual,orlanguagedefinition.Withanewlanguage,alanguagedefinitionandcompilerareoftendevelopedtogether.Morecommonsituationiswhenacompileriswrittenforwell-knownlanguagewhichhasanexitinglanguagedefinition.ErrorHandlingOneofthemostimportantfunctionsofacompiler.Errorscanbedetectedduringalmosteveryphaseofcompilation.Errorreportedbyacompilerarestatic

(orcompile-time)errors.Itisimportanttogeneratemeaningfulerrormessages.ErrorhandlercontainsdifferentoperationsforaspecificcompilerphaseandsituationCompilerLanguageTheimplementation(orhost)languagehastobemachinelanguage.Thiswashowthefirstcompilerswerewritten.Anotherapproachistowritethecompilerinanotherlanguageforwhichacompileralreadyexists.Weneedonlytocompilethenewcompilerusingtheexistingcompilertogetarunningprogram.Whatiftheexistingcompilerrunsonamachinedifferentfromthetargetmachine?Compilationproducesacrosscompiler–acompilerthatgeneratestargetcodeforadifferentmachine.T-DiagramAcompilerwritteninlanguageH(hostlanguage)thattranslateslanguageS(sourcelanguage)intolanguageT(targetlanguage)isdrawnasthefollowingT-diagram:Thisisequivalenttosayingthatthecompilerrunson“machine〞H.Typically,weexpectH=T.thecompilerproducescodeforthesamemachineastheoneonwhichitruns.STHCase1TherearetwocompilersthatrunonthesamemachineH.OnetranslatesfromlanguageAtolanguageB.TheothertranslatesfromlanguageBtolanguageC.Wecancombinethembylettingtheoutputofthefirsttobetheinputtothesecond.TheresultisacompilerfromAtoConmachineH.ABHBCHACH=>Case2WecanuseacompilerMfrom“machine〞Hto“machine〞Ktotranslatetheimplementationlanguageofanothe

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论