翻译文献-第三章_第1页
翻译文献-第三章_第2页
翻译文献-第三章_第3页
翻译文献-第三章_第4页
翻译文献-第三章_第5页
已阅读5页,还剩11页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

外文原文CHAPTER3PARSINGXMLWITHTWOSOLIDCHAPTERSOFINTRODUCTIONBEHINDUS,WEAREREADYTOCODEBYNOWYOUHAVESEENTHENUMEROUSACRONYMSTHATMAKEUPTHEWORLDOFXML,YOUHAVEDELVEDINTOTHELANGUAGEITSELF,ANDYOUSHOULDBEFAMILIARWITHANXMLDOCUMENTTHISCHAPTERTAKESTHENEXTSTEP,ANDTHEFIRSTONOURPATHOFJAVAPROGRAMMING,BYDEMONSTRATINGHOWANXMLDOCUMENTISPARSEDANDHOWWECANACCESSTHEPARSEDDATAFROMWITHINJAVACODEONEOFTHEFIRSTTHINGSYOUWILLHAVETODOWHENDEALINGWITHXMLPROGRAMMATICALLYISTAKEANXMLDOCUMENTANDPARSEITASTHEDOCUMENTISPARSED,THEDATAINTHEDOCUMENTBECOMESAVAILABLETOTHEAPPLICATIONUSINGTHEPARSER,ANDSUDDENLYWEAREWITHINANXMLAWAREAPPLICATIONIFTHISALLSOUNDSALITTLETOOSIMPLETOBETRUE,ITALMOSTISINTHISCHAPTER,WEWILLLOOKCLOSELYATHOWANXMLDOCUMENTISPARSEDUSINGAPARSERWITHINANAPPLICATIONANDHOWTOFEEDTHATPARSERYOURDOCUMENTSDATAWILLBECOVEREDTHENWEWILLLOOKATTHEVARIOUSCALLBACKSTHATAREAVAILABLEWITHINTHEPARSINGLIFECYCLETHESEEVENTSARETHEPOINTSWHEREAPPLICATIONSPECIFICCODECANBEINSERTEDANDDATAMANIPULATIONCANOCCURINADDITIONTOLOOKINGATHOWPARSERSWORK,WEWILLALSOBEGINOUREXPLORATIONOFTHESIMPLEAPIFORXMLSAXINTHISCHAPTERSAXISWHATMAKESTHESEPARSINGCALLBACKSAVAILABLETHEINTERFACESPROVIDEDINTHESAXPACKAGEWILLBECOMEANIMPORTANTPARTOFOURTOOLKITFORHANDLINGXMLEVENTHOUGHTHESAXCLASSESARESMALLANDFEWINNUMBER,EVERYTHINGELSEINOURDISCUSSIONSOFXMLISBASEDONTHESECLASSESASOLIDUNDERSTANDINGOFHOWTHEYHELPUSACCESSXMLDATAISCRITICALTOEFFECTIVELYLEVERAGINGXMLINYOURJAVAPROGRAMS31GETTINGPREPAREDTHEREARESEVERALITEMSTHATWESHOULDTAKECAREOFBEFOREBEGINNINGTOCODEFIRST,YOUMUSTOBTAINANXMLPARSERWRITINGAPARSERFORXMLISASERIOUSTASK,ANDTHEREARESEVERALEFFORTSGOINGONTOPROVIDEEXCELLENTXMLPARSERSWEARENOTGOINGTODETAILTHEPROCESSOFACTUALLYWRITINGANXMLPARSERHERERATHER,WEWILLDISCUSSTHEAPPLICATIONSTHATWRAPTHISPARSINGBEHAVIOR,FOCUSINGONUSINGEXISTINGTOOLSTOMANIPULATEXMLDATATHISRESULTSINBETTERANDFASTERPROGRAMS,ASWEDONOTSEEKTOREINVENTWHATISALREADYAVAILABLEAFTERSELECTINGAPARSER,WEMUSTENSURETHATACOPYOFTHESAXCLASSESISONHANDTHESEAREEASYTOLOCATE,ANDAREKEYTOOURJAVACODEBEINGABLETOPROCESSXMLFINALLY,WEWILLNEEDANXMLDOCUMENTTOPARSETHEN,ONTOTHECODEJAVAANDXML311OBTAININGAPARSERTHEFIRSTSTEPINGETTINGREADYTOCODEJAVATHATUSESXMLISLOCATINGANDOBTAININGTHEPARSERYOUWANTTOUSEWEBRIEFLYTALKEDABOUTTHISPROCESSINCHAPTER1,ANDLISTEDVARIOUSXMLPARSERSTHATCOULDBEUSEDTOENSURETHATYOURPARSERWORKSWITHALLOFTHEEXAMPLESINTHEBOOK,YOUSHOULDVERIFYYOURPARSERSCOMPLIANCEWITHTHEXMLSPECIFICATIONBECAUSEOFTHEVARIETYOFPARSERSAVAILABLEANDTHERAPIDPACEOFCHANGEWITHINTHEXMLCOMMUNITY,ALLOFTHEDETAILSABOUTWHICHPARSERSHAVEWHATCOMPLIANCELEVELSAREBEYONDTHESCOPEOFTHISBOOKYOUSHOULDCONSULTTHEPARSERSVENDORANDVISITTHEWEBSITESPREVIOUSLYGIVENFORTHISINFORMATIONINTHESPIRITOFTHEOPENSOURCECOMMUNITY,ALLOFTHEEXAMPLESINTHISBOOKWILLUSETHEAPACHE,THISCANDJAVAXERCESPARSERFREELYAVAILABLEINBINARYANDSOURCEFORMATHTTP/XMLAPACHEORGBASEDPARSERISALREADYONEOFTHEMOSTWIDELYCONTRIBUTEDTOPARSERSAVAILABLEINADDITION,USINGANOPENSOURCEPARSERSUCHASXERCESALLOWSYOUTOSENDQUESTIONSORBUGREPORTSTOTHEPARSERSAUTHORS,RESULTINGINABETTERPRODUCT,ASWELLASHELPINGYOUUSETHESOFTWAREQUICKLYANDCORRECTLYTOSUBSCRIBETOTHEGENERALLISTANDREQUESTHELPONTHEXERCESPARSER,SENDABLANKEMAILTOXERCESDEVSUBSCRIBEXMLAPACHEORGTHEMEMBERSOFTHISLISTCANHELPIFYOUHAVEQUESTIONSORPROBLEMSWITHAPARSERNOTSPECIFICALLYCOVEREDINTHISBOOKOFCOURSE,THEEXAMPLESINTHISBOOKALLRUNNORMALLYONANYPARSERTHATUSESTHESAXIMPLEMENTATIONCOVEREDHEREONCEYOUHAVESELECTEDANDDOWNLOADEDANXMLPARSER,MAKESURETHATYOURJAVAENVIRONMENT,WHETHERITBEANIDEINTEGRATEDDEVELOPMENTENVIRONMENTORACOMMANDLINE,HASTHEXMLPARSERCLASSESINITSCLASSPATHTHISWILLBEABASICREQUIREMENTFORALLFURTHEREXAMPLES312GETTINGTHESAXCLASSESANDINTERFACESONCEYOUHAVEYOURPARSER,YOUNEEDTOLOCATETHESAXCLASSESTHESECLASSESAREALMOSTALWAYSINCLUDEDWITHAPARSERWHENDOWNLOADED,ANDXERCESISNOEXCEPTIONIFTHISISTHECASEWITHYOURPARSER,YOUSHOULDBESURENOTTODOWNLOADTHESAXCLASSESEXPLICITLY,ASYOURPARSERISPROBABLYPACKAGEDWITHTHELATESTVERSIONOFSAXTHATISSUPPORTEDBYTHEPARSERATTHETIMEOFTHISWRITING,SAX20HADJUSTGONEFINALTHESAX20CLASSESAREUSEDTHROUGHOUTTHISBOOK,ANDSHOULDCOMEBUNDLEDWITHTHELATESTVERSIONOFTHEAPACHEXERCESPARSERIFYOUARENOTSUREWHETHERYOUHAVETHESAXCLASSES,LOOKATTHEJARFILEORCLASSSTRUCTUREUSEDBYYOURPARSERTHESAXCLASSESAREPACKAGEDINTHEORGXMLSAXSTRUCTURETHELATESTVERSIONOFTHESEINCLUDES17CLASSESINTHISROOTDIRECTORY,ASWELLAS9CLASSESINORGXMLSAXHELPERSAND2ORGXMLSAXEXTIFYOUAREMISSINGANYOFTHESECLASSES,YOUSHOULDTRYTOCONTACTYOURINPARSERSVENDORTOSEEWHYTHECLASSESWERENOTINCLUDEDWITHYOURDISTRIBUTIONITISPOSSIBLETHATSOMECLASSESMAYHAVEBEENLEFTOUTIFTHEYARENOTSUPPORTEDINWHOLETHESECLASSCOUNTSAREFORSAX20ASWELLFEWERCLASSESMAYAPPEARIFONLYSAX10ISSUPPORTED1SUPPORTINGSAXINWHOLEISAVERYIMPORTANTITEMFORAPARSERALTHOUGHYOUARECERTAINLYWELCOMETOUSEANYPARSERYOULIKE,IFYOURPARSERDOESNOTHAVECOMPLETESAX20SUPPORT,MANYOFTHEEXAMPLESINTHISBOOKWILLNOTWORKINADDITION,YOURPARSERISNOTKEEPINGUPWITHTHELATESTXMLDEVELOPMENTSFOREITHERORBOTHREASONS,YOUMAYWANTTOCONSIDERATLEASTTRYINGTHEXERCESPARSERFORTHEDURATIONOFTHISBOOKFINALLY,YOUMAYWANTTOEITHERDOWNLOADORBOOKMARKTHESAXAPIJAVADOCSONTHEWEBTHISDOCUMENTATIONISEXTREMELYHELPFULINUSINGTHESAXCLASSES,ANDTHEJAVADOCSTRUCTUREPROVIDESASTANDARD,SIMPLEWAYTOFINDOUTADDITIONALINFORMATIONABOUTTHECLASSESANDWHATTHEYDOTHISYOUMAYDOCUMENTATIONISLOCATEDATHTTP/WWWMEGGINSONCOM/SAX/SAX2/JAVADOC/INDEXHTMLALSOGENERATEJAVADOCFROMTHESAXSOURCEIFYOUWISH,BYUSINGTHESOURCEINCLUDEDWITHYOURPARSER,ORBYDOWNLOADINGTHECOMPLETESOURCEFROMHTTP/WWWMEGGINSONCOM/SAX/SAX2JAVAANDXML313HAVEANXMLDOCUMENTONHANDYOUSHOULDALSOMAKESURETHATYOUHAVEANXMLDOCUMENTTOPARSETHEOUTPUTSHOWNINTHEEXAMPLESISBASEDONPARSINGTHEXMLDOCUMENTWEDISCUSSEDINCHAPTER2SAVETHISFILEASCONTENTSXMLSOMEWHEREONYOURLOCALHARDDRIVEWEHIGHLYRECOMMENDTHATYOUFOLLOWWHATWEREDOINGINTHISFILEYOUCANSIMPLYTYPETHEFILEINFROMTHEBOOK,ORYOUMAYDOWNLOADTHEXMLFILEFROMTHEBOOKSWEBSITE,HTTP/WWWOREILLYCOM/CATALOG/JAVAXMLYOUAREENCOURAGEDTOTAKETHETIMETOTYPEINTHEEXAMPLE,THOUGH,ASITWILLALMOSTCERTAINLYFAMILIARIZEYOUWITHXMLSYNTAXMORETHANAQUICKDOWNLOADWILLINADDITIONTODOWNLOADINGORCREATINGTHEXMLFILE,YOUNEEDTOMAKEACOUPLEOFSMALLMODIFICATIONSBECAUSEWEHAVENTCOVEREDORDISCUSSEDHOWTOCONSTRAINANDTRANSFORMDOCUMENTS,OURPROGRAMSONLYPARSEXMLINTHISCHAPTERTOPREVENTERRORS,WENEEDTOREMOVETHEREFERENCESWITHINTHEXMLDOCUMENTTOANEXTERNALDTD,WHICHCONSTRAINSTHEXML,ANDTHEXSLSTYLESHEETSTHATTRANSFORMITYOUSHOULDCOMMENTOUTTHESETWOLINESINTHEXMLDOCUMENT,ASWELLASTHEPROCESSINGINSTRUCTIONTOCOCOONREQUESTINGXSLTRANSFORMATIONONCETHESELINESARECOMMENTED,NOTETHEFULLPATHTOTHEXMLDOCUMENTYOUWILLNEEDTOSUPPLYTHATPATHTOOURPROGRAMSINTHISANDLATERCHAPTERSFINALLY,WENEEDTOCOMMENTOUTOURREFERENCETOTHEOREILLYCOPYRIGHTEXTERNALENTITYTHATWOULDBEUSEDTOLOADAFILEFROMTHEFILESYSTEMWITHTHENEEDEDCOPYRIGHTINFORMATIONWITHOUTADTDTODEFINEHOWTORESOLVETHISENTITYREFERENCE,WEWILLRECEIVEUNWANTEDERRORSINTHENEXTCHAPTER,WEWILLLOOKATHOWTORESOLVETHISREFERENCEFORTHEXMLDOCUMENT32SAXREADERSWITHOUTSPENDINGANYFURTHERTIMEONTHEPRELIMINARIES,LETSBEGINTOCODEOURFIRSTPROGRAMWILLBEABLETOTAKEANXMLFILEASACOMMANDLINEPARAMETER,ANDPARSETHATFILEWEWILLBUILDDOCUMENTCALLBACKSINTOTHEPARSINGPROCESSSOTHATWECANDISPLAYEVENTSINTHEPARSINGPROCESSASTHEYOCCUR,WHICHWILLGIVEUSABETTERIDEAOFWHATEXACTLYISGOINGON“UNDERTHEHOOD“JAVAANDXMLTHEFIRSTTHINGWENEEDTODOISGETANINSTANCEOFACLASSTHATCONFORMSTOTHESAXORGXMLSAXXMLREADERINTERFACETHISINTERFACEDEFINESPARSINGBEHAVIORANDALLOWSUSTOSETFEATURESANDPROPERTIES,WHICHWEWILLLOOKATINCHAPTER5FORTHOSEOFYOUFAMILIARWITHSAX10,THISINTERFACEREPLACESTHEORGXMLSAXPARSERINTERFACE321INSTANTIATINGAREADERSAXPROVIDESANINTERFACETHATALLSAXCOMPLIANTXMLPARSERSSHOULDIMPLEMENTTHISALLOWSSAXTOKNOWEXACTLYWHATMETHODSAREAVAILABLEFORCALLBACKANDUSEWITHINANAPPLICATIONFOREXAMPLE,THEXERCESMAINSAXPARSERCLASS,ORGAPACHEXERCESPARSERSSAXPARSER,IMPLEMENTSTHEORGXMLSAXXMLREADERINTERFACEIFYOUHAVEACCESSTOTHESOURCEOFYOURPARSER,YOUSHOULDSEETHESAMEINTERFACEIMPLEMENTEDINYOURPARSERSMAINSAXPARSERCLASSEACHXMLPARSERMUSTHAVEONECLASSSOMETIMESMORETHATIMPLEMENTSTHISINTERFACE,ANDTHATISTHECLASSWENEEDTOINSTANTIATETOALLOWUSTOPARSEXMLXMLREADERPARSERNEWSAXPARSER/DOSOMETHINGWITHTHEPARSERPARSERPARSEURIFORTHOSEOFYOUNEWTOSAXENTIRELY,ITMAYBEABITCONFUSINGNOTTOSEETHEINSTANCEVARIABLEWEUSEDNAMEDREADERORXMLREADERWHILETHATWOULDBEANORMALCONVENTION,THESAX10CLASSESPARSER,ANDALOTOFLEGACYCODEHASVARIABLESNAMEDPARSERDEFINEDTHEMAINPARSINGINTERFACEASBECAUSEOFTHATNAMINGTHISINTERFACEWASDEPRECATEDBECAUSEOFTHELARGENUMBEROFCHANGESREQUIREDFORNAMESPACEANDFEATUREANDPROPERTIESSUPPORT,BUTTHENAMINGCONVENTIONISSTILLAGOODONE,ASPARSERDOESINDICATETHEPURPOSEOFTHEINSTANCEVARIABLEWITHTHATINMIND,LETSLOOKATASMALLPROGRAMTOSTARTUPANDINSTANTIATEASAXPARSERTHISPROGRAM,SHOWNINEXAMPLE31,WONTACTUALLYPARSEADOCUMENT,BUTSETSUPTHESKELETONWITHINWHICHWECANWORKFORTHERESTOFTHECHAPTERWEWILLADDTHEACTUALPARSINGBEHAVIORINTHENEXTCHAPTEREXAMPLE31SAXPARSEREXAMPLEIMPORTORGXMLSAXXMLREADER/IMPORTYOURVENDORSXMLREADERIMPLEMENTATIONHEREIMPORTORGAPACHEXERCESPARSERSSAXPARSER/SAXPARSERDEMOWILLTAKEANXMLFILEANDPARSEITUSINGSAX,DISPLAYINGTHECALLBACKSINTHEPARSINGLIFECYCLEAUTHORBRETTMCLAUGHLINVERSION10/PUBLICCLASSSAXPARSERDEMO/THISPARSESTHEFILE,USINGREGISTEREDSAXHANDLERS,ANDOUTPUTSTHEEVENTSINTHEPARSINGPROCESSCYCLEPARAMURISTRINGURIOFFILETOPARSE/PUBLICVOIDPERFORMDEMOSTRINGURISYSTEMOUTPRINTLN“PARSINGXMLFILE“URI“NN“/INSTANTIATEAPARSERXMLREADERPARSERNEWSAXPARSER/THISPROVIDESACOMMANDLINEENTRYPOINTFORTHISDEMO/PUBLICSTATICVOIDMAINSTRINGARGSIFARGSLENGTH1SYSTEMOUTPRINTLN“USAGEJAVASAXPARSERDEMOXMLURI“SYSTEMEXIT0STRINGURIARGS0SAXPARSERDEMOPARSERDEMONEWSAXPARSERDEMOPARSERDEMOPERFORMDEMOURIYOUSHOULDBEABLETOLOADANDCOMPILETHISPROGRAMIFYOUMADETHEPREPARATIONSTALKEDABOUTEARLIERTOENSURETHESAXCLASSESAREINYOURCLASSPATHTHISSIMPLEPROGRAMDOESNTDOMUCHYETINFACT,IFYOURUNITANDSUPPLYABOGUSFILENAMEORURIASANARGUMENT,ITSHOULDHAPPILYGRINDAWAYANDDONOTHING,OTHERTHANPRINTOUTTHEINITIAL“PARSINGXMLFILE“MESSAGETHATSBECAUSEWEHAVEONLYINSTANTIATEDAPARSER,NOTREQUESTEDTHATOURXMLDOCUMENTBEPARSEDIFYOUHAVETROUBLECOMPILINGTHISSOURCEFILE,YOUMOSTLIKELYHAVEPROBLEMSWITHYOURIDEORSYSTEMSCLASSPATHFIRST,MAKESUREYOUOBTAINEDTHEAPACHEXERCESPARSERORYOURVENDORSPARSERFORXERCES,THISINVOLVESDOWNLOADINGAJARFILETHISARCHIVECANTHENBEEXTRACTED,ANDWILLCONTAINAXERCESJARFILEITISTHISJARFILETHATCONTAINSTHECOMPILEDCLASSFILESFORTHEPROGRAMADDTHISARCHIVETOYOURCLASSPATHYOUSHOULDTHENBEABLETOCOMPILETHESOURCEFILELISTING322PARSINGTHEDOCUMENTONCEAPARSERISLOADEDANDREADYFORUSE,WECANINSTRUCTITTOPARSEOURDOCUMENTTHISISPARSEMETHODOFORGXMLSAXXMLREADER,ANDTHISMETHODCANCONVENIENTLYHANDLEDBYTHEACCEPTEITHERANORGXMLSAXINPUTSOURCE,ORASIMPLESTRINGURIFORNOW,WEWILLDEFERTALKINGINPUTSOURCEANDLOOKATPASSINGINASIMPLEURIALTHOUGHTHISURICOULDBEAABOUTUSINGANNETWORKACCESSIBLEADDRESS,WEWILLUSETHEFULLPATHTOTHEXMLDOCUMENTWEPREPAREDFORTHISUSEEARLIERIFYOUDIDCHOOSETOUSEAURLFORNETWORKACCESSIBLEXMLDOCUMENTS,YOUSHOULDBEAWARETHATTHEAPPLICATIONWOULDHAVETORESOLVETHEURLBEFOREPASSINGITTOTHEPARSERGENERALLYTHISREQUIRESONLYSOMEFORMOFNETWORKCONNECTIVITYWENEEDTOADDTHEPARSEMETHODTOOURPROGRAM,ASWELLASTWOEXCEPTIONHANDLERSBECAUSETHEDOCUMENTMUSTBELOADED,EITHERLOCALLYORREMOTELY,AJAVAIOIOEXCEPTIONCANRESULT,ANDMUSTBECAUGHTINADDITION,THEORGXMLSAXSAXEXCEPTIONCANBETHROWNIFPROBLEMSOCCURWHILEPARSINGTHEDOCUMENTSOWECANADDTWOMOREIMPORTSTATEMENTSANDAFEWLINESOFCODE,ANDHAVEANAPPLICATIONTHATPARSESXMLREADYTOUSEIMPORTJAVAIOIOEXCEPTIONIMPORTORGXMLSAXSAXEXCEPTIONIMPORTORGXMLSAXXMLREADER/IMPORTYOURVENDORSXMLREADERIMPLEMENTATIONHEREIMPORTORGAPACHEXERCESPARSERSSAXPARSER/THISPARSESTHEFILE,USINGREGISTEREDSAXHANDLERS,ANDOUTPUTSTHEEVENTSINTHEPARSINGPROCESSCYCLEPARAMURISTRINGURIOFFILETOPARSE/PUBLICVOIDPERFORMDEMOSTRINGURISYSTEMOUTPRINTLN“PARSINGXMLFILE“URI“NN“TRY/INSTANTIATEAPARSERXMLREADERPARSERNEWSAXPARSER/PARSETHEDOCUMENTPARSERPARSEURICATCHIOEXCEPTIONESYSTEMOUTPRINTLN“ERRORREADINGURI“EGETMESSAGECATCHSAXEXCEPTIONESYSTEMOUTPRINTLN“ERRORINPARSING“EGETMESSAGECOMPILETHESECHANGESANDYOUAREREADYTOEXECUTETHEPARSINGEXAMPLEYOUSHOULDSPECIFYTHEFULLPATHTOYOURFILEASTHEFIRSTARGUMENTTOTHEPROGRAMDPRODJAVAXMLJAVASAXPARSERDEMODPRODJAVAXMLCONTENTSCONTENTSXMLPARSINGXMLFILEDPRODJAVAXMLCONTENTSCONTENTSXMLTHISRATHERUNINTERESTINGOUTPUTMAYMAKEYOUDOUBTTHATANYTHINGHASHAPPENEDHOWEVER,IFYOULEANNICEANDCLOSE,YOUMAYHEARYOURHARDDRIVESPINBRIEFLYORYOUCANJUSTHAVEFAITHINOURBYTECODEINFACT,THEXMLDOCUMENTISPARSED,ANDIFYOUPASSINANINVALIDFILEURI,THEPARSERWILLTHROWANEXCEPTIONLETTINGYOUKNOWITCOULDNTLOCATEAFILETOPARSEHOWEVER,WEHAVENOTSETUPANYCALLBACKSTOTELLSAXTOTAKEACTIONDURINGTHEPARSINGPROCESSANDLETUSKNOWWHATISGOINGONWITHOUTTHESECALLBACKS,ADOCUMENTISPARSEDQUIETLYANDWITHOUTAPPLICATIONINTERVENTIONOFCOURSE,WEWANTTOINTERVENEINTHATPROCESS,SOWEMUSTNEXTLOOKATCREATINGSOMEPARSERCALLBACKMETHODSTHISINTERVENTIONISTHEMOSTIMPORTANTPARTOFUSINGSAXPARSERCALLBACKSLETUSINSERTACTIONINTOTHEPROGRAMFLOW,ANDTURNOURRATHERBORING,QUIETPARSINGOFANXMLDOCUMENTINTOANAPPLICATIONTHATCANREACTTOTHEDATA,ELEMENTS,ATTRIBUTES,ANDSTRUCTUREOFTHEDOCUMENTBEINGPARSED,ASWELLASINTERACTWITHOTHERPROGRAMSANDCLIENTSALONGTHEWAY323USINGANINPUTSOURCEINSTEADOFUSINGAFULLURI,THEPARSEMETHODMAYALSOBEINVOKEDWITHANORGXMLSAXINPUTSOURCEASANARGUMENTTHEREISACTUALLYREMARKABLYLITTLETOCOMMENTONINREGARDTOTHISCLASSITISUSEDASAHELPERANDWRAPPERCLASSMORETHANANYTHINGELSEANINPUTSOURCESIMPLYENCAPSULATESINFORMATIONABOUTASINGLEOBJECTWHILETHISISNTVERYHELPFULINOUREXAMPLE,INSITUATIONSWHEREASYSTEMIDENTIFIER,PUBLICIDENTIFIER,ORASTREAMMAYALLBETIEDTOONEURI,USINGANINPUTSOURCEFORENCAPSULATIONCANBECOMEVERYHANDYTHECLASSHASACCESSORANDMUTATORMETHODSFORITSSYSTEMIDANDPUBLICID,ACHARACTERENCODING,ABYTESTREAMJAVAIOINPUTSTREAM,ANDACHARACTERSTREAMJAVAIOREADERPASSEDASANARGUMENTTOTHEPARSEMETHOD,SAXALSOGUARANTEESTHATTHEPARSERWILLNEVERMODIFYTHEINPUTSOURCETHISENSURESTHATTHEORIGINALINPUTTOAPARSERISSTILLAVAILABLEUNCHANGEDAFTERITSUSEBYAPARSERORXMLAWAREAPPLICATIONWHILEWEDONOTSPENDANYFURTHERTIMELOOKINGATTHISUTILITYCLASSHERE,INPUTSOURCECLASSASINPUTTOSAXMANYOFTHEAPPLICATIONSWELOOKATLATERINTHEBOOKUSETHEPARSERSRATHERTHANASPECIFICURI33CONTENTHANDLERSINORDERTOLETOURAPPLICATIONDOSOMETHINGUSEFULWITHXMLDATAASITISBEINGPARSED,WEMUSTREGISTERHANDLERSWITHTHESAXPARSERAHANDLERISNOTHINGMORETHANASETOFCALLBACKSTHATSAXDEFINESTOLETUSINTERJECTAPPLICATIONCODEATIMPORTANTEVENTSWITHINADOCUMENTSPARSINGREALIZETHATTHESEEVENTSWILLTAKEPLACEASTHEDOCUMENTISPARSED,NOTAFTERTHEPARSINGHASOCCURREDTHISISONEOFTHEREASONSTHATSAXISSUCHAPOWERFULINTERFACEITALLOWSADOCUMENTTOBEHANDLEDSEQUENTIALLY,WITHOUTHAVINGTOFIRSTREADTHEENTIREDOCUMENTINTOMEMORYWEWILLLATERLOOKATTHEDOCUMENTOBJECTMODELDOM,WHICHHASTHISLIMITATION中文译文第三章在前两章的介绍之后,就可以开始编写代码。在此之前你已经看到了XML领域的大量术语,你已经深入该语言,而且对一个XML文档应该已经比较熟悉了,这一章中我们讨论怎样解析一个XML文档,以及怎样从JAVA代码中得到解析数据,这是JAVA编程之路上的第一步。在程序化处理XML时,首先要做的事情之一,是获取一个XML文档并解析它。当文档被解析后,其中的数据对使用解析器的应用程序就可用了,这样我们就得到了XML应用程序,所有这些听起来可能有点太简单了,但却是如此。在本章中,我们将进一步密切关注一个XML文档怎样被解析,还将介绍在应用据程序由如何使用解析器和怎样将你的数据输入到解析器中。然后我们将看看在解析过程中可利用的回调事件,这些事件是应用程序专用代码插入和数据操作出现的地方。除了研究解析器是怎样工作之外,我们还将在本章开始对XML的简单APISAX进行研究。SAX可以使这些解析回调有效,SAX包中提供的接口将成为处理XML的工具箱的一个重要组成部分,尽管SAX类很小并且数量很少,但讨论XML中的其他东西都是以这些类为基础的。深入理解他们在访问XML数中的作用,对于我们在JAVA程序中有效的使用XML非常重要。准备工作在开始编写代码之前,有几点需要引起注意。首先必须得到一个XML的解析器。为XML写一个解析器,是一个艰巨的任务。要实现一个优秀的XML解析器要经过多次的努力。在这里不准备具体地讲述实际写一个XML解析器的过程。相反地,我们将讨论把这种解析行为隐藏起来的应用,并且重点介绍如何使用已经有的工具来操作XML数据,由于不需要尝试去重新创造已可使用的东西,这样就会产生更快更好的程序。在选择了一个解析器后,必须保证手上由一分SAX表,这些很容易设置,但它是JAVA代码能够处理的XML的关键,最后还需要一个要解析的XML文档。下面,开始编写代码获取一个解析器准备用XML编写JAVA代码的第一步,是设置并得到想使用的解析器。在第一章中我们已经讨论了这个过程,并且列举了可供使用的各种XML解析器。为了确保你的解析器能够应用本书中的所有例子,就应该用XML规范来检验解析器是否合适。由于可用的解析器的多样性和XML社区快速的变化,本书将不讨论哪个解析器具有什么级别的适应性这样的细节,你应该咨询解析器厂商,访问前面提供的网站,以得到这些信息。在原代码公开的宗旨下,本书中的所有规范将使用APACHEXERCES解析器。由于HTTP/XMLAPACHEORG可免费地得到其二进制文件。这个以C和JAVA语言为基础的解析器已经成为贡献最大的解析器之一,而且使用XERCERS这样的开源解析器,你可以把问题或出错报告直接发给解析器的作者,这样,既可以帮助作者开发出更好的产品,又可以帮助你快速正确地使用此软件。若要订阅一个普通的标单并且请求帮助,请发送一个空的电子邮件到XERCESDEVSUBSCRIBEXMLAPACHEORG如果你对解析器有疑问或难题而本书又未涉及,名单上的成员会对你有所帮助。当然,所有使用了这里讨论的SAX工具的解析器上,都能正常运行本书上的例子。一旦选择并下载了一个XML解析器,不管JAVA的环境是IDE(集成开发环境),还是命令行,都要确保此XML解析器与其在同一路径上。这也对后面所有例子的基本要求。获得SAX类和接口当有了接续其后,就需要设置SAX类,这些类通常在下载时都包括解析器,XERCES也不例外。如果解析器是这种情况,就不必明确地下载SAX类,因为该解析器很可能已经有解析器支持的最新版本的SAX包了,在写到这得时候,SAX20最终开始流行起来。本书全部使用SAX20,而且已经与最新版本的APACHEXERCES解析器捆绑其来。如果你不能确定是否有SAX类,请查看解析器使用的JAR文件或类结构,SAX类被封装在ORGXMLSAX结构中,最新版本有17个类在其目录,同时又9个在ORGXMLSAXHELPER,两个在ORGXMLSAXEXT中,如果其中有缺少的类,你应该与解析器厂商联系,看看为什么该发布版本中没有这些类,如果有些类不能被全部支持,则有可能是被厂商保留了。这些类也适用于SAX20;如果只支持SAX10,类会更少。最后,你可能想下载SAXAPIJAVADOCS或将其做成书签。在使用SAX类时,这份资料极有帮助,而且JAVADOCS结构提供了一个标准,简单的方法来查询关于这些类的额外信息和他们的行为。这些资料在HTTP/WWWMEGGINSONCOM/SAX/SAX2/JAVADOC/INDEXHTOIL中得到。你也可以使用包含解析器的源代码,或从HTTP/WWWMEGGINSONCOM/SAX/SAX2下载完整的源代码,再从SAX源代码中生成JAVADOC获得XML文档要保证手中的有一份需进行解析的XML文档,本例中的解析输出基于在第二章“创建XML”中曾讨论过的XML文档,将此文件命名为CONTENT,XML,并存于硬盘上。我们强烈推荐你遵循示例进行,你可以从书中录入,也可以从网上下载,本书所在的网址为HTTP/WWWOREILLYCOM/CATALOG/JAVAXML然而,我们希望你能花点时间从书中录入,因为,这样做比快速下载会让你更加熟悉XML的语法。除在下载或创建文档之外,我们还必须进行一些小的改动,因为还未涉及到文档约束和文档转换,本章中的程序仅涉及解析XML。为避免错误,我们去掉了XML文档中指相外部DTD的引用,这个外部DTD约束了XML和XSL样式表转化。你应该把XML文档中的这两行,以及COCOON请求XSL转换的处理指令暂时变为注释。注意这些变为注释的行为中有XML文档的完整路径。在本章和以后各章节的程序中也必须提供完整的路径。最后,我们标出了OREILLYCOPYRIGHT外部实体的引用,这其中包含了从文件系统中装载一个文件所需的版权信息。相反,如果没有DTD来定义如何解决这种实体引用问题,将会得到意想不到的错误,在下一章节将会看到在XML文档中是如何解决这类问题的。SAX阅读器为节省时间,让我们马上就开始编程。第一个程序将以一个XML文件为命令行参数来解析这个文件,将会建立反映解析过程情况的文档,来显示解析过程的种种事件,这将有助于更好地了解底层正在发生的事。首先必须得到一个符合SAXORGXMLSAXXMLREADER接口规范的例子。这个接口定义了解析行为并允许设置某些特征和属性,我们将在第五章“验证XML”中看到这一点。对那些熟知SAX10的人而言,该接口替换了ORGXMLSAXPARSER阅读器SAX提供了一个所有支持SAX的XML解析器都必须支持的接口,这样使得SAX明白在一个应用中什么是可以回调的,什么是可以使用的。例如,XERCES的主SAX解析器类,ORGAPACHEXERCESPARSERSSXPARSER支持ORGXMLSAXREADER接口。如果分析过解析器的源码,就会发现在他的主SAX解析器类中有相同的接口支持,每一解析器必须有一个支持此接口的类(有时会更多)。正因为如此可以用举例子来解析XML文档;XMLREADERPARSERNEWSAXPARSER/DOSOMETHINGWITHTHEPARSERPARSERPARSEURI那些对SAX完全陌生的人,会不明白为什么有些例子叫READER,而有些却叫XMLREADER,SAX10类将主解析接口定义为解析器。由这种命名方式,许多继承代码被冠以不同的解析器名。这是普通的约定,这种接口有时遭非议,因为他的许多名子空间的特征和属性都还需改进。然而他应然是一种好的接口,它确实能指出各种例子的真正目的。因此,我们开始用一个小程序来对SAX解析器做出说明,这并不能真正解析一个文档,实际上它只是给出一个框架对本章后续内容服务,实际解析过程将在下一章里具体描述。例31SAX解析器范例IMPORTORGXMLSAXXMLREADER/IMPORTYOURVENDORSXMLREADERIMPLEMENTATIONHEREIMPORTORGAPACHEXERCESPARSERSSAXPARSER/SAXPARSERDEMOWILLTAKEANXMLFILEANDPARSEITUSINGSAX,DISPLAYINGTHECALLBACKSINTHEPARSINGLIFECYCLEAUTHORBRETTMCLAUGHLINVERSION10/PUBLICCLASSSAXPARSERDEMO/THISPARSESTHEFILE,USINGREGISTEREDSAXHANDLERS,ANDOUTPUTSTHEEVENTSINTHEPARSINGPROCESSCYCLEPARAMURISTRINGURIOFFILETOPARSE/PUBLICVOIDPERFORMDEMOSTRINGURISYSTEMOUTPRINTLN“PARSINGXMLFILE“URI“NN“/INSTAN

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论