




已阅读5页,还剩50页未读, 继续免费阅读
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
SequenceLabeling,SequenceLabeling,Sequence,Sequence,(sequence),(sequence),Application,NameentityrecognitionIdentifyingnamesofpeople,places,organizations,etc.fromasentenceHarryPotterisastudentofHogwartsandlivedonPrivetDrive.people,organizations,places,notanameentity,Ref:https:/www.ptt.cc/bbs/JinYong/M.1258625573.A.DC4.html,楊公再興之神,Canbedifficult,Ref:https:/www.ptt.cc/bbs/JinYong/M.1195128035.A.31A.html,馮氏埋香之塚,ExampleTask,POStagging,Annotateeachwordinasentencewithapart-of-speech.,Usefulforsubsequentsyntacticparsingandwordsensedisambiguation,etc.,ExampleTask,POStagging,Theproblemcannotbesolvedwithoutconsideringthesequences.,“saw”ismorelikelytobeaverbVratherthananounN,However,thesecond“saw”isanounNbecauseanounNismorelikelytofollowadeterminer.,Outline,Outline,HMM,Howyougenerateasentence?,Johnsawthesaw.,JusttheassumptionofHMM,PNVDN,HMMStep1,0.95,0.05,0.85,0.05,end,0.5,0.1,0.8,0.1,0.1,0.25,0.25,start,0.1,0.5,0.4,Det,Noun,PropNoun,Verb,P(“PNVDN”)=0.4*0.8*0.25*0.95*0.1,Slidecredit:RaymondJ.Mooney,Thisisthegrammarinyourbrain.,PN,N,D,V,MarkovChain,HMMStep2,Noun,cat,dog,saw,pen,bed,apple,Det,a,the,the,the,that,a,the,a,0.2,John,saw,the,saw,“PNVDN”,Slidecredit:RaymondJ.Mooney,0.17,0.63,0.17,P(“Johnsawthesaw“|“PNVDN”)=0.2*0.17*0.63*0.17,HMM,x:,y:,=|,|=|,P(x,y)=P(y)P(x|y),start,end,HowaboutP(x,y)=P(x)P(y|x)?,HMM,=1|,=11+1|,|,Transitionprobability,|,=1|,Emissionprobability,Step1,Step2,x:,y:,=1,2,=1,2,HMMEstimatingtheprobabilities,HowcanIknowP(V|PN),P(saw|V)?Obtainingfromtrainingdata,TrainingData:,1,1,2,2,3,3,HMMEstimatingtheprobabilities,=1|=11+1|=1|,+1=|=,=|=,Sosimple,(andaretags),(istag,andisword),DifferentfromwhatyoulearnedinDSP?,Latentinformationinspeechrecognition,Ref:.tw/tlkagk/courses/MLDS_2015_2/Lecture/Hidden%20(v7).ecm.mp4/index.html,HMMHowtodoPOSTagging?,WecancomputeP(x,y),x:,y:,Tobefound,Observed,=max,=max|,=max,Task:givenx,findy,EnumerateallpossibleyAssumethereare|S|tags,andthelengthofsequenceyisLThereare|S|LpossibleyViterbialgorithmSolvetheaboveproblemwithcomplexityO(L|S|2),HMMViterbiAlgorithm,=max,HMM-Summary,=max,F(x,y)=P(x,y)=P(y)P(x|y),P(y)andP(x|y)canbesimplyobtainedfromtrainingdata,notnecessarilysmall,HMM-Drawbacks,Inference:Toobtaincorrectresults,:,CanHMMguaranteethat?,P(V|N)=9/10,P(D|N)=1/10,P(a|V)=1/2,P(a|D)=1,Transitionprobability:,Emissionprobability:,yl=V,yl-1=N,yl=D,9/10,1/10,xl=a,xl=a,xl=c,1,1/2,1/2,=max,notnecessarilysmall,HMM-Drawbacks,Inference:Toobtaincorrectresults,:,CanHMMguaranteethat?,P(V|N)=9/10,P(D|N)=1/10,P(a|V)=1/2,P(a|D)=1,Transitionprobability:,Emissionprobability:,yl=?,yl-1=N,xl=a,V,P(yl|yl-1),P(xl|yl),=max,notnecessarilysmall,HMM-Drawbacks,Inference:Toobtaincorrectresults,:,CanHMMguaranteethat?,HighprobabilityforHMM,P(V|N)=9/10,P(D|N)=1/10,P(a|V)=1/2,P(a|D)=1,Transitionprobability:,Emissionprobability:,D,X9,X9,=max,HMM-Drawbacks,The(x,y)neverseeninthetrainingdatacanhavelargeprobabilityP(x,y).Benefit:Whenthereisonlylittletrainingdata,X9,X9,HighprobabilityforHMM,However,CRFcandealwiththisproblembasedonthesamemodel,Morecomplexmodelcandealwiththisproblem,Outline,CRF,|,P,=,=,isafeaturevector.Whatdoesitlooklike?isaweightvectortobelearnedfromtrainingdata,isalwayspositive,canbelargerthan1,=,P,=,P(x,y)forCRF,=1|=11+1|=1|,InHMM:,=1|+=11+1|+|,+=1|,P,verydifferentfromHMM?,P(x,y)forCRF,=1|=,|,Enumerateallpossibletagssandallpossiblewordt,Numberoftagsandwordtappearstogetherin,Logprobabilityofwordtgiventags,=1|+=11+1|+|,+=1|,P(x,y)forCRF,=1|,=,|,=2,=1,=1,=1,=0,=|+|+|+|+|,(foranyothersandt),=|2+|1+|1+|1,P(x,y)forCRF,=1|+=11+1|+|,+=1|,1|=|,=11+1|=,|,|=|,+|,P(x,y)forCRF,=,|,+,|,+|,=|,P,=,=,P,=,P(x,y)forCRF,=|1=,However,wedonotgiveanyconstraintsduringtraining,=,=,|,|,=|=,FeatureVector,hastwopartsPart1:relationsbetweentagsandwordsPart2:relationsbetweentags,Part1has|S|X|L|dimensions,Ifthereare|S|possibletags,|L|possiblewords,Whatdoes,looklike?,FeatureVector,hastwopartsPart1:relationsbetweentagsandwordsPart2:relationsbetweentags,Whatdoes,looklike?,:,Numberoftagsandconsecutivelyin,FeatureVector,hastwopartsPart1:relationsbetweentagsandwordsPart2:relationsbetweentags,Whatdoes,looklike?,Ifthereare|S|possibletags,|S|X|S|+2|S|dimensions,Defineany,youlike!,|=,CRFTrainingCriterion,Giventrainingdata:1,1,2,2,Findtheweightvectormaximizingobjectivefunction:,=argmaxO,Minimizewhatwedontobserve,Maximizewhatweobserve,O=1|,|,=,CRFGradientAscent,Gradientdescent,Findasetofparametersminimizingcostfunction,GradientAscent,FindasetofparametersmaximizingobjectivefunctionO,+,Oppositedirectionofthegradient,Thesamedirectionofthegradient,CRF-Training,Letmeshow,O=1|=1,verysimilar,CRF-Training,Ifwordtislabeledbytagsintrainingexamples,thenincrease,=,|,Ifwordtislabeledbytagsin,whichnotintrainingexamples,thendecrease,Aftersomemath,+,|=,CanbecomputedbyViterbialgorithmaswell,CRF-Training,=,|,StochasticGradientAscent,w+,|,Randompickadata,|=,CRFInference,Inference,=max|,=max,DonebyViterbiaswell,P,=max,CRFv.s.HMM,CRF:increase,decrease,Toobtaincorrectresults,HMMdoesnotdothat,:,CRFmorelikelytoachievethatthanHMM,X9,X9,yi=?,yi-1=N,xi=a,HMM:V,P(yi|yi-1),P(xi|yi),CRF:,0.1,CRF:D,SyntheticData,Generatingdatafromamixed-orderHMMTransitionprobability:|1+1|1,2Emissionprobability:|+1|,1ComparingHMMandCRFAlltheapproachesonlyconsider1-storderinformationOnlyconsideringtherelationof1andIngeneral,alltheapproacheshaveworseperformancewithsmaller,Ref:JohnD.Lafferty,AndrewMcCallum,andFernandoC.N.Pereira,“ConditionalRandomFields:ProbabilisticModelsforSegmentingandLabelingSequenceData”,ICML,2001,SyntheticData:CRFv.s.HMM,HMM,CRF,12,CRF,CRFHMM,Smaller,1-storderHMMassumptionisinaccurate,DataaregeneratedfromHMM,CRF-Summary,=argmax=1|,=argmax|,=argmax,w+,|,Outline,StructuredPerceptron,=argmax(,),=,:,+,=argmax,Viterbi,ThesameasCRF,StructuredPerceptronv.s.CRF,StructuredPerceptronCRF,+,=argmax,w+,|,Hard,Soft,StructuredSVM,=argmax(,),=,Viterbi,ThesameasCRF,Way1.GradientDescent,Way2.QuadraticProgramming(CuttingPlaneAlgorithm),Considermarginanderror:,StructuredSVMErrorFunction,Errorfunction:,:DifferencebetweenandCostfunctionofstructuredSVMistheupperboundof,Theoretically,canbeanyfunctionyoulikeHowever,youneedtosolveProblem2.1=max,+,:,:,ATTCGGGGAT,ATTAGGAGAA,Example,=3/10,Inthiscase,problem2.1canbesolvedbyViterbiAlgorithm,POSTagging,Ref:Nguyen,Nam,andYunsongGuo.Comparisonsofsequencelabelingalgorithmsandextensions.ICML,2007.,NameEntityRecognition,Ref:Tsochantaridis,Ioannis,etal.Largemarginmethodsforstructuredandinterdependentoutputvariables.JournalofMachineLearningResearch.2005.,PerformanceofDifferentApproaches,ConcludingRemarks,.tw/tlkagk/courses/MLDS_2015/Structured%20Lecture/Segmental%20CRF%20(v8).fsp/index.html(請用IE開啟),Theaboveapproachescancombinewithdeeplearningtohavebetterperformance.,Nextl
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2025湖北襄阳市中西医结合医院(襄阳市东风人民医院)招聘急需专业技术人才16人考试参考试题及答案解析
- 2025年皮肤病诊断与治疗专项考试模拟试卷答案及解析
- 地沟盖板施工合同范本
- 2025年康复医学全功能评定量表解读测试答案及解析
- 压车放款合同范本
- 雕塑类设计合同范本
- 2025年核医学核素应用及放射性同位素检查解读模拟考试卷答案及解析
- 2025年辽宁省交通建设投资集团招聘104人备考练习试题及答案解析
- 2025云南省文山州富宁县紧密型医共体妇幼保健院院区招聘(3人)考试参考试题及答案解析
- 2025年8月贵州遵义市职教中心(遵义市红花岗区中等职业学校)临聘专职宿管教官(男)招聘1人考试参考试题及答案解析
- 眼外伤护理业务查房
- 个人IP打造与推广实战指南
- 火灾自动报警操作流程
- 2025机动车维修企业安全管理员安全考试题库及参考答案
- 医院入职申请书
- 校家社协同育人专题家长培训
- 国土空间生态保护修复工程生态成效监测评估技术导则 DB32 T 4867-2024
- 电梯扣款通知函
- 《恩施旅游,介绍》课件
- 2025年中国福建省个人贷款行业市场运营现状及投资方向研究报告
- 专业音响灯光租用协议(2024年版)
评论
0/150
提交评论