版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、Hammersley Clifford 定理Hammersley- Clifford theorem The Hammersley - Clifford theorem is aresult in probability theory,mathematical statistics and statistical mecha ni cs,that gives n ecessary and sufficie nt con diti ons un der which apositive probability distributio n can be represe nted as aMarkov
2、 n etwork(also known as aMarkov ran dom field)t states that aprobability distribution that has apositive mass or density satisfies one of the Markov properties with respect to an undirected graph Gif and only if it is aGibbs random field,that is,its density can be factorized over the cliques(or comp
3、lete subgraphs)of the graph.The relatio nship betwee n Markov and Gibbs ran dom fields was in itiated by Rola nd Dobrush in 1a nd Frank Spitzer2i n the con text of statistical mecha ni cs.The theorem is n amed after Joh n Hammersley and Peter Clifford who proved the equivale nee in an un published p
4、aper in 1971.34Simpler proofs using the in clusi on-exclusi on prin ciple were give n in depe nden tly by Geoffrey Grimmett,5Presto n 6a nd Sherma n 7in 1973,with afurther proof by Julian Besag in 1974.8NotesADobrushin,P.L.(1968),The Description of aRandom Field by Means of Conditional Probabilities
5、 and Conditions of Its Regularity,Theory of Probability and its Applications 13(2): 197 - 224,doi :10.1137/1113026,Spitzer,Frank(1971),Markov Random Fields and Gibbs Ensembles,The American Mathematical Monthly 78(2): 142 - 154,doi :10.2307/2317621,JSTOR 2317621,Hammersley,J.M. ; Clifford,P.(1971),Ma
6、rkov fields on finite graphs and lattices,Clifford,P.(1990),Markov random fields in statistics,in Grimmett,G.R.; Welsh,D.J.A.,Disorder inPhysical Systems : A Volume in Honour of John M.Hammersley,Oxford University Press,pp.19- 32,ISBN 0198532156,MR 1064553,retrieved 2009-05-04AGrimmett,G.R.(1973),A
7、theorem about random fields,Bulletin of the Lon don Mathematical Society 5(1): 81 - 84,doi :10.1112/blms/5.1.81,MR 0329039APreston,C.J.(1973),Generalized Gibbs states and Markov ran dom fields,Adva nces in Applied Probability 5(2)1426035,Sherma n,S.(1973),Markov ran dom fields and Gibbs ran dom fiel
8、ds,lsrael Journal of Mathematics 14(1): 92 - 103,doi : 10.1007/BF02761538,MR 0321185ABesag,J.(1974),Spatial in teraction and the statistical analysis of lattice systems,Journal of the RoyalStatistical Society.Series B(Methodological)36(2): 192 - 236,MR0373208.JSTOR 2984812 Further readi ng Bilmes,Je
9、ff(Spri ng 2006),Ha ndout 2: Hammersley- Clifford,course notes from University of Washington course.Grimmett,Geoffrey,Probability on Gr aphs,Chapter 7,Helge,The Hammersley- Clifford Theorem and its Impact on Moder n Statistics,probability-related article is astub.You can help Wikipedia by expa nding
10、 it.Retrieved from - Clifford_theoremFrom Wikipedia,the free en cyclopediaThe first after noon of the memorial sessi on for Julia n Besag in Bristol was an intense and at times emotional moment,where friends and colleagues of Julia n shared memories and stories.This collectio n of tributes showed ho
11、w much of alarger-tha n-life character he was,from his Iong-termed and wide-ranged impact on statistics to his very high expectati on s,both for himself and for others,leadi ng to atotal and un compromis ing research ethics,to his passi on forextremesports and outdoors.(The stories duri ng and after
12、 diner were of amore pers onal nature,but at least as much enjoyable! )The talks on the second dayshowed how much and how deeply Julian had contributed to spatial statistics and agricultural experiments,to pseudo-likelihood,to Markov ran dom fields and image an alysis,a nd to MCMC methodology and pr
13、actice .I hope Idid not botch too much my prese ntati on on the history of MCMC,while Ifound reading through the 1974,1986 and 1993 Read Papers and their discussi ons an imme nsely rewardi ng experime nt(l wish Ihad done prior to completing our Statistical Science paper,but it was bound to be in com
14、plete by n ature! ).Some in terest ing links made by the audie nee werethe prior publicati on of proofs of the Hammersley-Clifford theorem in 1973(by Grimmet,Prest on,and Steward,respectively),as well as the proposal of aGibbs sampler by Bria n Ripley as early as 1977(eve n though Hasti ngs did use
15、Gibbs steps in one of his examples).Christophe An drieu also poin ted out to me avery early Monte Carlo review by Joh n Halt on in the 1970 SIAM Rewiew,review that Iwill read(a nd commme nt)as soon as possible.Overall,I am quite glad Icould take part in this memorial and Iam grateful to both Peters
16、for organising it as afitting tribute toJulia n.Markov Cha in Mon te Carlo(MCMC)methods are curre ntly avery active field of research.MCMC methods are sampli ng methods,based on MarkovChai ns which are ergodic with respect to the target probability measure.The principle of adaptive methods is to opt
17、imize on the fly some desig n parameters of the algorithm with respect to agive n criteri on reflect ing the samplers performa nce(opti mize the accepta nee rate,optimize an importa nee sampli ng fun ctio n, etc).A postdoctoralposition is opened to work on the numerical analysis of adaptive MCMC met
18、hods: con verge nce,nu merical efficie ncy,developme nt and an alysis of new algorithms.A particular emphasis will be give n to applicati ons in statistics and molecular dynamics.(Detailed description)Position funded by the French Natio nal Research Age ncy(ANR)through the 2009-2012 project ANR-08-B
19、LAN-0218.The position will benefit from an in terdiscipli nary environment involving numerical analysts,statisticians and probabilists,a nd of strong in teracti ons betwee n the part ners of the project ANR-08-BLAN-021 In the most recent issue of Statistical Science,the special topic isCelebrating t
20、he EM Algorithms Quandun ciace ntenni al .It contains an historical survey by Marti n Tanner and Wing Wong on the emerge nee of MCMC Bayesia n computati on in the 1980 s,This survey is more focused and more in formative tha n our global history(also to appear in Stati stical Science).In particular,i
21、t provides the authorsa nalysis as to why MCMC was delayed by ten years or so(or eve n more whe n con sideri ng that aGibbs sampler as asimulati on tool ap pears in both Hast in gs(1970)a nd Besags(1974)papers).They dismissourc oncerns about computi ng power(I was running Monte Carlo simulations on
22、my Apple lie by 1986 and asingle mean square error curve evaluati on for aJames-Ste in type estimator would the n take close to aweekend! )and Markov innumeracy,rather attributing the reluctanee to alack of con fide nee into the method.This perspective rema ins debatable as,apart from Tony OHaga n w
23、ho was the n fighti ng aga in Monte Carlo methods as being un-Bayesian(1987,JRSS D),l do not remember any negative attitude at the time about simulation and the immediate spread of the MCMC methods from Ala n Gelfa nds and Adria n Smiths prese ntati ons of their 1990 paper shows on the opposite that
24、 the Bayesia n com mun ity was ready for the move.Another interesting point made in this historical survey is that Metropolisa nd other Markov cha in methods were first prese nted outside simulatio n sect ions of books like Hammersley and Handscomb(1964),Rubinstein( 1981)and Ripley(1987),perpetuatin
25、g the impressi on that such methods were mostly optimisatio n or ni che specific methods.This is also why Besags earlier works(not mentioned in this survey)did not get wider recognition,until later.Something Iwas notsampling(i.e.population Monte Carlo)n the Bayesian literature of the 1980 s,with pro
26、posals from Herman van Dijk,Adrian Smith,and others.The即 pe ndix about Smith et al.(1985),the 1987 special issue of JRSS D,a nd the computation contents of Valencia 3(that Isadly missed for being in the Army ! )is also quite in formative about the percepti on of computational Bayesian statistics at
27、this time.A missing connection in this survey is Gilles Celeux and Jean Diebolts stochastic EM(or SEM).As early as 1981,with Michel Bron iatowski,they proposed asimulated versi on of EM for mixtures where the late nt variable zwas simulated from its conditional distribution rather than replaced with
28、 its expectation .So this was the first half of the Gibbs sampler for mixtures we completed with Jea n Diebolt about ten years later.(Also found in Gelma n and Kin g,1990.)These authors did not get much recog niti on from the com mun ity,though,as they focused almost exclusively on mixtures,used sim
29、ulati on to produce arandomn ess that would escape the local mode attract ion ,rather tha n targeti ng the posterior distributi on,and did not an alyse the Markovia n n ature of their algorithm until later with the simulated annealing EM algorithm.Share: Share概率图模型分为有向和无向的模型。有向的概率图模型主要包括贝叶斯网络和隐马 尔可夫
30、模型,无向的概率图模型则主要包括马尔可夫随机场模型和条件随机场模 型。2001 年,卡耐基.梅隆大学的 Lafferty 教授(John Lafferty ,An drew McCallum, Ferna ndo Pereira) 等针对序列数据处理提出了 CRF模 型(Co nditio nal Ra ndom Fields Probabilistic Models for Segme nti ng and Labeli ng Seque nee Data)。这种模型直接对后验概率建模,很好地解决了MRF莫型利用多特征时需要复杂的似然分布建模以及不能利用观察图像中上下文信息的问题。Kumar
31、博士在2003年将CRF模型扩展到2-维格型结构,开始将其引入到图像分析领域,吸引了学术界的高 度关注。对给定观察图像,估计对应的标记图像y观察图像,x未知的标记图像1. 如果直接对后验概率建模(即考虑公式中的第一项),可以得到判别的 (Discriminative) 概率框架。特别地,如果后验概率直接通过Gibbs分布建模,(x,y)称为一个CRF得到的模型称为判别的CRF模型。2.通过对(x,y)的联合建模 (即考虑公式中的第二项),可以得到联合的概率框架?。特别地,如果考虑双随机 场(x,y)的马尔可夫性,即公式的第二项为Gibbs分布,那么(x,y)被称为一个双MRF(Pairwise
32、 MRF,PMRF)9。 3.后验概率通过公式所示的 p(x)和p(y|x)建模,其 中p(y|x)为生成观察图像的模型,因此这种框架称为生成的(Generative)概率框架。特别地,如果先验p(x)服从Gibbs分布,x称为一个MRF12,得到的模型称 为生成的MRF莫型。-【面向图像标记的随机场模型研究】运用 Hammersley- Clifford 定理,标记场的后验概率服从 Gibbs分布其中,z(y, 9 )为归一化函数, c为定义在基团c上的带有参数0的势函数。 CRF模型中一个关键的问题是定义合适的势函数。因此发展不同形式的扩展 CRF模型是当前CRF模型的一个主要研究方向。具
33、体的技 术途径包括:一是扩展势函数。通过引进更复杂的势函数,更多地利用多特征和上 下文信息;二是扩展模型结构。通过引入更复杂的模型结构,可以利用更高层次、 更多形式的上下文信息。扩展势函数(1)对数回归(Logistic Regression,LR) 支持向量机(Support Vector Machine,SVM)核函数(4) Boost(5) Probit扩展模型结构(1)动态CRF模型动态CRF(Dynamic CRFQCR模型用于对给定的观测数据,同时进行多个标记任务,以此充分利用不同类型标记之间的相关性。 隐CRF模型CRF 模型的另一类扩展图结构是在观察图像和标记图像之间引入过渡的
34、隐变量层h,得到的模型称为隐 CRF(Hidden Conditional Random Field,HCRF)。隐含层的 引入使CRF模型具有更丰富的表达能力,可以对一些子结构进行建模。隐变量可以 是抽象的,也可以具有明确的物理意义。(3)树结构CRF模型CRF模型的标准图结构中,标记之间的相关性通过格型结构的边(edge)表示。 混合CRF模型1.Markov假设有限历史以及平稳。有限历史指的是和有限的历史相关2. HMM给定观察序列01,02,03.,每个观察Oi对应隐状态序列S1,S2.Sn。HMM军决三个问题:1. 计算观察序列的概率利用forward算法即可2. 跟定观察序列,计算
35、出对应概率最大的隐状态序列Viterbi算法,提供O(N*N*T)的复杂度3. 给定观察序列以及状态集合,估计参数A(状态转移矩阵)B(发射概率)EM算法,forward-backword 算法问题2类似序列标注的问题Pr(O|S)=p(O1|S1)*p(O2|S2).p(O n|Sn)P(O)=p(O1|start)*p(O2|O1).p(O n|O n-1)P(S|O)=argmaxPr(O|S)P(O)=argmax(.p(Oi|Si)*p(Si|Si-1).)ME分类器,将给定的观察值O进行分类。ME需要从O中提取出相关的Feature以及计算对应w。注意:主要解决的是观察值 O分类问
36、题,如文本分类d那个P(C=c|O)MEMM序列标注问题,综合 ME和HMM提供更多的Featrue,优于HMM 考虑到t时间附近观察以及状态对其影响。P(S|0)=argmax(P(S|0)=argmax(.p(Si|0,Si-1)J,其中 O可以是 Oi,也可以是Oi-1等观察。最大熵模型Maximum Entropy现从一个简单例子看起:比如华盛顿和维吉利亚都可以作人名和地名,而从语料中只知道p(人名)=0.6,那么p(华盛顿二人名)的概率为多少比较好呢?一个直观的想法就是p(华盛顿二人 名)=0.3。为什么呢?这就是在满足已有证据的情况下不做任何其他假设,也就是熵 最大,这就是最大熵模
37、型的原理。现在来看模型的定义:首先,明确模型的目标:给定一个上下文x,估计p(y|x)接着,从训练样本中我们可以得到一串标注过的样本(x_i,y_i),其中x_i为上下文,y_iin 丫为类别然后构造特征函数f(x,y)=1 如果x,y满足一些条件,比如x=记者*,y=人名0 otherwise注意x是一个上下文,是个向量,而不是单个词(最大熵模型里的特征概念不同于模式识别里的特征,这里的特征即特征函数,通 常是二值函数,也有直接定义成实数的,比如jeon-sigir06里直接把f定义为KDE距离,不是很明白那样定义的好处。)于是模型的约束就是对于所有的特征函数模型中的期望等于样本的期望,即E
38、_p(f)=E_tilde p(f)其中E_p(f)=sum_x,yp(x,y)f(x,y)=sum_x,yp(x)p(y|x)f(x,y)approxsum_x,ytilde p(x)p(y|x)f(x,y)tilde p(f)=sum_x,ytilde p(x,y)f(x,y),并且对于任意的x: sum_y p(y|x)=1而模型的熵为在满足约束的情况下,使熵最大,于是问题即求p*=argmax_pin P-sumx,yp(y|x)tilde p(x)log p(y|x)where P=p(y|x)|all f_i : sum_x,yp(y|x)tildeP(x)f_i(x,y)=sum
39、_x,ytilde p(x,y)f_i(x,y),all x: sum_y p(y|x)=1可以证明,模型的最优解的形式为p(y|x)=exp(sum_ilambda_i f_i(x,y)/Zx where Zx=sum_yexp(sum_ilambda_i f_i(x,y)具体证明请见拜下qxred大牛隐马尔可夫模型Hidden Markov Model马尔可夫模型实际上是个有限状态机,两两状态间有转移概率;隐马尔可夫模型中 状态不可见,我们只能看到输出序列,也就是每次状态转移会抛出个观测值;当我 们观察到观测序列后,要找到最佳的状态序列。设0为观测值,x为隐变量,那么模型要找到让P(O)最
40、大的最佳隐藏状态,而P(O)=sum_x P(O|X)P(X)而其中P(X)=p(x_1)p(x_2. n|x_1)=P(x_1)P(x_2|x_1)p(x_3. n|x_1,x_2)根据x_i只与x_i-1相关的假设有P(X)=p(x_1)p(x_2|x_1)p(x_3|x_2)而类似的P(O|X)=p(o_1|x_1. n)p(o_2. n|o_1x_1. n)根据o_i只与x_i有关的假设有P(O|X)=p(o_1|x_1)p(o_2|x_2)合起来就是P(O)=sum_x p(x_1)p(x_2|x_1)p(o_1|x_1)p(x_3|x_2)p(o_2|x_2)定义向前变量alpha
41、_i(t)为t时刻以状态S_i结束时的总概率alpha_j(t)=sum_i=1ANalpha_ip(x_t=j|x_t-1=i)p(o_t=i|x_t=i)定义向后变量beta_i(t)为给定当前状态S_i和t时刻情况下观测序列中剩余部分的概率和beta_i(t)=sum_j=1ANp(x_t=j|x_t+1=i)p(o_t=i|x_t=i)beta_j(t+1)于是观测序列的概率为P(O,X_t=i)=alpha_i(t)beta_i(t)最佳状态可以由动态规划得到模型参数可以由EM算法得到EM具体请见再拜qxred大牛最大熵隐马 Maximum Entropy Markov Model
42、HMM的缺点是根据观测序列决定状态 序列,是用联合模型解决条件问题;另外,几乎不可能枚举所有所有可能的观测序 列。而MEM解决了这些问题。首先,MEM和MM或 HMM有本质不同,MEM估计的是P(S|O),而MM古计的是 P(S),HMM计的都是 P(O)。=P(s_1|O)P(s_2|s_1,O)P(s_3. n|s_1,s_2,0)然后根据假设有P(S|O)=P(s_1|O)P(s_2. n|s_1,O)=P(s_1|o_1)P(s_2|s_1,o_2)P(s_3. n|s_1,s_2,o_3)重新定义特征函数:a=b,r b是指示函数用于指示当前观测r是状态值f_a(o_t,S_t)=1
43、 if b(o_t)is true and s_t=r于是约束变为E_a=sum_k=1Am_ssum_si nSP(s|s,o_k)f_a(o_k,s)/m_s=sum_k=1Am_sf_a(o_k,s_k)=F_a这个目标函数和ME的目标函数实质是一样的于是解的形式为P(s|s,o)=exp(sum_alambda_a f_a(o,s)/Z(o,s)然后依然采用HMM中的前向后向变量,寻找最佳序列而实际上得到的序列是由计算P(s|o)=P(s_0)P(s_1|s_0,o_0)P(s_2|s_1,o_1)得到条件随机场Conditional Random Fields MEMM 其实是用局部信息去优化全局,会 有label bias
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2026上半年贵州事业单位联考贵州省住房和城乡建设厅招聘16人备考题库含答案详解(黄金题型)
- 2026安徽蚌埠市禹会区招聘村级后备干部招聘5人备考题库附答案详解(夺分金卷)
- 2025年湖南劳动人事职业学院单招职业技能考试题库附答案解析
- 2026广东佛山市顺德区龙江镇华东小学语文、数学、英语临聘教师招聘备考题库有答案详解
- 2026广西国土规划集团招聘2人备考题库含答案详解(新)
- 2026中国建材集团数字科技有限公司招聘23人备考题库附答案详解(典型题)
- 2026广东广州电力工程监理有限公司校园招聘备考题库带答案详解(预热题)
- 2025年重庆航天职业技术学院马克思主义基本原理概论期末考试模拟题带答案解析
- 2026上半年贵州事业单位联考北京积水潭医院贵州医院招聘16人备考题库附答案详解(综合卷)
- 2026“才聚齐鲁成就未来”山东泰山财产保险股份有限公司社会招聘3人备考题库参考答案详解
- 2026年心理健康AI干预项目商业计划书
- GB/T 46568.2-2025智能仪器仪表可靠性第2部分:电气系统可靠性强化试验方法
- 2025年11月江苏南京市建邺区政府购岗人员招聘5人笔试考试参考题库附答案解析
- 卷烟厂标识考核办法
- GB/T 10454-2025包装非危险货物用柔性中型散装容器
- GB/T 4127.16-2025固结磨具尺寸第16部分:手持式电动工具用切割砂轮
- 血液透析血管通路的感染与预防
- 普外科科主任年终述职
- 中医内科学:肺胀
- 肯德基副经理养成课程
- XX问题技术归零报告
评论
0/150
提交评论