OptimizationofEMAlgorithmandItsApplicationtoSpike.doc_第1页
OptimizationofEMAlgorithmandItsApplicationtoSpike.doc_第2页
OptimizationofEMAlgorithmandItsApplicationtoSpike.doc_第3页
OptimizationofEMAlgorithmandItsApplicationtoSpike.doc_第4页
OptimizationofEMAlgorithmandItsApplicationtoSpike.doc_第5页
已阅读5页,还剩7页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

optimization of em algorithm and its application to spikeclassificationyin haibing, hu dewen, liu yadong, li ming, wan yuchengdepartment of automatic control, college of mechatronics and automation, national university of defense technology,changsha (410073)e-mail: abstractrecently, several methods have been developed for automatic spike classification, including theexpectation maximization (em) clustering based on multivariate t-distribution mixture models. however, studies showed that em algorithm has linearity convergence, and in terms of spike classification, this method could not be used practically for the large time expending. in this study, we introduced an optimized em algorithm for spike classification with multivariate t-distribution finite mixture models. our algorithm optimizes the em iterative algorithm with classic ascent gradient in high dimension characteristic space of spikes, thus the convergence becomes better than its original linearity. the application of our optimized em algorithm in real spike data showed a faster and more robust performance in the convergence, and better capability of the spike classification.keywords: spike classification, em algorithm, ascent gradient1introductionover last three decades, much development has been achieved in recording and analysis of neuron activities 12, and many of them concentrate on the area of spike sorting, which is the key way to explore the neuron and its activity 3. scanning the thesis in the nearest decade, we could find that, there were much trouble with the neuroscientist, and one of the primary problems is spike sorting, including the issue of detection and classification of action potentials, decomposing overlaps and assigning spike shapes 2. it is known that, there are plenty of difficulties in these spike sorting areas. and the whys are thought mainly lie beneath these areas: threshold detection, types of detection errors, and ubiquitous misclassification error due to overlaps 3. recent years, several methods were developed for spike classification, namely template matching 45, wavelet transform 6, independent component analysis (ica) 78, and finite mixture models with expectation maximum (em) algorithm 9. considered that the spike sorting could be thought as a likelihood estimation problem of statistics, and the powerful application of em algorithm in statistical frames, in this study, our research concentrates on the em algorithm clustering with multivariate finite mixture models.the em algorithm, which is firstly put forward by dempster a.p. et al 10, has been applied broadly in statistical analysis area. recently, based on em algorithm and its extensions, the multivariate mixture model has many applications in clustering analysis and spike sorting 911-13. classification methods based on em algorithm have lots of merit, first of them is its monotone for converge to a local maximum. however, the convergence rate seems to be the primary drawbacks of em algorithms, and many researchers devoted their hard work for improvement of the algorithm convergence rate 14-16. some of them concentrated the em algorithm with gaussian mixture models, and they had several harvest in improving the convergence rate 14. and, in the terms of spike classification, recent researches suggested that the multivariate t-distribution provides more accurate description of the spike statistics than multivariate gaussian models 917. thus, we need an investigation on the convergence improvement of primal em algorithm when the multivariate t-distribution mixture model was applied to replace of the gaussian distribution.we optimized the original em algorithm by using the classic ascent gradient with multivariate t-distribution models, and applied our algorithm to real spike data. the remainder of this study was organized as follows: in the second section, we briefly reviewed the original em algorithm with multivariate t-distribution mixture models firstly, and then established our optimized em algorithm.- 12 -after that, we gave a theoretical proof of the efficiency for our optimized em algorithm. the simulation experiment was done to demonstrate the validity of our algorithm. in the third section, we applied our method to real spikes data. the fourth and the fifth section are discussion and conclusion respectively.2theoryin spike classification problems, we could consider that they came from many particular neurons, and that the number of the neurons was unknown. our aim of classification is to obtain that which spikes train are the same kind, or which spikes train come from the same neuron, and that what is the probability of the arbitrary spike come from the same neuron.2.1 the statistical characteristic of spikein terms of statistics, every spike waveform could be thought as a point in high dimension space, thus, the spikes distribution could be characterized with multivariate probability density function. where, the sample number of each spike could be considered as the dimension of multivariate space.that is, we have n spikes, and each spike has m sample points, in suppose that these spikes come from g neurons along the whole experiment time, and the spikes from particular neuron could be described via a high multivariate probability density function. we used m dimensional vector xj to represent asampled spike waveform or a vector of features from jth neuron ( 1 j g ). we could assume thateach neuron affords a proportion j (0 j 1) of the n spikes, and that the distribution of spikes fromneuron j has the parameter of j . thus, the probability of the each spike was 3911:gp( xi ) = j p( xi | j )j =1(1)fortunately, recent evidences have shown that the t-distribution mixture models shows better than normal ones in the examination of mahalanobis distribution 917. in this study, we used themultivariate t-distribution for our spike distribution, so we could set j = j , j , v , where j is themean value of the spikes belong to jth component, j is the covariance of the spikes belong to jthcomponent, and v is the degree of freedom (dof) of the multivariate t-distribution. then we had the following probabilistic model 9:n n gp( x) = p( xi ) = j p( xi | j , j , v)(2)wherei =1i =1j =1 v + mv + m ( 2 ) ( xi , j , j ) 2p( xi | j , j , v) =vm 1 1 +v( )( v) 2 22 (3)j j j j j ( x, , ) = ( x )t 1 ( x )(4)and is the gamma function, j 0withg j = 1 , m is the dimension of spike data sets xi. herej =1and also elsewhere in this study, the superscript t denotes the transpose of a vector or a matrix.2.2 the em algorithmllconsiderate the probabilistic model of equation (1), the best-fitting parameters 1, , g ,1, , g couldbe obtained by maximizing the model likelihood, or the model log likelihood. thanks to the essence ofem algorithm, we could get the analytic result of the maximization. for convenience of computing,instead of equation (2), we use the following log likelihood function as follows:n gl = log p( x | ) = log j p( xi | j , j , v)(5)i =1j =1and this function could be calculated via the following iterative em algorithm:expectation step (e-step): updatezij (the membership of spike i to unit j),pij (the probability of spikei to unit j), anduij (the weight indicating typicality of spike i to unit j) 11: j pijzij =gl pill =1( |( k 1) ,( k 1) ,( k 1) )(6)pij = p xi j j v(7)maximum step (m-step): updatem + v( k 1)i j juij = ( x , ( k 1) , ( k 1) ) + v( k 1)j ( k ) (the proportions of unit j),j ( k ) (the mean of unit j),(8)j( k ) (thecovariance of unit j), and v (estimation the degree of freedom: dof) 911:( k ) n ziji =1 j =n(9)n = zij uij xi( k ) i =1 jn zij uiji =1(10)n( k ) ( k ) t zij uij ( xi j )( xi j ) =( k ) i =1 jn ziji =1(11)v( k ) :z ( m + v) log(2) ulog( v) 1( v) 0n g ( k 1) ( k ) ( k ) ij +( k 1) ij + =i =1 j =12 ( xi , j , j ) + v22v( k ) =2+ 0.0416 1 + erf 0.6594 * log(2.1971) (12)ny + log y 1y + log y 1 i j j n y = g zijm + v( k 1)( ) + log(2) u i =1 j =1 2 ( x , ( k 1) , ( k 1) ) + v( k 1)ijseeing about the formulas above, we found that the iterate calculation may be boring when we hadlarge classification data. for reducing the time consuming, we optimized the calculation in next section.2.3 the optimized em algorithmas soon as the em algorithm appeared, for computation efficiency and hardware utilization, plenty of people have studied its convergence rate with the hope of improving its rate. with their works, a great improvement has been achieved, such as asymptotic convergence rate of the em algorithm for gaussian mixtures 15, and improving the linear convergence to hyper-linear with gradient ascent 14, et al. elicitation from the gradient ascent with the em algorithm in gaussian normal mixture models, we optimized the computation algorithm by establishing the gradient ascent with the primal em algorithm. in the following of this section, we will present our optimization and proof of the efficiency for optimized em algorithm with multivariate t-distribution in spike classification.at each iterate in em algorithm, we optimized the m-step as following:aa( k +1) a( k ) = p( k ) l(13)j ja a = a( k ) ( k +1) ( k ) = p( k ) l(14)jj j j = ( k ) vec ( k +1) vec ( k ) = p( k ) l(15)where j j j vec j = ( k ) ( k )1 ( k ) ,( k ) ,( k ) j j( k ) ( k ) t pa =diag 1n( k ) 2 l g a a(16)p( k ) =jjn(17) zij uiji =1j jp( k ) =2jn ( k ) ( k ) (18) ziji =11 2 gand vector a denotes the mixing proportions , ,l, t, j indexes the mixture components(j=1,2,g), k denotes the iteration number, “vecb” denotes the vector obtained by stacking thecolumn vectors of the matrix b, and “ ” denotes the kronecker product. moreover, given the( k ) constraints j 0andg( k ) j =1 j = 1 ,( k ) pawasapositivedefinitematrixandthematrices p( k ) and p( k ) were positive definite with probability one for n sufficiently large. however, as it j jdidnt have apparent computing expression, the dof used its original expression in computing as equation (12) 9. and the following of this section is our proof of the optimized algorithm theoretically.considering the proof, we found that, the equation (13) and equation (14) are the same as gaussian models 14, and here, we showed the last one as follows. firstly, the gradient of the log likelihoodcould be calculated as: f ( xi | j ) ln j n j f ( xi | j )1 t 1 g gj ij i j i j jj= 1u ( x )( x ) 1 1 ji =1 f ( x | )i =1 f ( x | ) 22nj =1j i j m =1m i m =z 1 1 u ( x )( x )t 1 1 (19)i =1ij 2j ij i j i j j j 11and we know that, in the original em algorithm mentioned in peel et al. 11, and then we could updatethe covariance of unit j like this 11:nt zij uij ( xi j )( xi j )( k +1) = ( k ) + i =1 ( k )j j n j ziji =1nz u ( k ) ( x )( x )t ( k ) 11 ij ij j i j i j j = ( k ) + ( k ) i =1 ( k ) ( k ) ( k ) ( k ) ( k ) ( k ) j j n ziji =1j j j j j j= ( k ) +2 ( k ) l ( k ) (20)jni =1jjzij( k ) jand, as we know that: vec axb = (bt a) vec x , so we could get the matrix ofp( k ) :jj jp( k ) =2jn( k ) ( k ) (21) ziji =1furthermore, for an arbitrary matrix u, we have jnvec u t p( k ) vec u =2vec u t ( ) vec u =2 tr( u u t ) = ziji =12tr(u )t (u )n n ziji =1 ziji =12 vec u t vec u 0(22)n ziji =1thus, we could say that, the matrix p( k ) was positive define as probability one. with the classicjoptimization theory, we know that the updated covariance was legal along the ascent of gradient.3resultswe applied our optimized em algorithm in simulation data and real spike data. for the convenience of comparison, we applied the primal em algorithm with the identical data in each figure or table.3.1 simulationthe simulation data were generated by the following way: we had four kinds of planar t-distribution and each component had the same probability of 0.25, and their mean centers were (-4,0), (0,-4), (+4,0),(0,+4) respectively, and the dof was 10, moreover, the correlation of the data were as follows: 10.7 10.71 = 3 = 0.7 1 2 = 4 = 0.71 the original simulation data was illustrated in fig. 1. the result of classification with em algorithm and our optimized algorithm, each data set has n=1000 samples, the ellipses indicated 2 areas, thus we had 95% of the identical class data in the ellipses of each component of all the data, and the label “+” means the center position of each kind, and the later figures in this study were the same meaning. we should clear about the figures in this study that, the same label or color in one figure means the identical component, and however, the same color in different figures does not mean the identical component.a a ac cc bb b(a)(b)(c)fig. 1. the simulation of em algorithm and our optimized em algorithm application in simulated data with binary t-distribution. (a) is the primal simulation data, (b) is primal em algorithm, and (c) is our optimized em algorithm with gradient ascent. the letters and rectangles in each panel are added for discussed later.we applied the primal em algorithm and optimized em algorithm to our simulated data set, and the classification results were given in table 1. the meaning of each parameter was following, the defaultproportion of component in mixture models (prop.), the default mean center of component ( ), thedefault (and calculated) correlation matrix of component (corr), the calculated covariance matrix ofcomponent (cov), the default (and calculated) degree of freedom of t-distribution (dof), and the calculated iterate time of algorithm (ite.). as the table 1 showed, both the original and the optimized algorithms have a approached result of classification for the simulation data, however, table 1 showed that they have some differences among the validity, process speed et al. firstly, with the existence of error, we could get the same result from the original em algorithm or the optimized one, and our optimized algorithm had less error. secondly, our optimized algorithms have a better performance than the original algorithm in convergent rate. table 1. the parameters of finite mixture models (component 1 as an example) parameterprop. corrcovdofite.simulation0.25primal em0.2717 optimized0.2717em40 4.088 0.0984 4.088 0.09851.0 0.70.7 1.0 1.0000.66810.6681 1.000 1.0000.66810.6681 1.000 1.2160.76600.7660 1.089 0.9595 0.61280.6128 0.9104 0.9599 0.6131 0.6131 0.9109109.0841039.105913.2 spike classificationour real spike data was obtained with a 16-microelectrode array, and of the 16 available electrodes, 14 provided single or multiunit activity. more detailed description of the extracellular acquisition system was provided by shy shoham, et al 9.after obtaining spike data from multi-electrode in experiments, the spike sorting begins. and since the original spike sorting research, many studies concentrated on the finite gaussian mixture models, however, in terms of statistics, the multivariate t-distribution could be more appropriate to describe spikes in high dimension space 9. in our study, we used optimized em algorithm with multivariate t-distribution for spike classification, and for the convenience of computation and illustration, we preprocessed the spikes waveform data with time aligned, de-noising, and principal component analysis (pca) process, and then, we showed the spikes waveform with its first two principal components. fig.2 showed the no

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论