The Nature of Statistical Learning Theory

上传人：机*** IP属地：河南上传时间：2017-12-15 格式：DOC 页数：3 大小：31.50KB 积分：12 举报 版权申诉

The Nature of Statistical Learning Theory_第2页

The Nature of Statistical Learning Theory_第3页

全文预览已结束

 下载本文档

版权说明：本文档由用户提供并上传，收益归属内容提供方，若内容存在侵权，请进行举报或认领

文档简介

/crshalizi/reviews/vapnik-nature/The Nature of Statistical Learning Theoryby V. N. VapnikBerlin: Springer-Verlag, 1995 A Useful Biased EstimatorVapnik is one of the Big Names in machine learning and statistical inference; this is his statement of what is important, how to do it, and who figured out how to do it. His views on all these matters are decided, at least a little idiosyncratic, and worth attending to. The general setting of the problem of statistical learning, according to Vapnik, is as follows. We want to estimate some functional which depends on an unknown distribution over a probability space X - it could be a concept in the machine-learning sense, regression coefficients, moments of the distribution, Shannon entropy, etc.; even the distribution itself. We have a class of admissible distributions, called hypotheses, and a loss functional, an integral over X which tells us, for each hypothesis, how upset we should be when we guess wrong; this implicitly depends on the true distribution. Clearly we want the best hypothesis, the one which minimizes the loss functional - but to explicitly calculate that wed need to know the true distribution. Vapnik assumes that we have access to a sequence of independent random variables, all drawn from the (stationary) true distribution. What then are we to do? Vapniks answer takes two parts. The first has to do with empirical risk minimization: approximate the true, but unknown, loss functional, which is an integral over the whole space X, with a sum over the observed data-points, and go with the hypothesis that minimizes this empirical risk; call this, though Vapnik doesnt, the ERM hypothesis. Its possible that the ERM hypothesis will do badly in the future, because we blundered into unrepresentative data, but we can show necessary and sufficient conditions for the loss of the ERM hypothesis to converge in probability to the loss of the best hypothesis. Moreover, we can prove that under certain very broad conditions, that if we just collect enough data-points, then the loss of the ERM hypothesis is, with high probability, within a certain additive distance (confidence interval - Vapniks scare-quotes) of the loss of the best hypothesis. These conditions involve the Vapnik-Chervonenkis dimension, and a related quantity called the Vapnik-Chervonenkis entropy. Very remarkably, we can even calculate how much data we need to get a given approximation, at a given level of confidence, regardless of what the true distribution is, i.e. we can calculate distribution-independent bounds. (They do, however, depend on the nature of the integrands in the loss functional.) As Vapnik points out, these results about convergence, approximation, etc. are in essence extensions of the Law of Large Numbers to spaces of functions. As such (though he does not point this out), the assumption that successive data-points are independent and identically distributed is key to the whole exercise. He doesnt talk about what to do when this assumption fails. The second part of Vapniks procedure is an elaboration of the first: For a given amount of data, we pick the hypothesis which minimizes the sum of the empirical risk and the confidence interval about it. He calls this structural risk minimization, though to be honest I couldnt tell you what structure he has in mind. More popular principles of inference - maximum likelihood, Bayesianism, and minimum description length - are all weighed in the balance against structural risk minimization and found more or less wanting. Vapniks view of the history of the field is considerably more idiosyncratic than most of his opinions: in epitome, it is that everything important was done by himself and Chervonenkis in the late 1960s and early 1970s, and that everyone else, American computer scientists especially, are a bunch of wankers. Indeed, this is a very Russian book in several senses. I dont just mean that it clearly wasnt written (or edited) by somebody fluent in English - the missing articles, dropped copulas, and mangled verb-tenses are annoying but not so bad as to conceal Vapniks meaning. More important, and more characteristically Russian, is the emphasis on mathematical abstraction, logical rigor and formal elaboration, all for their own sweet sakes. (Detailed proofs are, however, left to his papers.) Vapnik opposes the idea that complex theories dont work, simple algorithms do, which is fair enough, but he seems almost hurt that simple algorithms ever work, that something as pragmatic and unanalytical as a neural network can not just work, but sometimes even outperforms machines based on his own principles. There are a number of other oddities here, like an identification of Karl Poppers notion of unfalsifiable with classes of functions with infinite VC dimension, and some talk about Hegel I didnt even try to understand. I think Vapnik suffers from a certain degree of self-misunderstanding in calling this a summary of learning theory, since many issues which would loom large in a general theory of learning - computational tractability, chosing the class of admissible hypotheses, representations of hypotheses and how the means of representation may change, etc. - are just left out. Instead this is a excellent overview of a certain sort of statistical inference, a generalization of the classical theory of estimation. In the hands of a master like Vapnik, this covers such a surprisingly large territory that its almost no wonder he imagines it extends over the entire field. That said, there is a lot here for those interested in even the most general and empirical aspects of learning and inference, though theyll ne

人人文库> 全部分类> 图纸下载 > 毕业设计

温馨提示

1. 本站所有资源如无特殊说明，都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
2. 本站的文档不包含任何第三方提供的附件图纸等，如果需要附件，请联系上传者。文件的所有权益归上传用户所有。
3. 本站RAR压缩包中若带图纸，网页内容里面会有图纸预览，若没有图纸预览就没有图纸。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 人人文库网仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对用户上传分享的文档内容本身不做任何修改或编辑，并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容，请与我们联系，我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

The Nature of Statistical Learning Theory

文档简介

温馨提示

最新文档

评论

The Nature of Statistical Learning Theory

文档简介

温馨提示

最新文档

评论

相关文档