会员注册 | 登录 | 微信快捷登录 支付宝快捷登录 QQ登录 微博登录 | 帮助中心 人人文库renrendoc.com美如初恋!
站内搜索 百度文库

热门搜索: 直缝焊接机 矿井提升机 循环球式转向器图纸 机器人手爪发展史 管道机器人dwg 动平衡试验台设计

外文翻译--机器学习的研究.doc外文翻译--机器学习的研究.doc -- 5 元

宽屏显示 收藏 分享

资源预览需要最新版本的Flash Player支持。
您尚未安装或版本过低,建议您

1MachineLearningResearchFourCurrentDirectionsThomasG.Dietterich■Machinelearningresearchhasbeenmakinggreatprogressinmanydirections.Thisarticlesummarizesfourofthesedirectionsanddiscussessomecurrentopenproblems.Thefourdirectionsare1theimprovementofclassificationaccuracybylearningensemblesofclassifiers,2methodsforscalingupsupervisedlearningalgorithms,3reinforcementlearning,and4thelearningofcomplexstochasticmodels.Thelastfiveyearshaveseenanexplosioninmachinelearningresearch.ThisexplosionhasmanycausesFirst,separateresearchcommunitiesinsymbolicmachinelearning,computationlearningtheory,neuralnetworks,statistics,andpatternrecognitionhavediscoveredoneanotherandbeguntoworktogether.Second,machinelearningtechniquesarebeingappliedtonewkindsofproblem,includingknowledgediscoveryindatabases,languageprocessing,robotcontrol,andcombinatorialoptimization,aswellastomoretraditionalproblemssuchasspeechrecognition,facerecognition,handwritingrecognition,medicaldataanalysis,andgameplaying.Inthisarticle,Iselectedfourtopicswithinmachinelearningwheretherehasbeenalotofrecentactivity.ThepurposeofthearticleistodescribetheresultsintheseareastoabroaderAIaudienceandtosketchsomeoftheopenresearchproblems.Thetopicareasare1ensemblesofclassifiers,2methodsforscalingupsupervisedlearningalgorithms,3reinforcementlearning,and4thelearningofcomplexstochasticmodels.Thereadershouldbecautionedthatthisarticleisnotacomprehensivereviewofeachofthesetopics.Rather,mygoalistoprovidearepresentativesampleoftheresearchineachofthesefourareas.Ineachoftheareas,therearemanyotherpapersthatdescriberelevantwork.IapologizetothoseauthorswhoseworkIwasunabletoincludeinthearticle.EnsemblesofClassifiersThefirsttopicconcernsmethodsforimprovingaccuracyinsupervisedlearning.Ibeginbyintroducingsomenotation.Insupervisedlearning,alearningprogramisgiventrainingexamplesoftheform{x1,y1,,xm,ym}forsomeunknownfunctionyfx.Thexivaluesaretypicallyvectorsoftheformwhosecomponentsarediscreteorrealvalued,suchasheight,weight,color,andage.ThesearealsocalledthefeatureofXi,IusethenotationXijto.referto2thejthfeatureofXi.Insomesituations,Idroptheisubscriptwhenitisimpliedbythecontext.Theyvaluesaretypicallydrawnfromadiscretesetofclasses{1,,k}inthecaseofclassificationorfromthereallineinthecaseofregression.Inthisarticle,Ifocusprimarilyonclassification.Thetrainingexamplesmightbecorruptedbysomerandomnoise.GivenasetSoftrainingexamples,alearningalgorithmoutputsaclassifier.Theclassifierisahypothesisaboutthetruefunctionf.Givennewxvalues,itpredictsthecorrespondingyvalues.Idenoteclassifiersbyh1,,hi.Anensembleofclassifierisasetofclassifierswhoseindividualdecisionsarecombinedinsomewaytypicallybyweightedorunweightedvotingtoclassifynewexamples.Oneofthemostactiveareasofresearchinsupervisedlearninghasbeenthestudyofmethodsforconstructinggoodensemblesofclassifiers.Themaindiscoveryisthatensemblesareoftenmuchmoreaccuratethantheindividualclassifiersthatmakethemup.AnensemblecanbeemoreaccuratethanitscomponentclassifiersonlyiftheindividualclassifiersdisagreewithoneanotherHansenandSalamon1990.Toseewhy,imaginethatwehaveanensembleofthreeclassifiers{h1,h2,h3},andconsideranewcasex.Ifthethreeclassifiersareidentical,thenwhenh1xiswrong,h2xandh3xarealsowrong.However,iftheerrorsmadebytheclassifiersareuncorrelated,thenwhenh1xiswrong,h2xandh3xmightbecorrect,sothatamajorityvotecorrectlyclassifiesx.Moreprecisely,iftheerrorratesofLhypotheseshiareallequaltopL/2andiftheerrorsareindependent,thentheprobabilitythatbinomialdistributionwheremorethanL/2hypothesesarewrong.Figure1showsthisareaforasimulatedensembleof21hypotheses,eachhavinganerrorrateof0.3.Theareaunderthecurvefor11ormorehypothesesbeingsimultaneouslywrongis0.026,whichismuchlessthantheerrorrateoftheindividualhypotheses.Ofcourse,iftheindividualhypothesesmakeuncorrelatederrorsatratesexceeding0.5,thentheerrorrateofthevotedensembleincreasesasaresultofthevoting.Hence,thekeytosuccessfulensemblemethodsistoconstructindividualclassifierswitherrorratesbelow0.5whoseerrorsareatleastsomewhatuncorrelated.MethodsforConstructingEnsemblesManymethodsforconstructingensembleshavebeendeveloped.Somemethodsaregeneral,andtheycanbeappliedtoanylearningalgorithm.Othermethodsarespecifictoparticularalgorithms.Ibeginbyreviewingthegeneraltechniques.SubsamplingtheTrainingExamplesThefirstmethodmanipulatesthetrainingexamplestogeneratemultiple3hypotheses.Thelearningalgorithmisrunseveraltimes,eachtimewithadifferentsubsetofthetrainingexamples.Thistechniqueworksespeciallywellforunstablelearningalgorithmsalgorithmswhoseoutputclassifierundergoesmajorchangesinresponsetosmallchangesinthetrainingdata.Decisiontree,neuralnetwork,andrulelearningalgorithmsareallunstable.Linearregression,nearestneighbor,andlinearthresholdalgorithmsaregenerallystable.Themoststraightforwardwayofmanipulatingthetrainingsetiscalledbagging.Oneachrun,baggingpresentsthelearningalgorithmwithatrainingsetthatconsistofasampleofmtrainingexamplesdrawnrandomlywithreplacementfromtheoriginaltrainingsetofmitems.Suchatrainingsetiscalledabootstrapreplicateoftheoriginaltrainingset,andthetechniqueiscalledbootstrapaggregationBreiman1996a.Eachbootstrapreplicatecontains,ontheaverage,63.2percentoftheoriginalset,withseveraltrainingexamplesappearingmultipletimes.Anothertrainingsetsamplingmethodistoconstructthetrainingsetsbyleavingoutdisjointsubsets.Then,10overlappingtrainingsetscanbedividedrandomlyinto10disjointsubsets.Then,10overlappingtrainingsetscanbeconstructedbydroppingoutadifferentisusedtoconstructtrainingsetsfortenfoldcrossvalidationso,ensemblesconstructedinthiswayaresometimescalledcrossvalidatedcommitteesParmanto,Munro,andDoyle1996.ThethirdmethodformanipulatingthetrainingsetisillustratedbytheADABOOSTalgorithm,developedbyFreundandSchapire1996,1995andshowninfigure2.Likebagging,ADABOOSTmanipulatesthetrainingexamplestogeneratemultiplehypotheses.ADABOOSTmaintainsaprobabilitydistributionpixoverthetrainingexamples.Ineachiterationi,itdrawsatrainingsetofsizembysamplingwithreplacementaccordingtotheprobabilitydistributionpix.Thelearningalgorithmisthenappliedtoproduceaclassifierhi.Theerrorrate£iofthisclassifieronthetrainingexamplesweightedaccordingtopixiscomputedandusedtoadjusttheprobabilitydistributiononthetrainingexamples.Infigure2,notethattheprobabilitydistributionisobtainedbynormalizingasetofweightswiioverthetrainingexamples.Theeffectofthechangeinweightsistoplacemoreweightonexamplesthatweremisclassifiedbyhiandlessweightonexamplesthatwerecorrectlyclassified.Insubsequentiterations,therefore,ADABOOSTconstructsprogressivelymoredifficultlearningproblems.Thefinalclassifier,hiisconstructsbyaweightedvoteoftheindividualclassifiers.Eachclassifierisweightedaccordingtoitsaccuracyforthedistributionpithatitwastrainedon.Inline4oftheADABOOSTalgorithmfigure2,thebaselearningalgorithmLearniscalledwiththeprobabilitydistributionpi.IfthelearningalgorithmLearncanusethisprobabilitydistributiondirectly,4thenthisproceduregenerallygivesbetterresults.Forexample,Quinlan1996developedaversionofthedecisiontreelearningprogramc4.5thatworkswithaweightedtrainingsample.Hisexperimentsshowedthatitworkedextremelywell.OnecanalsoimagineversionsofbackpropagationthatscaledthecomputedoutputerrorfortrainingexampleXi,Yibytheweightpii.Errorsforimportanttrainingexampleswouldcauselargergradientdescentstepsthanerrorsforunimportantlowweightexamples.However,ifthealgorithmcannotusetheprobabilitydistributionpidirectly,thenatrainingsamplecanbeconstructedbydrawingarandomsamplewithreplacementinproportiontotheprobabilitiespi.ThisproceduremakesADABOOSTmorestochastic,butexperimentshaveshownthatitisstilleffective.Figure3comparestheperformanceofc4.5toc4.5withADABOOST.M1usingrandomsampling.Onepointisplottedforeachof27testdomainstakenfromtheIrvinerepositoryofmachinelearningdatabasesMerzandMurphy1996.Wecanseethatmostpointslieabovethelineyx,whichindicatesthattheerrorrateofADABOOSTislessthantheerrorrateofc4.5.Figure4comparestheperformanceofbaggingwithc4.5toc4.5alone.Again,weseethatbaggingproducessizablereductionsintheerrorrateofc4.5formanyproblems.Finally,figure5comparesbaggingwithboostingbothusingc4.5astheunderlyingalgorithm.Theresultsshowthatthetwotechniquesarecomparable,althoughboostingappearstostillhaveanadvantageoverbagging.Wecanseethatmostpointslieabovethelineyx,whichindicatesthattheerrorrateofADABOOSTislessthantheerrorrateofc4.5.Figure4comparestheperformanceofbaggingwithc4.5toc4.5alone.Again,weseethatbaggingproducessizablereductionsintheerrorrateofc4.5formanyproblems.Finally,figure5comparesbaggingwithboostingbothusingc4.5astheunderlyingalgorithm.Theresultsshowthatthetwotechniquesarecomparable,althoughboostingappearstostillhaveanadvantageoverbagging.ManipulatingtheInputFeaturesAsecondgeneraltechniqueforgeneratingmultipleclassifiersistomanipulatethesetofinputfeaturesavailabletothelearningalgorithm.Forexample,inaprojecttoidentifyvolcanoesonVenus,Cherkauer1996trainedensembleof32neuralnetworks.The32networkswerebasedon8differentsubsetsofthe119availableinputfeaturesand4differentnetworksizes.TheinputfeaturessubsetswereselectedbyhandtogroupfeaturesthatwerebasedondifferentimageprocessingoperationssuchasprincipalcomponentanalysisandthefastFouriertransform.Theresultingensembleclassifierwasabletomatchtheperformanceofhumanexpertsinidentifyingvolcanoes.TumerandGhosh1996appliedasimilartechniquetoasonardatasetwith25inputfeatures.However,theyfound
编号:201311172036107946    大小:122.00KB    格式:DOC    上传时间:2013-11-17
  【编辑】
5
关 键 词:
教育专区 外文翻译 精品文档 外文翻译
温馨提示:
1: 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
2: 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
3.本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 人人文库网仅提供交流平台,并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
  人人文库网所有资源均是用户自行上传分享,仅供网友学习交流,未经上传用户书面授权,请勿作他用。
0条评论

还可以输入200字符

暂无评论,赶快抢占沙发吧。

当前资源信息

4.0
 
(2人评价)
浏览:29次
英文资料库上传于2013-11-17

官方联系方式

客服手机:13961746681   
2:不支持迅雷下载,请使用浏览器下载   
3:不支持QQ浏览器下载,请用其他浏览器   
4:下载后的文档和图纸-无水印   
5:文档经过压缩,下载后原文更清晰   

相关资源

相关资源

相关搜索

教育专区   外文翻译   精品文档   外文翻译  
关于我们 - 网站声明 - 网站地图 - 友情链接 - 网站客服客服 - 联系我们
copyright@ 2015-2017 人人文库网网站版权所有
苏ICP备12009002号-5