全文预览已结束
下载本文档
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
MachLearn(2006)63:211215DOI10.1007/s10994-006-8919-xGUESTEDITORIALMachinelearningandgamesMichaelBowlingJohannesFurnkranzThoreGraepelRonMusickPublishedonline:10May2006SpringerScience+BusinessMedia,LLC2006Thehistoryoftheinteractionofmachinelearningandcomputergame-playinggoesbacktotheearliestdaysofArtificialIntelligence,whenArthurSamuelworkedonhisfamouschecker-playingprogram,pioneeringmanymachine-learningandgame-playingtechniques(Samuel,1959,1967).Sincethen,bothfieldshaveadvancedconsiderably,andresearchintheintersectionofthetwocanbefoundregularlyinconferencesintheirrespectivefieldsandingeneralAIconferences.ForsurveysofthefieldwerefertoGinsberg(1998),Schaeffer(2000),Furnkranz(2001);editedvolumeshavebeencompiledbySchaefferandvandenHerik(2002)andbyFurnkranzandKubat(2001).Inrecentyears,thecomputergamesindustryhasdiscoveredAIasanecessaryingredienttomakegamesmoreentertainingandchallengingand,viceversa,AIhasdiscoveredcom-putergamesasaninterestingandrewardingapplicationarea.TheindustrysperspectiveiswitnessedbyaplethoraofrecentbooksongentleintroductionstoAItechniquesforgameprogrammers(Collins,2002;Champanard,2003;Bourg&Seemann,2004;Schwab,2004)oraseriesofeditedcollectionsofarticles(Rabin,2002,2003,2006).AIresearchoncomputergamesbegantofollowdevelopmentsinthegamesindustryearlyon,butsinceJohnLairdskeynoteaddressattheAAAI2000conference,inwhichheadvocatedInteractiveComputerGamesasachallengingandrewardingapplicationareaforAI(Laird&vanLent,2001),numerousworkshops(Fu&Orkin,2004;Ahaetal.,2005),conferences,andspecialissuesofjournals(Forbus&Laird,2002)demonstratethegrowingimportanceofgame-playingapplicationsforArtificialIntelligence.M.Bowling(envelopeback)e-mail:bowlingcs.ualberta.caJ.Furnkranze-mail:fuernkranzinformatik.tu-darmstadt.deT.Graepele-mail:R.Musicke-mail:Springer212MachLearn(2006)63:211215Games,whethercreatedforentertainment,simulation,oreducation,providegreatop-portunitiesformachinelearning.ThevarietyofpossiblevirtualworldsandthesubsequentML-relevantproblemsposedfortheagentsinthoseworldsislimitedonlybytheimagination.Furthermore,notonlyisthegamesindustrylargeandgrowing(havingsurpassedthemovieindustryinrevenueafewyearsback),butitisfacedwithatremendousdemandfornoveltythatitstrugglestoprovide.Againstthisbackdrop,machinelearningdrivensuccesseswoulddrawhigh-profileattentiontothefield.Surprisinglyhowever,themorecommercialthegametodate,thelessimpactlearninghasmade.Thisisquiteunlikeothergreatmatchesbetweenapplicationanddata-drivenanalyticssuchasdataminingandOLAP.Topicsofparticularimportanceforsuccessfulgameapplicationsincludelearninghowtoplaythegamewell,playermodeling,adaptivity,modelinterpretationandofcourseperfor-mance.Theseneedscanberecastasacallfornewpracticalandtheoreticaltoolstohelpwith:learningtoplaythegame:Gameworldsprovideexcellenttestbedsforinvestigatingthepoten-tialtoimproveagentscapabilitiesvialearning.Theenvironmentcanbeconstructedwithvaryingcharacteristics,fromdeterministicanddiscreteasinclassicalboardandcardgamestonon-deterministicandcontinuousasinactioncomputergames.Learningalgorithmsforsuchtaskshavebeenstudiedquitethoroughly.Probablythebest-knowninstanceofalearninggame-playingagentistheBackgammon-playingprogramTD-Gammon(Tesauro,1995).learningaboutplayers:Opponentmodeling,partnermodeling,teammodeling,andmultipleteammodelingarefascinating,interdependentandlargelyunsolvedchallengesthataimatimprovingplaybytryingtodiscoverandexploittheplans,strengths,andweaknessesofaplayersopponentsand/orpartners.OneofthegrandchallengesinthislineofworkaregameslikePoker,whereopponentmodelingiscrucialtoimproveovergame-theoreticallyoptimalplay(Billingsetal.,2002).behaviorcaptureofplayers:Creatingaconvincingavatarbasedonaplayersin-gamebe-haviorisaninterestingandchallengingsupervisedlearningtask.Forexample,inMassiveMultiplayerOnlineRole-playingGames(MMORGs)anavatarthatistrainedtosimulateausersgame-playingbehaviorcouldtakehiscreatorsplaceattimeswhenthehumanplayercannotattendtohisgamecharacter.FirststepsinthisareahavebeenmadeincommercialvideogamessuchasForzaMotorsport(Xbox)wheretheplayercantraina“Drivatar”thatlearnstogoaroundthetrackinthestyleoftheplayerbyobservingandlearningfromthedrivingstyleofthatplayerandgeneralizingtonewtracksandcars.modelselectionandstability:Onlinesettingsleadtowhatiseffectivelytheunsupervisedconstructionofmodelsbysupervisedalgorithms.Methodsforbiasingtheproposedmodelspacewithoutsignificantlossofpredictivepowerarecriticalnotjustforlearningefficiency,butinterpretiveabilityandend-userconfidence.optimizingforadaptivity:Buildingopponentsthatcanjustbarelyloseininterestingwaysisjustasimportantforthegameworldascreatingworld-classopponents.Thisrequiresbuildinghighlyadaptivemodelsthatcansubstantivelypersonalizetoadversariesorpart-nerswithawiderangeofcompetenceandrapidshiftsinplaystyle.Byintroducingaverydifferentsetofupdateandoptimizationcriteriaforlearners,awealthofnewresearchtargetsarecreated.modelinterpretation:“Whatsmynextmove”isnottheonlyquerydesiredofmodelsinagame,butitiscertainlytheonewhichgetsthemostattention.Creatingtheillusionofintelligencerequires“paintingapicture”ofanagentsthinkingprocess.TheabilitytodescribethecurrentstateofamodelandtheprocessofinferenceinthatmodelfromSpringerMachLearn(2006)63:211215213decisiontodecisionenablesqueriesthatprovidethefoundationforahostofsocialactionsinagamesuchaspredictions,contracts,counter-factualassertions,advice,justification,negotiation,anddemagoguery.Thesecanhaveasmuchormoreinfluenceonoutcomesasactualin-gameactions.performance:Resourcerequirementsforupdateandinferencewillalwaysbeofgreatimpor-tance.TheAIdoesnotgetthebulkoftheCPUormemory,andthemachinesdrivingthemarketwillalwaysbeunderpoweredcomparedtotypicaldesktopsatanypointintime.Thisspecialissuecontainsthreearticlesandoneresearchnotethatspanthewiderangeofresearchintheintersectionofgameplayingandmachinelearning.Inthefirstcontribution,AdaptiveGameAIwithDynamicScripting,Sproncketal.tackletheproblemofadaptivitybydynamicallymodifyingtheruleswhichgoverncharacterbe-haviorin-game.Thispaperistargetedatthecommercialgamesindustry,andprovidessomegoodinsightintoproblemsfacedbythecreatorsoftodaysroleplayinggames.Theauthorsproposefourfunctionalandfourcomputationalrequirementsforon-linelearningingames.Theythenproceedtoshowhowdynamicscriptingfitsintothoserequirements,andprovideexperimentalevidenceofthepotentialpromiseofthisapproach.Dynamicscriptingcanbecharacterizedasstochasticoptimization.Theauthorsevaluatedynamicscriptingonboththetaskofprovidingthetoughestopponentpossible,andonthetaskofdifficultyscaling.Gooddifficultyscalingunderpinswhatmakesmostgamesfun,andsolvingthisproblemisoftenverychallengingandthesolutionsarealmostalwaysad-hoc.TheauthorspresentexperimentaldatathatcomparesdynamicscriptingtostaticopponentsandthosecontrolledbyQ-LearningandMonteCarlo.Thetestenvironmentsincludebothsimulatedgamesandanactualcommercialgame(NeverwinterNights),andhelptopresentaveryinterestingstudywhichissuretoblazeapathforfurtherinterestingresearch.Thesecondpaper,UniversalParameterOptimizationinGamesBasedonSPSAbySzepesvariandKocsis,considerstheproblemofoptimizingparameterstoimprovetheperfor-manceofparameterizedpoliciesforgameplay.TheyconsidertheSimultaneousPerturbationStochasticApproximation(SPSA)methodintroducedbySpall(1992)whichisageneralgra-dientfreeoptimizationmethodthatisapplicabletoawiderangeofoptimizationproblems.TheauthorsdemonstratethatSPSAisapplicabletoawiderangeoftypicaloptimizationproblemsingamesandproposeseveralmethodstoenhancetheperformanceofSPSA.Theseenhancementsincludetheuseofcommonrandomnumbersandantitheticvariables,acombinationwithRPROPandthereuseofsamples.TheapplicationtogamesconsidersthedomainoflearningtoplayOmahaHi-LoPokerwiththeirpokerprogramMcRaise.SPSAcombinedwiththeirproposedenhancementsleadstopokerperformancecompetitivewithTD-learning,themethodsosuccessfullyusedbyTesauro(1995),forlearningaworld-classevaluationfunctionforBackgammonandstillusedintodaysworldclassbackgammonprogramssuchasJellyFishandSnowie.Thethirdcontribution,LearningtoBidinBridgebyMarkovitchandAmit,addressestheproblemofbiddinginthegameofBridge.WhileresearchinBridgeplayinghaspioneeredMonteCarlosearchalgorithmsfortheplayingphaseofcardgamesandresultedinprogramsofconsiderablestrength(Ginsberg,1999),thebiddingphase,inwhichthegoal(theso-calledcontract)ofthesubsequentplayingphaseisdetermined,isstillamajorweaknessofexistingBridgeprograms.ThispaperisaboutanapproachthatsupportsthedifficultbiddingphaseinthegameBridgewithtechniquesfrommachinelearning,inparticularopponentmodelingviathelearningofdecisionnetsandviamodel-basedMonteCarlosamplingtoaddresstheproblemofhiddeninformation.Theevaluationclearlyestablishesthatthesystemimproveswithlearning,anditseemsthatthelevelofplayachievedbythisprogramsurpassesthelevelSpringer214MachLearn(2006)63:211215ofthebiddingmoduleofcurrentstate-of-the-artprogramsandapproachesthatofanexpertplayer.Finally,SadikovandBratkopresentaresearchnoteonLearningLong-termChessStrate-giesfromDatabases.Theyaddresstheproblemofknowledgediscoveryingamedatabases.Formanygamesorsubgames(suchaschessendgames),therearegamedatabasesavailable,whichcontainperfectinformationaboutthegameinthesensethatforeverypossibleposi-tion,thegame-theoreticoutcomeisstoredinadatabase.However,althoughthesedatabasescontainallinformationtoallowperfectplay,theyarenotamenabletohumananalysis,andaretypicallynotverywellunderstood.Forexample,chessGrandmasterJohnNunnanalyzedsev-eralsimplechessendgamedatabasesresultinginaseriesofwidelyacknowledgedendgamebooks(Nunn,1992,1994b,1995),butreadilyadmittedthathedoesnotyetunderstandallaspectsofthedatabasesheanalyzed(Nunn,1994a).Thispaperreportsonanattempttomakeheadwaybyautomaticallyconstructingplayingstrategiesfromchessendgamedatabases.Itdescribesamethodforbreakinguptheproblemintodifferentgamephases.Foreachphase,itisthenproposedtolearnaseparateevaluationfunctionvialinearregression.Experimentsinthethekingandrookvs.king,orkingandqueenvs.kingandrookendgamesshowencouragingresults,butalsoillustratethedifficultyoftheproblem.MachinelearninghasbeeninstrumentaltodateinbuildingsomeoftheworldsbestplayersinBackgammonandhasleadtointerestingresultsingameslikeChessandGo.Tomoveintomainstreamcommercialgames,machinelearningresearchhastofacewhatinmanywaysaretheharderproblemsoflosingininterestingways,creatingmoreusefulillusionsofintelligence,hyper-fastadaptation,andtakingonpersona.Thearticlesinthisspecialissueprovideaglimpseintodifferentfacetsofalloftheseproblems.ReferencesAha,D.W.,Munoz-AvilaH.M.,&vanLent,M.(Eds.),(2005).Reasoning,representation,andlearningincomputergames:ProceedingsoftheIJCAIworkshop.Edinburgh,Scotland:NavalResearchLaboratory,NavyCenterforAppliedResearchinArtificialIntelligence.TechnicalReportAIC-05-127.Billings,D.,Pena,L.,Schaeffer,J.,&Szafron,D.(2002).Thechallengeofpoker.ArtificialIntelligence,134(12),201240,SpecialIssueonGames,ComputersandArtificialIntelligence.Bourg,D.M.,&SeemannG.(2004).AIforgamedevelopersCreatingintelligentbehavioringames.OReilly.Champanard,A.(2003).AIgamed
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2026威海广电面试题及答案
- 2026污水问题面试题及答案解析
- 2026西安电网面试题库及答案
- 2026县后备干部面试题目及答案
- 2026销售行为面试题及答案
- 2026心理宣泄室面试题及答案
- 购买树苗植树合同范本
- 建筑公司购买石材合同
- 购买工作名额定金合同
- 销售系统软件购买合同
- 大连理工大学2026年强基计划校考《面试+体育测试》模拟试题及答案解析
- 2026云南文山州文山市教育体育系统选调中小学教师50人考试参考题库及答案详解
- 银行员工消防安全培训教材
- 26新五 (下) 道德与法治单元知识点梳理
- 2026年陕西省八年级地理生物会考试卷题库及答案
- 2026年部编版新教材语文二年级下册期末测试题(有答案)
- GB/T 19877-2026个人用特种清洁剂
- T∕CCTAS 301-2026 边坡柔性防护网工程技术规程
- 重庆《高速公路隧道洞口智慧管控设计指南》
- 期末评估测试卷(含答案)2025-2026学年地理人教版八年级下册
- 2025年试验检测继续教育《试验室检测安全事故典型案例分析》答案
评论
0/150
提交评论