智能体模型评述 Critique of Agent Model_第1页
智能体模型评述 Critique of Agent Model_第2页
智能体模型评述 Critique of Agent Model_第3页
智能体模型评述 Critique of Agent Model_第4页
智能体模型评述 Critique of Agent Model_第5页
已阅读5页,还剩81页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1EricXing

,†*,MingkaiD

InstituteofFoundationModels,MohamedbinZayedUniversityofArtificialIntelligence{eric.xing,mingkai.deng,jinyu.hou}@mbzuai.ac.aeAbstractWhatisanagent?Whatconstitutesagency?WiththeriseofLargeLanguageModel(LLM)systemsmarketedas“codingagents”,“AIco-scientists”,andother“agentic”toolsthatpromisetodriveupproductivity,andatthesametime,“existential”concernssuchasAIescapinghumancontrolwithdestructivepowerunderaspeculative“machineagency”againsthumans,ithasbecomeessentialtoclarifywhereautomationendsandagencybegins,bothforbuildingcapablesystemsandforunderstandingwhetherandwhattofear.DrawingonDescartes’groundingofagencyinindependentthought,andonportrayalsofautonomousbeingsinsciencefiction,wesurveythecurrentlandscapeofAIagents,andanalyzeagentarchitecturesalongfivedimensions:goal,identity,decision-making,self-regulation,andlearning.Specifically,wearguethatgenuineagencyrequiresthesestructurestobeinternalizedwithinthesystemitselfratherthanassembledthroughexternalscafolding.Thisdistinctionbetweenagenticsystems,whosecompetenceresidesinengineeredworkflows,andagentivesystems,whosecapabilities(includingsocialinteraction)ariseendogenously,definestheboundarybetweensystemsdesignedforprescribedtasks,andthosecapableofoperatingintheopenworldwithtrueautonomy.Buildingonthisanalysis,weproposetheGoal-Identity-Configurator(GIC)architectureforageneral-purposeagentmodel,combininghierarchicalgoaldecomposition,identityevolution,simulativereasoninggroundedinaseparatelytrainedworldmodel,learnedself-regulation,andself-directedlearningfrombothrealandsimulatedexperience.Furthermore,weshareinsightontheauditability,controllability,andsafetyofagentivesystemsthatpossessgreaterautonomyand“agency”,butremainunderhumanoversight.1IntroductionWhatisanagent?Whatconstitutesgenuineagency?Forcenturies,thequestionofhumanagencyhasbeenassociatedwithpropertiessuchaslong-termgoals,evolvingidentity,purposefulplanning,formationofsocialrelationships,self-regulation,self-reflection,allthewaytowardmoralrespon-sibilityandfreewill.Philosophicalaccounts,fromAristotle’sdiscussionsofpurposefulaction[9]tolaterviewsbyDescartes[25]thatthinkingdefinesexistence(“Cogito,ergosum”),suggestthatagentsarenotjuststaticentitiesthatrespondtoexternalstimuli,butdynamicindividualswiththeabilitytoreasonindependentlyandactfreelybutrationallyinpursuitofgoalsandwell-being.∗Co-firstauthor2Cansuchbiologicallyrootedagencyberealizedthroughartificialandmechanicalmeans?Afamiliarillustrationofautonomousartificialagentsappearsinsciencefiction.BladeRunner[68],agenre-definingclassic,portraysreplicants,atypeofbio-engineeredinstrength,agility,andintelligence.Thesereplicantsarebynomeansperfect:theyexperiencemoveandworkinthephysicalwotheirownsenseofself.Eventually,somebravelystepoutoftheirassignedrolestofuncertaintyandfreedom.Suchthoughtexperimentshighlightthatagencyisnotsynonymouswithoperationalexcellence(althoughoftencalledfor),butinsteadinvolvesthecapacityforgoal-directedactions,self-development,self-reflection,participationincomplexsocialenvironments,and,ultimately,possessionoffreewill,morality,andadriveforselfactuation.Thisdeepernotionofagencystandsincontrasttomanymodernsystemslabeledas“agents”incontemporaryAIresearc(e.g.,softwareengineering,computerusefolding,includingpredefinedtools,workflows,andprogrammaticcontrolloopsthatguidebehaviorthroughexternallydefinedstrusivepracticalsuccess,theircapabilitieslargelyarisefromorchestratingpredefinedworkflowswithinconstrainedenvironments.Inmanycases,behaviorsaredeterminedbyexternallyspecifiedtools,protocolsortrainingprocesses[e.g.,4,6,88],ratherthanbyanendogenous,flexibledecision-makprocessandintrinsicwill.Wefinditusefultodistinguishbetweentwolevelsofautonomoussystems.Agenticsystems,suchasthosedescribedearlier,completetasksautonomouslythroughorchestratedtoolsandworkflows;theircompetenceresidesprimarLLM.Agentivesystems,exemplifiedbybiologicalagentsanddiscussedatlengthinthispaper,long-termgoals,evolvingself-identity,simulatingfuturepossibilities,regulatingwhenandhowtoreason,orlearningbetterbehaviors)ratherthanfollowingprescribedprocedures,whetheratinfer-encetime(e.g.,fixedplanning-executionworkflows)oracrossthedevelopmentlifecycle(e.g.,manualtraining–deployment–retrainingcycles).CurrentAIsystemsarelargelyagenticbutnotyetagentive:muchoftheircompetenceresidesintheirworkflowsandharnesses,notinthemodelitself.onsequently,suchsystemsareoftenbetterunderstoodassophisticatedsofrthangenuinelyautonomousagents.Whilethesesystemsrepresonlyaportionofthebroaderchallengeofartificialagency.Indeed,itisdifficulttoimaginehowenumeratingeverypossiblebehaviorthroughtools,prompts,orskillswillallowAIsystemstoscaletothediversityandadaptabilityobservedinbiologicalagents.Humans,forexample,exhibitmultipletiersofintelligence(Figure1):linguisticandsymbolicrea-soning(e.g.,reading,writing,coding),physicalandspatialcompetence(e.g.,navigation,manipula-tion),socialunderstanding(e.g.,coordinatingandcompetingwithotheragents),andhigher-level“philosophical”capacities(e.g.,curiosity,self-reflection,andgoalformation).Asinglecognitivear-chitectureisabletosupportthisbroadrangeofbehaviorswithoutrequiringexplicitre-engineeringMotivatedbythisobservation,wearguethatagencyshouldnotbetreatedastheaccumulationofexternalscaffolding,butratherasapropertyemergingfromamodelcapableofdevelopingitsidentity,pursuinggoals,andexpressingandorganizingitsbehavioracrossdiverseenvRatherthanconstructingagentsthroughincreasinglycomplexsoftwarepipelines,westudythebroadrangeofactionswiththeflexibility,adaptability,andautonomyassociatedwithnaturalagents(e.g.,humansandotheranimals),andoflearningautonomouslyandperpetually.WerefertosuchamodelasanAgentModel.Specifically,anagentmodel(AM)isareasoningmodelthat3Figure1:Humansexhibitmultiplelayersofintelligence:linguisticandsymbolicreasoning,physicalandspatialcompetence,socialunderstanding,andhigher-level“philosophical”capacities.generatesreal-worldactionsbasedonitsgoalsgandidentityi.Formally,anAMπmapsthecurrentworldstatestoapredictedactionathrough,forexample,aconditionalprobabilitydistribution:Equippedwithsuchamodel,amachinecandrawonconceptualknowledgeandlogical/mathematicalreasoningforabstractproblem-solving,aswellasactinthephysicalworldviaitsendactuators(e.g.,ahumanoidbody).Crucially,conditioningongoalgandidentityienablesthesystemtoinspect,decompose,andreviseitslong-termobjectives(e.g.,self-preservationorsafetyconstraints)andself-model(e.g.,capabilitiesandroles)ratherthanleavingthemimplicitlydistributedacrossmodelweightsandthusdifficulttomodify.Whetherthesearekeptfixedbydesignorupdateddynam-icallyisahallmarkofthedistinctionbetweenagenticandagentivesystems.Similarly,howthedecision-makingproceduresandrequireexternallyscheduledtrainingtoimprove,whileagentiveonesregulateitsowndeliberationmodeduringinference(e.g.,reactingimmediatelytoemer-gencyvs.planningcarefullyforacomplexmaneuver)andcapabilityupdatesduringlearning(e.g.,retreatingintosimulatedpracticetoaddressanidentifiedweakness).Agency,inthisview,arisesfromintentionalactionsgeneratedbythemodelitselfratherthanfrompassivelyfollowingexternallyscaffoldedinstructions.Wediscussthesedistinctionsinmoredetailin§2.How,then,shouldsuchamodelbebuilt?Abasicprinciple,whichwediscussformallyin§4.3and§4.5,isthattheagentmodelmustbekeptfunctionallydistinctfromaworldmodel[85]:theformerdecideswhattodo,thelatterpredictswhatwillhappen.Collapseveralrecentproposalsdo[86,48,56],conflatesreward-drivenactionselectionwithfidelity-drivennext-stateprediction,underminingthereliabilityofbothplanningandsimulation.Atahigh-level,constructingandtraininganAgentModelinvolvesfivekeyaspects:goal,identity,decision-making,self-regulation,andlearning.Thepasttwoyearshaveseenanexplosionofsystemswereofferedintheseattempts,butasystemictreatmentofallaspectswithasingleframeworkpossibleforimplementationisstillunavailable.Inthispaper,wecategorizetheseapproachesandanalyzetheirlimitationstowardsscalableandgeneral-purposeagenducetheGIC(Goal-Identity-Configurator)architecture,whichprovidesconcreteproposalsforeachofthefiveaspectsofartificialagencyandresultantcapabilitieswithinasingleadaptivesystem,4CriticCriticst+1gTWorldStatestπatμFigure2:Illustrationofanagentactinginanenvironmenttoachieveagoal.pairedwithaseparatelylearnedworldmodel.Specifically,theGICarchitecturecombines:1)hierarchicalgoaldecompositionwithpersistentobjectives;2)anevolvingidentitythatadaptswith-outneedingretraining;3)simulativeplanningthroughaninternalworldmodel(SystemII)alongsidereactiveaction(SystemI);4)self-regulationofwhenandhowdeeplytodeliberateviaalearnedcon-figurator(SystemIII);and5)self-directedlearningfrombothrealandsimulatedexperience.Wepresenttheseideasindetailinthesectionsthatfollow.2TheBoundaryBetweenAgenticandAgentiveSystemsHavingintroducedthedistinctionbetweenagenticsystems,whichcompletetasksthroughexter-nallyorchestratedtoolsandworkflows,andagentivesystems,whosecapabilitiesarisefrominternalorganization,wenowformalizethedimensionsalongwhichtheydiffer.Ourgoalisnottodismissexistingagenticsystems,buttoidentifytheminimalpropertiesrequiredforgenuineagency,asaguidelineforinspiringplausibledesignandimplementation.Eachdimensionbelowdefinesaspec-trum:atoneend,therelevantstructureisfullyprescribedbyexternalengineering;attheother,itismaintainedandrevisedinternallybytheagentaspartofitsowndecision-making.2.1Preliminaries:Agent-EnvironmentModelWebeginwithaminimalformulationofsequentialdecisionmakingasaneutralfoundationforthediscussionthatfollows.Consideranenvironment(oruniverse)representedbyastochasticdynamicalsystemµ,encompassingvirtual,physical,andsocialcomponents.Theenvirdiscretetimestepsindexedbyt(continuoustimestepscanbeapproximatedbyinfinitesimallysmalldiscretesteps).Letstdenotetheworld(andinternal)stateattimetandatanaction.Theenvironmentdefinesatransitiondistributionpμ(st+1|st,at),andanagentismodeledasapolicyπthatproducesanactiondistributionpπ(at|st).Givenaninitialstatest,theinteractionbetweenπandµinducesatrajectorydistribution:Equation1describesobservableinteractiondynamicswithoutassuminganyparticularinternalstructureoftheagent.Thefactorizationalsodecomposesthesubjectofourdiscussionintoex-actlytwoobjects:theagentfactorpπ(ak|sk),whichdecideswhattodo,andtheuniversefactorpμ(sk+1|sk,ak),whichdetermineswhathappensnext.Anagentmodel(AM)isalearnedreal-izationoftheformer;aworldmodel(WM)isalearnedapproximationofthelatter.5Wenotethattheterm“worldmodel”hasrecentlybeenusedmorebroadly,encompassingnotonlynext-statepredictionbutalsonext-actiongeneration[86,48,56],ineffectcollapsingthetwofactorsofEquation1intoasingleobject.Throughoutthispaper,wekeepthemdistinct:“worldmodel”refersstrictlytotheuniversefactor,and“agentmodel”totheagentfactortogetherwiththeinternalstructures,introducedbelow,thatrealizeit.Webelievetheabsenceofaclear,functionaldefinitionoftheagentmodel,distinctfromtheworldmodel,mayhavecontributedtoactiongenerationbeingabsorbedintoworld-modelframeworksbydefault;thispaperoffersonesuchdefinitionandexploresitsconsequencesforhowtheagentreasons(§4.3,§5.2),whythetwomodelscalInthefollowingsubsections,weconstructanagentmodelbyintroducinglatentvariables(goals,identity,plans,andregulationmechanisms)thatformalizethepropertiesofendogenousagencyoutlinedabove.Whilegoalsandidentitycouldalsobeviewedascomponentsoftheworldstateobservablebyotheragents(e.g.,oneagentinferringanthemhereaslatentvariablesinternaltstructuresareendogenouslymaintainedvs.externallyprescribed.2.2GoalsandSubgoalsWefirstenrichtheagent-environmentformulationbyintroducinggoals,whichrepresentdesiredoutcomesguidingdecision-makingovertime.Wedenotetheagent’sgoalattimetbyalatentvariablegt,conditioningactionselectionaspπ(at|st,gt).Aswiththeotherdimensionsdiscussedbelow,wedistinguishtwolimitingcases.Ononeendareexternallyspecifiedgoals,whereobjectivesgtaresuppliedateachstep(e.g.,userinstructions,prompts,ortaskspecifications)theinteractionends.Ontheotherendareinternallypersistentgoalsg,whichremainconsistentoverlonghorizons.Anagentwithpersistentgoalsginterpretsimmediatetasksnotasitsentireobjective,butassubgoalsgtwithinalarger,continuingtrajectoryofbehavior.Inthisview,respondingtoindividualuserinstructionsisequivalenttohavingthetop-levelgoalof“satisfyexternaldirections”,withthesubgoalsaseachinstruction.Theagent’scapacity,however,extendsbeyondthisspecidependencyandpriority,andrevisableasnewinformationarrives:Thishierarchicalstructureisolatesthedifficultyoflong-horizonplanninginthedecompositionmod-uleδ,whileeachsubgoalgtcanbepursuedbyshort-horizoncapabilitiesthatareeasiertolearnandsupervise.Acommonwaytoevaluategoal-directedbehavioristhrougharewardfunctionr(st,gt)measuringthecompatibilitybetweenthecurrentstateandtheagent’scurrentsubgoal,andthelong-termperformanceofapolicyisevaluatedbytheexpecteddiscountedcumulativereward,alsoknownasthevaluefunction[74],withthediscountparameterγtsatisfyinglimt→∞γt=0:Thedegreetowhichgoalformation,decomposition,andmaintenaisoneaxisalongwhichagenticsystemsbecomeagentispecifiedinstructions;agentivesystemsmaintain,decompose,andretheirongoingdecision-making.62.3IdentityWenextintroduceidentity:alatentvariableitcapturingpersistentpropertiesthatinfluencedecision-makingacrosstime,suchascapabilities,constraints,affordances,andrelationshipswithotherentities.Identityconditionsactionselectionaspπ(at|st,gt,it),separatinginternalself-knowledgefromobservabledynamics.Akeyquestionishowidentityismaintained.Atoneend,identityisstatic:it=i0forallt,fixedbysystemdesign(e.g.,systemprompts,configurationfiles,orpredefinedroles).Suchdesignsarepracticalwhentheenvironmentiswell-understoodandpre-dictable,butadaptationrequiresexternalre-engineeringratherthanendogenousupdating.Attheotherend,identityevolveswiththeenvironmentandinternalstatestthroughthetransitionι:it~pι(it|st,it-1).Anagentwithadaptiveidentityrevisesitsself-modelinresponsetosuccess,failure,orenvironmentalday.Identityinthissensefunctionsnotmerelyasinitializationbutasanevolvinglatentstateparticipatinginongoingdecision-making:capabilitiesandroleassumptionsmayberevised,newaffordancesmaybediscovered,andrelationshipswithotherentitiesmaybeupdatobservedinteractions.Thedegreetowhichidentityisoriginated,maintainedandrevisedinternallyisoneaxisalongwhichnotionsof2.4Decision-MakingGivengoalsandidentity,anagentmustselectactionsthataccountforfutureconsequences.trueworldstatest.Instead,itreceivesobservationsotandinfersabeliefstatetrepresentingitsbestestimateoftheworld.Alearnedworldmodelfcanthenpredictthenextbeliefstategivenaproposedaction,accordingtopf(t+1|t,a).ThisfispreciselyalearnedrealizationoftheuniversefactorofEquation1,nowoperatinginbeliefspace:itremainsamodeloftheworld,distinctfromtheagentmodelthatqueriesit.Bysimulatingsequencesofactionsandtheirpredictedconsequences,theagentcanapproximateoptimalbehaviorwithoutaccesstothetrueenvironmentdynamics.Formally,theoptimalpolicyundertheworldmodelfselectsactionsequencesthatmaximizeexpectedgoalprogressundersimulatedstatetransitions,conditionedontheagent’scurrentsubgoalgtandidentityit:Werefertothisformofdeliberationassimulativereasoning(aformofSystemIIreasoning):theagentproposescandidateactions,predictstheirconsequencesthroughtheworldmodelf,andselectsthesequencethatmaximizesexpectedlong-termprogress.Incontrasttotraditionallogicalreasoning(e.g.,deduction,induction,abduction),simulativereasoningprovidesageneral-purposeplanningmechanismgroundedinverifiablenext-stateprediction,applicableacrossdiversetaskswithoutdomain-specificprocedures[85].Inpractice,exactoptimizationoverEquation3isintractable.Wethusdenotebyπfasimulativeplannerthatapproximatesπ.Itsoutputisaplanctencodingthecurrentbelief,aselectedactionct=(t,a,t+1,a+1,...,T′)~pπf(·|t,gt,it).(4)Theplanprovidesstructuredgroundingforcoherentbehavioroverlonghorizons:predictedfuture7guideexecutionwhenanticipatedstatesareencounteredorwhenthecurrentstateishighlyuncertain(e.g.,landinganairplaneinlowvisibility).Givenaplanct,theagentselectsconcreteactionsthroughanactorαthathandlesfine-grainedreactiveexecution:at~pα(·|t,ct).Thisreactivecomponent(SystemI)capturesexecutionpatternsthataredifficulttoencodeinstructuredplansandenablesfastresponsewhendagentivesystemsisthereforewhetherplanningisaninternalcomputationalprocess(i.e.,theagentforms,revises,andactsonplansasaresultofitsowndecision-making)orprocedure(e.g.,forcedreaction,predefinedworkflow,oralways-onmodel-predictivecontrol).A2.5Self-RegulationLong-horizonplanningintroducesaquestionbeyondwhatactiontotake:howshouldthedecisionbemade?Differentsituationscallfordifferentamountsandtypesofinternalcomputation,dependingonurgency,difficulty,uncertainty,andresourcebudget.Somedecisionsmaybehandledbydirectpolicyexecution(e.g.,dodgingaball),whileothersbenefitfromextendeddeliberationorreplanning(e.g.,strategizingafullmatch).Morebroadly,suchmeta-decisionsalsoencompasswhethertoorabandonagoal,whethertoactorrefrainfromacting,andhowtoprioritizecompetingobjectives,extendingbeyondcomputationalresourceallocationtobehavioralandnormativedimensions.Werefertothecapacitytocontroltheseinternalmodesofoperationasself-regulation.Wemodelthisthroughaconfiguratorκ,whichoutputsaregulationvariableutgoverningtheagent’sdecisionmodeateachstep(e.g.whethertoactdirectly,continueexecutinganexistingplanct-1,invokeadditionalplanning,orrevisegoals:ut~pκ(·|st,gt,it,ct-1).Self-regulationisthusitselfpartoftheagent’spolicy:theallocationofinternaleffortadaptswithexperienceratherthanfollowingfixedrulesordesigner-specifiedworkflows.Furthermore,thecon-figuratormayextendbeyondinference-timedeliberationtogoverntheagent’sownlearningprocess(e.g.,decidingwhentoactintheenvironment,whentoretreatintosimulationforpractice,whentoupdateitsworldmodel,andwhentoreviseitsself-model).Wereturntothispointbelow.Thedegreetowhichdeliberationcontrolisendogenoustotheagentisanotheraxisalongwhichagenticsystemsaredistinguishedfromagentiveones.Agenticsystemsfollowexternallyprworkflows;agentivesystemsorganizetheirowncomputationinresponsetochangingcircumstances.2.6LearningTheprecedingsubsectionsdescribehowanagentacquestionishowthosecapabilitiesimproveovertime.Inmostexistingsystems,learningterminatesbeforedeployment,andbehavioralchangethereafterrequiresexternalinterventionsuchasretrainingorpromptredesign.Agrowingbodyofworkaddressesthislimitationunderlabelssuchas“never-endinglearning”[53],“recursiveself-improvement”[63]or“autoresearch”[42],whichuseAIsystemstoautomateaspectsofthetraditionaltrainingpipeline(e.g.,generatingsynthetictasksandcurricula,performingautomatedevaluation).However,invirtuallyallsuch“AItrainingAI”systems,thelearningprocessitselfremainsexternaltotheagent,withtrainingdecisions(e.g.,whentolearn,whatdatatouse,howlongtotrain,andwhentostop)ultimatelymadebythehumanengineer,notbytheagentwhosecapabilitiesarebeingupdated.Amorecompletenotionofagency,ontheotherhand,treatslearningascontinuousandendogenous,takingtwocomplementaryforandlearningfromsimulatedexperience,wheretheagentgenerateshypotheticaltrajectoriesthroughitsworldmodelfandtrainsonthemwithoutreal-worldinteraction.Formally,wedefineλas8thelearningprocessthatoutputsthenextparameterθt+1givencurrentparametersθtandrealandsimulatedexperiencesDμandDfasbelow:θt+1~pλ(·|θt,Dμ,Df).Simulativelearningisparticularlyvaluablewhenreal-worldtrial-and-errorisdangerous,expensive,orslow.Notethatthetwomodelsimplicatedherelearnfromdifferentsignals:theworldmodelfimprovesbyreducingpredictionerroragainstobservedtransitions,whiletheagent’sdecision-makingcomponentsθimprovethroughgoal-directedfeedback,asdetailin§4.5.Anotherkeydifferencefromcurrent“AI-builds-AI”approachesisthatintheself-directedagent,learningisgovernedbytheconfiguratorĸaspartoftheagent’sownpolicy,ratherthanbeingimposedontheagentasanexternalschedule.Inadditiontomodelparametersθ,theself-modelimayalsobeupdatedinthemannerdiwithoutneedingfullretraining.ThedegreetowhichlearningisinternallyinitiatedandregulatedthosethatautomatetrainingwithAI,arestillagenticasthetrainingloopremainsexternalandtheagentremainsfrozenunlessretrained.Agentivesystems,bycontrast,improveautonomouslyandperpetuallythroughexperience,augmentingexternalinteractionwithinternalworld-modelsimulations,andgoverningitsownlearningasanintegralpartofitsongoingdecision-making.2.7CoordinationandCommunicationInasocialenvironment,anagentmustoftendecidewhethertocommunicate,whomtoengage,whatinformationtoshare,andhowtointerpretthebehaviorofothersinlightoftheirlikelyidentities,capabilities,andgoals.Communicationandcoordinationthusemergeasautonomousdecisions,arisingfromtheagent’snativecommunicativeabilities,anenvironmentcomposedofotheragents,andtasksthatrequiremulti-agentinteraction.Naturalagentsexhibitafurthercapacityforself-organization:individualsform,revise,anddissolvepatternsofcoordination,withoutrequiringthosestructurestobespecifiedinadvance.Inpractice,manyexistingsystemsconstruct“multi-agentteams”[83]or“agentswarms”[e.g.,59],buttheseoftenexternallyspecifythenatureandpatternofinteraction(e.g.,teammembership,communicationprotocols,roleassignments,andcoordinationconsistingofafederationoftasksratherthanagenuinemulti-agentsociety.Aswiththeotherdimensions,howmulti-agentinteractionishandleddelineatestheboundarybetweenagenticandagentivesystems:agenticsystemsrequireorchestratinginteractionpatternsexternally;agentivesystemsallowcollectiveorganizationtoemergeasaninternaldecisionofparticipatingagents.Thepropertiesintroducedabovetogethercharacterizewhatgenuineagencyshouldminimallypos-sess.Thedistinctionbetweenagenticandstructures(e.g.,goals,identity)exist,butinhowthesebehaviorsoriginate:throughexternallyengineeredpipelinesthatprescribebehavior,oraninternalconfiguratorcapableofadapting,re-vising,andorganizingtheirowndecision-makingprocesses(e.g.,planning,self-regulation,learning,andinteraction).Thisperspectivemotivatestheremainderofthepaper,wherewefirstexam-inewhetherandwherecurrentagenticsystemsfallshortofthisvision(§3-4),andthenpresenttheGoal-Identity-Configurator(GIC)agentmodelarchitecturewherethesestructuresariseascomponentsofasingleadaptivesystem,pairedwithaseparatelylearnedworldmodel(§5).3LandscapeofSystemsLabeledas“Agents”Theterm“agent”iscurrentlyappliedtoaremarkablybroadrangeofsystems,fromsimpleautoma-tionscriptstoembodiedlearningsystems.Thisbreadth,however,obscuresanimportantdistinctionhighlightedintheprevioussection:systemsmayappeargoal-directedwhiledifferingfundamentally9inwheretheorganizationofbehaviorresides.Ratherthanorganizingthelandscapebyapplicationdomain,weexamineitthroughthemechanismsthatproducebehavior.Thisperspectiverevealsacontinuumfromsystemswhosecompetenceisalmostentirelyprescribedbysoftwarestructure,tosystemsthatincreasinglyinternalizeplanning,actin

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论