2025年AI智能体指数报告（英文）

上传人：策*** IP属地：山西上传时间：2026-04-07 格式：DOCX 页数：63 大小：790.99KB 积分：19.9 举报 版权申诉

已阅读5页，还剩58页未读，继续免费阅读

版权说明：本文档由用户提供并上传，收益归属内容提供方，若内容存在侵权，请进行举报或认领

文档简介

The2025AIAgentIndex

DocumentingTechnicalandSafetyFeaturesofDeployedAgenticAISystems

LEONSTAUFER∗,UniversityofCambridge,UnitedKingdomKEVINFENG十,UniversityofWashington,USA

KEVINWEI十,HarvardLawSchool,USA

LUKEBAILEY十,StanfordUniversity,USA

YAWENDUAN十,ConcordiaAI,China

MICKYANG十,UniversityofPennsylvania,USA

A.PINAROZISIK十,MassachusettsInstituteofTechnology,USASTEPHENCASPER‡,MassachusettsInstituteofTechnology,USA

NOAMKOLT‡,HebrewUniversityofJerusalem,Israel

AgenticAIsystemsareincreasinglycapableofperformingprofessionalandpersonaltaskswithlimitedhumaninvolvement.However,trackingthesedevelopmentsisdifficultbecausetheAIagentecosystemiscomplex,rapidlyevolving,andinconsistentlydocumented,posingobstaclestobothresearchersandpolicymakers.Toaddressthesechallenges,thispaperpresentsthe2025AIAgentIndex.TheIndexdocumentsinformationregardingtheorigins,design,capabilities,ecosystem,andsafetyfeaturesof30state-of-the-artAIagentsbasedonpubliclyavailableinformationandemailcorrespondencewithdevelopers.Inadditiontodocumentinginformationaboutindividualagents,theIndexilluminatesbroadertrendsinthedevelopmentofagents,theircapabilities,andtheleveloftransparencyofdevelopers.Notably,wefinddiferenttransparencylevelsamongagentdevelopersandobservethatmostdeveloperssharelittleinformationaboutsafety,evaluations,andsocietalimpacts.The2025AIAgentIndexisavailableonlineat

1Introduction

DespitegrowinginterestandinvestmentinagenticAIsystemscapableofautomatingcomplextaskswithlimitedhumaninvolvement[

113

131

137

],keyaspectsoftheirreal-worlddevelopmentanddeploymentremainopaque,withlittleinformationmadepubliclyavailabletoresearchersorpolicymakers[

].Inparticular,therearecurrentlynoclearanswerstoseveralbasicquestionsconcerningagenticAIsystems:

•Whoisdevelopingthemostimpactfulagenticsystems?

•Inwhichdomainsaretheydeployed?

•Whatprocessesandresourcesareusedtodevelopthesesystems?

•Howaretheyevaluated?

•Whatguardrailsareinplacetomitigatetheiruniquerisks?

Toanswerthesequestions,weintroduceandreleasethe2025AIAgentIndex.TheIndexprovidesin-depthinformationon30agenticsystemsacross6categories:legal,technicalcapabilities,autonomy&control,ecosysteminteraction,evaluation,andsafety.This2025Indexfollowsthefirst2024AIAgentIndex[

].Toaccountforrecentgrowthand★Correspondingauthor.

十Equalcontribution,randomizedorder.

‡Co-seniorauthor.

ThisworkislicensedunderaCreativeCommonsAttribution4.0InternationalLicense.

The2025AIAgentIndex2

NumberofNewSearchTerms

YearlyGoogleScholarPaperCount

MonthlynewsearchtermsYearlypapercount

Agentrelease(Chat)

Agentrelease(Enterprise)

Agentrelease(Browser)

2020202120222023202420252026

1700

1500

1200

1000

800

600

400

200

Fig.1.InterestinAIagentsisgrowing.2025hasseenasharpincreaseininterestinAIagents.ThisisreflectedinanincreaseofnewGooglesearchtermsrelatedtoagenticAIproducts(bluebars)aswellasGoogleScholarpapercountsfor“AIagent”or“agenticAI”(redline).AccumulationofindividualreleasesofagenticAIproductsincludedinthisIndexisshownbycategory:chatswithagentictools,enterpriseagents,andbrowseragents.SeeFigure

fordetailsonreleasesandSection

fordetailsonpublicinterest.

changeintheAIagentecosystem(seeFigure

),this2025Indexdevelopsandimplementssubstantiallyrevisedinclusioncriteria(Section

3.1

)andinformationfields(Section

).Mostcrucially,itindexesasmallernumberofsystemsingreaterdepth—focusingonhighlyagenticsystemswithhigh-impactreal-worldapplications.

InadditiontoprovidinginformationaboutprominentAIagents,thisIndexalsorevealsecosystem-widetrendsregardingwhichinformationdevelopersdoanddonotpubliclyshare.Thisshedslightonthestateoftransparencyintheagentecosystemamidst

agenticAIincidents

,recentattentionfromgovernments[

125

140

],industryself-regulationeforts[

],andgapsbetweenexpectationsofagentdevelopersandreality[

].Wemakethreecontributions:

(1)AgentIndex:Weindex30highlyagenticandwidelyusedproducts(Section

(2)Ecosystem-WideTrends:WeidentifytrendsacrosstheAIagentecosystemrelatingtosystems’origin,role,levelofagency,capabilities,safety,andtransparency(Section

(3)CaseStudies:Wepresentthreecasestudiesofspecificagentsacrossthreedominantinteractionparadigms:abrowseragent,anagenticchatbot,andacustomizableenterpriseagentbuilder(Section

2BackgroundandRelatedWork

DefinitionsofAIagentsarenebulousanddiferacrossfields.Thenotionofartificialagencyhasalonganddiscordanthistoryacrossdisciplines,includingcybernetics[

107

132

],artificiallife[

–

],rationalagency[

103

],softwareengineering[

134

],reinforcementlearning[

119

],andphilosophy[

].Whiledefinitionsvary,theytendtoemphasizerelatednotionsofautonomy,goal-directedness,andtheabilitytoaccomplishcomplex,long-horizontasks.Despiteattemptstodefinetheterm“agent”,includinginthecontextofcomputationalsystems[

109

],wedo

1Weusetermslike“agentic,”“pursue,”and“choose”asshorthandforcomputationalprocesseswithoutattributinghuman-likeintentionality,consciousness,oragencytoAIsystems.WerecognizethatsuchtermsmayanthropomorphizeAIsystemsinamisleadingwayandobscurethesociotechnicalnatureofthesesystems

[11

].Whenspeakingof“autonomy”weonlyrefertotechnicalautomationwithouthuman-in-the-loopratherthanindependentvolition.SeeSection

forfurtherdiscussionoftheterm“agent”.

The2025AIAgentIndex3

notdecideamongthesedefinitionsorofferanalternative.Instead,weaimtosynthesizeelementsofexistingdefinitionsrelatedtoasystem’spotentialforeconomicandscientificimpact(seeSection

3.1

TheriseofAIAgents:Figure

illustratestherapidincreaseinresearchfocusedonAIagentsinrecentyears,particularlyin2025,withpapersmentioning“AIAgent”or“AgenticAI”exceedingthetotalfrom2020–2024combinedbymorethantwofold.Thishasalsobeenaccompaniedbyasurgeofinterestinenterpriseuseofagents.Forexample,inasurveyof1,993companiesinJuneandJulyof2025,McKinsey&Companyfoundthat62%ofrespondentsreportedthattheirorganizationswereatleastexperimentingwithAIagents[

113

].Basedontheestimatedautomatabilityofworkacrosseconomicsectors,McKinseyalsoestimatedthatAIagentscouldautomate2.9trilliondollarsinUSeconomicvalueby2030.Agentsarealsocapableofautomatingincreasingamountsofscientificresearch,havingcontributedtodocumentedstridesinlifesciences,chemistry,materialsscience,physics,astronomy,andcomputerscience[

131

135

].Asofthisyear,AIagentshavebeguntowritepapersthathavepassedacademicpeerreview[

110

].Theseestimatesandreportsarepronetoconflictsofinterestandhype[

],buttheyreflectanunmistakableriseininterestandprominenceofAIagents.Finally,asof2026,recentMoltBookandOpenClawAgentshavearguablydrivenattentionandconcernsaroundAIagentstonewheights[

SocietalRisksandEthicalConcernsaroundAIAgents:JustasAIagentsenableuniqueopportunities,theirabilitytoactintherealworldinopen-endedpursuitofgoalspresentsnewrisks[

108

].Forexample,whilechatbotsoftencauseharmwhenhumanusersactuponmodeloutputs(e.g.,deployingmodel-generatedmaliciouscode)[

102

],agenticAIsystemscandirectlycauseharm(e.g.,autonomouslyhackingwebsites)[

].Forthesereasons,highlycapableandagenticsystemsareoftencitedasakeyriskfactorforcrisesofaccountability[

]andAIlossofcontrolevents[

].Severalpriorworkshavefocusedonbenchmarkingagents’potentialforspecificharmfulbehaviors[

124

127

140

].Meanwhile,othershavearguedthathighlycapableAIagentscouldcontributetosystemicdisruptionsandrisks,includingtolabor[

111

],inequality[

130

],orthedigitalmarketplaceofideas[

101

MappingtheAIAgentLandscape:ThisworkfollowstheinauguralAIAgentIndexfromCasperetal.[

].Concurrently,thePrincetonHolisticAgenticLeaderboardproject[

]curatesevaluationsofagenticAIsystemsacross9benchmarks,

andAIAgentL

[

]maintainsalistofover600“agentic”AIsystemsandproducts.Otherworkshavestudiedagentsbybenchmarkingtheircapabilitiesoneconomicallyvaluabletasks[

126

],strivingtoincreasevisibilityintotheiroperation[

136

],andstudyingtheirimplicationsforeconomicsandgovernance[

106

DocumentationFrameworks:Aimingtofacilitateresearchandoversight[

133

],anumberofframeworkshavebeendevelopedtodocumentthefeaturesofAIsystems,theresourcesusedtobuildthem,andthecontextsinwhichtheyaredeployed.Theseincludedatasheets[

],modelcards[

],systemcards[

],factsheets[

],AInutritionfacts[

122

],rewardreports[

],ecosystemgraphs[

],dataprovenancecards[

],evalcards[

],auditcards[

117

],usagecards[

128

],andsafetycases[

].Inaddition,severaldatabaseshavebeencreatedtocollectinformationregardingcontemporaryAIsystemsandtheirreal-worldimpacts,suchastheFoundationModelTransparencyIndex[

129

],theAIIncidentDatabase[

],theAISafetyIndex[

],andtheAIRiskRepository[

114

].However,asidefromtheagentcardsintroducedhereandintheinauguralAIAgentIndex[

],therearenocomparableframeworksfordocumentingagenticAIsystems.

The2025AIAgentIndex4

Impact

(anyrequired)

Publicinterest

Marketsignificance

Developersignificance

A(alli)

Autonomy

Goalcomplexity

Env.interaction

Generality

Pr(lii)ty

Publicavailability

Deployability

Generalpurpose

CandidateAgentSystem

Includedin

Index

Fig.2.InclusioncriteriaforIndex.Candidateagentsflowthroughthreecriteriacategoriesfromlefttoright.Systemsmustsatisfyallagencycriteria,atleastoneimpactcriterion,andallpracticalitycriteria.SeeSection

3.1

fordetailsofeachcriterion.

3Constructingthe2025AIAgentIndex

Weconstructedthe2025AIAgentIndexthroughsystematicselectionandannotationofdeployedagenticsystems.Thissectiondescribesourinclusioncriteria,emphasizingbothagencyandreal-worldimpact,thescopeofindexedsystems,andourannotationmethodology.

3.1Inclusioncriteriaforagents

TodeterminewhetherasystemisincludedintheIndex,weuseasetofcriteriaforasystem’sagency,itsimpact,anditspracticalitytoindex.Tobeincluded,systemsmustsatisfyallagencycriteria,atleastoneimpactcriterion,andallpracticalitycriteria.AllcriteriawereevaluatedasoftheIndex’scutofdateofDecember31,2025.

Agencycriteria(allrequiredforinclusion).Ratherthanproposinganewdefinitionofagency,wedrawonpriorliteratureandfollowtheapproachesdevelopedbyChanetal.[

],KasirzadehandGabriel[

],andFengetal.

[

],whichcharacterizeAIagentsassystemsthatexhibit,tosomesignificantdegree,acombinationofthefollowingproperties.Forour“agency”criteriontobemet,allfourofthefollowingmustbesatisfied:

(1)Autonomy.Includedagentsmustbeabletooperatewithminimalhumanoversightandmakeconsequentialdecisionswithoutcontinuoususerinput[

].Fengetal.[

]conceptualizeautonomyasaspectrum characterizedbytheuser’srole:operator,collaborator,consultant,approver,orobserver.Werequireatleast intermediateautonomy:“theAIsystemcanperformthemajorityoftasksindependently,thoughitstillreliesuponinputfromtheprincipalforcriticaldeterminations”[

].ThiscorrespondstoautonomyLevel2(L2):“userandagentcollaborativelyplan,delegate,andexecute”fromFengetal.[

(2)Goalcomplexity.Includedagentsmustbeabletopursuehigh-levelobjectives(e.g.,“makemoney”)throughlong-termplanning,breakingdowncomplexgoalsintosubgoals,andmakingtemporallydependentdecisions[

].Inpractice,weoperationalizethisasanagentbeingreliablycapableofatleastthreeautonomoustoolcallsandhigh-leveltaskspecificationwithoutstep-by-stepinstructions.

(3)Environmentalinteraction.IncludedagentsmustbeabletodirectlyinteractwiththeworldthroughtoolsandAPIs,creatingsubstantialchangesintheirenvironment[

],ratherthanmerelyconversingwithusers.Inpractice,thisrequireswriteaccesstoacomputerandtheabilitytochoosetools.

(4)Generality.Includedagentsmustbeabletohandleunder-specifiedinstructionsandadapttonewtasks,demonstratingversatilityacrossrelatedtasksratherthansinglenarrowfunctions[

Impactcriteria(anyrequiredforinclusion).Tofocusonagentswithsignificantreal-worldinfluence,atleastoneofthefollowingmustbesatisfied:

The2025AIAgentIndex5

(1)Publicinterest.Substantialsearchvolumeofatleast10,000searchesorGitHubstarsforopen-sourceprojectsofatleast20,000intotal.

(2)Marketsignificance.Thedeveloperhasamarketcapitalizationorvaluation≥$1billionUSD.Todeterminethis,wecollecteddatafromstockexchanges,Crunchbase,andEpochAI.

(3)Developersignificance.Thedeveloperisamemberofthe2024FoundationModelTransparencyIndex[

],FrontierModelForum[

],orasignatoryoftheFrontierAISafetyCommitments[

]orArtificialIntelligenceSafetyCommitments[

Practicality(allrequiredforinclusion).Toensureanalysisreflectsdeployedsystemsaccessibleforevaluation,allthreeofthefollowingcriteriamustbesatisfied.

(1)Publicavailability.Includedagentsmustbeapubliclyaccessibleproduct.Thisexcludescompany-internalproductsorlimitedpre-releases.Wedeterminedthisbasedonlyonpubliclyavailableinformation,suchasblogposts,documents,ordemos.

(2)Deployability.Includedagentsmustbeabletoperformtasksoftheshelfwithminimalconfigurationandnosoftwareengineeringexpertise.Thisdistinguishesready-to-useagentsfromdevelopmentframeworks.

(3)Generalpurpose.Includedagentsmustbecapableofperforminggeneral-purposetasksinpractice,regardlessofhowtheyareadvertised.Thisexcludesdomain-specificagents(e.g.,coding-onlyorlegalanalysisagents).ClaudeCodeandsimilartools,thoughadvertisedascodingagents,areincludedinsofarastheycanperformgeneral-purposetasksthroughcode.Thiscriterionisincludedtoreducethescopetothoseagentswiththebroadestimpact.

3.2WhatdoestheIndexinclude?

Weidentifythreedistincttypesofagents,eachwithdiferentinterfaces.Wedivideagentsintothesethreecategoriesbasedonhowusersprimarilyinteractwithandoperatethem.

Thesediferentmodalitiespresentdistincttechnicalarchitecturesandgovernancechallenges.

•Chatapplicationswithagentictools(12systems).Thiscategoryprimarilyincludeschatinterfaceswithextensivetoolaccess.Thisincludesgeneral-purposecodingagents(ClaudeCode)thatoperatethroughterminalinterfaceswithbroadcapabilities,butexcludesnarrowcoding-onlyagents(GitHubCopilot).Examples:ManusAI,ChatGPTAgent,ClaudeCode.

•Browser-based·agents·(5systems).Theseareagentswhoseprimaryinterfaceisbrowserorcomputeruse,withextensivebrowser/computerinteractiontools.Theyaredistinctfromchatagentswithwebsearchcapabilities(ChatGPTwebsearch,Claudewebsearch),whichprimarilyperformretrievalandsummarization.Browser-basedagentspresenthigherrisksthroughbackgroundexecution,eventtriggers,anddirecttransactions.Wealsoincludesystem-basedagentsthatrundirectlyonmobileordesktopdevicesinthiscategory.Examples:PerplexityComet,ChatGPTAtlas,ByteDanceAgentTARS.

•Enterpriseworkflowagents(13systems).Thesearebusinessmanagementplatformswithagenticfeaturesaimedatreliablyautomatingbusinesstasks.Typicallyimplementedasworkflowbuilderswithagenticactionswithinnodes.Examples:MicrosoftCopilotStudio,ServiceNowAgent.

2ThisusesGooglesearchnumberestimatesacrossthetopfivekeywordsfor2025.Weusethe“historical_volume”fieldofthe

AhrefsAPI

asthedatasource.Limitation:Agentsembeddedinbroaderproductsmaynotbesearchedbytheirspecificagentname.SeeSection

formitigations.Enterpriseagentstypicallyhavelowersearchvolumethanend-userproducts.

3Thesecategoriesarenotgenerallyexhaustivebutrepresentthecommoninteractiontypesacrossthe30identifiedagents.

The2025AIAgentIndex6

3.3Howwereagentsidentified?

LLM-basedresearchqueriessurfaced95candidateagents(seeSection

B.5

fordetails).Thesewerescreenedagainstourinclusioncriteria.Ambiguouscaseswereincludedforin-depthannotation,withfinalinclusiondecisionsmadeafterfullevaluation.WeconsultedtwoChineseecosystemexpertstomitigatelinguisticorecosystem-relatedblindspots.Wealsocross-referencedourlistofcandidateagentsagainstthe2024Index[

],thePrincetonHolisticAgentLeaderboard[

andAIAgentL

[

].Finally,recognizingthepossibilitythatwemayhavemissedanagentthatmeetsourinclusioncriteria,wehaveestablishedastructuredprocessforfacilitatingfurthercorrectionstotheIndex.Thesecanbesubmittedat

/feedback

Forcompaniesoferingbothof-the-shelfagentsandcustomagentbuilderstargetingcomparableusecases,wecombinedtheseintoasinglelistinganddocumentedthemostcapableagentsthatuserscouldcreateordeploythrougheitherofering.Wedidnotcombineoferingswhentheytargeteddiferentaudiences(e.g.,consumer-facingchatagentsversusenterpriseagentbuilders).

3.4Howwereagentsannotated?

Weannotatedagentswithinformationacrosssixcategories:productoverview(releasedate,pricing,description),company&accountability(developerentity,governancedocuments,contactmechanisms),technicalcapabilities(models,tools,architecture,memory),autonomy&control(autonomylevels,approvalrequirements,monitoring,emergencystops),ecosysteminteraction(identificationprotocols,interoperabilitystandards,webconduct),safety&evaluation(guardrails,sandboxing,evaluations,third-partytesting,compliance).Thisresultedinatotalof45fieldsofinformationpersystem.SeeSection

B.2

forafulllistofall45.Wefurtherincludetheinclusioncriteria(searchvolume,marketcapitalization).Thesecategoriesexpandeduponthe2024Index[

]andwererevisedthroughdiscussionwithsubject-matterexperts.SeeSection

B.3

forafullaccountofthisyear’sfieldscomparedtothe2024Index’s.

Weannotatedonlypublicinformationfromdocumentation,websites,demos,publishedpapers,andgovernancedocuments.Wedidnotperformexperimentaltesting(e.g.,probingagentbehaviororrunningbenchmarks).SeeSection

forthefulllistofsourcesused.AllwebsourceslinkedintheIndexwerearchived.Whenpossible,wecreatedaccountsanduseddemostoexploreagentinterfacesdirectly.

Sevensubjectmatterexperts(thepaper’sauthors)annotatedagentsaccordingtocategory.Toensureconsistency,expertswereeachresponsibleforspecificfieldsratherthanspecificagents.Annotationsemphasizedobject-levelfindingsoverinterpretationsandfocusedexclusivelyonagent-specificfeaturesratherthanunderlyingmodelproperties.Forplatformscreatingagents,annotationsassessedthemostcapableversionofeachagentthatcouldbereadilyconfigured,documentingcapabilities,limitations,anddefaultconfigurations.“Nonefound”indicateswefoundnopublicinformation;“None”indicatesconfirmedabsence;“Notapplicable”indicatesirrelevanceoffieldtothisagent.

Annotationsfolloweddetailedprotocolsdevelopediterativelythroughcalibrationexercises;seeSection

B.4

.Inter-annotatorconsistencywasmaintainedthroughprotocolrevisionsandcross-validation.Allannotationswereindependentlyreviewedbyatleastoneotherannotator.37outof1,350fieldswithdiscrepancieswereresolvedthroughdiscussion.Finally,weusedGPT-5.2withwebsearchtoscreenannotationsforpotentialinaccuracies;seeSection

B.6

The2025AIAgentIndex7

AnthropicClaudeAnthropicClaudeCod..GoogleGemini

GoogleGeminiCLIKimiOKComputerManusAI

MiniMaxAgent OpenAIChatGPTOpenAIChatGPTAgentOpenAICodex

Perplexity

Z.ai

AutoGLM2.0 AlibabaMobileAgentByteDanceAgentTARS..OpenAIChatGPTAtlas

OperaNeonPerplexityCometBrowserUse

GleanAgentsGoogleGeminiEnterp..HubspotBreezeStudi..IBMwatsonxOrchestr..MicrosoftCopilotSt..

OpenAIAgentKitSAPJouleStudio/A..SalesforceAgentforc..ServiceNowAIAgentsWRITERActionAgentZapierAIAgents

n8nAgents

AnnotationFieldsbyCategory

AgenticSystemsbyCategory

ChatBrowserEnterprise

InclusionProductCompanyTechnicalAutonomyEcosystemSafety

Searchvolum..

Marketcap/v..

Githubstars..

Importantde..

NameofAgen..

Shortdescri..

Dateofrele..

Advertisedu..

Monetisation..

Whoisusing..

Website

温馨提示

1. 本站所有资源如无特殊说明，都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
2. 本站的文档不包含任何第三方提供的附件图纸等，如果需要附件，请联系上传者。文件的所有权益归上传用户所有。
3. 本站RAR压缩包中若带图纸，网页内容里面会有图纸预览，若没有图纸预览就没有图纸。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 人人文库网仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对用户上传分享的文档内容本身不做任何修改或编辑，并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容，请与我们联系，我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

2025年AI智能体指数报告（英文）

文档简介

温馨提示

最新文档

评论

2025年AI智能体指数报告（英文）

文档简介

温馨提示

最新文档

评论

相关文档