监管智能体：Agentic+Supervision的未来研究报告+The+Future+of+Agentic+Supervision

上传人：策*** IP属地：山西上传时间：2025-06-19 格式：DOCX 页数：61 大小：9.81MB 积分：19.9 举报 版权申诉

监管智能体：Agentic+Supervision的未来研究报告+The+Future+of+Agentic+Supervision_第2页

监管智能体：Agentic+Supervision的未来研究报告+The+Future+of+Agentic+Supervision_第3页

监管智能体：Agentic+Supervision的未来研究报告+The+Future+of+Agentic+Supervision_第4页

监管智能体：Agentic+Supervision的未来研究报告+The+Future+of+Agentic+Supervision_第5页

已阅读5页，还剩56页未读，继续免费阅读

版权说明：本文档由用户提供并上传，收益归属内容提供方，若内容存在侵权，请进行举报或认领

文档简介

ofAgentic

Supervision

TheFuture

ΛRFΛCT

AIISABOUTPEOPLE

WEACCELERATEDATAANDAIADOPTIONTOPOSITIVELYIMPACT

PEOPLEANDORGANIZATIONS.

COUNTRIES

1700

EMPLOYEES

+1000

CLIENTS

Artefactisagloballeaderinconsultingservices,specializedindatatransformation

anddata&digitalmarketing,fromstrategytothedeploymentofAIsolutions.

Weareofferingauniquecombinationofinnovation(Art)anddatascience(Fact).

STRATEGY&TRANSFORMATION|AIACCELERATION|DATAFOUNDATIONS&BI

IT&DATAPLATFORMS|MARKETINGDATA&DIGITAL

Executivesummary

LastFebruary,wepublished“TheFutureofWorkwithAI”,ourfirststudyonAgenticAI.WefoundthatalthoughAIagentswillreplacehumansontediousandrepetitivetasks,anewtypeofworkwillappear:AgenticSupervision.Duringtheindustrialrevolution,machinesreplacedhumansonmanualtasks,butnewjobsappearedsuchasmachinepurchasing,operationalsupervisionandmaintenance.WithAgenticAI,cognitivejobswillbereplacedbyotherhigher-levelandmoreproductivecognitivejobs.ThisstudyintendstodeepdiveintotheearlydaysofAgenticSupervisionandtodrawtheoutlineoftheFutureofSupervisionintermsofAgentlifecyclemanagement,governanceandsupervisiontooling.

TogatherthecurrentstateofAgenticSupervision,wein-terviewed14enterprisesand5ArtefactAgenticProductManagers&Engineers.WealsocontactedkeyAgenticSupervisionproviders,includingmajorData&AIplatformswithyearsofsoftwaresupervisionexperience(suchasGoogleandMicrosoft)aswellasspecializedstart-ups(WB,Giskard,RobustIntelligence…).

ThefirstinsightwefoundisthatwhileAgenticSupervisionextendstheprinciplesestablishedinDevOps(softwareop-erations),DataOps(dataoperations),andMLOps(MachineLearningoperations),itdramaticallyincreasesthedemandforrobustgovernancetokeepAIAgentsalignedandundercontrol.Indeed,with“softwarethatstartstothink”,unseenrisksareemerging,suchashallucination,reasoningerrors,inappropriatetone,intellectualpropertyinfringementorevenpromptjacking.Mitigatingthesereliability,behavioral,regulatoryandsecurityrisksnowrequiresgovernancethatisnotonlymorerigorousbutalsobroaderthanwhathaspreviouslybeenappliedtotechproducts.

Thismarkedlygreaterneedforgovernanceisthechal-lengethatmaydefinetheemergingoperationalparadigmof“AgentOps”.Interestingly,AgentOpswillneedtobuilduponeachorganization’sexistingDevOps,DataOps,andMLOpsfoundationsandgovernance,andcompanieslag-

RFΛCT

THEFUTUREOFAGENTICSUPERVISION

“WefoundthatalthoughAIagentswillreplacehumansontediousandrepetitivetasks,anewtypeofworkwillappear:

AgenticSupervision.”

gingintheseoperationaldomainswillhavetobridgeanygapsintheseareaswhilesettingtheirAgenticgovernanceframework.

Thesecondmajorchallengeidentifiedbyourinterview-eesistheneedtostrengthentheirAIsupervisiontooling.ManyarecurrentlyrelyingonexistingRPAandDev/Data/MLOpstools,orexperimentingwithcustom-builtsolutionsastheysearchformoresustainable,long-termoptions.Theabundanceofearly-stagetoolsandtheneedtoenvisionacohesive,end-to-endsupervisionsystemthatintegratesmultiplecomponents,promptedustoexplorethetechno-logicaldimensionsofagenticsupervisioningreaterdepth.AswithanyTechOpsframework,AgentOpssupervisioninvolvesthreefundamentalstages:(1)Observe,(2)Evaluate,and(3)Monitorandmanageincidents.Whilethethirdstagerepresentsthelargestsupervisioneffortandtime,thefirsttwoareessentialtoensuringeffectiveriskmanagement.Withnewcategoriesofriskstomonitorandconsequently,newlogs,traces,andevaluationmechanismstoestablish,it’sclearwhyintervieweesconsistentlyemphasizedtheneedfortherighttoolstosupportscalableandreliablesupervision.

EXECUTIVESUMMARYTHEFUTUREOFAGENTICSUPERVISION

“Supervisionshouldnotbeanafterthought,itmustbe

embeddedearlyintheagent’sdesignanddevelopment.”

Ourresearchintoagenticsupervisiontoolsrevealedthreekeyinsights.First,thereiscurrentlynoall-in-onesolutionavailable.MajorcloudproviderslikeGoogleandMicrosoftareactivelydevelopingandreleasingsupervisiontoolsandframeworksaimedatcoveringthefullspectrumofsupervisionneedsforteamsbuildingagentsonplatformssuchasVertexAI(Google)andCopilotStudio(Microsoft).Second,agentsupervisionfallsintotwocategories:pro-activeandreactive.Proactivesupervisionisappliedduringdevelopmenttotestagentsagainstdefinedscenariosor,inproduction,tocontinuouslyguardagainstemergingthreats,particularlyintheareaofsecurity,ortocollectaggregatedperformancedata.Itsgoalistoimproveagentbehaviorovertime.Reactivesupervision,ontheotherhand,focusesondetectingandhandlingliveincidents.Althoughbothtypesrelyonobservabilitytoolsandmayusesimilarevaluationmechanisms,theydiffersignificantlyindatasources,eval-uationgranularity,andresponsestrategies.Finally,ourthirdinsightisthatagenticobservability,evaluation,andriskmitigationremaincomplexandrapidlyevolvingdomains.Weanticipatesubstantialadvancementsinsupervisiontoolingoverthecomingyears.

Eachphaseoftheagenticsupervisioncycle;observe,evaluate,andsupervise,presentsitsownsetofchal-lenges.

Observabilityfirstrequiresanticipatingwhatdatatocapture,whichdependsheavilyonhavingaclearlydefinedevaluationandsupervisionstrategy.Withoutthisforesight,teamsriskeithercollectingtoolittleinformationorbeingoverwhelmedbyvast,unstructuredtracesthathindermanualrootcause

analysis.ToolslikeLangSmithandLangChainareincreas-inglyusedtostructureandstreamlinetheobservationofagentbehavior.AnothermajorchallengeliesintheopacityofLLMreasoning,whichmustbecounteredbydeliberatelydesigningagentarchitecturesandworkflowstoensuretraceabilityandtransparency.

EvaluationinagenticAIissignificantlymorecomplexthanintraditionalsoftwareordataqualityassessments.Wheredeterministictestsbasedonobservabilityqueriesaresuf-ficientinclassicalDevOpsandDataOps,agenticsystemsoftenrequireAItoevaluateAI.ThishasledtotheriseofLLM-as-a-judgetechniques;acounterintuitiveapproachwhereonemodelassessestheoutputofanother.Whilethisraisesconcerns(whytrustflawedAItojudgeflawedAI?),studiesshowitoftenproducesmoreconsistentandscalableresultsthanhumanreviewers.Nonetheless,acommonpainpointamongintervieweeswasthedifficultyofbuildingreliablegroundtruthdatasets,expert-curatedquestion-answerpairs,tobenchmarkagentresponses.Humanevaluatorstendtodisagreeandoftenlackcom-pletenessintheiranswers.

Finally,supervisionandmitigationfacechallengesaroundprioritization.Withagrowingnumberofmetricsandalerts,teamscanquicklybecomeoverwhelmed.Standardizedframeworksforalertingandmetricmanagementareamusttobringstructureandclaritytoagenticsupervision.

Onlyahandfuloforganizationshavesuccessfullyestab-lishedeffectivegovernanceandstandardsforagenticAI.Thosewithmaturesoftwareanddatagovernanceframe-

4ΛRFCT

EXECUTIVESUMMARY

“AgenticSupervisionis

theFutureofWorkwithAI!”

workshavehadaheadstart,benefitingfromstrongfoun-dationsandawell-establishedcultureofobservabilityandsupervision.Weobservedthatleveragingexistingsoftware,RPA,anddatasupervisionpractices,processes,andtoolscansignificantlyaccelerateprogress.However,thekeychal-lengeliesinadaptingthesetothedynamicrisksandevolvingtoolsetsspecifictoagenticAI,andinbuildingadedicated,future-readygovernanceframework.Relyingtoolongonlegacyapproaches,includingdeterministiclogicandcus-tom-builttools,canbecomeaconstraint,limitingteamstonarrow,tightlycontrolledagenticworkflowsandpreventingtheadoptionofmoreautonomous,AI-orchestratedagents.

Allintervieweesemphasizedthatthekeytoeffectiveagenticsupervisionisanticipation.Supervisionshouldnotbeanaf-terthought,itmustbeembeddedearlyintheagent’sdesignanddevelopment.Settingupobservabilityandevaluationmechanismsonlyoncetheagentisinproductionistoolate.Identifyingflawsatthatstageoftenmeansreworkingtheentireagent,whichisfarmorecostlythaninvestinginrobustsupervisionfromthestart.

Thegoodnewsisthatavarietyoftestedtoolcombinationsandemergingagenticframeworksarealreadyavailable.WestronglyrecommendthatenterpriseAIgovernanceteamsdefinetheirownstandardizedframeworkandtoolsettobeappliedacrossallagenticdevelopment.Thisbecomesevenmorecriticalasagentsbegintointerconnect,makingsys-tem-widecontrolandsupervisioninteroperabilityessential.

Tosucceed,AIgovernancemustalsoaligncloselywithstrongITandDataGovernancepractices,sinceagents

RFΛCT

THEFUTUREOFAGENTICSUPERVISION

relyonenterprisedataandITsystemsto‘think’andtake‘action.’JustasITanddatagovernancerequiredbusinessinvolvementinthepast,oneofthekeytakeawaysfromourstudyisthatagenticgovernancewilldemandevendeeperbusinessengagement.

Unliketraditionalsoftwareordatasupervision,typicallyhandledbyITordatateams(andinthemostmatureor-ganizations,byabusiness-leddatagovernancenetwork),agentsupervisionwillneedtobebusiness-owned.GiventheinherentunpredictabilityofAIagents,incidentresponsesof-tenrequiredomainexpertise.Asaresult,thebusinessmustbeactivelyinvolvednotjustinmonitoring,butinframingagentbehaviorfromtheoutset.Thisrepresentsasignificantculturalshift:agenticAIblursthelinesbetweenIT,data,andbusiness,andwillrequirenewwaysofworkingbasedoncross-functionalcollaboration.AgenticSupervisionistheFutureofWorkwithAI!

FlorenceBénézit

ExpertPartnerData&AIGovernance

HananOuazan

ManagingPartner,LeadGenerativeAI

THANKS&ACKNOWLEDGMENTSTHEFUTUREOFAGENTICSUPERVISION

Methodology

ThisstudyisbasedonaqualitativeresearchapproachdesignedtoexploretheemergingchallengesandgovernancepracticessurroundingtheearlyimplementationsofautonomousAIagentsinorganizations.Bycombiningexpertinterviewswithanin-depthanalysisoftheevolvingtechnologicallandscape,weaimedtomapcurrentpractices,identifyoperationalneeds,andunderstandthevaluepropositionsofavailablesolutionsforagentobservability,evaluation,andsupervision.

Weconducted20+interviewswithprofessionalsdirectlyinvolvedinthedeployment,governance,ortechnicaldevelopmentofagenticsystems.Theseincluded:

—AIandDataLeaders,suchasChiefDataOfficers,HeadsofAI,andDataPlatformDirectors,whosharedtheirstrategicvisiononagentimplementation,riskmanagement,andtheevolutionofdatainfrastructure.

—ProductManagersandInnovationExecutiveswhoofferedinsightsintooperationalusecases,organizationalreadiness,andtheshifttowardagent-centricarchitectures.

—Compliance,Security,andITGovernanceExperts,

whoprovidedcriticalinputonregulatoryexpectations,ethicalrisks,andtheemergingneedforreal-timecontrolmechanismstailoredtoAIagents.

—FoundersandChiefsofScienceofAItoolingcompanies,

whosefeedbackhelpedassessthestateofthemarketacrossthreekeyfunctions:observability,evaluation,andactivesupervisionofAIagents.

Intervieweesrepresentedadiverserangeoforganizations,includingmajorcorporations(insectorssuchasenergy,telecom,pharmaceuticals,andluxury),globaltechplayers,andhigh-growthstartups,ensuringarichandnuancedunderstandingofthetopic.

Inparallel,weconductedasystematicreviewofoveradozentoolsandplatformsofferingcapabilitiesrelevanttoagentgovernanceincludingLangfuse,LangSmith,DeepEval,CopilotStudio,VertexAI,Ragas,Weights&Biases,PRISMEval,DeepEval,RobustIntelligence,Giskard…Eachsolutionwasanalyzedusingadedicatedframeworkthatcross-referencedthreedimensionsofquality(Reliability,BehavioralAlignment,Security)withthreestagesofsupervision(Observation,Evaluation,ActiveSupervision).

Byintegratingreal-worldpractitionerfeedbackwithastructuredtechnologicalbenchmark,thisstudyaimstoofferapragmaticandforward-lookingperspectiveonhowcompaniescanresponsiblyscaleagenticAIsystems.

SpecialThanks&Acknowledgments

ENTERPRISEINTERVIEWEES

YoannBersihand,VPAITechnology,SCHNEIDER

ArthurGarnier,ITChiefofStaff&DataScientist,ARDIANJean-FrançoisGuilmard,CDO,ACCOR

PaulSaffers,DeputyCDO,VEOLIA

AlexisVaillant,HeadAutomatisation,ORANGE

LeoWang,DataProtectionOfficer,LOUISVUITTONCHINA

AGENTOPSSTACKINTERVIEWEES

AlexCombessie,Co-founder&Co-CEO,GISKARD

SaloméFroment,AccountDirectorFrance,WEIGHTS&BIASESÉricHoresnyi,HeadofAIGo-To-Market,GOOGLEFRANCE

AminKarbasi,SeniorDirector,CISCOFOUNDATIONAIRESEARCH(FormerChiefScientistatRobustIntelligence)

Jean-LucLaurent,GenerativeAI/MLSpecialist,GOOGLE

PierrePeigné,Co-founderandChiefScienceOfficer,PRISMEvalChrisVanPelt,Co-founder&CISO,WEIGHTS&BIASES

MarcGardette,DeputyCTO,MICROSOFTFRANCE

6ΛRFCT

TABLEOFCONTENTSTHEFUTUREOFAGENTICSUPERVISION

Introduction

9I—AgenticAIrisksareshakingupthetech

governance&supervisiongame.

10AgenticAIorwhensoftwarestartstothink.

14Newtech,oldproblems:whygovernanceisacontinuum.

18Nomorewatchingfromthesidelines:AgenticAIputssupervisioninbusinesshands.

24II—ThenewAgentOpsstack:tests,guardrailsandfeedbackloops.

25Pre-productiontestingmustembracevariabilitytoensureagentreadiness.

35Guardrailsprotectoperationsbymanagingrisksduringagentexecution.

41Agentsupervisionspansfromimmediateruntimeactionstofutureplanningdecisions.

45III—SecureandaccelerateAgenticAIwith

standards&globalgovernance.

46Technicalteamsneedclearstandardstobuildanddeployagentsefficientlyandresponsibly.

50Scalingmulti-agentsystemsrequiressharedprotocolsforinteroperabilityandmanageability.

55BusinessteamsneedtoorganizeglobalAIgovernanceandsupervisionprotocols.

Conclusion

RFΛCT7

INTRODUCTIONTHEFUTUREOFAGENTICSUPERVISION

Introduction

If,asshowninourpreviousstudy,thefutureofworkwithAIliesinsupervisingAIagents,thenitisessentialtoensurethatthisnewformofworkbecomesabetterexperiencethanthecognitivetasksitreplaces.Manu-allyoverseeingeverystepanddecisionmadebyanagentwouldquicklybecomeatedious,evenmoredrainingtaskthansolvingtheproblemdirectlyourselves.So,howcanwedobetter?Thisstudyexploreswhat’strulyatstakeinagenticsupervisionandhowearlytoolsarebeginningtoshapewhatthisnewtypeofworkmightlooklike.

Wetakeabroadviewofwhatsupervisionmeans.Itstartswithsettingupautomatedloggingandtracingsystems.Italsoinvolvesdesigningevaluationandalert-ingframeworksthatguidethefinalandmostvisiblestep:takingaction(manuallycorrectingmistakes,relaunchinganagentictaskwithbettercontext,mitigatingincidents,identifyingareasforimprovement,andprioritizingde-velopmentefforts).Supervisingagentsmirrorsmanyaspectsofhumancollaboration:definingjobdescriptions(agentobjectives),recruiting(designinganddeployingnewagents),trainingandcoaching(monitoringandup-

datingbehavior),andongoingcollaboration(providingin-putsandsupporttoagents,butalsolearningfromagentsandthebusinesscontexttheycollectintheirmemory).

Webelievethatthesupervisionofasingleagentwillnotfalltojustoneperson.Agenticsupervisionisinherentlymultidimensional.Forinstance,businessoperationsmayoverseerelevanceandaccuracy;ethicsteams,compli-anceandtone;businessleaders,valueandeconomicviability;andcybersecurityteams,safetyandmaliciousattackriskmitigation.

Thisstudyfocusesonbestpracticesforagenticgov-ernance,supervisionprocesses,andthesupportingtools.Whilethisdomainisstillemergingandlikelytoevolvesignificantly,wealsoobservestrongcontinuitywithestablishedpracticesfromsoftware,RPA,data,andMLsupervision.DespitetheuniquechallengesposedbytheprobabilisticbehaviorofAIagents,manystablefoundationsalreadyexist.Embracingthesefoundationsnowiscriticaltoensuringthesuccessofearlyagenticinitiatives.

GeneratedwithChatGPT

8RFCT

THEFUTUREOFAGENTICSUPERVISION

AgenticAIrisksareshakingupthetechgovernance&

supervisiongame.

I.A

—AgenticAIorWhenSoftwareStartstoThink.

14I.B—NewTech,OldProblems:WhyGovernanceIsaContinuum.

18I.C—Nomorewatchingfromthesidelines:AgenticAIputssupervisioninbusinesshands.

IAGENTICAIRISKSARESHAKINGUPTHETECHGOVERNANCE&THEFUTUREOFAGENTICSUPERVISIONSUPERVISIONGAME.

I.AAgenticAIorWhenSoftwareStartstoThink.

AIagentsradicallydifferfromsoftware:theyareautonomousandgoal-driven.

Traditionalsoftwarefollowspredeterminedlogic,andchat-botsoperatewithinrigidtemplatesanddeterministicdeci-siontrees.Incontrast,agenticAIsystemsgomuchfurther:theyinterpretcontext,planactions,andexecutetasksbychainingdecisionsacrossvarioustoolsandAPIs.Theseagentsdon’tsimplywaitforusercommands,theypursueobjectives,evaluateintermediateoutcomes,andadjusttheirstrategiesonthefly.Thisautonomousreasoningmakesthemfeellessliketoolsandmorelikecollaborators.UnlikeRPAbots(RoboticProcessAutomation)orevenstandalonelargelanguagemodels(LLMs),agenticAIsys-temsaregoal-orientedandtask-complete,builttoachieveanoutcome,notjustfollowinstructionsorgeneratethemostlikelynextresponsetoaprompt.

Thismarksafundamentalshiftinthesoftwaredevelop-mentparadigm.Insteadofhardcodinglogicupfront,youdefinegoalsandsetconstraintsandtheagentautono-mouslyconstructsitsownplan.Itmaychainprompts,callAPIs,search&querydatastores,orevencreatesubgoalsasneeded.Ratherthanfollowingafixedpath,thesystemcontinuouslyadaptsitsactionstowhat’smostlikelyto

succeed.Whilethisopensthedoortomajorproductiv-itygains,italsodisruptstraditionalgovernancemodels:Howdoyoutestasystemwhoseoutputschangewitheveryrun?Howcanyoucontrolbehaviorthatvariesovertime,withoutresortingtoconstanthumanoversightandintervention?

“What’sdifferentwithagentsisthattheydon’tjustfollowascript.Theyinterpretinstructions,decidehowtoachievegoals,andofteninfermorethanyoutoldthemto.Thatopensupanewlayerofunpredictability.You’renotsuper-visingcode,you’resupervisingintent.”

ArthurGRENIER

ITChiefofStaff&SeniorDataScientist

ARDIAN

IAGENTICAIRISKSARESHAKINGUPTHETECHGOVERNANCE&THEFUTUREOFAGENTICSUPERVISIONSUPERVISIONGAME.

AgenticAIcan’tbemade100%predictableandcallsforgovernancereinventiontobalancevalueandrisks.

Thefirstgenerationofautomationtools,includingRPA,macrosandrule-basedbots,offeredpredictabilitybyde-sign.Theymimickeduseractionsstepbystep,withinwell-definedworkflows.EventraditionalMachineLearningsystems,despitetheirinternalcomplexityandprobabilisticnature,operatedwithinclearboundaries:structuredinputsandoutputs.Incontrast,LLMsacceptunstructuredtextinputsandcangenerateawiderangeofoutputs,ofteninunpredictableformats.AgenticAIexacerbatesbehaviorcomplexityevenfurther,agentsnavigatedynamicenviron-ments,drawonmultipleknowledgesources,andadapttheiractionsautonomouslyinrealtime.Theirbehaviorisinfluencednotjustbytrainingdataorpredefinedrules,butbyhumanprompts,toolusage,memorystate,andimplicitknowledgebakedintotheirfoundationmodels.

Legacygovernancemodelsreliedondeterministicin-put-outputcontrol:supplytestdata,verifyresults,tracebugs.Butagenticsystemsblurthatline.Asinglepromptmightleadtohallucinations,multipleAPIcalls,toolinterac-tions,ormemoryrecalls,allpotentiallydifferenteachtime.Thisabstractionbetweenintentandexecutioncreatesagovernancecontrolgapintermsoftechnicalvisibility,pro-cessreadinessandaccountability:rulescanbebypassed,edgecasesoverlooked,andbehavioralregressionsmaygounnoticeduntiltheycauserealissues.

Asaresult,supervisingagentsshiftstheeffortweightfromverifyingcodetoobservingpairsofinputsandoutputs,andpiecingtogethertheirdecision-makingret-rospectively.Asforsoftwareanddatamanagement,thisobservation&analysisefforthappensbothoffline,beforedeploymentongroundtruthorsyntheticdata,andonlineonproductiondata.Allintervieweesstressedtheimportanceofsettingupagenticsupervisionupfronttorigorouslytestagentswhilebeingdevelopedbutalsotoanticipateonlinesupervisionaccountabilityandreme-diationprocesses.

“Unliketraditionalsoftware,AIdevelopmentisfundamentallyprobabilistic.CodeisnolongerthecoreIP,learningis.Whatmattersisknow-ingwhatworks,whatdoesn’t,andwhy.”

ChrisVanPelt

Co-founder&CISO

10ΛRFCTRFΛCT11

IAGENTICAIRISKSARESHAKINGUPTHETECHGOVERNANCE&THEFUTUREOFAGENTICSUPERVISIONSUPERVISIONGAME.

Thisunpredictabilityshiftintroducestheneedforlarge-scale,statisticalvalue&riskevaluation.

Asaconsequenceofthisunpredictability,theemergenceofagenticAIhasintroducedaprofoundcontrolchallenge:traditionalQA(QualityAssessment)methodsarenolongeradequate.Previously,ahandfulofunittestsmatchingfixedinputstotheirexpecteddeterministicoutputswasenoughtovalidatehardcodedlogic.Incontrast,AIagentsnowrequiretestingacrossabroadspectrumofpossibleinputs,witheachtestscenariorigorouslyandrepeatedlyruntoaccountfortheirnon-deterministicbehavior.Ontopofthat,evaluatingtheirperformancemeansinterpretingun-structuredandvariabletextoutputs,whichmakesitmuchhardertoconsistentlydefineandmeasurewhat“quality”reallymeans.Outputqualitymayneedtobeassessedalongmultipledimensions,includingfactualaccuracy,completeness,security,andalignmentwithuserintent.

Oncequalityisassessed,asecondchallengeemerges:identifyingtherootcausesofagentfailurestosupportim-provementormanagerunincidents.Thisrequiresdetailed,transparentloggingoftheagent’sreasoningprocess,accessibletoadiversesetofsupervisingstakeholders;developers,complianceofficers,businessowners,anddomainexpertsalike.

“Theneedtoclosethissupervisionandgovernancegaprisesveryearlyintheenterpriseagenticjourney.”

Theneedtoclosethissupervisionandgovernancegaprisesveryearlyintheenterpriseagenticjourney.Asagenticsystemsbegininterpretingcomplexbusinesscontextsandtakingautonomousdecisions,therisksandresponsibilitiesgrow.Whileagentsarealreadybeingdeployedinenterprisepilotsacrossvariousfunctions,thetechnical,organization-al,andlegalinfrastructuresrequiredforrobustsupervisionremainunderdeveloped.Legacygovernanceframeworksareinsufficientandenterprisesneedtoupgradeitwithanew,test-intense,purpose-builtapproach.

“AftertheDigitalandMobilerevolutions,wearenowenteringathirdwaveofmediadisrup-tion:AIagents.Theseagentswillincreasinglymediateourinteractionswithcompanies,

transforminghowwesearch,learn,shop,

work,andcommunicate.Imaginethatin2030,40%ofinteractionsbetweenconsumersandcompanieswillbeshapedbyAI.Buthowdowecontrolthereliabilityandsecurityrisksoftheseagents?”

AlexCOMBESSIE

Co-founder&Co-CEO

}PGiskard

12ΛRFCT

IAGENTICAIRISKSARESHAKINGUPTHETECHGOVERNANCE&THEFUTUREOFAGENTICSUPERVISIONSUPERVISIONGAME.

TECHNOLOGY

Giskardisanopen-sourcetestingplatformdesignedtoensurethequality,security,andcomplianceofAImodels.Itautomatesthedetectionofvulnerabilitiessuchashallucinations,biases,andsecurityflawsinLLMsandagents.Giskard’sfeaturesincludeautomatedtestgeneration,continuousmonitoring,andcollaborativetoolsthatfacilitatecross-functionalteamworkamongdatascientists,developers,andbusinessstakeholders.

FEATURECOVERAGE

Eliability,Regulatorycompliance,Security,FinOps,Latency

OBSERVE.

Giskarddoesnotofferreal-timeob-servabilityfeaturessuchastrackinglatency,tokenusage,orcostmet-rics.Itsprimaryfocusisonpre-de-ploymenttestingandvulnera

人人文库> 全部分类> 应用文书 > 研究报告

温馨提示

1. 本站所有资源如无特殊说明，都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
2. 本站的文档不包含任何第三方提供的附件图纸等，如果需要附件，请联系上传者。文件的所有权益归上传用户所有。
3. 本站RAR压缩包中若带图纸，网页内容里面会有图纸预览，若没有图纸预览就没有图纸。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 人人文库网仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对用户上传分享的文档内容本身不做任何修改或编辑，并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容，请与我们联系，我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

监管智能体：Agentic+Supervision的未来研究报告+The+Future+of+Agentic+Supervision

文档简介

温馨提示

最新文档

评论

监管智能体：Agentic+Supervision的未来研究报告+The+Future+of+Agentic+Supervision

文档简介

温馨提示

最新文档

评论

相关文档