版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
Argus:AMulti-AgentSensitiveInformationLeakageDetectionFrameworkBasedonHierarchicalReferenceRelationships
BinWang1,HuiLi1,∗,LiyangZhang2,QijiaZhuang2
AoYang1,DongZhang3,XijunLuo3,∗,BingLin4
1GuangdongProvincialKeyLaboratoryofUltraHighDefinitionImmersiveMediaTechnology,ShenzhenGraduate
School,PekingUniversity2UniversityofElectronicScienceandTechnologyofChina3TencentSecurityPlatform
Department4ChinaUnicom(Guangdong)IndustrialInternetCo.,Ltd
thebinking66@
,
lih64@
,{2022090908021,
2022090917007}@
jarvisya@
,{zalezhang,
junjunluo}@
,
gds-cyhlw@
arXiv:2512.08326v1cs.CR9Dec2025
[]
Abstract
Sensitiveinformationleakageincoderepositorieshasemergedasacriticalsecuritychallenge.Traditionaldetectionmethods—relyingonregularexpressions,fingerprintfeatures,andhigh-entropy
calculations-suferfromhighfalse-positiverates,whichnotonlyre-ducedetectioneficiencybutalsosignificantlyincreasethemanualscreeningburdenondevelopers.Recentadvancesinlargelanguagemodels(LLMs)andmulti-agentcollaborativearchitectureshavedemonstratedremarkablepotentialintacklingcomplextasks,of-feringanoveltechnologicalperspectiveforsensitiveinformationdetection.Inresponsetothesechallenges,weproposeArgus,amulti-agentcollaborativeframeworkfordetectingsensitiveinfor-mation.Argusemploysathree-tierdetectionmechanismthatinte-grateskeycontent,filecontext,andprojectreferencerelationshipstoefectivelyreducefalsepositivesandenhanceoveralldetectionaccuracy.TocomprehensivelyevaluateArgusinreal-worldrepos-itoryenvironments,wedevelopedtwonewbenchmarks—onetoassessgenuineleakdetectioncapabilitiesandanothertoevaluatefalse-positivefilteringperformance.ExperimentalresultsshowthatArgusachievesupto94.86%accuracyinleakdetection,withapre-cisionof96.36%,recallof94.64%,andanF1scoreof0.955.Moreover,theanalysisof97realrepositoriesincurredatotalcostofonly$2.21.Allcodeimplementationsandrelateddatasetsarepubliclyavail-ableat
/TheBinKing/Argus-Guard
forfurtherresearchandapplication.
CCSConcepts
•Computingmethodologies→Naturallanguageprocessing;
•Securityandprivacy→Softwaresecurityengineering.
Keywords
sensitiveinformationleakage,coderepositorysecurity,multi-agentsystems,largelanguagemodels,contextualsemanticanalysis
ACMReferenceFormat:
BinWang1,HuiLi1,∗,LiyangZhang2,QijiaZhuang2andAoYang1,DongZhang3,XijunLuo3,∗,BingLin4.2026.Argus:AMulti-AgentSensitiveInformationLeakageDetectionFrameworkBasedonHierarchicalReference
ThisworkislicensedunderaCreativeCommonsAttribution4.0InternationalLicense.
ICSE’26,RiodeJaneiro,Brazil
®2026Copyrightheldbytheowner/author(s).
ACMISBN979-8-4007-2025-3/2026/04
/10.1145/3744916.3773208
Relationships.In2026IEEE/ACM48thInternationalConferenceonSoftwareEngineering(ICSE’26),April12–18,2026,RiodeJaneiro,Brazil.ACM,NewYork,NY,USA,
13
pages.
/10.1145/3744916.3773208
1INTRODUCTION
Publiccoderepositories,suchasGitHub,havebecomecentralplat-formsfordevelopercollaborationandversioncontrolinmodernsoftwaredevelopment.Theseplatformsenabledeveloperstoefi-cientlysharecode,trackissues,andmanageversionsrigorously,therebysignificantlyenhancingbothdevelopmenteficiencyandcodequality.However,theiropennaturealsointroducesnewse-curitychallenges,particularlyregardingthemanagementandpro-tectionofsensitiveinformation[
23
].AccordingtomonitoringdatafromGitGuardian[
9
],sensitiveinformationleakageincidentsonGitHubreached12.8millionin2023—a28%increaseover2022—withthetrendcontinuingupward.TheseleaksprimarilyinvolveAPIkeys,databasecredentials,privatekeys,andothercriticaldata,posingseriousrisksnotonlytoindividualprivacybutalsotoen-terprisesbyexposingthemtoseveresecurityvulnerabilitiesandpotentialeconomiclosses[
46
].ThepaperHowBadCanItGit?Char-acterizingSecretLeakageinPublicGitHubRepositories[
28
]discussestheprevalenceofsecretleakageinopen-sourceGitrepositories,highlightingtheurgencyofaddressingthisissue.
Currentapproachestodetectingsensitiveinformationleakscanbebroadlyclassifiedintotwocategories.Thefirstcomprisesrule-baseddetectiontools(e.g.,GitleaksandTru且eHog)thatrelyonregularexpressionsandentropycalculations[
38
].Thesecondin-volvesmachinelearningmethodsdesignedtoreducefalsepositivesthroughmodeltraining.However,bothapproacheshaveinherentlimitations.Whilerule-basedtoolsoferextensivecoverage,sometoolshaveafalsepositiverateofover80%[
2
],whichsubstantiallyunderminestheirutility.AsChessandMcGraw[
4
]havenoted,“anexcessivelyhighfalsepositiverateultimatelyleadsto100%ofleaksbeingoverlookedbecauseuserswilleventuallydisregardthedetec-tionresults.”Conversely,machinelearningmethods[
32
],thoughefectiveinreducingfalsepositives,lackadeepunderstandingofcodesemantics,renderingthemlessefectiveinmanagingcomplexcontextualrelationships.
Inrecentyears,theadventofLLMshasopenedanewtechni-calpathwayforsensitiveinformationdetection[
12
].Comparedtotraditionalmethods,LLMsofersuperiortextcomprehension,enablingthemtodeeplyanalyzecodecontextandidentifypotentialsensitiveinformation.However,relyingsolelyonLLMspresentschallenges:theymaystruggletopreciselyverifykeyformatsand
ICSE’26,April12–18,2026,RiodeJaneiro,BrazilBinWang1,HuiLi1,*,LiyangZhang2,QijiaZhuang2andAoYang1,DongZhang3,XijunLuo3,*,BingLin4
identifyplaceholders,andtheiroutputstabilitycandiminishwhenprocessinglengthytexts.Toovercometheselimitations,thecon-ceptof“AI-empoweredsoftwareengineering”hasemerged.ThisapproachleveragesmultipleAIagentsworkingincollaborationtoaddresscomplextasks.Theprincipleof“collaborativeAIforSE”involvesthecoordinatedoperationofseveralAIagents,eachcompensatingforthelimitationsofasingleagentwhentacklingintricateproblems.Forinstance,intaskssuchascodereviewandgeneration,multi-agentsystemshavedemonstratedsignificantad-vantages—suchasreducingsecurityvulnerabilitiesby13%[
29
]whenanLLMresponsibleforcodegenerationcollaborateswithagentsforstaticanalysisandfuzztesting—whileensuringfunc-tionalcorrectness.Thesefindingsunderscorethepotentialofacol-laborativemulti-agentstrategyinhandlingthediverseandhighlyaccuratedetectionrequirementsofsourcecodesensitiveinforma-tion.
Motivatedbytheseinsights,weproposeamulti-agentsensitiveinformationdetectionframeworknamedArgus.Thisframeworkemploysathree-tierdetectionmechanismthatintegrateskeycon-tent,filecontext,andprojectreferencerelationships,efectivelycompensatingforthelimitationsofasingleLLM.Eachagentfo-cusesonaspecificdetectiontask,andthroughtheircoordinatedef-forts,thesystemachievesstableandprecisedetectionoutcomes.Ad-ditionally,wehavedevelopedacomprehensiveevaluationdatasetthatencompassescommonsensitiveinformationscenariosfoundinopen-sourceprojects.ExperimentalresultsdemonstratethatArgusattainsadetectionaccuracyof94.86%onthisdataset,significantlyoutperformingexistingmethods.
Themaincontributionsofthispaperareasfollows:
(1)Weproposeanovelthreeleveldetectionmechanismthatofersacomprehensiveanalysisofsensitiveinformation,providingafreshperspectiveonapplyingLLMsinthefieldofsecuritydetection.
(2)Weconstructtwobenchmarkdatasetsbasedonreal-worldcoderepositoryscenarios,coveringawiderangeofsensitiveinfor-mationtypesandusagescenarios,therebyestablishingaunifiedstandardforevaluatingdetectiontools.Inaddition,weverifythevalidityofthesecretsineachrepositorytomitigatepoten-tialsecurityrisks.
(3)Wedesignandimplementamulti-agentsensitiveinformationdetectionframeworknamedArgus,whichachievesaprecisionof96.36%andarecallof94.64%onthebenchmarkdataset,sig-nificantlyoutperformingpreviousbaselinetoolsbyefficientlyidentifyinggenuineleakswhileefectivelyfilteringoutfalsepositives.
2PROBLEMANDMOTIVATION
2.1ProblemDescriptionandDefinitionofSecretLeakDetection
Inthisstudy,a“secret”referstosensitiveinformationthatappearsinplaintextwithinacoderepositorywithoutanyformofmaskingorencryption.Suchinformationtypicallyexhibitsthefollowingcharacteristics:
(1)FormatCharacteristics:Thesesecretsoftenhavefixedpre-fixesordistinctcharacterstructures.Forexample,anAWS
accesskeymightstartwith“AKIA”,oranRSAprivatekeymaybeidentifiedbymarkerssuchas“—–BEGINPRIVATEKEY—–”.Existingliteratureindicatesthatrule-basedmethodsprimarilytargettheseformattedpatterns.
(2)SemanticRelevance:Thecontentissemanticallytiedtoau-thentication,authorization,orsecurecommunicationsandiscloselylinkedtoactualbusinessoperations.Aleakofsuchinfor-mation—forinstance,anAPIkeyfromOpenAI—coulddirectlycauseservicedisruptionsorfinanciallosses.
Basedonthesefeatures,thisworkdefinesa“secretleak”astheoccurrencewhereanyfileinacoderepositorycontainsplain-textinformationthatmeetsthedefinitionofasecretandhasnotbeenproperlymaskedorencrypted.Itisimportanttonotethatsometextsmeetingthesecharacteristicsmayalsobepresentinarepository;however,ifcontextualcuesorrepositoryindicatorsmakeitclearthatthesecretwasintentionallymadepublicbythedeveloper,itshouldnotbeconsideredasecretleak.Thus,thecoretaskofthisworkistoaccuratelyidentifyandpinpointunintentionalsecretdisclosuresbydevelopers.
2.2ExcessiveFalsePositives
Currentmethodsfordetectingsecretleaksincoderepositories(e.g.,TruffleHog,Gitleaks)suferfromseverefalsepositiveissues.Thisnotonlyreducesthepracticalefficiencyofthesetoolsbutalsosignificantlyincreasesthemanualreviewburdenondevelopers,efectivelyrenderingahighfalsepositiverateequivalenttolowdetectionaccuracyinpractice[
1
].Mostexistingdetectiontoolsrelyoncustomrulesbasedonregularexpressions,fingerprintfea-tures,andhigh-entropycalculations.However,theseapproacheshaveclearlimitations.Forexample,manytoolsmistakenlyclassifycommithashstringsassensitiveinformation.Similarly,placeholderstringsintentionallyleftbydevelopers(e.g.,keytemplatesintheform“sk-xxxxxxxxxxxxxxxxx”)areerroneouslyflaggedasleakseventhoughtheyaremerelyintendedtoguidetheuserinenteringtheactualkey.
Table1:LeakDataStatistics
Platform
TE
RL
LR(%)
RL>5
TR
GitLab
1,606,827
9,803
23.44
2,330
41,826
GitHub
2,295,293
37,149
6.21
8,287
597,933
Gitee
494,247
14,750
7.27
4,512
203,012
Note:TE=TotalEntries,RL=RepositorieswithLeaks,LR=LeakRepoRatio,RL>5=Repositorieswith>5LeakEntries,TR=TotalRepositories.
2.2.1GeneralEvaluation.Inourcomprehensiveevaluation,weemployedtheactivelymaintainedopen-sourcetoolTruffleHogtosurvey2,022mirrorbackupsfromGitHub,GitLabandGitee.Scanningtheentiredatasetprovedprohibitivelyexpensive,sowerandomlysampledalargenumberofrepositoriestoestimatethefalsepositiverate.Giventhemanpowerrequiredtomanuallyverifyeverydetection,weinitiallytreatedTruffleHog’soutputsasgroundtruth.ThedetailedresultsappearinTable
1
.
Ouranalysisshowsthatover7.3%ofrepositoriescontainedatleastonereportedleak,risingto23.44%onGitLab,andthatapproximately5.57%ofrepositoriesreportedmorethanfiveleaks.
Argus:AMulti-AgentSensitiveInformationLeakageDetectionFrameworkBasedonHierarchicalReferenceRelationshipsICSE’26,April12–18,2026,RiodeJaneiro,Brazil
Structureddatafiles(.csv,.json)accountedforroughly520000detectionswhiledocumentfiles(.md,.txt)comprisedabout10%ofallfindings.Notably,repositoriesreportingmorethan50leakscontributed75.69%ofthetotaldetectionvolume.
AcloserexaminationofTruffleHog’soutputsrevealedahighprevalenceoffalsepositivesconcentratedinjustafewrules.Forexample,entriesflaggedbythe“Github”and“Gitlab”rulesmadeup72.4%ofalldetections,yetmanyofthesecorrespondedtocommithashesordefaultconfigurationfilesmisclassifiedassecrets.Like-wise,the“JDBC”and“URI”rulesprovedoverlybroad,frequentlytaggingtestdataandboilerplateassensitive.OnGitee,suchspuri-ousentriestotaledaround26000,representing5%ofallflags.Wefoundthatthesenoisyfilestypicallyfeaturetemplatedstructures,highlyrepetitivefieldsandrigidformatting,causingthesameruletotriggerrepeatedlyacrossmultipleprojects.Thissystematicam-plificationoffalsealarmsplacesaheavyburdenondownstreamanalysis.
Tovalidateourfalsepositiveassessment,werandomlysampled2000entriesfromTruffleHog’sfulloutput.Twoindependentanno-tatorswithsecurityexpertisereviewedeachrecord,classifyingitasagenuinesecretorafalsepositivebasedoncontextualsemantics,structuralpatternsandknownnon-sensitivemarkers.Athirdre-viewerresolvedanydisagreements.Thefinalannotationsshowedthatfewerthan3.4%ofsampledrecordsrepresentedgenuineleaks;thevastmajorityconsistedofdefaultvalues,placeholders,debuginformationorhighlyrepetitivestrings.Thesefindingsconfirmthatwhilesecretleaksareindeedwidespread,falsepositivesarepervasive.
Table2:FalsePositiveStatisticswithVersionInformation
Repository
Version
TH
GL
ST
WP
moby
c710b88
83
181
(148,11,0)
73
kubernetes
9253c9b
142
306
(110,66,19)
27
bitcoin
bf03c45
3
71
(2,2,102)
12
neovim
8b98642
5
7
(2,0,0)
3
webpack
3612d36
17
1
(1,1,0)
4
spring-boot
8964203
56
26
(28,11,11)
2
fastapi
113da5b
1
28
(45,0,0)
1
pandas
0691c5c
2
1
(1,2,0)
4
vue
13f4e7d
56
1
(1,0,0)
3
transformers
5d7739f
5
13
(0,66,8)
2
Note:TH=TruffleHog,GL=Gitleaks,ST=SpectralOps,WP=Whispers,ThevaluesofSTrepresentthenumberofdetectionswith(high,mid,low)severitylevels.
2.2.2FalsePositiveExperiment.Toassessthelimitationsofcurrentsensitiveinformationdetectionmethods,weselected10high-starrepositoriesfromGitHubandanalyzedthefalsepositivecountsus-ingfourtools:TruffleHog,Gitleaks,SpectralOps,andWhispers.TheexperimentalresultsindicatethatTruffleHoggenerated370falsepositives.Althoughitsdeepscanstrategyhelpsinefectivelyfilter-ingouthigh-entropystrings,thereisstillroomforimprovementinhandlingcomplexencodingsandboundarycases.Gitleaks,despiteoferingbroaddetectioncoverage,produced635falsepositivesduetooverlybroadrulesettingsthatledtonumerousfalsepositivesin
testcodeandtemplatefiles,therebyincreasingthemanualreviewburden.SpectralOpsreportedthehighestnumberoffalsepositives.Althoughitcategorizestheresultsintohigh,medium,andlowrisktoprovidedeveloperswithaprioritizationreference,itsrelianceonmachinelearningandcontextanalysishasnotsufficientlyreducedfalsepositivesfromnon-sensitivecontent.Incontrast,Whispersgeneratedonly131falsepositives,alowercountprimarilyattribut-abletoitslimiteddetectionscope(focusingsolelyonhard-codedfiles)andrestrictedlanguagesupport(limitedtoJavaScript,Java,Go,andPHP)(seeTable2fordetails).
Overall,althoughtheactualoccurrenceofsensitiveinformationleaksinthesehigh-starrepositoriesisrelativelylow,theprevalentissueofexcessivefalsepositivesnotonlyincreasesthemanualreviewworkloadfordevelopersbutalsorisksoverlookinggen-uinesensitiveinformation.Toenhancedetectionefficiencyandsecurity,futurestrategiesmustaimtoreducefalsepositivesfur-ther—throughtheincorporationofcontextanalysisanddynamicruleadjustment—whilemaintainingabroadcoverage.
3DATASETS
Currentdatasetsinthesecretleakdetectiondomainexhibitseveralshortcomings.First,mostdatasetsfocusexclusivelyonasingletypeofsecret(e.g.,keysorcredentials),resultinginasignificantgapbetweenthecollecteddataandwhatisobservedinreal-worldcoderepositories[
27
].Second,thesedatasetsgenerallylackhierar-chicalgradinganddetailedcategorizationofsensitiveinformation,makingitdifficulttothoroughlyevaluatethedistinctcharacteris-ticsandrisksassociatedwithvarioustypesofsecrets.Moreover,duetothevariablequalityofprojectsonGitHub,manyexistingdatasetsinadvertentlyincludealargenumberoflow-qualityorinactiveprojects,whichintroducessamplebias.Lastly,evenwhensomedatasetsaresourcedfromreputablerepositories,thesecretscontainedthereinmaystillbeactive,therebyposingadditionalsensitivityandsecurityrisks[
8
][
3
].
Toaddressthelimitationsfoundinexistingdatasets—suchaslimitedsecrettypes,lackofhierarchicalannotation,andinsuffi-cientvalidation—weconstructtwonewdatasets:CommonLeakandTrustedFalseSecrets.CommonLeakisbasedonaTruffleHogscanofaGitHubsnapshotfromJune2022.Wemanuallyselected97rep-resentativeprojectscoveringtencommonsecrettypes,includingAWS,GitHub,Huggingface,JDBC,MongoDB,OpenAI,PostgreSQL,PrivateKey,Redis,andURI.Eachcandidatewasreviewedbytwoindependentannotatorsusingcontextandsemanticstodistinguishrealsecretsfromfalsepositives.Disagreementswereresolvedbyathirdreviewer.Allconfirmedtrueleaksweredeactivatedforsaferelease.Thefinaldatasetcontains57truepositivesand40falsepositives.DetailsareshowninFigure
2
andTable
3
.Trusted-FalseSecretsfocusesonrepresentativefalsepositives;wecurated20typicalcasesfromtenwell-maintainedopen-sourcerepositoriestoillustratecommonmisclassificationsmadebyregex-basedtools.Thisdatasetofersacleanbenchmarkforevaluatingfalsepositivemitigationtechniques(seeTable
8
).
4METHODOLOGY
Inthissection,wepresentthedesignofArgus.Fromamethodolog-icalperspective,Argusemploysathree-levelanalysisframework
ICSE’26,April12–18,2026,RiodeJaneiro,BrazilBinWang1,HuiLi1,*,LiyangZhang2,QijiaZhuang2andAoYang1,DongZhang3,XijunLuo3,*,BingLin4
Table3:CompositionofConfigandOthers
ConfigOthers
TotalProp.SubcategoryTotalProp.
25.71%Java
14.29%CS
11.43%Dockerfile8.57%Shell
8.57%Typescript8.57%PHP
5.71%C
5.71%Gradle
5.71%Html
2.86%TCL
2.86%CPP
5
2
2
2
2
2
2
1
1
1
1
9
5
4
3
3
3
2
2
2
1
1
23.81%
9.52%
9.52%
9.52%
9.52%
9.52%
9.52%
4.76%
4.76%
4.76%
4.76%
Subcategory
Env
Json
PropertiesIpynb
MarkdownKey
Git
Data
Pem
Txt
Conf
(a)Compositionofdatasetcategories(b)CompositionofDatasetLanguage
Figure2:CompositionofDatasetandSubcategories
toassesssecrets,leveragingamulti-agentcollaborationmecha-nismtodistributeandcoordinatetasks.Additionally,itutilizesasharedmemorypooltorecordintermediateprocessesandfacilitateinformationsharingamongagents.
4.1Three-LevelContextualSemanticAnalysis
Traditionaltoolsforscanningcoderepositoriesforsecretsarefre-quentlyoverwhelmedbyfalsepositives.Toaddressthischallenge,weproposeadetectionmethodbasedonthree-tiercontextualse-manticanalysis,implementedthroughamulti-agentsystemthatautomatesdecision-making.Thisapproachdecomposesthesecretdetectiontaskintothreeinterconnectedlayers:theanalysisofin-trinsickeyfeatures,thesemanticinterpretationofitsimmediatecontext,andtheexaminationofproject-levelreferencerelation-ships.Together,theselayersformahierarchical,traceable,andinterpretabledetectionprocess(AsshowninFigure
3
).
4.1.1Level1:AnalysisofIntrinsicSemantics.Atthisinitiallevel,thefocusissolelyonthesecret’sownfeatures.Thegoalistorapidlydismissobviousfalsepositivesbyinspectingcharacteristicssuchasreadability,placeholderusage,andadherencetospecifickeyformats.Falsepositivesatthisstagegenerallyfallintothreecategories:
(1)ReadableKeys:Forexample,astringlike
https://readonly:
readonly@www.pauldreik.se
issemanticallyclearandlacksthehighentropyorspecificstructureexpectedofagenuinekey.Traditionaltoolsthatrelysolelyonentropyandregexmatchingoftenfailtoproperlyfiltersuch“readable”pseudo-keys,whereasLLMscanusetheirsemanticanalysiscapabilitiestorecognizethesenon-genuinecharacteristics.
(2)KeyswithPlaceholders:Forinstance,mongodb://username:password@serverappearsindocumentation(e.g.,Markdownfiles)asanexample,usingfixedplaceholderslikeusernameorpassword.Byanalyzinglarge-scaledata,wehaveidentifiedasetofcommonplaceholders.Ourplaceholderdetectiontoolchecksforthesemarkerswithinthekey,therebyflaggingsuchcasesaslikelyfalsepositives.
(3)KeysNotConformingtoSpecificFormats:Forexample,jdbc:postgresql://mightbeatruncatedversionofalegitimatekeyformat,omittingtherequireduser-nameandpassword.Traditionalregex-baseddetectiondoesnotdiferentiatebetweenvalidandinvalidformatsacrosskeytypes.Toremedythis,wehavedesignedpreciseregexpatternsforma-jorkeytypes(e.g.,AWS,JDBC,MongoDB)toverifycompliancewiththeirexpectedformats.
Implementation-wise,eachtoolcomprisesanLLMandaspecificfunction.Forexample,akeyformatcheckerintegratesaGPT-4omodelwithacustompromptandaregularexpressionmatchingfunction.Thefunction’sparameters,returnvalues,andusageareembeddedintheprompttoenabletheLLMtoaccuratelyutilizethetoolandprovidecorrectfeedbackbasedonitsdetectionresults.EachagentconsistsofanLLMandmultipletools.Forinstance,aprelim-inaryinspectionagentincludesanLLMwithacustompromptandtoolssuchasplaceholderandkeyformatcheckers.Thedefinitionsofthesetoolsareincorporatedintotheprompttoguidetheagentinselectingappropriatetoolsforsecretinspection.Unliketraditionalsecretdetectiontools,thedetectionresults(e.g.,regularexpressionmatches)arenottreatedasfinalconclusionsbutratherasevidencefortheLLM’sjudgment.TheLLMperformssecondaryanalysisandinference,consideringtheactualcharacteristicsofthekey,toinferwhetherthekeyisgenuineormerelyresemblesone.
Implementation-wise,therelevanttoolsareencapsulatedwithinindividualagents.Insteadofdirectlyusingregexmatchresultsasthefinalverdict,theseresultsareprovidedtoanLLM,whichcombinestheclueswiththekey’sintrinsicfeaturestoinferwhetherthekeyisgenuineormerelyresemblesone.
4.1.2Level2:SemanticAnalysisoftheSecret’sImmediateContext.Atthesecondlevel,thefocusshiftstocaseswhereakey,whileap-pearingauthenticonitsown,isintendedsolelyfordemonstration,teaching,ortesting.Becausesuchkeysexhibitalltheintrinsicchar-acteristicsofgenuinesecrets,astandaloneanalysisisinsufficient.Instead,thesurroundingcontextmustbeexamined.
Argus:AMulti-AgentSensitiveInformationLeakageDetectionFrameworkBasedonHierarchicalReferenceRelationshipsICSE’26,April12–18,2026,RiodeJaneiro,Brazil
Figure3:OverviewoftheArgusframeworkanditsoperationalflow
Forinstance,adocumentmightincludeaSECRET_ACCESS_KEYexampleaccompaniedbyexplanatorytextclarifyingitsinstruc-tionalpurpose.Traditionaltoolsmightsimplyflagthekeyasarisk,butourmulti-agentsystemfeaturesanadvancedcontextanaly-sismodule.Thismodulescrutinizesannotations,comments,andnearbynarrativecuestodetermineifthekeyismerelyillustrativeratherthanoperational.
4.1.3Level3:GlobalReferenceAnalysisattheProjectLevel.Incaseswherekeysareembeddedasstandalonefiles(e.g.,RSAprivatekeysorcertificates)anddisplayalltheattributesofgenuinesecrets,relyingsolelyonintrinsicfeatureanalysisorimmediatecontextualevaluationmaynotyielddefinitiveresults.Toaddressthis,Level3detectionexaminesthekey’sroleanditsrelationshipswithintheentireproject.BelowisanillustrationoftheLevel3detectionprocessusinganRSAprivatekeyinspectionasanexample:
(1)InitialDiscovery:ThescanningtooldetectsafilematchingtheRSAprivatekeyformat.Sinceitdoesnottriggerobviousfalsepositiveconditionsintiersoneortwo,itsauthenticityremainsundetermined.
(2)ReferencePathCheck:Theadvancedmoduleretrievesthefile’sreferencelocationwithintheproject.Forexample:
TheRSAprivatekeyisreferencedinthefile:
final_dataset\PrivateKey\...\
pay.py
Thissuggeststhekeyislikelyutilizedbyafunctionalmodule.
(3)ContextualAnalysisoftheReference:Afurtherexami-nationofthecodeinpay.pyshowsthatthekeyfileisreadandassignedtoavariable(e.g.,app_private_key_string)inconjunctionwithtermslikealipay_public_key_string,in-dicatingitsroleingenuinepaymentorencryptionoperations.
(4)FinalDetermination:Lackinganyindicatorsthatthekeyisusedfortestingordemonstration,andgivenitsactiveusageincorefunctionalities,thesystemconcludesthatitisagenuinesecretleak.
Overall,Levelthreefocusesonproject-levelusageandreferencerelationships.Ifakeycannotberuledoutasafalsepositiveviaintrinsicorcontextualanalyses,examiningitspracticaldeployment(throughreferencepaths,functioncalls,orfiledependencies)often
yieldsthefinaldetermination:ifitisemployedinproduction,itistreatedasagenuineleakrequiringimmediateremediation.
4.2RoleSpecialization
Inourmulti-agentsystem,wefirstdesignateaninitialscreeningagenttolocatehigh-entropyorfeature-basedsecretcandidates.Next,aCommanderactsastheultimatedecision-maker,delegat-ingtaskstotwospecializedroles:theBasicCheckAgentandtheAdvancedCheckAgent.Eachroleisequippedwithspecifictoolandfunctionalcapabilities,workingtogethertodetermine
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2025年司法鉴定专业考试试卷及答案
- 2025年事业单位招聘考试综合类专业能力测试试卷(财务类)-财务分析技巧
- 2025年世界水日、中国水周知识答题(试题题库及答案)
- 2026中国后向座椅行业运营效益与需求前景预测报告
- 2026年人工智能伦理评估合同协议
- 液氢储运2025年安全监管系统协议
- 2026农业科技服务行业市场调研现状分析政策推动与行业发展趋势报告
- 2026农业科技产品市场发展趋势分析及产业发展竞争力
- 2026农业生物制剂市场增长驱动因素与政策分析报告
- 2026农业无人机精准施药技术经济性比较分析
- 湖北省圆创高中名校联盟2026届高三第一次联合测评 语文试卷(含答案)
- 检察机关刑事申诉课件
- 留守儿童情感的缺失论文
- 2025年《工会基础知识》试题库及答案
- 2025年北京大兴区初一(下)期中语文试题和答案
- 广东专项债券管理办法
- 急性心力衰竭急诊管理
- 党校食堂就餐管理制度
- 2024年海南省中考英语试题(附答案和音频)
- 城区供水管网改造项目可行性研究报告
- 2025年河北省石家庄市中考一模物理试题 (原卷版+解析版)
评论
0/150
提交评论