打造LLMOps时代Prompt数据驱动引擎_第1页
打造LLMOps时代Prompt数据驱动引擎_第2页
打造LLMOps时代Prompt数据驱动引擎_第3页
打造LLMOps时代Prompt数据驱动引擎_第4页
打造LLMOps时代Prompt数据驱动引擎_第5页
已阅读5页,还剩52页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

数据驱动引擎刘逸伦-华为-2012文本机器翻译实验室团队介绍:华为文本机器翻译实验室从AIOps到LLMOps:大模型强大的泛化能力和语言理解能力推动AIOps发展语言上。框架。人的认知世界与大模型数字世界的桥梁,解决模型意图更清晰的推理路径,提升人与模型的交互效率。题-答案对齐依赖。效的交互策略,生成符合人意图效的交互策略,生成符合人意图和需要的内容。-答案上训练模型,帮助模型理解人的意图和需要。痛点1:传统智能运维算法依赖于任务数据,专家标注耗时耗力痛点2:传统运维系统可解释性差、可交互性弱痛点3:Prompt训练数据质量不稳定,导致模型性能下降痛点4:Prompt训练数据全面性不足,损害AI能力的全面性(ICSE2024/ICPC2024接收)型打造LLMOps数据飞轮(ICDE2024接收)•Logsaresemi-structuredteNewfeaturesNewfeatures01Motivation:Existingapproachreliesonmassivetrainingdataandlacksinterpretabilityanalysistaskswhentr(testedontheremaining20%or10%oflog•Performancedrasticallydeclineswhentrlimitedto10%orlo•Therelyingonmassivemethodsineffectiveandinflexiblein•Existingmethodsonlyoffe•Practitionersoftenneedtospendextresults)andact(identifyrootcauses,composereport,etc...).valueswithoutrationunderstandingofincidentMotivation:Largelanguagemodelshasthepotentialtoaddressthechallenges,andprompteLargelanguagemodels(LLMs)havepowerfulgenabilitytounseenuserinstralsobeabletohandleunseenlogsintheonlinesituationoflogInourpreliminaryexperChatGPTwiththesimpachievedanF1-scoreofonly0.189inanomalydetection.However,ourbestpromptstrategyoutperformedthesimplepromptby0.195Sinceloganalysisisadomain-specificandapplyingasimpleprompttoLLMscanresultinpoorUnlikeexistingdeeplearnreport,etc.Loginterpretawritingtask.TherearemanypromptphiloproposedinNLPtasks,suchasCoTheprimaryobjectiveofLogPromptinterpretabilityofloganalysisintheonlinescenario,throughproperstrategyofpromptingLLMs.Approach:Thechain-of-thought(CoT)promptsTheCoTPromptintheputanexamplewithintbeforeaninputmathpmodelisencouragedtofollowthethinkTheconceptofchainofthought(CoT),aseriesofintermediatereasoningsteps,isintroducedbyWeietal.[1].TheCoTpromptemulatesthehumanthoughtprocessbyrequiringthemodeltoincludethinkingstepswhenaddressingcomplexproblemsandcanenhancetheperformanceofLLMsinchallengingtasks,suchassolvingmathematicalproblems.AdvantagesofCoTpr•Breakdownunseenproblemsintomanageable•EnhanceinterpretabilityandtransparencyofLLMoutput•Unleashthelearnedabilitiesinthepre-trainingpha[1]J.Wei,X.Wang,D.Schuurmans,M.Bosma,F.Xia,E.Chi,Q.V.Le,D.Zhouetal.,“Chain-of-thoughtpromptingelicitsreasoninginlargelanguagemodels,”AdvancesinNeuralInformationProcessingSystems,vol.35,pp.24824–24837,2022.Approach:AdaptingtheCoTpromptintothefieldofloganalysisInmanualloganalysis,practitionersalsoengageinaseriesofreasoningstepstoreachanormallogandanabnormallogisunclear.Toemulatethethinki•ImplicitCoT:Humanmostlyhasreasonsbeforeconclusion.Thus,ijustifyingitsdecisions.•ExplicitCoT:Wefurtherexplicitlydefineintermediatestepstoregulatethethidefinitionofanomalytobeonly“alertsexplicStandardPromptPerforminganalysisbasedonTaskdescriptionandInputlogsStandardPromptLogPrompt(CoT)Approach:OtherpromptstrateIn-contextPrompt:ThisapproachussamplesoflabeledlogstosetthecontextfortheTheLLMthenpredictsonnewlogs,usingFormatFormatControl:Weemploytwofunctions,fx([X])andfZ([Z],toestarange,like“abinarychoicebetweenabnormnormal”,or“aparsedlogtemplSelf-prompt:thisstrategyinvolvestheLLownprompts.Ameta-promptdescribingthetaLLMtogeneratepromptprefixcaeffectivepromptchosenbased•TheeffectivenessofLogPromptisevaluatedmainlyusingthelogHubdatasets,whichcontainsreal-worldlogsfromninedifferentdomains,includingsupercomputers,distributedsystems,operatingsystems,mobilesystems,andserverapplications.•Twoofthedatasets(BGLandSpirit)wereannotatedbydomainexpertstoidentifyanomalouseventsforthepurposeofanomalydetection.•Toevaluatethelogparsingperformance,eightofthedatasetshavelogtemplatesmanuallyextractedfromasubsetof2000logmessagesineachdomain.•Alllogdataweretimestamped,enablingthedatasetstobesplitintotrain/testsetschronologically.•Inourprimaryexperiments,theunderlyingLLMisaccessedviaAPIsprovidedbyexternalservices.•Theinitialtemperaturecoefficientissetto0.5,maintainingabalancebyincreasingthemodel'sreasoningcapabilitiesthroughdiversetokenexplorationwhilelimitingdetrimentalrandomness.•Iftheresponseformatisinvalid,thequeryisresubmittedwithanincreasedtemperaturecoefficientof0.4untiltheresponseformatiscorrect.Theformatfailurerateislessthan1%,whichisconsistentwithexistingliterature.•Thetrain/testdatasetsaresplitchronologicallytosimulateonlinescenarios.Experiment:LogPrompthasastrongabilityofhandlingscarcityintrainingdata•TaskofLogParsing•Foreachdataset,mostbaselinemethodsaretrainedonthefirst10%logsandevaluatedontheremaining90%logs,whileLogPromptisdirectlytestedontheremaining90%logswithoutin-domaintrainingda•WeadopttheF1-scoreasthemetric.Tocalculateit,wetokenizethepredictedlogtemplateintoalistoftokens,thenwetreatthetokensasresultsfromaclassificationtaskof{template,variable}.•LogPromptachievedthebestF1-scoreonsixoftheeightdatasets,andoutperformedexistingmethodswhichrequireresourcesforin-domain•TaskofAnomalyDetection•Forbaselines,thefirst4000logsineachdatasetareusedfortraining.BothLogPromptandthetrainedbaselinesarethentestedonthe•Wereportthesession-levelF1-scoreofanomaly.Asessionisformedusingfixed-windowgroupingwithalengthof100logtemplate•Despitetheexistingmethodsbeingtrainedonthousandsoflogs,LogPromptstillachievedstrongperformancesinbothdatasetswithoututilizinganyin-domaintrainingdata,withanaverageimprovementof55.9%intermsofF1-score.TheadvantageofTheadvantageofLogPromptmakesitasuitablechoiceforloganalysisinonlinescenariosExperiment:LogPromptyieldhelpfulandcomprehensiblecontentforpractitionersduringreal-worldloganalysis•ANovelEvaluatingTaskofLogInterpretation•Atotalof200logswererandomlysampledforthehumanevaluation,accompaniedbyLogPrompt'sactualoutputs,with100logsrelatedtologparsingand100relatedtoanomaly(evenlydistributeda•Incorrectlypredictedlogs(FPsandFNs)werenotincludedinthisevaluation.Anequalnumberofnormalandabnormalsampleswereincludedforanomalydetection,andeachselectedlogforlogparsingwasrequiredtocontainatleasaccordingtothecriteria,independently•Wereportedtwometforbothtasksintermsofusefulnessandreadabilityconsistentlyexceededfour,andaverageHIPwasconsistentlyabove80%,indicatinganoverallhelpfulandreadablecontentconsideredbyexperiencedloganExperiment:MoreanalysisofLogPrompt’sinterpretability•FeedbacksfromPractitioners•BadCaseAnalysis•“IappreciatetheabilityofLogPromptinsupportingtheinterpretationoflogsfromvariousdomains.Asoursystemcontinuouslyincorporatingthird-partyservices,sometimesIhavetoreadthroughthemanualstodecipherlogsfromunfamiliardomains.TheexplanationsgeneratedbyLogPromptcanprovideaswiftgraspofthelogsbeforeIfoundtheofficialdefinitions.”•“LogPromptcandefinitelyhelpinthecomposeofashortlyaftersystemcrashes,whereIoftenneedtonon-technicalcolleaguesinmeetings.”•“IntherealmofsoftwareO&M,falsealarmsareaninescapablereality,withfalsepositivesimposingsubstantialtimecostsonengineecausingsevereramifications.Accompanyingexplanationswithautomaticanalysisoutcomesenablesengineerstomorepromptlyascertainthecredibilityofthepurportedanomaly,therebyreducingthetimespentonsubsequentactions.”•AmajorfactorofbadcasesistheLLM’slackofdomainknowledge,whichleadstooverlygeneralinterpretationsofsomedomain-specificterms.Forexample,itmayrefspecificparametersassiAnothercauseisthelackofsemanticconttlogs,whichcanbeattributedtotheirbrevityorrichnessofnon-NLPpatterns(i.e.,digits,codesandaddresses).Experiment:AblationstudyonthethreepromptstrategiesComparedtoprompt5,prompt2providesmoreformalandaccuratewords(suchas“standardized”and“convert”)andclearlyoutlinestheintermediatestepsforthetask(identifyingandreplacingvariables,thenconvertingtoatemplate).Interestingly,onlyutilizingtheimplicitCoT(requiringgeneratingreasons)canstillimprovethemodelperformance,likelyduetothereasonthatwithmoreNLPexplanations,thedistributionofthegeneratedanswersismoreclosetothatinthepre-trainingphasesofmodels.AnoverlylongcontextprefixedtothepromptmaycauseLLMstopaylessattentiothenewinputlogs,therebydeterioratingthetaskperformance,whichiswhythepeakisFuturework:Domainadaptingsmaller-scaledLLMsforcompatibilitywithadvancedpromptstrategies•ApplyingLogPrompttoasmaller-scaledLLM:Vicuna13B•Thedeploymentoflarge,proprietary,andAPI-dependentLLMsinlocalenvironmentscanbeachallengingtask.•ThereliabilityofservicesmaybecompromisediftheAPIservicesofLLMsbecomeunavailable.•Therefore,forindustrialusage,itiscrucialforLogPrompttohavecompatibilitywithalternativeopen-source,privacy-protected,andsmaller-scaleLLMs.OnlineLogParsingwithsmaller-scaledLLMsusin•AlthoughVicunahasonly13Bparameters,whenequippedwithLogPrompt,itachievesacomparableperformanceoflogparsingwiththe175BGPTmodelinthedatasetsofHDFS,LinuxandProxifier.•Additionally,whenthepromptstrategytransitionsfromasimpleprompttoLogPrompt,Vicunaexhibitssignificantimprovementsonperformance,withanaverageincreaseof380.7%inF1-score.•AsVicunaisopen-sourceandrequiresfewerresourcesfortraininganddeployment,thesuccessofLogPromptonVicunaholdspromisingimplicationsforbuildingin-domainindustrialapplications.•Theperformanceofsmaller-scaledLLMslikeVicunastillhasaroomforimprovement.Sincethebasemodel

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论