打造LLMOps时代Prompt数据驱动引擎

上传人：策*** IP属地：山西上传时间：2025-06-30 格式：DOCX 页数：57 大小：6.31MB 积分：19.9 举报 版权申诉

已阅读5页，还剩52页未读，继续免费阅读

版权说明：本文档由用户提供并上传，收益归属内容提供方，若内容存在侵权，请进行举报或认领

文档简介

数据驱动引擎刘逸伦-华为-2012文本机器翻译实验室团队介绍：华为文本机器翻译实验室从AIOps到LLMOps：大模型强大的泛化能力和语言理解能力推动AIOps发展语言上。框架。人的认知世界与大模型数字世界的桥梁，解决模型意图更清晰的推理路径，提升人与模型的交互效率。题-答案对齐依赖。效的交互策略，生成符合人意图效的交互策略，生成符合人意图和需要的内容。-答案上训练模型，帮助模型理解人的意图和需要。痛点1：传统智能运维算法依赖于任务数据，专家标注耗时耗力痛点2：传统运维系统可解释性差、可交互性弱痛点3：Prompt训练数据质量不稳定，导致模型性能下降痛点4：Prompt训练数据全面性不足,损害AI能力的全面性(ICSE2024/ICPC2024接收)型打造LLMOps数据飞轮(ICDE2024接收)•Logsaresemi-structuredteNewfeaturesNewfeatures01Motivation:Existingapproachreliesonmassivetrainingdataandlacksinterpretabilityanalysistaskswhentr(testedontheremaining20%or10%oflog•Performancedrasticallydeclineswhentrlimitedto10%orlo•Therelyingonmassivemethodsineffectiveandinflexiblein•Existingmethodsonlyoffe•Practitionersoftenneedtospendextresults)andact(identifyrootcauses,composereport,etc...).valueswithoutrationunderstandingofincidentMotivation:Largelanguagemodelshasthepotentialtoaddressthechallenges,andprompteLargelanguagemodels(LLMs)havepowerfulgenabilitytounseenuserinstralsobeabletohandleunseenlogsintheonlinesituationoflogInourpreliminaryexperChatGPTwiththesimpachievedanF1-scoreofonly0.189inanomalydetection.However,ourbestpromptstrategyoutperformedthesimplepromptby0.195Sinceloganalysisisadomain-specificandapplyingasimpleprompttoLLMscanresultinpoorUnlikeexistingdeeplearnreport,etc.Loginterpretawritingtask.TherearemanypromptphiloproposedinNLPtasks,suchasCoTheprimaryobjectiveofLogPromptinterpretabilityofloganalysisintheonlinescenario,throughproperstrategyofpromptingLLMs.Approach:Thechain-of-thought(CoT)promptsTheCoTPromptintheputanexamplewithintbeforeaninputmathpmodelisencouragedtofollowthethinkTheconceptofchainofthought(CoT),aseriesofintermediatereasoningsteps,isintroducedbyWeietal.[1].TheCoTpromptemulatesthehumanthoughtprocessbyrequiringthemodeltoincludethinkingstepswhenaddressingcomplexproblemsandcanenhancetheperformanceofLLMsinchallengingtasks,suchassolvingmathematicalproblems.AdvantagesofCoTpr•Breakdownunseenproblemsintomanageable•EnhanceinterpretabilityandtransparencyofLLMoutput•Unleashthelearnedabilitiesinthepre-trainingpha[1]J.Wei,X.Wang,D.Schuurmans,M.Bosma,F.Xia,E.Chi,Q.V.Le,D.Zhouetal.,“Chain-of-thoughtpromptingelicitsreasoninginlargelanguagemodels,”AdvancesinNeuralInformationProcessingSystems,vol.35,pp.24824–24837,2022.Approach:AdaptingtheCoTpromptintothefieldofloganalysisInmanualloganalysis,practitionersalsoengageinaseriesofreasoningstepstoreachanormallogandanabnormallogisunclear.Toemulatethethinki•ImplicitCoT:Humanmostlyhasreasonsbeforeconclusion.Thus,ijustifyingitsdecisions.•ExplicitCoT:Wefurtherexplicitlydefineintermediatestepstoregulatethethidefinitionofanomalytobeonly“alertsexplicStandardPromptPerforminganalysisbasedonTaskdescriptionandInputlogsStandardPromptLogPrompt(CoT)Approach:OtherpromptstrateIn-contextPrompt:ThisapproachussamplesoflabeledlogstosetthecontextfortheTheLLMthenpredictsonnewlogs,usingFormatFormatControl:Weemploytwofunctions,fx([X])andfZ([Z],toestarange,like“abinarychoicebetweenabnormnormal”,or“aparsedlogtemplSelf-prompt:thisstrategyinvolvestheLLownprompts.Ameta-promptdescribingthetaLLMtogeneratepromptprefixcaeffectivepromptchosenbased•TheeffectivenessofLogPromptisevaluatedmainlyusingthelogHubdatasets,whichcontainsreal-worldlogsfromninedifferentdomains,includingsupercomputers,distributedsystems,operatingsystems,mobilesystems,andserverapplications.•Twoofthedatasets(BGLandSpirit)wereannotatedbydomainexpertstoidentifyanomalouseventsforthepurposeofanomalydetection.•Toevaluatethelogparsingperformance,eightofthedatasetshavelogtemplatesmanuallyextractedfromasubsetof2000logmessagesineachdomain.•Alllogdataweretimestamped,enablingthedatasetstobesplitintotrain/testsetschronologically.•Inourprimaryexperiments,theunderlyingLLMisaccessedviaAPIsprovidedbyexternalservices.•Theinitialtemperaturecoefficientissetto0.5,maintainingabalancebyincreasingthemodel'sreasoningcapabilitiesthroughdiversetokenexplorationwhilelimitingdetrimentalrandomness.•Iftheresponseformatisinvalid,thequeryisresubmittedwithanincreasedtemperaturecoefficientof0.4untiltheresponseformatiscorrect.Theformatfailurerateislessthan1%,whichisconsistentwithexistingliterature.•Thetrain/testdatasetsaresplitchronologicallytosimulateonlinescenarios.Experiment:LogPrompthasastrongabilityofhandlingscarcityintrainingdata•TaskofLogParsing•Foreachdataset,mostbaselinemethodsaretrainedonthefirst10%logsandevaluatedontheremaining90%logs,whileLogPromptisdirectlytestedontheremaining90%logswithoutin-domaintrainingda•WeadopttheF1-scoreasthemetric.Tocalculateit,wetokenizethepredictedlogtemplateintoalistoftokens,thenwetreatthetokensasresultsfromaclassificationtaskof{template,variable}.•LogPromptachievedthebestF1-scoreonsixoftheeightdatasets,andoutperformedexistingmethodswhichrequireresourcesforin-domain•TaskofAnomalyDetection•Forbaselines,thefirst4000logsineachdatasetareusedfortraining.BothLogPromptandthetrainedbaselinesarethentestedonthe•Wereportthesession-levelF1-scoreofanomaly.Asessionisformedusingfixed-windowgroupingwithalengthof100logtemplate•Despitetheexistingmethodsbeingtrainedonthousandsoflogs,LogPromptstillachievedstrongperformancesinbothdatasetswithoututilizinganyin-domaintrainingdata,withanaverageimprovementof55.9%intermsofF1-score.TheadvantageofTheadvantageofLogPromptmakesitasuitablechoiceforloganalysisinonlinescenariosExperiment:LogPromptyieldhelpfulandcomprehensiblecontentforpractitionersduringreal-worldloganalysis•ANovelEvaluatingTaskofLogInterpretation•Atotalof200logswererandomlysampledforthehumanevaluation,accompaniedbyLogPrompt'sactualoutputs,with100logsrelatedtologparsingand100relatedtoanomaly(evenlydistributeda•Incorrectlypredictedlogs(FPsandFNs)werenotincludedinthisevaluation.Anequalnumberofnormalandabnormalsampleswereincludedforanomalydetection,andeachselectedlogforlogparsingwasrequiredtocontainatleasaccordingtothecriteria,independently•Wereportedtwometforbothtasksintermsofusefulnessandreadabilityconsistentlyexceededfour,andaverageHIPwasconsistentlyabove80%,indicatinganoverallhelpfulandreadablecontentconsideredbyexperiencedloganExperiment:MoreanalysisofLogPrompt’sinterpretability•FeedbacksfromPractitioners•BadCaseAnalysis•“IappreciatetheabilityofLogPromptinsupportingtheinterpretationoflogsfromvariousdomains.Asoursystemcontinuouslyincorporatingthird-partyservices,sometimesIhavetoreadthroughthemanualstodecipherlogsfromunfamiliardomains.TheexplanationsgeneratedbyLogPromptcanprovideaswiftgraspofthelogsbeforeIfoundtheofficialdefinitions.”•“LogPromptcandefinitelyhelpinthecomposeofashortlyaftersystemcrashes,whereIoftenneedtonon-technicalcolleaguesinmeetings.”•“IntherealmofsoftwareO&M,falsealarmsareaninescapablereality,withfalsepositivesimposingsubstantialtimecostsonengineecausingsevereramifications.Accompanyingexplanationswithautomaticanalysisoutcomesenablesengineerstomorepromptlyascertainthecredibilityofthepurportedanomaly,therebyreducingthetimespentonsubsequentactions.”•AmajorfactorofbadcasesistheLLM’slackofdomainknowledge,whichleadstooverlygeneralinterpretationsofsomedomain-specificterms.Forexample,itmayrefspecificparametersassiAnothercauseisthelackofsemanticconttlogs,whichcanbeattributedtotheirbrevityorrichnessofnon-NLPpatterns(i.e.,digits,codesandaddresses).Experiment:AblationstudyonthethreepromptstrategiesComparedtoprompt5,prompt2providesmoreformalandaccuratewords(suchas“standardized”and“convert”)andclearlyoutlinestheintermediatestepsforthetask(identifyingandreplacingvariables,thenconvertingtoatemplate).Interestingly,onlyutilizingtheimplicitCoT(requiringgeneratingreasons)canstillimprovethemodelperformance,likelyduetothereasonthatwithmoreNLPexplanations,thedistributionofthegeneratedanswersismoreclosetothatinthepre-trainingphasesofmodels.AnoverlylongcontextprefixedtothepromptmaycauseLLMstopaylessattentiothenewinputlogs,therebydeterioratingthetaskperformance,whichiswhythepeakisFuturework:Domainadaptingsmaller-scaledLLMsforcompatibilitywithadvancedpromptstrategies•ApplyingLogPrompttoasmaller-scaledLLM:Vicuna13B•Thedeploymentoflarge,proprietary,andAPI-dependentLLMsinlocalenvironmentscanbeachallengingtask.•ThereliabilityofservicesmaybecompromisediftheAPIservicesofLLMsbecomeunavailable.•Therefore,forindustrialusage,itiscrucialforLogPrompttohavecompatibilitywithalternativeopen-source,privacy-protected,andsmaller-scaleLLMs.OnlineLogParsingwithsmaller-scaledLLMsusin•AlthoughVicunahasonly13Bparameters,whenequippedwithLogPrompt,itachievesacomparableperformanceoflogparsingwiththe175BGPTmodelinthedatasetsofHDFS,LinuxandProxifier.•Additionally,whenthepromptstrategytransitionsfromasimpleprompttoLogPrompt,Vicunaexhibitssignificantimprovementsonperformance,withanaverageincreaseof380.7%inF1-score.•AsVicunaisopen-sourceandrequiresfewerresourcesfortraininganddeployment,thesuccessofLogPromptonVicunaholdspromisingimplicationsforbuildingin-domainindustrialapplications.•Theperformanceofsmaller-scaledLLMslikeVicunastillhasaroomforimprovement.Sincethebasemodel

人人文库> 全部分类> 应用文书 > 研究报告

温馨提示

1. 本站所有资源如无特殊说明，都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
2. 本站的文档不包含任何第三方提供的附件图纸等，如果需要附件，请联系上传者。文件的所有权益归上传用户所有。
3. 本站RAR压缩包中若带图纸，网页内容里面会有图纸预览，若没有图纸预览就没有图纸。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 人人文库网仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对用户上传分享的文档内容本身不做任何修改或编辑，并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容，请与我们联系，我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

打造LLMOps时代Prompt数据驱动引擎

文档简介

温馨提示

最新文档

评论

打造LLMOps时代Prompt数据驱动引擎

文档简介

温馨提示

最新文档

评论

相关文档