电气装置安装工程_第1页
电气装置安装工程_第2页
电气装置安装工程_第3页
电气装置安装工程_第4页
电气装置安装工程_第5页
已阅读5页,还剩11页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

Detecting, Managing, and Diagnosing Failures with FUSE,John Dunagan, Juhan Lee (MSN), Alec WolmanWIP,2,Goals & Target Environment,Improve the ability of large internet portals to gain insight into failuresNon-goals: masking failuresuse machine learning to inferabnormal behavior,3,MSN Background,Messenger, , Hotmail, Search, many other “properties”Large ( 100 million users)Sources of Complexity: multiple data-centers large # of machinescomplex internal network topologydiversity of applications and software infrastructure,4,The Plan,Detecting, managing, and diagnosing failuresReview MSNs current approachesDescribe our solution at a high level,5,Detecting Failures,Monitor system availability with heartbeatsMonitor applications availability & quality of service using synthetic requestsCustomer complaintsTelephone, emailProblems: These approaches provide limited coverage harder to catch failures that dont affect every requestData on detected failures often lacks necessary detail to suggest a remedy:which front end is flaky? which app component caused end-user failure?,6,Managing Failures,Definition: Ability to prioritize failures Detect component service degradation Characterizing app-stability Capacity planningWhen server “x” fails, what is the impact of this failure?Better use of ops and engineering resourcesCurrent approach: no systematic attempt to provide this functionality,7,Our solution (in 2 steps),Detecting and Managing FailuresStep 1: Instrument applications to track user requests across the “service chain”Each request is tagged with a unique idService chain is composed on-the-fly with help of app instrumentationFor each request:Collect per-hop performance informationCollect per-request failure statusCentralized data collection,8,What kinds of failures?,We can handle:Machine failuresNetwork connectivity problemsMost:MisconfigurationApplication bugsBut not all:Application errors where app itself doesnt detect that there is a problem,9,Diagnosing Failures,Assigning responsibility to a specific hw or sw componentInsight into internals of a component Cross component interactionsCurrent approach: instrument applicationsApp-specific log messagesProblemsHigh request rates = log rolloverPerceived overhead = detailed logging enabled during testing, disabled in production,10,Fuse Background,FUSE (OSDI 2004): lightweight agreement on only one thing: whether or not a failure has occurredLack of a positive ack = failure,11,Step 2: Conditional Logging,Step 2: Implement “conditional logging” to significantly reduce the overhead of collecting detailed logs across different machines in the service chainStep 1 provides ability to identify a request across all participants in the service chain, Fuse provides agreement on failure status across that chainWhile fate is undecided: Detailed log messages stored in main memoryCommon case overload of logging is vastly reducedOnce the fate of service chain is decided, we discard app logs for successful requests and save logs for failuresQuantity of data generated is manageable, when most requests are successful,12,Example,Benefits:FUSE allows monitoring of real transactions.All transactions, or a sampled subset to control overhead.When a request fails, FUSE provides an audit trailHow far did it get?How long did each step take?Any additional application specific context.FUSE can be deployed incrementally.,13,Issues,Overload policy: need to handle bursts of failures without inducing more failuresHow much effort to make apps FUSE enabled?Are the right components FUSE enabled?Identifying and filtering false positivesTracking request flow is non-trivial with network load balancers,14,Status,Weve implemented FUSE for MSN, integrated with ASP.NET rendering engineTesting in progressRoll-out at end of summer,15,Backups,16,FUSE is Easy to Integrate,Example current code on Front End:ReceiveRequestFromClient() SendRequestToBackEnd();Example code on F

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论