系统管理与监控_第1页
系统管理与监控_第2页
系统管理与监控_第3页
系统管理与监控_第4页
系统管理与监控_第5页
已阅读5页,还剩92页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

大数据系统的部署、调度与监控徐葳本次课的目标系统管理的重要性从裸机到大数据系统系统全局状态的维护和管理一致性与Chubby

/

Zookeeper任务调度软硬件系统的监控怎么听本节课我ResearcherPractitionerSplit

personality一个系统管理员(我)的血泪我维护的200节点集群ProductionSupport100sofresearchersrunning“bigdata”workloadSystemsResearchSelf-drivingbigdatainfrastructureEverythingmanagedby…HPC:Thegoodolddays

(forsysadmins)Rocks

Cluster

RollsJohn

Boyle.

Biology

must

develop

its

own

big-data

systems.

Nature

(world

view).

July

2013

Demand

1:customers

want

flexibility

…Motivation2:

Customersdemand

performance

…Prof.DavidHausslerBiologist

at

UC

Santa

CruzGodDamnI/O!Wehavea

variety

of

applicationsScientific

Image

ProcessingCryo-EM

and

Protein

StructureSocial“BigData”Social

NetworkingOnlineEducationDataLotsofdependencies…Natural

Language

Processing*ImagecourtesyofProf.GerarddeMelo@TsinghuaResource

hungry

too

…CC++JavaGenomeAnalysisCustomer’s

needs

change

…Protein

DesignCustomer’s

needs

change

…Protein

DesignCustomer’s

needs

change

…Protein

DesignCustomer’s

needs

change

…Protein

Design系统的部署:从裸机到大数据系统Source:

Juju

website基本想法:安装一台机器,自动安装所有其他机器Rocks

Cluster

RollsHeadComputeNodesNoapplicableroll?==Sorry网络和硬件的配置IPMIDNSCiscoRouterRAIDBMCVPNFirewall解决方案:定制化服务器+整机架交付开放数据中心委员会(ODCC)整机架OCP整机架*集装箱规模的交付和部署Photo

from

Lintao

Zhang硬件支持如何远程控制裸机?IntelligentPlatformManagementInterface(IPMI)实现方法:专用BMC芯片功能:重启机器,

Console,电压,温度,网络连接PXE(网络启动)Bootp,

TFTP操作系统和基础架构CentOSGPUDriversLDAPImageServersCobblerLocalRepoStorageOSoptimizationNetworkDriversSSOOSDriversSecurityStorageSR-IOV解决方案:配置管理Ubuntu

MAAS

(Metal

as

a

Service)把配置转化为程序流行的配置管理工具配置管理:可视化Figure

from

Juju

website项目要求和截止期阶段0:项目选择和组队(本周)阶段1:与用户初次交流,提交需求分析与项目计划(11月11日)阶段2:至少每两周与用户交流一次,提交阶段性报告(11月25日,12月9日,12月23日)阶段3:进行项目展示(12月30日课上)阶段4:提交项目报告(17周末)课程项目comments1.不少组背景部分离项目本身有点远,和项目相关的部分一笔带过,感觉有点像凑篇幅的;

2.有几个组需要自己爬数据,而写爬虫代码和搭系统的又是一波人,尽量别耽误了后面的部分;

3.有些组给的技术路线只是现有技术的介绍,还没有组织好,可能会影响后面的进度;

4.那几个被安排志愿组做得都挺好的,值得鼓励课程项目特别提醒不能抄袭(加上了出处也不行)引用一些图片可以,但是必须注明出处Hadoop作业问题?系统全局状态的维护与管理系统的全局状态:挑战GFS

--

masterMapReduce

masterDryad

master问题:谁是master节点?如果master节点挂了?解决?找一个人来决定谁是master问题?Chubby的解决方案:A

servicethatprovidessynchronization(leaderelection,sharedenv.info.)reliabilityavailabilityeasy-to-understandsemanticsperformance,throughput,latencyonlysecondaryPrimaryElectionDistributedconsensusproblemAsynchronouscommunicationloss,delay,reorderingWhy

it

is

hard?

FLPimpossibilityresultAmodel:twogeneralproblemTwoarmiesareonoppositesidesofacityinthevalleyThetwogeneralsshouldcoordinatetheattack;eachhasaninitialvalue(attackorretreat)Theonlycommunicationisthroughsendingmessengerswhicharepronetobeingcaptured/lostinthevalleyNodeterministicalgorithmforreachingconsensus!ProofbycontradictionFischer-Lynch-Paterson(FLP)Evenifwehavereliablemessagedelivery…Noconsensuscanbeguaranteedinanasynchronouscommunicationsysteminthepresenceofanyfailures.Intuitiona“failed”processmayjustbeslow,andcanrisefromthedeadatexactlythewrongtime.PaxosIntroductionPaxosisanasynchronousconsensusalgorithm.FLPresultsaysnoasynchronousconsensusalgorithmcanguaranteebothsafetyandliveness.Paxosisguaranteedsafe.Consensusisastableproperty:oncereacheditisneverviolated;theagreedvalueisnotchanged.Paxosisnotguaranteedlive.Consensusisreachedif“alargeenoughsubnetwork...isnon-faultyforalongenoughtime.”OtherwisePaxosmightneverterminate.Paxos:

the

namePaxosConsensus

ModelLeslieLamportTuring

Award,

2013“fundamentalcontributionstothetheoryandpracticeofdistributedandconcurrentsystems,notablytheinventionofconceptssuchascausalityandlogicalclocks,safetyandliveness,replicatedstatemachines,andsequentialconsistency”LaTeXSequentialconsistencyByzantinefaulttolerancePaxosalgorithmPhoto

from

WikipediaAPaxosRoundReplicated

State

MachineMaintainreplicasbyexecutingoperationsinexactly

the

sameorderRequiresallreplicasto“agree”onthe(setand)orderofoperationsThepoint:ifoneserverfails,canuseotherservers,whichhaveexactlythesamestateUsing

PaxosThree

(Five)

replicas

Clientscan

anyreplica(notjustprimary)Serverappendseachclientoptoareplicated*log*ofoperationsPut,Get,

Update,

DeleteNumberedlogentries–“instances”–seqPaxosagreementoncontentofeachlogentrynote:eachinstance(logentry)isanentirelyseparatePaxosagreement

withentirelyseparateproposalnumbersUsing

Paxos

to

replicate

statesKV

Server

Paxos

Peer

(library)Other

peersLogGET(a)PUT(a,b)……..Instances(LogEntry)

#Client

OpsExample

1:WriteKvpaxosServerS1KvpaxosServer

S2KvpaxosServer

S3Client

1PUT(a,b)LogEntry3,

PUT(a,b)LogEntry3,

PUT(a,b)LogEntry3,

PUT(a,b)LogEntry

3PUT(a,b)……..……..Example2:ReadKvpaxosServerS1KvpaxosServer

S2KvpaxosServer

S3Client

2GET(a)LogEntry4,

GET(a)LogEntry4,

GET(a)LogEntry4,

GET(a)PUT(a,b)GET(a)……..……..LogEntry

4Scan

upto

LogEntry4Consistent

during

a

PartitionKvpaxosServerS1KvpaxosServer

S2KvpaxosServer

S3Client

1Client

2Client

3GETPOSTPOSTPartitionWorks!Does

not

work!Chubby

Design:SystemStructureTwomaincomponents:server(Chubbycell)clientlibraryFigure

from

the

Chubby

paperDesign:Files,Dirs,HandlesFSinterface/ls/cs6464-cell/lab2/testspecializedAPIalsoviainterfaceusedbyGFSLock

LeasesSessionmaintainedthroughKeepAlivesHandles,locks,cacheddataremainvalidclientmustacknowledgeinvalidationmessagesTerminatedexplicitly,orafterleasetimeoutZooKeeperServiceServerServerServerServerServerServerLeaderOpen

source

alternative:

ZooKeeperClientClientClientClientClientClientClientClientAllserversstoreacopyofthedata(inmemory)‏AleaderiselectedatstartupFollowersserviceclients,allupdatesgothroughleaderUpdateresponsesaresentwhenamajorityofservershavepersistedthechangeExample

use

of

Zookeeper(Well

known

address

for

Zookeeper)图片复制于任务调度:问题和挑战Problem:

ResourceSharinginDataCentersProblemNosingleframeworkoptimalforallapplicationsWanttorunmultipleframeworksinasinglecluster…tomaximizeutilization…tosharedatabetweenframeworksHadoopPregelMPISharedclusterSlide

from

Lintao

ZhangSolution:ResourceSchedulerResourceManagerNodeNodeNodeNodeHadoopPregel…NodeNodeHadoopNodeNodePregel…Slide

from

Lintao

ZhangWhat

are

the

“demands”?Multiple

usersJobs

tasksEach

have

different

requirementsRequests

coming

in

over

time

(online)What

are

the

“resources”?CPU,

RAM,

Disk

spaceNetworkingSpecial

constraints

Location,

colocationSpecial

hardwareGoals

for

the

scheduler

(1)Whatresourcesareavailable?resourcetracking

(who

already

has

what)failure

handlingGoals

for

the

scheduler

(2)Who

can

get

whatresource(andwhen)?FairnessImprove

utilizationImprove

average

completion

timeImprove

power

efficiency(often)

conflicting

goalsGoals

for

the

scheduler

(3)Howcantheuseraccesstheresource?namingOthergoalsEnsureuserisolation

(Container,

VMs)Allow

users

to

monitor

their

servicesA

description

language

/

UI

for

resource

specs任务调度举例:BorgBorg10+

years

@

GoogleManaging

millions

of

machinesResources

managed

by

Borg~10,000

(median)

servers

per

cellHeterogeneous

machinesSize,

processor

type,

external

IPs,

peformanceSpecial

hardware

like

SSDDemandsJob

TasksDifferent

sizesProd

/

non-prodOnline

and

batchRequirement

descriptions

written

in

BCLCan

“update”

task

requirementsRolling

updatesBorg

ArchitectureSource:

Borg

EuroSys

paperHow

Borg

achieved

the

goalsResource

TrackingThrough

Borglets

(local

agents

on

each

machine)Monitoring

+

executions(logically)

single

central

Borg

MasterFault

tolerant

using

Chubby

(always

knows

which

is

the

current

master)Records

all

jobs

in

Paxos

storeBorg

Scheduling

PolicyPriority

+

admission

controlUsed

a

scoring

mechanism

Minimize

the

cost

change

when

placing

a

jobVs.

“best

fit”NamingBorg

names

a

process

with

an

IP

address

+

ports

To

allow

different

jobs

runs

on

a

single

machineShould

this

be

done

by

the

scheduler?Other

things

Borg

handlesPackage

distribution

(how

to

copy

the

binaries

to

all

machines)AutoscalingRe-packing

tasksContainers

to

do

performance

isolationMonitoring

UIDebugging

UITracing

Integration

(later)BCL

(Borg

Configuration

Language)Local

disk

management……LessonsThe

Borg

master

should

be

the

kernel

of

the

data

centerOther

things

can

move

to

separate

servicesShould

simplify

Naming

and

addressing

management

Should

have

multiple

ways

to

group

tasks

(not

necessarily

jobs)Too

much

optimizations

for

power

users,

too

complicated.(230

specifications

in

BCL)

Open

source:

kubernetes任务调度:MesosMesos

DemoMesosArchitectureSlide

from

Lintao

ZhangResourceOfferingResourceoffersOfferavailableresourcestoframeworks,letthempickwhichresourcestouseandwhichtaskstolaunch

KeepsMesossimple,letsitsupportfutureframeworksDecentralizeddecisionsmightnotbeoptimalOptimization:Letframeworksshort-circuitrejectionbyprovidingapredicateonresourcestobeofferedE.g.“nodesfromlistL”or“nodeswith>8GBRAM”CouldgeneralizetootherhintsaswellSlide

from

Lintao

Zhang任务调度:sparrow问题:scheduler太慢怎么办?分布式?集中的scheduler:知道全局资源的状态分散的scheduler:同步状态?10min.10sec.100ms1ms2004:MapReducebatchjob2009:Hivequery2010:DremelQuery2012:Impalaquery2010:In-memorySparkquery2013:SparkstreamingOn

100016-coremachines26decisions/secondSchedulerthroughput1.6Kdecisions/second160Kdecisions/second16Mdecisions/secondFigure

from

KayOusterhout

et

al.

Sparrow

presentation多个scheduler的问题?WorkerWorkerWorkerWorkerWorkerSchedulerSchedulerSchedulerSchedulerJobWorkerFigure

from

KayOusterhout

et

al.

Sparrow

presentationPer-tasksamplingWorkerWorkerWorkerWorkerWorkerSchedulerSchedulerSchedulerSchedulerJobWorkerPowerofTwoChoicesFigure

from

KayOusterhout

et

al.

Sparrow

presentationPer-tasksamplingWorkerWorkerWorkerWorkerWorkerSchedulerSchedulerSchedulerSchedulerJobWorkerPowerofTwoChoicesFigure

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论