




版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
大数据系统的部署、调度与监控徐葳本次课的目标系统管理的重要性从裸机到大数据系统系统全局状态的维护和管理一致性与Chubby
/
Zookeeper任务调度软硬件系统的监控怎么听本节课我ResearcherPractitionerSplit
personality一个系统管理员(我)的血泪我维护的200节点集群ProductionSupport100sofresearchersrunning“bigdata”workloadSystemsResearchSelf-drivingbigdatainfrastructureEverythingmanagedby…HPC:Thegoodolddays
(forsysadmins)Rocks
Cluster
RollsJohn
Boyle.
Biology
must
develop
its
own
big-data
systems.
Nature
(world
view).
July
2013
Demand
1:customers
want
flexibility
…Motivation2:
Customersdemand
performance
…Prof.DavidHausslerBiologist
at
UC
Santa
CruzGodDamnI/O!Wehavea
variety
of
applicationsScientific
Image
ProcessingCryo-EM
and
Protein
StructureSocial“BigData”Social
NetworkingOnlineEducationDataLotsofdependencies…Natural
Language
Processing*ImagecourtesyofProf.GerarddeMelo@TsinghuaResource
hungry
too
…CC++JavaGenomeAnalysisCustomer’s
needs
change
…Protein
DesignCustomer’s
needs
change
…Protein
DesignCustomer’s
needs
change
…Protein
DesignCustomer’s
needs
change
…Protein
Design系统的部署:从裸机到大数据系统Source:
Juju
website基本想法:安装一台机器,自动安装所有其他机器Rocks
Cluster
RollsHeadComputeNodesNoapplicableroll?==Sorry网络和硬件的配置IPMIDNSCiscoRouterRAIDBMCVPNFirewall解决方案:定制化服务器+整机架交付开放数据中心委员会(ODCC)整机架OCP整机架*集装箱规模的交付和部署Photo
from
Lintao
Zhang硬件支持如何远程控制裸机?IntelligentPlatformManagementInterface(IPMI)实现方法:专用BMC芯片功能:重启机器,
Console,电压,温度,网络连接PXE(网络启动)Bootp,
TFTP操作系统和基础架构CentOSGPUDriversLDAPImageServersCobblerLocalRepoStorageOSoptimizationNetworkDriversSSOOSDriversSecurityStorageSR-IOV解决方案:配置管理Ubuntu
MAAS
(Metal
as
a
Service)把配置转化为程序流行的配置管理工具配置管理:可视化Figure
from
Juju
website项目要求和截止期阶段0:项目选择和组队(本周)阶段1:与用户初次交流,提交需求分析与项目计划(11月11日)阶段2:至少每两周与用户交流一次,提交阶段性报告(11月25日,12月9日,12月23日)阶段3:进行项目展示(12月30日课上)阶段4:提交项目报告(17周末)课程项目comments1.不少组背景部分离项目本身有点远,和项目相关的部分一笔带过,感觉有点像凑篇幅的;
2.有几个组需要自己爬数据,而写爬虫代码和搭系统的又是一波人,尽量别耽误了后面的部分;
3.有些组给的技术路线只是现有技术的介绍,还没有组织好,可能会影响后面的进度;
4.那几个被安排志愿组做得都挺好的,值得鼓励课程项目特别提醒不能抄袭(加上了出处也不行)引用一些图片可以,但是必须注明出处Hadoop作业问题?系统全局状态的维护与管理系统的全局状态:挑战GFS
--
masterMapReduce
–
masterDryad
–
master问题:谁是master节点?如果master节点挂了?解决?找一个人来决定谁是master问题?Chubby的解决方案:A
servicethatprovidessynchronization(leaderelection,sharedenv.info.)reliabilityavailabilityeasy-to-understandsemanticsperformance,throughput,latencyonlysecondaryPrimaryElectionDistributedconsensusproblemAsynchronouscommunicationloss,delay,reorderingWhy
it
is
hard?
FLPimpossibilityresultAmodel:twogeneralproblemTwoarmiesareonoppositesidesofacityinthevalleyThetwogeneralsshouldcoordinatetheattack;eachhasaninitialvalue(attackorretreat)Theonlycommunicationisthroughsendingmessengerswhicharepronetobeingcaptured/lostinthevalleyNodeterministicalgorithmforreachingconsensus!ProofbycontradictionFischer-Lynch-Paterson(FLP)Evenifwehavereliablemessagedelivery…Noconsensuscanbeguaranteedinanasynchronouscommunicationsysteminthepresenceofanyfailures.Intuitiona“failed”processmayjustbeslow,andcanrisefromthedeadatexactlythewrongtime.PaxosIntroductionPaxosisanasynchronousconsensusalgorithm.FLPresultsaysnoasynchronousconsensusalgorithmcanguaranteebothsafetyandliveness.Paxosisguaranteedsafe.Consensusisastableproperty:oncereacheditisneverviolated;theagreedvalueisnotchanged.Paxosisnotguaranteedlive.Consensusisreachedif“alargeenoughsubnetwork...isnon-faultyforalongenoughtime.”OtherwisePaxosmightneverterminate.Paxos:
the
namePaxosConsensus
ModelLeslieLamportTuring
Award,
2013“fundamentalcontributionstothetheoryandpracticeofdistributedandconcurrentsystems,notablytheinventionofconceptssuchascausalityandlogicalclocks,safetyandliveness,replicatedstatemachines,andsequentialconsistency”LaTeXSequentialconsistencyByzantinefaulttolerancePaxosalgorithmPhoto
from
WikipediaAPaxosRoundReplicated
State
MachineMaintainreplicasbyexecutingoperationsinexactly
the
sameorderRequiresallreplicasto“agree”onthe(setand)orderofoperationsThepoint:ifoneserverfails,canuseotherservers,whichhaveexactlythesamestateUsing
PaxosThree
(Five)
replicas
Clientscan
anyreplica(notjustprimary)Serverappendseachclientoptoareplicated*log*ofoperationsPut,Get,
Update,
DeleteNumberedlogentries–“instances”–seqPaxosagreementoncontentofeachlogentrynote:eachinstance(logentry)isanentirelyseparatePaxosagreement
withentirelyseparateproposalnumbersUsing
Paxos
to
replicate
statesKV
Server
Paxos
Peer
(library)Other
peersLogGET(a)PUT(a,b)……..Instances(LogEntry)
#Client
OpsExample
1:WriteKvpaxosServerS1KvpaxosServer
S2KvpaxosServer
S3Client
1PUT(a,b)LogEntry3,
PUT(a,b)LogEntry3,
PUT(a,b)LogEntry3,
PUT(a,b)LogEntry
3PUT(a,b)……..……..Example2:ReadKvpaxosServerS1KvpaxosServer
S2KvpaxosServer
S3Client
2GET(a)LogEntry4,
GET(a)LogEntry4,
GET(a)LogEntry4,
GET(a)PUT(a,b)GET(a)……..……..LogEntry
4Scan
upto
LogEntry4Consistent
during
a
PartitionKvpaxosServerS1KvpaxosServer
S2KvpaxosServer
S3Client
1Client
2Client
3GETPOSTPOSTPartitionWorks!Does
not
work!Chubby
Design:SystemStructureTwomaincomponents:server(Chubbycell)clientlibraryFigure
from
the
Chubby
paperDesign:Files,Dirs,HandlesFSinterface/ls/cs6464-cell/lab2/testspecializedAPIalsoviainterfaceusedbyGFSLock
LeasesSessionmaintainedthroughKeepAlivesHandles,locks,cacheddataremainvalidclientmustacknowledgeinvalidationmessagesTerminatedexplicitly,orafterleasetimeoutZooKeeperServiceServerServerServerServerServerServerLeaderOpen
source
alternative:
ZooKeeperClientClientClientClientClientClientClientClientAllserversstoreacopyofthedata(inmemory)AleaderiselectedatstartupFollowersserviceclients,allupdatesgothroughleaderUpdateresponsesaresentwhenamajorityofservershavepersistedthechangeExample
use
of
Zookeeper(Well
known
address
for
Zookeeper)图片复制于任务调度:问题和挑战Problem:
ResourceSharinginDataCentersProblemNosingleframeworkoptimalforallapplicationsWanttorunmultipleframeworksinasinglecluster…tomaximizeutilization…tosharedatabetweenframeworksHadoopPregelMPISharedclusterSlide
from
Lintao
ZhangSolution:ResourceSchedulerResourceManagerNodeNodeNodeNodeHadoopPregel…NodeNodeHadoopNodeNodePregel…Slide
from
Lintao
ZhangWhat
are
the
“demands”?Multiple
usersJobs
–
tasksEach
have
different
requirementsRequests
coming
in
over
time
(online)What
are
the
“resources”?CPU,
RAM,
Disk
spaceNetworkingSpecial
constraints
Location,
colocationSpecial
hardwareGoals
for
the
scheduler
(1)Whatresourcesareavailable?resourcetracking
(who
already
has
what)failure
handlingGoals
for
the
scheduler
(2)Who
can
get
whatresource(andwhen)?FairnessImprove
utilizationImprove
average
completion
timeImprove
power
efficiency(often)
conflicting
goalsGoals
for
the
scheduler
(3)Howcantheuseraccesstheresource?namingOthergoalsEnsureuserisolation
(Container,
VMs)Allow
users
to
monitor
their
servicesA
description
language
/
UI
for
resource
specs任务调度举例:BorgBorg10+
years
@
GoogleManaging
millions
of
machinesResources
managed
by
Borg~10,000
(median)
servers
per
cellHeterogeneous
machinesSize,
processor
type,
external
IPs,
peformanceSpecial
hardware
like
SSDDemandsJob
TasksDifferent
sizesProd
/
non-prodOnline
and
batchRequirement
descriptions
written
in
BCLCan
“update”
task
requirementsRolling
updatesBorg
ArchitectureSource:
Borg
EuroSys
paperHow
Borg
achieved
the
goalsResource
TrackingThrough
Borglets
(local
agents
on
each
machine)Monitoring
+
executions(logically)
single
central
Borg
MasterFault
tolerant
using
Chubby
(always
knows
which
is
the
current
master)Records
all
jobs
in
Paxos
storeBorg
Scheduling
PolicyPriority
+
admission
controlUsed
a
scoring
mechanism
Minimize
the
cost
change
when
placing
a
jobVs.
“best
fit”NamingBorg
names
a
process
with
an
IP
address
+
ports
To
allow
different
jobs
runs
on
a
single
machineShould
this
be
done
by
the
scheduler?Other
things
Borg
handlesPackage
distribution
(how
to
copy
the
binaries
to
all
machines)AutoscalingRe-packing
tasksContainers
to
do
performance
isolationMonitoring
UIDebugging
UITracing
Integration
(later)BCL
(Borg
Configuration
Language)Local
disk
management……LessonsThe
Borg
master
should
be
the
kernel
of
the
data
centerOther
things
can
move
to
separate
servicesShould
simplify
Naming
and
addressing
management
Should
have
multiple
ways
to
group
tasks
(not
necessarily
jobs)Too
much
optimizations
for
power
users,
too
complicated.(230
specifications
in
BCL)
Open
source:
kubernetes任务调度:MesosMesos
DemoMesosArchitectureSlide
from
Lintao
ZhangResourceOfferingResourceoffersOfferavailableresourcestoframeworks,letthempickwhichresourcestouseandwhichtaskstolaunch
KeepsMesossimple,letsitsupportfutureframeworksDecentralizeddecisionsmightnotbeoptimalOptimization:Letframeworksshort-circuitrejectionbyprovidingapredicateonresourcestobeofferedE.g.“nodesfromlistL”or“nodeswith>8GBRAM”CouldgeneralizetootherhintsaswellSlide
from
Lintao
Zhang任务调度:sparrow问题:scheduler太慢怎么办?分布式?集中的scheduler:知道全局资源的状态分散的scheduler:同步状态?10min.10sec.100ms1ms2004:MapReducebatchjob2009:Hivequery2010:DremelQuery2012:Impalaquery2010:In-memorySparkquery2013:SparkstreamingOn
100016-coremachines26decisions/secondSchedulerthroughput1.6Kdecisions/second160Kdecisions/second16Mdecisions/secondFigure
from
KayOusterhout
et
al.
Sparrow
presentation多个scheduler的问题?WorkerWorkerWorkerWorkerWorkerSchedulerSchedulerSchedulerSchedulerJobWorkerFigure
from
KayOusterhout
et
al.
Sparrow
presentationPer-tasksamplingWorkerWorkerWorkerWorkerWorkerSchedulerSchedulerSchedulerSchedulerJobWorkerPowerofTwoChoicesFigure
from
KayOusterhout
et
al.
Sparrow
presentationPer-tasksamplingWorkerWorkerWorkerWorkerWorkerSchedulerSchedulerSchedulerSchedulerJobWorkerPowerofTwoChoicesFigure
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 华山医院职业暴露培训
- 知识产权行政保护课件
- 陈鹤琴活教育读书分享
- 冲刺抢分卷01 备战2025年高考考前仿真模拟卷冲刺抢分卷化学试题01 (辽宁、黑龙江、吉林、内蒙古专用) 含解析
- 能发音的音标教学课件
- 农村农田水利工程承包合同
- 食品营养与加工技术案例分析题集
- 社交媒体营销策略考试题
- 行政管理2025年公共关系学的关键问题
- 生物化学在医药领域的知识练习题
- 2025年山西万家寨水务控股集团限公司公开招聘工作人员48人自考难、易点模拟试卷(共500题附带答案详解)
- 广东东软学院《英语语法I》2023-2024学年第二学期期末试卷
- 流行性感冒诊疗方案(2025 年版)解读课件
- 2025年公务员考试时事政治题及参考答案
- 物业管理安全责任分配
- 2025年湖南湘投控股集团有限公司招聘笔试参考题库含答案解析
- 绿色建筑材料在土木工程施工中的应用研究
- 第二十九节 商业模式创新及案例分析
- 中国铁路沈阳局集团有限公司招聘笔试冲刺题2025
- 2024年度医疗设备报废回收与资源化利用合同3篇
- 医疗器械的维护和保养方法
评论
0/150
提交评论