




版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
大数据系统的部署、调度与监控徐葳本次课的目标系统管理的重要性从裸机到大数据系统系统全局状态的维护和管理一致性与Chubby
/
Zookeeper任务调度软硬件系统的监控怎么听本节课我ResearcherPractitionerSplit
personality一个系统管理员(我)的血泪我维护的200节点集群ProductionSupport100sofresearchersrunning“bigdata”workloadSystemsResearchSelf-drivingbigdatainfrastructureEverythingmanagedby…HPC:Thegoodolddays
(forsysadmins)Rocks
Cluster
RollsJohn
Boyle.
Biology
must
develop
its
own
big-data
systems.
Nature
(world
view).
July
2013
Demand
1:customers
want
flexibility
…Motivation2:
Customersdemand
performance
…Prof.DavidHausslerBiologist
at
UC
Santa
CruzGodDamnI/O!Wehavea
variety
of
applicationsScientific
Image
ProcessingCryo-EM
and
Protein
StructureSocial“BigData”Social
NetworkingOnlineEducationDataLotsofdependencies…Natural
Language
Processing*ImagecourtesyofProf.GerarddeMelo@TsinghuaResource
hungry
too
…CC++JavaGenomeAnalysisCustomer’s
needs
change
…Protein
DesignCustomer’s
needs
change
…Protein
DesignCustomer’s
needs
change
…Protein
DesignCustomer’s
needs
change
…Protein
Design系统的部署:从裸机到大数据系统Source:
Juju
website基本想法:安装一台机器,自动安装所有其他机器Rocks
Cluster
RollsHeadComputeNodesNoapplicableroll?==Sorry网络和硬件的配置IPMIDNSCiscoRouterRAIDBMCVPNFirewall解决方案:定制化服务器+整机架交付开放数据中心委员会(ODCC)整机架OCP整机架*集装箱规模的交付和部署Photo
from
Lintao
Zhang硬件支持如何远程控制裸机?IntelligentPlatformManagementInterface(IPMI)实现方法:专用BMC芯片功能:重启机器,
Console,电压,温度,网络连接PXE(网络启动)Bootp,
TFTP操作系统和基础架构CentOSGPUDriversLDAPImageServersCobblerLocalRepoStorageOSoptimizationNetworkDriversSSOOSDriversSecurityStorageSR-IOV解决方案:配置管理Ubuntu
MAAS
(Metal
as
a
Service)把配置转化为程序流行的配置管理工具配置管理:可视化Figure
from
Juju
website项目要求和截止期阶段0:项目选择和组队(本周)阶段1:与用户初次交流,提交需求分析与项目计划(11月11日)阶段2:至少每两周与用户交流一次,提交阶段性报告(11月25日,12月9日,12月23日)阶段3:进行项目展示(12月30日课上)阶段4:提交项目报告(17周末)课程项目comments1.不少组背景部分离项目本身有点远,和项目相关的部分一笔带过,感觉有点像凑篇幅的;
2.有几个组需要自己爬数据,而写爬虫代码和搭系统的又是一波人,尽量别耽误了后面的部分;
3.有些组给的技术路线只是现有技术的介绍,还没有组织好,可能会影响后面的进度;
4.那几个被安排志愿组做得都挺好的,值得鼓励课程项目特别提醒不能抄袭(加上了出处也不行)引用一些图片可以,但是必须注明出处Hadoop作业问题?系统全局状态的维护与管理系统的全局状态:挑战GFS
--
masterMapReduce
–
masterDryad
–
master问题:谁是master节点?如果master节点挂了?解决?找一个人来决定谁是master问题?Chubby的解决方案:A
servicethatprovidessynchronization(leaderelection,sharedenv.info.)reliabilityavailabilityeasy-to-understandsemanticsperformance,throughput,latencyonlysecondaryPrimaryElectionDistributedconsensusproblemAsynchronouscommunicationloss,delay,reorderingWhy
it
is
hard?
FLPimpossibilityresultAmodel:twogeneralproblemTwoarmiesareonoppositesidesofacityinthevalleyThetwogeneralsshouldcoordinatetheattack;eachhasaninitialvalue(attackorretreat)Theonlycommunicationisthroughsendingmessengerswhicharepronetobeingcaptured/lostinthevalleyNodeterministicalgorithmforreachingconsensus!ProofbycontradictionFischer-Lynch-Paterson(FLP)Evenifwehavereliablemessagedelivery…Noconsensuscanbeguaranteedinanasynchronouscommunicationsysteminthepresenceofanyfailures.Intuitiona“failed”processmayjustbeslow,andcanrisefromthedeadatexactlythewrongtime.PaxosIntroductionPaxosisanasynchronousconsensusalgorithm.FLPresultsaysnoasynchronousconsensusalgorithmcanguaranteebothsafetyandliveness.Paxosisguaranteedsafe.Consensusisastableproperty:oncereacheditisneverviolated;theagreedvalueisnotchanged.Paxosisnotguaranteedlive.Consensusisreachedif“alargeenoughsubnetwork...isnon-faultyforalongenoughtime.”OtherwisePaxosmightneverterminate.Paxos:
the
namePaxosConsensus
ModelLeslieLamportTuring
Award,
2013“fundamentalcontributionstothetheoryandpracticeofdistributedandconcurrentsystems,notablytheinventionofconceptssuchascausalityandlogicalclocks,safetyandliveness,replicatedstatemachines,andsequentialconsistency”LaTeXSequentialconsistencyByzantinefaulttolerancePaxosalgorithmPhoto
from
WikipediaAPaxosRoundReplicated
State
MachineMaintainreplicasbyexecutingoperationsinexactly
the
sameorderRequiresallreplicasto“agree”onthe(setand)orderofoperationsThepoint:ifoneserverfails,canuseotherservers,whichhaveexactlythesamestateUsing
PaxosThree
(Five)
replicas
Clientscan
anyreplica(notjustprimary)Serverappendseachclientoptoareplicated*log*ofoperationsPut,Get,
Update,
DeleteNumberedlogentries–“instances”–seqPaxosagreementoncontentofeachlogentrynote:eachinstance(logentry)isanentirelyseparatePaxosagreement
withentirelyseparateproposalnumbersUsing
Paxos
to
replicate
statesKV
Server
Paxos
Peer
(library)Other
peersLogGET(a)PUT(a,b)……..Instances(LogEntry)
#Client
OpsExample
1:WriteKvpaxosServerS1KvpaxosServer
S2KvpaxosServer
S3Client
1PUT(a,b)LogEntry3,
PUT(a,b)LogEntry3,
PUT(a,b)LogEntry3,
PUT(a,b)LogEntry
3PUT(a,b)……..……..Example2:ReadKvpaxosServerS1KvpaxosServer
S2KvpaxosServer
S3Client
2GET(a)LogEntry4,
GET(a)LogEntry4,
GET(a)LogEntry4,
GET(a)PUT(a,b)GET(a)……..……..LogEntry
4Scan
upto
LogEntry4Consistent
during
a
PartitionKvpaxosServerS1KvpaxosServer
S2KvpaxosServer
S3Client
1Client
2Client
3GETPOSTPOSTPartitionWorks!Does
not
work!Chubby
Design:SystemStructureTwomaincomponents:server(Chubbycell)clientlibraryFigure
from
the
Chubby
paperDesign:Files,Dirs,HandlesFSinterface/ls/cs6464-cell/lab2/testspecializedAPIalsoviainterfaceusedbyGFSLock
LeasesSessionmaintainedthroughKeepAlivesHandles,locks,cacheddataremainvalidclientmustacknowledgeinvalidationmessagesTerminatedexplicitly,orafterleasetimeoutZooKeeperServiceServerServerServerServerServerServerLeaderOpen
source
alternative:
ZooKeeperClientClientClientClientClientClientClientClientAllserversstoreacopyofthedata(inmemory)AleaderiselectedatstartupFollowersserviceclients,allupdatesgothroughleaderUpdateresponsesaresentwhenamajorityofservershavepersistedthechangeExample
use
of
Zookeeper(Well
known
address
for
Zookeeper)图片复制于任务调度:问题和挑战Problem:
ResourceSharinginDataCentersProblemNosingleframeworkoptimalforallapplicationsWanttorunmultipleframeworksinasinglecluster…tomaximizeutilization…tosharedatabetweenframeworksHadoopPregelMPISharedclusterSlide
from
Lintao
ZhangSolution:ResourceSchedulerResourceManagerNodeNodeNodeNodeHadoopPregel…NodeNodeHadoopNodeNodePregel…Slide
from
Lintao
ZhangWhat
are
the
“demands”?Multiple
usersJobs
–
tasksEach
have
different
requirementsRequests
coming
in
over
time
(online)What
are
the
“resources”?CPU,
RAM,
Disk
spaceNetworkingSpecial
constraints
Location,
colocationSpecial
hardwareGoals
for
the
scheduler
(1)Whatresourcesareavailable?resourcetracking
(who
already
has
what)failure
handlingGoals
for
the
scheduler
(2)Who
can
get
whatresource(andwhen)?FairnessImprove
utilizationImprove
average
completion
timeImprove
power
efficiency(often)
conflicting
goalsGoals
for
the
scheduler
(3)Howcantheuseraccesstheresource?namingOthergoalsEnsureuserisolation
(Container,
VMs)Allow
users
to
monitor
their
servicesA
description
language
/
UI
for
resource
specs任务调度举例:BorgBorg10+
years
@
GoogleManaging
millions
of
machinesResources
managed
by
Borg~10,000
(median)
servers
per
cellHeterogeneous
machinesSize,
processor
type,
external
IPs,
peformanceSpecial
hardware
like
SSDDemandsJob
TasksDifferent
sizesProd
/
non-prodOnline
and
batchRequirement
descriptions
written
in
BCLCan
“update”
task
requirementsRolling
updatesBorg
ArchitectureSource:
Borg
EuroSys
paperHow
Borg
achieved
the
goalsResource
TrackingThrough
Borglets
(local
agents
on
each
machine)Monitoring
+
executions(logically)
single
central
Borg
MasterFault
tolerant
using
Chubby
(always
knows
which
is
the
current
master)Records
all
jobs
in
Paxos
storeBorg
Scheduling
PolicyPriority
+
admission
controlUsed
a
scoring
mechanism
Minimize
the
cost
change
when
placing
a
jobVs.
“best
fit”NamingBorg
names
a
process
with
an
IP
address
+
ports
To
allow
different
jobs
runs
on
a
single
machineShould
this
be
done
by
the
scheduler?Other
things
Borg
handlesPackage
distribution
(how
to
copy
the
binaries
to
all
machines)AutoscalingRe-packing
tasksContainers
to
do
performance
isolationMonitoring
UIDebugging
UITracing
Integration
(later)BCL
(Borg
Configuration
Language)Local
disk
management……LessonsThe
Borg
master
should
be
the
kernel
of
the
data
centerOther
things
can
move
to
separate
servicesShould
simplify
Naming
and
addressing
management
Should
have
multiple
ways
to
group
tasks
(not
necessarily
jobs)Too
much
optimizations
for
power
users,
too
complicated.(230
specifications
in
BCL)
Open
source:
kubernetes任务调度:MesosMesos
DemoMesosArchitectureSlide
from
Lintao
ZhangResourceOfferingResourceoffersOfferavailableresourcestoframeworks,letthempickwhichresourcestouseandwhichtaskstolaunch
KeepsMesossimple,letsitsupportfutureframeworksDecentralizeddecisionsmightnotbeoptimalOptimization:Letframeworksshort-circuitrejectionbyprovidingapredicateonresourcestobeofferedE.g.“nodesfromlistL”or“nodeswith>8GBRAM”CouldgeneralizetootherhintsaswellSlide
from
Lintao
Zhang任务调度:sparrow问题:scheduler太慢怎么办?分布式?集中的scheduler:知道全局资源的状态分散的scheduler:同步状态?10min.10sec.100ms1ms2004:MapReducebatchjob2009:Hivequery2010:DremelQuery2012:Impalaquery2010:In-memorySparkquery2013:SparkstreamingOn
100016-coremachines26decisions/secondSchedulerthroughput1.6Kdecisions/second160Kdecisions/second16Mdecisions/secondFigure
from
KayOusterhout
et
al.
Sparrow
presentation多个scheduler的问题?WorkerWorkerWorkerWorkerWorkerSchedulerSchedulerSchedulerSchedulerJobWorkerFigure
from
KayOusterhout
et
al.
Sparrow
presentationPer-tasksamplingWorkerWorkerWorkerWorkerWorkerSchedulerSchedulerSchedulerSchedulerJobWorkerPowerofTwoChoicesFigure
from
KayOusterhout
et
al.
Sparrow
presentationPer-tasksamplingWorkerWorkerWorkerWorkerWorkerSchedulerSchedulerSchedulerSchedulerJobWorkerPowerofTwoChoicesFigure
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 山东公考真题2025
- 零售药店医疗器械各岗位培训考核试题及答案
- 度继续教育公需科目知识产权考试试题及答案(版)
- 药品管理法和药品网络销售管理办法培训试题及答案
- 事业单位招聘考试公共基础知识复习题库及答案
- 高血压健康管理试题-高血压知识
- 工程力学期末考试试卷测试题A与答案
- 2025年度安全培训中心保安人员服务合同下载
- 2025版车展期间展商洽谈与商务对接服务合同
- 2025年智能机器人研发与市场推广合作协议
- 安全伴我行-大学生安全教育知到智慧树章节测试课后答案2024年秋哈尔滨工程大学
- 有害物质过程管理系统HSPM培训教材
- 2025年蛇年年会汇报年终总结大会模板
- 存款代持协议书范文模板
- DB3301T 0374-2022 疗休养基地评价规范
- 胖东来企业文化指导手册
- 北师大版八年级物理(上册)期末复习题及答案
- 【历年真题合集+答案解析】2024年教资高中历史
- 委托别人找工作的协议
- 医技三基三严知识模拟习题含参考答案
- Y -S-T 732-2023 一般工业用铝及铝合金挤压型材截面图册 (正式版)
评论
0/150
提交评论