zookeeper整理后的资料.doc_第1页
zookeeper整理后的资料.doc_第2页
zookeeper整理后的资料.doc_第3页
zookeeper整理后的资料.doc_第4页
zookeeper整理后的资料.doc_第5页
已阅读5页,还剩1页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1.1 zookeeper的部署和使用1.1.1 系统描述ZooKeeper是一个高可用、高可靠的协同工作系统,分布式程序可以用ZooKeeper保存并更新关键共享状态。Katta使用zookeeper保证主节点和搜索节点的有效性,指派索引文件给搜索节点,察觉搜索节点的失效等。1.1.2 安装和配置安装zookeeper也很简单,下载zookeeper-3.1.1.tar.gz,解压到/home/hezhiming/zookeeper-3.1.1。zookeeper要部署到各台机器的相同目录下,且具有相同的配置文件。Zookeeper 的配置文件主要有以下两个:1、/zookeeper -3.1.1/conf/zoo.cfg:# The number of milliseconds of each ticktickTime=2000# The number of ticks that the initial# synchronization phase can takeinitLimit=10# The number of ticks that can pass between# sending a request and getting an acknowledgementsyncLimit=5# the directory where the snapshot is stored.dataDir=/home/nutch/zookeeper-3.1.1/zookeeper-data# the port at which the clients will connectclientPort=3355# 2888,3888 are election portserver.1=devcluster01:2555:3555server.2=devcluster05:2555:3555server.3=devcluster06:2555:3555注明:2555端口号是zookeeper服务之间通信的端口,而3555是zookeeper与其他应用程序通信的端口。、myid(在zoo.cfg 配置的dataDir目录下,此处为/home/nutch/zookeeper-3.1.1/zookeeper-data)注意: Myid中的值与server的编号相同:devcluster01上的myid:1devcluster05上的myid:2devcluster06上的myid:31.1.3 启动zookeeper到每台zookeeper server的目录下,执行:启动./bin/zkServer.sh start 关闭./bin/zkServer.sh start1.1.4 zookeeper的使用zookeeper启动之后,不需要执行命令,只需查看批准的状况,命令如下:、ruok - The server will respond with imok if it is running. Otherwise it will not respond at all. 、kill - When issued from the local machine, the server will shut down. 、dump - Lists the outstanding sessions and ephemeral nodes. This only works on the leader. 、stat - Lists statistics about performance and connected clients.例如: $echo ruok | nc 127.0.0.1 3355Imok$2.1 Katta和zookeeper的关系2.4.1 什么是zookeeper?zookeeper是针对分布式应用的分布式协作服务,它的目的就是为了减轻分布式应用从头开发协作服务的负担。它的基本功能是命名服务(naming),配置管理(configuration management),同步(synchronization)和组服务 (group services)。在此基础上可以实现分布式系统的一致性,组管理,Leader选举等功能。一个zookeeper机群包含多个zookeeper服务器,这些Server彼此都知道对方的存在。 Zookeeper系统结构图如图2所示:FollowerFollowerFollowerFollower图2zookeeper系统结构图l 所有的Server都保存有一份目前zookeeper系统状态的备份;l 在zookeeper启动的时候,会自动选取一个Server作为Leader,其余的Server都是Follower;l 作为Follower的Server服务于Client,接受Client的请求,并把Client的请求转交给Leader,由Leader提交请求。l Client只与单个的zookeeper服务器连接。Client维护一个持久TCP连接,通过其发送请求, 获取响应和事件,并发送心跳信息。如果Client到Server的TCP连接中断, Client将会连接到另外一个Server。l zookeeper机群的鲁棒性是我们使用它的原因,只要不超过半数的服务器当机(如果正常服务的服务器数目不足一半,那么原有的机群将可能被划分成两个信息无法一致的zookeeper服务),该服务就能正常运行。2.4.2 zookeeper的虚拟文件系统Zookeeper允许多个分布在不同服务器上的进程基于一个共享的、类似标准文件系统的树状虚拟文件系统来进行协作。虚拟文件系统中的每个数据节点都称作一个znode。每个znode都可以把数据关联到它本身或者它的子节点.如图3所示:znodeznodeznodeznodeznodeznode图3 zookeeper的虚拟文件系统目录结构l 每个znode的名称都是绝对路径名来组成的,如“/katta/index/index_name”等。l 读取或写入znode中的数据都是原子操作,read会获取znode中的所有字节, write会整个替换znode中的信息.每个znode都包含一个访问控制列表(ACL)以约束该节点的访问者和权限. l 有些znode是临时节点.临时节点在创建它的session的生命周期内存活, 当其session终止时,此类节点将会被删除.l zookeeper提供znode监听器的概念. Client可以在某个znode上设置监听器以监听该znode的变更. 当znode有变更时, 这些Client将会收到通知,并执行预先敲定好的函数。那么Zookeeper能为我们做什么事情呢?简单的例子:假设这样一个系统:1 20个搜索引擎的服务器(每个服务器负责一部分索引的搜索任务), 每个搜索引擎的服务器有时提供搜索服务有时生成索引,但不能同时做这两件事;2 一个总服务器(负责向这20个搜索引擎的服务器发出搜索请求并合并结果集) ;3 一个备用的总服务器(负责当总服务器宕机时替换总服务器) ;4 一个web的cgi(向总服务器发出搜索请求) 。使用Zookeeper可以保证:1. 总服务器自动感知有多少台服务器可以提供搜索服务,并向这些服务器发出搜索请求;2. 总服务器当机时自动启用备用的总服务器;3. Web的cgi能够自动地获知总服务器的网络地址变化。这些都可以通过zookeeper的虚拟文件系统来实现,把这些状态信息,配置,位置等信息都保存于znode中,znode是被所有服务器共享的,通过znode及其数据的变化来完成服务器之间的协作。2.4.3 Katta与zookeeper的关系l Katta使用zookeeper保证Master主节点和Node子节点的有效性,在Master主节点和Node子节点之间传递消息,保存一些配置信息,保证文件读取的一致性等。l Master管理服务器, Node检索服务器和Client服务器之间的通信就是通过zookeeper来实现的。l Client服务器可以直接从zookeeper服务中读取Node检索服务器列表,并向Node检索服务器发送检索请求,最后从Node检索服务器得到结果,不必经过Master管理服务器。附录:#zookeeperZooKeeper works using distributed processes to coordinate with each other through a shared hierarchical name space that is modeled after a file system.Data is kept in memory and is backed up to a log for reliability. By using memory ZooKeeper is very fast and can handle the high loads typically seen in chatty coordination protocols across huge numbers of processes.Its meant to store small bits of configuration information rather than large blobs. Replication is used for scalability and reliability which means it prefers applications that are heavily read based.Typical of hierarchical systems you can add nodes at any point of a tree, get a list of entries in a tree, get the value associated with an entry, and get notification of when an entry changes or goes away.A weakness of ZooKeeper is the fact that changes happened are dropped: Because watches are one time triggers and there is latency between getting the event and sending a new request to get a watch, you cannot reliably see every change that happens to a node in ZooKeeper. Be prepared to handle the case where the znode changes multiple times between getting the event and setting the watch again. (You may not care, but at least realize it may happen.)在得到change通知和设置新的watch之间,可能已经又发生了几次changes,这些将被丢掉。This means that ZooKeeper is a state based system more than an event system. If you want to use events to log when and how something changed, for example, then you cant do that. You would have to include change history in the data itself. 它是个状态系统,zookeeper本身不记录事件历史。Each process goes to ZooKeeper and finds out which is the primary database. If a new primary is elected, say because a host fails, then ZooKeeper sends an event that allows everyone dependent on the database to react by getting the new primary database.Using ZooKeeper I can store my state machine definition as a node which is loaded from the static configuration collected from every distribution package in a product. Every process (client)dependent on that node can register as a watcher when they initially read the state machine. When the state machine is updated all dependent entities will get an event that causes them reload the state machine into the process. Simple and straightforward. All processes will eventually get the change and any rebooting processes will pick up the new state machine on initialization. A very cool way to reliably and centrally control a large distributed application.Another caveat that may not be obvious on first reading is that your application state machine using ZooKeeper will have to be intimately tied to ZooKeepers state machine. When a ZooKeeper server dies, for example, your application must process that event and reestablish all your watches on a new server. When a watch event comes your application must handle the event and set new watches. The algorithms to carry out higher level operations like locks and queues are driven by multi-step state machines that must be correctly managed by your application. And as ZooKeeper deals with state that is probably stored in your application its important to worry about thread safety. Callbacks from the ZooKeeper thread could access shared data structures. An Actor model where you dump ZooKeeper events into your own Actor queues could be a very useful application architecture here for synthesizing different state machines in a thread safe manner. Some Fast Facts How data are partitioned across multiple machines? Complete replication in memory. (yes this is limiting) How does update happen (interaction across machines)? All updates flow through the master and are considered complete when a quorum confirms the update. How does read happen (is getting a stale copy possible) ? Reads go to any member of the cluster. Yes, stale copies can be returned. Typically, these are very fresh, however. What is the responsibility of a leader? To assign serial ids to all updates and confirm that a quorum has received the update. There are several limitations that stand out in this architecture:- complete replication limits the total size of data that can be managed using Zookeeper. This is acceptable in some applicatio

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论