




免费预览已结束,剩余105页可下载查看
下载本文档
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1,应用层网络与p2p,清华大学计算机系,2,应用层网络与p2p,应用层网络 对等网络系统 应用层组播,3,应用层网络,application layer network overlay network 网络:定义主机之间通信的寻址方式、路由方式和服务模型 在现有的internet传输网络之上构建一个完全位于应用层的网络系统 拓扑发现,路由等功能完全由应用层完成,不依赖网络层 基于internet网络的大规模的分布式应用,4,internet,internet本身也是一个overlay network 目标:互联大量的局域网络 方法:通过接入链路和主干网络互联 策略:为分组增加ip报文头,进行寻址和路由,5,应用层网络的优缺点,优点: 易于部署,不依赖于网络设备的升级 可扩展性好 缺点: 增加了复杂性和处理开销 无法利用最佳路由,增加了延迟 破坏了网络的分层结构模型 路由具有“自私”特性,6,弹性覆盖网络ron1,弹性覆盖网络是一种分布式覆盖网络体系结构,它可以使分布式应用检测到路径的失效和周期性的性能降低现象并能够迅速恢复 恢复时间少于20秒 对比:在目前的internet中,bgp恢复时间为几分钟 ron节点监控它们之间的路径的质量并根据这些信息决定 直接利用internet转发分组 通过其他ron节点转发分组,7,internet的故障恢复,a,b,c,d,packet switching and route around failures,8,internet路由的鲁棒性,9,internet路由的冗余特性,在12台主机上进行traceroute,得到的as连接关系图,10,11,internet路由的鲁棒性特点,优点 可扩展性好 缺点 故障检测和恢复比较慢 难以检测性能差的路径 不能利用internet的多路径特点 不能为客户提供多穴(multihoming)连接 难以表示复杂的策略,12,rons goal,to improve communication availability for small groups by at least a factor or 10,many applications collaboration and conferencing virtual private networks (vpns) across public internet overlay internet service,13,ron: routing using overlays,cooperating end-systems in different routing domains can conspire to do better than scalable wide-area protocols,types of failures outages: configuration/operational errors, backhoes, etc. performance failures: severe congestion, denial-of-service attacks, etc.,scalable bgp-based ip routing substrate,14,ron design,performance database,application-specific routing tables policy routing module,ron library,nodes in different routing domains (ases),15,可扩展性,测量和路由会带来额外开销 50个节点是性能上限 对大多数应用已经足够了,16,测试,作者对有12个节点和132条路径的ron进行了64个小时的实际运行测试 测试过程中经历了32次严重的路径失效,平均每次的持续时间超过30分钟 ron用平均不到20秒的时间就可以检测出这些失效情况并从中恢复 ron还改善了应用的丢失率,延迟和吞吐率等性能指标,17,进一步的研究,internet可扩展性和鲁棒性的矛盾 规模可扩展 多个ron是否会有矛盾 可以为哪些应用提供支持,18,对等网络系统,19,对等网络系统,20,sarnoff law:效益规模是o(n):网络是广播媒介,任1发送者(设备)和多个(n-1)接收者(设备),metcalfe law:效益规模是o(n2)网络是全互连媒介,任何1个设备可与其它n-1个交互,同时存在n(n-1)=n2-n个并发执行的事务,reed law:效益规模是o(2n):网络是群组媒介。网络可建立cn2+cn3+cnn-1+cnn = 2n-n-1 个小组,网络服务规模三法则,21,p2p 与 c/s,二者在结构和构成上有很大区别 管理能力、构态能力、功能(查找或发现)、组织、元素(dns)和协议 但又无明显边界 都能运行在不同的(internet / intranet)平台上 都能服务传统或新的应用:ebusiness eservices ,22,有管理自组织,预构-ad-hoc,查找发现,分层mesh,静态移动,依赖服务器独立生存,以ip为中心不以ip为中心,基于dns客户命名,rpc异步,.net,jxta,c/s模式,p2p模式,corba,corba,gnutella,napster,ebusiness,web apps,eservices,distr.apps,ad-hoc nw,clusters,internet intranet,wans,grids,p2p与c/s,23,p2p流量持续增长,24,p2p market share by country,25,what is being shared over p2p?,parameters mix of file formats by volume of traffic generated over the 4 main p2p networks bittorrent edonkey fasttrack gnutella weighted by volume of traffic generated on each network,26,how did it start?,a killer application: napster 1999-2002 free music over the internet key idea: share the content, storage and bandwidth of individual (home) users,internet,27,model,each user stores a subset of files each user has access (can download) files from all users in the system,28,main challenge,find where a particular file is stored,a,b,c,d,e,f,e?,29,other challenges,scale: up to hundred of thousands or millions of machines dynamicity: machines can come and go any time,30,napster,assume a centralized index system that maps files (songs) to machines that are alive how to find a file (song) query the index system return a machine that stores the required file ideally this is the closest/least-loaded machine ftp the file,31,napster,advantages: simplicity easy to implement sophisticated search engines on top of the index system disadvantages: robustness scalability (?),32,napster: example,a,b,c,d,e,f,m1,m2,m3,m4,m5,m6,m1 a m2 b m3 c m4 d m5 e m6 f,e?,33,napster vs. riaa,recording industry association of america (riaa) the riaa sued napster in december 1999 it demanded $10,000 for every copyrighted song traded over the system after long legal battles and a failed attempt to become a pay-based service, napster stopped operation in july 2001,34,35,gnutella,distribute file location idea: flood the request how to find a file: send request to all neighbors neighbors recursively multicast the request eventually a machine that has the file receives the request, and it sends back the answer,36,gnutella,advantages: totally decentralized, highly robust disadvantages: not scalable; the entire network can be swamped with request (to alleviate this problem, each request has a ttl),37,gnutella: example,assume: m1s neighbors are m2 and m3; m3s neighbors are m4 and m5;,a,b,c,d,e,f,m1,m2,m3,m4,m5,m6,38,freenet,addition goals to file location: provide publisher anonymity, security resistant to attacks a third party shouldnt be able to deny the access to a particular file (data item, object), even if it compromises a large fraction of machines,39,freenet,architecture: each file is identified by a unique identifier each machine stores a set of files, and maintains a “routing table” to route the individual requests,40,data structure,each node maintains a common stack id file identifier next_hop another node that store the file id file file identified by id being stored on the local node forwarding each message contains the file id it is referring to if file id stored locally, then stop if not, search for the “closest” id in the stack, and forward the message to the corresponding next_hop,id next_hop file,41,query,api: file = query( id ) upon receiving a query for document id check whether the queried file is stored locally if yes, return it if not, forward the query message,42,query,notes each query is associated a ttl that is decremented each time the query message is forwarded each node maintains the state for all outstanding queries that have traversed it help to avoid cycles when file is returned, the file is cached along the reverse path,43,query example,note: doesnt show file caching on the reverse path,4 n1 f4 12 n2 f12 5 n3,9 n3 f9,3 n1 f3 14 n4 f14 5 n3,14 n5 f14 13 n2 f13 3 n6,n1,n2,n3,n4,4 n1 f4 10 n5 f10 8 n6,n5,query(10),44,insert,api: insert(id, file) two steps search for the file to be inserted if not found, insert the file,45,insert,searching: like query, but nodes maintain state after a collision is detected and the reply is sent back to the originator insertion follow the forward path; insert the file at all nodes along the path a node probabilistically replace the originator with itself; obscure the true originator,46,insert example,assume query returned failure along “gray” path; insert f10,4 n1 f4 12 n2 f12 5 n3,9 n3 f9,3 n1 f3 14 n4 f14 5 n3,14 n5 f14 13 n2 f13 3 n6,n1,n2,n3,n4,4 n1 f4 11 n5 f11 8 n6,n5,insert(10, f10),47,insert example,10 n1 f10 4 n1 f4 12 n2,3 n1 f3 14 n4 f14 5 n3,14 n5 f14 13 n2 f13 3 n6,n1,n3,n4,4 n1 f4 11 n5 f11 8 n6,n5,insert(10, f10),9 n3 f9,n2,orig=n1,48,insert example,n2 replaces the originator (n1) with itself,10 n1 f10 4 n1 f4 12 n2,10 n2 f10 9 n3 f9,10 n2 10 3 n1 f3 14 n4,14 n5 f14 13 n2 f13 3 n6,n1,n2,n3,n4,4 n1 f4 11 n5 f11 8 n6,n5,insert(10, f10),orig=n2,49,insert example,n2 replaces the originator (n1) with itself,10 n1 f10 4 n1 f4 12 n2,10 n1 f10 9 n3 f9,10 n2 10 3 n1 f3 14 n4,10 n4 f10 14 n5 f14 13 n2,n1,n2,n3,n4,10 n4 f10 4 n1 f4 11 n5,n5,insert(10, f10),50,freenet properties,newly queried/inserted files are stored on nodes storing similar ids new nodes can announce themselves by inserting files attempts to supplant or discover existing files will just spread the files,51,freenet summary,advantages provides publisher anonymity totally decentralize architecture robust and scalable resistant against malicious file deletion disadvantages does not always guarantee that a file is found, even if the file is in the network,52,分布式哈希表查找,dht- distributed hash table 问题 给定一个关键字,映射 到一台主机 关键点 正确性 可扩展性 支持网络状态和主机的动态 变化 网络和主机的异构 性 安全,53,hash tables 实例,null,null,null,null,obj5 key=1,obj1 key=15,obj4 key=2,obj2 key=28,7 6 5 4 3 2 1 0,table,obj3 key=4,buckets,hash value,54,dht step 1: the hash,introduce a hash function to map the object being searched for to a unique identifier: e.g., h(“network notes”) 8045 distribute the range of the hash function among all nodes in the network each node must “know about” at least one copy of each object that hashes within its range (when one exists),55,“knowing about objects”,two alternatives node can cache each (existing) object that hashes within its range pointer-based: level of indirection - node caches pointer to location(s) of object,56,dht step 2: routing,for each object, node(s) whose range(s) cover that object must be reachable via a “short” path the different approaches (can,chord,pastry,tapestry) differ fundamentally only in the routing approach any “good” random hash function will suffice,57,dht routing: other challenges,neighbors for each node should scale with growth in overlay participation (e.g., should not be o(n) dht mechanism should be fully distributed (no centralized point that bottlenecks throughput or can act as single point of failure) dht mechanism should gracefully handle nodes joining/leaving the overlay need to repartition the range space over existing nodes need to reorganize neighbor set need bootstrap mechanism to connect new nodes into the existing dht infrastructure,58,分布式哈希查找,chord2 can3 tapestry4 pastry5,59,分布式哈希查找chord,chord实现了下面的功能 给定一个关键字(key),将key映射到某个节点 给对等网络应用的每个数据都分配一个key,那么对等网络中的数据查找问题就可以用chord解决 chord采用了相容哈希(consistent hashing)6的一种变体为节点分配关键字 哈希函数可以做到负载平衡,也就是说所有的节点可以接收到基本相同数量的关键字 当第n个节点加入或者离开网络时,只有1/n的关键字需要移动到另外的位置,60,分布式哈希查找chord,chord进一步改善了相容哈希的可扩展性 在由n个节点组成的网络中,每个节点只需要维护o(logn)个节点的信息 每次查找只需要o(logn)条消息 当节点加入或者离开网络时,chord需要更新路由信息,每次加入或者离开需要传递o(log2n)条消息,61,chord id,m bit identifier space for both keys and nodes key identifier = sha-1(key) key=“letitbe” id=60 node identifier = sha-1(ip address) ip=“” id=123 both are uniformly distributed how to map key ids to node ids?,62,1,6,2,identifier circle,successor(1) = 1,successor(2) = 3,successor(6) = 0,successor nodes,63,6,1,2,successor(6) = 7,6,1,successor(1) = 3,node joins and departures,64,scalable key location,a very small amount of routing information suffices to implement consistent hashing in a distributed environment each node need only be aware of its successor node on the circle queries for a given identifier can be passed around the circle via these successor pointers resolution scheme correct, but inefficient: it may require traversing all n nodes!,65,acceleration of lookups,lookups are accelerated by maintaining additional routing information each node maintains a routing table with (at most) m entries (where n=2m) called the finger table ith entry in the table at node n contains the identity of the first node, s, that succeeds n by at least 2i-1 on the identifier circle s = successor(n + 2i-1) s is called the ith finger of node n, denoted by n.finger(i).node,66,finger tables,1 2 4,1,2) 2,4) 4,0),1 3 0,67,lookup procedure,1 2 4,1,2) 2,4) 4,0),1 3 0,query(6),when node receive lookup request check if it has item in local memory if cant find, forward the request to the successor node that has largest id which is smaller than request id,68,node joins with finger tables,1 2 4,1,2) 2,4) 4,0),1 3 0,finger table,start,int.,succ.,keys,2,4 5 7,4,5) 5,7) 7,3),0 0 0,finger table,start,int.,succ.,keys,6,6,6,6,6,69,node departures with finger tables,1 2 4,1,2) 2,4) 4,0),1 3 0,finger table,start,int.,succ.,keys,1,2 3 5,2,3) 3,5) 5,1),3 3 0,finger table,start,int.,succ.,keys,2,4 5 7,4,5) 5,7) 7,3),6 6 0,finger table,start,int.,succ.,keys,finger table,start,int.,succ.,keys,7 0 2,7,0) 0,2) 2,6),0 0 3,6,6,6,0,3,70,分布式哈希查找can,can基于虚拟的d维笛卡儿坐标空间实现数据组织和查找功能 整个坐标空间动态地分配给系统中的所有节点,每个节点都拥有独立的互不相交的一块区域,71,分布式哈希查找can,新节点加入 can把某个现有节点的区域分裂成同样大小的两块,把其中一块分给新加入的节点,整个过程分为以下三步: (i)新节点首先找到一个已经在can中的节点 (ii)新节点使用can的路由机制找到一个区域将要被分隔的节点 (iii)执行分裂操作,原有区域的邻接区域被告知发生了分裂,这样新节点才能被别的节点路由到,72,can实例:两维空间,空间在节点之间划分 所有的节点将覆盖全部空间 每个节点覆盖一个正方形或者1:2的长方形 实例 节点n1:(1, 2) 是第一个加入节点,它将覆盖整个空间,73,can实例:两维空间,节点 n2:(4, 2) 加入,空间在两个节点间划分,74,can实例:两维空间,节点n3:(3, 5) 加入,从n1的空间中划分一半,75,can实例:两维空间,节点 n4:(5, 5) 和 n5:(6,6) 加入,76,can实例:两维空间,节点: n1:(1, 2); n2:(4,2); n3:(3, 5); n4:(5,5);n5:(6,6) 数据项: f1:(2,3); f2:(5,0); f3:(2,1); f4:(7,5),77,can实例:两维空间,每个数据项保存在拥有这块空间的节点上,78,can:查询实例,每个节点都知道其在d维空间中邻居节点的位置 将查询转发到离id最接近的邻居节点 实例:假定n1要查找f4 有多条路径,可以绕开失效路径 如果一个d维空间划分成n个相等的区域,那么平均路由长度是(d/4)(n1/d),每个节点只需要维护2d的邻接节点信息,1,2,3,4,5,6,7,0,1,2,3,4,5,6,7,0,n1,n2,n3,n4,n5,f1,f2,f3,f4,79,can: 节点离开,当节点离开can时,必须保证它的区域被can系统收回,分配给其他仍然在系统中的节点 一般过程是由某个节点来接管这个区域和所有的(关键字,值)数据库 如果这个区域可以和相邻区域合并形成一个大的区域,那么can将执行合并操作 如果合并不能进行,那么该区域将交给其邻接节点中区域最小的节点,也就是说,这个节点将临时负责两个区域,80,can/chord optimizations,weight neighbor nodes by rtt when routing, choose neighbor who is closer to destination with lowest rtt from me reduces path latency multiple physical nodes per virtual node reduces path length (fewer virtual nodes) reduces path latency (can choose physical node from virtual node with lowest rtt) improved fault tolerance (only one node per zone needs to survive to allow routing through the zone) several others,81,distributed hash tables,distributed hash tables are a key component of scalable and robust overlay networks can: o(d) state, o(d*n1/d) distance chord: o(log n) state, o(log n) distance simplicity is key services built on top of distributed hash tables p2p file storage, i3 (chord) multicast (can, tapestry) persistent storage (oceanstore using tapestry),82,问题,只能进行精确匹配,实用价值有局限 适应节点动态变化的能力低 经过哈希之后,节点的位置信息被破坏了,来自同一个子网的站点很可能节点号相距甚远,这不利于查询性能的优化 基于哈希表的系统不能利用应用本身的信息,许多应用(比如文件系统)的数据本身是按照层次结构组织的,而使用哈希函数后,这些层次信息就丢失了 terradir8中提出了一种面向层次结构的查找机制,83,dhts vs unstructured p2p,dhts good at: exact match for “rare” items dhts bad at: keyword search, etc. cant construct dht-based google tolerating extreme churn gnutella etc. good at: general search finding common objects very dynamic environments gnutella etc. bad at: finding “rare” items,84,基于分布式哈希查找系统的大规模 对等网络应用,协作文件系统cfs (cooperative file system) 9 past10 oceanstore11,85,基于分布式哈希查找系统的大规模 对等网络应用cfs,highest layer provides a file-like interface to user including user-friendly naming and authentication this file systems maps operations to lower-level block operations block storage uses chord to identify responsible node for storing a block and then talk to the block storage server on that node,86,其他p2p应用,应用层组播 流媒体直播与点播 coolstreaming, pplive, ppstream 计算能力的共享 / 计算机支持的协同工作,87,application layer multicast (alm),ip multicast is not globally deployed application layer/level multicast (or overlay multicast) is hence proposed multicasting implemented at end hosts instead of network routers nodes form unicast channels or tunnels between them,88,alm methodologies,tree based content flows from server to nodes in a tree like fashion, every node forwards the content to its children, which in turn forward to their children one point of failure for a complete subtree high recovery time notes tree base approaches: nice, spreadit, zigzag mesh based overcomes tree based flaws nodes maintain state information of many nodes high control overhead notes mesh based approaches include narada and esm from cmu,89,the delay between the source and receivers is small at the same time, the number of redundant packets on any physical link should be low,gatech,“efficient” overlay,cmu,berk2,stan1,stan2,berk1,berk1,high degree (unicast),berk2,gatech,stan2,cmu,stan1,stan2,high latency,cmu,berk2,gatech,stan1,berk1,network efficiency,90,physical link stress (pls),the number of identical copies of a packet that traverse a physical link indicates the bandwidth inefficiency,s,r1,r2,e1,e2,e3,example: pls for link s-r1 is 2 average pls is 7/5,91,relative delay penalty (rdp),the ratio of the delay in the overlay with the delay in the direct unicast path indicates the delay inefficiency,s,r1,r2,e1,e2,e3,20 ms,10 ms,10 ms,example: overlay delay for the path from s to e3 is 60 ms unicast delay is 40 ms therefore, the rdp for e3 is 1.5 ( = 60 ms / 40 ms),10 ms,10 ms,92,alm and p2p,93,p2p industry trends,napster,bittorrent,skype,pplive,file sharing,downloading,video streaming,voip,2001 2003 2004 2005,basic applications,advanced applications,94,p2p technology,streaming technology,p2p streaming,95,coolstreaming简介12,第一个真正意义上的p2p视频直播软件 通过构建p2p覆盖网络实现数据的传输,无需复杂的控制结构 数据驱动 数据的有效性决定流的方向,不存在固定的overlay数据拓扑结构 每个节点随机选择一些邻居节点并定期和邻居交换自己的数据信息,96,alm vs. donet(data-driven overlay network)数据拓扑结构,97,coolstreaming软件结构,缓存管理模块: 管理缓存,一方面从数据调度模块接收数据,同时为发送模块提供数据 数据调度模块: 负责决定数据来源节点,发送请求并接收数据 数据发送模块: 接收到邻居节点的发送请求后,从缓存得到数据并发送至请求成员,98,成员管理,新节点加入时会首先连接源节点,然后源节点从自身成员列表mcache里面选取一个代理deputy,由代理负责新节点的加入过程 当新节点从代理节点那里获得一些候选节点列表后,就向这些候选节点发送连接请求,建立协作关系 加入成功后,每个节点都会保持一个属于自身的局部的节点列表 每个节点会发送消息给其他节点宣布自己的存在,其他节点收到信息会加入该节点到自己的成员列表 成员列表中每一项有一个生命周期时间,过期之后会从成员列表中去除,99,信息更新scam 协议,信息更新通过scam(scalable gossip membership protocol)协议周期交互节点的状态消息实现 消息的格式包含4元组 消息序号seq num、节点标识id、伙伴数量num partner、消息存活时间ttl 当一个节点接收一个消息时候,就会按照其中的信息为相应节点更新状态信息或者建立一个新的状态信息记录 通过scam协议进行状态信息的交互,可以在较低的控制负载下实现成员的管理,100,邻居选择模块,新节点加入时,通过代理节点提供的候选节点列表选择邻居 列表更新或者节点加入的时候, mcache列表会更新,节点会周期性的从mcache列表中随机选择一些新的节点重新选择邻居 节点计算它和邻居之间的双向流量(包括它给邻居的,和邻居给它的),评价邻居节点并选择更优的邻居节点(带宽更高或者有效数据更多) 目标:提高系统的健壮性,101,数据调度管理模块,在donet协议中,视频流被分割成固定长度的段 节点缓存中每个片段的有效性通过缓冲图bm(buffer map)表示 可以采用120位的bm,其中1表示数据有效,0表示数据无效,则120位的bm可以表示120个片段的有效信息 节点间通过和协作节点交互bm获得数据的有效信息,确定可以从哪些协作节点获取自身没有的视频片段,102,调度算法,三个约束 数据可用性 满足视频回放的时间要求 邻居之间不同的带宽传输限制 目标:为每个segment选择邻居数据源,保证延迟最小,使播放流畅,103,调度算法,等价于并行计算机调度问题 np-hard 启发式算法 总体思想:segment提供者越少,越可能出现延时 计算每个视频片断潜在的提供者的数目,通过视频片段的提供者数量来确定优先顺序,然后从少到多来进行调度处理 具有多个提供者的情况,再按照节点的带宽和延时来确定优先顺序,104,pplive,105,ppli
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 山石盆景工综合考核试卷及答案
- 安阳电影新媒体营销方案
- 营销活动加油卡权益方案
- 贵州苗绣营销策划方案
- 2025版司法局《撤销劳动仲裁裁决申请书》民事类法律文书(空白模板)
- 专业建筑机电安装方案设计
- 语音交换系统施工方案
- 女性情感咨询方案
- 痔围手术期护理
- 咨询公司薪资造价方案
- 读书分享会红色书籍《保卫延安》课件
- 华能集团薪酬管理制度
- T/CIE 147-2022空间行波管加速寿命试验评估技术规范
- 系统性淀粉样变性护理
- 化工过程安全管理导则 (一)
- 四川成都经济技术开发区(龙泉驿区)“蓉漂人才荟”招聘笔试题库2025
- 解除委托退费协议书
- 国家能源集团共享服务中心有限公司-企业报告(业主版)
- 国民经济行业分类代码(2024年版)
- 《缺血性卒中脑细胞保护临床实践中国专家共识(2025年版)》解读
- 《顺丰速运探索》课件
评论
0/150
提交评论