DCN-FatTreeAdvanced Computer Networking完整原版课件_第1页
DCN-FatTreeAdvanced Computer Networking完整原版课件_第2页
DCN-FatTreeAdvanced Computer Networking完整原版课件_第3页
DCN-FatTreeAdvanced Computer Networking完整原版课件_第4页
DCN-FatTreeAdvanced Computer Networking完整原版课件_第5页
已阅读5页,还剩43页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

AdvancedComputerNetworkingScopeCuttingedgetechnologicaltrendsincomputernetworkinginthepastafewyears.Time:Tuesday3,4,5(9:45-12:10,week1-16)Thursday1,2(7:50-9:25,week8-12)Venue:Online:Offline:3B202Score:60%:~3courseprojects40%:finalexam(USTCmandatoryrequirement)AScalable,CommodityDataCenterNetworkArchitectureMohammadAl-Fares,AlexanderLoukissas,AminVahdatSIGCOMM2008PresentedbyYeTianforCourseCS05112OutlineBackgroundFattreebasedsolutionImplementationandevaluationReviewDatacentersClustersofthousandsofcomputersWheretheInternetlivesGoogledatacenter/about/datacenters/inside/streetview/Microsoft

underwaterdatacenter腾讯贵安七星数据中心阿里巴巴千岛湖数据中心DatacenterracksWhatisdatacenter?DCCommunicationsPlentyofM2Mcommunications,theprinciplebottleneckinlarge-scaleclustersisofteninter-nodecommunicationbandwidth.MapReduce:mustperformsignificantdatashufflingtotransporttheoutputofitsmapphasebeforeproceedingwithitsreducephase.Websearchengine:oftenrequiresparallelcommunicationwitheverynodeintheclusterhostingtheinvertedindextoreturnthemostrelevantresultsManagedbyonesingleauthorityTwoapproachesforDCnetworkApproach1:SpecializedhardwareandcommunicationprotocolsForexample:InfiniBand,MyrinetDonotleveragecommodityparts,expensiveNotcompatiblewithTCP/IPapplicationsApproach2:LeveragescommodityEthernetswitchesandrouterstointerconnectclustermachines.Unmodifiedapplications,OS,andhardwareButhow?DesiredPropertiesforaDCNetworkArchitectureScalableinterconnectionbandwidth:anarbitraryhostinthedatacentercancommunicatewithanyotherhostinthenetworkatthefullbandwidthofitslocalnetworkinterface.Economiesofscale:makecheapoff-the-shelfEthernetswitchesthebasisforlargescaledatacenternetworks.Backwardcompatibility:theentiresystemshouldbebackwardcompatiblewithhostsrunningEthernetandIP.CurrentDataCenterNetworkTopologiesCurrentDataCenterNetworkTopologiesThreetiers:core,aggregation,edge(ToRswitch)Twotypesofswitches:48-portGigEswitch,withfour10GigEuplinks,usedattheedgeofthetree128-port10GigEswitchforhigherlevelsofacommunicationhierarchyProblemsoftheTopologyOversubscription:theratioofthetotalbisectionbandwidthofaparticularcommunicationtopologytotheworst-caseachievableaggregatebandwidthamongtheendhosts.Ideal:1:1,allhostsmaypotentiallycommunicatewitharbitraryotherhostsatthefullbandwidthoftheirnetworkinterfaceTypicaldesignsareoversubscribedbyafactorof2.5:1to8:1ProblemsoftheTopologyMulti-pathRouting:Deliveringfullbandwidthbetweenarbitraryhostsinlargerclustersrequiresa“multi-rooted”treewithmultiplecoreswitchesECMPperformsstaticload-splittingamongflows.Limitthemultiplicityofpathsto8–16ProblemsoftheTopologyCost:保证一定的oversubscription,cost会随规模急剧增加。ProblemsoftheTopologyCost:Usingthelargest10GigEandGigEswitchestobuildadatacenterwith1:1oversubscriptionAclustercanbeupto27,648hostsOutlineBackgroundFattreebasedsolutionImplementationandevaluationReviewFat-treeFat-treekpods,eachcontainingtwolayersofk/2switches.Eachk-portswitchinthelowerlayerisdirectlyconnectedtok/2hosts.Eachoftheremainingk/2portsisconnectedtok/2ofthekportsintheaggregationlayer.Fat-tree(k/2)2

k-portcoreswitches.Eachhasoneportconnectedtoeachofkpods.Theithportofanycoreswitchisconnectedtopodisuchthatconsecutiveportsintheaggregationlayerofeachpodswitchareconnectedtocoreswitcheson(k/2)strides.Fat-treeFocusondesignsuptok=48.Useidentical48-portGigEswitches.Thenetworksupports27,648hosts,madeupof1,152subnetswith24hostseach.Thereare576equal-costpathsbetweenanygivenpairofhostsindifferentpods.Thecostofdeployingsuchanetworkarchitecturewouldbe$8.64M,comparedto$37Mforthetraditionaltechniques.ArchitectureDesignMotivationThereare(k/2)2shortest-pathsbetweenanytwohostsondifferentpods,butonlyoneischosen.Eachpathhas5hopsProtocolslikeOSPFselectspathbasedonhopcounts.itispossibleforasmallsubsetofcoreswitches,perhapsonlyone,tobechosenastheintermediatelinksbetweenpods.Needasimple,fine-grainedmethodoftrafficdiffusion.AddressingAllIPaddressesinthenetworkwithintheprivate/8block.Thepodswitchesaregivenaddressesoftheform10.pod.switch.1,poddenotesthepodnumber(in[0,k−1]),switchdenotesthepositionofthatswitchinthepod(in[0,k−1],startingfromlefttoright,bottomtotop).Givecoreswitchesaddressesoftheform10.k.j.i,jandidenotethatswitch’scoordinatesinthe(k/2)2coreswitchgrid(eachin[1,(k/2)],startingfromtop-left).Two-levelRoutingTableEachentryinthemainroutingtablewillpotentiallyhaveanadditionalpointertoasmallsecondarytableof(suffix,port)entries.Afirst-levelprefixisterminatingifitdoesnotcontainanysecondlevelsuffixes,Asecondarytablemaybepointedtobymorethanonefirst-levelprefix.Two-levelRoutingTableEntriesintheprimarytableareleft-handed(i.e.,/mprefixmasksoftheform1m032−m),entriesinthesecondarytablesareright-handed(i.e./msuffixmasksoftheform032−m1m).Ifthelongest-matchingprefixsearchyieldsanon-terminatingprefix,thenthelongest-matchingsuffixinthesecondarytableisfoundandused.Two-levelRoutingTableTheroutingtableofanypodswitchwillcontainnomorethank/2prefixesandk/2suffixes.RoutingAlgorithmPodswitchesIfahostsendsapackettoanotherhostinthesamepodbutonadifferentsubnet,thenallupper-levelswitchesinthatpodwillhaveaterminatingprefixpointingtothedestinationsubnet’sswitch.RoutingAlgorithmPodswitchesForallotheroutgoinginter-podtraffic,thepodswitcheshaveadefault/0prefixwithasecondarytablematchinghostIDs.EmploythehostIDsasasourceofdeterministicentropy;theywillcausetraffictobeevenlyspreadupwardamongtheoutgoinglinkstothecoreswitches.RoutingAlgorithmAggregationswitchesOnceapacketreachesitsdestinationpod,thereceivingupper-levelpodswitchwillalsoincludea(10.pod.switch.0/24,port)prefixtodirectthatpackettoitsdestinationsubnetswitch,whereitisfinallyswitchedtoitsdestinationhost.Generatingupperaggregationswitchroutingtable;Forlowerswitches,omitline3-5.RoutingAlgorithmCoreswitchesOnceapacketreachesacoreswitch,thereisexactlyonelinktoitsdestinationpod,andthatswitchwillincludeaterminating/16prefixforthepodofthatpacket(10.pod.0.0/16,port).AnExampleSource:;destination:Atthegatewayswitch(),matcheswiththe/0first-levelprefix,thenmatcheswiththe/8secondary-levelsuffix,thenforwardtoport2,androutedtothepodswitch.(i=3,z=1)AnExampleAtthegatewayswitch(),matcheswiththe/0first-levelprefix,thenmatcheswiththe/8secondary-levelsuffix,thenforwardtoport2,androutedtothepodswitch.(i=3,z=2)AnExampleAt,matchesaterminating/16prefix,whichpointstopod2onport2,andswitch.At,matchesaterminatingprefix/24,whichpointstotheswitchresponsibleforthatsubnet,onport0.Howaboutthedestinationbecomes?Centralizedalgorithm,notadistributedone.Whyfeasible?FlowClassificationThetwo-levelroutingtechniqueisstatic,buttrafficsarenotevenlydistributedamongthehosts.EdgeswitchRecognizesubsequentpacketsofthesameflow,andforwardthemonthesameoutgoingport.(packetsofsame<srcIP,dstIP,srcport,dstport,proto>belongtoasameflow)Periodicallyreassignaminimalnumberofflowoutputportstominimizeanydisparitybetweentheaggregateflowcapacityofdifferentports.FlowSchedulingTrafficsaredominatedbyfewlargelong-livedflowsEdgeswitchAdditionallydetectanyoutgoingflowwhosesizegrowsaboveapredefinedthreshold,andperiodicallysendnotificationstoacentralschedulerspecifyingthesourceanddestinationforallactivelargeflows.FlowSchedulingCentralSchedulerMaintainsbooleanstateforalllinksinthenetworktheiravailabilitytocarrylargeflows.Whentheschedulerreceivesanotificationofanewflow,itlinearlysearchesthroughthecoreswitchestofindonewhosecorrespondingpathcomponentsdonotincludeareservedlink.Uponfindingsuchapath,theschedulermarksthoselinksasreserved,andnotifiestherelevantlower-andupper-layerswitchesinthesourcepodwiththecorrectoutgoingportthatcorrespondstothatflow’schosenpath.PowerandHeatIssues不同Switch的能耗效率,后三个是10GigE的switchEmploysmoreindividualswitches,issuperiortothoseincurredbycurrentdatacenterdesigns,with56.6%lesspowerconsumptionand56.5%lessheatdissipation.OutlineBackgroundFattreebasedsolutionImplementationandevaluationReviewImplementationImplementrouterprototypewithClick.Supportthetwo-levelroutingtable4-portsTheClickModularRouterProjectAsoftwarearchitectureforbuildingflexibleandconfigurablerouters/kohler/clickExperimentDescriptionImplementa4-portfat-tree(k=4):thereare16hosts,fourpods(eachwithfourswitches),andfourcoreswitches.Multiplexthese36elementsontotenphysicalmachines,interconnectedbya48-portProCurve2900switchwith1GigabitEthernetlinks.Eachpodofswitchesishostedononemachine;eachpod’shostsarehostedononemachine;andthetworemainingmachinesruntwocoreswitcheseach.ExperimentDescriptionForthecomparisoncaseofthehierarchicaltreenetwork,fourmachinesrunningfourhostseach,andfourmachineseachrunningfourpodswitcheswithoneadditionaluplink.BenchmarkSuiteRandom:Ahostsendstoanyotherhostinthenetworkwithuniformprobability.Stride(i):Ahostwithindexxwillsendtothehostwithindex(x+i)mod16.StaggeredProb(SubnetP,PodP):Whereahostwillse

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论