大规模数据处理云计算Introduction_第1页
大规模数据处理云计算Introduction_第2页
大规模数据处理云计算Introduction_第3页
大规模数据处理云计算Introduction_第4页
大规模数据处理云计算Introduction_第5页
已阅读5页,还剩34页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

大规模数据处理/云计算

Lecture3–MapReduceBasics闫宏飞北京大学信息科学技术学院7/12/2011/~course/cs402/ThisworkislicensedunderaCreativeCommonsAttribution-Noncommercial-ShareAlike3.0UnitedStates

See/licenses/by-nc-sa/3.0/us/fordetailsJimmyLinUniversityofMaryland课程建设SEWMGroupHowdowescaleup?Source:Wikipedia(IBMRoadrunner)DivideandConquer“Work”w1w2w3r1r2r3“Result”“worker”“worker”“worker”PartitionCombineParallelizationChallengesHowdoweassignworkunitstoworkers?Whatifwehavemoreworkunitsthanworkers?Whatifworkersneedtosharepartialresults?Howdoweaggregatepartialresults?Howdoweknowalltheworkershavefinished?Whatifworkersdie?Whatisthecommonthemeofalloftheseproblems?CommonTheme?Parallelizationproblemsarisefrom:Communicationbetweenworkers(e.g.,toexchangestate)Accesstosharedresources(e.g.,data)Thus,weneedasynchronizationmechanismSource:RicardoGuimarãesHerrmannManagingMultipleWorkersDifficultbecauseWedon’tknowtheorderinwhichworkersrunWedon’tknowwhenworkersinterrupteachotherWedon’tknowtheorderinwhichworkersaccessshareddataThus,weneed:Semaphores(lock,unlock)Conditionalvariables(wait,notify,broadcast)BarriersStill,lotsofproblems:Deadlock,livelock,raceconditions...Diningphilosophers,sleepybarbers,cigarettesmokers...Moralofthestory:becareful!CurrentToolsProgrammingmodelsSharedmemory(pthreads)Messagepassing(MPI)DesignPatternsMaster-slavesProducer-consumerflowsSharedworkqueuesMessagePassingP1P2P3P4P5SharedMemoryP1P2P3P4P5MemorymasterslavesproducerconsumerproducerconsumerworkqueueWheretherubbermeetstheroadConcurrencyisdifficulttoreasonaboutConcurrencyisevenmoredifficulttoreasonaboutAtthescaleofdatacenters(evenacrossdatacenters)InthepresenceoffailuresIntermsofmultipleinteractingservicesNottomentiondebugging…Thereality:Lotsofone-offsolutions,customcodeWriteyouowndedicatedlibrary,thenprogramwithitBurdenontheprogrammertoexplicitlymanageeverythingSource:Wikipedia(FlatTire)Source:MITOpenCoursewareSource:MITOpenCoursewareSource:Harper’s(Feb,2008)What’sthepoint?It’sallabouttherightlevelofabstractionThevonNeumannarchitecturehasserveduswell,butisnolongerappropriateforthemulti-core/clusterenvironmentHidesystem-leveldetailsfromthedevelopersNomoreraceconditions,lockcontention,etc.SeparatingthewhatfromhowDeveloperspecifiesthecomputationthatneedstobeperformedExecutionframework(“runtime”)handlesactualexecutionThedatacenteristhecomputer!“BigIdeas”Scale“out”,not“up”LimitsofSMPandlargeshared-memorymachinesMoveprocessingtothedataClusterhavelimitedbandwidthProcessdatasequentially,avoidrandomaccessSeeksareexpensive,diskthroughputisreasonableSeamlessscalabilityFromthemythicalman-monthtothetradablemachine-hourMapReducegggggfffffMapFoldRootsinFunctionalProgrammingTypicalLarge-DataProblemIterateoveralargenumberofrecordsExtractsomethingofinterestfromeachShuffleandsortintermediateresultsAggregateintermediateresultsGeneratefinaloutputKeyidea:provideafunctionalabstractionforthesetwooperationsMapReduce(DeanandGhemawat,OSDI2004)19MapReduceProgrammersspecifytwofunctions:map(k,v)→<k’,v’>*reduce(k’,v’)→<k’’,v’’>*AllvalueswiththesamekeyaresenttothesamereducerTheexecutionframeworkhandleseverythingelse…20mapmapmapmapShuffleandSort:aggregatevaluesbykeysreducereducereducek1k2k3k4k5k6v1v2v3v4v5v6ba12cc36ac52bc78a15b27c2368r1s1r2s2r3s321MapReduceProgrammersspecifytwofunctions:map(k,v)→<k’,v’>*reduce(k’,v’)→<k’,v’>*AllvalueswiththesamekeyaresenttothesamereducerTheexecutionframeworkhandleseverythingelse…What’s“everythingelse”?22MapReduce“Runtime”HandlesschedulingAssignsworkerstomapandreducetasksHandles“datadistribution”MovesprocessestodataHandlessynchronizationGathers,sorts,andshufflesintermediatedataHandleserrorsandfaultsDetectsworkerfailuresandrestartsEverythinghappensontopofadistributedFS(later)23MapReduceProgrammersspecifytwofunctions:map(k,v)→<k’,v’>*reduce(k’,v’)→<k’,v’>*AllvalueswiththesamekeyarereducedtogetherTheexecutionframeworkhandleseverythingelse…Notquite…usually,programmersalsospecify:partition(k’,numberofpartitions)→partitionfork’Oftenasimplehashofthekey,e.g.,hash(k’)modnDividesupkeyspaceforparallelreduceoperationscombine(k’,v’)→<k’,v’>*Mini-reducersthatruninmemoryafterthemapphaseUsedasanoptimizationtoreducenetworktraffic24combinecombinecombinecombineba12c9ac52bc78partitionpartitionpartitionpartitionmapmapmapmapk1k2k3k4k5k6v1v2v3v4v5v6ba12cc36ac52bc78ShuffleandSort:aggregatevaluesbykeysreducereducereducea15b27c298r1s1r2s2r3s3c236825Twomoredetails…BarrierbetweenmapandreducephasesButwecanbegincopyingintermediatedataearlierKeysarriveateachreducerinsortedorderNoenforcedorderingacrossreducers26“HelloWorld”:WordCountMap(Stringdocid,Stringtext):

foreachwordwintext:Emit(w,1);Reduce(Stringterm,Iterator<Int>values):

intsum=0;foreachvinvalues:sum+=v;Emit(term,sum);27MapReducecanreferto…TheprogrammingmodelTheexecutionframework(aka“runtime”)ThespecificimplementationUsageisusuallyclearfromcontext!28MapReduceImplementationsGooglehasaproprietaryimplementationinC++BindingsinJava,PythonHadoopisanopen-sourceimplementationinJavaAnApacheprojectLargecontributionofdevelopmentledbyYahoo,usedinproductionRapidlyexpandingsoftwareecosystemLotsofcustomresearchimplementationsForGPUs,cellprocessors,etc.29split0split1split2split3split4workerworkerworkerworkerworkerMasterUser

Programoutputfile0outputfile1(1)submit(2)schedulemap(2)schedulereduce(3)read(4)localwrite(5)remoteread(6)writeInputfilesMapphaseIntermediatefiles(onlocaldisk)ReducephaseOutputfilesAdaptedfrom(DeanandGhemawat,OSDI2004)30Howdowegetdatatotheworkers?ComputeNodesNASSANWhat’stheproblemhere?31DistributedFileSystemDon’tmovedatatoworkers…moveworkerstothedata!StoredataonthelocaldisksofnodesintheclusterStartuptheworkersonthenodethathasthedatalocalWhy?NotenoughRAMtoholdallthedatainmemoryDiskaccessisslow,butdiskthroughputisreasonableAdistributedfilesystemistheanswerGFS(GoogleFileSystem)forGoogle’sMapReduceHDFS(HadoopDistributedFileSystem)forHadoop32GFS:AssumptionsCommodityhardwareover“exotic”hardwareScale“out”,not“up”HighcomponentfailureratesInexpensivecommoditycomponentsfailallthetime“Modest”numberofhugefilesMulti-gigabytefilesarecommon,ifnotencouragedFilesarewrite-once,mostlyappendedtoPerhapsconcurrentlyLargestreamingreadsoverrandomaccessHighsustainedthroughputoverlowlatencyGFSslidesadaptedfrommaterialby(Ghemawatetal.,SOSP2003)33GFS:DesignDecisionsFilesstoredaschunksFixedsize(64MB)ReliabilitythroughreplicationEachchunkreplicatedacross3+chunkserversSinglemastertocoordinateaccess,keepmetadataSimplecentralizedmanagementNodatacachingLittlebenefitduetolargedatasets,streamingreadsSimplifytheAPIPushsomeoftheissuesontotheclient(e.g.,datalayout)HDFS=GFSclone(samebasicideas)34FromGFStoHDFSTerminologydifferences:GFSmaster=HadoopnamenodeGFSchunkservers=HadoopdatanodesFunctionaldifferences:NofileappendsinHDFS(plannedfeature)HDFSperformanceis(likely)slowerForthemostpart,we’llusetheHadoopterminology…35Adaptedfrom(Ghemawatetal.,SOSP2003)(filename,blockid)(blockid,blocklocati

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论