外文翻译--网络性能的测量.doc
1英文翻译:本文出自ComputerNetwork第四版AndrewS.Tanenbaum著NetworkPerformanceMeasurementWhenanetworkperformspoorly,itsusersoftencomplaintothefolksrunningit,demandingimprovements.Toimprovetheperformance,theoperatorsmustfirstdetermineexactlywhatisgoingon.Tofindoutwhatisreallyhappening,theoperatorsmustmakemeasurements.Inthissectionwewilllookatnetworkperformancemeasurements.ThediscussionbelowisbasedontheworkofMogul(1993).Thebasicloopusedtoimprovenetworkperformancecontainsthefollowingsteps:1.Measuretherelevantnetworkparametersandperformance.2.Trytounderstandwhatisgoingon.3.Changeoneparameter.Thesestepsarerepeateduntiltheperformanceisgoodenoughoritisclearthatthelastdropofimprovementhasbeensqueezedout.Measurementscanbemadeinmanywaysandatmanylocations(bothphysicallyandintheprotocolstack).Themostbasickindofmeasurementistostartatimerwhenbeginningsomeactivityandseehowlongthatactivitytakes.Forexample,knowinghowlongittakesforaTPDUtobeacknowledgedisakeymeasurement.Othermeasurementsaremadewithcountersthatrecordhowoftensomeeventhashappened(e.g.,numberoflostTPDUS).Finally,oneisofteninterestedinknowingtheamountofsomething,suchasthenumberofbytesprocessedinacertaintimeinterval.Measuringnetworkperformanceandparametershasmanypotentialpitfalls.Belowwelistafewofthem.Anysystematicattempttomeasurenetworkperformanceshouldbecarefultoavoidthese.MakeSureThattheSampleSizeIsLargeEnoughDonotmeasurethetimetosendoneTPDU,butrepeatthemeasurement,say,onemilliontimesandtaketheaverage.Havingalargesamplewillreducetheuncertaintyinthemeasuredmeanandstandarddeviation.Thisuncertaintycanbecomputedusingstandardstatisticalformulas.MakeSureThattheSamplesAreRepresentative2Ideally,thewholesequenceofonemillionmeasurementsshouldberepeatedatdifferenttimesofthedayandtheweektoseetheeffectofdifferentsystemloadsonthemeasuredquantity.Measurementsofcongestion,forexample,areoflittleuseiftheyaremadeatamomentwhenthereisnocongestion.Sometimestheresultsmaybecounterintuitiveatfirst,suchasheavycongestionat10,11,1,and2oclock,butnocongestionatnoon(whenalltheusersareawayatlunch).BeCarefulWhenUsingaCoarse-GrainedClockComputerclocksworkbyincrementingsomecounteratregularintervals.Forexample,amillisecondtimeradds1toacounterevery1msec.Usingsuchatimertomeasureaneventthattakeslessthan1msecispossible,butrequiressomecare.(Somecomputershavemoreaccurateclocks,ofcourse.)TomeasurethetimetosendaTPDU,forexample,thesystemclock(say,inmilliseconds)shouldbereadoutwhenthetransportlayercodeisenteredandagainwhenitisexited.IfthetrueTPDUsendtimeis300µsec,thedifferencebetweenthetworeadingswillbeeither0or1,bothwrong.However,ifthemeasurementisrepeatedonemilliontimesandthetotalofallmeasurementsaddedupanddividedbyonemillion,themeantimewillbeaccuratetobetterthan1µsec.BeSureThatNothingUnexpectedIsGoingOnduringYourTestsMakingmeasurementsonauniversitysystemthedaysomemajorlabprojecthastobeturnedinmaygivedifferentresultsthanifmadethenextday.Likewise,ifsomeresearcherhasdecidedtorunavideoconferenceoveryournetworkduringyourtests,youmaygetabiasedresult.Itisbesttoruntestsonanidlesystemandcreatetheentireworkloadyourself.Eventhisapproachhaspitfallsthough.Whileyoumightthinknobodywillbeusingthenetworkat3A.M.,thatmightbepreciselywhentheautomaticbackupprogrambeginscopyingallthediskstotape.Furthermore,theremightbeheavytrafficforyourwonderfulWorldWideWebpagesfromdistanttimezones.CachingCanWreakHavocwithMeasurementsTheobviouswaytomeasurefiletransfertimesistoopenalargefile,readthewholething,closeit,andseehowlongittakes.Thenrepeatthemeasurementmanymoretimestogetagoodaverage.Thetroubleis,thesystemmaycachethefile,soonlythefirstmeasurementactuallyinvolvesnetworktraffic.Therestarejustreadsfromthe3localcache.Theresultsfromsuchameasurementareessentiallyworthless(unlessyouwanttomeasurecacheperformance).Oftenyoucangetaroundcachingbysimplyoverflowingthecache.Forexample,ifthecacheis10MB,thetestloopcouldopen,read,andclosetwo10-MBfilesoneachpass,inanattempttoforcethecachehitrateto0.Still,cautionisadvisedunlessyouareabsolutelysureyouunderstandthecachingalgorithm.Bufferingcanhaveasimilareffect.OnepopularTCP/IPperformanceutilityprogramhasbeenknowntoreportthatUDPcanachieveaperformancesubstantiallyhigherthanthephysicallineallows.Howdoesthisoccur?AcalltoUDPnormallyreturnscontrolassoonasthemessagehasbeenacceptedbythekernelandaddedtothetransmissionqueue.Ifthereissufficientbufferspace,timing1000UDPcallsdoesnotmeanthatallthedatahavebeensent.Mostofthemmaystillbeinthekernel,buttheperformanceutilitythinkstheyhaveallbeentransmitted.UnderstandWhatYouAreMeasuringWhenyoumeasurethetimetoreadaremotefile,yourmeasurementsdependonthenetwork,theoperatingsystemsonboththeclientandserver,theparticularhardwareinterfaceboardsused,theirdrivers,andotherfactors.Ifthemeasurementsaredonecarefully,youwillultimatelydiscoverthefiletransfertimefortheconfigurationyouareusing.Ifyourgoalistotunethisparticularconfiguration,thesemeasurementsarefine.However,ifyouaremakingsimilarmeasurementsonthreedifferentsystemsinordertochoosewhichnetworkinterfaceboardtobuy,yourresultscouldbethrownoffcompletelybythefactthatoneofthenetworkdriversistrulyawfulandisonlygetting10percentoftheperformanceoftheboard.4网络性能的测量当一个网络的运行效果很差的时候,它的用户通常会向网络运行商抱怨并要求提高网络的质量。为了改善网络的性能,网络操作人员首先必须确定发生了什么问题。为了找出真正的问题所在,操作人员必须进行测量工作。在这一小节中,我们来看一看网络性能的测量问题。下面的讨论以Mogul(1993)的工作为基础。用来改善网络性能的基本循环过程包括以下步骤:(1)测量有关的网络参数和性能。(2)试图理解当前的网络状况。(3)改变一个参数。这些步骤不断重复,直到网络的性能已经足够好,或者改善性能的全部空间都已经被发掘出来了。测量工作可以有许多做法,也可以在许多地点或场所进行(既指物理位置,也指协议栈中的位置)。最基本的一种测量手段是:在开始某一个动作的时候启动一个定时器,然后确定该需要多长时间。例如,知道一个TPDU需要多长时间才被确认是一个很关键的测量指标。其他有一些测量指标可以通过计数器来完成,即记录某种事件发生的次数,比如丢失的TPDU的数量。最后,人们通常对于某些事物的数量比较感兴趣,比如在特定的时间间隔内所处理的字节数。测量网络的性能和参数有许多潜在的陷阱。以下我们列出其中一部分。任何一种系统化的网络性能测量手段都应该小心地避免这些陷阱。确保样本空间足够大不要测量发送一个TPDU的时间,而是重复也测量。比如说测量1百万次,然后再取平均。采用大量的样本将可以减小所测量的均值和标准方差中的不确定性。这种不确定性可以利用标准的统计公式来计算。确保样本具有代表性理想情况下,这1百万次测量的完整序列应该在一天或者一周的不同时刻进行重复,从而可以看到不同的系统负载对于所测量指标的影响。例如,对于拥塞的测量,如果仅仅在没有拥塞的那一时刻来测量拥塞,则这样的测量和结果并没有用。有时候测量结果初看起来可能不符合直觉,比如在10,11,1和2点钟网络严重拥塞,但是中午时候没有拥塞(所用的用户都去吃午饭了)。