外文翻译--数字信号处理器重新采纳多核架构.doc
1http:/www.eetasia.com/DigitalSignalProcessingChipsreembracemulticorearchitecturePosted:03Nov2008Addingcorestoaprocessortogainaperformanceboost,whileloweringpowerdemand,hasbecomestandardpracticeinthecomputingandembeddedprocessorindustries.Whileasimilarevolutionseemsinevitableforalltypesofhigh-performanceprocessing,priorexperiencehasmadeDSPvendorsmoreselectiveinapplyingthemulticoreapproach.DSPsarenowbeginningtoreembracemulticorearchitectures,butmainlyforspecificapplicationspossessingwell-partitionedprocessingtasks.PerformpartitioningADSPapplicationoftencomprisesonlyafewhighlycomplextasks,andsystemperformanceimprovementsdependonhasteningtaskexecution,notsimplyrunningmoretasks.Insteadofpartitioningatthetasklevel,thissystemoftenrequirespartitioningatthealgorithmlevel.Theoveralltask,suchascompressingavideostream,mustbebrokenintostepsthatcanruninparallelonseparatecores.ThetaskschedulerorOScannotperformsuchpartitioning;itmustcomeduringthesoftwaredesign.ManyDSPapplicationdevelopersavoidthemulticoreapproachbecauseofthedifficultyofalgorithmpartitioning.Atthesametime,sometaskssuchasencryptionarenotsuitabletoparallelization.Homogenousvs.heterogeneousThisdoesntmeanthatthemulticoreapproachhasntbeentriedwithDSPs.PicoChiphaslonghaditspicoArrayarchitecturethatputsmultiple,identicalcorestogetherforhigh-performanceDSP.Inmostcases,however,multicoredesignofferingswithDSPhadnotbeenhomogeneoushavingmultiplecopiesofthesamecore.Instead,theyintegratedaDSPcorewithaRISCCPUcore.SuchheterogeneousDSPs,forinstance,havebeenpartofmulticoreprocessordesignsforanumberofyearsinthehandsetsandcommunicationsindustries.TheapplicationstheseprocessorstargetedreadilyseparateintosignalprocessingtasksfortheDSPandcontroltasksfortheRISCCPU,makingpartitioningmoredirect.OneexceptionwastheBlackfinBF561dual-coreDSPfromAnalogDevicesInc.Thedeviceusedcoresthatweredesignedtohandlebothtypesoftaskswell,sotherewasnoneedtopartitionalongtasklines.Instead,developerscouldassigntasksaccordingtotheirpreferencetobalancetheloadamongthecores.Mostdevelopersat2thetime,however,wereinexperiencedatpartitioningsoftwareandautomatedtoolsupportwaslacking,sothehomogeneousmulticoreDSPwasnotquicklyadopted."TheBF561wasanearlyentranttothefield,"saidDavidKatz,ADIsBlackfinapplicationsmanager,"anditwasaheadofitstime.Ithastakenawhileforpeopletolearnpartitioning."HenotedthatithasnotintroducedahomogeneousmulticoreDSPdesignsincetheBF561,although"multicoreisanimportantpartofourroadmapstrategy."OtherDSPvendorsalsoviewmulticoreasaninevitabletrendforDSPs,forthesamereasonitwasadoptedincomputing:higherperformanceatlowerpower.WhatisnowmakinghomogeneousmulticoreDSPchipdesignsappearoncemoreisashiftintheperformanceincreasessomesystemsrequire.Insomeapplications,theneedforperformanceismovingfromperformingasingletaskfastertoperformingmoretasks.ThisshiftissimplifyingDSPtaskpartitioning,makingitmorelikethatofotherembeddedapplications,andDSPvendorsarecreatingproductsthatcapitalizeontheopportunity.ProcessingchangeWiththerisingdemandforVoIPandvideooverIP,mediaprocessinghasbecomeonesuchshiftingapplication.Amediagatewaydesign,forinstance,mustprovideanumberofvoice,audioandvideocodecs,andhandlemultipleindependentchannels.Thisapplicationstructureiseasilypartitionedintoindependenttasks,makingitagoodfitformulticoreDSPdesigns.TheOctasicVocalloaddressesthisapplicationspace.Vocallohas15identicalDSPcores,givingdesignersconsiderableflexibilityincreatingparallelandpipelinedarchitecturesthatstriketherightbalanceofchannelcapacityfordifferentinstallations.AnothermulticoreDSPforcommunicationsisTexasInstrumentsInc.sTNETV3020forhigh-densitycorenetworks."Whatwearedoingnowisapplication-specificmultiprocessing,"saidRaySimar,managerofmulticoresolutionsatTI."Ininfrastructureapplicationsyouredoingthesametasks,sodesignscangravitatetomultiplecopiesofthesameprocessor."TheTNETV3020hassixDSPcores,alongwithaswitchfabricandavarietyofserialI/Ochannels,allowingdesignerstoconfigurethedesignfortaskssuchaschannelformatconversion.CommunicationsisnottheonlyDSPapplicationchangingitscharacter.Audioprocessinghasalsogrowntorequirehigh-performancehandlingofmultipletasksthatareneededforsimplepartitioningamongmultiplecores.AccordingtoSujataNeidig,audioDSPproductmanageratFreescaleSemiconductorInc.,theadventofhighdefinition,DolbyandBlu-rayaudioalgorithmshasincreasedperformancedemandsonaudioDSPsasmuchasfivetimes,withrisingcomplexity,dataratesandnumbersofchannels.Further,Neidigsaidmorefeaturessuchasautomaticvolumecontrolarebeingintegratedtoaudio.RecyclingcodesFreescalesSymphonyDSP56724andDSP56725DSPsofferadual-corearchitecturethatallowsdeveloperstosplittheprocessingburdenwhilereusingtheir3existingcode.MulticoreDSPsthattargetvideoormixedaudioandvideoprocessingarealsoappearing.SamplesaretheCT3616fromCradleTechnologiesInc.,theVoyageurfromGennumCorp.andmulticoreDSPsforaudiofromCirrusLogicInc.ThetrendtowardmulticoreDSPchipdesignsmayultimatelymovethepartitioningtaskoutofthedevelopershandsintothechipvendors.AnexampleisthePC302,recentlyintroducedbyPicoChip.Thecompanyuseditsgeneral-purposepicoArrayarchitecturetocreateadevicethatimplementsacompletefemtocellaccesspointonasinglechip.Thecompanyhandledallthepartitioningandputthecoresystemsoftwareinon-chipmemory,limitingdeveloperseffortstoaddcustomfunctionality.Suchspecializeddevicesmaybethenear-termfutureformulticoreDSPdesigns,butlongtermthemulticoreapproachwillenvelopegeneral-purposeDSPdesignsaswell."AswelookdowntheroadwecanseethatmulticoreforDSPisnotaone-trickpony,"saidSimar."Itwillbecomemorecommon."AlongwiththatshiftwillcomeagrowingneedfordeveloperstolearnhowtopartitiontheirdesignstoeffectivelyutilizehomogenousmulticoreDSPs."Anumberofpeoplewantacompilertohandlethis,"saidSimar,"butthatsnotgoingtohappenforawhile.Wewillneedtothinkaboutthingsdifferentlyinordertoapplythesedevices."RichardQuinnellEETimes4数字信号处理器重新采纳多核架构为了提升性能同时降低功率要求,在处理器中增加内核已经成为计算和嵌入式处理器产业的标准作法。虽然同样的演变对各种高性能处理来说似乎是不可避免的,以往的经验使得数字信号处理器(DSP)供应商更愿意选择多核方式。DSP正在开始重新采纳多核架构,不过主要是针对那些处理任务可以得到妥善划分的特殊应用。操作分解DSP应用通常只包含少量高度复杂的任务,系统性能的提升依赖于加快任务执行速度,而不是简单地运行更多的任务。与任务级划分不同,DSP系统通常要求在算法级划分任务。整个任务,比如压缩一个视频流,必须被分解成可以在单独内核上并行运行的多个步骤。任务调度器或操作系统无法完成这种划分,这种划分必须在软件设计过程完成。许多DSP应用开发人员正因为算法划分困难而回避多内核方法。也有一些任务(如加密)不支持并行运算。比较相同与不同这并不是说多核方法未尝试与DSP亲近过。PicoChip公司很早前推出的picoArray架构就整合了多个相同的内核来支持高性能DSP。然而在大多数情况下,采用DSP的多核设计中的内核不是同一类的。相反,它们采用一个DSP内核和一个RISCCPU内核。多年来这种异质DSP一直被用于蜂窝电话和通信行业中的多核处理器设计。这些处理器的目标应用可以被很好地划分为适合DSP的信号处理任务和适合RISCCPU的控制任务,从而使得划分相当简单。一个例外是ADI的BlackfinBF561双核DSP。该器件使用的内核可以很好地执行两种任务,因此无需对任务组进行划分。相反,开发人员可能随意分配任务来平衡内核之间的负载。不过,大多数开发人员缺少划分软件的经验,而自动化工具支持也非常缺乏,因此同类多核DSP无法得到迅速普及。“BF561是较早进入这一领域的产品。”ADI公司Blackfin应用经理DavidKatz表示,“它是领先于时代的产品。我们需要给开发员一定的时间来学习划分技术。”Katz指出,ADI公司在BF561后就没有推出过同类多核DSP设计,虽然“多内核是我们发展策略中的重要组成部分”。其它DSP供应商也将多核看作是DSP不可避免的发展趋势,这与它被计算应用广泛采纳的理由是相同的:以更低的功率提供更高的性能。再次出现同类多核DSP芯片设计的驱动因素,是一些系统要求的性能越来越高。在一些关键应用中,对性能的要求正在从更快地执行单一任务转变为执行更多的任务。这种变化将简化DSP的任务划分,使得它更像是其它嵌入式应用的任务划分,而DSP供应商也在抓住这个难得的机会积极开发产品。