会员注册 | 登录 | 微信快捷登录 支付宝快捷登录 QQ登录 微博登录 | 帮助中心 人人文库renrendoc.com美如初恋!
站内搜索 百度文库

热门搜索: 直缝焊接机 矿井提升机 循环球式转向器图纸 机器人手爪发展史 管道机器人dwg 动平衡试验台设计

   首页 人人文库网 > 资源分类 > PDF文档下载

42-Reality Engine Graphics.pdf

  • 资源星级:
  • 资源大小:188.11KB   全文页数:8页
  • 资源格式: PDF        下载权限:注册会员/VIP会员
您还没有登陆,请先登录。登陆后即可下载此文档。
  合作网站登录: 微信快捷登录 支付宝快捷登录   QQ登录   微博登录
友情提示
2:本站资源不支持迅雷下载,请使用浏览器直接下载(不支持QQ浏览器)
3:本站资源下载后的文档和图纸-无水印,预览文档经过压缩,下载后原文更清晰   

42-Reality Engine Graphics.pdf

RealityEngineGraphicsKurtAkeleySiliconGraphicsComputerSystemsn03AbstractTheRealityEngineTMgraphicssystemisthefirstofanewgenerationofsystemsdesignedprimarilytorendertexturemapped,antialiasedpolygons.ThispaperdescribesthearchitectureoftheRealityEnginegraphicssystem,thenjustifiessomeofthedecisionsmadeduringitsdesign.Theimplementationisnearmassivelyparallel,employing353independentprocessorsinitsfullestconfiguration,resultinginameasuredfillrateofover240millionantialiased,texturemappedpixelspersecond.Renderingperformanceexceeds1millionantialiased,texturemappedtrianglespersecond.Inadditiontosupportingthefunctionsrequiredofageneralpurpose,highendgraphicsworkstation,thesystemenablesrealtime,outthewindowimagegenerationandinteractiveimageprocessing.CRCategoriesandSubjectDescriptorsI.3.1ComputerGraphicsHardwareArchitectureI.3.7ComputerGraphicsThreeDimensionalGraphicsandRealismcolor,shading,shadowing,andtexture1IntroductionThispaperdescribesandtoalargeextentjustifiesthearchitecturechosenfortheRealityEnginegraphicssystem.Thedesignersthinkofthissystemasourfirstimplementationofathirdgenerationgraphicssystem.Tousagenerationischaracterizednotbythescopeofcapabilitiesofanarchitecture,butratherbythecapabilitiesforwhichthearchitecturewasprimarilydesigned–thetargetcapabilitieswithmaximizedperformance.Becausewedesignedourfirstmachineintheearlyeighties,ournotionoffirstgenerationcorrespondstothisperiod.Floatingpointhardwarewasjustbecomingavailableatreasonableprices,framebuffermemorywasstillquiteexpensive,andapplicationspecificintegratedcircuitsASICswerenotreadilyavailable.Theresultingmachineshadworkabletransformationcapabilities,butverylimitedframebufferprocessingcapabilities.Inparticular,smoothshadinganddepthbuffering,whichrequiresubstantialframebufferhardwareandmemory,werenotavailable.Thusthetargetcapabilitiesoffirstgenerationmachineswerethetransformationandrenderingofflatshadedpoints,lines,andpolygons.Theseprimitiveswerenotlighted,andhiddensurfaceelimination,ifrequired,wasaccomplishedbyalgorithmsimplementedbytheapplication.Examplesofsuchsystemsarethen032011N.ShorelineBlvd.,MountainView,CA94043USA,kurtsgi.comSiliconGraphicsIris30001985andtheApolloDN5701985.Towardtheendofthefirstgenerationperiodadvancesintechnologyallowedlighting,smoothshading,anddepthbufferingtobeimplemented,butonlywithanorderofmagnitudelessperformancethanwasavailabletorenderflatshadedlinesandpolygons.Thusthetargetcapabilityofthesemachinesremainedfirstgeneration.TheSiliconGraphics4DG1986isanexampleofsuchanarchitecture.Becausefirstgenerationmachinescouldnotefficientlyeliminatehiddensurfaces,andcouldnotefficientlyshadesurfaceseveniftheapplicationwasabletoeliminatethem,theyweremoreeffectiveatrenderingwireframeimagesthanatrenderingsolids.Beginningin1988asecondgenerationofgraphicssystems,primarilyworkstationsratherthanterminals,becameavailable.ThesemachinestookadvantageofreducedmemorycostsandtheincreasedavailabilityofASICstoimplementdeepframebufferswithmultiplerenderingprocessors.Theseframebuffershadthenumericabilitytointerpolatecolorsanddepthswithlittleornoperformanceloss,andthememorycapacityandbandwidthtosupportdepthbufferingwithminimalperformanceloss.Theywerethereforeabletorendersolidsandfullframescenesefficiently,aswellaswireframeimages.TheSiliconGraphicsGT198811andtheApolloDN5901988areearlyexamplesofsecondgenerationmachines.Latersecondgenerationmachines,suchastheSiliconGraphicsVGX12theHewlettPackardVRX,andtheApolloDN100004includetexturemappingandantialiasingofpointsandlines,butnotofpolygons.Theirperformancesaresubstantiallyreduced,however,whentexturemappingisenabled,andthetexturesizeoftheVGXandfilteringcapabilitiesoftheVRXandtheDN10000arelimited.TheRealityEnginesystemisourfirstthirdgenerationdesign.Itstargetcapabilityistherenderingoflighted,smoothshaded,depthbuffered,texturemapped,antialiasedtriangles.Theinitialtargetperformancewas1/2millionsuchtrianglespersecond,assumingthetrianglesareinshortstrips,and10percentintersecttheviewingfrustumboundaries.Texturesweretobewellfiltered8samplelinearinterpolationwithinandbetweentwomipmap13levelsandlargeenough1024n021024tobeusableastrueimages,ratherthansimplyasrepeatedtextures.Antialiasingwastoresultinhighqualityimagesofsolids,andwastoworkinconjunctionwithdepthbuffering,meaningthatnoapplicationsortingwastoberequired.Pixelsweretobefilledataratesufficienttosupport30Hzrenderingoffullscreenimages.Finally,theperformanceonsecondgenerationprimitiveslighted,smoothshaded,depthbufferedwastobenolowerthanthatoftheVGX,whichrendersroughly800,000suchmeshtrianglespersecond.Allofthesegoalswereachieved.Theremainderofthispaperisinfourpartsadescriptionofthearchitecture,somespecificsoffeaturessupportedbythearchitecture,alternativesconsideredduringthedesignofthearchitecture,andfinallysomeappendixesthatdescribeperformanceandimplementationdetails.Permissiontocopywithoutfeeallorpartofthismaterialisgrantedprovidedthatthecopiesarenotmadeordistributedfordirectcommercialadvantage,theACMcopyrightnoticeandthetitleofthepublicationanditsdateappear,andnoticeisgiventhatcopyingisbypermissionoftheAssociationforComputingMachinery.Tocopyotherwise,ortorepublish,requiresafeeand/orspecificpermission.©1993ACM0897916018/93/008/00151.50providedthatthecopiesarenotmadeordistributedfordirectcommercialadvantage,theACMcopyrightnoticeandthetitleofthepublicationanditsdateappear,andnoticeisgiventhatcopyingisbypermissionoftheAssociationforComputingMachinery.Tocopyotherwise,ortorepublish,requiresafeeand/orspecificpermission.©1993ACM8/93/0081.50109geometryboardCommandProcessorGeometryEnginesFragmentGeneratorsTriangleBusImageEnginesdisplaygeneratorboardSystemBusvideorastermemoryboardrastermemoryboardFigure1.Boardlevelblockdiagramofanintermediateconfigurationwith8GeometryEnginesonthegeometryboard,2rastermemoryboards,andadisplaygeneratorboard.2ArchitectureTheRealityEnginesystemisa3,4,or6boardgraphicsacceleratorthatisinstalledinaMIPSRISCworkstation.ThegraphicssystemandoneormoreMIPSprocessorsareconnectedbyasinglesystembus.Figure1isaboardlevelblockdiagramoftheRealityEnginegraphicsaccelerator.ThegeometryboardcomprisesaninputFIFO,theCommandProcessor,and6,8,or12GeometryEngines.Eachrastermemoryboardcomprises5FragmentGeneratorseachwithitsowncompletecopyofthetexturememory,80ImageEngines,andenoughframebuffermemorytoallocate256bitsperpixeltoa1280n021024framebuffer.Thedisplaygeneratorboardsupportsallvideofunctions,includingvideotiming,genlock,colormapping,anddigitaltoanalogconversion.Systemscanbeconfiguredwith1,2,or4rastermemoryboards,resultingin5,10,or20FragmentGeneratorsand80,160,or320ImageEngines.Togetaninitialnotionofhowthesystemworks,letsfollowasingletriangleasitisrendered.Theposition,color,normal,andtexturecoordinatecommandsthatdescribethevertexesofthetriangleinobjectcoordinatesarequeuedbytheinputFIFO,theninterpretedbytheCommandProcessor.TheCommandProcessordirectsallofthisdatatooneoftheGeometryEngines,wherethecoordinatesandnormalsaretransformedtoeyecoordinates,lighted,transformedtoclipcoordinates,clipped,andprojectedtowindowcoordinates.Theassociatedtexturecoordinatesaretransformedbyathirdmatrixandassociatedwiththewindowcoordinatesandcolors.Thenwindowcoordinateslopeinformationregardingthered,green,blue,alpha,depth,andtexturecoordinatesiscomputed.Theprojectedtriangle,readyforrasterization,isthenoutputfromtheGeometryEngineandbroadcastontheTriangleBustothe5,10,or20FragmentGenerators.Wedistinguishbetweenpixelsgeneratedbyrasterizationandpixelsintheframebuffer,referringtotheformerasfragments.EachFragmentGeneratorisresponsiblefortherasterizationof1/5,1/10,or1/20ofthepixelsintheframebuffer,withthepixelassignmentsfinelyinterleavedtoinsurethatevensmalltrianglesarepartiallyrasterizedbyeachoftheFragmentGenerators.EachFragmentGeneratorcomputestheintersectionofthesetofpixelsthatarefullyorpartiallycoveredbythetriangleandthesetofpixelsintheframebufferthatitisresponsiblefor,generatingafragmentforeachofthesepixels.Color,depth,andtexturecoordinatesareassignedtoeachfragmentbasedontheinitialandslopevaluescomputedbytheGeometryEngine.Asubsamplemaskisassignedtothefragmentbasedontheportionofeachpixelthatiscoveredbythetriangle.Thelocalcopyofthetexturememoryisindexedbythetexturecoordinates,andthe8resultingsamplesarereducedbylinearinterpolationtoasinglecolorvalue,whichthenmodulatesthefragmentscolor.Theresultingfragments,eachcomprisingapixelcoordinate,acolor,adepth,andacoveragemask,arethendistributedtotheImageEngines.LiketheFragmentGenerators,theImageEnginesareeachassignedafixedsubsetofthepixelsintheframebuffer.ThesesubsetsarethemselvessubsetsoftheFragmentGeneratorallocations,sothateachFragmentGeneratorcommunicatesonlywiththe16ImageEnginesassignedtoit.EachImageEnginemanagesitsowndynamicRAMthatimplementsitssubsetoftheframebuffer.WhenafragmentisreceivedbyanImageEngine,itsdepthandcolorsampledataaremergedwiththedataalreadystoredatthatpixel,andanewaggregatepixelcolorisimmediatelycomputed.Thustheimageiscompleteassoonasthelastprimitivehasbeenrenderedthereisnoneedforafinalframebufferoperationtoresolvethemultiplecolorsamplesateachpixellocationtoasingledisplayablecolor.Beforedescribingeachoftherenderingoperationsinmoredetail,wemakethefollowingobservations.First,afteritisseparatedbytheCommandProcessor,thestreamofrenderingcommandsmergesonlyattheTriangleBus.Second,trianglesofsufficientsizeafunctionofthenumberofrastermemoryboardsareprocessedbyalmostalltheprocessorsinthesystem,avoidingonly5,7,or11GeometryEngines.Finally,smalltomoderateFIFOmemoriesareincludedattheinputandoutputofeachGeometryEngine,attheinputofeachFragmentGenerator,andattheinputofeachImageEngine.Thesememoriessmooththeflowofrenderingcommands,helpingtoinsurethattheprocessorsareutilizedefficiently.2.1CommandProcessorThattheCommandProcessorisrequiredatallisprimarilyafunctionoftheOpenGLTM87graphicslanguage.OpenGLismodal,meaningthatmuchofthestatethatcontrolsrenderingisincludedinthecommandstreamonlywhenitchanges,ratherthanwitheachgraphicsprimitive.TheCommandProcessordistinguishesbetweentwoclassesofthismodalstate.OpenGLcommandsthatareexpectedinfrequently,suchasmatrixmanipulationsandlightingmodelchanges,arebroadcasttoalltheGeometryEngines.OpenGLcommandsthatareexpectedfrequently,suchasvertexcolors,normals,andtexturecoordinates,areshadowedbytheCommandProcessor,andthecurrentvaluesarebundledwitheachrenderingcommandthatispassedtoanindividualGeometryEngine.TheCommandProcessoralsobreakslongconnectedsequencesoflinesegmentsortrianglesintosmallergroups,eachgrouppassingtoasingleGeometryEngine.Thesizeofthesegroupsisatradeoffbetweentheincreasedvertexprocessingefficiencyoflargergroupsduetosharedvertexeswithinagroupandtheimprovedloadbalancingthatresultsfromsmallergroups.Finally,becausetheCommandProcessormustinterpreteachgraphicscommand,itisalsoabletodetectinvalidcommandsequencesandprotectthe110i860XPASIC256Kx64DRAMFromCommandProcessorToTriangleBus484864Figure2.IndividualGeometryEngine.subsequentprocessorsfromtheireffects.NonbroadcastrenderingcommandsaredistributedtotheGeometryEnginesinpureroundrobinsequence,takingnoaccountofGeometryEngineloading.Thisapproachwaschosenforitssimplicity,andisefficientbecausetheprocessingrequirementsofprimitivesareusuallyverysimilar,andbecausetheinputandoutputFIFOsofeachGeometryEnginesmooththeimbalancesduetodatadependentprocessingsuchasclipping.2.2GeometryEnginesThecoreofeachGeometryEngineisanInteli860XPprocessor.Operatingat50MHz,thecombinedfloatingpointmultiplierandALUcanachieveapeakperformanceof100MFLOPS.EachIntelprocessorisprovided2Mbytesofcombinedcode/datadynamicmemory,andissupportedbyasingleASICthatimplementstheinputandoutputFIFOs,asmallregisterspacefromwhichthei860XPaccessesincomingcommands,andspecializeddataconversionfacilitiesthatpackcomputedslopedataintoaformatacceptedbytheFragmentGenerators.Figure2.AllGeometryEnginecodeisfirstdevelopedinC,whichiscrosscompiledforthei860XPonMIPSRISCdevelopmentsystems.Codethatisexecutedfrequentlyisthenrecodedini860XPassemblycode,showingthegreatestimprovementinperformancewhereschedulingofthevectorfloatingpointunitishandoptimized.Theassemblycodeiswrittentoconformtothecompilerslinkconventions,sothathandcodedandcompiledmodulesareinterchangeablefordevelopmentanddocumentationpurposes.Mostfloatingpointarithmeticisdoneinsingleprecision,butmuchofthetexturearithmetic,andalldeptharithmeticafterprojectiontransformation,mustbedoneindoubleprecisiontomaintaintherequiredaccuracy.Aftertransformation,lighting,andclipping,therasterizationsetupcodetreatseachparameterasaplaneequation,computingitssignedslopeinthepositiveXandYscreendirections.Becausetheparametersofpolygonswithmorethan3vertexesmaybenonplanar,theGeometryEnginedecomposesallpolygonstotriangles.2.3TriangleBusTheTriangleBusactsasacrossbar,connectingtheoutputofeachGeometryEnginetotheinputsofalltheFragmentGenerators.BecauseallGeometryEngineoutputconvergesatthisbus,itisapotentialbottleneck.Toavoidperformanceloss,theTriangleBuswasdesignedwithbandwidthtohandleoveronemillionshaded,depthbuffered,texturemapped,antialiasedtrianglespersecond,morethantwicethenumberofprimitivespersecondthatwereanticipatedfroman8GeometryEnginesystem.Thisperformancecushionallowsthelaterconceived12GeometryEnginesystemtorenderatfullperformance,inspiteofthegreaterthanexpectedperformanceoftheindividualengines.InadditiontobroadcastingtherasterizationdatafortrianglestotheFragmentGenerators,theTriangleBusbroadcastspointandlinesegmentdescriptions,textureimages,andrasterizationmodechangessuchasblendingfunctions.2.4FragmentGeneratorsAlthougheachFragmentGeneratormaybethoughtofasasingleprocessor,thedatapathofeachunitisactuallyadeeppipeline.Thispipelinesequentiallyperformstheinitialgenerationoffragments,generationofthecoveragemask,textureaddressgeneration,texturelookup,texturesamplefiltering,texturemodulationofthefragmentcolor,andfogcomputationandblending.ThesetasksaredistributedamongthefourASICsandeightdynamicRAMsthatcompriseeachFragmentGenerator.Figure3.FragmentsaregeneratedusingPinedaarithmetic9,withthealgorithmmodifiedtotraverseonlypixelsthatareinthedomainoftheFragmentGenerator.Acoveragemaskisgeneratedfor4,8,or16samplelocations,chosenonaregular8n028subsamplegridwithinthesquareboundariesofthepixel.Thehardwareimposesnoconstraintsonwhichsubsetofthe64subsamplelocationsischosen,exceptthatthesamesubsetischosenforeachpixel.Thesubsetmaybechangedbytheapplicationbetweenframes.Depthandtexturecoordinatesamplevaluesarealwayscomputedatthecentermostsamplelocation,regardlessofthefragmentcoveragemask.ThesingledepthsampleislaterusedbytheImageEnginestoderiveaccuratedepthsamplesateachsubpixellocation,usingtheXandYdepthslopes.Takingthetexturesampleataconsistentlocationinsuresthatdiscontinuitiesareavoidedatpixelsthatspanmultipletriangles.Colorsamplevaluesarecomputedatthecentermostsamplelocationonlyifitiswithintheperimeterofthetriangle.Otherwisethecolorsampleistakenatasamplelocationwithinthetriangleperimeterthatisnearthecentroidofthecoveredregion.Thuscolorsamplesarealwaystakenwithinthetriangleperimeter,andthereforeneverwraptoinappropriatevalues.BasedonalevelofdetailLODcalculationandthetexturecoordinatevaluesatthefragmentcenter,theaddressesoftheeighttexelsnearestthesamplelocationinthemipmapoftextureimagesareproduced.Eightseparatebanksoftexturememoryarethenaccessedinparallelattheselocations.The816bitvaluesthatresultaremergedwithatrilinearblend,basedonthesubtexelcoordinatesandtheLODfraction,resultinginasingletexturecolorthatvariessmoothlyfromframetoframeinananimation.Theentirebandwidthofthe8banktexturememoryisconsumedbyasingleFragmentEngine,soeachFragmentEngineincludesitsowncompletecopyofalltextureimagesinitstexturememory,allowingallFragmentGeneratorstooperateinparallel.SeparateFIFOmemoriesontheaddressanddataportsofeachtexturememorybank111ASIC1Mx16DRAMFromTriangleBus48ASIC1Mx16DRAM1Mx16DRAM1Mx16DRAM1Mx16DRAM1Mx16DRAM1Mx16DRAMASICASICTo16ImageEngines161Mx16DRAMFigure3.IndividualFragmentGenerator.insurethatrandompageboundarycrossingsdonotsignificantlydegradethebandwidthavailablefromthedynamicRAMs.ThelastASICintheFragmentGeneratorappliesthetexturecolortothefragmentssmoothshadedcolor,typicallybymodulation.Itthenindexesitsinternalfogtablewiththefragmentsdepthvalueandusestheresultingfogblendfactorcomputedbylinearinterpolationbetweenthetwonearesttableentriestoblendthefragmentcolorwiththeapplicationdefinedfogcolor.2.5ImageEnginesFragmentsoutputbyasingleFragmentGeneratoraredistributedequallyamongthe16ImageEnginesconnectedtothatgenerator.WhenthetrianglewasfirstacceptedbytheFragmentGeneratorforprocessing,itsdepthslopesintheXandYscreendirectionswerebroadcasttoeachImageEngine,whichstoredthemforlateruse.WhenanImageEngineacceptsafragment,itfirstusesthesetwoslopevaluesandthefragmentsdepthsamplevaluetoreconstructthedepthvaluesateachsubpixelsamplelocation.Thearithmeticrequiredforthisoperationissimplifiedbecausethesubpixelsamplelocationsarefixedtoaregular8n028grid.ThecalculationsarelinearbecausedepthvalueshavebeenprojectedtowindowcoordinatesjustliketheXandYpixelcoordinates.Ateachsamplelocationcorrespondingtoa1inthefragmentscoveragemask,thecomputeddepthvalueiscomparedtothedepthvaluestoredintheframebuffer.Ifthecomparisonsucceeds,theframebuffercoloratthatsubsamplelocationisreplacedbythefragmentcolor,andtheframebufferdepthisreplacedbythederivedfragmentdepth.Ifanychangeismadetothepixelscontents,theaggregatepixelcolorisrecomputedbyaveragingthesubpixelsamplecolors,andisimmediatelywrittentothedisplayablecolorbufferthatwillcontainthefinalimage.EachImageEnginecontrolsasingle256Kn0216dynamicRAMthatcomprisesitsportionoftheframebuffer.Figure4.Whentheframebufferisinitialized,thismemoryispartitionedequallyamong4K,8K,or16Kpixels,resultinginpixelswith1024,512,or256bits.Allsubsampledepthandcolorsamples,aswellastheone,two,orfourdisplayablecolorbuffersandotherauxiliarybuffers,arestoredinthismemory.Bydefault,colorsarestored256Kx16DRAMFromFragmentGenerator4ImageEngine161ToDisplayGeneratorFigure4.IndividualImageEngine.with12bitsperred,green,blue,andalphacomponentinboththedisplayablebuffersandthesubpixelsamples.Depthvaluesare32bitseach,andarenormallyrequiredonlyforeachsubpixelsample,notforthedisplayablecolorbufferorbuffers.Coloranddepthsampleresolutionscanbereducedto8,8,8and24bitstoallowmoresamplestobestoredperpixel.The4Kpartitionstores8highresolutionsamplesperpixel,or16lowresolutionsamplesperpixel,inadditiontotwodisplayablecolorbuffersofthesameresolution.The8Kpartitionstores4highresolutionsamplesperpixel,or8lowresolutionsamplesperpixel,againwithtwodisplayablecolorbuffersofthesameresolution.The16Kpartitioncannotbeusedtosupportmultisampleantialiasing.Becausethenumberofrastermemoryboards1,2,or4andthenumberofpixelsperImageEngine4K,8K,or16Kareindependent,theRealityEnginesystemsupportsawidevarietyofframebufferdimensions,coloranddepthresolutions,andsubpixelsamples.Forexample,asinglerasterboardsystemsupports16sampleantialiasingat640n02512resolutionoraliasedrenderingat1280n021024resolution,anda4boardsystemsupports8sampleantialiasingattrueHDTV1920n021035resolutionor16sampleantialiasingat1280n021024resolution.2.6DisplayHardwareEachofthe80ImageEnginesontherastermemoryboarddrivesasinglebit,50MHzpathtothedisplayboard,deliveringvideodataat500MBytespersecond.All160singlebitpathsofatworastermemoryboardconfigurationareactive,doublingthepeakvideodatarate.Thepathsaretimemultiplexedbypairsofrastermemoryboardsinthefourboardconfiguration.TencrossbarASICsonthedisplayboardassemblethe80or160singlebitstreamsintoindividualcolorcomponentsorcolorindexes.Colorcomponentsarethenditheredfrom12bitsto10bitsandgammacorrectedusing1024n028lookuptables.Theresulting8bitcolorcomponentsdrivedigitaltoanalogconvertersandareoutputtothemonitor.Colorindexesaredereferencedina32Klocationlookuptable,supportingseparatecolorlookuptablesforeachofupto40windowsonthescreen.Perpixeldisplaymodes,suchasthecolorindexoffset,aresupportedbyacombinationofImageEngineanddisplayboardhardware,drivenbywindowIDbitsstoredintheframebuffer1.1123FeaturesThissectionprovidesadditionalinformationregardingthearchitecturesantialiasing,texturemapping,stereo,andclippingcapabilities.3.1AntialiasingThearchitecturesupportstwofundamentallydifferentantialiasingtechniquesalphaandmultisample.Alphaantialiasingofpointsandlinesiscommontosecondgenerationarchitectures.Alphaantialiasingisimplementedusingsubpixelandlineslopeindexedtablestogenerateappropriatecoveragevaluesforpointsandlines,compensatingforthesubpixelpositionoflineendpoints.Polygoncoveragevaluesarecomputedbycountingthe1sinthefullprecision8n028coveragemask.Thefragmentalphavalueisscaledbythefractionalcoveragevalue,whichvariesfrom0.0,indicatingnocoverage,to1.0,indicatingcompletecoverage.Ifpixelblendingisenabled,fragmentsareblendeddirectlyintothecolorbuffer–nosubpixelsamplelocationsareaccessedorrequired.Alphaantialiasingresultsinhigherqualitypointsandlinesthandoesmultisampleantialiasing,becausetheresolutionofthefiltertablesisgreaterthanthe4bitequivalentofthe16samplemask.Whilealphaantialiasedprimitivesshouldberenderedbacktofrontorfronttobackdependingontheblendfunctionbeingusedtogenerateacorrectimage,itisoftenpossibletogetanacceptablepointorlineimagewithoutsuchsorting.Alphaantialiasedpolygons,however,mustbesortedneartofartogetanacceptableimage.Thusthistechniqueisefficientlyappliedtopolygonsonlyin2Dscenes,suchasinstrumentpanels,whereprimitiveorderingisfixedandaslightincreaseinqualityisdesired.Multisampleantialiasinghasalreadybeendescribed.Itsprincipaladvantageoveralphaantialiasingisitsorderinvariancepoints,lines,andpolygonscanbedrawnintoamultisamplebufferinanyordertoproducethesamefinalimage.Twodifferentmaskgenerationtechniquesaresupportedinmultisamplemode,eachwithitsownadvantagesanddisadvantages.Thedefaultmaskgenerationmodeiscalledpointsampledthealternatemodeisareasampled.Apointsampledmaskisgeometricallyaccurate,meaningthateachmaskbitissetifandonlyifitssubpixellocationiswithintheperimeterofthepoint,line,orpolygonoutline.Samplesontheprimitivesedgeareincludedinexactlyoneofthetwoadjacentprimitives.Suchmasksinsurethecorrectnessofthefinalimage,attheexpenseofitsfilteredquality.Thefinalimageiscorrectbecauseallthesamplesthatcompriseitaregeometricallyvalidnonehavingbeentakenoutsidetheircorrespondingprimitives.Itispoorlysampledbecausethenumberofbitssetinthemaskmaynotcloselycorrespondtotheactualareaofthepixelthatiscoveredbytheprimitive,andthefinalfilteringqualitydependsonthiscorrespondence.Areasamplingattemptstoinsurethatthenumberof1sinthesamplemaskiscorrectplusorminus1/2asample,basedontheactualcoverageofpixelareabytheprimitive.Figure5.Inordertoaccomplishthis,areasampledmasksnecessarilyincludesamplesthatareoutsidetheprimitiveoutline,resultinginimageartifactssuchaspolygonprotrusionsatsilhouettesandTjunctions.AreasampledmasksareimplementedwithatechniquethatisrelatedtotheonedescribedbyAndreasSchilling10.Pointandareasamplingcanbeselectedbytheapplicationprogramonaperprimitivebasis.Thedesirablemultisamplepropertyoforderinvarianceislostifalphatransparencyandpixelblendingareused.Alphadoessometimescarrysignificantinformation,usuallyasaresultofthealphachannelinthetextureapplication.Forexample,treesareThesinglesampleselectedbythepointsamplemethodisdarkened.Thethreesamplesselectedbytheareasamplemethodaredarkened.Figure5.Anarrowtriangleintersectedwithasingle,16samplepixel.Thethreesamplesselectedbytheareasamplemethodaccuratelyrepresentthefactthatalmost20percentofthepixeliscoveredbythetriangle.oftendrawnassinglepolygons,usinganalphamattetoexpresstheirshape.Inordertohandlealphatransparencywithoutrequiringpixelblending,theImageEngineshavetheabilitytoconvertfragmentalphavaluestopseudorandommasks,whicharethenlogicallyANDedwiththefragmentscoveragemask.Thismethod,whilenotgeometricallyaccurate,providesusableantialiasingoftexturemattes,andisorderinvariant.3.2TextureMappingInadditiontothe2dimensiontexturemapsdescribedinthearchitecturesection,1and3dimensionmapsarealsosupported.TheeightmilliontexelmemoryassociatedwitheachFragmentGeneratorstores2Dmipmappedimagesupto1024n021024,and3Dnonmipmappedimagesupto256n02256n0264.Thus3Dtexturescanbeusedtorendervolumetricimagesofsubstantialresolution,atratesupto30framespersecond.TheS,T,andRtexturecoordinatesofeachfragmentarecomputedbyinterpolatingS/W,T/W,R/W,and1/W,thendoingthecorrectdivisionsateachpixel,resultinginperspectivecorrectedmapping.Levelofdetailisalsocomputedforeachpixel,basedontheworstcaseofthefourpixeltotexelXandYratios.Linearfilteringofthenearesttexelsandmipmaplevelsissupportedfor1D,2D,and3Dtextures,blendingatotalof16texelcolorsinthe3Dmode.Inthe2Dcasesuchlinearfilteringiscommonlyknownastrilinear.Bicubicinterpolationissupportedfor2D,nonmipmappedtextures,againblending16texels.Thereisnosupportforcubicfilteringof1Dor3Dtextures,orofanymipmappedtextures.Thedefault16bittexelsizesupportsRGBAtexelsat4bitspercomponent,RGBtexelsat5bitspercomponent6bitsforgreen,intensityalphatexelsat8bitspercomponent,andintensitytexelsat12bitspercomponent.32bitand48bittexelscanbespecifiedbytheapplicationwithproportionallossofperformance.ThemaximumRBGAtexelresolutionis12bitspercomponent,equaltothemaximumframebuffercolorresolution.Texturemagnificationcanbedonebyextrapolationofmipmaplevels,resultinginasharpeningofthehighestresolutionmipmapimage,orthehighestresolutionimagecanbeblendedwithareplicated256n02256detailimage,greatlyincreasingtheapparentresolutionofthetexturewithoutrequiringexcessivetexturestorage.FilterfunctionsforRGBandforalphacanbespecifiedseparately113toimprovethequalityoftexturemattes.Finally,texturememorycanbeloadedfromtheapplicationprocessorsmemoryattherateof80million16bittexelspersecond,allowingtheapplicationtotreattexturememoryasamanagedcacheofimages.3.3StereoinaWindowImageEnginememorycanbeconfiguredwithseparateleftandrightcolorbuffersforboththevisibleandnonvisibledisplayablecolorbuffers,resultinginatotaloffour48bitcolorbuffersperpixel.Thedisplayhardwarealternatelydisplaystheleftandrightbuffercontentsofthevisiblebuffersofallwindowssoconfigured,anddrivesasyncsignalthatcanbeusedtocontrolscreenorheadmountedshutters.ThisstereoinawindowcapabilityisbothformallyandpracticallycompatiblewiththeXprotocolformallybecauseneitherframebufferdimensionsnorpixelaspectratioarechangedwhenitisenabledordisabled,andpracticallybecauseitallowsmonoscopicwindowssuchasmenustoberenderedanddisplayedcorrectly.Toreduceeyefatigue,itisadvisabletoselectareduceddimensionframebufferwhenthewindowsystemisinitialized,allowingtheframedisplayratetobeincreasedto90Hzwithinthe140MHzpixellimitofthedisplayboard.3.4FastClippingRealityEnginepolygonclippingisfasterthanthatofourearlierdesignsfortwofundamentalreasonsitisimplementedmoreefficiently,anditisrequiredlessoften.HigherefficiencyresultsfromtheMIMDGeometryEnginearchitecture.Becauseeachoftheenginesexecutesanindependentcodesequence,andbecauseeachhassignificantinputandoutputFIFOs,randomclippingdelaysaffectonlyasingleengineandareaveragedstatisticallyacrossalltheengines.Also,becauseeachGeometryEnginecomprisesonlyasingleprocessor,allofthatenginesprocessingpowercanbedevotedtotheclippingprocess.SIMDarchitecturesarelessefficientbecauseallprocessorsareslowedwhenasingleprocessormustclipapolygon.Pipelinesofprocessors,andevenMIMDarrangementsofshortpipelines,arelessefficientbecauseonlyafractionofavailableprocessingpowerisavailabletotheclippingprocess.Therequirementforclippingisreducedthroughatechniquewecallscissoring.Nearandfarplaneclippingaredoneasusual,buttheleft,right,bottom,andtopfrustumedgesaremovedwellawayfromthespecifiedfrustum,andalltrianglesthatfallwithintheexpandedfrustumareprojectedtoextendedwindowcoordinates.Ifcullingisdonebytheapplication,almostnotriangleswillactuallyintersectthesidesoftheexpandedfrustum.Projectedtrianglesthatarenotfullywithintheviewportarethenscissoredtomatchtheedgesoftheviewport,eliminatingtheportionsthatarenotwithintheviewport.ThePinedarasterizationalgorithmthatisemployedeasilyandefficientlyhandlestheadditionalrectilinearedgesthatresult,andnofragmentgenerationperformanceislostonscissoredregions.4DesignAlternativesWethinkthatthemostinterestingpartofdesignisthealternativesconsidered,andthereasonsforchoices,ratherthanthedetailsoftheresult.Thissectionhighlightssomeofthesealternatives,inroughlydecreasingorderofsignificance.4.1SinglepassAntialiasingMultipassaccumulationbufferantialiasingusinganaccumulationbuffer3isorderinvariant,andproduceshighqualityimagesin10to20passes.Further,asystemthatwasfastenoughtorender10to20fullsceneimagesperframewouldbeafantasticgeneratorofaliasedimages.Sowhydesignacomplex,multisampleframebuffertoaccomplishthesamethinginonepassTheansweristhatsignificantlymorehardwarewouldberequiredtoimplementamultipassmachinewithequivalentperformance.Thisistruenotonlybecausethemultipassmachinemusttraverseandtransformtheobjectcoordinateseachpass,butinparticularbecausetexturemappingwouldalsobeperformedforeachpass.Thecomponentcostsfortraversal,transformation,parameterinterpolation,andtexturemappingconstitutewelloverhalfofthemultisamplemachinecost,andtheyarenotreplicatedinthemultisamplearchitecture.Acompetingmultipassarchitecturewouldhavetoreplicatethishardwareinsomemannertoachievetherequiredperformance.EventhePixelFlowarchitecture6,whichavoidsrepeatedtraversalandtransformationbybufferingintermediateresults,muststillrasterizeandtexturemaprepeatedly.4.2MultisampleAntialiasingMultisampleantialiasingisaratherbruteforcetechniqueforachievingorderinvariantsinglepassantialiasing.WeinvestigatedalternativesortingbuffertechniquesderivedfromtheAbufferalgorithm2,hopingforhigherfilterqualityandcorrect,singlepasstransparency.Thesetechniqueswererejectedforseveralreasons.First,sortbuffersareinherentlymorecomplexthanthemultisamplebufferand,withfinitestorageallocationsperpixel,theymayfailinundesirableways.Second,anysolutionthatislessexactthanmultisamplingwithpointsampledmaskgenerationwilladmitrenderingerrorssuchaspolygonprotrusionsatsilhouettesandTjunctions.Finally,themultisamplealgorithmmatchesthesinglesamplealgorithmclosely,allowingOpenGLpixeltechniquessuchasstencil,alphatest,anddepthtesttoworkidenticallyinsingleormultisamplemode.4.3ImmediateResolutionofMultisampleColorOurinitialexpectationwasthatrenderingwouldupdateonlythemultisamplecoloranddepthvalues,requiringasubsequentresolutionpasstoreducethesevaluestothesinglecolorvaluesfordisplay.Thecomputationalexpenseofvisitingallthepixelsintheframebufferishigh,however,andtheresolutionpassdamagedthesoftwaremodel,becauseOpenGLhasnoexplicitscenedemarcations.Immediateresolutionbecamemuchmoredesirablewhenwerealizedthatthesinglemostcommonresolutioncase,wherethefragmentcompletelyreplacesthepixelscontentsi.e.thefragmentmaskisallonesandalldepthcomparisonspasscouldbeimplementedbysimplywritingthefragmentcolortothecolorbuffer,makingnochangetothe4,8,or16subsamplecolors,andspeciallytaggingthepixel.Onlyifthepixelissubsequentlypartiallycoveredbyafragmentisthecolorinthecolorbuffercopiedtotheappropriatesubsamplecolorlocations.Thistechniqueincreasestheperformanceinthetypicalrenderingcaseandeliminatestheneedforaresolutionpass.1144.4TriangleBusAllgraphicsarchitecturesthatimplementparallelprimitiveprocessingandparallelfragment/pixelprocessingmustalsoimplementacrossbarsomewherebetweenthegeometryprocessorsandtheframebuffer5.Whilemanyoftheissuesconcerningtheplacementofthiscrossbararebeyondthescopeofthispaper,wewillmentionsomeoftheconsiderationsthatresultedinourTriangleBusarchitecture.TheRealityEngineTriangleBusisacrossbarbetweentheGeometryEnginesandtheFragmentGenerators.DescribedinRealityEngineterms,architecturessuchastheEvansSutherlandFreedomSeriesTMimplementGeometryEnginesandFragmentGeneratorsinpairs,thenswitchtheresultingfragmentstotheappropriateImageEnginesusingafragmentcrossbarnetwork.Sucharchitectureshaveanadvantageinfragmentgenerationefficiency,duebothtotheimprovedlocalityofthefragmentsandtoonlyoneFragmentGeneratorbeinginitializedperprimitive.Theysufferincomparison,however,forseveralreasons.First,transformationandfragmentgenerationratesarelinked,eliminatingthepossibilityoftuningamachineforunbalancedrenderingrequirementsbyaddingtransformationorrasterizationprocessors.Second,ultimatefillrateislimitedbythefragmentbandwidth,ratherthantheprimitivebandwidth.Forallbutthesmallesttrianglesthequantityofdatageneratedbyrasterizationismuchgreaterthanthatrequiredforgeometricspecification,sothisisasignificantbottleneck.SeeAppendix2.Finally,ifprimitivesmustberenderedintheorderthattheyarespecified,loadbalancingisalmostimpossible,becausethenumberoffragmentsgeneratedbyaprimitivevariesbymanyordersofmagnitude,andcannotbepredictedpriortoprocessorassignment.BothOpenGLandthecoreXrendererrequiresuchorderedrendering.ThePixelFlow6architecturealsopairsGeometryEnginesandFragmentGenerators,buttheequivalentofImageEnginesandmemoryfora128n02128pixeltilearealsobundledwitheachGeometry/Fragmentpair.Thecrossbarinthisarchitectureisthecompositingtreethatfunnelsthecontentsofrasterizedtilestoafinaldisplaybuffer.Becausetheframebufferassociatedwitheachprocessorissmallerthanthefinaldisplaybuffer,thefinalimageisassembledasasequenceof128n02128logicaltiles.Efficientoperationisachievedonlywheneachlogicaltileisrasterizedonceinitsentirety,ratherthanbeingrevisitedwhenadditionalprimitivesaretransformed.Toinsurethatallprimitivesthatcorrespondtoalogicaltileareknown,allprimitivesmustbetransformedandsortedbeforerasterizationcanbegin.Thissubstantiallyincreasesthesystemslatency,andrequiresthattherenderingsoftwaresupportthenotionofframedemarcation.NeitherthecoreXrenderernorOpenGLsupportthisnotion.4.512bitColorColorcomponentresolutionwasincreasedfromtheusual8bitsto12bitsfortworeasons.First,theRealityEngineframebufferstorescolorcomponentsinlinear,ratherthangammacorrected,format.When8bitlinearintensitiesaregammacorrected,singlebitchangesatlowintensitiesarediscernible,resultinginvisiblebanding.Thecombinationof12to10bitditheringand10bitgammalookuptablesusedatdisplaytimeeliminatesvisiblebanding.Second,itisintendedthatimagesbecomputed,ratherthanjuststored,intheRealityEngineframebuffer.Volumerenderingusing3Dtextures,forexample,requiresbacktofrontcompositionofmultipleslicesthroughthedataset.Iftheframebufferresolutionisjustsufficienttodisplayanacceptableimage,repeatedcompositionswilldegradetheFigure6.Ascenefromadrivingsimulationrunningfullscreenat30Hz.Figure7.A12xmagnifiedsubregionofthesceneinfigure6.Theskytextureisproperlysampledandthesilhouettesofthegroundandbuildingsagainsttheskyareantialiased.resolutionvisibly.The12bitcomponentsallowsubstantialframebuffercompositiontotakeplacebeforeartifactsbecomevisible.ConclusionTheRealityEnginesystemwasdesignedasahighendworkstationgraphicsacceleratorwithspecialabilitiesinimagegenerationandimageprocessing.Thispaperhasdescribeditsarchitectureandcapabilitiesintherealmofimagegeneration20to60Hzanimationsoffullscreen,fullytextured,antialiasedscenes.Figures6and7.Theimageprocessingcapabilitiesofthearchitecturehavenotbeendescribedatalltheyincludeconvolution,colorspaceconversion,tablelookup,histogramming,andavarietyofwarpingandmappingoperationsusingthetexturemappinghardware.Futuredevelopmentswillinvestigateadditionaladvancedrenderingfeatures,whilecontinuallyreducingthecostofhighperformance,highqualitygraphics.115AcknowledgmentsItwasaprivilegetobeapartoftheteamthatcreatedRealityEngine.Whilemanyteammembersmadeimportantcontributionstothedesign,IespeciallyacknowledgeMarkLeatherfordevelopingthemultisampleantialiasingtechniquethatwaseventuallyadopted,andfordesigningaremarkableintegratedcircuittheImageEnginethatimplementedhisdesign.Also,specialthankstoDougVoorhies,whoreadandcarefullymarkedupseveraldraftsofthispaper,Finally,thankstoJohnMontrym,DanBaum,RolfvanWidenfelt,andtheanonymousreviewersfortheirclarificationsandinsights.Appendix1MeasuredPerformanceThetwomostsignificantperformancecategoriesaretransformratethenumberofprimitivespersecondthatcanbeprocessedbytheGeometryEngines,andfillratethenumberoffragmentspersecondthatcanbegeneratedandmergedintotheframebuffer.Runninginthirdgenerationmodelighting,smoothshading,depthbuffering,texturingandmultisampleantialiasinga12GeometryEnginesystemcanprocess1.5millionpoints,0.7millionconnectedlines,and1.0millionconnectedtrianglespersecond.Insecondgenerationmodelighting,smoothshading,anddepthbufferingthesamesystemcanprocess2.0millionpoints,1.3millionconnectedlines,and1.2millionconnectedtrianglespersecond.Measuredthirdgenerationfillratesfor2and4rasterboardsystemsare120and240millionfragmentspersecond.Measuredsecondgenerationfillratesfor1,2,and4rasterboardsystemsare85,180,and360millionfragmentspersecond.Thethirdgenerationfillratenumbersaresomewhatdependentonrenderingorder,andarethereforechosenasaveragesoverarangeofactualperformances.Appendix2BandwidthandotherStatisticsTriangleBus,fragmenttransferpath,andImageEnginetoframebuffermemorybandwidthsareinroughlytheratiosof11020.Specificnumbersforthetypicaltworasterboardconfigurationare240Mbyte/secontheTriangleBus,3,200Mbyte/secaggregateonthe160FragmentGeneratortoImageEnginebusses,and6,400Mbyte/secaggregateonthe160ImageEnginetoframebufferconnections.Becausethe6,400Mbyte/secframebufferbandwidthissomuchlargerthanthebandwidthrequiredtorefreshamonitorroughly800Mbyte/secat1280n021024n0276HzweimplementtheframebuffermemorywithdynamicRAMratherthanvideoRAM,acceptingthe12percentfillratedegradationinfavorofthelowercostofcommoditymemory.GeometryEnginememoryandtexturememoryarealsoimplementedwithcommodity,16bitdatapathdynamicRAM.Totaldynamicmemoryinthemaximallyconfiguredsystemisjustover1/2Gigabyte.References1AKELEY,KURTANDTOMJERMOLUK.HighPerformancePolygonRendering.InProceedingsofSIGGRAPH88August1988,pp.239–246.2CARPENTER,LOREN.TheAbuffer,AnAntialiasedHiddenSurfaceMethod.InProceedingsofSIGGRAPH84July1984,pp.103–108.3HAEBERLI,PAULANDKURTAKELEY.TheAccumulationBufferHardwareSupportforHighQualityRendering.InProceedingsofSIGGRAPH90August1990,pp.309–318.4KIRK,DAVIDANDDOUGLASVOORHIES.TheRenderingArchitectureoftheDN10000VS.InProceedingsofSIGGRAPH90August1990,pp.299–308.5MOLNAR,STEVEN.ImageCompositionArchitecturesforRealTimeImageGeneration.UniversityofNorthCarolinaatChapelHill,ChapelHill,NC,1991.6MOLNAR,STEVEN,JOHNEYLESANDJOHNPOULTON.PixelFlowHighSpeedRenderingUsingImageComposition.InProceedingsofSIGGRAPH92July1992,pp.231–240.7NEIDER,JACQUELINE,MASONWOOANDTOMDAVIS.OpenGLProgrammingGuide.AddisonWesley,1993.8OPENGLARCHITECTUREREVIEWBOARD.OpenGLReferenceManual.AddisonWesley,1992.9PINEDA,JUAN.AParallelAlgorithmforPolygonRasterization.InProceedingsofSIGGRAPH88August1988,pp.17–20.10SCHILLING,ANDREAS.ANewSimpleandEfficientAntialiasingwithSubpixelMasks.InProceedingsofSIGGRAPH91July1991,pp.133–141.11SILICONGRAPHICS,INC.Iris4DGTTechnicalReport.SiliconGraphics,Inc.,MountainView,CA,1988.12SILICONGRAPHICS,INC.TechnicalReportPowerSeries.SiliconGraphics,Inc.,MountainView,CA,1990.13WILLIAMS,LANCE.PyramidalParametrics.InProceedingsofSIGGRAPH83July1983,pp.1–11.RealityEngineandOpenGLaretrademarksofSiliconGraphics,Inc.FreedomSeriesisatrademarkofEvansSutherlandComputerCorporation.116

注意事项

本文(42-Reality Engine Graphics.pdf)为本站会员(baixue100)主动上传,人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对上载内容本身不做任何修改或编辑。 若此文所含内容侵犯了您的版权或隐私,请立即通知人人文库网([email protected]),我们立即给予删除!

温馨提示:如果因为网速或其他原因下载失败请重新下载,重复下载不扣分。

copyright@ 2015-2017 人人文库网网站版权所有
苏ICP备12009002号-5