外文翻译原文-远程视频监控

上传人：机*** IP属地：河南上传时间：2018-01-01 格式：PDF 页数：15 大小：482.79KB 积分：12 举报 版权申诉

已阅读5页，还剩10页未读，继续免费阅读

版权说明：本文档由用户提供并上传，收益归属内容提供方，若内容存在侵权，请进行举报或认领

文档简介

ASystemforVideoSurveillanceandMonitoringRobertT.Collins,AlanJ.LiptonandTakeoKanadeRoboticsInstitute,CarnegieMellonUniversity,Pittsburgh,PAE-MAIL:frcollins,ajl,PHONE:412-268-1450HOMEPAGE:/vsamAbstractTheRoboticsInstituteatCarnegieMellonUniversity(CMU)andtheSarnoffCorporationaredevelopingasystemforautonomousVideoSurveillanceandMonitoring.Thetechnicalobjectiveistousemultiple,cooperativevideosensorstoprovidecontinuouscoverageofpeopleandvehiclesinclutteredenvironments.Thispaperpresentsanoverviewofthesystemandsignificantresultsachievedtodate.1IntroductionTheDARPAImageUnderstanding(IU)programisfundingbasicresearchintheareaofVideoSurveillanceandMonitoring(VSAM)toprovidebattlefieldawareness.ThethrustofCMUsVSAMresearchistodevelopauto-matedvideounderstandingalgorithmsthatallowanetworkofactivevideosensorstoautomaticallymonitorobjectsandeventswithinacomplex,urbanenvironment.Wehavedevelopedvideounderstandingtechnologythatcanau-tomaticallydetectandtrackmultiplepeopleandvehicleswithinclutteredscenes,andtomonitortheiractivitiesoverlongperiodsoftime.Humanandvehicletargetsareseamlesslytrackedthroughtheenvironmentusinganetworkofactivesensorstocooperativelytracktargetsoverareasthatcannotbeviewedcontinuouslybyasinglesensoralone.Eachsensortransmitssymboliceventsandrepresentativeimagerybacktoacentraloperatorcontrolstation,whichprovidesavisualsummaryofactivitiesdetectedoverabroadarea.Theuserinteractswiththesystemusinganintuitivemap-basedinterface.Forexample,theusercanspecifythatobjectsenteringaregionofinterestshouldtriggeranalert,relievingtheburdenofcontinuallywatchingthatarea.Thesystemautomaticallyallocatessensorstooptimizesystemperformancewhilefulfillingusercommands.Althoughdevelopedwithinacontextofprovidingbattlefieldawareness,webelievethistechnologyhasgreatpoten-tialforapplicationsinremotemonitoringofnuclearfacilities.Sampletasksthatcouldbeautomatedareverificationthatroutinemaintainanceactivitiesarebeingperformedaccordingtoschedule,loggingandtrackingvisitorsandpersonnelastheyenterandmovethroughthesite,andprovidingsecurityagainstunauthorizedintrusion.Otherapplicationsinmilitaryandlawenforcementscenariosincludeprovidingperimetersecurityfortroops,monitoringpeacetreatiesorrefugeemovementsusingunmannedairvehicles,providingsecurityforembassiesorairports,andstakingoutsuspecteddrugorterroristhide-outsbycollectingtime-stampedpicturesofeveryoneenteringandexitingthebuilding.ThefollowingsectionspresentanoverviewofvideosurveillancealgorithmsdevelopedatCMUoverthelasttwoyears(Section2)andtheirincorporationintoaprototypesystemforremotesurveillanceandmonitoring(Section3).ThisworkisfundedbytheDARPAIUprogramunderVSAMcontractnumberDAAB07-97-C-J031.1Collins-12VideoUnderstandingTechnologiesKeepingtrackofpeople,vehicles,andtheirinteractionsinacomplexenvironmentisadifficulttask.TheroleofVSAMvideounderstandingtechnologyinachievingthisgoalistoautomatically“parse”peopleandvehiclesfromrawvideo,determinetheirgeolocations,andautomaticallyinsertthemintoadynamicscenevisualization.Wehavedevelopedrobustroutinesfordetectingmovingobjects(Section2.1)andtrackingthemthroughavideosequence(Section2.2)usingacombinationoftemporaldifferencingandtemplatetracking.Detectedobjectsareclassifiedintosemanticcategoriessuchashuman,humangroup,car,andtruckusingshapeandcoloranalysis,andtheselabelsareusedtoimprovetrackingusingtemporalconsistencyconstraints(Section2.3).Furtherclassificationofhumanactivity,suchaswalkingandrunning,hasalsobeenachieved(Section2.4).Geolocationsoflabeledentitiesaredeterminedfromtheirimagecoordinatesusingeitherwide-baselinestereofromtwoormoreoverlappingcameraviews,orintersectionofviewingrayswithaterrainmodelfrommonocularviews(Section2.5).Thecomputedgeolocationsareusedtoprovidehigher-leveltrackingcapabilities,suchastaskingmultiplesensorswithvariablepan,tiltandzoomtocooperativelytrackanobjectthroughthescene(Section2.6).2.1MovingTargetDetectionTheinitialstageofthesurveillanceproblemistheextractionofmovingtargetsfromavideostream.Therearethreeconventionalapproachestoautomatedmovingtargetdetection:temporaldifferencing(two-frameorthree-frame)Andersonetal.,1985;backgroundsubtractionHaritaogluetal.,1998,Wrenetal.,1997;andopticalflow(seeBarronetal.,1994foranexcellentdiscussion).Temporaldifferencingisveryadaptivetodynamicenvironments,butgenerallydoesapoorjobofextractingallrelevantfeaturepixels.Backgroundsubtractionprovidesthemostcompletefeaturedata,butisextremelysensitivetodynamicscenechangesduetolightingandextraneousevents.Opticalflowcanbeusedtodetectindependentlymovingtargetsinthepresenceofcameramotion;however,mostopticalflowcomputationmethodsareverycomplexandareinapplicabletoreal-timealgorithmswithoutspecializedhardware.TheapproachpresentedhereissimilartothattakeninGrimsonandViola,1997,andisanattempttomakebackgroundsubtractionmorerobusttoenvironmentaldynamism.Thekeyideaistomaintainanevolvingstatisticalmodelofthebackgroundtoprovideamechanismthatadaptstoslowchangesintheenvironment.Foreachpixelvaluepninthenthframe,arunningaveragepnandaformofstandarddeviationpnaremaintainedbytemporalfiltering,implementedas:pn+1=pn+1+(1?)pnn+1=jpn+1?pn+1j+(1?)n(1)where=f,fistheframerateandisatimeconstantspecifyinghowfast(responsive)thebackgroundmodelshouldbetointensitychanges.theinfluenceofoldobservationsdecaysexponentiallyovertime,andthusthebackgroundgraduallyadaptstoreflectcurrentenvironmentalconditions.Ifapixelhasavaluewhichismorethan2frompn,thenitisconsideredaforegroundpixel.Atthispointamultiplehypothesisapproachisusedfordeterminingitsbehavior.Anewsetofstatistics(p0;0)isinitializedforthispixelandtheoriginalsetisremembered.If,aftertimet=3,thepixelvaluehasnotreturnedtoitsoriginalstatisticalvalue,thenewstatisticsarechosenasreplacementsfortheold.Foreground(moving)pixelsareaggregatedusingaconnectedcomponentapproachsothatindividualtarget“blobs”canbeextracted.Transientmovingobjectscauseshorttermchangestotheimagestreamthatarenotincludedinthebackgroundmodel,butarecontinuallytracked,whereasmorepermanentchangesare(afteratimeincrementof3haselapsed)absorbedintothebackground(seeFigure1).Themovingtargetdetectionalgorithmdescribedaboveispronetothreetypesoferror:incompleteextractionof2Collins-1(A)(B)Figure1:Exampleofmovingtargetdetectionbydynamicbackgroundsubtraction.Figure2:Targetpre-processing.Amovingtargetregionismorphologicallydilated(twice),erodedandthenitsborderisextracted.amovingobject;erroneousextractionofnon-movingpixels;andlegitimateextractionofillegitimatetargets(suchastreesblowinginthewind).Incompletetargetsarepartiallyreconstructedbyblobclusteringandmorphologicaldilation(Figure2).Erroneouslyextracted“noise”isremovedusingasizefilterwherebyblobsbelowacertaincriticalsizeareignored.Illegitimatetargetsmustberemovedbyothermeanssuchastemporalconsistencyanddomainknowledge.Thisisthepurviewofthetargettrackingalgorithm.2.2TargetTrackingTobegintobuildatemporalmodelofactivity,individualobjectsmustbetrackedovertime.Thefirststepinthisprocessistotaketheblobsgeneratedbymotiondetectionandmatchthembetweenframesofavideosequence.ManysystemsfortargettrackingarebasedonKalmanfilters.However,asIsardandBlakepointout,theseareoflimitedusebecausetheyarebasedonunimodalGaussiandensitiesthatcannotsupportsimultaneousalternativemotionhypothesesIsardandBlake,1996.IsardandBlakepresentanewstochasticalgorithmcalledCONDEN-SATIONthatdoeshandlealternativehypotheses.WorkontheproblemofmultipledataassociationinradartrackingcontextsisalsorelevantBar-ShalomandFortmann,1988.Weemployamuchsimplerapproachbasedonaframe-to-framematchingcostfunction.Arecordofeachblobiskeptwiththefollowinginformation:imagetrajectory(positionp(t)andvelocityv(t)asfunctionsoftime)oftheobjectcentroid,blob“appearance”intheformofanimagetemplate,3Collins-1blobsizesinpixels,colorhistogramhoftheblob.ThepositionandvelocityofeachblobTiisdeterminedfromthelasttimesteptlastandusedtopredictanewimagepositionatthecurrenttimetnow:pi(tnow)pi(tlast)+vi(tlast)(tnow?tlast)(2)UsingthisinformationamatchingcostisdeterminedbetweenaknowntargetTiandacandidatemovingblobRjC(Ti;Rj)=f(jpi?pjj;jsi?sjj;jhi?hjj):(3)Targetsthatare“closeenough”incostspaceareconsideredtobepotentialmatches.Tolendmorerobustnesstochangesinappearanceandocclusions,thefulltrackingalgorithmusesacombinationofcostandadaptivetemplatematching,asdescribedindetailinLiptonetal.,1998.RecentresultsfromthesystemareshowninFigure3.Figure3:Recentresultsofmovingentitydetectionandtrackingshowingdetectedobjectsandtrajectoriesoverlaidonoriginalvideoimagery.Notethattrackingpersistsevenwhentargetsaretemporarilyoccludedormotionless.2.3TargetClassificationTheultimategoaloftheVSAMeffortistobeabletoidentifyindividualentities(suchasthe“FedExtruck”,the“4:15pmbustoOakland”and“FredSmith”)anddeterminewhattheyaredoing.Asafirststep,entitiesareclassifiedintospecificclassgroupingssuchas“humans”and“vehicles”.Currently,weareexperimentingwithaneuralnetworkapproach(Figure4).Theneuralnetworkisastandardthree-layernetworkwhichusesabackpropagationalgorithmforhierarchicallearning.Inputstothenetworkare4Collins-1amixtureofimage-basedandscene-basedentityparameters:dispersedness(perimeter2/area(pixels);imagearea(pixels);aspectratio(height/width);andcamerazoomfactor.Usingasetofmotionregionsautomaticallyextractedbutlabeledbyhand,thenetworkistrainedtooutputoneofthreeclasses:human;vehicle;orhumangroup(twoormorehumanswalkingclosetogether).Whenteachingthenetworkthataninputentityisahuman,alloutputsaresetto0.0exceptfor“human”,whichissetto1.0.Otherclassesaretrainedsimilarly.Iftheinputdoesnotfitanyoftheclasses,suchasatreeblowinginthewind,alloutputsaresetto0.0.InputLayer(4)HiddenLayer(16)OutputLayer(3)TeachpatternAreaDispersednessVehicleSinglehuman1.00.0Multiplehuman0.0AspectratioRejecttargetsinglehuman0.00.00.0ZoommagnificationTargetCameraFigure4:Neuralnetworkapproachtotargetclassification.Resultsfromtheneuralnetworkareinterpretedasfollows:if(outputTHRESHOLD)classification=maximumNNoutputelseclassification=REJECTTheresultsforthisclassificationschemearesummarizedinTable1.Thisclassificationapproachiseffectiveforsingleimages.However,oneoftheadvantagesofvideoisitstemporalcomponent.Toexploitthis,classificationisperformedoneveryentityateveryframeandtheresultsofclassificationarekeptinahistogramwiththeithbucketcontainingthenumberoftimestheobjectwasclassifiedasclassi.Ateachtimestep,theclasslabelthathasbeenoutputmostoftenforeachobjectischosenitsmostlikelyclassification.2.4ActivityAnalysisAfterclassifyinganobject,wewanttodeterminewhatitisdoing.Understandinghumanactivityisoneofthemostdifficultopenproblemsintheareaofautomatedvideosurveillance.DetectingandanalyzinghumanmotioninrealtimefromvideoimageryhasonlyrecentlybecomeviablewithalgorithmslikePfinderWrenetal.,1997andW4Haritaogluetal.,1998.Thesealgorithmsrepresentagoodfirststeptotheproblemofrecognizingandanalyzinghumans,buttheystillhavesomedrawbacks.Ingeneral,theyworkbydetectingfeatures(suchashands,feetandhead),trackingthem,andfittingthemtosomeapriorihumanmodelsuchasthecardboardmodelofJuetalJuetal.,1996.Thereforethehumansubjectmustdominatetheimageframesothattheindividualbodycomponentscanbereliablydetected.5Collins-1ClassSamples%CorrectlyClassifiedHuman43099.5Humangroup9688.5Vehicle50899.4Falsealarms4864.5Total108296.9Table1:Resultsforneuralnetworkclassificationalgorithm.Weusea“star”skeletonizationprocedureforanalyzingthemotionofhumansthatarerelativelysmallintheimage.DetailscanbefoundinFujiyoshiandLipton,1998.Thekeyideaisthatasimpleformofskeletonizationthatonlyextractsthebroadinternalmotionfeaturesofatargetcanbeemployedtoanalyzeitsmotion.Thismethodprovidesasimple,real-time,robustwayofdetectingextremalpointsontheboundaryofthetargettoproducea“star”skeleton.The“star”skeletonconsistsofthecentroidofanentityandallofthelocalextremalpointswhichcanberecoveredwhentraversingtheboundaryoftheentitysimage(Figure5a).0,enddistanceborderpositiondistancedistanceiDFTLPFInverseDFT0endabcdeabcdestarskeletonoftheshaped(i)d(i)centroid-+0xyx,yccl,lxy(a)(b)x,ycc(A)(B)Figure5:(A)Thestarskeletonisformedby“unwrapping”aregionboundaryasadistancefunctionfromthecentroid.Thisfunctionisthensmoothedandextremalpointsareextracted.(B)Determinationofskeletonfeaturesmeasuringgaitandposture.istheangletheleftmostlegmakeswiththevertical,andistheanglethetorsomakeswiththevertical.Usingonlymeasurementsbasedonthe“star”skeleton,itispossibletodeterminethegaitandpostureofamovinghumanbeing.Figure5bshowshowtwoanglesnandnareextractedfromtheskeleton.Thevaluenrepresentstheangleofthetorsowithrespecttovertical,whilenrepresentstheangleoftheleftmostleginthefigure.Figure6showsskeletonmotionfortypicalsequencesofwalkingandrunninghumans,alongwiththevaluesofnandn.Thesedatawereacquiredinreal-timefromavideostreamwithframerate8Hz.Comparingtheaveragevaluesninfigures6(e)-(f)showthatthepostureofarunningtargetcaneasilybedistinguishedfromthatofawalkingoneusingtheangleofthetorsosegmentasaguide.Also,thefrequencyofcyclicmotionofthelegsegmentsprovidescuestodistinguishingrunningfromwalking.6Collins-1frame111213141516171819200.125sec(a)skeletonmotionofawalkingperson12345678910(b)skeletonmotionofarunningperson1112131415161718192012345678910frameradframe-1-0.500.510510152025-1-0.500.510510152025rad(d)legangleofarunningperson(c)legangleofawalkingperson00.4051015202500.40510152025frameframerad|rad|(f)torsoangleofarunningperson(e)torsoangleofawalkingpersonFigure6:Skeletonmotionsequences.Clearly,theperiodicmotionofnprovidescuestothetargetsmotionasdoesthemeanvalueofn.2.5Model-basedGeolocationThevideounderstandingtechniquesdescribedsofarhaveoperatedpurelyinimagespace.Alargeleapintermsofdescriptivepowercanbemadebytransformingimageblobsandmeasurementsinto3Dscene-basedobjectsanddescriptors.Inparticular,determinationofobjectlocationinthesceneallowsustoinfertheproperspatialrelation-shipsbetweensetsofobjects,andbetweenobjectsandscenefeaturessuchasroadsandbuildings.Furthermore,webelievethekeytocoherentlyintegratingalargenumberoftargethypothesesfrommultiplewidely-spacedsensorsiscomputationoftargetspatialgeolocation.Inregionswheremultiplesensorviewpointsoverlap,objectlocationscanbedeterminedveryaccuratelybywide-baselinestereotriangulation.However,regionsofthescenethatcanbesimultaneouslyviewedbymultiplesensorsarelikelytobeasmallpercentageofthetotalareaofregardinrealoutdoorsurveillanceapplications,whereitisdesirabletomaximizecoverageofalargeareagivenfinitesensorresources.Determiningtargetlocationsfromasinglesensorrequiresdomainconstraints,inthiscasetheassumptionthattheobjectisincontactwiththeterrain.Thiscontactlocationisestimatedbypassingaviewingraythroughthebottomoftheobjectintheimageandintersectingitwithamodelrepresentingtheterrain(seeFigure7a).Sequencesoflocationestimatesovertimearethenassembledintoconsistentobjecttrajectories.Previoususesoftherayintersectiontechniqueforobjectlocalizationinsurveillanceresearchhavebeenrestrictedtosmallareasofplanarterrain,wheretherelationbetweenimagepixelsandterrainlocationsisasimple2Dho-mographyBradshawetal.,1997,FlinchbaughandBannon,1994,Kolleretal.,1993.Thishasthebenefitthatnocameracalibrationisrequiredtodeterminetheback-projectionofanimagepointontothesceneplane,providedthemappingsofatleastfourcoplanarscenepointsareknownbeforehand.However,largeoutdoorsceneareasmay7Collins-1Elev(X0+kU,Y0+kV)Z0+kW11X0,Y0,Z0X0121087469351213ProjectionX0,Y0Ray:(X0,Y0)+k(U,V)Ray:(X0,Y0,Z0)+k(U,V,W)VerticalX(A)(B)Figure7:(A)Estimatingobjectgeolocationsbyintersectingtargetviewingrayswithaterrainmodel.(B)ABresenham-liketraversalalgorithmdetermineswhichDEMcellcontainsthefirstintersectionofaviewingrayandtheterrain.containsignificantlyvariedterrain.Tohandlethissituation,weperformgeolocationusingrayintersectionwithafullterrainmodelprovided,forexample,byadigitalelevationmap(DEM).Givenacalibratedsensor,andanimagepixelcorrespondingtotheassumedcontactpointbetweenanobjectandtheterrain,aviewingray(x0+ku;y0+kv;z0+kw)isconstructed,where(x0;y0;z0)isthe3Dsensorlocation,(u;v;w)isaunitvectordesignatingthedirectionoftheviewingrayemanatingfromthesensor,andk0isanarbitrarydistance.Generalmethodsfordeterminingwhereaviewingrayfirstintersectsa3Dscene(forexample,raytracing)canbequiteinvolved.However,whenscenestructureisstoredasaDEM,asimplegeometrictraversalalgorithmsuggestsitself,basedonthewell-knownBresenhamalgorithmfordrawingdigitallinesegments.ConsidertheverticalprojectionoftheviewingrayontotheDEMgrid(seeFigure7b).Startingatthegridcell(x0;y0)containingthesensor,eachcell(x;y)thattheraypassesthroughisexaminedinturn,progressingoutward,untiltheelevationstoredinthatDEMcellexceedsthez-componentofthe3Dviewingrayatthatlocation.Thez-componentoftheviewrayatlocation(x;y)iscomputedaseitherz0+(x?x0)uworz0+(y?y0)vw(4)dependingonwhichdirectioncosine,uorv,islarger.ThisapproachtoviewingrayintersectionlocalizesobjectstoliewithintheboundariesofasingleDEMgridcell.Amoreprecisesub-celllocationestimatecanthenbeobtainedbyinterpolation.Ifmultipleintersectionswiththeterrainbeyondthefirstarerequired,thisalgorithmcanbeusedtogeneratetheminorderofincreasingdistancefromthesensor,outtosomecut-offdistance.SeeCollinsetal.,1998formoredetails.8Collins-12.6Multi-SensorCooperationInmostcomplexoutdoorscenes,itisimpossibleforasinglesensortomaintainitsviewofanobjectforlongperiodsoftime.Objectsbecomeoccludedbyenvironmentalfeaturessuchastreesandbuildings,andsensorshavelimitedeffectivefieldsofregard.Apromisingsolutiontothisproblemistouseanetworkofvideosensorstocooperativelytrackanobjectthroughthescene.Trackedobjectsarethenhanded-offbetweencamerastogreatlyextendthetotaleffectiveareaofsurveillancecoverage.Therehasbeenlittleworkdoneonautonomouslycoordinatingmultipleactivevideosensorstocooperativelytrackamovingtarget.OneapproachispresentedbyMatsuyamaforacontrolledindoorenvironmentwherefourcameraslockontoontoaparticularobjectmovingacrossthefloorMatsuyama,1998.Weapproachtheproblemmoregenerallybyusingtheobjects3Dgeolocationascomputedinthelastsectiontodeterminewhereeachsensorshouldlook.Thepan,tiltandzoomoftheclosestse

人人文库> 全部分类> 图纸下载 > 毕业设计

温馨提示

1. 本站所有资源如无特殊说明，都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
2. 本站的文档不包含任何第三方提供的附件图纸等，如果需要附件，请联系上传者。文件的所有权益归上传用户所有。
3. 本站RAR压缩包中若带图纸，网页内容里面会有图纸预览，若没有图纸预览就没有图纸。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 人人文库网仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对用户上传分享的文档内容本身不做任何修改或编辑，并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容，请与我们联系，我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

外文翻译原文-远程视频监控

文档简介

温馨提示

最新文档

评论

外文翻译原文-远程视频监控

文档简介

温馨提示

最新文档

评论

相关文档