机器人资料-论文-042-Robust View-based Visual Tracking with Detection of Occlusions.pdf_第1页
机器人资料-论文-042-Robust View-based Visual Tracking with Detection of Occlusions.pdf_第2页
机器人资料-论文-042-Robust View-based Visual Tracking with Detection of Occlusions.pdf_第3页
机器人资料-论文-042-Robust View-based Visual Tracking with Detection of Occlusions.pdf_第4页
机器人资料-论文-042-Robust View-based Visual Tracking with Detection of Occlusions.pdf_第5页
免费预览已结束,剩余2页可下载查看

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

proceedings of the 2001 ieee international conference on robotics iven here. the affine transformed templates are generated from the follow- ing equation: 4 detection of occluded region 4.1 visual tracking the basic algorithm of visual tracking to deal with occlusion is the following. 1. detect the occluded region in a subimage where the current best match of the template is obtained. 2. create a mask corresponding to the occluded re- gion. the mask is used for eliminating pixels in the occluded region from the calcula.tion of the correlation error. 3. by using the generated mask, the best match of the template is determined as a displacement vector which gives the minimum correlation error: nn d(% u ) = min(u,u)r i s(z+u, y+u)-r(x, y) i) (2) xy (h 5 u 5 k , h 5 u 5 k ) r(x,y) is a reference template with the size of n x npixels , (u,.) is a displacemerit vector, s(x,y) is the search region, and d(u,v) is the displacement vector which gives the minimum correlation error. 1208 4.2 detection of occlusion using tessel- lated template method we propose the “tessellated template method” to detect occlusioii caused by general objects. to avoid confusion in the following description, we use two terms: ”large template” for the original reference tem- plate and ”small template” which is obtained by tes- sellation of the original template. for example, if the large template is tessellated into small templates with n horizontal slices and m vertical slices, we have r x m small templates. in every tracking cycle, the system evaluates the correlation errors of the small templates as well as the correlation errors of the large template. the tracking process consists of two stages: 1. in the first stage, the system repeats the following processes (a) and (b) for all small templates. (a) the system calculates the correlation error of a small template. the correlation error is used to detect occlusion at the corresponding part of the image. that is, the correlation error of the small template is calculated without changing the template position in a search area as normally perforrned in tracking. (b) if the correlation error is larger than a certain threshold, the part of the image corresponding to the small template is estimated to be occluded by objects or, possibly, by itself (self occlusion). then the system generates a mask corresponding to the region of the small template. 2. in the second stage, the system calculates the correlation error of the large template using the mask created in the first stage. the system elim- inates the pixels in the masked region from the calculation of the correlation of the large tem- plate (fig.2(left) and determines the best match position of the template with the minimum cor- relation error in a search area. 4.3 detection of occlusion caused by hu- man hand to perform visual tracking of an object in human- robot interactions we must handle frequent occlusions in grasping the object by human hand. to cope with such occlusions, we use images obtained from an in- frared camera. the infrared images allow us to reli- ably extract regions of the human hand even in the presence of background clutter, change of brightness, and various textures. then the system generates a mask corresponding to the region of such occlusions occluding object maskarea , , human hand i / effective template area effective template area figure 2: the two methods for detecting occlusions so that we can eliminate the pixel data in the region from the calculation of correlation with the template (fig.2(right). however, since the basic visual tracking process em- ploys a color ccd camera for the template matching, we must adjust the geometry of the field-of-views in the ccd camera and in the infrared camera to be as close as possible. therefore, we used a ccd camera with which zooming of the lens could be controlled through a serial port connection. 4.4 evaluation of template since each of the affine transformed templates has different shape and size, we normalized the correlation error in terms of the area of the template using the following equation: (3) matching-error area-o f -the-template normalized-error = we integrate the proposed methods to deal with occlusion as well as the change of the appearance in the 3d environment. this requires a modification of eq. (2) as follows: normalized-error = matching-error area-of -the-template - occlusionilrea (4) 5 experiments we conducted visual tracking experiments to eval- uate the proposed method with various kinds of oc- clusion. 1209 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 frame figure 3: occlusion by general object 5.1 prototype system we use a visual tracking hardware (fujitsu, trv- cu) 7. the system allows macking of color tem- plates with the size of 8n x 8ml (n,m=1,2, .) pixels, based on a block matching algorithm. it can track about 500 templates in 33 msec when the template is a black-and-white image and the size is 8x8 pixels. when the size of the large template is 8n x 8m pix- els, the size of the search area is (8n+15) x (8m+15) (n,m=1,2, .). we used the small templates with the size of 8 x $pixels to detect occlusion. the tracking hardware allows us to use a mask in calculating the matching error, i.e. the sum of the absolute differences. since most of the template im- i “ i i 1 m m 0 5 10 15 2 0 25 30 3 5 40 45 50 55 60 65 70 75 80 5 90 95 f “ figure 4: occlusion by human hand ages are parallelograms, we use a mask to eliminate the unnecessary parts of the ternplate image from the calculation of the matching error. the tracking hard- ware is controlled by a workstation (sun, sparcstation 5, 170 mhz). we use an infrared camera (mitsubishi electronic co., ir-u300m1) to capture images of heat and a ccd camera (sony, evi-g20) with a control- lable zoom lens. in the experiments of the tessel- lated template method, we used a 6dof manipulator (kawasaki heavy industry, js2) and a ccd camera (sony, evi-310). to implement the software, we use euslisp 8 de- veloped for robotic applications at the electrotech- nical laboratory. since euslisp provides object- 1210 oriented functions, we define classes for the geodesic dome, the facet of the dome, and the template. 5.2 evaluation of the proposed methods we conducted experiments of the proposed meth- ods. in the experiments, we compared the perfor- mance of the visual tracking with and without the proposed method. moreover, we evaluated how large an occluded region is permissible in the tracking. mthile the system was tracking the target object for 100 tv frames, the target object was occluded by general objects (fig.3) and by human hand (fig.4), with a direction from the upper left to lower right. we used a template with a size of 64 x 64 pixels. we measured the correlation error with and with- out the proposed method. the graphs in the mid- dle of fig.3 and fig.4 show the change of the cor- relation error of occlusion by genera objects and by human hand, respectively. the vertical axis indicates the correlation error and the horizontal axis indicates time, in number of tv frames. the solid lines in the graphs show the correlation error with the proposed method and the dotted lines show the error without the method. the lower graphs show the ratio of the area of the occluded region and the area of the template. in the graphs, the vertical line indicates the percentage of the occluded region and the horizontal axis indicates time. the upper four images of the figures show some intermediate scenes. the occlusion starts after about 40 frames in both cases of fig.3 and fig.4. without using the proposed method, the correlation errors increase in proportion to the increase of the occlusion area. in contrast, the errors of the tracking with the proposed method show no increase as we had expected. since the occlusion was stopped after about 80 frames, the graphs show the increase of the errors was also stopped. the experimental results show that the proposed methods work stably for tracking the target object even if the occlusion area inside the reference template is up to about 80%. 5.3 visual tracking with detection of oc- clusions caused by general objects we conducted experiments of visual tracking with detection of occlusions caused by general objects. in the experiments, the 6-dof manipulator grasped a target object (a box with a grip) and moved it behind two books. the size of the reference template is 64 x 64 pixels and we generated templates with the rota- tion angle of multiples of 15 degrees. fig.5 shows the tracking experiments. since the template in fig.5(1) has no occlusion, the occlusion ratio is shown to be 0%. the scenes in fig.5(b)(c)(d) show the cases in which the target ob- ject was partially occluded by the two books. in these sequences, the tracking was successfully performed. the cycle time of the tracking is also within .one tv frame . 5.4 visual tracking with detection of oc- clusions caused by human hand we conducted experiments of visual tracking with detection of occlusions caused by human hand. in the experiments, the system tracked a target object which was grasped arbitrarily by human hand. we moved the grasped object to change the appearance in the 3d space. the size of the reference template is 64 x 64 pixels at one-second resolution. we create the affine transformed templates using a geodesic dome with 320 facets. fig.6 shows an example of the results. the images in the left of fig.6 were captured by the infrared camera. the right images are captured by the color ccd camera. the number at the bottom of the figure indicates the ratio of occluded area to the area of the reference template. the rectangular frame shows the contour of the template. in the top scenes, the template is not yet occluded and the occlusion ratio is 0%. in the middle scenes, the target object is rotated and is occluded by human hand. as shown in the figure, the visual tracking was successfully performed despite the change of appearance and the occlusion by human hand. in the experiments, the cycle time of the visual tracking was within one tv frame (33 msec). fig.7 shows facet transition on the geodesic dome which corresponds to the best matched templates dur- ing the visual tracking. the characters s and g in the figure indicate the start and the goal facets, respec- tively. 5.5 occlusion of more complex situations fig.8 shows an example of combining the two meth- ods to cope with occlusion. using a logical or oper- ation of the masks obtained by the two methods, we can create a new mask for the template to eliminate the data in the occluded region for correlation. the template has a size of 128 x 128 pixels at one-second resolution and the cycle time of the visual tracking is within one tv frame (33 msec). 1211 figure 6: visual tracking of a target object with de- tection of occlusions by hand the original template can be tessellated into small teniplates with the size of 4 x 4 pixels using the mask function. the smaller template permits the shape of the occluded region to be extracted more accurately at the expense of cycle time which typically requires two tv frames. 5 . 6 visual tracking in a task of human- robot interactions finally, we conducted experiments of visual hack- ing in a task of human-robot interactions where both types of the occlusions usually deleriorate the track- ing performance. fig.9 shows i t task of handing over a diskette from human to robot. visual tracking of the diskette was successfully performed. in the exper- iments, the size of the reference template was 96 x 96 pixels at one-second resolution and the cycle time of the visual tracking was withiri one tv frame (33 msec). 6 conclusion we presented methods for detecting occlusions in a - figure 5: visual tracking of a target with detection of occlusions by general objects. view-based visual tracking. we developed the meth- 1212 figure 7: facet transition on the geodesic dome during the visual tracking. figure 9: visual tracking in a task of handing over a diskette from human to robot figure 8: combining the two methods to cope with occlusion ods depending on the causes of occlusion: general ob- jects and huinan hand. in the case of occlusion by general objects, we detected the occlusion area using a tessellated template, i.e., we evaluated the correla- tion errors of the component parts of the tessellation. in the case of occlusion by human hand, we utilized infrared images to detect the occluded region in the target template. the system then creates a mask of the occluded region and eliminates pixels in the mask from the calculation of correlation with the template image. in the prototype system, the generation of the mask and the correlation can be performed, typically, in one frame. we have integrated these methods into the view-based visual tracking system which we have already developed. experimental results demonstrate the usefulness of the proposed methods. we expect that the applications of the robust view-based visual tracking can be extended to various tasks and envi- ronmen t s. our plans for future work include extending the sys- tem to deal with changes of illumination and changes of appearance of the template with more degrees of freedom. track

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论