版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) Macau, China, November 4-8, 2019Rapid Estimation of Optical Properties for Simulation-Based Evaluation of Pose Estimation PerformanceThorbjrn Mosekjr Iversen, Jakob Wilm and Dirk Kraft978-1-7281-4003-2/19/$31.00 2019 IEE
2、E2030Abstract A growing trend in computer vision is the use of synthetic images for the evaluation of computer vision algorithms such as 3D pose estimation. This is partly due to the availability of high-quality render engines, which provide highly realistic synthetic images. However, the realism of
3、 the rendered images, and thus the reliability of the evaluations, strongly depends on how accurately the scenes are modeled and it requires considerable time and knowledge to do the modeling manually. Automating the modeling process is therefore crucial for making the rendering of photo-realistic s
4、ynthetic images accessible to the wider robotics community.We present a method for automatically modeling object and light properties for rigid, opaque plastic objects commonly found in industry. Our method relies on recordings of the environment captured with a consumer 360 camera to model the ligh
5、t, and on analysis-by-synthesis to estimate the optical properties of the objects. We show that the synthetic images rendered based on our automatic modeling method can be used to predict the overall performance of a monocular 3D pose estimation algorithm.I. INTRODUCTIONThe evaluation and tuning of
6、computer vision algorithms rely on visual input. Depending on the context, it can be time- consuming, impractical, or infeasible to capture the needed vision data. Due to the increase in the computational power of modern PCs as well as the improvement of render engines, it has become increasingly po
7、pular to rely on synthetic images in addition to real ones 1.The generation of realistic synthetic images requires ac- curate models of the light and objects in the scene. Manual modeling takes considerable effort, which leads to a trade-off between model accuracy and time invested in the modeling p
8、rocess. The computer graphics community has a strong focus on increasing the realism of rendered images and consequently provides sophisticated methods and specialized equipment, such as gonioreflectometers, for accurately mea- suring the visual properties of objects. However, many of these methods
9、prioritize measurement accuracy over mea- surement speed or rely on expensive specialized equipment. As a result, these methods are seldom used in the computer vision and robotics communities where synthetic images are mainly used as a means to an end.We propose a method for automating the modeling
10、of light and optical object properties for opaque plastic objects. We use recordings of the environment captured with a consumerThe research has been funded in part by Innovation Fund Denmark as a part of the project “MADE - Platform for Future Production”.All authors are with the Maersk Mc-Kinney M
11、oller Institute, University of Southern Denmark.thmi,jaw,kraftmmmi.sdu.dkFig. 1. Top: A photo (left) and a synthetic image rendered based on our method (right). Bottom: 3D pose estimation performance evaluated by estimating the pose in the photo (left) and the synthetic image (right).360 camera to m
12、odel the illumination and analysis-by- synthesis to estimate the optical properties. This allows for the generation of highly realistic synthetic images with a minimum of modeling required. Our method thus provides a compromise between modeling speed and image realism. While the level of realism is
13、less than what can be expected if rigid measurements of the bi-directional reflectance distribu- tion function (BRDF) was performed, the method is simple, fast and still results in highly realistic images (see Fig. 1). We evaluate our method by evaluating the performance of a monocular 3D pose estim
14、ation algorithm using both real images and synthetic images modeled using our method.This paper is structured as follows. First Section II presents an overview of the related literature. Section III then presents our method for modeling the light and optical object prop- erties using analysis-by-syn
15、thesis. Section IV describes the details of the evaluation, and Section V presents the results of the evaluation and a discussion on the found results. Finally, Section VI concludes on our method and results.II. RELATED WORKIn the computer vision literature, synthetic images are utilized in a wide r
16、ange of use cases. They are for instance used in the training of algorithms 2, 3, 4, analysis of uncertainties 5, evaluation of algorithms 1, 6, 7, analysis of visual features 8, 9 and offline programming and prototyping 10, 11. The methods used for generating the images, the realism of the images a
17、nd the methods of evaluation differs greatly depending on the use case. In a study of the use of graphics simulations for the evaluation of vision algorithms, Veeravasarapu et al. 12 found that it is still an open question what level of realism is neededto perform meaningful evaluations. It is, ther
18、efore, crucial to include a dedicated evaluation of the realism of synthetic images when these are used for evaluation.In 13, Rossmann et al. present work that explicitly evaluates the realism of synthetic images. They present a virtual testbed, which provides a framework for simulating various visi
19、on sensors, for use in the development and initial testing stage of a robotic system. Their system relies on OpenGL rasterization with a subsequent application of filters. This allows them to render synthetic images in real time.One of the disadvantages of using rasterization is that some of the sub
20、tle interactions between light and material are lost. In order to render images with a high degree of realism, Medeiros et al. 1 use a physically based rendering pipeline to simulate a structured light scanner. The synthetic scanner is then used to make a quantitative evaluation of different structu
21、red light scanning techniques, taking effects such as subsurface scattering and projector defocus into account.In 7, Irgenfried et al. explicitly evaluates the realism of synthetic images based on how a given computer vision algorithm performs when applied to a real image and a cor- responding synth
22、etic image. In their work, the performance of different segmentation algorithms is evaluated on both synthetic and real images. They report similar performance when a segmentation algorithm is applied to simple scenes. However, for complex scenes, there are some discrepancies between the segmentatio
23、n performed on the real and syn- thetic images. They attribute the discrepancies to insufficient modeling of the global illumination.The above works require that the scene has been mod- eled. This includes lights, object material, and object and scene geometry. Since state-of-the-art rendering engin
24、es are able to reproduce almost any kind of real-world effect, the differences between synthetic and real images can be al- most entirely attributed to insufficient modeling. A common method for capturing the BRDF of a surface is to use a gonioreflectometer. Unfortunately, such a device is both expe
25、nsive to acquire and time-consuming to use. For this reason, the development of more rapid methods is an active research field (see e.g. the work by Mukaigawa et al. 14). An alternative to traditional measurements of object prop- erties is analysis-by-synthesis. This method compares a rendering to a
26、 real image and iteratively adjusts the modelparameters until the reconstruction error has been minimized. The model parameters at the minimum are then estimates of the corresponding real-world parameters. An example of this is in the work by Corso et al. 15 where analysis-by- synthesis is used to i
27、nfer the concentration of apple particles in a glass of apple juice based on an image. Another example is the work by Stets et al. 16 where analysis-by-synthesisis used to determine the absorption-coefficient of glass.Our work is strongly inspired by Stets et al. 16. In their work, a scene, consisti
28、ng of glass objects on a white tablecloth, is reconstructed by fusing multi-model sensor data. Using highly accurate, specialized calibration methods, they are able to render highly realistic synthetic images which are comparable to corresponding real images on apixel-to-pixel level.Our work uses a
29、streamlined image creation pipeline, which relies on analysis-by-synthesis, to infer the object parameters and on recordings of the environment to model the illumination. Although our method does not achieve the same accuracy as 16, it does not rely on expensive special- ized equipment and time-cons
30、uming recordings to model the scene. This makes our method very useful for many robotics applications, in that the performance of vision equipment can be realistically simulated and optimized before investing engineering effort in implementing a physical cell.We show that the images rendered based o
31、n modeling using our method are comparable to what is achieved by manual modeling. Additionally, we show the images are sufficiently realistic to predict the overall performance of the pose estimation algorithm described in 17. In this work, pose estimation will refer to the use of this method. To t
32、he best of our knowledge, our work is the first to use automatic object material and light modeling to facilitate fast and easy simulation-based evaluation of pose estimation performance.III. METHODIn order to render a synthetic image, the scene must be described by a set of models which define rele
33、vant entities such as the light, the poses of the objects, and the geometry and material properties of the objects. The choice of models dictates the theoretical maximum limit of realism achievable in the image. This section first describes and discusses the models used in our method which are all c
34、ommonly used models. This is followed by a description of our method for automatic estimation of optical object properties.A. The modelsThe scene model is split into parts which are approached separately. The three parts are the lighting, the geometry, and the optical properties of the object. The f
35、ollowing describes each part in more detail.Lighting. Ideally, all light sources in the scene should be modeled by their position, intensity, and distribution of light emission. While this is possible in a controlled lab environment, industrial scenes generally include many light sources. As light a
36、lso reflects off objects which are not visible in the image, a complete light model would require the entire scene to be modeled including objects outside the field-of-view. To avoid this, the light is modeled by an environment map: the scene is modeled as consisting of local objects illuminated by
37、an environment which is infinitely far away. This approximation allows the light to bounce between local objects in the scene without interacting with the environment.Modeling the light as coming from infinitely far away can potentially introduce a significant parallax error. It is possible to corre
38、ct for this using parallax correction. However, this would require a manual estimate of the position of the light in the environment. As our method has been developed to automate the modeling process, we have2037Pose estimate Theta VCapture image under controlled lighting conditionsPose estimate obj
39、ectenvironment mapCAD modelobject posereal imageAnalysis-by-synthesisRecord environment mapZYXBack cameraBack camera90Response curveelevation anglecalibration0optical object parameterslight intensityFig. 2. Data flow diagram illustrating our method for modeling the light and optical object propertie
40、s.(a)-90180900azimuth angle (b)270180chosen not to include this correction.Scene and object geometry. In this work, object geometry refers to both the shape of the object as well as the relative 6D poses of objects, light, camera, and, in the case of the experiments, markers. Our method has been dev
41、eloped for industrial cases, which means that CAD models are available for all objects of interest. When this is not the case it is possible to either reverse engineer or 3D scan the objects. It is therefore assumed that digital representations of object shapes are available as triangle meshes.Optic
42、al properties of the object. The appearance of an object depends on the incoming light and how this light interacts with the material. While it is impossible to take into account all physical effects taking place, the computer graph- ics literature provides models which approximate almost any aspect
43、 of light interaction. Our study involves opaque plastic objects since these are common in industry and can be fairly accurately described by a limited set of render parameters.We use Pixar Renderman v21 as our rendering engine. The object material is modeled using the Renderman material PxrSurface.
44、 Based on initial experiments and an analysis of the available model parameters, we conclude that opaque plastic objects can be relatively accurately modeled using only three parameters: color, surface roughness, and refrac- tive index. Note that this model treats surface roughness as a material par
45、ameter regardless of the fact that surface rough- ness is the microscopic structure of the object geometry. As a result, objects which have surface areas of different roughness are modeled as consisting of different materials.B. Estimation of model parametersThe achieved realism of the synthetic ima
46、ges largely depends on the model parameters. The following describes our method for estimating these parameters (the lighting, the geometry, and the optical properties of the object). The method consists of three steps. Inthefirststep, images of the environment are recorded using a Theta V 360 camer
47、a and merged into a high dynamic range (HDR) environment map which contains the relative light intensity of the environment. In the second step, an imageof the object of interest is captured under specialized calibration conditions. In the third step, the light intensity and object material paramete
48、rs are optimized jointly. A data flow diagram illustrating our method is shown in Fig. 2.Fig. 3. a) The coordinate frame of the Theta V camera b) A equirectangular image captured with the Theta V. In our case, the back camera is covered so it cointains no relevant image information. Only the central
49、 part of the front sensor is used for calibration of the response curve. This is indicated by the green box in the figure.Recording of HDR environment map. Images of the environment are recorded using a Ricoh Theta V camera, which is a consumer camera that can capture 360 imageswhich cover the full
50、azimuth angle range ( 0, 360) and elevation angle range ( 90, 90). These images are captured by two image sensors with corresponding fisheyelenses positioned such that one covers the front-facing hemi- sphere and the other covers the back-facing hemisphere. Im- mediately after capture, the two fishe
51、ye images are stitched where the hemisphere borders meet and mapped to and saved as a single equirectangular image as shown in Fig. 3b.In order to perform seamless stitching, the Theta V distorts the two input images before stitching. This distortion is strongest close to the image border. Initial e
52、xperiments have shown that strong light sources close to the image border tend to be attenuated. As a result, this part of the image is unsuitable for reliable recording of relative light intensity which is needed for the creation of an HDR environment map. To mitigate the effect, we place the camer
53、a with the front lens pointing upwards and the back lens covered. This results in images where the distorted regions are strongest close to the table horizon. This approach could potentially lead to inaccurate environment maps if there are strong light sources at a low elevation angle above the tabl
54、e. However, in many industrial applications, light sources are placed mainly above the table surface. This means that the distortion at the edges has a relatively low practical impact on the rendered images. This workaround makes it possible to use the Theta V without the need for low-level hardware
55、 access.A number of images of the environment are captured at varying exposures. The images are merged into an HDR image using Debevecs method 18. We use OpenCVs implementation which generates the images in two steps. First, the response curve of the camera is calibrated, after which the images, whi
56、ch are taken at different exposures, are merged to a single HDR environment map. To limit the negative impact, which image stitching has on the calibration, only a subregion in the environment images is used for the calibration of the response curve. The region is the central 25% of the front camera
57、 illustrated in Fig. 3. After the LED diode Charuco markers Object Camera(a) Real image(b) Synthetic image(with env. light)(c) Synthetic image (without env. light)Fig. 5. a) The analysis-by-synthesis image captured under optimal light conditions, b) a corresponding synthetic image illuminated by evi
58、ronment light with optical properties determined through analysis-by-synthesis opti- mization, c) a synthetic image using the optical properties from the analysis- by-synthesis optimization but with the environment light replaced with a point light at the camera center.(a) (b)Fig. 4. a) The analysis-by-synthesis image is captured inside a box shielded from outside light by black curtains. The table
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 2026年4月贵州遵义市赤水市公益性岗位人员招聘12人备考题库附参考答案详解(研优卷)
- 2026上半年四川事业单位统考安州区考试招聘教师26人备考题库及答案详解(真题汇编)
- 2026福州产发园区运营管理有限公司项目运营合同制用工招聘3人备考题库【夺冠】附答案详解
- 2026四川大学华西医院许艺苧研究员课题组博士后招聘备考题库附答案详解(巩固)
- 2026河北雄安交通投资有限公司校园招聘2人备考题库含答案详解(满分必刷)
- 2026山东济南市妇女儿童活动中心幼儿园(领秀公馆园)招聘实习生备考题库附答案详解【突破训练】
- 2026四川自贡自流井区人力资源服务中心就业见习岗位招募1人备考题库(易错题)附答案详解
- 2026年3月山东济南轨道交通集团运营有限公司社会招聘备考题库附答案详解(b卷)
- 2026广东佛山南海区大沥镇盐步第三幼儿园招聘备考题库完整参考答案详解
- 2026湖南岳阳市云溪区“四海揽才”教师人才校园招聘13人备考题库【考点精练】附答案详解
- 产权交易平台设计与运行管理方案
- 混凝土路面换板施工技术方案详解
- 幼儿大班认识建筑
- 新工厂安全培训内容简要课件
- 园艺学进展课程课件
- 产品设计文档撰写规范案例示范版
- 蒸汽工程安装方案(3篇)
- 颅内动脉急诊取栓技术
- 2025年四川大学教育培训部业务岗工作人员招聘考前自测高频考点模拟试题附答案详解
- 江苏省2025年接受高级访问学者的高等学校
- 村民自治课件
评论
0/150
提交评论