IROS2019国际学术会议论文集1770_第1页
IROS2019国际学术会议论文集1770_第2页
IROS2019国际学术会议论文集1770_第3页
IROS2019国际学术会议论文集1770_第4页
IROS2019国际学术会议论文集1770_第5页
已阅读5页,还剩3页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

A Benchmark for Visual-Inertial Odometry Systems Employing Onboard Illumination Mike Kasper1Steve McGuire1Christoffer Heckman1 AbstractWe present a dataset for evaluating the perfor- mance of visual-inertial odometry (VIO) systems employing an onboard light source. The dataset consists of 39 sequences, recorded in mines, tunnels, and other dark environments, totaling more than 160 minutes of stereo camera video and IMU data. In each sequence, the scene is illuminated by an onboard light of approximately 1300, 4500, or 9000 lumens. We accommodate both direct and indirect visual odometry methods by providing the geometric and photometric camera calibrations (i.e. response, attenuation, and exposure times). In contrast with existing datasets, we also calibrate the light source itself and publish data for inferring more complex light models. Ground-truth position data are available for a subset of sequences, as captured by a Leica total station. All remaining sequences start and end at the same position, permitting the use of total accumulated drift as a metric for evaluation. Using our proposed benchmark, we analyze the performance of several start-of-the-art VO and VIO frame- works. The full dataset, including sensor data, calibration sequences, and evaluation scripts, is publicly available online at /research/oivio. I. INTRODUCTION Given their versatility and relatively low cost, passive cameras are arguably the most common sensor employed in robotics applications. However, to be used reliably, suffi cient scene illumination is required. While these conditions are met in many scenarios, there is growing interest for robots to work in darker environments, such as underground or underwater. This is most evident in the recently proposed DARPA Subterranean Challenge 3, but also highlighted by the emergence of workshops focused on the subject 20, 25 and companies fi elding robots in this domain. In the absence of visual information, the robotics com- munity has primarily relied on depth sensors (e.g. LIDAR and active depth cameras). While these sensors are robust to low-texture surfaces and poor illumination, they are often of limited range, lower resolution, and higher cost. More im- portantly, their utility is reduced in geometrically ambiguous scenes, such as long hallways or tunnels. Traditionally, visual cues have compensated for these limitations 9. In order to employ cameras in dark environments, robots can be equipped with an onboard light source. However, this would violate the brightness constancy assumption held by most visual perception systems, as scene illumination will change as a result of the robots movement. This is particularly problematic for direct methods, which work on image intensities 8, 15, 4. In contrast, indirect methods 1Autonomous Robotics and Perception Group (APRG). Department of Computer Science. University of Colorado. Boulder, Colorado USA. Cor- responding author: Fig. 1: Example frames from each environment (i.e. tunnels, mines, woods, and offi ce). Each row shows a sequence of four frames, separated by a few seconds. These images high- light the primary challenges posed by our dataset: dynamic illumination, motion blur, and poor camera exposure. are robust to such illumination changes, but are far more susceptible to the motion blur and sensor noise we can expect when working in dark environments, due to inadequate camera exposure 23, 13, 4. To assess the performance of existing methods and to aid the development of novel solutions to the aforementioned challenges, we present a benchmarking dataset for visual- inertial odometry (VIO) systems working in environments illuminated by a single, onboard light source. In total, the dataset contains 39 sequences, with over two hours of stereo camera video and IMU data. The sequences were recorded in a number of challenging environments, including tun- nels, mines, low-light indoor scenes, and nighttime outdoor scenes. Several example frames can be seen in Figure 1. For each recorded sequence, we illuminate the scene with a white LED light of approximately 1300, 4500, or 9000 lumens. This allows us to assess how much performance depends on lighting strength. In contrast with other datasets, our benchmark not only provides the geometric and photo- metric camera calibrations, but also a calibration of the light itself. By publishing a light model, we intend to promote the development of novel VIO algorithms that relax the brightness constancy assumption and model the dynamic illumination of the scene. As a point of comparison, however, we analyze several existing frameworks in Section VII. 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) Macau, China, November 4-8, 2019 978-1-7281-4003-2/19/$31.00 2019 IEEE5256 Fig. 2: Frames captured while executing approximately the same turn with each lighting confi guration. From left to right, the depicted scene is illuminated with 1300, 4500, and 9000 lumens. Note how the cameras auto-exposure compensates for the different levels of illumination, but consequently produces varying amounts of motion blur. To aid visual inspection, we have provided enlarged images for the regions outlined in red and green. II. RELATEDWORK We draw guidance from the popular EuRoC dataset 2 in terms of sensors and ground-truthing employed. It contains 11 sequences captured via a hardware-synchronized stereo camera and IMU mounted on top of a micro aerial vehi- cle. Ground-truth 6-DoF poses and 3-DoF positions were captured via a Vicon Motion Capture system and Leica MS50 laser tracker, respectively. The dataset does exhibit some challenging lighting scenarios, where large regions of the environment are poorly illuminated. However, lighting remained constant while capturing each sequence. A limita- tion of the EuRoC dataset is that it is only well-suited for indirect methods; it not only lacks the camera response and attenuation models required by direct methods, but also the exposure times between stereo cameras are not synchronized. In contrast, the TUM monoVO dataset 5 targets direct odometry methods, providing full photometric calibration and exposure times as reported by the camera sensor. How- ever, this is a purely monocular dataset, lacking the second camera and IMU sensor found in 2. The curators of the TUM monoVO dataset also opt for a different ground- truthing strategy. All sequences start and end in the same position, permitting the evaluation of VO frameworks in terms of total accumulated drift over the entire sequences. The published sequences contain a number of challenging scenes, but again exhibit relatively static lighting. More inline with our focus on onboard illumination is the Oxford RobotCar dataset 12. It contains over one year of driving data recorded by a car outfi tted with six cameras, a LIDAR, and an IMU. It facilitates the development and evaluation of a number of perception problems related to autonomous vehicles. The sequences exhibit a variety of weather conditions, captured at day and night. While night- time sequences are illuminated by the cars headlights, a sig- nifi cant portion of illumination is contributed by streetlights. Additionally, no model of the cars headlights are provided. Taking these concepts further, the ETH-ICL dataset 18 focuses directly on the problem of visual SLAM in dy- namically lit environments. The dataset contains both real and synthetic sequences largely based on the TUM RGB- D benchmark 24 and the ICL-NUIM dataset 7. Each sequence exhibits some form of dynamic lighting, either by modulating global and local light sources, or by the movement of a fl ashlight co-located with the camera. While this dataset does contain sequences illuminated by an on- board light source, only two sequences employ such a lighting solution. Additionally, no model of the light source is provided, which we believe could be leveraged to develop novel methods for visual odometry. In a different vein, the DiLiGenT dataset 22 is not in- tended for VO research, but rather that of photometric stereo. Photometric stereo, in contrast with binocular stereo, is a technique that typically employs a single camera and one or more lights to infer 3D geometry 26. The DiLiGenT dataset contains a series of images, taken of 10 objects, captured by a stationary camera and different lighting confi gurations. In addition to the images themselves the authors provide a calibration of the employed light array, which consists of a 2D grid of 96 uniformly-spaced, white LED lights. We wish to take a similar approach in our visual odometry dataset. As can be observed by this brief review, just as there is large diversity of VO solutions, the same can be said for VO datasets. The dataset we present in the following sections is particularly focused on underground environments with sensing and calibration considerations to match, including onboard lighting and the usage of stereo cameras and IMUs. III. DATASET All sequences in our dataset can be characterized as a visual-inertial rig navigating dark environments, illuminated by an onboard light source. We captured sequences in four types of environments: (1) mines, (2) tunnels, (3) outdoors at night, and (4) indoors where all other lights are turned off. While some sequences may exhibit small amounts ex- ternal lighting, the predominate illuminant in all scenes is the onboard light source. To permit exploration of lighting solutions, we roughly replicate each trajectory with three different lighting intensities. A few examples frames are shown in Figure 2. During data capture, the sensor rig was either handheld or mounted on a ground-vehicle (Clearpath Husky UGV). In the remainder of this section we provide details about the sensor rig and ground-truthing strategies. 5257 Leica Prism Light Cameras IMU Computer Batteries Fig. 3: Our employed sensors include an Intel RealSense D435i and a LORD Microstrain G3M-GX5-15 (not visible). The onboard light source is a 9000 lumen, 100W, white LED light. To modulate the light intensity we use a DC-DC boost regulator. Long-term use of this light requires a large passive heat-sink and fan. We equip a tracking prism when ground- truthing position data with the Leica. A. Sensor Setup We capture inertial data with LORD Microstrain 3DM- GX5-15 at 100Hz, and a stereo pair of 1280 720, grayscale images with a Intel RealSense D435i at 30Hz. We opted for the RealSense as it is a widely available consumer device featuring a hardware-synchronized, fi xed-lens, global-shutter stereo camera. However, as it is primarily intended to be used as an active IR depth sensor, these cameras do not fi lter out infrared light. This does not negatively impact the acquired visual information, but does require that we disable the IR emitter during operation. Consequently, we do not publish any depth maps with our dataset. Each sequence in our benchmark is illuminated by an onboard, maximum 9000 lumens, 100W, white LED light. Long-term use of this light requires a large passive heat sink and fan. Clearly, such a lighting system is not practical for all robotics applications (e.g. micro air vehicles). We therefore attempt to capture the same trajectory three times, modulating the lights intensity to approximately 100, 50, and 15 percent of its full capacity. This allows us to evaluate the performance of visual odometry systems working with different lighting solutions. The light and sensors are mounted inside custom housing, equipped with a power supply and onboard computer for logging data. As our sensor rig is self-contained, all captured sequences exhibit consistent extrinsic calibrations, regardless of the mobile platform employed (i.e. handheld or ground ve- hicle). Depending on the ground-truthing system employed, the rig may also be outfi tted with a laser tracking prism. An image of our sensor rig can be seen in Figure 3. B. Ground-truth In this work, we employ two methods for ground-truthing, providing a balance between sampling resolution and trajec- tory complexity. For a subset of our dataset, we employ a Leica TCRP1203 R300 total station to acquire the ground- truth position of our sensor rig at 10Hz. Unfortunately, main- taining the line-of-sight required by any tracking solution puts undesirable constraints on the nature of environments and trajectories we can employ. This is especially true in the narrow passage ways common in tunnels and mines. We therefore provide an addition set of sequences, that start and end in the same position, permitting the use of total accumulated drift as another evaluation metric. As in 5, we start and end each sequence with a 10-20 second sequence of loopy camera motion, observing an easy-to-track scene. We then process just these frames using the ORB-SLAM2 framework 14 to obtain accurate poses for the start and end segments. As our dataset contains stereo images, we need not resolve the scale ambiguity problem that arises from monocular VO, as required in 5. IV. DATASETFORMAT A. Data Format We organize our dataset using the ASL dataset format, similar to the the EuRoC dataset 2. However, we must also accommodate the additional photometric calibrations and light source model. As in 2, each captured sequence comprises a collection of sensor data. Each sensor comes with a sensor.yaml fi le, specifying calibration parame- ters, and a data.csv fi le, that either contains the sensor data itself or references fi les in an optional data folder: sensor.yaml data.csv data Again, ground-truthing systems are treated as separate sensors. In a slight abuse of terminology, we treat the light source as a sensor without a data.csv fi le. For example: husky0 imu0 sensor.yaml data.csv cam0 sensor.yaml data.csv data 154723105215000000.png 154723105220000000.png . leica0 sensor.yaml data.csv light0 sensor.yaml 1) YAML Files:As with the EuRoC MAV dataset, eachsensorprovidesasensor.yaml fi le,which describes all relevant properties unique to the sensor. Additionally,allYAML fi lessharetwocommon fi elds: sensor type and T BS.The sensor type fi eld specifi esoneofthefollowingsensortypes: 5258 imu|camera|position|pose|light and T BS is a 4 4 homogeneous transformation matrix, describing the sensors extrinsic relationship with the sensor rigs frame. All properties listed in the YAML fi le are assumed to be static throughout the entire sequence. 2) CSV Data Files: The data.csv fi le either contains all the data captured by the sensor throughout the sequence, or references fi les in the optional data folder. In either case, each line fi rst begins with a timestamp (integer nanosec- onds POSIX) denoting the time at which the corresponding data was recorded. The subsequent fi elds for each sensors data.csv fi le are presented in the following sections. B. Sensors 1) Cameras: As with the EuRoC dataset 2, each line of a cameras data.csv fi le contains the timestamp and fi le name of the captured image. We augment this by including the exposure times and gains, reported by the camera during capture. The sensor.yaml fi le specifi es the cameras intrinsics with the fi elds: camera model, intrinsic coefficients,distortion model, distortion coefficients, and resolution. However, as our dataset accommodates direct VO meth- ods, we also provide a photometric calibration, via the response.csv and vignette.png fi les in the sensor folder. The response.csv fi le contains 255 values, pro- viding all camera inverse-response values over the domain of 8-bit color depth. These values can be used as a simple lookup to convert from captured image intensities to irra- diance values. The vignette.png fi le is a monochrome 16-bit image with the same resolution as the camera. Each pixel in the vignette.png specifi es the attenuation factor for its respective coordinates in camera images. Section V-B describes our photometric calibrations process. 2) IMU: As with the EuRoC dataset, each line of an IMUs data.csv fi le contains the following fi elds: times- tamp, Srads1, aSms2. Where S R3is the angular rate and aS R3is the linear acceleration, both expressed in the sensors body frame. For all sequences, the Microstrains data is published under the imu0 sensor. The YAML fi le contains the noise densities and bias “diffusions”, which stochastically describe the sensors random walk. These are obtain in the same way as in the EuRoC MAV dataset. We refer the reader to 2 for more details. We also publish measurements from the RealSense D435is onboard IMU (Bosch Sensortec BMI055) as imu1. However, the RealSense does not capture accelerometer and gyroscope data at the same rate. Rather than complicating IMU data access, we opt to keep the same single-line fi le format. We confi gure the RealSense to capture accelerometer data at 250Hz and gyroscope data at 400Hz, and only report the gyroscope measurement closest to each accelerometer measurement. However, we still provide fi les for the indi- vidual sensor streams in imu1/data folder. 3) Light: As previously mentioned, the onboard light is treated like a sensor without a data.csv fi le. However, it does have a sensor.yaml fi le. Like all sensors, this YAML fi le provides the transformation matrix T BS, which specifi es the pose of the light with respect to the sensor system. It also contains two additional fi elds: size and lumens. Here, size is 2D vector indicating the horizontal, and vertical size, in meters, of the LED patch. The lumens fi eld specifi es the approximate light intensity, in lumens, which is one of three values: 9000, 4500 or 1300. 4) Position, Pose: As described in Section III-B, we employ two ground-truthing methods. These are represented by the sensors leica0 and loop0. As with the EuRoC dataset, the data.csv fi le for these sensors contains the fol- lowing fi elds: tim

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论