IROS2019国际学术会议论文集 1770

上传人：我*** IP属地：北京上传时间：2020-04-05 格式：PDF 页数：8 大小：3.98MB 积分：12 举报 版权申诉

已阅读5页，还剩3页未读，继续免费阅读

版权说明：本文档由用户提供并上传，收益归属内容提供方，若内容存在侵权，请进行举报或认领

文档简介

A Benchmark for Visual Inertial Odometry Systems Employing Onboard Illumination Mike Kasper1Steve McGuire1Christoffer Heckman1 Abstract We present a dataset for evaluating the perfor mance of visual inertial odometry VIO systems employing an onboard light source The dataset consists of 39 sequences recorded in mines tunnels and other dark environments totaling more than 160 minutes of stereo camera video and IMU data In each sequence the scene is illuminated by an onboard light of approximately 1300 4500 or 9000 lumens We accommodate both direct and indirect visual odometry methods by providing the geometric and photometric camera calibrations i e response attenuation and exposure times In contrast with existing datasets we also calibrate the light source itself and publish data for inferring more complex light models Ground truth position data are available for a subset of sequences as captured by a Leica total station All remaining sequences start and end at the same position permitting the use of total accumulated drift as a metric for evaluation Using our proposed benchmark we analyze the performance of several start of the art VO and VIO frame works The full dataset including sensor data calibration sequences and evaluation scripts is publicly available online at http arpg colorado edu research oivio I INTRODUCTION Given their versatility and relatively low cost passive cameras are arguably the most common sensor employed in robotics applications However to be used reliably suffi cient scene illumination is required While these conditions are met in many scenarios there is growing interest for robots to work in darker environments such as underground or underwater This is most evident in the recently proposed DARPA Subterranean Challenge 3 but also highlighted by the emergence of workshops focused on the subject 20 25 and companies fi elding robots in this domain In the absence of visual information the robotics com munity has primarily relied on depth sensors e g LIDAR and active depth cameras While these sensors are robust to low texture surfaces and poor illumination they are often of limited range lower resolution and higher cost More im portantly their utility is reduced in geometrically ambiguous scenes such as long hallways or tunnels Traditionally visual cues have compensated for these limitations 9 In order to employ cameras in dark environments robots can be equipped with an onboard light source However this would violate the brightness constancy assumption held by most visual perception systems as scene illumination will change as a result of the robot s movement This is particularly problematic for direct methods which work on image intensities 8 15 4 In contrast indirect methods 1Autonomous Robotics and Perception Group APRG Department of Computer Science University of Colorado Boulder Colorado USA Cor responding author christoffer heckman colorado edu Fig 1 Example frames from each environment i e tunnels mines woods and offi ce Each row shows a sequence of four frames separated by a few seconds These images high light the primary challenges posed by our dataset dynamic illumination motion blur and poor camera exposure are robust to such illumination changes but are far more susceptible to the motion blur and sensor noise we can expect when working in dark environments due to inadequate camera exposure 23 13 4 To assess the performance of existing methods and to aid the development of novel solutions to the aforementioned challenges we present a benchmarking dataset for visual inertial odometry VIO systems working in environments illuminated by a single onboard light source In total the dataset contains 39 sequences with over two hours of stereo camera video and IMU data The sequences were recorded in a number of challenging environments including tun nels mines low light indoor scenes and nighttime outdoor scenes Several example frames can be seen in Figure 1 For each recorded sequence we illuminate the scene with a white LED light of approximately 1300 4500 or 9000 lumens This allows us to assess how much performance depends on lighting strength In contrast with other datasets our benchmark not only provides the geometric and photo metric camera calibrations but also a calibration of the light itself By publishing a light model we intend to promote the development of novel VIO algorithms that relax the brightness constancy assumption and model the dynamic illumination of the scene As a point of comparison however we analyze several existing frameworks in Section VII 2019 IEEE RSJ International Conference on Intelligent Robots and Systems IROS Macau China November 4 8 2019 978 1 7281 4003 2 19 31 00 2019 IEEE5256 Fig 2 Frames captured while executing approximately the same turn with each lighting confi guration From left to right the depicted scene is illuminated with 1300 4500 and 9000 lumens Note how the camera s auto exposure compensates for the different levels of illumination but consequently produces varying amounts of motion blur To aid visual inspection we have provided enlarged images for the regions outlined in red and green II RELATEDWORK We draw guidance from the popular EuRoC dataset 2 in terms of sensors and ground truthing employed It contains 11 sequences captured via a hardware synchronized stereo camera and IMU mounted on top of a micro aerial vehi cle Ground truth 6 DoF poses and 3 DoF positions were captured via a Vicon Motion Capture system and Leica MS50 laser tracker respectively The dataset does exhibit some challenging lighting scenarios where large regions of the environment are poorly illuminated However lighting remained constant while capturing each sequence A limita tion of the EuRoC dataset is that it is only well suited for indirect methods it not only lacks the camera response and attenuation models required by direct methods but also the exposure times between stereo cameras are not synchronized In contrast the TUM monoVO dataset 5 targets direct odometry methods providing full photometric calibration and exposure times as reported by the camera sensor How ever this is a purely monocular dataset lacking the second camera and IMU sensor found in 2 The curators of the TUM monoVO dataset also opt for a different ground truthing strategy All sequences start and end in the same position permitting the evaluation of VO frameworks in terms of total accumulated drift over the entire sequences The published sequences contain a number of challenging scenes but again exhibit relatively static lighting More inline with our focus on onboard illumination is the Oxford RobotCar dataset 12 It contains over one year of driving data recorded by a car outfi tted with six cameras a LIDAR and an IMU It facilitates the development and evaluation of a number of perception problems related to autonomous vehicles The sequences exhibit a variety of weather conditions captured at day and night While night time sequences are illuminated by the car s headlights a sig nifi cant portion of illumination is contributed by streetlights Additionally no model of the car s headlights are provided Taking these concepts further the ETH ICL dataset 18 focuses directly on the problem of visual SLAM in dy namically lit environments The dataset contains both real and synthetic sequences largely based on the TUM RGB D benchmark 24 and the ICL NUIM dataset 7 Each sequence exhibits some form of dynamic lighting either by modulating global and local light sources or by the movement of a fl ashlight co located with the camera While this dataset does contain sequences illuminated by an on board light source only two sequences employ such a lighting solution Additionally no model of the light source is provided which we believe could be leveraged to develop novel methods for visual odometry In a different vein the DiLiGenT dataset 22 is not in tended for VO research but rather that of photometric stereo Photometric stereo in contrast with binocular stereo is a technique that typically employs a single camera and one or more lights to infer 3D geometry 26 The DiLiGenT dataset contains a series of images taken of 10 objects captured by a stationary camera and different lighting confi gurations In addition to the images themselves the authors provide a calibration of the employed light array which consists of a 2D grid of 96 uniformly spaced white LED lights We wish to take a similar approach in our visual odometry dataset As can be observed by this brief review just as there is large diversity of VO solutions the same can be said for VO datasets The dataset we present in the following sections is particularly focused on underground environments with sensing and calibration considerations to match including onboard lighting and the usage of stereo cameras and IMUs III DATASET All sequences in our dataset can be characterized as a visual inertial rig navigating dark environments illuminated by an onboard light source We captured sequences in four types of environments 1 mines 2 tunnels 3 outdoors at night and 4 indoors where all other lights are turned off While some sequences may exhibit small amounts ex ternal lighting the predominate illuminant in all scenes is the onboard light source To permit exploration of lighting solutions we roughly replicate each trajectory with three different lighting intensities A few examples frames are shown in Figure 2 During data capture the sensor rig was either handheld or mounted on a ground vehicle Clearpath Husky UGV In the remainder of this section we provide details about the sensor rig and ground truthing strategies 5257 Leica Prism Light Cameras IMU Computer Batteries Fig 3 Our employed sensors include an Intel RealSense D435i and a LORD Microstrain G3M GX5 15 not visible The onboard light source is a 9000 lumen 100W white LED light To modulate the light intensity we use a DC DC boost regulator Long term use of this light requires a large passive heat sink and fan We equip a tracking prism when ground truthing position data with the Leica A Sensor Setup We capture inertial data with LORD Microstrain 3DM GX5 15 at 100Hz and a stereo pair of 1280 720 grayscale images with a Intel RealSense D435i at 30Hz We opted for the RealSense as it is a widely available consumer device featuring a hardware synchronized fi xed lens global shutter stereo camera However as it is primarily intended to be used as an active IR depth sensor these cameras do not fi lter out infrared light This does not negatively impact the acquired visual information but does require that we disable the IR emitter during operation Consequently we do not publish any depth maps with our dataset Each sequence in our benchmark is illuminated by an onboard maximum 9000 lumens 100W white LED light Long term use of this light requires a large passive heat sink and fan Clearly such a lighting system is not practical for all robotics applications e g micro air vehicles We therefore attempt to capture the same trajectory three times modulating the light s intensity to approximately 100 50 and 15 percent of its full capacity This allows us to evaluate the performance of visual odometry systems working with different lighting solutions The light and sensors are mounted inside custom housing equipped with a power supply and onboard computer for logging data As our sensor rig is self contained all captured sequences exhibit consistent extrinsic calibrations regardless of the mobile platform employed i e handheld or ground ve hicle Depending on the ground truthing system employed the rig may also be outfi tted with a laser tracking prism An image of our sensor rig can be seen in Figure 3 B Ground truth In this work we employ two methods for ground truthing providing a balance between sampling resolution and trajec tory complexity For a subset of our dataset we employ a Leica TCRP1203 R300 total station to acquire the ground truth position of our sensor rig at 10Hz Unfortunately main taining the line of sight required by any tracking solution puts undesirable constraints on the nature of environments and trajectories we can employ This is especially true in the narrow passage ways common in tunnels and mines We therefore provide an addition set of sequences that start and end in the same position permitting the use of total accumulated drift as another evaluation metric As in 5 we start and end each sequence with a 10 20 second sequence of loopy camera motion observing an easy to track scene We then process just these frames using the ORB SLAM2 framework 14 to obtain accurate poses for the start and end segments As our dataset contains stereo images we need not resolve the scale ambiguity problem that arises from monocular VO as required in 5 IV DATASETFORMAT A Data Format We organize our dataset using the ASL dataset format similar to the the EuRoC dataset 2 However we must also accommodate the additional photometric calibrations and light source model As in 2 each captured sequence comprises a collection of sensor data Each sensor comes with a sensor yaml fi le specifying calibration parame ters and a data csv fi le that either contains the sensor data itself or references fi les in an optional data folder sensor yaml data csv data Again ground truthing systems are treated as separate sensors In a slight abuse of terminology we treat the light source as a sensor without a data csv fi le For example husky0 imu0 sensor yaml data csv cam0 sensor yaml data csv data 154723105215000000 png 154723105220000000 png leica0 sensor yaml data csv light0 sensor yaml 1 YAML Files As with the EuRoC MAV dataset eachsensorprovidesasensor yaml fi le which describes all relevant properties unique to the sensor Additionally allYAML fi lessharetwocommon fi elds sensor type and T BS The sensor type fi eld specifi esoneofthefollowingsensortypes 5258 imu camera position pose light and T BS is a 4 4 homogeneous transformation matrix describing the sensor s extrinsic relationship with the sensor rig s frame All properties listed in the YAML fi le are assumed to be static throughout the entire sequence 2 CSV Data Files The data csv fi le either contains all the data captured by the sensor throughout the sequence or references fi les in the optional data folder In either case each line fi rst begins with a timestamp integer nanosec onds POSIX denoting the time at which the corresponding data was recorded The subsequent fi elds for each sensor s data csv fi le are presented in the following sections B Sensors 1 Cameras As with the EuRoC dataset 2 each line of a camera s data csv fi le contains the timestamp and fi le name of the captured image We augment this by including the exposure times and gains reported by the camera during capture The sensor yaml fi le specifi es the cameras intrinsics with the fi elds camera model intrinsic coefficients distortion model distortion coefficients and resolution However as our dataset accommodates direct VO meth ods we also provide a photometric calibration via the response csv and vignette png fi les in the sensor folder The response csv fi le contains 255 values pro viding all camera inverse response values over the domain of 8 bit color depth These values can be used as a simple lookup to convert from captured image intensities to irra diance values The vignette png fi le is a monochrome 16 bit image with the same resolution as the camera Each pixel in the vignette png specifi es the attenuation factor for its respective coordinates in camera images Section V B describes our photometric calibrations process 2 IMU As with the EuRoC dataset each line of an IMU s data csv fi le contains the following fi elds times tamp S rads 1 aS ms 2 Where S R3is the angular rate and aS R3is the linear acceleration both expressed in the sensor s body frame For all sequences the Microstrain s data is published under the imu0 sensor The YAML fi le contains the noise densities and bias diffusions which stochastically describe the sensor s random walk These are obtain in the same way as in the EuRoC MAV dataset We refer the reader to 2 for more details We also publish measurements from the RealSense D435i s onboard IMU Bosch Sensortec BMI055 as imu1 However the RealSense does not capture accelerometer and gyroscope data at the same rate Rather than complicating IMU data access we opt to keep the same single line fi le format We confi gure the RealSense to capture accelerometer data at 250Hz and gyroscope data at 400Hz and only report the gyroscope measurement closest to each accelerometer measurement However we still provide fi les for the indi vidual sensor streams in imu1 data folder 3 Light As previously mentioned the onboard light is treated like a sensor without a data csv fi le However it does have a sensor yaml fi le Like all sensors this YAML fi le provides the transformation matrix T BS which specifi es the pose of the light with respect to the sensor system It also contains two additional fi elds size and lumens Here size is 2D vector indicating the horizontal and vertical size in meters of the LED patch The lumens fi eld specifi es the approximate light intensity in lumens which is one of three values 9000 4500 or 1300 4 Position Pose As described in Section III B we employ two ground truthing methods These are represented by the sensors leica0 and loop0 As with the EuRoC dataset the data csv fi le for these sensors contains the fol lowing fi elds timestamp qRS RpRS Here RpRS denotes the position of the sensor with respect to the ground truthing reference frame R The fi eld qRSis the four element unit quaternion representing the orientation of the sensor Given that the Leica laser tracker only captures 3D posi tion the qRS fi eld is omitted from its data csv fi le In contrast the loop closure method described in 5 can provide full 6 DoF poses However it only does so for the very beginning and ending of the trajectory Consequently its corresponding data csv fi l

人人文库> 全部分类> 教育资料 > 课件下载

温馨提示

1. 本站所有资源如无特殊说明，都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
2. 本站的文档不包含任何第三方提供的附件图纸等，如果需要附件，请联系上传者。文件的所有权益归上传用户所有。
3. 本站RAR压缩包中若带图纸，网页内容里面会有图纸预览，若没有图纸预览就没有图纸。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 人人文库网仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对用户上传分享的文档内容本身不做任何修改或编辑，并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容，请与我们联系，我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

IROS2019国际学术会议论文集 1770

文档简介

温馨提示

最新文档

评论

IROS2019国际学术会议论文集 1770

文档简介

温馨提示

最新文档

评论

相关文档