IROS2019国际学术会议论文集2561

上传人：我*** IP属地：北京上传时间：2020-06-04 格式：PDF 页数：8 大小：4.73MB 积分：12 举报 版权申诉

已阅读5页，还剩3页未读，继续免费阅读

版权说明：本文档由用户提供并上传，收益归属内容提供方，若内容存在侵权，请进行举报或认领

文档简介

TIMTAM: Tunnel-Image Texturally-Accorded Mosaic for Location Refi nement of Underground Vehicles with a Single Camera Fan Zeng1, Adam Jacobson1, David Smith2, Nigel Boswell2, Thierry Peynot1,3, Michael Milford1 AbstractMany mine-site processes such as vehicle opera- tion require localisation systems that are reliable, robust and work in a range of environmental conditions. In underground operations, GPS is not available: solutions instead rely on static infrastructure or expensive, laser-based solutions with limited operational capability. In this paper we present a new vision-based technique, Tunnel-IMage Texturally-Accorded Mosaic (TIMTAM), for sub-metre, infrastructure-free local- isation in underground mining environments using a single camera. Our approach stitches upward-facing camera images to form planar mosaic maps, using locations generated by the coarse mapping engine based on a small number of manually anchored locations. Localisation is achieved by refi ning coarse location estimations with a best fi t pixel location for the query image within a search neighbourhood in the mosaic map. Our direct pixel-based method is more robust to the challenging illumination and surface-texture environments encountered in underground mine operations than feature-based techniques. Localisation refi nement is only triggered when a confi dence threshold for the estimate is exceeded. The system is evaluated in a real world mine tunnel, with results showing that the confi dence threshold approach is predictive of the quality of the location estimate refi nement, and achieves a reduction in mean localisation metric error of up to 66% from simulated coarse results. I. INTRODUCTION Developing technology for underground operation of both manned and unmanned underground mining vehicles is a challenging objective in fi eld robotics. For mining vehicles to be self-driving, they fi rst need to know their exact locations, while tracking the location of manned vehicles can result in signifi cant effi ciency dividends for mine operations planning. Localisation in underground environments is particularly diffi cult for multiple reasons, including: 1) Underground environments are GPS denied, removing a central pillar of many above-ground autonomy approaches. 2) Infrastructure-based localisation systems, such as Radio Frequency techniques 1, 2, require signifi cant investment in setting up and maintaining the beacons. 3) In geometrically self-similar environments such as long stretches of underground mine tunnels, the scan profi les generated by laser scanners can lack place-uniqueness, re- quiring a different sensory input, such as camera-based computer vision, to complement the performance of laser based localisation systems 3, 4. This research was supported by an Advance Queensland Innovation Partnerships grant from the Queensland Government, Mining3, Caterpillar and the Queensland University of Technology (QUT). MM also received support from an ARC Future Fellowship FT140101229. 1FZ, AJ, TP and MM are with QUT.au 2DS and NB are with Caterpillar, Inc. 3TP is also with Mining3. (a) (b)(c)(d) Fig. 1: (a) Picture of a mining truck with sensors attached. (b) Example of images taken by upward-facing camera (top) and forward-facing camera (bottom). (c) Mosaic of tunnel ceiling below which the top image in (b) was captured. The magenta dot indicates coarse location estimate. TIMTAM examines refi nement candidates highlighted in yellow. (d) Patch-normalised query image shown in the inset, side-by- side with its best-fi t position delimited in the mosaic map. There are also unique challenges for vision-based lo- calisation systems in mine tunnels, including poor general illumination and fl uctuating artifi cial light sources both on and off the vehicle (see sample camera image in Fig. 1b). Furthermore, typical visual features in underground tunnel images are highly aliased (self-similar or repetitive) 5, caus- ing feature-based place recognition algorithms to struggle with false-positive matches generated by visual aliasing. Pre- viously we have proposed methods for saliency-based image fi ltering 6, and have demonstrated that a coarse vision- only localisation system dubbed Semi-Supervised SLAM 7 using only a forward facing camera has achieved an average localisation error within 9 metres 7 in underground envi- ronments; a level of accuracy useful for inventory tracking but insuffi cient for automating mining vehicle navigation. In this work, we present a direct pixel-based localisation approach to refi ning an initial coarse estimate of location. IEEE Robotics and Automation Letters (RAL) paper presented at the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) Macau, China, November 4-8, 2019 Copyright 2019 IEEE Our approach fi rst builds a reference planar mosaic map; the precise localisation engine then refi nes the location estimate through a direct pixel-based search of candidate poses. This refi nement stage also produces a confi dence score, enabling the precise localisation engine to only perform refi nement when it is likely that the refi nement will improve the local- isation estimate. As much as 66% of the coarse localisation error can be reduced from simulated coarse results using the low-cost vision only solution - TIMTAM. Comparing with non-mosaic refi nement solutions developed previously 5, TIMTAM reduces redundant computations, as well as the size of the map database. It also has the potential to be integrated with other sensors to realise a robust and precise localisation system. This paper makes the following contributions: 1) A camera-only solution for underground vehicle position refi nement is proposed and evaluated using novel real-world underground mine datasets. 2) A method for generating ceiling mosaics for locali- sation underground, which maintains a consensus record of overlapped portion of consitituent images, minimising the effect of appearance change and reducing map storage requirements. 3) A method for evaluating the match of query images to ceiling mosaic for localisation, reducing computational requirements. The rest of the paper proceeds as follows: Section II gives a brief review of the applicability of general SLAM methods, traditional vision based methods and existing mosaic creation and matching algorithms to the previously raised localisation problem. Section III describes the proposed approach with pseudo-code algorithms. Section IV details the dataset and parameters used in the experiments and the steps performed to obtain the results presented in Section V. An analysis of the results and conclusions are given in Section VI. II. LITERATUREREVIEW In this section, we describe the state of the art of SLAM and localisation algorithms, highlighting systems designed to operate in hazardous conditions, such as underground mines, and vision-based techniques that utilise mosaics for localisation. A. General SLAM Methods There is a large body of work demonstrating SLAM implementations using onboard robotic sensors such as lasers and cameras. Current SLAM implementations have shown the ability to operate over tremendous distances and within a variety of situations 8, 9, 10, 11, 3, 12, 13, 14. Systems such as ORB-SLAM2 15 have demonstrated map- ping performance with centimetre-level accuracy in some cases. However, we have not been able to successfully apply these systems using images of underground mine tunnel ceilings. Unsatisfying mapping / localisation accuracies using forward-facing camera images have also been reported in our previous work 6, 7 for FAB-MAP and ORB-SLAM. B. Vision-Based Methods Traditional feature-based place recognition algorithms 16 17 such as FAB-MAP 18 do not work well in very visually-aliased environments 5. SeqSLAM 19 addressed appearance changes by matching sequences using direct image comparisons, and NetVLAD 20 constructed a state- of-the-art feature extraction architecture based on convo- lutional neural network; however, both approaches do not directly output metric locations. In our previous work 5, Intra-Image SeqSLAM (I2-S2) was proposed as a precision localisation engine. As mentioned in 21, typically a large portion of optical fl ow vector calculations are redundant when processing each pair of query and reference images. A Fully Convolutional Network (FCN) based sample point selector was used in 21 to select just enough high-quality sample points to process for homography generation, reduc- ing computation time for each query-reference pair. Still, at least a few query-reference pairs need to be compared because of the limited Field of View (FOV), due to the potential drastic appearance change introduced by a slight viewing angle change 5. Matching query images to a series of adjacent reference images result in redundant calculations, and storing more than one reference image around each node contributes to a bloated map database. In large-scale environ- ments where lightweight embedded solutions are favoured, such as our underground localisation problem, a reduced map database is desirable. C. Mosaic Related Methods The typical approach to image stitching is to use keypoint detections to estimate the homography between adjacent images, which is then used to warp the images to achieve a high-quality stitching line 22, 23, 24. These methods have been successful in creating seamless mosaics appealing to human eyes and useful for localisation in particular applications. However, the fi rst priority in place recognition applications is not eliminating visible seams, but creating a map that can be used for template-matching over the long term, despite appearance variations. This requirement intro- duces diffi culties that need to be overcome in underground mine environments. The presence of lighting variation within one image is one challenge; Patch normalisation and local contrast enhancement such as Contrast Limited Adaptive Histogram Equalization (CLAHE) is able to mitigate such variations. Signum of Laplacian of Gaussian (SLoG) 25, 26 is a fi ltering step very applicable to the problem of underground mining scenario. However, care must be taken not to amplify the local noise that interferes with place recognition.A second challenge is that the tunnel ceiling surfaces being captured are very close to the camera, and are not perfectly planar; the same spot on a rock surface can appear both bright and dark from different viewing angles or under changed lighting conditions. Shadows cast by the ceiling wires change as the vehicle travels along the tunnel. These issues have been addressed in existing mosaic creation solutions such as 25, 26 by fi ltering and SLoG normalisa- tion of image intensities. Texture analysis has been proposed (a) (b) Fig. 2: (a) Schematic diagram of the complete localisation system, containing coarse and precise stages. (b) Schematic diagram of the precise localisation system - TIMTAM. to match features in a sequence of underwater images 27. Difference of Gaussian (DoG) based feature detection and Robust Local Binary Pattern (RLBP) descriptors were used for underwater correspondence problems in 28. In this work, we explore the possibility of not using explicit feature extractions. III. APPROACH The schematic diagram of the complete localisation system is shown in Fig. 2a. Under the two-stage methodology, the separation of global and local positioning signifi cantly reduces the computational load on the global search. The schematic diagram of the proposed TIMTAM system is shown in Fig. 2b. A. Coarse Localisation Engine Our previous work 7 has proposed a coarse localisation system that only uses a forward facing camera CamF. Here is a brief summary of its working principle: In the mapping phase, the coarse map MapCof the mine is created, which consists of evenly-separated nodes. Each node has a location and a few reference images captured by CamF. In the localisation phase, the query image from the forward facing camera is compared with reference images associated with hypothesised nodes. The index of the node with the highest matching score is reported as the localisation result, and the belief of the vehicles location is updated according to the image matching score. It follows that the resolution of the coarse localisation result is limited by the separation of nodes. Based on the matching of CamFimages, further reducing node separa- tions to tens of centimetres does not improve the resolution, as the most valuable pixels that give clues for localisation in CamFimages are off the centre of CamFimages (at least when driving), and each may correspond to a different depth, with distortions that are diffi cult to precisely rectify. A 2D laser scanner has been added to the coarse localisation system 7 in the mapping phase. It is practically desirable if we do not require such laser scanners on the fl eet during lo- calisation. On the other hand, upward-facing cameras CamL, CamMand CamRcapture a continuous quasi-planar sur- face - the mine tunnel ceiling - rich in texture information at quasi-constant distances, and are potentially better choices for the precise localisation engine to work with. As a camera- only solution, the mechanism of TIMTAM for ceiling-image- based mosaic creation and template matching is discussed next. B. TIMTAM: Tunnel IMage Texturally Accorded Mosaic TIMTAM is designed for a camera that moves on a quasi- planar surface with 2 translation and 1 rotation Degrees Of Freedom (DOF), (x,y,). Given a few reference images at known locations and a query image at an unknown location, it tries to fi nd the queried location and optionally report the confi dence of prediction. Similarly to the coarse localisation system, TIMTAM also works in two phases. In the map building phase, a mosaic image of the mine tunnel ceiling is created using reference images with known locations. Such locations can be gener- ated by surveying during the mine construction process or interpolated based on node locations in the coarse map. In the localisation phase, the query images are compared against the mosaic image at candidate refi nement locations within a search range R Nlalt, where laand ltare the axial (along the tunnel) and traverse (perpendicular to the tunnel side walls) search range, respectively. The quality of each comparison result is evaluated as a confi dence score, based on which a decision is made regarding whether a location refi nement should be executed or not. C. Image preprocessing All raw images from the cameras (Fig. 3c, contrast- enhanced to show fi ner details) will fi rst be undistorted (Fig. 3a) to reduce the spherical distortions induced by the wide FOV lens, and resized to I NlI,rowlI,col. A patch normalisation process is then performed to enhance the contrast and reduce intra-image lighting variations (Fig. 3b) evident in other images in Fig. 3 (darker in centre image rows in this case). SLoG 26 and CLAHE have been used for this purpose. In this work, we implemented an approach similar to SLoG, which is to scale the local contrast to a specifi ed (a)(b) (c)(d) Fig. 3: Image preprocessing process: (a) Undistorted image from raw camera output, (b) image in (a) after patch normal- isation, comparing with (c) contrast enhanced raw camera image, and (d) image in (c) reduced to resolution of (b). standard deviation, as detailed in Algorithm 1. For 8-bit gray- scale images as used in this paper, depth d = 281 = 255, and I0= 27 1 = 127. I is the scaled difference between each individual pixel and its local meanI, and is the input scaling constant. Due to the intra-image lighting variation, it might be necessary to use a different for patch-normalising the reference images used in mosaic map creation (r) and the query image (q). The contrast-enhanced original image reduced to the same resolution as Fig. 3b is shown in Fig. 3d for comparison. Algorithm 1: Patch normalisation Input: Ii,j 0,d, ./ I is input image. Output: Ipn./ patch normalised image. I0 I0J/ J is all-one matrix Klpatch 1/(lpatch)2Jlpatchlpatch/ K is convolution kernel I I Klpatch;/ I is local mean image. I (I I) Ipn I0+ I D. Mosaic Map Creation One by one, according to their known poses (x,y,), pre- processed reference images Irare projected and overlaid onto M, a scaled-down mosaic map of the mine tunnel ceil- ing. Two additional images, both the same size as the mosaic M, are updated during the process. The fi rst one is a validity image VM, a pixel of “True” means the corresponding pixel in M has been defi ned (previously visited by Irprojection), and vice and versa. The second is an “accordance weight” image W that records one weight value for each corre- sponding pixel in M according to the consensus between all processed reference images. As detailed in Algorithm 2, for each projected pixel, if it lies outside the defi ned mosaic, it is taken as is and added onto the mosaic, with W incremented by 1. If it overlaps with existing mosaic, then a consensus check in terms of whether the pixel is “dark” ( I0 ?) is performed, in which ? is a tolerance value. If the projected pixel agrees with existing value within tolerance ?, its value is weighted in, otherwise that pixel in the mosaic is set back to I0. Algorithm 2: Create mosaic map Input: (Ir,(x,y,),?/ set of reference images with known locations, and ? the tolerance Output: M NlM,rowlM,col,Mi,j 0,255 VM NlM,rowlM,col False W NlM,rowlM,col 0 for (Ir,(x,y,) (Ir,(x,y,) do Ir,proj project(Ir,M,(x,y,) for Pixel Ir,projdo if VMPixel = False then MPixel Ir,projPixel VMPixel True WPixel WPixel + 1 else if (Ir,projPixel I0 ? and MPixel I0 ?) or (I

人人文库> 全部分类> 教育资料 > 课件下载

温馨提示

1. 本站所有资源如无特殊说明，都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
2. 本站的文档不包含任何第三方提供的附件图纸等，如果需要附件，请联系上传者。文件的所有权益归上传用户所有。
3. 本站RAR压缩包中若带图纸，网页内容里面会有图纸预览，若没有图纸预览就没有图纸。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 人人文库网仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对用户上传分享的文档内容本身不做任何修改或编辑，并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容，请与我们联系，我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

IROS2019国际学术会议论文集2561

文档简介

温馨提示

最新文档

评论

IROS2019国际学术会议论文集2561

文档简介

温馨提示

最新文档

评论

相关文档