论文2G精华版本 ICRA 2018 files 2226

上传人：我*** IP属地：北京上传时间：2019-12-21 格式：PDF 页数：8 大小：3.68MB 积分：12 举报 版权申诉

已阅读5页，还剩3页未读，继续免费阅读

版权说明：本文档由用户提供并上传，收益归属内容提供方，若内容存在侵权，请进行举报或认领

文档简介

GOSELO Goal Directed Obstacle and Self Location Map for Robot Navigation using Reactive Neural Networks Asako Kanezaki1 Jirou Nitta1 and Yoko Sasaki1 Abstract Robot navigation using deep neural networks has been drawing a great deal of attention Although reactive neural networks easily learn expert behaviors and are computationally effi cient they suffer from generalization of policies learned in specifi c environments As such reinforcement learning and value iteration approaches for learning generalized policies have been proposed However these approaches are more costly In the present paper we tackle the problem of learning reactive neural networks that are applicable to general environments The key concept is to crop rotate and rescale an obstacle map according to the goal location and the agent s current location so that the map representation will be better correlated with self movement in the general navigation task rather than the layout of the environment Furthermore in addition to the obstacle map we input a map of visited locations that contains the movement history of the agent in order to avoid failures that the agent travels back and forth repeatedly over the same location Experimental results reveal that the proposed network outperforms the state of the art value iteration network in the grid world navigation task We also demonstrate that the proposed model can be well generalized to unseen obstacles and unknown terrain Finally we demonstrate that the proposed system enables a mobile robot to successfully navigate in a real dynamic environment I INTRODUCTION Path planning is crucial to realizing autonomous cars and automobile robots Fig 1 In particular path planning for a path from a given starting location to a goal within a grid map of obstacles has been studied extensively When using a path planning technique to navigate a mobile agent by observing a map with cameras or depth sensors attached to the agent the fact that the map can change dynamically from moment to moment should be taken into consideration Even when moving through a familiar environment with access to a fi xed map an agent can fail to follow the planned path when the agent is interrupted by a human or an unmapped obstacle We believe that the most important feature for navigation on a dynamic map is the ability to estimate the next best step instantly when the map is updated and we believe this is even more important than calculating a perfect optimal path to the goal which must be adjusted every time an interruption occurs The proposed method is based on a convolutional neural network CNN to estimate the next best step among neigh boring pixels in a grid map Fig 2 We refer to such a CNN as a reactive CNN because it reacts to specifi c patterns on a map in order to determine the movement of the agent Nav igation based on a reactive CNN has three main advantages 1The authors are with National Institute of Advanced Industrial Science and Technology AIST 2 4 7 Aomi Koto ku Tokyo 135 0064 Japan kanezaki asako nitta jiriu y sasaki aist go jp 96 cm LIDAR Velodyne HDL 32e Moving base Pioneer 3DX Robot embedded PC IntelNUC5iRYH 55 cm Fig 1 Hardware confi guration of the Peacock mobile robot S G probability candidate direc ons agent CNN environment map Output sub regions of the map Input Fig 2 Overview of our method for tackling the problem of navigating an agent blue triangles from a starting location S to a goal G The black areas indicate obstacles through which the agent cannot pass We derive a CNN that takes as input an image for which the channels correspond to sub regions of the map on different scales surrounding the agent and that probabilistically determines the optimal direction in which to proceed as described below First a reactive CNN estimates the next best step in a constant time in any situation In contrast the computational time of most existing path planning methods such as the A search 6 and rapidly exploring random tree RRT 13 12 depends on the scale and complexity of the map Furthermore such classical path planning methods will fail when there is no path to the goal A CNN based method can suggest a plausible direction in which to proceed at every moment regardless of the existence of a path which is important for navigation in cluttered dynamic environments Second a reactive CNN can use GPU acceleration due to its high potential for parallelization This is also a major advantage over many classical path planning methods that cannot be wholly parallelized because every point on a path is dependent on other locations Finally a reactive CNN IEEE Robotics and Automation Letters RAL paper presented at the 2018 IEEE International Conference on Robotics and Automation ICRA May 21 25 2018 Brisbane Australia can effi ciently learn expert behaviors e g human controls without modeling the rewards and the policy behind the behaviors The most signifi cant drawback of using a reactive CNN for navigation is that it does not generalize well to unknown domains Reactive CNNs learn the mapping from state observation i e obstacle map and goal location to action e g go straight and turn left Here the mapping could be fairly complicated because a small difference in state observation for instance moving the goal location slightly but over a wall could totally change the optimal action Tamar et al 22 pointed out this issue and presented a novel neural network based model called the value iteration network VIN which can effectively learn to plan rather than learn reactive policy This model however suffers from high computational cost because it iterates the computation of state values several dozens or hundreds of times Instead of learning a complicated model we tackle the generalization problem by simplifying the state observation representation We propose the Goal directed Obstacle and SElf LOcation GOSELO map the concept of which is illustrated in Fig 3 Since we know the goal and the agent s current location in a grid world domain which is the same setting as in 22 we crop rotate and rescale by area averaging the obstacle map so that the goal and the agent s location are fi xed in the converted image The transformation makes the map representation egocentric and therefore the patterns of the transformed map become more correlated to the agent s self movement This simple operation makes the mapping problem between observation and action much more straightforward For instance if there is no obstacle on the central vertical line of the converted image the agent will most likely continue to proceed to the goal In the present paper the CNN models learned with GOSELO outperform the VIN 22 in the grid world navigation task in terms of both success rate and computational time when the resolution of the grid world is larger than 32 32 We also show that the model can be well generalized to unseen obstacles and unknown terrain Finally we navigate a real autonomous robot using GOSELO in a real dynamic environment II RELATED WORK Autonomous driving based on CNNs with image input has attracted attention in recent years Chen et al 3 trained a CNN with supervised human driving in a video game in order to estimate the motion affordance for autonomous driving based on images Lei et al 14 applied Q learning to a CNN with a depth image input that estimated the output distribution of the moving commands such as move forward or move diagonally right or left Ammirato et al 1 trained a CNN to determine the next best move in order to improve object classifi cation Although these methods are effective at generating estimating predictable motions from the current view such as obstacle avoidance they are not easily applied to long term motion estimation for guiding the agent to the goal over a distance More recently Brahmbhatt and Hays 2 presented a CNN based navigation system in large cities Fig 3 Concept of GOSELO Red stars indicate goal locations and blue triangles indicate the current location of the agent We crop rotate and rescale the obstacle map according to the goal and the agent s current location Goal directed obstacle and self location is better correlated with actions in the navigation task than the original grid world map For instance if there is no obstacle on the central vertical line of the GOSELO map then it would be best for the agent continue to proceed toward the goal See the fi rst two images at bottom left that predicts the directions to a specifi c destination e g a gas station from street view images Although learning visual navigation to a limited number of destinations is effective Brahmbhatt and Hays considered fi ve classes of destinations e g churches and gas stations as of yet it is impossible to navigate to an unknown class of destination or navigate in unknown streets In order to learn a general navigation policy we consider the problem of grid map navigation The use of reinforcement learning for example by deep Q learning networks DQNs 15 which estimate the optimal movement that leads an agent to win a game i e to reach the goal in this scenario could be considered However since the search space grows exponentially when the scale of the grid map increases reinforcement learning with a random initialized policy is extremely ineffi cient for exploring a large environment Furthermore reinforcement learning typically requires engineering by hand in order to design a reward function As such a number of previous studies have used supervised or semi supervised approaches for learning action policies from optimal movements 4 8 19 There are also strategies that combine supervised learning and self training that collects additional training samples by the learned policy 16 23 The VIN 22 is the closest approach to that considered herein and has provided impressive results for the grid world navigation task The VIN has also been used for visual robot navigation tasks in realistic environments 5 The main advantage of the VIN over a standard reactive CNN is the ability to learn to plan The VIN is a deep neural network that approximates the value iteration algorithm which predicts the outcomes of state transitions based on a Markov decision process and so is capable of learning policies that generalize well to unseen task instances The main drawback of the VIN is its computational complexity Since the VIN involves an channel 1channel 2channel 3 channel 4channel 5channel 6 L 4 4L 8L G P P G S S obstacles path of the agent rotatecrop resize original 2D map obstacle mapself location map Fig 4 Detailed diagram of the proposed GOSELO map representation Points S G and P in this fi gure correspond to the starting location the goal and the agent s current location respectively The black pixels indicate obstacles and the red pixels indicate the locations through which the agent has already passed In practice the red pixels have integer numbers that indicate how many times the agent has visited the locations The task is to determine the next step for navigating the agent from P to G First the original 2D map is rotated so that G is located directly above P The squares at different scales whose centers are located on the center of the line GP are then cropped and resized by area averaging to be input to the CNN The fi rst three channels of the input image correspond to the obstacle map and the latter three channels correspond to the self location history TABLE I DURABILITY GENERALITY AND SCALABILITY OF VARIOUS METHODS A search 6 Reactive CNNVIN 22 GOSELO a Durability b Generality c Scalability iterative process for value prediction that further increases as the resolution of grids increases it does not scale to large domains In contrast instead of learning a complex predictive model we improve the representation of the task with no additional computation time Note that classical path planning methods such as A search 6 have a drawback in that they fail to predict the next step when there is no path to the goal This situation frequently occurs in cluttered dynamic environments More over the computational complexity of A search depends on the scale of the grid map Table I summarizes a the ability to predict the next step in any situation durability b the ability to generalize to unknown domains generality and c scalability to large environments of the methods considered herein III METHOD A GOSELO map representation A detailed diagram of the proposed GOSELO map is shown in Fig 4 which is a novel representation of an image with multiple channels for the input to a CNN This representation is a combination of two maps a map of obstacles observed by the agent s laser scanner and a map of self location history from the beginning to the current state The pixels of the former map have binary values where 1 indicates a pixel is occupied by an obstacle and 0 indicates a free or invisible space Each pixel of the latter map has an integer value that represents how many times the agent has visited a location These maps are transformed in the manner described below First we rotate and translate maps so that the goal G is located directly above the current location P and the center point M of line GP is at the center of the entire image The important point here is that the direction from P to G should be aligned regardless of their positions in the original map it does not matter if the orientation of the alignment is vertical or horizontal or anything else Next we crop a square of size L 4 L 4 centered at M where L denotes the number of the pixels on the line GP We then also crop squares of size L L and merge them as additional channels Here we use two additional channels with 4 8 which were experimentally determined Finally we construct an image of six channels i e an image of size W H 6 as the input of a CNN which consists of three channels from an obstacle map and the three channels from a self location history Owing to the multiscale nature of the proposed representation the proposed system is able to consider both local and global features of the environ ment The fi rst channel of the input image that represents an obstacle map between G and P is related to obstacle avoidance If there is no obstacle on the central vertical line of this channel image the agent would most likely continue to move directly to the goal Figure 5 shows the average images of GOSELO maps for the respective optimal directions in which to proceed as derived experimentally in the present study This fi gure indicates that there are patterns of obstacle locations and self location history that are strongly correlated with the respective optimal movements Intuitively the CNN of the present study learns such movement specifi c patterns of GOSELO maps by supervision B Relation to recurrent neural networks The objective of using self location maps as the input of the CNN is to consider the agent s movement history Getting trapped in local minima is most often addressed during the navigation task The agent may sometimes travel back and forth over the same location and never reach the goal1 In order to prevent this problem we train a CNN with the information of visited locations so that the CNN has the preference of moving toward an unvisited location The images in the bottom rectangle in Fig 5 indicate that such a 1See from 01 50 in the supplementary video obstacle mapsself location maps 1 2 3 4 5 6 Fig 5 Average images of GOSELO maps for the respective optimal directions shown in the bottom in which to proceed The images for channels 1 to 3 of GOSELO top three rows correspond to the sub region images of obstacle maps whereas those for channels 4 to 6 bottom three rows correspond to the sub region images of past self location maps Note that an obstacle map has binary pixel values which is one if a pixel is occupied by an obstacle whereas a past self location map has integer numbers that indicate how many times the agent has visited the locations preference exists where darker pixels show the locations the agent has visited The top left image in the bottom rectangle for instance shows that when the agent moves to the right the agent tends to have already visited the left side An alternative solution to considering temporal informa tion is to use a recurrent neural network RNN where the outputs of hidden layers in previous time steps are input to some middle layer of the network in the current time step Basic RNN including LSTM with certain parameters would perform similarly to our approach however it is ambiguous which parameters as well as architecture of RNN should be used for this task Indeed the use of a CNN with GOSELO can actually be regarded as a specifi c type of RNN where the outputs of the fi nal layers in all of the previous time steps from the beginning are converted in the form of a three channel image and are concatenated to the current obstacle map as input Note that the past self location map implicitly represents the history of the network outputs i e the history of directions that the agent selected Our approach of using the past self location map as additional input is computationally effi cient because it increases only the computation time of the fi rst convolutional layer which is negligible compared to the total computation time C Training of the CNN We randomly generated various pairings of starting loca tion and goal which we refer to hereinafter as scenarios and computed the optimal paths via A search 6 We then extracted the direction of the next step at every pixel on the path as the ground truth of the CNN output More specifi cally we repeat a cycle of moving the agent one step forward updating the obstacle map and the past self location map converting these updated maps into a GOSELO map CNN input and computing the direction of the next step CNN output by A search We use AlexNet 11 as the default architecture of the CNN whereas the fi rst convolutional layer takes an image with six channels and the last fully connecte

人人文库> 全部分类> 教育资料 > 课件下载

温馨提示

1. 本站所有资源如无特殊说明，都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
2. 本站的文档不包含任何第三方提供的附件图纸等，如果需要附件，请联系上传者。文件的所有权益归上传用户所有。
3. 本站RAR压缩包中若带图纸，网页内容里面会有图纸预览，若没有图纸预览就没有图纸。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 人人文库网仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对用户上传分享的文档内容本身不做任何修改或编辑，并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容，请与我们联系，我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

论文2G精华版本 ICRA 2018 files 2226

文档简介

温馨提示

最新文档

评论

论文2G精华版本 ICRA 2018 files 2226

文档简介

温馨提示

最新文档

评论

相关文档