IROS2019国际学术会议论文集2092_第1页
IROS2019国际学术会议论文集2092_第2页
IROS2019国际学术会议论文集2092_第3页
IROS2019国际学术会议论文集2092_第4页
IROS2019国际学术会议论文集2092_第5页
免费预览已结束,剩余1页可下载查看

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

Learning Multiple Sensorimotor Units to Complete Compound Tasks using an RNN with Multiple Attractors Kei Kase1,2, Ryoichi Nakajo1, Hiroki Mori3and Tetsuya Ogata1,2 AbstractAs the complexity of the robots tasks increases, we can consider many general tasks in a compound form that consists of shorter tasks. Therefore, for robots to generate various tasks, they need to be able to execute shorter tasks in succession, appropriately to the situation. With the design prin- ciple to construct the architecture for robots to execute complex tasks compounded with multiple subtasks, this study proposes a visuomotor-control framework with the characteristics of a state machine to train shorter tasks as sensorimotor units. The design procedure of training framework consists of 4 steps: (1) segment entire task into appropriate subtasks, (2) defi ne subtasks as states and transitions in a state machine, (3) collect subtasks data, and (4) train neural networks: (a) autoencoder to extract visual features, (b) a single recurrent neural network to generate subtasks to realize a pseud-state-machine model with a constraint in hidden values. We implemented this framework on two different robots to allow their performance of repetitive tasks with error-recovery motion, subsequently, confi rming the ability of the robot to switch the sensorimotor units from visual input at the attractors of the hidden values created by the constraint. I. INTRODUCTION To increase the applicability of robots, they need to be able to execute general tasks and implement them in as simple a manner as possible. Many general tasks executed by people can be considered compound in nature and, therefore, can be separated into shorter tasks. Similarly, this argument is applied to Hierarchical Task Network planning 1, where sequences of decomposed low-level tasks are computed to execute high-level tasks with increased performance. Robots with the ability to learn multiple low-level tasks and complete them under appropriate circumstances to accomplish a target task would be benefi cial. A state machine is a traditional approach robots use to execute multiple tasks. Robots controlled by a state machine can choose a specifi c task for a particular input and embed error recovery for robust task generation. On the other hand, methods used to manipulate robots using deep learning have gained prevalence, as they do not require hand-engineered features and can be trained in end-to-end fashion. Use of state machine for a specifi c task can be implemented by deep learning; however, the prepared low-level tasks often involve hand-engineered motions 2. Although the low-level tasks can also be learned using deep learning for its benefi cial *This work was based on results obtained from a project commissioned by the New Energy and Industrial Technology Development Organization 1Department of Intermedia Studies, Waseda University, Tokyo, Japan. 2 Artifi cial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology, Tokyo, Japan. 3Future Robotics Organization, Waseda University, Tokyo, Japan. kase, nakajo, mori, ogataidr.ias.sci.waseda.ac.jp aspects of generalization ability, the function of choosing the low-level tasks and executing low-level tasks are done by two different models. Previous studies have reported that multi-task learning improves overall performance since visuomotor control using deep learning has gained increasing attention 34. Multi- task learning can be applied to learning a compound task from low-level tasks (referred to as sensorimotor units in this work); however, few studies have applied learning multiple sensorimotor units using a single framework and in a series 5. Therefore, this study extends the work of 5 and focus on generating learned sensorimotor units applicable to completing a target task according to state-machine-like characteristics. Hand-engineering robot control is laborious and becomes diffi cult as the complexity of the task increases because it requires modeling both the robot and the working environ- ment, processing sensory inputs and defi ning a state machine fi t for the task. We utilize a predictive-learning framework for the consecutive generation of low-level tasks and that can be trained in end-to-end fashion within a few days without the need for hand-engineering. Our framework comprises two neural networks for visuomotor control in end-to-end fashion. The fi rst neural network is an autoencoder (AE) that extracts visual features autonomously from images captured by a camera. The second is a recurrent neural network (RNN) that uses integrated information from the current visual features and the current robot joint angles to predict the next step. For an RNN to generate tasks consecutively, the work of 5 implemented a constraint to a hidden values to form a single attractor with similar behavior to point attractor. Since the limitation of single attractor restricts the representation the framework can have, we modifi ed the constraint to form multiple attractors and embed characteristics of a state machine into the RNN. II. RELATED WORKS It is common for traditional robot-manipulation methods to decompose a target task into low-level tasks and combine them into a sequence of actions to generate a more complex task 6. Decomposing tasks allows the reuse of low-level tasks and the embedding of error-recovery motions, if neces- sary. Ijspeert et al. 7 proposed a method of learning motion primitives for use in generating complex behaviors to create versatile attractor dynamics from a known environment; how- ever, the design of this dynamical system is hand-engineered. Because hand-engineering often becomes laborious as the complexity of the tasks increase, deep learning has been 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) Macau, China, November 4-8, 2019 978-1-7281-4003-2/19/$31.00 2019 IEEE4244 applied to robot manipulation due to its ability to extract features autonomously and train a network in an end-to-end fashion. Deep-learning methods used for robot control have suc- cessfully allowed robots to learn tasks, such as grasping 8 and bottle cap closing 9. Levine et al. 9 demon- strated numerous tasks using such a framework, although the tasks were not trained simultaneously. Although not a deep-learning framework, Haruno et al. 10 proposed a method for learning multiple inverse models necessary for the control and selection of inverse models appropriate for a given environment, describing the framework as a mixture of expert architectures, where a specifi c model is chosen for a particular purpose. Learning multiple tasks simultaneously (multi-task learn- ing) is an important aspect of deep learning that aims to improve performance and is often simple, as the network is not specifi cally designed for a particular task. Previous studies applied multi-task learning and demonstrated the effect in the fi eld of translation 11, vision 12, and speech 13. Additionally, multi-task learning has also been applied to visuomotor control, resulting in improved performance 34. These methods learned multiple tasks simultaneously using a single neural-network architecture; however, they did not consider generating multiple tasks in series. Rahma- tizadeh et al. 14 demonstrated the benefi t of multi-task learning using a low-cost robotic arm; however, the transition between each task was not considered. Yu et al. 15 trained sensorimotor units separately using a single reinforcement learning architecture to generate sensorimotor units capable of completing target tasks by using a separate network architecture specialized in deciding which sensorimotor units to generate. In this study, we focused on implementing func- tions of learning multiple sensorimotor units and generating them in a series appropriate to a given situation using a single network architecture. As a deep learning framework for visuomotor control, reinforcement learning allows robots to acquire motor skills from trial and error, after which the robots can learn uncon- ventional and non-obvious motions to complete the tasks; however, the exploration needed to optimize the network can be demanding for real robot systems 16. To overcome this problem, learning methods, such as guided policy search 9 and transfer from a simulated to a real robot domain 17, have been proposed. A reinforcement-learning algorithm is optimized by using a reward function, that can be diffi cult to design. Predictive learning is another deep-learning framework for visuomotor control18. Predictive learning allows robots to acquire motor skills through demonstration, allowing them to learn to model the environment. The robot pre- dicts subsequent steps from the experience gained from the demonstration, thereby training them to predict the next step from previous inputs. Therefore, the predictive-learning algorithm is optimized from the demonstration and requires minimum confi guration as a task-specifi c reward function is unnecessary. This study utilized a predictive-learning framework as the minimal confi guration is required. By segmenting tasks into sensorimotor units, it is possible to train repetitive tasks, embed error-recovery motions, and lessen the burden of preparing demonstrations for robots to learn. We trained multiple sensorimotor units using a single predictive-learning framework in end-to-end fashion and applied a constraint en- abling the framework to consecutively generate sensorimotor units to complete the target task. III. METHODS In this section, we present our proposed framework of predictive learning for visuomotor control with the ability to consecutively generate multiple sensorimotor units. The proposed framework comprises two types of deep neural networks: one for extracting visual features and one for predictive visuomotor learning. Based on the characteristics of a state machine used to generate multiple sensorimotor units, we used an RNN-based architecture as a predictive- learning framework to implement behavior similar to that of a state machine. This study utilized a convolutional AE (CAE) to extract image features autonomously and a long short-term memory (LSTM) network to predict the next robot joint angles based on previous visuomotor inputs (Fig. 1). To create attractors and form a pseudo-state machine, we applied a constraint to the hidden values of the LSTM. The details of the CAE, LSTM, implementation of the pseudo- state machine, teaching of the framework, and generation of sensorimotor units are descried in the following sections. Fig. 1: Overview of the predictive visuomotor-control model. The image is captured from the robot, and the CAE extracts the image feature. The captured joint angles are concatenated with the extracted image feature to allow the LSTM to predict the next step. The predicted joint angles are then signaled to the robot. A. Image-feature Extraction We utilize the CAE as a visual-processing architecture to autonomously extract image features. The AE consists 4245 of encoding layers that learn important input features and decoding layers that attempt to decode the learned features back into their original form. With a bottleneck-structured neural network, the AE can reduce the dimensionality of the image input and use compressed data as image features. The CAE uses convolutional layers at the encoding layers and deconvolutional layers at the decoding layers. In this study, because visuomotor control uses positional information for task generation, we did not use a pooling layer. The most dimensionally reduced feature is at the fully connected layer that connects the convolutional and deconvolutional layers. The CAE was trained using the mean squared error between the input and the output. B. RNN Architecture and Pseudo-state Machine We used an LSTM as an RNN based architecture in our framework to allow prediction of visuomotor information. The LSTM predicts the future observation of time-step (t +1) from the current observation and the hidden values of the LSTM at time-step t. Therefore, the loss function, LPredict, used to predict the future observation from current observation is as follows: LPredict= S s=0 T t=0 ( y y y(s)(t)y y y(s)(t)2,(1) where T is the total number of steps in a task, y y y(s)(t), y y y(s)(t) is the training signal and predicted output, respectively, for the sthsequence, and S is the number of all sequences. To implement the characteristics of the state machine in the LSTM, we applied a constraint to the hidden values so that at the end of one sensorimotor unit and the start of another sensorimotor unit closer. The constraint should create attractors between the sensorimotor units to allow them to switch accordingly from the input. Additionally, the constraint allows the LSTM to act as a pseudo-state machine; however, there is a possibility of transitioning to undefi ned states based on unexpected input because the state machine is ”pseudo”. The attractor and sensorimotor units can be described as either the state or transition of the state machine. A description of a pseudo-state machine used for a towel-rolling task (described in the following section) is shown in Fig. 2a and 2b, with the constraint achieved through calculation of the loss function, LConstraint, as follows: LConstraint= (p,q) p,q(H H Hp(T)H H Hq(0)2, (2) where H H Hp(t) and H H Hq(t) are the hidden values for the sequences p and q, respectively, andp,qis the parameter that controls the loss of context, which is set to 1 when (p,q)E and to 0 when (p,q)E in our model. E is any set of a sequence requiring constraint. The LSTM is trained to minimize the loss of both LPredictand LConstraint. C. Training and Generation Methods We fi rst designed the target task and how to separate them into sensorimotor units. A human fi rst demonstrates a motion for each sensorimotor unit by teleoperation or direct teach- ing. The images and robot joint angles are recorded while the robots reproduce the demonstrated motion. Using the trained CAE, the image features are then extracted and concatenated with corresponding joint angles to create visuomotor signals. Then, the visuomotor data for each sensorimotor unit are trained by the LSTM to predict the (t+1)thvisuomotor data from the (t)thvisuomotor and the hidden values. Calculation of the robot-joint angles is performed in a closed loop, whereas image features are calculated in an open loop. During the generation process (Fig. 1), the robot fi rst captures the current image and joint angles, after which the image is processed by the CAE to obtain the image features, which are then concatenated with the joint angles to form current visuomotor data. The LSTM uses this data to predict the subsequent visuomotor data, and the next motor command is signaled to the robots. As the robot fi nishes ex- ecuting the commanded signal, the entire generation process is repeated. IV. EXPERIMENT Evaluation tasks are designed to determine whether the framework can learn multiple sensorimotor units and gener- ate them in a series to complete a target task according to the characteristics of a state machine, with the tasks designed to take advantage of the characteristics of the state machine. We prepared a towel-rolling task to embed repetitive motion and demonstrate the abilities of positional generalization ability and robustness against visual distractors. Additionally, we prepared a skewering task to demonstrate the low program- ming cost of this framework, as barely any confi guration is needed. The skewering task demonstrates the importance of collecting training data separately as sensorimotor unis and learning them with state-machine characteristics by involving error recovery motion. We used the Nextage Open Robot from Kawada Robotics to generate the towel rolling task as a goal (Fig. 2c). This is a dual-arm robot with a camera mounted on the head, with arms having six degrees of freedom (DoF) and grippers attached to each arm. We divided the towel rolling task into fi ve different sensorimotor units: ready w/o relocate, ready to relocate, relocate, roll, and fi nish. Specifi cally, the robot prepares to interact with the towel. If the towel is not in a favorable position for rolling, the robot relocates the towel. If the towel is at the favorable position for rolling, the robot rolls the towel until completely rolled. When the towel is rolled, the robot returns to the initial position. For this task, the hidden values of the LSTM are constrained to have three attractors (Fig. 2a and 2b). Fig. 2a shows the state machine when the sensorimotor units are represented as states and the attractors are represented as transitions of the state machine. Fig. 2b shows the state machine when sensorimotor units are represented as transitions and attractors as states of the state machine. The conceptualized image of the hidden values of the LSTM are similar to that depicted in Fig. 2b and this study use this depiction, otherwise stated. The training data for a towel-rolling task includes a demonstration of a robot manipulating a yellow towel at 25 different positions in a square area, where each point is 2 cm apart. If the towel 4246 is at the center of the square area, roll task is demonstrated without a relocation. If the towel is at the other 24 points, the relocation task is demonstrated before the roll task. The Nextage robot was trained with data equal to three full towel- rolling tasks at each towel position. The training data for the Nextage robot were created directly by teleoperating the robot using a 3D mouse. As the robot reproduces the training motions, it records images and joint angles simultaneously at 10 fps. The robot captures a 128 128 RGB area using its left camera on its head, and the CAE extracts the image features by reducing

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论