已阅读5页,还剩2页未读, 继续免费阅读
版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
Improved Learning Accuracy for Learning Stable Control from Human Demonstrations Shaokun Jin1 2 Zhiyang Wang1 3 4 Yongsheng Ou1 3 4 and Yimin Zhou3 4 Abstract Learning from Demonstration LfD has been identifi ed as an effective method for making robots adapt to a similar kind of tasks In this work a framework of learning from demonstration has been proposed for modelling robot motions We present an approach based on dimension ascending to learn a dynamical system so that the reproduced motions can closely follow the demonstrations In addition the reproductions can ultimately reach and stop at the target which refl ects the robustness of the method Therefore the system accuracy and stability can be better guaranteed simultaneously The effectiveness of the proposed approach is verifi ed by performing handwriting experiments on the LASA data set Index Terms Learning from demonstration dynamical sys tems point to point motions stability analysis I INTRODUCTION The rapid development of robotics has played an invalu able role in the progress of human society which brings far reaching infl uences on the production mode and living style of mankind 1 3 However there are still many tasks that humans can easily accomplish but robots cannot In factories most robot production lines are better at performing large scale pre programmed tasks but are less competent to the products with low volumes and rich varieties In families even though robots have been gradually applied most of the common tasks in human daily life can hardly be performed due to their complicated procedures or high precision manip ulation requirements In addition robots with poor robustness can hardly adapt to the unstructured environment which is always fi lled with disturbances and uncertainties As one of the most signifi cant intelligent control technolo gies Learning from demonstration LfD also called Pro gramming by Demonstration PbD 4 5 greatly enhances the ability of robots to automatically master human control strategies The main idea of this framework is to demonstrate 1S Jin Z Wang and Y Ou are with Shenzhen Institutes of Advanced Technology Chinese Academy of Sciences Shenzhen 518055 China 2S Jin is with the Shenzhen College of Advanced Technology University of Chinese Academy of Sciences Shenzhen 518055 China 3Z Wang Y Ou and Y Zhou are with Guangdong Provincial Key Laboratory of Robotics and Intelligent System Shenzhen Institutes of Advanced Technology Chinese Academy of Sciences 1068 Xueyuan Blvd Shenzhen China 4Z Wang Y Ou and Y Zhou are also with Key Laboratory of Human Machine Intelligence Synergy Systems Shenzhen Institutes of Advanced Technology Chinese Academy of Sciences 1068 Xueyuan Blvd Shenzhen China Yongsheng Ou is the corresponding authorys ou This work was jointly supported by National Natural Science Foundation of China Grants No U1613210 Guangdong Special Support Program 2017TX04X265 Primary Research b transform the S shape to a 3 dimensional trajectory c the balls are the isotimic surfaces of a quadratic Lyapunov function and the transformed trajectory is consistent with the quadratic Lyapunov function d map the reproduced trajectory onto a 2 dimensional one as the fi nal result methods for motion learning Dynamical Movement Primi tives DMP was proposed by Ijspeert et al 8 which can build blocks utilized and modulated for performing complex motions in real time They also introduced dynamical sys tems in 9 for movement imitations Duan et al proposed the Fast and Stable Learning of Dynamical Systems FSM DS 10 to emphasize on improving the learning speed which is crucial for practical applications However an inherent dilemma of FSM DS is that the model learned under the stability constraints might yield inaccurate reproductions when the given demonstrations vi olates the Lyapunov function Fig 1 By learning a Stable Estimator of Dynamical System a new framework was provided for the robot to learn point to point motions 11 Different from the work of Ijspeert et al on the coupled dynamical systems 8 this framework models a desired dynamical system in the state space instead of the phase space which increased the expressiveness of the learned control strategy CLF DM 12 divides the model learning into three procedures to learn the human control strategy Neural Imprinted Stable Vector Fields NIVF was devised in 13 to generate a strategy model via neural networks K Neumann et al investigated task dependent Lyapunov candidates to reduce the infl uence of the stability constraints on accuracy 14 SEDS 15 was proposed to improve the accuracy by combining the diffeomorphism method Considering that the accuracy 3 Extract xt by Eq 16 as the reproduction result in the original operational space at the tthinstant 4 else 5 Compute z0by Eqs 19 22 subsequently construct ing 0 x0 z0 6 Apply 0to the dynamical system in Eq 24 and compute the d 1 dimensional state variables tat the tthinstant through numerical integration 7 Extract xt by Eq 16 as the reproduction result in the original operational space at the tthinstant 8 end if Reference Traj ReproductionTargetStart PointSwept Error Area Fig 5 The computing rule of the SEA functional It is easy to compute the the tetragon areas enclosed by any two adjacent sampling point pairs And the SEA is just the sum of all such areas along both the reproduced and demonstrated trajectories the robot to reproduce accurate and stable trajectories Actu ally the factors causing inaccuracy reproduced trajectories cannot closely follow the corresponding demonstrated ones mainly consist of two aspects The fi rst is that there are intersections among the demonstrated trajectories The other also the most dominant ingredient is the stability vs accuracy dilemma Since the approach transforms the d dimensional demonstrations into the d 1 dimensional ones for learning the two factors can be both avoided This is the core advantage of the proposed approach Subsequently we compare the proposed approach with the state of the art work by the Swept Error Area SEA standard The SEA functional is usually used to quantify how the reproduction is similar to the demonstration computed as E 1 N N X n 1 Tn 1 X t 1 A xt n xt 1 n xt n xt 1 n 30 2683 Start point Target ReproductionsDemonstrations Fig 4 The reproduced results of the proposed approach on the LASA dataset 60 40 2002040 80 60 40 20 0 20 40 60 0 50 50 100 150 0 50 0 50 50 60 40 2002040 80 60 40 20 0 20 40 60 20 0 50 40 100 0 150 200 0 20 40 40 60 40 2002040 80 60 40 20 0 20 40 40 4020 0 150 100 50 20 0 0 50050 80 60 40 20 0 20 40 0 50 50 100 0 20 0 50 20 50050 80 60 40 20 0 20 40 0 50 100 150 200 20 0 050 20100 50050 80 60 40 20 0 20 40 0 0 20 40 20 60 80 100 0 20 DemonstrationReproduction from the demonstrated start pointReproduction from a new start pointTarget pointStart points a d e b c f Fig 6 Reproductions of the S shape and W shape from new start points other than the demonstrated start points by the dynamical system in Eq 24 In each sub fi gure the left part shows a 3 dimensional reproduction and the right part shows its corresponding 2 dimensional counterpart as the fi nal reproduction result For each shape we select 3 different new start points for reproductions Additionally we put a same demonstrated trajectory and a reproduced trajectory starting from the same demonstrated start point in each fi gure as references It can be seen that the reproduction from a new start point can still maintain geometric and topological features similar to both the demonstration and the reproduction from the demonstrated start point where xt nand xt nrepresent the sampling points in the respectively demonstration and reproduction trajectories Tn is the number of all sampling points in the nthdemonstra tion and N is the number of all demonstrations Fig 5 shows the general computing rule of SEA i e using the area included by the demonstration and reproduction trajectories to represent the inaccuracy Thus The smaller the enclosed area the more accurate the reproduction and vice versa The results by SEA can be seen in Table I Additionally we select the S shape and W shape in the 2684 Fig 7 The stability evaluation of the proposed approach on the LASA dataset We pick stochastically the G shape W shape and S shape for illustration TABLE I MEANSEAFORLASA DATASET ApproachesMean SEA mm2 FSM DS11122 22 CLF DM6722 28 NIVF6246 20 SEDS6074 81 proposed SEDS 2653 38 LASA dataset as examples to illustrate the reproduction performances by the dynamical system in Eq 24 when starting from new points other than the demonstrated start points Fig 6 Finally the stability of the proposed approach is also validated Since the dynamical system in Eq 24 has been proved in Section IV to have the same stability property with the learned dynamical system in Eq 23 we only demonstrate the stability performances of the learned dy namical system Eq 23 to verify both Eqs 23 and 24 systems stability By stochastically selecting 3 dimensional start points and reproducing trajectories we can see all the trajectories are able to fi nally converge to the equilibrium point Fig 7 Due to limited space we merely present three representative results VI CONCLUSION In this paper it mainly handles the dilemma of stability and accuracy lying in the robot motion modelling problem Different from conventional approaches we choose to as cend the dimension of the provided demonstrated motions in attempt to avoid such a trade off By the proposed approach we fi rst transform the given d dimensional tra jectories into d 1 dimensional ones The projections of the transformed counterparts onto the original operational space are made to be the d dimensional given examples Besides the transformed demonstrations are consistent to the quadratic Lyapunov function We subsequently learn a model depending on these motions and let the robot reproduce by the learned model From the simulations undertaken on the LASA dataset it validates the effectiveness of the proposed approach REFERENCES 1 X Zhang Y Fang N Sun Minimum time trajectory planning for underactuated overhead crane systems with state and control constraints IEEE Transactions on Industrial Electronics vol 61 no 12 pp 6915 6925 2014 2 W He Y Chen Z Yin Adaptive Neural Network Control of an Uncertain Robot with Full State Constraints IEEE Transactions on Cybernetics vol 46 no 3 pp 620 629 2016 3 X Liang H Wang Y Liu W Chen G Hu J Zhao Adaptive task space cooperative tracking control of networked robotic manipulators without task space velocity measurements IEEE Transactions on Cybernetics vol 46 no 10 pp 2386 2398 2017 4 B D Argall S Chernova M Veloso and B Browning A survey of robot learning from demonstration Rob Auton Syst vol 57 no 5 pp 469 483 2009 5 A G Billard S Calinon and R Dillmann Learning from Humans in Springer Handbook of Robotics 2nd edition 2016 pp 1995 2014 6 W He Y Dong C Sun Adaptive Neural Impedance Control of a Robotic Manipulator With Input Saturation IEEE Transactions on Systems Man Cybernetics Systems vol 46 no 3 pp 334 344 2017 7 H Wang D Guo H Xu W Chen T Liu KK Leang Eye in hand tracking control of a free fl oating space manipulator IEEE Transactions on Aerospace and Electronic Systems vol 53 no 4 pp 1855 1865 2017 8 A Ijspeert J Nakanishi H Hoffmann Dynamical movement prim itives Learning attractor models for motor behaviors Neural Com putation vol 25 no 2 pp 328 373 2013 9 A J Ijspeert J Nakanishi and S Schaal Movement imitation with nonlinear dynamical systems in humanoid robots
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
评论
0/150
提交评论