[机械模具数控自动化专业毕业设计外文文献及翻译]【期刊】从同步任务执行演示中学习多个机器人联合行动计划-外文文献_第1页
[机械模具数控自动化专业毕业设计外文文献及翻译]【期刊】从同步任务执行演示中学习多个机器人联合行动计划-外文文献_第2页
[机械模具数控自动化专业毕业设计外文文献及翻译]【期刊】从同步任务执行演示中学习多个机器人联合行动计划-外文文献_第3页
[机械模具数控自动化专业毕业设计外文文献及翻译]【期刊】从同步任务执行演示中学习多个机器人联合行动计划-外文文献_第4页
[机械模具数控自动化专业毕业设计外文文献及翻译]【期刊】从同步任务执行演示中学习多个机器人联合行动计划-外文文献_第5页
免费预览已结束,剩余3页可下载查看

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

Learning Multirobot Joint Action Plans from SimultaneousTask Execution DemonstrationsMurilo Fernandes MartinsDept. of Elec. and Electronic EngineeringImperial College LondonLondon, UKYiannis DemirisDept. of Elec. and Electronic EngineeringImperial College LondonLondon, UKy.demirisimperial.ac.ukABSTRACTThe central problem of designing intelligent robot systemswhich learn by demonstrations of desired behaviour has beenlargely studied within the field of robotics. Numerous archi-tectures for action recognition and prediction of intent of asingle teacher have been proposed. However, little work hasbeen done addressing how a group of robots can learn bysimultaneous demonstrations of multiple teachers.This paper contributes a novel approach for learning mul-tirobot joint action plans from unlabelled data. The robotsfirstly learn the demonstrated sequence of individual actionsusing the HAMMER architecture. Subsequently, the groupbehaviour is segmented over time and space by applying aspatio-temporal clustering algorithm.The experimental results, in which humans teleoperatedreal robots during a search and rescue task deployment,successfully demonstrated the ecacy of combining actionrecognition at individual level with group behaviour segmen-tation, spotting the exact moment when robots must formcoalitions to achieve the goal, thus yielding reasonable gen-eration of multirobot joint action plans.Categories and Subject DescriptorsI.2.9 Artificial Intelligence: RoboticsGeneral TermsAlgorithms, Design, ExperimentationKeywordsLearning by Demonstration, Multirobot Systems, SpectralClustering1. INTRODUCTIONA substantial amount of studies in Multirobot Systems(MRS) addresses the potential applications of engaging mul-tiple robots to collaboratively deploy complex tasks such assearch and rescue, distributed mapping and exploration ofunknown environments, as well as hazardous tasks and for-aging for an overview of the field, see 13. Designingdistributed intelligent systems, such as MRS, is a profitableCite as: Learning Multirobot Joint Action Plans from Simultaneous TaskExecution Demonstrations, M. F. Martins, Y. Demiris, Proc.of9thInt.Conf. on Autonomous Agents and Multiagent Systems (AA-MAS 2010), van der Hoek, Kaminka, Lesprance, Luck and Sen (eds.),May, 1014, 2010, Toronto, Canada, pp.g3Copyright c 2010, International Foundation for Autonomous Agents andMultiagent Systems (). All rights reserved.Figure 1: The P3-AT mobile robots used in thispaper, equipped with onboard computers, cameras,laser and sonar range sensors.technology which brings benefits such as flexibility, redun-dancy and robustness, among others.Similarly, a substantial amount of studies have proposednumerous approaches to robot Learning by Demonstration(LbD) for a comprehensive review, see 1. Equippingrobots with the ability to understand the context in whichthey interact without the need of configuring or program-ming the robots is an extremely desired feature.Regarding LbD, the methods which have been proposedare mostly focussed on a single teacher, single robot sce-nario. In 7, a single robot learnt a sequence of actionsdemonstrated by a single teacher. In 12, the authors pre-sented an approach where a human acted both as a teacherand collaborator to a robot. The robot was able to match thepredicted resultant state of the humans movements to theobserved state of the environment based on its underlyingcapabilities. A supervised learning method was presentedin 4 using gaussian mixture models, in which a four-leggedrobot was teleoperated during a navigation task.Few studies addressed the prediction of intent in adver-sarial multiagent scenarios, such as the work of 3, in whichgroup manoeuvres could be predicted based upon existingmodels of group formation. In the work of 5, multiplehumanoid robots requested a teachers demonstration whenfacing unfamiliar states. In 14, the problem of extractinggroup behaviour from observed coordinated manoeuvres ofmultiple agents along time was addressed by using a clus-tering algorithm. The method presented in 9 allowed asingle robot to predict the intentions of 2 humans based onspatio-temporal relationships.However, the challenge of designing an MRS system inwhich multiple robots learn group behaviour by observation931931-938of multiple teachers concurrently executing a task was notaddressed hitherto.This paper presents a novel approach for LbD in MRS the Multirobot Learning by Demonstration (MRLbD) in which multiple robots are capable of learning task so-lution procedures (denominated as multirobot joint actionplans) by observing the simultaneous execution of desiredbehaviour demonstrated by humans. This is achieved byfirstly learning a demonstrated sequence of actions at singlerobot level, and subsequently applying a Spectral Cluster-ing (SC) algorithm to segment the group behaviour. Lastly,a multirobot joint action plan is generated by combiningthe actions at single robot level with the segmented groupbehaviour, resulting in a sequence of individual actions foreach robot, as well as joint actions that require coalitionformation.The remainder of this paper is organised as follows: Sec-tion 2 discusses the issues that must be addressed when de-signing an MRLbD system. In Section 3, the teleoperationplatform, developed to allow remote control of real robots(pictured in Fig. 1) engaged in a realistic search and res-cue activity, is detailed. This section also describes howthe HAMMER architecture 7 and an implementation ofthe SC algorithm proposed by 14 were utilised to tacklethe action recognition and group behaviour segmentationissues. Then, Section 4 describes the experimental tests car-ried out to demonstrate the multirobot plan generation, andSection 5 analyses the results obtained. Finally, Section 6presents the conclusions and further work.2. SYSTEM DESIGN ISSUESThe MRLbD architecture proposed in this paper is basedupon a platform for robot teleoperation, which design wasinspired by the work of 8 and 16, as well as the LbDarchitectures presented in 7, 9.The design of any MRS encompasses common issues withinthis field of research. In particular, systems for robot tele-operation bring forth three central design issues, which arediscussed in the following sections.2.1 Human vs. robot-centred perceptionPlatforms for teleoperation usually provide restricted per-ception of the remote environment in which a robot is in-serted. However, depending upon the application and theenvironment, the human can be strategically positioned insuch a way that global, unrestricted observation is feasible.The first design issue to be addressed is the human vs.robot-centred perception: should the human be allowed toobserve the world with own senses; or should they have theirperception restricted to robot-mediated data.While the former statement results in a simplified system,the potential applications of MRS aforementioned inevitablyfall into the latter. The teleoperation platform implementedin this work is therefore based upon having a restricted per-ception of the environment, providing the human with thesame remote perception that the robot can acquire locallythrough its sensors (the human is “placed into the robotsperceptual shoes”).2.2 Observations of human behaviourAnother key issue in designing a teleoperation platformis related to how to define the commands that are sent tothe robot. The human actions are not directly observable toPossible human actionspush object search forobjectmovewanderobservable dataseries of joystick commandsactionsto beinferredFigure 2: Human actions space vs. observable datadiagram.the robots. Although the humans are “placed in the robotsperceptual shoes”, a robot has access only to its teleoperatormanoeuvre commands, rather than the humans intendedactions, as illustrated in Fig. 2.Two straightforward possibilities arise: send commandswhich represent the robots underlying capabilities; or sendcontrol signals, such as motor commands to the robots.The former assumption is coherent with most LbD meth-ods, as the robots primitive behaviours are used to matchto actions observed by the robots. However, the humanwould be limited to few, inflexible handcrafted actions pro-grammed into the robots, which are usually application spe-cific, and also dependent upon the robots design.Conversely, the latter possibility allows humans to playwith a full repertoire of actions, only restricted to environ-mental conditions. The teleoperation platform herein de-scribed makes use of this feature, even though matchingmotor commands to the robots primitive behaviours is amore complex issue.A key feature which strongly motivated this decision is dueto the gain of flexibility, allowing the presented approachto be applied to distinct robots, such as unmanned aerialvehicles and wheeled-mobile robots, in a handful of potentialapplications in the field of MRS with little or no modificationrequired.2.3 Action recognition at single robot levelWhen addressing the problem of recognising observed ac-tions, a mismatch between observed data and robot internalstates might happen. This is a common issue known as thecorrespondence problem 10.Even though the robot is able to passively observe the ac-tions being performed by the human operator using its ownsensors, mapping manoeuvre commands to robot primitivebehaviours inevitably falls into the correspondence problem.Recognising actions from observed data when using a tele-operation platform becomes even more challenging, as thestate of the environment is only partially observable. Thus,important variables may not be present at a particular ob-servation of the state of the environment.Furthermore, human actions have deliberative and reac-tive components. During task execution, a human delib-erately actuates on the joystick in order to manoeuvre theteleoperated robot. In addition, sudden changes in the per-ceived environment (e.g., a moving obstacle appears in frontof the robot) result in a reactive behaviour of the human, at-tempting to change the robots course. Likewise, the human932I1I2InF1F2Fnstate s (at t) M1M2MnPrediction verification (at t+1) Prediction verification (at t+1) Prediction verification (at t+1) P1P2Pn Figure 3: Diagramatic statement of the HAMMERarchitecture. Based on state st, multiple inversemodels (I1to In) compute motor commands (M1toMn), with which the corresponding forward models(F1to Fn) form predictions regarding the next statest+1(P1to Pn) which are verified at st+1.may perform certain actions sequentially or simultaneouslyresulting in a combination of actions, while the robot hasaccess to the joystick commands only.In order to recognise actions from observed data and ma-noeuvre commands, this paper makes use of the HierarchicalAttentive Multiple Models for Execution and Recognition(HAMMER) architecture 7, which has been proven to workvery well when applied to distinct robot scenarios. HAM-MER is based upon the concepts of multiple hierarchicallyconnected inverse-forward models. In this architecture, aninverse model has as inputs the observed state of the en-vironment and the target goal(s), and its outputs are themotor commands required to achieve or maintain the targetgoal(s). On the other hand, forward models have as inputsthe observed state and motor commands, and the outputis a prediction of the next state of the environment. Asillustrated in Fig. 3, each inverse-forward pair results in ahypothesis by simulating the execution of a primitive be-haviour, and then the predicted state is compared to theobserved state to compute a confidence value. This valuerepresents how correct that hypothesis is, thus determiningwhich robot primitive behaviour would result in the mostsimilar outcome to the observed action.3. SYSTEM IMPLEMENTATIONThe MRLbD approach proposed in this paper is demon-strated using the aforementioned platform for robot teleop-eration, which consists in a client/server software writtenin C+ to control the P3-AT robots (Fig. 1) utilised inthe experiments, as well as an implementation of the HAM-MER architecture for action recognition and a Matlab im-plementation of the SC algorithm similar to the one pre-sented in 14. An overview of the teleoperation platformcan be seen in Fig. 4.The server software comprises the robot cognitive capa-bilities and resides on the robots onboard computer. Theserver is responsible for acquiring the sensor data and send-ing motor commands to the robot, whereas the client soft-ware runs on a remote computer and serves as the interfacebetween the human operator and the robot.3.1 The robot cognitive capabilitiesWithin the Robot cognitive capabilities block, the servercommunicates with the robot hardware by using the well-known robot control interface Player 6, which is a networkserver that works as a hardware abstraction layer to interfaceHuman-robot interface Robot cognitive capabilities WiFi network Joystick Visualisation Robot control Environment perception Player Server Logging Robot hardware Plan Extraction Action recognition (HAMMER) Group behaviour segmentation Multirobot plan Figure 4: Overview of the teleoperation platformdevelopedinthispaper.with a variety of robotic hardware.Initially, the internal odometry sensors are read. Thisdata provides the current robots pose, which is updated asthe robot moves around and used as the ground truth posefor calculating objects pose and building the 2D map of theenvironment. Odometry sensors are known for inherentlyadding incremental errors and hence lead to inaccurate poseestimations; but nevertheless, it is shown later on in Sec-tion 5 that this inaccuracy was immaterial to the results.The image captured (320x240 pixels, coloured) from therobots camera (at 30 frames per second) is compressed usingthe JPEG algorithm and sent to the client software over aTCP/IP connection using the Wi-Fi network.Additionally, the image is also used to recognise objectsbased upon a known objects database, using the approachpresented in 15. This algorithm consists in detecting thepose (Cartesian coordinates in the 3D space, plus rotation onthe respective axes) of unique markers. The known objectsdatabase comprises a set of unique markers and the objectthat each marker is attached to, and also oset values tocompute the pose of the object based upon the detectedmarkers pose.A short memory algorithm, based upon confidence levels,was also implemented to enhance the object recognition; theobjects pose is tracked for approximately 3 seconds after ithas last been seen. This approach was found extremely use-ful during the experiments, as the computer vision algorithmcannot detect markers from distances greater than 2 metresand occlusion is likely to happen in real applications.The Sick LMS-200 laser range scanner provides millimetre-accuracy distance measurements (from up to 80 metres),ranging from 0 degrees (right-hand side of the robot) to 180degrees (left-hand side). In addition, 16 sonar range sen-sors, placed in a ring configuration on the robot, retrievemoderately accurate distance measurements from 0.1 to 5metres and a 30-degree field of view each. Despite the lackof precision, the sonar sensors play a fundamental role inthe overall outcome of the teleoperation platform: as thehuman operator has limited perception of the environment,particular manoeuvres (mainly when reversing the robot)may be potentially dangerous and result in a collision. Thus,obstacle avoidance is achieved by using an implementationbased upon the well-known algorithm VFH (Vector FieldHistogram) 2. However, the human operator is able toinhibit the sonar readings as desired, feature which is use-ful when pushing objects, passing through narrow gaps and933doorways.In addition, joystick inputs are constantly received fromthe client and translated into motor commands (transla-tional and rotational speeds), which are then sent to therobot through the Player interface.Lastly, all the data manipulated by the server is incre-mentally stored in a log file every 0.5 seconds. The log filecomprises a series of observations made by the robot dur-ing task execution, which are composed of the following el-ements: time stamps in the Unix time format; robots posebased on odometry data; list of pose of objects recognisedand their unique identification; laser and sonar range sensorreadings; and finally, joystick inputs and motor commands.3.2 The human-robot interfaceThe client software constitutes the human-robot interface,in which the visualisation module displays to the human op-erator the sensor data which is received from the server.This data comprises the robots onboard camera, batterylevel and Wi-Fi signal strength, sonar and laser range scan-ners, and the robots pose based upon odometry sensors.The image is decompressed and displayed in a dedicatedwindow, while a second window shows a sketch of the robotin the centre, as well as sonar and laser data. A screenshotof the human-robot interface can be seen in Fig. 5.Furthermore, line segments are extracted from the laserdata (using an implementation of the Split-and-Merge algo-rithm described in 11) and displayed on top of the raw laserdata. These line segments are mostly red-coloured, apartfrom line segments which width coincides with a known ob-jects width; these segments become blue-coloured. Thiscolour dierentiation is not yet used by the robots, but it ismeant to be used as an attention mechanism for the opera-tor which highlights probable objects of interest based upontheir shape.Also, in case another robot is recognised, then a blue-coloured ellipse boundary representing the observed robotspose is displayed; a green-coloured circle border is printedotherwise. Note that the size of these shapes are scaledaccording to the real size of the objects.In addition, an image on the top-right side of the mainwindow displays the map

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论