IROS2019国际学术会议论文集 1349_第1页
IROS2019国际学术会议论文集 1349_第2页
IROS2019国际学术会议论文集 1349_第3页
IROS2019国际学术会议论文集 1349_第4页
IROS2019国际学术会议论文集 1349_第5页
已阅读5页,还剩2页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

Cognitive Robotic Architecture for Semi Autonomous Execution of Manipulation Tasks in a Surgical Environment Giacomo De Rossi1 Marco Minelli2 Alessio Sozzi3 Nicola Piccinelli1 Federica Ferraguti2 Francesco Setti1 Marcello Bonf e3 Cristian Secchi2and Riccardo Muradore1 Abstract The development of robotic systems with a certain level of autonomy to be used in critical scenarios such as an operating room necessarily requires a seamless integration of multiple state of the art technologies In this paper we propose a cognitive robotic architecture that is able to help an operator accomplish a specifi c task The architecture integrates an action recognition module to understand the scene a supervisory control to make decisions and a model predictive control to plan collision free trajectory for the robotic arm taking into account obstacles and model uncertainty The proposed approach has been validated on a simplifi ed scenario involving only a da Vinci R surgical robot and a novel manipulator holding standard laparoscopic tools I INTRODUCTION Robotic technology is pervading our lives since the begin ning of the 60s when the fi rst industrial robots were put in use expanding then to other fi elds such as domotics au tonomous guided vehicles and drones and surgical robots Technology evolved in all these fi elds producing robots able to perform specifi c preprogrammed tasks in an extremely fast precise and repetitive way The next signifi cant forward leap will arise with the introduction of intelligent systems that can operate autonomously or semi autonomously in cooperation with human agents This evolution means that the robot has to robustly interact with the environment perceive it interpret the actions of the other agents and make decisions on top of it These robots will need improved dexterity and perception capabilities provided by cognitive functions that can support them in decision making and performance monitoring and enhance the general quality of tasks accomplishment We refer the reader to 1 2 3 4 5 and the reference therein for what concerns the surgical robotics area Currently the classifi cation of autonomy encompasses six basic levels Level 0 no autonomy Level 1 robot assistance Level 2 task autonomy Level 3 conditional autonomy Level 4 high autonomy Level 5 full autonomy 6 Explor ing the transition from level 0 to level 1 and 2 the robot is 1Giacomo De Rossi Nicola Piccinelli Francesco Setti and Riccardo Mu radore are with the Department of Computer Science University of Verona Verona Italy giacomo derossi nicola piccinelli francesco setti riccardo muradore univr it 2Marco Minelli FedericaFerragutiandCristianSecchiare withtheDepartmentofEngineeringSciencesandMethods UniversityofModenaandReggioEmilia ReggioEmilia Italy marco minelli federica ferraguti cristian secchi unimore it 3Alessio Sozzi and Marcello Bonf e are with the Department of En gineering University of Ferrara Ferrara Italy alessio sozzi marcello bonfe unife it Fig 1 Overview of the SARAS solo surgery platform architecture required to embed cognitive capabilities to provide operative support in a shared control approach with the human always in charge of decisions These levels are the basic building blocks of autonomy i e the ability to understand the task to plan a proper action and to ensure its safe execution In this paper we will focus on providing cognitive ca pabilities to surgical robots In particular we will develop a cognitive robotic architecture for helping an operator to perform a cooperative task This will be the fi rst step towards semi autonomous execution of more demanding cooperative tasks over complex surgical procedures This work is funded by the EU project SARAS saras project eu which is the acronym of Smart Autonomous Robotic Assistant Surgeon A The SARAS approach The SARAS solo surgery platform will be a very sophisti cated example of a shared control system a surgeon teleop erates a couple of robotic laparoscopic tools and cooperates with an autonomous system on a shared environment i e a manikin of the human abdomen to perform complex surgical procedures The goal of SARAS is to substitute the assistant surgeon next to the patient within the operating room with an autonomous system controlling the same kind of standard laparoscopic tools The general system architecture we are developing is shown in Figure 1 In this scenario the main surgeon is seated at the da Vinci R console and remotely controls the da Vinci R tools We started by designing a multi master multi slave MMMS bilateral teleoperation system where the laparoscopic tools are teleoperated by an assistant in order to store videos and kinematic time series of the two 2019 IEEE RSJ International Conference on Intelligent Robots and Systems IROS Macau China November 4 8 2019 978 1 7281 4003 2 19 31 00 2019 IEEE7821 a b Fig 2 Experimental setup a da Vinci R arm and SARAS arm devel oped by Medineering b da Vinci R robotic tool left and commercial laparoscopic tool mounted on the SARAS arm end effector right as seen thanks to the 3D endoscope robots about the surgical procedures 7 8 This data will be used in the future to train a machine learning algorithm for understanding the procedure from the beginning to the end in order to 1 recognize the action of the main surgeon 2 make decision on the task that the autonomous arms have to execute to help the surgeon i e what where and when 3 plan collision free trajectories for the arms avoiding tools and anatomical structures Moreover such data together with the clinical knowledge 9 will be used to design a supervisor control that handles the above point 2 B Problem Statement We report here on the very fi rst step towards this ambitious goal We focus on a basic non surgical task that resembles a common exercise for training in robotic minimally inva sive surgery R MIS The surgeon picks a ring with the da Vinci R robotic tool and moves it in the middle of the fi eld of view of the endoscope The autonomous SARAS system is trained to move in that position and grasps the ring that the surgeon hands Finally the SARAS arm moves and drops the ring into a target area The experimental setup is shown in Figure 2 while Fig ure 3 presents the block diagram of the control architecture It is worth mentioning that the 3D endoscope provides the streaming video both to the surgeon on the da Vinci R console and to the SARAS system This video is analyzed by an Action Recognition module to detect the current action The most likely action together with a confi dence level are elaborated by a Supervisory Control that decides what to do and the target position xgwhere the SARAS robot has to move The desired target point is taken as a reference input by a Model Predictive Controller MPC Then the MPC computes the desired trajectory xd to move the arm towards xg taking into account the confi dence level the smaller the confi dence level the lower the velocity and the obstacles The low level controller of the robot receives xd and computes the torques needed to command the four degree of freedom of the robot to move the tip of the grasper in the desired position In this paper we do not enter into the details of the robot inner controller since it is quite standard inverse kinematics plus PD controller All daVinci Console SARAS ArmInner Controller Operator Action Recognition Model Predictive Control Supervisory Controller daVinci Robot 3D Endoscope Left Arm I x A xd xg Fig 3 Solo surgery control architecture the mentioned subsystems will be described in depth in the following sections C Contributions The main contributions of this paper are a cognitive architecture to interpret the scene by recog nizing the actions of the other agents i e the robotic tools teleoperated by the surgeon and make decisions on how to interact with them This architecture is composed of two modules an Action Recognition and a Supervisory Controller a Model Predictive Control that exploits the outcomes of the cognitive architecture to plan collision free trajec tories and modulate the velocity according to how much confi dent the system is about the recognized actions and to the presence of obstacles towards the target pose and a seamless integration of perception decision planning and action in a simplifi ed but still realistic surgical training scenario II ACTION RECOGNITION The action recognition module employs a Convolutional Neural Network CNN to analyze a sequence of frames and provide action labels to the system The network used in this work is composed by a sequence of layers that resembles the schema of a VGG network 10 but with a shallower depth Figure 4 presents a schematic view of the structure Like most neural networks of this type its structure is a cascade of convolutional fi lters Conv x x increasing in number layer after layer from 64 to 512 interleaved with max pooling layers and ReLU Rectifi ed Linear Unit non linear activation functions The kernel size 3 3 has been maintained throughout all layers to improve feature detection at different scales The classifi cation is the output of two fully connected layers FC and a softmax function which also provides the required confi dence percentage This network has been trained from scratch using a cus tomized dataset of videos taken using the setup shown in Figure 2 in which both the da Vinci R and SARAS arms are teleoperated They are recorded using the left camera of the da Vinci R stereo endoscope In total 20 videos of approximately 100 frames each at 10 frames per second have 7822 RGBMHI A05 A04 A03 A02 A06 A07 A08 A09 A01 Conv 3x3x64 Conv 3x3x128 Conv 3x3x256 Conv 3x3x256 Conv 3x3x512 Conv 3x3x512 Conv 3x3x512 Conv 3x3x512 FC1 FC2 InputOutput Fig 4 Neural network schema for action recognition the RGB and MHI images are processed simultaneously as a 4 channel enhanced frame been taken all representing the same cooperative task with the corresponding ground truth labelling The labelling has been divided into 9 different fi ne grained actions for the main surgeon MS and the assistant surgeon AS A01MS moves to the red ring A02MS picks the ring A03MS lifts the ring A04MS moves the ring to the exchange area A05AS moves toward the ring A06AS grasps the ring and MS leaves the ring A07AS moves with ring to the delivery area A08AS drops the ring A09AS moves back to the starting position The decision to train the network from scratch instead of fi ne tuning an existing model is due to three main reasons the highly specialized use case scenario laparoscopic surgery the use of 4 channel images RGB Motion History Image MHI 11 which limits the number of available trained networks the reduced time and computational requirements for both training and evaluation of the model Additionally since this network represents a variation of the one presented in 11 it was already proved to be effective when tested over the similar JIGSAWS dataset 12 Indeed when limited to the task at hand the resulting network was deemed optimal thanks to its ability to provide accurate action segmentation with high confi dence It acts with causality on each frame augmented by its motion history over a period of 2 seconds This is a well known and effective technique to maintain correlation amongst sequences of frames for the improvement of action detection stability 13 Nevertheless the training occurred under varying conditions of light contrast and objects in the scene to improve generalization capabilities which also results inevitably in a drop in action detection certainty Figure 5 shows an example of the output of the action recognition module III SUPERVISORYCONTROL The result obtained by the action recognition module at each iteration determines the task the robot should carry out A supervisory controller is needed to coordinate the evolution of the task and the motion of the SARAS robotic arm This supervisor is implemented using a Finite State Machine FSM where each action see Section II corresponds to a Fig 5 Result of an action segmentation performed on a test video taken from the training dataset under ideal conditions The color of the bars refers to the color code used to label the actions It can be noted how the model still suffers from uncertainty on the prediction around frame 10 20 and 60 state of the FSM whereas the transitions between the states are fi red any time a new action is recognized Figure 7 shows the structure of the FSM implementing the state update logic described in the Algorithm 1 Algorithm 1 FSM state update 1 input action confidence 2 if confidence THRESHOLD then 3 if action 6 NEXT current state then 4 action filtered FILTER BUFFER 5 if action filtered 6 current action then 6 next action ask user confirmation action 7 else 8 next action current action 9 end if 10 else 11 next action action 12 end if 13 else 14 next action current action 15 end if 16 DO TRANSITION TO next action At the very beginning of the operation the FSM fi lls a FIFO buffer of fi xed dimension with the actions coming from the action recognition network after the buffer is fi lled the oldest action in the buffer is considered for the fi rst computation of the state This approach produces a delay between the actual state of the task and the action used to determine the state of the FSM However this delay is useful to have a prediction window on how the action will evolve in the future As shown in Figure 6 this window makes possible to detect and fi lter out spurious recognition guaranteeing a more robust control Since the delay should be as small as possible to avoid penalizing the promptness of the system the buffer dimension is chosen equal to 5 actions Within each state the FIFO buffer is updated with the predicted action and consequently the next control action is evaluated by fi ltering over its elements This buffering 7823 A03A03A01A01A01 k 1k 2k 3k 4k 5 1 0 8 2 0 75 3 0 75 4 0 8 5 0 8 A03 nA03 1 2 1 2 A01 nA01 3 4 5 1 44 time inout k nk Fig 6 Filtering example with A01 as current state at instant k and A03 as recognized action at instant k 1 A03 is not subsequent to A01 so the fi lter is applied and A01and A03are computed as shown above where i is the confi dence on the recognition of the action in the i th location of the buffer and njis the number of occurrences of the action j since A01 is higher no transition is performed avoids possible spurious switching between states e g due to high segmentation uncertainties Every time the newly recognized action respects the correct sequence see Sec tion II for the standard sequence of actions the recognition is accepted without fi ltering and the transition to the next state is performed Otherwise if the action does not follow the standard sequence it could be a spurious recognition To manage this uncertainty the following heuristic has been developed the occurrences of the recognized action and of the action related to the current state are counted and then weighted by their confi dences over the buffer of the future actions The action with the highest score is then considered for the transition for which if different from the current state the recognition can be considered not spurious on the other hand if it is out of sequence the system interrogates the user to solve the ambiguity In the event the user does not confi rm the recognition the task is aborted to guarantee the highest safety After each transition the FSM returns as output the goal pose xgthe robot should reach to carry out the action related to the new current state and the confi dence of the recognition A09A01 A02 STARTEND USER INPUT LEGEND In sequence or current action recognition Ask user for not in sequence filtered recognition Current action filtered recognition User confirmation User abort Fig 7 Scheme of the FSM the states embedding the actions are represented with circles while the transitions are represented with arrows IV MODELPREDICTIVECONTROL The goal pose xgis used within the Model Predictive Controller MPC to plan the motion of the robot toward xg whereas the confi dence level is exploited to modulate the Cartesian velocity The main idea of introducing the MPC is to deal with the uncertainty of the recognized action while guaranteeing the satisfaction of constraints critical for the robot e g maximum torque or speed limit and or for the application e g safety interaction with anatomical structures A Robot model The SARAS arm used for the experimental validation and in general the laparoscopic tools are 4 DOFs systems with a remote centre of motion RCM If we neglect the rotation around the tool the pose of the robot is defi ned as the Cartesian position of the end effector As a consequence the robot with the inner controller as shown within the dashed rectangle in Figure 3 can be kinematically modeled as a single integrator in the discrete time domain x k 1 x k Bu k 1 where x R3are the coordinates of the position of the end effector in the task space B diag tc tc tc is the input matrix tcis the sample time t k tc k Z and u R3 represents the control input namely the velocities of the end effector B Robot Obstacle distance computation In the present surgical scenario two robotic laparoscopic tools are supposed to work in a shared workspace In order to guarantee safety collisions must be avoided Hence a strategy to defi ne the distance between the tools needs to be defi ned We chose to model the tools using virtual capsules with the aim of enclosing them inside the fi ttest and simplest possible shape such that the distance computation becomes easier Given a couple of Cartesian points a capsule is a virtual object composed by two hemispheres centered in that points and a cylinder with longitudinal axis linking the two points as shown in Figure 8 14 Using the position of the end effector and the position of the remote centre of motion of each tool to build the capsules the minimum distance between two virtual capsule can be computed as d j i daxj axi ri rj 2 where daxj

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论