Unsupervised Task Segmentation Approach for Bimanual Surgical Tasks Using Spatiotemporal and Variance Properties IROS2019国际学术会议论文集 1244_第1页
Unsupervised Task Segmentation Approach for Bimanual Surgical Tasks Using Spatiotemporal and Variance Properties IROS2019国际学术会议论文集 1244_第2页
Unsupervised Task Segmentation Approach for Bimanual Surgical Tasks Using Spatiotemporal and Variance Properties IROS2019国际学术会议论文集 1244_第3页
Unsupervised Task Segmentation Approach for Bimanual Surgical Tasks Using Spatiotemporal and Variance Properties IROS2019国际学术会议论文集 1244_第4页
Unsupervised Task Segmentation Approach for Bimanual Surgical Tasks Using Spatiotemporal and Variance Properties IROS2019国际学术会议论文集 1244_第5页
已阅读5页,还剩2页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

Unsupervised Task Segmentation Approach for Bimanual Surgical Tasks using Spatiotemporal and Variance Properties Ya Yen Tsai Yao Guo Member IEEE Guang Zhong Yang Fellow IEEE Abstract In surgical workfl ow analysis and training in robot assisted surgery automatic task segmentation could sig nifi cantly reduce the manual labeling time and enhance robot learning effi ciency This paper presents an unsupervised seg mentation approach to automatically segment a given surgical task without manual intervention A new segmentation method is presented which relies only on bimanual kinematic trajec tories without the need for prior information about the data Specifi cally surgical tasks are segmented by fusing trajectories spatiotemporal and variance properties To demonstrate the effectiveness of the proposed method detailed experiments were fi rst conducted on our dataset We segmented trajectories of three different surgical stitches and observed an average F1 score of 77 9 against the ground truths The same trajectories were then added with different levels of noises and the seg mentation comparison was made with four other methods The proposed algorithm had demonstrated its robustness against the noises Finally to assess its generalization ability the method was evaluated on publicly available JIGSAWS dataset and an average F1score of 75 5 was achieved I INTRODUCTION In the past decades robot assisted surgeries have sup ported realizing the full potential of minimally invasive surgery compared to traditional open surgeries 1 2 The benefi ts of such transition can be seen from many clinical studies and evidence 3 4 The provision of task automa tion is particularly advantageous in situations where surgi cal subtasks require for example extended period of high concentration for repeated and tedious operations Learning from Demonstration LfD 5 improves the effi ciency in programming a robot by learning complicated movements and manipulations through human guidance or provision of human demonstrations Task segmentation is one of the most critical processes in LfD because it facilitates analyzing and understanding motion behaviors Complicated motions during surgical tasks typically consist of multiple steps and complicated tool ma nipulations Hence accurately dividing a demonstration into meaningful and homogeneous action units namely motion primitives MP is challenging especially when there is a lack of a prominent and clear boundary between MPs Moreover human annotations for a large amount of data are diffi cult to be maintained and thus manual labelling is prone to errors Y Y Tsai Y Guo and G Z Yang are with the Hamlyn Centre for Robotic Surgery Imperial College London SW7 2AZ London UK e mail y tsai17 yao guo g z yang imperial ac uk G Z Yang is also with the Institute of Medical Robotics Shanghai Jiao Tong University China This work was supported by Engineering and Physical Sciences Research Council EPSRC under Grant EP L020688 1 Fig 1 Flowchart of the proposed unsupervised segmentation method Bimanual trajectories generated from human demon strations are fi rst segmented based on the spatiotemporal and variance properties of 6 DoF kinematic trajectories separately Then two sets of segments are merged using DBSCAN to form the fi nal segmentation result Many unsupervised segmentation methods are focused on using kinematics information to segment trajectories 6 7 They group sub trajectories based on the similarity of kine matic features Buchin et al 8 used a range of movement characteristics such as location speed velocity shape curva ture and sinuosity to determine the kinematic homogeneity of motions Segmentation points were found at the changes in the homogeneity Despinony et al 9 on the other hand relied on distance metrics like Hausdorff distance Fr echet distance and Dynamic Time Warping DTW for computing the dissimilarity in trajectories and used this information to fi nd segmentation points Clustering based technique is another commonly used segmentation strategy Many of which detected the stop and move actions of the trajectory data in the spatial domain 10 11 As stagnation points often appear to be dense in space they utilized such property 2019 IEEE RSJ International Conference on Intelligent Robots and Systems IROS Macau China November 4 8 2019 978 1 7281 4003 2 19 31 00 2019 IEEE7934 to group vicinity points and segmented the trajectory based on the distant clusters Extended works have also been carried out to combine both kinematic homogeneity and clustering to segment a task 12 Transition State Clustering TSC 13 exploited video and kinematic information of demonstrations for sur gical task segmentation It identifi ed potential transitions and segmented linear dynamic regimes based on kinematic sensory and temporal similarity The number of clusters was later governed by Dirichlet Process DP avoiding the need for a priori knowledge The algorithm optimized its results by iteratively merging dense clusters while removing sparse and repeating loops until a condition was met Fard et al 14 introduced a soft boundary unsupervised approach Soft UGS to fi rst segment surgical gestures into fi ne pieces then iteratively merge homogeneous segments that defi ned by Probabilistic Principal Component Analysis PPCA the distance between the centers of segments and DTW distance between neighboring segments Although many previous works have addressed issues such as trajectory frame dependence in surgical tasks segmenta tion there remain main problems still unresolved Firstly temporal variations such as noises are often inevitable in human demonstrations Although smoothing may mitigate the effect intensive smoothing may change the critical shape as well as spatiotemporal information from the original trajectory This effect leads to the critical segmentation features becoming less prominent and results in a degraded performance of an automatic task segmentation Secondly inconspicuous segmentation points or transition periods is another common problem especially for trajectories involved human demonstrations The tendency to perform a continu ous and smooth transition from one action unit to another is naturally inherited Distinguishing such motions by hand may be diffi cult and this can increase the likelihood of inconsistent manual segmentation On top of that data with a high Degree of Freedom DoF may even make seg mentation more ambiguous Without incorporating rotational components of trajectories it introduces more uncertainty in segment identifi cation and classifi cation 15 which has not been addressed in the previous work In this paper we propose a generic and novel task segmen tation framework to divide bimanual 6DoF trajectories into multiple action units automatically It utilizes two derived kinematic features from the spatiotemporal and variance properties of trajectories to fi nd initial sets of potential segmentation points These sets are fi nally clustered to refi ne the segmentation result Fig 1 illustrates the overall structure of the proposed unsupervised segmentation algorithm This paper mainly focuses on complex surgical related applica tions but the algorithm is also applicable to other relevant tasks The main contributions of this paper are two fold as follows We propose a new framework for complicated task segmentation with bimanual 6DoF spatiotemporal tra jectories as inputs The algorithm fuses two frame independent kinematic features to enhance the robustness against noises and the segmentation precision and accuracy This paper is organized as follows Section II introduces our proposed method used to segment a given task au tomatically The experiments in Section III evaluate the proposed algorithm by comparing the results with manually labeled ground truths with and without the presence of the additive noises We also compared its performance against other commonly used segmentation approaches Section IV presents the discussion and conclusion and future works are provided in Section V II METHODOLOGY Complicated motions involved in surgical tasks typically require multiple steps to accomplish Decomposing a task into several simple steps allows a better understanding of the constitution of a motion This paper proposes a novel algorithm to automatically segment bimanual tasks such as surgical suturing A divide and merge approach is followed throughout the framework to refi ne the segmentation perfor mance A Problem Statement Let us defi ne a bimanual task T l r as the combination of two motion trajectories where l RN dL and r RN dRrepresent the kinematic trajectories of the left hand and the right hand respectively dLis the dimension of features describing the translation and orientation of the left hand movement over time while dRdescribes those of the right hand N refers to the number of frames in trajectory data The purpose of the task segmentation is to divide the task T into K consecutive fractions as T K i 1 Si pi s p i e 1 where pi s and pi e indicate the indexes of the starting point and the ending point for the segment Si Note that the end point pi 1 e of the segment Si 1coincides with the start point pi s of the current segment Si In this paper 6 DoF kinematic trajectories of tools move ments are recorded using a visual system by tracking the visual markers attached to the tips of the tools Then we have dL dR 6 In specifi c a 6 DoF trajectory is expressed as t where p x y z is the translation component and ex ey ez is the rotation component in the form of Euler Rodrigues representation The proposed segmentation algorithm takes bimanual kinematic trajectories as inputs and produces a set of segmentation points based on the following three steps spatiotemporal based segmentation variance based segmen tation and a merging step Firstly the spatiotemporal based segmentation method provides the initial set of segments by identifying the spatiotemporal density of the candidate tra jectory Secondly the variance based segmentation projects the recorded temporal trajectories onto different coordinates for capturing the changes in velocity for each feature over time Segments are determined based on the changes in the 7935 a Left hand trajectory b Right hand trajectory Fig 2 Example bimanual trajectory generated from a blanket stitch The trajectory is manually segmented into 9 sections each of which corresponds to a color variance of each feature By adopting these two methods separately given trajectories are divided into two sets of segments Finally the merge process is carried out to extract points exist in both data sets and cluster points in the same space region in order to form the fi nal set of segmentation points Segmentation points are defi ned at the physical boundaries of motions This decomposes a trajectory into homogeneous segments therefore each of which has a continuous and smooth motion that only comprises of simple translation and or rotation Fig 2 shows a single cycle of a blanket stitch segmented into 9 MPs As illustrated critical points are found when there is a change in homogeneity In this paper the algorithm aimed to fi nd an optimal segmentation where a trajectory is partitioned into minimal segments while meeting the requirement of segment homogeneity B Spatiotemporal based Segmentation We fi rst explore both the spatial and temporal properties of the points in the given trajectories and determine the potential segmentation points This method is inspired by the inherent characteristics of human movements 15 When performing and combining multiple incoherent motions to ensure smooth transitions motions tend to change gradually in the spatiotemporal domain The kinematic characteristics of the transition points can differ from those within the same segments therefore they serve as good properties for segmentation The movements at the transitions often refl ect on the density of trajectories in space The gradual transition from one motion to another results in poses of hands hovered at a particular region which leads to the dense distribution of points within the trajectory It should be pointed out that not all spatial clustered points represent transition state Clustered points may also come from noises the presence of other motion primitives and other transition periods To address this a spatiotemporal based protocol is pro posed to investigate the clustered points that are temporally proximate For a 6 DoF trajectory a distance profi le is calculated by fi nding the distance in space between two vicinity points Considering the translation and rotation are measured in different units we calculate the distance for the translation and rotation components separately For the translation component the Euclidean distance Dtrans t measures the distance between the point at the current instance p t x t y t z t and the point at the previous instance p t 1 x t 1 y t 1 z t 1 As for the rotational component we fi rst convert the Euler Rodrigues representation into the quaternion q to calculate the distance between two 3D rotations For two 3D rotations that are close enough the distance between them can be approximated as linear In this case the provided trajectories were recorded at 20Hz and hence the distance at a time instance Drot t is calculated using the quaternion q t qw t qx t qy t qz t at the instance and quaternion q t 1 qw t 1 qx t 1 qy t 1 qz t 1 at the previous instance Eqn 2 is used to compute the quaternion distance between two vicinity points along the trajectory at time stamp t Drot t arccos 2 q t q t 1 2 1 2 We calculate Drotand Dtransfor the left hand and the right hand trajectories respectively In total there are four distance profi les for two hands As mentioned the points belonging to the transition state are more closely clustered and therefore they possess smaller values in a distance profi le whereas peaks represent mo tions or segments The potential segmentation points are consequently found by fi nding the corners of the peaks To minimize the effect of noises presented in the data only peaks that had height and prominence higher than the pre defi ned thresholds are considered At a transition period there exists a moment with zero ve locity and acceleration Zero velocity crossing is a commonly used method to determine segmentation points which happen at the zero crossing Therefore speed and acceleration pro fi les derived from the distance profi les for each component are used to further refi ne the segmentation results Starting from the segmentation points identifi ed from the distance profi les each point is examined whether it is at the zero crossing If it is not at the zero crossing it searched forward and backward temporally to fi nd the points that meet the criteria This refi nement process is performed iteratively for each segmentation point and each component Finally the segmentation results from the four components are combined to form the fi nal set for spatiotemporal based segmentation approach A point is selected in the fi nal set such that it exists in one of the four segmentation sets and there existed another point temporally nearby the point in another segmentation set The vicinity of another point is again determined based on the pre defi ned threshold The outcome of the merge point served as the initial segmentation points of this framework C Variance based Segmentation Although spatiotemporal based segmentation policy is ca pable of generating a set of segmentation points there still exists mis segments due to the noise of the raw trajectories To solve this we additionally propose a variance based segmentation policy to address two common problems in task segmentation noises and frame dependency 7936 Fig 3 Illustration of points along a trajectory being projected onto different frames in space To address the aforementioned challenges the variance based segmentation instead transforms the given trajec tory from the current coordinate system E to different frames in space to perform segmentation Nfframes are fi rst randomly selected and the given bimanual trajecto ries are projected onto these frames For each point p t along the trajectory on a new frame j j 1 Nf we can defi ne a vector pointing from the origin of j to the point p t Next the projection angles at time t j x t j y t j z t to the three axes of j are com puted Let us defi ne R t as the 3D rotation matrix from the local Euler Rodrigues representation t to the new frame j The Euler angles j x t j y t j z t can be derived from the rotation matrix R t These two sets of angles are calculated for Nfframes Finally we obtain Nfsets of 6 DoF reparameterized trajectories t for each hand from which with frame j can be expressed as j x t j y t j z t j x t j y t j z t For each time step t we calculate the variance of the speed profi le for each feature from the trajectory The trajectories of both hands were transformed to these points The changes in the projection angles were found using the translation component The changes in angular displacements were calculated using the rotation component Variances from the translation components V artrans and variances from the rotation components V arrot calculated from the frame j are summarized as follows V arj trans t V ar j x t V ar j y t V ar j z t V arj rot t V ar j x t V ar j y t V ar j z t 3 where j 1 Nf The variance profi le has a similar property as the distance profi le For the projection angles their changes at each time step are affected by the location of points in space If consecutive points are clustered at a region in space the changes in projection angles will be smaller This property is invariant to any frame On the other hand if an object is moving the trajectory points would be sparser in the spatial domain hence the changes in angles will be greater By examining the variance of these changes across different frames we can fi nd the potential segmentation points Points with higher variance impl

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论