IROS2019国际学术会议论文集0585_第1页
IROS2019国际学术会议论文集0585_第2页
IROS2019国际学术会议论文集0585_第3页
IROS2019国际学术会议论文集0585_第4页
IROS2019国际学术会议论文集0585_第5页
已阅读5页,还剩3页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

Multicamera 3D Reconstruction of Dynamic Surgical Cavities: Non-Rigid Registration and Point Classifi cation Yun-Hsuan Su, Kevin Huang, Blake Hannaford AbstractDeformable objects and surfaces are ubiquitous in the daily lives of humans from the garments in fashion to soft tissues within the body. Because of this routine inter- action with soft materials, humans are adept and trained in manipulation of deformable objects while avoiding irreversible damage. The dexterity and care involved is largely facilitated through a combination of the human haptic sense of touch and visual observations of object deformation 1. While this scenario presents itself as a trivially intuitive task, it becomes signifi cantly more diffi cult and complex with the deprivation of both 3D depth perception and haptic senses. This deprived state is not dissimilar to the scenarios encountered in many robot- assisted minimally invasive surgeries. As a result, unintentional tissue damage can occur due to lack of force feedback and fi ne 3D visibility 2. One approach to remediate these issues combines real-time dynamic 3D reconstruction and vision-based force estimation for haptic feedback. Toward that end, this work continues research in a series of studies focusing on multicamera 3D reconstruction of dynamic surgical cavities. Previous work introduced a novel approach of camera grouping and pair sequencing 3. This paper builds upon that work by introducing a method for non-rigid, sparse point cloud registration and subsequent point classifi cation. In particular, to enable deformation and force analyses, surfaces are locally classifi ed into three categories: static, shifting and deforming. The topics addressed in this paper present open challenges and ongoing research directions for researchers to this day 4, and provide a step towards real-time 3D reconstruction and force feedback in robot-assisted surgery. I. INTRODUCTION A. Background Minimally invasive surgery (MIS) affords several benefi ts over traditional open surgery, including reduced patient re- covery time and lower risk of infection 5. In MIS, surgeons utilize real-time laparoscopic imaging of the surgical cavity, oftentimes an insuffl ated abdomen, to precisely guide surgi- cal instruments. Advances in medical robotics have catalyzed the clinical introduction of surgical robots, whereby surgeons teleoperate robots outfi tted with surgical tools in lieu of manual control. This teleoperated architecture marries the experience and skill of the human surgeon with the dexterity, scalability and precision of machines, all while introducing a confi gurable software layer for intelligent augmentations. Despite these robot mediated improvements, several draw- backs exist with the current state-of-the-art. Yun-Hsuan Su and Blake Hannaford are with the University of Wash- ington Department of Electrical and Computer Engineering, 185 Stevens Way, Paul Allen Center - Room AE100R, Campus Box 352500, Seattle, WA 98195-2500, USA.yhsu83, Kevin Huang is with Trinity College, Dept. of Engineering, 300 Summit St, Hartford, CT 06106 USA Firstly, 3D vision is challenging with conventional image acquisition methods, largely due to the restrictive and small cavities encountered in MIS. Even with pre-calibrated stereo endoscopes, short baselines between cameras, specular re- fl ections from wet tissue surfaces, and the highly dynamic nature of abdominal surgical scenes make real-time 3D reconstruction an ongoing challenge 4. As a result, in practice, surgeons are oftentimes provided with 2D video feeds. Secondly, contact interaction forces are no longer transmitted directly through manual surgical tool to the sur- gical operator. The indirect control afforded by teleoperation sacrifi ces immediate sensory feedback. A straightforward yet na ve solution would be to monitor the applied tool- tip force with an end-effector mounted force-torque sensor. However, it is impractical to equip surgical robot end- effectors with additional electronic devices or sensors due to restrictions and complications with sanitation, e.g. autoclave temperatures are likely to damage modern electronic sensors. B. Contribution In the authors previous work 3, a graph-based pairwise camera sequencing method for real-time multicamera 3D reconstruction of dynamic surgical cavities is proposed. Towards realization of simultaneous 3D reconstruction and interaction force estimation, the work here extends those previous fi ndings to non-rigid point cloud registration over time and subsequent classifi cation of locally static, shifting, or deforming surfaces. This work utilizes a robot-assisted MIS scenario for which multiple endoscopes are present in the surgical cavity, as depicted in Fig. 1. Fig. 1.Multiple independently moving cameras from different views of the surgical cavity. Calculated geometries are represented as a point cloud. 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) Macau, China, November 4-8, 2019 978-1-7281-4003-2/19/$31.00 2019 IEEE7905 This paper presents a constrained optimization framework for 3D information processing from multiple viewpoints in a dynamic surgical environment such that: point clouds from successive time frames are opti- mally registered while simultaneously ensuring shape and smoothness of the resultant 3D model as well as maintaining the dynamic nature of the surgical scenes; points are classifi ed into two main types - static or dynamic. Dynamic points are further classifi ed as either deforming or shifting, depending on the relative motion of neighboring points. C. Related Work 1) Multicamera 3D Reconstruction: This work aims to address the lack of 3D measurement and force feedback in robot-assisted MIS; achieved using dense multicamera 3D reconstruction in tandem with vision-based force estimation. Urey et al. demonstrated that dense 3D reconstruction from multiple viewpoints indeed improved surgeon perception of the entire surgical scene 6. It should be noted that a multicamera setup for MIS does not require additional incisions. In fact, it was shown by Silvestri et al. that an array of cameras could be mounted on a single insertable unit through a trocar 7. Individual camera poses were also wirelessly controlled via external magnets, thus improving the articulation of multicamera setups for MIS 811. Given multiple viewpoints of a scene, COSLAM is a promising method targeted toward visual reconstruction us- ing multiple independent cameras in dynamic environments 12. However, COSLAM is not amenable to surgical scenar- ios, where deforming regions may occupy a large portion of the viewable scene. Moreover, in COSLAM deforming points are indistinguishable from shifting points. This distinction is essential in tissue deformation and force estimation analyses. Unlike the cameras demonstrated in the COSLAM method, the cameras in this work mimic those of a real surgical scenario; poses are tracked with the Medtronic Stealth Sta- tion s7 surgical navigation system, and the surgical tool pose is monitored in real-time by the Raven-II surgical robot platform 13. It should be noted that the accuracy requirement in surgical 3D reconstruction applications is 1mm, in order to realistically refl ect applied force during tool-tissue interactions with reasonable resolution 2. 2) Non-rigid Registration: Registering multiple frames of medical imaging is necessary to stitch together volumes or scenes of the region of interest in real-time. However, COSLAM is only suited for larger scale, rigid scenes. Real- time, rigid methods for ultrasound have been proven in clinical validation studies and are adequate only with very slight deformations 14. Several methods for rigid body point cloud registration exist, including optical fl ow with probablistic volumetric reconstruction 15, iterative closest point, and implicit surface generation methods 16. How- ever, these methods are often inappropriate for the soft tissues encountered in MIS applications. Non-rigid registration of dynamic point clouds is a fi eld of research on its own. Oftentimes non-rigid registration is formulated as an en- ergy functional containing both data and regularization terms. Regularization terms help to preserve smoothness, affording the optimization procedure robustness to noise and outliers. Wand et al. and Li et al. created deformation fi elds to fi t data during optimization for non-rigid registration 17, 18. In another approach, S umuth et al. introduced an as-rigid-as-possible energy function to promote smoothness 19. A high-order graph matching technique with implicit embedding energy was used for registration with high de- formation by Zeng et al. 20. Guo et al. demonstrated that using l0regularization in non-rigid registration can improve robustness and accuracy 21. Methods for sparse non-rigid surface data are elaborated in 22, 23 The quadratic data terms used in 18, 21, 23 implicitly assume positional errors with Gaussian distributions. Soft tissue deformation resulting from natural breathing or heart- beat are large, piece-wise smooth signals residing on 3D surfaces. On the other hand, indentations due to tool-tissue interactions usually result in larger positional errors close to the incision point, with smaller errors for the remaining surfaces. Therefore, this indicates that the positional errors are sparse, and are thus better suited for and modeled by a heavy-tailed distribution instead of a Gaussian one. Such a model was presented in 24, for which surfaces were assumed piece-wise smooth, and substantial changes in transformations could occur only in relatively local areas. Thus, this model is incorporated in the proposed registration method. 3) DeformableShapeCorrespondence:Deformation tracking can provide crucial information for force estimation. One approach is to match feature points between two frames of the same deformable object, and estimate a correspon- dence between those frames; registration, alignment, and matching are special cases of the shape correspondence prob- lem. The complexity of determining correspondence relies on context: partial vs. full, dense vs. sparse, semantic vs. geometry, local vs. global etc. 25. Methods exist for various shape representations. Given implicit surfaces, conformal mappings using diffeomorphisms can be used to produce a space-of-shapes or geodesics, which indicate a path to morph between two shapes 26. Mesh representations of a surface are amenable to topo- logical approaches for deformable tracking. Given a mesh shape model and a few anchor vertices, a mean-value en- coding approach can be used to evaluate shape-preserving and rotation invariant deformations 27. In another method, a robust mesh correspondence search was achieved via a combinatorial tree traversal that weighted heavily self- distortion energy 28. Large deformations can be tracked so long as the deformed surface is near-isometric to the original genus zero surface. By fi rst fl attening meshes using a mid-edge fl attening technique and conformal mapping, Lipman et al. developed a M obius voting technique that could automatically determine dozens of correspondences between genus zero surfaces under large deformations 29. A similar M obius approach was used to ascertain intrinsically 7906 template point cloud from iteration (t-NT) from iteration (t) new set of images camera poses robot pose reprojection inliers reprojection outliers reprojection error check static shiftingdeforming dynamic intra camera outliers intra camera inliers transform stereo matching triangulation inliers triangulation outliers inter camera outliers inter camera inliers disposed false points tool point extraction non-rigid registration triangulation matrices T from iteration (t) point classification result plot (Tj-Ti)pi (neighbor points j) registered target point cloud target point cloud from iteration (t) registered target point cloud from iteration (t) corresponding match points (a) optimization with ADMM minimization template point cloud from iteration (t-NT) energy (b) Fig. 2. The fl owchart for (a) non-rigid registration and (b) point classifi cation. Grey boxes demarcate the two main results of the algorithm. symmetric point correspondences in 30. Other approaches involve fi rst segmenting shapes into semantic parts, followed by registrations between near-isometric shapes within these classes. This was achieved by employing eigenfunctions of the Laplace-Beltrami operator 31. Schulman et al. developed an algorithm for tracking de- formable objects in real-time from sequences of point clouds. The approach utilized a physics engine and a probabilistic generative expectation maximization algorithm to determine point cloud and mesh model correspondences. The solution was robust to occlusion, yet would not recover well from a divergent estimate 32. Point clouds are also amenable to skeletal approaches. Given even sparse point clouds, curve skeletons can be extracted via an iterative method assum- ing shapes are generally cylindrical 33. By modeling the evolution of competing fronts within an objects volumetric shape, curve skeletons can also be analyzed for both sparse point clouds and meshes 34. While many of these approaches result in accurate and repeatable shape correspondences, most rely on either a priori assumptions of the objects shape or the presence of numerous distinct features (geometric, color), assets not necessarily available in surgical settings. Furthermore, only a few of these approaches work in real-time. 4) Dynamic Point Classifi cation: In order to estimate interaction force based on tissue deformation, the dynamic changes in the surgical cavity surface must be tracked. Locally deforming surfaces should be distinguished from static or merely shifting regions. In 12, re-projection error values of mapped 3D point cloud data between inter and intra camera groups provide indications for distinguishing the dynamic or static nature of the observed geometries. In robot- assisted MIS, it is challenging to isolate and segment the moving surgical tool tip points from nearby deforming tissue points. This is exacerbated by the fact that tissue features and colors are often refl ected off the metallic tool surface 4. To overcome this, the work presented here incorporates robot kinematic information with the constrained optimized non- rigid registration results, thus more selectively distinguishing deforming points from merely shifting ones. II. METHODS Since both position and color of surface points are useful indicators for feature registration, surface points in this work are stored in six dimensional color point clouds, i.e. p R6 where p = ?c p p p ? and c p R3 stores the RGB color values and p p R3 the Cartesian position vector of point p. In addition to this, each point position p p is augmented to form homogenous coordinate, h p = (c p 1)T. Given this point cloud representation, the fl owchart shown in Fig.2 conveys the overall workfl ow described in this work, the details of which are described in the following sections. A. Template Point Cloud The template point cloud, denoted P, contains ordered points collected during the fi rst time instance. In particular, P is generated accumulatively from multiple view points within the surgical cavity. The process for building the template 3D model from multiple 2D images involves inter-camera matching, pair sequencing, and triangulation details of which are found in the authors prior work 3. A sample of a template point cloud from various viewpoints is shown below in Fig.3. All points are treated as static at time 0. Fig. 3.Extracted feature points from multiple 2D images captured with 6 cameras from different viewpoints are combined to form P. 7907 B. Target Point Cloud For each time step after the initial template point cloud is formed, a new set of images are acquired from all cameras. Visible 3D points in the template point cloud are then reprojected to the new set of images based on current camera poses to determine which regions are viewable at the current time instance. Viewable points which are correctly shown in the separate 2D images will be updated. Then, the union of all non-viewable 3D template points and the updated regions form the target point cloud, G, for that time instance. target point cloud intra camera inliers (static points) inter camera inliers (dynamic points) intra camera outliers reprojection reprojection erroerror check template point cloud new set of images: tool point extraction robot triangulation pose Fig. 4.The generation of target point cloud G. Fig.4 conveys the process by which the target point cloud is generated. Observe that any feature points within the 2D images which are not yet associated with a 3D point are triangulated, and thereby associated with a new 3D point in the target point cloud, G. C. Surgical Tool Segmentation and Removal Robot kinematic pose information is tracked by the Raven- II system 13. Given this pose, accurate 3D models of the robot, and camera poses, a 2D mask of the tool shaft can be generated and projected onto each camera image. This in effect eliminates any tool points, both in the 2D images and in the generated 3D point cloud 35, 36. As a result, both template and target point clouds P, G, consist only of points representing the topology of the soft tissue within the surgical cavity. D. Energy Function Suppose that |P| = NPand |G| = NG. Furthermore, denote the set of the fi rst w natural numbers, 1,2,.,w, as Nw . With an appropriate energy function defi ned, the goal is to determine correspondences between each point p P to poi

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论