IROS2019国际学术会议论文集 0428_第1页
IROS2019国际学术会议论文集 0428_第2页
IROS2019国际学术会议论文集 0428_第3页
IROS2019国际学术会议论文集 0428_第4页
IROS2019国际学术会议论文集 0428_第5页
免费预览已结束,剩余1页可下载查看

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

Object Singulation by Nonlinear Pushing for Robotic Grasping Jongsoon Won Youngbin Park Byung Ju Yi and Il Hong Suh Abstract In this study we aim at grasping a single target object in a cluttered environment using a robotic arm While dexterous grasp for various shapes of objects is not considered in this work we focus on developing the method to mitigate clutter near the target object as soon as quickly For this purpose we propose a method to generate nonlinear pushing motions for object singulation based on an off the shelf ma chine learning algorithm and a typical semantic segmentation algorithm Through experiments we show that the success rate of robotic grasping is considerably improved by the proposed pushing behavior And notably the nonlinear pushing trajectories allows the robot to perform singulation of the target object in a cluttered environment with fewer trials than linear pushing usually pursued in related works I INTRODUCTION In unstructured human environments robotic grasping systems have to face clutter i e other objects that block direct access to the desired objects For instance consider a task such as clearing a dining table In this case the robot needs to identify the goal object on the table determine the object s location move its arm to reach the object and grasp the object to move it away However due to the clutter on a common dining table it is considerably diffi cult for a robotic hand to wrap around a single target object to achieve a good grasp In this study we aim to grasp a single target object in the presence of objects obstructing direct access to the desired object Although the development of dexterous and general grasp ing skill is an important issue in the literature of robotic manipulation we do not focus on such a goal in this paper We investigate interactive solution to mitigate the clutter near a target object for increasing the success rate of grasping keeping the number of interaction being minimum To do that we propose a nonlinear pushing motion By including this new skill the robot can isolate the target object before reaching its arm closer to the object and grasping it by closing its fi ngers Figure 1 illustrates the block diagram for the proposed grasping system The Semantic Segmentation Module SSM captures an RGB image which serves an input for the generation of a segmented image Then the fGrasping De cision Module PGDM determines whether it is possible to grasp a goal object without pushing behavior or not If it Jongsoon Won and Youngbin Park are with the Department of ElectronicsandComputerEngineering HanyangUniversity Korea jswon pa9301 incorl hanyang ac kr Byung Ju Yi is with the Department of Electronic Systems Engineering Hanyang University Korea bj hanyang ac kr Il Hong Suh is with the Division of Computer Science and Engineering College of Engineering Hanyang University Korea All correspondences should be addressed to Il Hong Suh ihsuh hanyang ac kr Singulation Module SM Env Pushing action Semantic Segmentation Module SSM Pushing Grasping Decision Module PGDM Grasping Module GM RGB image Segmented image Grasping action RGB D image Regions of objects Fig 1 Block diagram for the proposed robotic grasping system is possible to grasp the object without pushing the robot executes a Grasping Module GM otherwise the Singula tion Module SM produces a nonlinear pushing motion to attenuate clutter near the target object After this trial if the PGDM still detects the presence of objects obstructing the path towards the goal object the pushing behavior is executed again This procedure is repeated until there are no objects that block direct access to the goal object Among four modules in Figure 1 we focus on developing robust and effi cient algorithm for PGDM and SM In real life scenarios the decision to execute pushing or grasping and the trajectories for complete singulation can be diffi cult to program explicitly because of the variety of objects that can be encountered To this end we build the proposed method based on an off the shelf machine learning algorithm and a recent semantic segmentation technique II RELATED WORK Some studies for grasping in cluttered environments have addressed integrated perception and grasping where the ob jective is to grasp objects from an unorganized pile 1 3 9 12 13 however these methods aim to grasp objects from a cluttered bin and remove the objects so that the bin eventually becomes empty Most of them thus are not intensively required to develop non prehensile skills for increasing the success rate of grasping objects becuase the robot in this case can grasp multiple objects in a sigle trial and can grasp objects in non clutter regions earlier than the rest 4 7 8 13 are closely related to our approach In particular 4 presents a push grasp planner that can reduce the uncertainty about an object s pose by acting on it This approach pushes an object so that it rolls into the hand of the robot leading to successful grasps that avoid collisions with the clutter These works assume a context where the individual 3D object model is known 7 perform object singulation using several push primitives Their method is 2019 IEEE RSJ International Conference on Intelligent Robots and Systems IROS Macau China November 4 8 2019 978 1 7281 4003 2 19 31 00 2019 IEEE2402 based on object edges and 3D point cloud to detect splitting locations between potential objects then push vectors for all candidate boundaries associated with the object of interest are produced This set of push vectors are ranked and the highest ranked push is performed 8 present neural network based approach that separates unknown objects in clutter by selecting favorable push actions This network is trained using a supervised manner Therefore more than 3 000 pushing actions are manually labeled as positive or negative by a user who assesses the outcomes of the actions 13 trained two fully convolutional networks that map from visual observations to actions One infers the utility of pushes that can help rearrange cluttered objects to make space for arms and fi ngers while the other does the same for grasping Both networks are trained jointly in a Q learning framework Though prior works mentioned above explicitly consider the singulation of a paticular object or grasping of a single goal object in clutter they perform linear pushing to move the object or surrounding objects On the other hand we investigate nonlinear pushing trajectories to achieve object singulation with fewer pushes III PROPOSEDMODEL In this section we describe the modules in our proposed model shown in Figure 1 A Semantic Segmentation Module SSM Thanks to the recent development of deep learning tech niques the segmented image based on deep neural networks gives precise boundaries of object instances in real time The semantic segmentation plays a critical role in the proposed system because the segmented image is used in PGDM SM and GM We employ the semantic segmentation method presented in 5 where a deep fully convolutional neural network archi tecture termed SegNet was proposed for pixel wise classifi ca tion in real time The segmentation architecture consists of an encoder network a corresponding decoder network followed by a pixel wise classifi cation layer The encoder network has 13 convolutional layers Each encoder in the encoder network performs convolution with a fi lter bank to produce a set of feature maps These are then batch normalized Then an element wise rectifi ed linear nonlinearity ReLU is applied Following that max pooling with a 2 2 window and stride 2 is performed and the resulting output is sub sampled by a factor of 2 More details are described in 5 B Pushing Grasping Decision Module PGDM To determine whether to perform pushing or grasping action the segmented image obtained using SSM and the dilation operation 6 is used Dilation is one of the two basic operators in the fi eld of mathematical morphology the other being erosion Dilation is typically applied to binary images the basic effect of the operator on a binary image is to gradually enlarge the boundaries of the regions of foreground pixels In particular the dilation operator takes two data items as inputs fi rst the image that needs to be dilated and Algorithm 1 PGDM input S segmented image t target object index construct Otusing S and t compute Ot dialusing of Ot B in 2 compute osurrin 3 if osurr 0 behavior pushing else behavior grasping return behavior second is a set of coordinate points known as a kernel The kernel determines the precise effect of the dilation operation on the input image Dilation can be iteratively performed Therefore in general there are two parameters that need to be determined for this algorithm including the size of kernel and number of iterations K In mathematical morphology The dilation of the binary image A by the structuring element B is defi ned by A B b B Ab 1 where Abis the translation of A by b When the structuring element B has a origin one way to think of Equations 1 is to take copies of A and translate them by movement vectors defi ned by each of the pixels in B If we union these copies together we get A B Let O represent the entire object region in the segmented image and let B be complement of O which is thus background Therefore the union of two sets denoted by S O B is the entire segmented image The object region O is partitioned into n subregions O1 O2 On such that S iO i O Each subregion is a set the elements of which are pixel coordinates p x y Let Oidenote a binary image for an object Oi Then the dilation of the binary image Oiby the structuring element B is given by Oi B b B Oi b 2 The set of pixels which have value one after dilation is denoted by Oi dial p vp 1 p Oi B Let o be an object index set in a segmented image and an object index set except a target object index t is denoted by o i i o i 6 t The index set is the indices of surrounding objects which block direct access to the goal object is defi ned as osurr i Ot dial O i 6 i o 3 Osurris the region of the surrounding objects such that Osurr S iO i i osurr PGDM operates in following manner as in Algorithm 1 First we dilate an image in which a target object is set to foreground while the other regions are set as the background Then we check if the enlarged target object overlaps with any of the pixels of the other objects If some pixels are 2403 Loop boundaryNon loop boundary Fig 2 Two types of decision boundary Left and right fi gures show loop and non loop boundaries respectively Red region represents Ot dialand blue region illustrates Osurr Purple dots denotes the ends of line segments Li There are three and two line segments in left and right images repectively overlapped pushing behavior is selected otherwise grasping action is executed C Singulation Module SM The primary contribution in this paper is the singulation module that effi ciently moves away objects obstructing the path towards the goal object To generate trajectories to do that we use an classifi er termed nonlinear Support Vector Machines SVMs 10 instead of relying on typical algorithms used in classical control problem In particular various nonlinear classifi ers can be applied to our problem However we choose SVMs because SVMs construct a maximum margin separator a decision boundary with the largest possible distance to example points This helps them generalize well Standard SVMs create a linear separating hyperplane but nonlinear version of SVMs have the ability to embed the data into a higher dimensional space using the so called kernel trick The high dimensional linear separator is actually nonlinear in the original space We form the generation of nonlinear pushing trajectories as binary classifi cation problem Let D and D be the training data for positive and negative classes respectively To apply nonlinear SVMs to generate a trajectory for sin gulation of the target object the points included in the goal object are considered as the positive class whereas the points in the region belonging to the surrounding objects are set as the negative class which are defi ned by D pi yi 1 pi Ot dial n i 1 n O t dial 4 D pi yi 1 pi Osurr m i 1 m Osurr The trajectory for pushing is obtained based on the de cision boundary determined based on the nonlinear SVMs Figure 2 shows two types of decision boundaries called loop boundary and non loop boundary The algorithm for selecting fi nal trajectory is varied depend on the type of boundaries Let L be the set of pixels which are included in the nonlinear decision boundary and let L denote the subset of L in which the elements are not located in object regions L is partitioned into n lines L1 L2 Ln such that S i Li L The partition is determined by investigating the connectivity of 8 adjacent pixels In the case of loop decision boundary the trajectory for pushing is determined by trajloop L Llong 5 Fig 3 Left top RGB image Right top segmented image Left bottom the regions of dilation of the goal object and surrounding objects Red and blue indicate goal and surrounding objects respectively Right bottom Green line and circles in the right bottom image denotes the fi nal trajectory trajloop produced by SM and two end point of the path Algorithm 2 SM input Ot dial Osurr osurrproduced by PGDM and S segmented image construct training dataset in 4 determine decision boundary using nonlinear SVM construct L L construct L1 L2 Ln long argmaxi Li determine SV Ms type based on Llong if SV Ms type linearSV Ms reconstruct Osurrusing osurrand S compute traj trajnon loopin 6 else if SV Ms type non linearSV Ms determine boundary type if boundary type loop compute traj trajloopin 5 else if boundary type non loop determine Lie1 Lie2 compute traj trajnon loopin 6 return traj where Llongdenotes the set of pixels in longest line segment defi ned by long argmaxi Li and L Llongrepresent the set of all elements that belong to L but not to Llong Therefore the path is the shortest line that pass through all surrounding objects One of end points on line segment is randomly set to the start point and the other is determined to the end point In the case of non loop decision boundary there are two line segments contact the boundary of the segmented image denoted by Li e1 and Li e2 The trajectory of the end effector for pushing is determined by trajnon loop L Li e1 Li e2 6 This path passes through all surrounding objects and lies completely within the workspace In practice for the start point k th adjacent pixel from an end point of the line segment is selected so that the end effector lies the starting position without collision Figure 3 illustrates the input and output of each module in our proposed system RGB image is the input for SSM SSM produces segmented image and the image is used as input for PGDM PGDM constructs the regions of dilation of the goal object and the surrounding objects and gives them to SM Finally SM determines the fi nal trajectory for pushing 2404 If the target object is completely surrounded by neighbor ing objects the robot necessarily collides with the neighbor ing objects during moving to the start position The robot might be physically damaged in this case To mitigate this the robot performs linear pushing in advance to break this complete surrounding Whether to use linear SVMs or not is determined based on the cardinality of the set Llong Therefore nonlinear SVMs should be performed initially for making decision because as mentioned before Llongcan be constructed using the decision boundary produced by nonlinear SVMs If the number of the element is smaller than a certain threshold T linear SVMs is employed Otherwise the decision boundary constructed using nonlinear SVMs is used to generate a pushing trajectory The fi nal trajectory for linear SVMs is computed using Equation 6 because a decision boundary constructed by linear SVMs must be the non loop boundary type In case of linear SVMs Osurr is recomputed based on the region of the biggest object in the index set of surrouding objects osurrto rearrange neighboring objects as widely as possible Osurr Ok where k argmaxi Oi and i osurr The procedure that SM generates the trajectory of the end effector for pushing is explained in Algorithm 2 D Grasping Module GM In this study we do not aim to develop dexterous grasping for complex shape of objects Therefore we take into account primitive shapes of objects and implement grasping module in a simple manner For a successful grasp without geometric object models we consider planar grasps only A planar grasp is one wherein the grasp confi guration is along and perpendicular to the workspace Hence the grasp confi gura tion includes three dimensions which are x y position of the grasp point on the surface of table and which is the angle of grasp The robot executes three primitive actions in a sequential manner namely reaching orienting and grasping If the robot is not sequentially operated the end effector sometimes collides with the target object even though the position and orientation of the object are correctly estimated The aim of the reaching action is to move the end effector to 20 cm above the center of the goal object while ensuring that the vector between the end effector and adjacent joint fi nal joint is normal to the table plane The three dimensional center position of the goal object is computed in the following manner First the points included in the goal object are obtained from the single view 2 5D point clouds extracted from a Kinect camera wherein the segmented image is used to select the points included in the target object Second the center position of the target is computed by averaging the positions of the selected points in the camera coordinate system The center position with respect to the robot base is obtained using a pre computed calibration matrix For the orienting action the shortest axis of the goal object is calculated using the segmented image and eigenv

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论