IROS2019国际学术会议论文集1025_第1页
IROS2019国际学术会议论文集1025_第2页
IROS2019国际学术会议论文集1025_第3页
IROS2019国际学术会议论文集1025_第4页
IROS2019国际学术会议论文集1025_第5页
免费预览已结束,剩余1页可下载查看

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

Atomic force microscope tip localization and tracking through deep learning based vision inside an electron microscope Shuai Liang, Mokrane Boudaoud, Catherine Achard, Weibin Rong and St ephane R egnier AbstractScanning Electron Microscopy (SEM) is an ideal observation tool for small scales robotics. It has the potential to achieve automated nano-robotic tasks such as nano-handling and nano-assembly. Path following control of nano-robot end effectors using SEM vision feedback is a key for an intuitive programming of elementary robotic tasks sequences. It requires the ability to track end effectors under various SEM scan speeds. SEM suffers however from tricky issues that limits robotic tracking capabilities. This paper focuses on one specifi c issue related to the compromise between the scan speed and the image quality. This restriction seriously limits the performance of conventional vision tracking algorithms when used with electron images. At high scan speed, the image quality is very noisy making very diffi cult to differentiate the robot end effector from the background, hence limiting the tracking capabilities. The work related in this paper explores for the fi rst time the potential value of Convolutional Neural Networks (ConvNet) in the context of nano-robotic vision tracking inside SEM. The aim is to localize an end-effector, AFM cantilever in the case of the study, from SEM images for any scan speed confi guration and despite of low images quality. For that purpose, a data set of AFM tip images is build up from SEM images for the learning algorithm. Network performances are estimated under different SEM scan speeds. Thanks to the learning algorithm, experimental results show robust AFM tip tracking capabilities inside the SEM under various scan speed conditions. I. INTRODUCTION Nano-robotic systems are effi cient platforms for automa- tion task at the small scales. They have generated a lot of interest in various fi elds of research, such as material science 1, health 2, micro/nano assembly 3 and more generally in physics 4. Specifi cally, inertial-actuated nano- robotic systems 5 6, characterized with both millimeter displacement range and nanometer resolution are one of the most popular. A. Automation inside SEM A series of robotics applications involving SEM have been reported in the literature. One can cite works related to electrical characterization of nano-wires 7, mechanical characterization 8, pick and place of small scale objects such as colloidal particles 4 and carbon nanotubes (CNTs) Shuai Liang, Mokrane Boudaoud, Catherine Achard and St ephane R egnier are with the Sorbonne Universit e, campus Pierre et Marie Curie/ ISIR CNRS UMR 7222 4 Place Jussieu, CC 173, Pyramide - T55/65, 75005 Paris.- Weibin Rong is with Harbin Institute of Technology, State Key Lab. of Robotics and System, 92 West Dazhi Street, Nan Gang District, Harbin, China.-liangsorbonne-universite.fr, mokrane.boudaoudsorbonne-universite.fr, catherine.achardsorbonne-universite.fr, stephane.regniersorbonne-universite.fr, rwb 9 and so one. The common point between these applica- tions is the need of corse and fi ne positioning capabilities of the robotics system and a SEM vision feedback. B. SEM image processing Several tracking algorithms are available using conven- tional image processing methods. Standard algorithms are divided into 3 main steps. The fi rst one is the preprocessing that decreases the noise in images. It can be done using a Gaussian fi ltering, or a nonlinear anisotropic diffusion approach 10. Secondly, the image is binarized in order to localize a Region of interest (RoI). More or less advanced methods can be used such as segment detection method (SDM) 9 or binary large object (BLOB)-detection 4. Some morphological operations 7 are often used to improve the detection. The third step is the RoI tracking that is often performed using simple heuristics. The lack of robustness of the RoI detection step has led to another kind of methods that skips this step. It is based on template matching where a template is searched in the image using the sum-of-squared- differences (SSDs) between the image and the template or using a correlation. However, these algorithms are highly sensitive to variations of the contrast, the brightness or the signal to noise ratio 9. Experiments carried out in the context of this current paper have shown that template matching performance decreases with decreasing the signal to noise ratio. Therefore, a more powerful image processing method is needed to track in a robust way an AFM cantilever inside a SEM. C. Deep learning Deep learning is a class of machine learning. It is in- creasingly applied in modern tricky computer vision issues. This is thanks to its outstanding capability to distinguish and extract target representations from massive and intricate data, showing signifi cant superiority and potential to enhance robotic performances on various aspects, such as speech recognition and visual servoing. Deep convolutional neural networks (CNNs) are driving the advances in recognition for a multitude of computer vision challenges, illustrating highly superiority on both holistic and local feature extraction 11. The application and combination of convolutional layers and pooling layers are the key factors that boost the outperfor- mance of CNNs. Convolutional layers, enable the network to extract local features that are choosen and learned according to the application. Deep Neural Networks (DNNs) have achieved remark- able success on both image classifi cation 12 and object 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) Macau, China, November 4-8, 2019 978-1-7281-4003-2/19/$31.00 2019 IEEE2435 localization. Modern approaches perform at the same time multiple objet classifi cations and localizations. Among them, some methods 13 use an external RoI proposal generator and apply a classifi cation algorithm with a box regression to recognize the object and refi ne its position. In R-CNN 13, each RoI is resized to a fi xed size and pre-trained. CNN network is employed to extract a feature vector of size 4096. Multiple SVM are learned on these vectors to perform the multi-classes classifi cation. Then, for each positive clas- sifi cation, a linear regression is done to refi ne the position of the object. This method is time consuming as the feature extraction, the classifi cation and the regression and done for each RoI (2000 in the original article). An improvement is done in Fast R-CNN 14 as the CNN is applied only once on the whole image. The feature vectors are extracted after the CNN, at the localizations of the RoI. Recent approaches such as Faster R-CNN 15, SDD 16 or YOLO 17 change the two-step method and include the region proposal generator in the network in an end to end way. They use a multi- part loss that combines a classifi cation loss and a regression loss. A good review of recent techniques can be found in 18. In the previous approaches, the localization is made using bounding box and are useful to coarsely localize the object in the image. Other works consider the estimation of the pose of complex object. This problem has been widely addressed for human pose estimation 19, 20. Two main approaches can be distinguished: those that predict heatmaps (map of presence probability) and those that directly predict the position (x, y) of each joint. While the fi rst methods 20 are based on hourglass or stacked hourglass network, the second ones 19 use generic convolutional layers and regression. This kind of approach, also used in 21 is well adapted to our problem as we can directly regress the x and y positions of the AFM tip from SEM imaging, in a more precise way than using bounding box. D. Motivation of the paper The aim is to tackle the issue of tracking the position of an AFM (Atomic Force Microscope) tip inside a SEM in various scan speeds of the electron microscope. The SEM used in this work is the ZEISS EVO-LS 25. This microscope features 15 scan speed modes. The scan mode 1 is the fastest one but it provides very noisy images that do not allow to distinguish precisely the AFM tip from the background as shown in Fig.1(a). The scan mode 15 is the slowest one, it requires much more time to get an image and lead to the best quality of image among other scan speed modes. It is possible for instance to see the image quality at scan speed mode 5 in Fig.1(b). This mode requires 15 s to get an image. To this end, a learning algorithm that allows extracting precisely the coordinate of the AFM tip in various scan speeds and despite of the noisy images of the fast scan modes is studied. A ConvNets based network is built to regress the end-effector (i.e. AFM tip) localization within SEM imaging, which is further used as position feedback in path following control loop. This work shows the potential of Deep Learning 20 m AFM cantilever Background Scan speed mode 1 (a) 20 m AFM cantileverBackground Scan speed mode 5 (b) Fig. 1.AFM tip observed with the SEM using a scan speed mode 1 (a) and 5 (b). The scan speed 1 is the fastest one, it allows obtaining an image every 1.3 s but the image quality is bad, it is hard to differentiate the AFM tip from the background. The scan speed mode 5 allows obtaining a better image quality, but it requires 15 s to acquire an image. and enriches the feasible solutions to deal with the tricky SEM imaging issues. II. DEEP LEARNING MODEL FORAFMTIP COORDINATE REGRESSION INSEMIMAGING The aim is to develop an end-to-end, pixel-wise prediction of the AFM tip coordinates (x, y) from the images acquired by the SEM. The studied images are very specifi c grayscale images, with many geometric and repetitive patterns. It is diffi cult to take advantage of the large databases in the liter- ature such as imagenet, Microsoft coco dataset 22 or other applications like human pose estimation 20. Therefore, a specifi c model has been built to localize the AFM tip in images captured by SEM. A. Problem defi nition and model architecture Among the deep learning methods, those based on a fi rst detection of bounding boxes cannot be used in our application, as the whole object is not fully visible with the magnifi cation of the SEM images. So, we propose to directly regress the extremity (x, y) of the tip using standard CNN 19. As preprocessing, data normalization is implemented by subtracting the mean and dividing it by the standard deviation of the pixel values. The image is then passed through several layers. All convolutional layers have a size of 3 x 3 pixels, a stride fi xed to 1 pixel and a padding to keep the original size of the image. The spatial max-pooling layers are done with 2 x 2 pixels window and a stride of 2. The last layers are fully connected layers with two units that predict the x and y positions of the AFM tip. The Rectifi ed Linear Unit (ReLU) activation function is used for all the layers except the fully connected one that has a linear activation function. 2436 1925256 (x, y) 154205321542051 21 7710264 3851128 912256 convolution+ReLU max pooling fully connected Fig. 2. AFM tip localization with regression network. Classifi cation layers as feature extractor fi rst, followed with fully convolutional layers for fi ne-tuning and regresses numerical AFM tip coordinate by fully connected layer. Moreover, a Local Response Normalization (LRN) is done before the ReLU layers. B. Loss function The last fully connected layer gives the prediction of the numerical tip coordinates xp= (xp,yp)T. The loss is estimated as the Euclidean distance between the predicted coordinate xp= (xp,yp)Tand the ground truth coordinates xg= (xg,yg)T. This loss, named Mean Square Error (MSE), is defi ned by: L = 1 N N X i=1 |xp xg|2(1) Where N is the number of images in the batch. C. Construction of SEM imaging database Some popular datasets of natural images, such as imagenet or Microsoft coco dataset 22 exist in literature. However, no training data are available for AFM cantilevers with SEM images. Thus, a dedicated database is built for AFM tip local- ization. In order to consider the images under conventional SEM standard uses, images are captured with a wide range of SEM magnifi cation, distributing at 340, 377, 423, 474, 526, 572, 622, 675 times. This leads to images with an important variation in the size of the AFM tip. Moreover, the position of the tip is changed during the acquisition to simulate all the possible confi gurations as illustrated on Fig.3. The fi nal database contains 1476 images of size 612 460 collected through UDP communication from SEM at scan speed mode 3. III. PATH FOLLOWING STRATEGY OF THE INERTIAL-ACTUATED NANO-ROBOTIC SYSTEM INSIDE THESEM This section relies on the proposed learning network to predict the AFM tip coordinates to be used as a position feedback in the path following control loop. SEM magnification of 675 times Tip position (200, 134) SEM magnification of 340 times Tip position (194, 70) Tip position (523, 135) SEM magnification of 340 times Tip position (455, 151) SEM magnification of 675 times Fig. 3.Examples of AFM tip images used for the learning process. A. Kinematic Model The path following control strategy is designed consider- ing the specifi c case of a 2-DOF holonomic inertial actuated nano-robotic system (i.e. X and Y axes of the Cartesian nano-robot of 23). Let us consider the local frame Rp, the Frenet frame Rsand the world frame R024 as in Fig. 4. The controller specifi es the local velocities of the AFM tip vadvand vrec, with respect to the Frenet frame Rs. These velocities are further projected onto X and Y axes as vxand vyin the world frame R0. With vxand vyas input, the closed loop velocity control previosulsy designed in 25 is applied for each axis of the nano-robotic system driving the AFM cantilever. The aim is that the AFM tip position converges to a defi ned curve in world frame. In this work, the AFM tip orientations in the frames Rp, Rsand R0 , are assumed to be fi xed. The AFM tip position is characterized by a point p(x,y)Tin world frame R0, which is predicted by the proposed learning network from SEM images. ppis the perpendicular projection of p(x,y)Ton the reference curve . The AFM tip translation velocity vtwithin 2437 X0 Y0 Z0 Rs dpf kp dd pp vrec vadv Path reference vt vrec vadv vt vx vy kx ky R0 Rp p p kpf Fig. 4.Frenet frame based kinematic model, calculating the tangential velocity vadvand the normal velocity vretin the Frenet frame Rsand projected into vxand vyin the world frame R0. the frame Rs, is divided onto the tangential velocity vadv along the path, and the normal velocity vrecperpendicular with the path. In Fig. 4, kpand kpfare tangential and normal unit vectors respectively of at pp. kxand kyare the unit vectors of x and y axes in the frame R0respectively. The path following error is the distance |dpf| = pp p. In equation (2), the AFM tip velocity vtis an independent variable. B. Path Following Control Law The control process consists of two steps. First, Frenet frame velocities calculation, which computes and assigns the tangential velocity vadvand normal velocity vrecto steer the AFM tip to converge to the path reference . Secondly, projection of vadvand vrecon X and Y axes as vxand vyfor inertial actuator velocity control, Fig. 5. Start End SEM Network: Position prediction (x, y)Distance to path end Calulation: Reach path end? Yes No UDP Calulation: Closest distance dpf, kp dpf dd? YesNo Velocity control on x,y axis Path reference vadv = 0 vret = *dpf Move only towards path reference: vadv = *kp vret = *dpf Move both towards and along path reference: Project to vx,vy Calulation: Fig. 5. Path following control fl ow chat: given the path reference and the AFM tip localization by the learning network, if the distance |dpf| between the AFM tip and its path reference projection ppexceeds the threshold distance dd, the tip only moves perpendicularly towards the path reference at velocity vrec. Else it will move both towards and along the path reference with velocity vrecand vadv. These are further projected into vxand vyin the world frame R0, as the velocity control inputs of nano-robotic system. 1) Velocity assignment in Frenet Frame: The fl ow chat of the controller used here is described in Fig. 5. The velocities vadvand vrecare calculated from the spec- ifi ed desired translation velocity vtof the AFM tip in the world frame R0, where: |vt| = vconstant, vt= kp |z vadv +|dpf|kpf |z vrec , (2) is the amplitude of the tangential velocity vadvand |dpf| is the amplitude of the normal velocity vrectoward the path reference. The AFM tip velocity vt is specifi ed as constant vconstantto ensure a smooth and stable tracking. and are determined by the following rules: if |dpf| dd, ( = constant = 0 (3) where ddis a threshold distance between the AFM tip position and the path reference as shown in Fig. 4. 2) Velocity assignment in the world frame: The projected velocities vxand vyin the world frame are obtained from vadvand vrecas follows: ( vx= vadv kx+ vrec ky, vy= vadv ky+ vrec kx, (4) kxand kyare the unit vectors of X and Y axes in R0. The velocities vxand vywill be used as velocity control input references to achieve the fi nal path following task, Fig. 4. IV. EXPERIMENTS A. Tip localization using the learning network 1) Experimental setup: For each image of the database, the extremity tip positon is manually annotated and the x and y coordinates that will be regressed are registered. Then, the whole database (1476 images) is split in a training part (80%) and a testing part (20%). Considering the training effi ciency and the computation capability, the raw images are resized to 205 154 pixels. The network is trained using Keras in Jupyter Noteboo

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论