IROS2019国际学术会议论文集 0372_第1页
IROS2019国际学术会议论文集 0372_第2页
IROS2019国际学术会议论文集 0372_第3页
IROS2019国际学术会议论文集 0372_第4页
IROS2019国际学术会议论文集 0372_第5页
已阅读5页,还剩2页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

Deep Learning of Proprioceptive Models for Robotic Force Estimation Erik Berger1 Daniel Eger Passos2 Steve Grehl2 Heni Ben Amor3 Bernhard Jung2 Abstract Many robotic tasks require fast and accurate force sensing capabilities to ensure adaptive behavior execution While dedicated force torque FT sensors are a common option such devices induce extra costs need additional power supply and add weight to otherwise light weight robotic sys tems This paper presents a machine learning approach for estimating external forces acting on a robot based on common internal sensors only In the training phase a behavior specifi c proprioceptive model is learned as compact representation of the expected proprioceptive feedback during task execution First the proprioceptive sensors relevant for the given behavior are identifi ed using information theoretic measures Then the proprioceptive model is learned using deep learning techniques During behavior execution the proprioceptive model is applied to actual sensor readings for estimation of external forces Experiments performed with the UR5 robot demonstrate the ability for fast and accurate force estimation even in situations where a dedicated commercial FT sensor is not applicable I INTRODUCTION Robot platforms which assist or replace humans in phys ically demanding tasks need precise force sensing capabil ities for controlled and safe interactions with their environ ment 1 A common way to enable such functionality is by monitoring the interaction with the environment directly via dedicated force sensors Major drawbacks of such special purpose sensors are the increased costs and reduced payload of the robot platform To overcome these limiting factors one could implement an accurate analytical model which takes into account the kinematic structure mass distribution and the dynamics of the controlled robot system 2 3 The required amount of expert knowledge as well as the time investment needed for customizing such a model for complex nonlinear systems is a major limitation of analytical approaches In contrast to that humans do not rely on such precise mathematical models and instead learn to approximate highly nonlinear system dynamics from prior experience More precisely the sense of self or proprioception 4 allows humans to learn and adapt a skill by becoming familiar with the particular task Proprioception refers to the sense of the relative position of one s own parts of the body and strength of effort being employed in movement 5 In particular experience about a wide variety of proprioceptive receptors which are also referred to as proprioceptors is generalized with regard to the actual state of the body For example 1Digital Enterprise x j i X determines the infor mation shared between the fi ll level y and the ith propri oceptor xiwhich is not already contained in the selection X Furthermore 1 j defi nes multiple possible time lags which enables the recognition of time delayed dependencies In particular the CMI quantifi es the amount of information shared between the processes A B when a third process C is known I A B C H A C H B C H A B C H C 2 where H denotes the Shannon entropy On the other hand the MMI I y x j i X quantifi es the amount of information gained compared to the already contained information I A B C H A H B H A B I A B C 3 where positive values indicate a predominant portion of redundancy while negative values imply synergistic effects The corresponding iterative selection algorithm is con structed as follows In each iteration the proprioceptor with highest CMI or MMI is added to the condition C by merging it with the already contained proprioceptors The resulting condition is equivalent to the intersection of the corresponding sensor streams To give a simple example two sensors 1 0 1 1 0 1 1 1 0 0 which contain four state patterns 1 1 0 1 1 0 0 0 are merged to the condition 1 2 1 3 4 By repeating this process the infor mation in C about A is iteratively growing and therefore has a maximum of shared information with the target The procedure ends when no more additional information is contained in the remaining proprioceptors The sequence of proprioceptors contained in C therefore refl ects the relevance with respect to the target e g the fi ll level 4260 0255075100125150175200 Time s 100 0 100 Force N 0 125 250 Fill Level ml Fig 4 A FT150 is attached to the end effector of the UR5 robotic arm to measure the three dimensional forces f f1 f2 f3 and the corresponding overall strength f The recorded training data X is concatenated where 0s to 75s represent empty 75s to 150s half full and 150s to 225s full lifting behaviors The force measurements are disturbed by the inertia of the water and the acceleration of the robot Hence the fi ll level is not evident from these measurements and the FT150 is not suffi cient to solve the corresponding classifi cation task 51015 0 0 25 0 5 0 75 1 CMI MMI Fig 5 Utilizing the proposed iterative proprioceptor selection procedure CMI and MMI are applied to the training data The CMI selection grows faster and requires less proprioceptors X to obtain 100 of shared information with the fi ll level y This is due to the fact that MMI relies on the relation between redundancy and synergy while CMI focuses on the overall information In practice we noticed that proprioceptors selected with MMI contain less redundancy while proprioceptors selected with CMI maximize the shared information For this reason the selection procedure usually provides better results when utilizing CMI rather than MMI Then the CMI and a similar MMI algorithm are applied for different time delays 0 10 Figure 5 shows the increasing ratio of shared information and Shannon entropy I X y H y 1 Both measurements require ten proprioceptors in the subset X to gather about 99 of shared information with the fi ll level y As expected due to ignoring the amount of redundancy the information gain when utilizing CMI is growing faster than MMI and requires only 15 instead of 17 proprioceptors Consequently less proprioceptors are required to completely obtain all information about the actual state of the fi ll level This makes CMI to the method of choice for the proposed task Subsequently the ten proprioceptors with highest combined CMI XCMI10are utilized for training while VCMI10is used for validation purposes C Proprioceptive Model Learning The proprioceptive model is implemented by a deep neural network classifi er which is trained on the previously selected subset of proprioceptors XCMI10 More precisely a neural network architecture is applied which 1 implements a temporally indefi nite memory and 2 provides a categorical probability distribution as output The fi rst is realized by implementing Long Short Term Memory LSTM layers 10 In contrast to classical recur rent architectures these layers provide a temporally indefi nite memory More precisely these layers have the ability to remember information for an infi nite delay Each repetition of the lifting behavior is represented by time series data of 625 equidistant measurements For each time step the classifi er provides an estimate of the station s fi ll level Here the usage of LSTM layers allows remembering the past proprioception for the complete sequence of time series data As a result the classifi er gets more confi dent with each additional measurement of the behavior execution This further has the advantage that spontaneous noise is fi ltered automatically Categorical output is obtained through a softmax activation function 11 inside the output neurons of the corresponding network This function squashes inputs 1 r to the same size of outputs 1 r in the range between 0 1 by j fsoftmax j e j Pr i 1e i j 1 r 4 In contrast to the usage of a sigmoidal or linear activation function the sum of the softmax outputs Pr i 1 i 1 and therefore is equivalent to a categorical probability distribution To provide a probability based output the fi ll levels y 0ml 125ml 250ml are categorized in classes y 1 0 0 0 1 0 0 0 1 The overall LSTM network ar chitecture then becomes Input layer for each time step the selected proprio ceptors XCMI10are utilized as input neurons These ten neurons are fully connected to the cell inputs and gate units contained in the hidden layer what results in 400 connections 4261 Fig 6 The iteratively selected proprioceptors XCMI10contain 99 information about the fi ll level y and therefore the LSTM accurately estimate the corresponding class The remaining proprioceptors XCMI10 contain minor information about y and result in frequent classifi cation errors Utilizing all proprioceptors X requires the network to extract the relevant correlations itself The resulting estimates are similar to the usage of XCMI10but require considerably more computational effort Hidden layer there is one hidden layer which is com posed of ten LSTM blocks In turn each block contains its gate units and exactly one cell The cell outputs are transmitted to the output layer but are also fed back to all cell inputs and gate units what results in 430 additional connections Output layer the number of output neurons is equivalent to the dimensionality of the fi ll level classes Hence three output neurons are used as an estimate of the actual class Finally these neurons utilize a softmax activation function and therefore return a categorical probability distribution The resulting network consists of 830 connections and is re ferred to as LSTMCMI Furthermore each connection contains a bias which doubles the number of weighted connections to 1720 Due to this large amount of connections a visual representation of LSTMCMIis omitted Training the proposed LSTMCMIrequires a back propaga tion process which adapts the connection weights with regard to the training data More precisely the 45 behavior examples XCMI10are separated into sequences with a length of 625 equidistant measurements Here each sequence is iteratively fed into the input layer of LSTMCMIwhere softmax function returns a discrete set of 625 3 elements This set describes the categorical probability distribution for the different fi ll levels and is evaluated by utilizing the Cross Entropy Error CEE Back propagation is repeated for various epochs and is stopped when overfi tting occurs early stopping or the classifi cation accuracy is suffi cient The class with the highest probability is utilized as the generated estimate of our model An estimate which is not equivalent to the correct class contained in y is interpreted as classifi cation error The percentage of training errors with regard to the learning epoch is shown in Figure 6 Here the ten most benefi cial proprioceptors XCMI10are compared to all sensors X and to the less benefi cial XCMI10 As can be seen the proposed selection approach outperforms the usage of the less benefi cial sensors Similar results are achieved by the usage of all sensors but the increased size of input neurons also requires considerably more computational effort Another drawback of utilizing all sensors is shown in Fig 7 After 300 epochs the most benefi cial proprioceptors VCMI10 accurately estimate the fi ll level contained in the validation data This outperforms the usage of the less benefi cial proprioceptors VCMI10 Furthermore using all proprioceptors V suffers from spurious correlations which are contained in the training data and fail to generalize accurate estimates for the validation data Figure 7 Here the validation data V is utilized to monitor the models generalization ability for arbitrary inputs As stated above the position and fi ll level 100ml of V are not contained in the training data X Hence an accurate classifi er should assign the highest probability to the class which is equivalent to the closest fi ll level 0 1 0 125ml The classifi cation errors for all sensors X refl ect a major problem of deep learning with few data points More precisely the network utilizes dependencies between the proprioceptors and the fi ll level which are only correct within the limited size of training data The corresponding spurious correlations are not contained in V resulting in wrong estimates The classifi cation error is even higher than for less benefi cial proprioceptors VCMI10which does only contain less relevant information and also results in a high classifi cation error In contrast to that VCMI10achieves good results after 300 epochs and the most accurate estimate after 700 epochs III EXPERIMENTS Different experiments have been conducted to validate the accuracy and applicability of the proposed deep learning proprioceptive model To this end the previously introduced classifi er LSTMCMIwas utilized A Runtime Evaluation Each behavior execution as shown in Figure 8 results in a sequence containing 625 proprioceptive measurements of time series data This sequence is used as the network s input which returns a categorical probability distribution Due to the temporal data integration capabilities of LSTMs the classifi er is getting more confi dent about the fi ll level with each additional input Figure 9 shows this fact for three different sequences contained in the training data XCMI10and one sequence of the validation data VCMI10 A high certainty decision regarding the fi ll level is typically achieved after less than half a second Hence the lifting behavior does not need to be fi nished to generate an accurate decision The most probable class is used to adapt the extraction process For the proposed task three classes are suffi cient to enable the robot to distinguish between a correct a partially successful and an incorrect extraction process More precisely the robot 4262 Fig 8 One execution of the examined behavior The UR5 robotic arm lifts the self contained water extraction station after gathering an unknown amount of liquid Here the proprioceptive model is utilized to estimate the fi ll level after fi ve seconds of the behavior execution 012345012345 0 20 40 60 80 100 Probability 0 20 40 60 80 100 Probability 0ml125ml250ml Fig 9 The LSTMCMI estimates the fi ll level by transforming each of the 625 inputs to a categorical probability distribution Here the output generated for four behavior executions with varying fi ll levels is shown The classifi er is getting more confi dent with each additional input and a correct decision can usually be made after half a second As shown for the validation sequence LSTMCMIis also generalizing adequate estimates for arbitrary inputs can utilize the proprioceptive model to implement a set of reaction rules 0ml 1 0 0 in case of an incorrect extraction it is assumed that the station was not in contact with water and therefore the robot has to adapt its lifting position 125ml 0 1 0 a partially fi lled station indicates that an adequate position was selected but the process need to be repeated for a longer period 250ml 0 0 1 in case of a successful extraction process the station is returned to its docking position At runtime the measured proprioception is preprocessed in the same manner as the training data and subsequently processed by the classifi er LSTMCMI Here the network s processing time for one runtime measurement is 1 2ms with a standard deviation of 0 2ms This is far less than the 8ms provided by the interface of the UR5 robot and ensures the real time applicability of the presented approach B LSTM Advantages The advantages of LSTM over a classical Recurrent Neu ral Network RNN are demonstrated for the given classifi ca tion task To ensure comparability this RNN is constructed with a similar structure to LSTMCMI In particular the input and output layer are equivalent to that contained in LSTMCMI Furthermore the network contains two hidden layers which are fully connected with each other The fi rst hidden layer consists of ten and the second of fi ve neurons where all neu rons utilize a sigmoid activation function The last ten outputs of the network are then fed back as a recurrent input to itself This enables the network to remember its internal activation which corresponds to the received inputs of proprioceptive measurements The cell outputs of the second hidden layer are transmitted to the output layer The resulting network contains an overall of 2830 weights and is referred to as RECCMI Similar to LSTMCMIback propagation is utilized to adapt the connection weights by an offl ine learning procedure on the training data XCMI10 This process is repeated from ten initial weight confi gurations for a maximum of 1000 epochs The weight confi guration which achieves the highest classifi cation accuracy is then utilized as the fi nal RECCMI With regard to the training data XCMI10 RECCMIresults in 32 45 wrongly assigned classes while LSTMCMIhas an overall classifi cation error of 2 53 The less accurate results of the RECCMIcan be explained by the vanishing gradient problem In contrast to that the usage of LSTM blocks almost eliminates the vanishing gradient problem and enables LSTMCMIto remember the proprioception for the complete execution time frame of the behavior C Proprioceptive Information As mentioned in Section II B the proprioception XCMI10 shares more than 99 information with the fi ll level y This argument begs the question how much information is required to learn an accurate network Hence the learning curve when using another amount of information and conse quently a different number of proprioceptors is evaluated Figure 10 illustrates the mean CEE blue curve of the softmax layer and the corresponding standard deviation gray 4263 02004006008001000 0 200 400 600 800 1000 1200 0200400600800100002004006008001000 Fig 10 The number of input proprioceptors is infl uencing the mean learning curve blue and the standard deviation gray area of the LSTM Left The fi ll level share only 63 information and therefore the learning curve converges early Middle An adequate classifi er is learned within thousand epochs when containing more than 99 information Right Sh

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论