IROS2019国际学术会议论文集 1438

上传人：我*** IP属地：北京上传时间：2020-04-11 格式：PDF 页数：7 大小：303.87KB 积分：12 举报 版权申诉

已阅读5页，还剩2页未读，继续免费阅读

版权说明：本文档由用户提供并上传，收益归属内容提供方，若内容存在侵权，请进行举报或认领

文档简介

Performance Guarantees for Receding Horizon Search with Terminal Cost Benjamin Biggs Daniel J Stilwell Harun Yetkin James McMahon Abstract We present a novel method of using terminal costs in the construction of a receding horizon search path We prove that the proposed method of constructing search paths provides a theoretical lower bound on the performance of the search path Our result can be interpreted as ensuring that the receding horizon path performs no worse in expectation than a given sub optimal search path This result is especially practical for subsea applications where due to use of side scan sonar in search applications search paths typically consist of parallel straight lines Thus for subsea search applications our approach ensures that expected performance is no worse than the usual subsea search path and it might be much better We demonstrate the effi cacy of the proposed method by planning search paths in simulation using real world data that was acquired by an autonomous underwater vehicle during a subsea survey of Boston Harbor I INTRODUCTION We consider the challenge of planning for search mis sions where the length of an optimal search path is too long to be computed in real time In this paper optimal paths maximize a general deterministic value function For search applications the performance of a search mission is stochastic however the anticipated performance used for planning search missions is deterministic Our work is inspired by subsea search applications and we present numerical simulations of our approach using real world data acquired by an autonomous underwater vehicle The problem of computing search paths through a gridded space with a discrete set of possible actions has been shown to be NP hard 1 Throughout this paper we do not address exhaustive search as is normally a primary goal for coverage path planning algorithms 2 Rather we seek solutions for time limited or resource limited search scenarios where not all locations are necessarily searched We also consider environmental variability That is we seek search paths that maximize search effectiveness given that a search sensor may perform well in some areas and poorly in others The main contribution of this paper is a method of incorporating terminal costs into a receding horizon search path planner such that the resultant receding horizon search paths are theoretically guaranteed to perform at least as well as any path that is already available For a subsea search mission a typical search path is a sequence of parallel straight lines This work was supported by the Offi ce of Naval Research via grants N00014 16 1 2092 N00014 18 1 2627 and N00014 19 1 2194 James McMahon is with the US Naval Research Laboratory Code 7130 Washington D C USA Harun Yetkin is with the Department of Mechatronics Engineering Bart n University Turkey Benjamin Biggs and Daniel Stilwell are with the Bradley Department of Electrical and Computer Engineering Virginia Tech Blacksburg VA USA called a lawn mower or mowing the lawn path In this case our approach is guaranteed to perform in expectation no worse than a mowing the lawn path and might perform much better Receding horizon control otherwise known as model predictive control solves for an optimal control law within a fi nite planning horizon and executes a portion of that optimal solution out to a fi nite execution horizon that is shorter than the planning horizon At the end of the execution horizon a new optimal solution is computed and the process is repeated until the mission is completed i e until the allotted time or resources are exhausted or the desired outcome is achieved 3 5 One of the challenges for path planning using receding horizon RH methods occurs when an agent is unable to effectively construct desirable paths because of local extrema in the search space that arise due to myopic planning horizons 6 Yoo et al 7 propose a method to address the defi ciencies of RH planning by extending the planning horizon until the expected reward satisfi es a given threshold such that the resultant path jumps from thoroughly explored areas to unexplored areas that possess high reward In 8 10 an approximation of the minimum feasible arrival time to a goal location is used as a terminal cost to help the controller generate feasible paths in complex environ ments and avoid becoming trapped in concave obstacles The approaches presented in 7 9 and 10 are shown to be effective using numerical experiments however no formal performance guarantee is provided We note that it is shown in 10 that a terminal cost is a bound on the cost of the receding horizon trajectory plus the terminal cost at the next iteration Our approach which builds upon the same insight provides a guaranteed lower bound on expected search performance for an entire mission The receding horizon approach described in 11 is similar to ours except that in 11 a Lyapunov like constraint is appended to the optimization problem and at each location in the workspace there must be a next step for which the Lyapunov like function decreases We achieve a similar result by imposing a terminal cost instead of a constraint The terminal cost is required to converge to zero over the sequence of RH optimization solutions but it is not required to converge monotonically It is proven in 11 that a terminal constraint can be used to guarantee convergence to a desired state In contrast the method proposed in this paper uses sub optimal paths to determine terminal costs for receding horizon search paths It is proven that these sub optimal paths provide a lower bound to the performance of the corresponding RH path 2019 IEEE RSJ International Conference on Intelligent Robots and Systems IROS Macau China November 4 8 2019 978 1 7281 4003 2 19 31 00 2019 IEEE6362 We note that the performance guarantees provided in this paper do not imply optimality they simply guarantee a lower bound on performance In other words the method ensures that a receding horizon path does not behave poorly A limitation of our approach is the existence of suitable terminal costs For subsea search applications terminal costs can be derived from typical sub optimal search paths This paper is organized in the following manner Formal defi nitions of paths and value functions are provided in Section II The use of terminal costs in the construction of a receding horizon search path and performance guarantees is discussed in Section III We demonstrate in simulation using real world data that the proposed method not only provides a lower bound on the performance of the receding horizon search path but that the search paths produced using the proposed method perform signifi cantly better than this theoretical lower bound in most cases In Section IV we also demonstrate that search paths constructed using the proposed method are competitive with receding horizon search paths generated with much longer planning horizons II PATHS AND VALUE FUNCTIONS The mission area is a region G R2that is partitioned into q non intersecting cells hi G such that Pq i 1hi G An n length path n p0 p0 p1 pn is a feasible sequence of locations composed of pi G that starts at p0 A sequence is feasible if it can be traversed by a mobile search agent The set of all feasible n length paths that start at p0is denoted n p0 An infi nitely long path is denoted p0 p0 p1 p0 The value of an n length path n p0 is J n p0 n X k 1 g p0 pk 1 where g is a non negative function that returns the value of the information gained by visiting pk For search missions the value of visiting a location pkat time tkdepends on how many times pkhas been visited previously That is given pk G g p0 pk returns the value of the information gained by visiting pkconditioned on all previously visited positions p0 p1 pk 1 We also use the notation g pk g p0 pk 2 and Jn p0 J n p0 3 The value of the optimal n length path beginning at p0is J n p0 max n n p0 Jn p0 4 and the optimal n length path beginning at p0is n p0 argmax n n p0 J n p0 5 For the case of infi nite horizon paths the value of the optimal path is J p0 max p0 J p0 6 and the optimal path is p0 argmax p0 J p0 7 Suppose that the total mission length is l and that l n We can partition the value function into two parts Jl l p0 l X k 1 g pk n X k 1 g pk l X i n 1 g pi 8 where we refer to the right most summation in 8 as the cost to go For the case that an optimal n length path can be computed but an l length path is much too long to be computed using reasonable resources the cost to go might be the tail that is ignored when computing a short horizon n length path Throughout we consider both the case that l is fi nite and the case of infi nitely long paths where l Defi nition 2 1 A lower bound for the cost to go in 8 is the function Bl pn that satisfi es Bl pn l X i n 1 g pi 9 for all n l Bl pl 0 and for the infi nite case where l Bl pk 0 as k A Receding Horizon Paths Receding horizon RH paths are constructed from a sequence of n length optimal paths For this discussion we presume that a new n length optimal path is computed after traversing a single step along the previously computed n length optimal path The k length RH path is denoted Pk p0 p1 pk Because the RH path is composed of a sequence of single steps that come from optimal n length paths pj pj 1 Pk are the fi rst two elements of the optimal path n pj for all j 1 k To clarify the construction of a RH path proceeds as follows Beginning at the starting position p0 an optimal path n p0 is computed with the planning horizon n The search agent traverses the fi rst step of the path n p0 Thus the fi rst 2 elements of the RH path are P1 p0 p1 n p0 The optimal path n p1 is computed and the process repeats until the mission is completed The value of a receding horizon path Pkis J Pk k X i 1 g p0 pi 10 where g is the non negative function in 1 6363 III TERMINALCOST It is well known that when a sequence of optimal paths is used to naively generate an RH path as in Section II A the RH path does not retain the optimality property of the paths that are used to construct it However we show that if the corresponding value function includes a terminal cost that is a lower bound for the cost to go see 9 then the lower bound for the cost to go is also a lower bound for the cost of the RH path In Corollary 3 2 we show that if a naive suboptimal path for the entire mission is available as in the case in subsea search then the suboptimal path can be used to generate a lower bound for the cost to go In which case the RH path is guaranteed to perform no worse than the naive path Our results are similar to those that impose a terminal cost to generate a stabilizing receding horizon control law such as in 12 14 and 15 The value of an n length path n pk n pk with a terminal cost is V n pk k n X j k 1 g pj Bl pk n 11 where Bl is defi ned in 9 and the optimal n length path is n pk argmax n n pk V n pk 12 We also use the notation V n pk V n pk 13 Similarly we write V n pk 1 Vn n pk 1 where n pk 1 is the optimal n length path that maximizes Vn 1 pk 1 k n X j k 2 g pj Bl pk n 14 In Proposition 3 1 we state a condition under which the value of the RH path is always no less than that of the terminal cost In Corollary 3 2 we show that the hypothesis required in Proposition 3 1 is always satisfi ed if the terminal cost is constructed from the cost to go of an existing naive suboptimal path Proposition 3 1 Suppose the l length receding horizon path Plis composed of 1 step sequences along a sequence of n length optimal paths that maximize 11 For every n length optimal path n pk pk p k 1 p k n 15 suppose that there exists a location pk nthat is feasible from p k n 1 for which g pk n Bl pk n Bl p k n 1 16 then the RH path satisfi es J Pl Bl p0 17 Proof For any location pkalong the RH path the path that maximizes 11 is n pk pk p k 1 p k n 18 and the n 1 length path that starts at p k 1 and maximizes 14 is n 1 p k 1 p k 1 p k 2 p k n 19 Suppose n pk is an n length path composed of n 1 pk and a location pk n that is feasible from the fi nal location p k n 1 along n 1 pk such that n pk n 1 pk pk n 20 The value of n pk is V n pk V n 1 pk Bl p k n 1 g pk n Bl pk n 21 By hypothesis g pk n Bl pk n Bl p k n 1 22 Therefore 21 may be rewritten as V n pk V n 1 pk 23 and because n pk is not optimal V n pk V n pk 24 such that V n pk V n 1 pk 25 From the principle of optimality V n pk g pk 1 V n 1 pk 1 26 Substituting the right hand side of 26 into the left hand side of 25 yields g pk 1 V n 1 pk 1 V n 1 pk 27 Subtracting V n 1 pk 1 from both sides of 27 gives g pk 1 V n 1 pk V n 1 pk 1 28 Evaluating 28 at different values of k from k 0 to k l 1 yields g p1 V n 1 p0 V n 1 p1 g p2 V n 1 p1 V n 1 p2 g pl V n 1 pl 1 V n 1 pl 29 such that summing the inequalities in 29 and simplifying yields l X j 1 g pj V n 1 p0 V n 1 pl 30 Because we consider a path of length l V n 1 pl evaluates to V n 1 pl 0 31 6364 We also recognize that Bl p0 represents a lower bound on the cost to go see 9 such that V n 1 p0 Bl p0 32 and 30 becomes l X j 1 g pj Bl p0 33 where Pl j 1g pj is the value of the l length receding horizon path Pl Therefore we conclude that J Pl Bl p0 34 The lower bound Bl pk on the cost to go see 9 could be defi ned as the value of traversing a naive path through the search space beginning from the position pk In the case of subsea search lawn mower paths perform well 16 and can be computed effi ciently This makes them well suited to determine a lower bound on the cost to go Corollary 3 2 Suppose that l k n pk n pk n pl is a given naive suboptial path and let the cost to go be defi ned Bl pk n l X j k n 1 g pj 35 Then there exists pk n 1feasible from pk n which satisfi es Bl pk n g pk n 1 Bl pk n 1 36 Proof From 35 we note that l X j k n 1 g pj g pk n 1 l X j k n 2 g pj 37 Given that Bl pk n 1 is defi ned by the right most summa tion in 37 we write l X j k n 1 g pj g pk n 1 Bl pk n 1 38 Substituting the left hand side of 35 into the left hand side of 38 yields Bl pk n g pk n 1 Bl pk n 1 39 which satisfi es Bl pk n g pk n 1 Bl pk n 1 40 IV NUMERICALEXPERIMENTS A Problem Formulation The numerical experiments presented in this section ad dress the problem of planning search paths in an uncertain subsea environment as presented in 17 We consider the 61 61 gridded search space shown in Figure 1 We note that length scales have been normalized so that each grid element has unit length and width Each cell hiin the search space contains an unknown number of targets x X Each cell also has associated environment type e E We assume that the environment within the cell is from a fi nite set of possible environment types e1 e2 em A probability distribution on the environment type in cell hiis known a priori The number of targets contained within a cell and the environment type within the cell are assumed to be independent of all other cells We assume the search sensor performance is dependent on the environment e within the cell That is the environment type e of the ithcell hi directly infl uences the uncertainty of the sensor measurements acquired from hi Upon visiting a cell hi the search agent acquires a noisy measurement y Y of the environment type e and z Z of the number of targets contained within the cell The anticipated value of searching a cell is determined through a decision theoretic cost function Specifi cally the anticipated value of searching a cell is the expected reduction in Bayes risk due to the measurements ziand yiacquired upon visiting a cell That is given the estimator d yi which minimizes the posterior expected loss of computing the environment type estimate and the estimator xi which minimizes the posterior expected loss of computing the estimate of the number of targets in cell higiven d y Bayes risk is the posterior expected loss of computing the estimate on the number of targets in cell hi The anticipated value of searching cell hiis B i i r i k 41 where iis the current risk before acquiring the measure ments z and y and is defi ned as i m X j 1 P ej E L x ej 42 r i k is the anticipated risk given k visits to the ithcell hi and is defi ned as r i k X zi 1 X yi 1 X zi k X yi k P zi yi E L x zi zi d yi 43 where zi zi 1 zi k is the set of independent measure ments of the number of targets in cell hi yi yi 1 yi k is the set of independent measurements of the environmental 6365 type in cell hi and P zi yi X x X ej P x z1 ej P x zk ej P y1 ej P yk ej P x P ej 44 is the joint probability of sensing zi yigiven the prior environmental distribution P ej for cell hiand the prior distribution on the number of targets P x in cell hi The actual value of visiting cell hiis the actual risk reduction given measurements y and z That is given i is the current risk before acquiring the measurements z and y and iis the current risk after acquiring measurements z and y then the actual risk reduction for visiting cell hiis B i i i 45 0102030405060 0 10 20 30 40 50 60 0 2 0 3 0 4 0 5 0 6 0 7 0 8 Fig 1 The 61 61 gridded subsea search environment with cell color corresponding to the anticipated risk reduction for a single visit to each cell These numerical experiments consider that the search agent is an autonomous underwater vehicle AUV carrying side scan sonar Side scan sonar operations require the vehi cle to traverse straight line paths in order to acquire useful data when performing synthetic aperture sonar processing 18 At each location the AUV may move forward turn left or turn right We consider that each action contributes to the total length of the search path However we presume that sensor data acquired during a turning maneuver does not contain useful information and thus the non negative function g in 1 used to determine the anticipated value of traversing a path is g pk B i for pk hiwith no turning maneuver 0else 46 Similarly after obtaining measurements y and z of the environment type within cell hiand the number of targets contained within hiwhen trav

人人文库> 全部分类> 教育资料 > 课件下载

温馨提示

1. 本站所有资源如无特殊说明，都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
2. 本站的文档不包含任何第三方提供的附件图纸等，如果需要附件，请联系上传者。文件的所有权益归上传用户所有。
3. 本站RAR压缩包中若带图纸，网页内容里面会有图纸预览，若没有图纸预览就没有图纸。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 人人文库网仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对用户上传分享的文档内容本身不做任何修改或编辑，并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容，请与我们联系，我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

IROS2019国际学术会议论文集 1438

文档简介

温馨提示

最新文档

评论

IROS2019国际学术会议论文集 1438

文档简介

温馨提示

最新文档

评论

相关文档