{"title": "Near-Optimal Edge Evaluation in Explicit Generalized Binomial Graphs", "book": "Advances in Neural Information Processing Systems", "page_first": 4631, "page_last": 4641, "abstract": "Robotic motion-planning problems, such as a UAV flying fast in a partially-known environment or a robot arm moving around cluttered objects, require finding collision-free paths quickly. Typically, this is solved by constructing a graph, where vertices represent robot configurations and edges represent potentially valid movements of the robot between theses configurations. The main computational bottlenecks are expensive edge evaluations to check for collisions. State of the art planning methods do not reason about the optimal sequence of edges to evaluate in order to find a collision free path quickly. In this paper, we do so by drawing a novel equivalence between motion planning and the Bayesian active learning paradigm of decision region determination (DRD). Unfortunately, a straight application of ex- isting methods requires computation exponential in the number of edges in a graph. We present BISECT, an efficient and near-optimal algorithm to solve the DRD problem when edges are independent Bernoulli random variables. By leveraging this property, we are able to significantly reduce computational complexity from exponential to linear in the number of edges. We show that BISECT outperforms several state of the art algorithms on a spectrum of planning problems for mobile robots, manipulators, and real flight data collected from a full scale helicopter. Open-source code and details can be found here: https://github.com/sanjibac/matlab_learning_collision_checking", "full_text": "Near-Optimal Edge Evaluation in Explicit\n\nGeneralized Binomial Graphs\n\nSanjiban Choudhury\nThe Robotics Institute\n\nCarnegie Mellon University\n\nsanjiban@cmu.edu\n\nShervin Javdani\n\nThe Robotics Institute\n\nCarnegie Mellon University\n\nsjavdani@cmu.edu\n\nSiddhartha Srinivasa\nThe Robotics Institute\n\nCarnegie Mellon University\n\nsiddh@cs.cmu.edu\n\nSebastian Scherer\nThe Robotics Institute\n\nCarnegie Mellon University\n\nbasti@cs.cmu.edu\n\nAbstract\n\nRobotic motion-planning problems, such as a UAV \ufb02ying fast in a partially-known\nenvironment or a robot arm moving around cluttered objects, require \ufb01nding\ncollision-free paths quickly. Typically, this is solved by constructing a graph,\nwhere vertices represent robot con\ufb01gurations and edges represent potentially valid\nmovements of the robot between these con\ufb01gurations. The main computational\nbottlenecks are expensive edge evaluations to check for collisions. State of the art\nplanning methods do not reason about the optimal sequence of edges to evaluate\nin order to \ufb01nd a collision free path quickly. In this paper, we do so by drawing\na novel equivalence between motion planning and the Bayesian active learning\nparadigm of decision region determination (DRD). Unfortunately, a straight ap-\nplication of existing methods requires computation exponential in the number\nof edges in a graph. We present BISECT, an ef\ufb01cient and near-optimal algo-\nrithm to solve the DRD problem when edges are independent Bernoulli random\nvariables. By leveraging this property, we are able to signi\ufb01cantly reduce compu-\ntational complexity from exponential to linear in the number of edges. We show\nthat BISECT outperforms several state of the art algorithms on a spectrum of\nplanning problems for mobile robots, manipulators, and real \ufb02ight data collected\nfrom a full scale helicopter. Open-source code and details can be found here:\nhttps://github.com/sanjibac/matlab_learning_collision_checking\n\n1\n\nIntroduction\n\nMotion planning, the task of computing collision-free motions for a robotic system from a start to\na goal con\ufb01guration, has a rich and varied history [23]. Up until now, the bulk of the prominent\nresearch has focused on the development of tractable planning algorithms with provable worst-case\nperformance guarantees such as computational complexity [3], probabilistic completeness [24] or\nasymptotic optimality [20]. In contrast, analysis of the expected performance of these algorithms\non the real world planning problems a robot encounters has received considerably less attention,\nprimarily due to the lack of standardized datasets or robotic platforms. However, recent advances in\naffordable sensors and actuators have enabled mass deployment of robots that navigate, interact and\ncollect real data. This motivates us to examine the following question: \u201cHow can we design planning\nalgorithms that, subject to on-board computation constraints, maximize their expected performance\non the actual distribution of problems that a robot encounters?\u201d\nThis paper addresses a class of robotic motion planning problems where path evaluation is expensive.\nFor example, in robot arm planning [12], evaluation requires expensive geometric intersection\n\n31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA.\n\n\fFigure 1: The feasible path identi\ufb01cation problem (a) The explicit graph contains dynamically feasible\nmaneuvers [27] for a UAV \ufb02ying fast, with a set candidate paths. The map shows the distribution of edge validity\nfor the graph. (b) Given a distribution over edges, our algorithm checks an edge, marks it as invalid (red) or\nvalid (green), and updates its belief. We continue until a feasible path is identi\ufb01ed as free. We aim to minimize\nthe number of expensive edge evaluations.\n\ncomputations. In UAV path planning [9], evaluation must be done online with limited computational\nresources (Fig. 1).\nState of the art planning algorithms [11] \ufb01rst compute a set of unevaluated paths quickly, and then\nevaluate them sequentially to \ufb01nd a valid path. Oftentimes, candidate paths share common edges.\nHence, evaluation of a small number of edges can provide information about the validity of many\ncandidate paths simultaneously. Methods that check paths sequentially, however, do not reason about\nthese common edges.\nThis leads us naturally to the feasible path identi\ufb01cation problem - given a library of candidate\npaths, identify a valid path while minimizing the cost of edge evaluations. We assume access to a\nprior distribution over edge validity, which encodes how obstacles are distributed in the environment\n(Fig. 1(a)). As we evaluate edges and observe outcomes, the uncertainty of a candidate path collapses.\nOur \ufb01rst key insight is that this problem is equivalent to decision region determination (DRD) [19, 5])\n- given a set of tests (edges), hypotheses (validity of edges), and regions (paths), the objective is to\ndrive uncertainty into a single decision region. This linking enables us to leverage existing methods\nin Bayesian active learning for robotic motion planning.\nChen et al. [5] provide a method to solve this problem by maximizing an objective function that\nsatis\ufb01es adaptive submodularity [15] - a natural diminishing returns property that endows greedy\npolicies with near-optimality guarantees. Unfortunately, naively applying this algorithm requires\nO\nWe de\ufb01ne the Bern-DRD problem, which leverages additional structure in robotic motion planning\nby assuming edges are independent Bernoulli random variables 1, and regions correspond to sets of\nedges evaluating to true. We propose Bernoulli Subregion Edge Cutting (BISECT), which provides\na greedy policy to select candidate edges in O (E). We prove our surrogate objective also satis\ufb01es\nadaptive submodularity [15], and provides the same bounds as Chen et al. [5] while being more\nef\ufb01cient to compute.\nWe make the following contributions:\n\n(cid:0)2E(cid:1) computation to select an edge to evaluate, where E is the number of edges in all paths.\n\n1. We show a novel equivalence between feasible path identi\ufb01cation and the DRD problem,\n\nlinking motion planning to Bayesian active learning.\n\n(cid:0)2E(cid:1).\n\n2. We develop BISECT, a near-optimal algorithm for the special case of Bernoulli tests, which\n\nselects tests in O (E) instead of O\nrobots, manipulators, and real \ufb02ight data collected from a full scale helicopter.\n\n3. We demonstrate the ef\ufb01cacy of our algorithm on a spectrum of planning problems for mobile\n\n1Generally, edges in this graph are correlated, as edges in collision are likely to have neighbours in collision.\nUnfortunately, even measuring this correlation is challenging, especially in the high-dimensional non-linear\ncon\ufb01guration space of robot arms. Assuming independent edges is a common simpli\ufb01cation [23, 25, 7, 2, 11]\n\n2\n\n(a)(b)||&&\u21e5enabled\u21e5&&Xtcaseq\u21e5intentnotknownX!tcaseqX1||&&\u21e5enabled\u21e5&&Xtcaseq\u21e5intentnotknownX!tcaseqX1||&&\u21e5enabled\u21e5&&Xtcaseq\u21e5intentnotknownX!tcaseqX1||&&\u21e5enabled\u21e5&&Xtcaseq\u21e5intentnotknownX!tcaseqX1\f2 Problem Formulation\n\n2.1 Planning as Feasible Path Identi\ufb01cation on Explicit Graphs\n\nLet G = (V, E) be an explicit graph that consists of a set of vertices V and edges E. Given\na pair of start and goal vertices, (vs, vg) \u2208 V , a search algorithm computes a path \u03be \u2286 E - a\nconnected sequence of valid edges. To ascertain the validity of an edge, it invokes an evaluation\nfunction Eval : E \u2192 {0, 1}. We address applications where edge evaluation is expensive, i.e., the\ncomputational cost c(e) of computing Eval(e) is signi\ufb01cantly higher than regular search operations2.\n|E| which assigns to each edge a boolean validity\nWe de\ufb01ne a world as an outcome vector o \u2208 {0, 1}\nwhen evaluated, i.e. Eval(e) = o(e). We assume that the outcome vector is sampled from an\nindependent Bernoulli distribution P (o), giving rise to a Generalized Binomial Graph (GBG) [13].\nWe make a second simpli\ufb01cation to the problem - from that of search to that of identi\ufb01cation. Instead\nof searching G online for a path, we frame the problem as identifying a valid path from a library\nof \u2018good\u2019 candidate paths \u039e = (\u03be1, \u03be2, . . . , \u03bem). The candidate set of paths \u039e is constructed of\ufb02ine,\nwhile being cognizant of P (o), and can be veri\ufb01ed to ensure that all paths have acceptable solution\nquality when valid. 3 Hence we care about completeness with respect to \u039e instead of G.\nWe wish to design an adaptive edge selector Select(o) which is a decision tree that operates on a\nworld o, selects an edge for evaluation and branches on its outcome. The total cost of edge evaluation\nis c(Select(o)). Our objective is to minimize the cost required to \ufb01nd a valid path:\n\n(cid:89)\n\ne\u2208\u03be\n\nmin Eo\u2208P (o) [c(Select(o))] s.t \u2200o,\u2203\u03be :\n\no(e) = 1 , \u03be \u2286 Select(o)\n\n(1)\n\n2.2 Decision Region Determination with Independent Bernoulli Tests\n\nif all tests in that region evaluate to true, which has probability P (R) = (cid:81)\n\nWe now de\ufb01ne an equivalent problem - decision region determination with independent Bernoulli\ntests (Bern-DRD). De\ufb01ne a set of tests T = {1, . . . , n}, where the outcome of each test is a Bernoulli\nt (1\u2212 \u03b8t)1\u2212xt. We de\ufb01ne a set of hypotheses h \u2208 H,\nrandom variable Xt \u2208 {0, 1}, P (Xt = xt) = \u03b8xt\nT mapping all tests t \u2208 T to outcomes h(t). We de\ufb01ne a\nwhere each is an outcome vector h \u2208 {0, 1}\nset of regions {Ri}m\ni=1, each of which is a subset of tests R \u2286 T . A region is determined to be valid\n|A|.\nIf a set of tests A \u2286 T are performed, let the observed outcome vector be denoted by xA \u2208 {0, 1}\nLet the version space H(xA) be the set of hypotheses consistent with observation vector xA, i.e.\nH(xA) = {h \u2208 H | \u2200t \u2208 A, h(t) = xA(t)}.\nWe de\ufb01ne a policy \u03c0 as a mapping from observation vector xA to tests. A policy terminates when it\nT be the ground\nshows that at least one region is valid, or all regions are invalid. Let xT \u2208 {0, 1}\ntruth - the outcome vector for all tests. Denote the observation vector of a policy \u03c0 given ground truth\nxT as xA (\u03c0, xT ). The expected cost of a policy \u03c0 is c(\u03c0) = ExT [c(xA (\u03c0, xT )] where c(xA) is\nthe cost of all tests t \u2208 A. The objective is to compute a policy \u03c0\u2217 with minimum cost that ensures at\nleast one region is valid, i.e.\n(2)\n\nP (Xt = 1).\n\nt\u2208R\n\n\u2217\n\n\u03c0\n\n\u2208 arg min\n\nc(\u03c0) s.t \u2200xT ,\u2203Rd : P (Rd | xA (\u03c0, xT )) = 1\nNote that we can cast problem (1) to (2) by setting E = T and \u039e = {Ri}m\ni=1. That is, driving\nuncertainty into a region is equivalent to identi\ufb01cation of a valid path (Fig. 2). This casting enables\nus to leverage ef\ufb01cient algorithms with near-optimality guarantees for motion planning.\n\n\u03c0\n\n3 Related Work\nThe computational bottleneck in motion planning varies with problem domain and that has led to a\nplethora of planning techniques ([23]). When vertex expansions are a bottleneck, A* [17] is optimally\nef\ufb01cient while techniques such as partial expansions [28] address graph searches with large branching\nfactors. The problem class we examine, that of expensive edge evaluation, has inspired a variety of\n\n2It is assumed that c(e) is modular and non-zero. It can scale with edge length.\n3Refer to supplementary on various methods to construct a library of good candidate paths\n\n3\n\n\fFigure 2: Equivalence between the feasible path identi\ufb01cation problem and Bern-DRD. A path \u03bei is equivalent\nto a region Ri over valid hypotheses (blue dots). Tests eliminate hypotheses and the algorithm terminates when\nuncertainty is pushed into a region (R1) and the corresponding path (\u03be1) is determined to be valid.\n\n\u2018lazy\u2019 approaches. The Lazy Probabilistic Roadmap (PRM) algorithm [1] only evaluates edges on\nthe shortest path while Fuzzy PRM [26] evaluates paths that minimize probability of collision. The\nLazy Weighted A* (LWA*) algorithm [8] delays edge evaluation in A* search and is re\ufb02ected in\nsimilar techniques for randomized search [14, 6]. An approach most similar in style to ours is the\nLazyShortestPath (LazySP) framework [11] which examines the problem of which edges to evaluate\non the shortest path. Instead of the \ufb01nding the shortest path, our framework aims to ef\ufb01ciently\nidentify a feasible path in a library of \u2018good\u2019 paths. Our framework is also similar to the Anytime\nEdge Evaluation (AEE*) framework [25] which deals with edge evaluation on a GBG. However, our\nframework terminates once a single feasible path is found while AEE* continues to evaluation in\norder to minimize expected cumulative sub-optimality bound. Similar to Choudhury et al. [7] and\nBurns and Brock [2], we leverage priors on the distribution of obstacles to make informed planning\ndecisions.\nWe draw a novel connection between motion planning and optimal test selection which has a\nwide-spread application in medical diagnosis [21] and experiment design [4]. Optimizing the ideal\nmetric, decision theoretic value of information [18], is known to be NPPP complete [22]. For\nhypothesis identi\ufb01cation (known as the Optimal Decision Tree (ODT) problem), Generalized Binary\nSearch (GBS) [10] provides a near-optimal policy. For disjoint region identi\ufb01cation (known as the\nEquivalence Class Determination (ECD) problem), EC2 [16] provides a near-optimal policy. When\nregions overlap (known as the Decision Region Determination (DRD) problem), HEC [19] provides\na near-optimal policy. The DIRECT algorithm [5], a computationally more ef\ufb01cient alternative to\nHEC, forms the basis of our approach.\n\n4 The Bernoulli Subregion Edge Cutting Algorithm\n\nThe DRD problem in general is addressed by the Decision Region Edge Cutting (DIRECT) [5]\nalgorithm. The intuition behind the method is as follows - as tests are performed, hypotheses\ninconsistent with test outcomes are pruned away. Hence, tests should be incentivized to push the\nprobability mass over hypotheses into any region as fast as possible. Chen et al. [5] derive a surrogate\nobjective function that provides such an incentive by creating separate sub-problems for each region\nand combining them in a Noisy-OR fashion such that quickly solving any one sub-problem suf\ufb01ces.\nImportantly, this objective is adaptive submodular [15] - greedily maximizing such an objective\nresults in a near-optimal policy.\nWe adapt the framework of DIRECT to address the Bern-DRD problem. We \ufb01rst provide a modi-\n\ufb01cation to the EC2 sub-problem objective which is simpler to compute when the distribution over\nhypotheses is non-uniform, while providing the same guarantees. Unfortunately, naively apply-\ning DIRECT requires O\nBernoulli tests, we present a more ef\ufb01cient Bernoulli Subregion Edge Cutting (BISECT) algorithm,\nwhich computes each subproblem in O (T ) time. We provide a brief exposition deferring to the\nsupplementary for detailed derivations.\n\n(cid:0)2T(cid:1) computation per sub-problem. For the special case of independent\n\n4.1 A simple subproblem: One region versus all\n\nFollowing Chen et al. [5], we de\ufb01ne a \u2018one region versus all\u2019 subproblem, the solution of which helps\naddress the Bern-DRD. Given a single region, the objective is to either push the version space to\nH\nthat region, or collapse it to a single hypothesis. We view a region R as a version space R\n\u2286 H\n\n4\n\nR1R2R3R1R2R3R1R2R3\u21e01\u21e02\u21e03\u21e01\u21e02\u21e03\u21e01\u21e02\u21e03(a)(b)(c)\fconsistent with its constituent tests. We de\ufb01ne this subproblem over a set of disjoint subregions Si.\nH be S1. Every other hypothesis h \u2208 RH is de\ufb01ned as its\nLet the hypotheses in the target region R\nown subregion Si, i > 1, where RH is a set of hypothesis where a region is not valid. Determining\nwhich subregion is valid falls under the framework of Equivalence Class Determination (ECD), (a\nspecial case of the DRD problem) and can be solved ef\ufb01ciently by the EC2 algorithm (Golovin et al.\n[16]). This objective de\ufb01nes a graph with nodes as subregions and edges between distinct subregions,\nwhere the weight of an edge is the product of probabilities of subregions. As tests are performed and\noutcomes are received, the version space shrinks, and probabilities of different subregions are driven\nto 0. This has the effect of decreasing the total weight of edges. Importantly, the problem is solved\ni.f.f. the weight of all edges is zero. The weight over the set of subregions is:\n\nw[16]({Si}) =\n\nP (Sj)P (Sk)\n\n(3)\n\n(cid:88)\n\nj(cid:54)=k\n\nWhen hypotheses have uniform weight, this can be computed ef\ufb01ciently for the \u2018one region versus\n\nall\u2019 subproblem. Let P (S1) = (cid:80)\n\ni>1\n\nP (Si):\n\n(cid:18)\n\n(cid:19)\n\n1\n|H|\n\nw[16]({Si}) = P (S1)P (S1) + P (S1)\n\nP (S1) \u2212\n\n(4)\n\n(cid:88)\n\ni(cid:54)=1\n\n(cid:88)\n\ni(cid:54)=1\n\nP (Si)) + (\n\nwEC({Si}) = P (S1)(\n\nFor non-uniform prior however, this quantity is more dif\ufb01cult to compute. We modify this objective\nslightly, adding self-edges on subregions Si, i > 1, enabling more ef\ufb01cient computation while still\nmaintaining the same guarantees:\n\n(cid:88)\nj\u22651\n) + P (RH))\n= P (S1)P (S1) + P (S1)2 = P (RH)(P (R\nR(xA) = {h \u2208 H | \u2200t \u2208 A \u2229 R, h(t) = xA(t)}.\nFor region R, let the relevant version space be H\nR(xA).\nH consistent with relevant outcomes in xA is given by R\nThe set of all hypotheses in R\n(cid:0)2T(cid:1). However, for the\nR(xA)) allows us to quantify the progress made on\nH\nThe terms P (R\ndetermining region validity. Naively computing these terms would require computing all hypotheses\nand assigning them to correct subregions, thus requiring a runtime of O\n(cid:33)2\nspecial case of Bernoulli tests, we can reduce this to O (T ) as we can see from the expression\n\nR(xA)) and P (RH \u2229 H\n\nP (Sj))\nH\n\nP (Si))(\n\n\u2229 H\n\n\u2229 H\n\n(5)\n\nH\n\nwEC({Si}\u2229H\n\nR\n\n(xA)) =\n\nI(Xi = 1)\n\ni\u2208(R\u2229A)\n\nj\u2208(R\\A)\n\nk\u2208R\u2229A\n\n\u03b8xA(k)\n\nk\n\n(1 \u2212 \u03b8k)1\u2212xA(k)\n\n(cid:89)\n\n\uf8f6\uf8f8(cid:32) (cid:89)\n\n\u03b8j\n\n\uf8eb\uf8ed1 \u2212\n\n(cid:89)\n\n(6)\nWe can further reduce this to O (1) when iteratively updated (see supplementary for derivations). We\nnow de\ufb01ne a criterion that incentivizes removing edges quickly and has theoretical guarantees. Let\nfEC(xA) be the weight of edges removed on observing outcome vector xA. This is evaluated as\n\nfEC(xA) = 1 \u2212\n\nwEC({Si} \u2229 H\nwEC({Si})\n\n(cid:32)\n\n(cid:81)\n\nR(xA))\n\nI(Xi = 1) (cid:81)\n\ni\u2208(R\u2229A)\n\n(cid:19)2\n\n(7)\n\n\u03b8xA(k)\n\nk\n\n(1 \u2212 \u03b8k)1\u2212xA(k)\n\n(cid:33)(cid:18) (cid:81)\n(cid:81)\n\nk\u2208R\u2229A\n\u03b8i\n\ni\u2208R\n\n\u03b8j\n\nj\u2208(R\\A)\n1 \u2212\n\n1 \u2212\n\n= 1 \u2212\n\nLemma 1. The expression fEC(xA) is strongly adaptive monotone and adaptive submodular.\n\n4.2 Solving the Bern-DRD problem using BISECT\nWe now return to Bern-DRD problem (2) where we have multiple regions {R1, . . . ,Rm} that\noverlap. Each region Rr is associated with an objective f r\nEC(xA) for solving the \u2018one region versus\nall\u2019 problem. Since solving any one such subproblem suf\ufb01ces, we combine them in a Noisy-OR\n\n5\n\n\fAlgorithm 1: Decision Region Determination with Independent Bernoulli Test({Ri}m\n1 A \u2190 \u2205 ;\n2 while ((cid:64)Ri, P (Ri|xA) = 1) and (\u2203Ri, P (Ri|xA) > 0) do\n\ni=1 , \u03b8, xT )\n\n3\n4\n5\n6\n\nTcand \u2190 SelectCandTestSet(xA) ;\nt\u2217 \u2190 SelectTest(Tcand, \u03b8, xA) ;\nA \u2190 A \u222a t\u2217;\nxt\u2217 \u2190 xT (t\u2217) ;\n\n(cid:46) Using either (10) or (12)\n(cid:46) Using either (11),(13),(14),(15) or (16)\n\n(cid:46) Observe outcome for selected test\n\nm(cid:81)\n(cid:33)(cid:32) (cid:81)\nformulation by de\ufb01ning an objective fDRD(xA) = 1 \u2212\n(cid:81)\n\nI(Xi = 1) (cid:81)\n\nm(cid:89)\n\ni\u2208(Rr\u2229A)\n\n(cid:81)\n\n(cid:32)\n\n1 \u2212\n\nr=1\n\n\u03b8j\n\nj\u2208(Rr\\A)\n1 \u2212\n\ni\u2208Rr\n\nk\u2208Rr\u2229A\n\u03b8i\n\n1 \u2212\n\nr=1\n\n\uf8eb\uf8ec\uf8ec\uf8ec\uf8ec\uf8ec\uf8ed\n\n(1 \u2212 f r\n\nEC(xA)) [5] which evaluates to\n\n\u03b8xA(k)\n\nk\n\n(1 \u2212 \u03b8k)1\u2212xA(k)\n\n(cid:33)2\n\n\uf8f6\uf8f7\uf8f7\uf8f7\uf8f7\uf8f7\uf8f8 (8)\n\nSince fDRD(xA) = 1 iff f r\nto Bern-DRD\n\n\u2217\n\n\u03c0\n\nEC(xA) = 1 for at least one r, we de\ufb01ne the following surrogate problem\n\n\u2208 arg min\n\n\u03c0\n\nc(\u03c0) s.t \u2200xT : fDRD(xA (\u03c0, xT )) \u2265 1\n\n(9)\n\nThe surrogate problem has a structure that allows greedy policies to have near-optimality guarantees\nLemma 2. The expression fDRD(xA) is strongly adaptive monotone and adaptive submodular.\nTheorem 1. Let m be the number of regions, ph\n\u03c0DRD be the greedy policy and \u03c0\u2217 with the optimal policy. Then c(\u03c0DRD) \u2264 c(\u03c0\u2217)(2m log 1\nWe now describe the BISECT algorithm. Algorithm 1 shows the framework for a general de-\ncision region determination algorithm.\nIn order to specify BISECT, we need to de\ufb01ne two\noptions - a candidate test set selection function SelectCandTestSet(xA) and a test selec-\ntion function SelectTest(Tcand, \u03b8, xA). The unconstrained version of BISECT implements\nSelectCandTestSet(xA) to return the set of all tests Tcand that contains only unevaluated tests\nbelonging to active regions\n\nmin the minimum prior probability of any hypothesis,\n+1).\n\nph\n\nmin\n\nWe now examine the BISECT test selection rule SelectTest(Tcand, \u03b8, xA)\n\nTcand =\n\ni=1\n\n(cid:40) m(cid:91)\n(cid:41)\n{Ri | P (Ri|xA) > 0}\n\uf8eb\uf8ed1 \u2212\n(cid:89)\n(cid:89)\n\uf8f6\uf8f8\uf8f6\uf8f8 (\u03b8xt\n\nI(Xi = 1)\n\nI(Xi = 1)\n\n(cid:89)\n\ni\u2208(Rr\u2229A)\n\nj\u2208(Rr\\A)\n\n\u03b8j\n\n\uf8ee\uf8f0 m(cid:89)\n(cid:89)\n\nr=1\n\n\\ A\n\n\uf8f6\uf8f8\n\n\u03b8j\n\ni\u2208(Rr\u2229A\u222at)\n\nj\u2208(Rr\\A\u222at)\n\nt (1 \u2212 \u03b8t)1\u2212xt)\n\n(10)\n\nm(cid:80)\n\nk=1\n\n2\n\nI(t\u2208Rk)\n\n\uf8f9\uf8fb (11)\n\nExt\n\n\u2217\nt\n\n\u2208 arg max\nt\u2208Tcand\n\n\uf8eb\uf8ed m(cid:89)\n\nr=1\n\nc(t)\n\n1\n\n\uf8eb\uf8ed1 \u2212\n\n\u2212\n\nThe intuition behind this update is that tests are selected to squash the probability of regions not being\nvalid. It also additionally incentivizes selection of tests on which multiple regions overlap.\n\n4.3 Adaptively constraining test selection to most likely region\n\nWe observe in our experiments that the surrogate (8) suffers from a slow convergence problem -\nfDRD(xA) takes a long time to converge to 1 when greedily optimized. To alleviate the convergence\nproblem, we introduce an alternate candidate selection function SelectCandTestSet(xA) that\nassigns to Tcand the set of all tests that belong to the most likely region TmaxP which is evaluated as\nfollows (we will refer to this variant as MAXPROBREG)\n\n(cid:40)\n\n(cid:41)\n\nTmaxP =\n\narg max\n\nRi=(R1,R2,...,Rm)\n\nP (Ri|xA)\n\n\\ A\n\n(12)\n\n6\n\n\fApplying the constraint in (12) leads to a dramatic improvement for any test selection policy as we\nwill show in Sec. 5.2. The following theorem offers a partial explanation\nTheorem 2. A policy that greedily latches to a region according the the posterior conditioned on the\nregion outcomes has a near-optimality guarantee of 4 w.r.t the optimal region evaluation sequence.\nApplying the constraint in (12) implies we are no longer greedily optimizing fDRD(xA). However,\n(cid:17)\n(cid:17)(cid:17)\u22121\n(cid:16) 1\nthe following theorem bounds the sub-optimality of this policy.\nTheorem 3. Let pmin = mini P (Ri), ph\nmin = minh\u2208H P (h) and l = maxi |Ri|. The policy using\n.\n(12) has a suboptimality of \u03b1\n\n2m log\n\n(cid:16)\n\n(cid:17)\n\n(cid:16)\n\n(cid:16)\n\n+ 1\n\n2\nl\nmin\n\nwhere \u03b1 \u2264\n\n1 \u2212 max\n\n(1 \u2212 pmin)2, p\n\nph\n\nmin\n\n5 Experiments\n\nWe evaluate BISECT on a collection of datasets spanning across a spectrum of synthetic problems and\nreal-world planning applications. The synthetic problems are created by randomly selecting problem\nparameters to test the general applicability of BISECT. The motion planning datasets range from\nsimplistic yet insightful 2D problems to more realistic high dimension problems as encountered by an\nUAV or a robot arm. The 7D arm planning dataset is obtained from a high \ufb01delity simulation as shown\nin Fig. 4(a). Finally, we test BISECT on experimental data collected from a full scale helicopter \ufb02ying\nthat has to avoid unmapped wires at high speed as it comes into land as shown in Fig. 4(b). Refer to\nsupplementary for exhaustive details on experiments and additional results. Open-source code and\ndetails can be found here: https://github.com/sanjibac/matlab_learning_collision_checking\n\n5.1 Heuristic approaches to solving the Bern-DRD problem\n\nWe propose a collection of competitive heuristics that can also be used to solve the Bern-DRD problem. These\nheuristics are various SelectTest(Tcand, \u03b8, xA) policies in the framework of Alg. 1. To simplify the setting,\nwe assume unit cost c(t) = 1 although it would be possible to extend these to nonuniform setting. The \ufb01rst\nheuristic RANDOM selects a test by sampling uniform randomly\n\n(13)\nWe adopt our next heuristic MAXTALLY from Dellin and Srinivasa [11] where the test belonging to most regions\nis selected. It uses the following criteria, which exhibits a \u2018fail-fast\u2019 characteristic\n\n\u2217 \u2208 Tcand\nt\n\nm(cid:88)\n\ni=1\n\nt\n\n\u2217 \u2208 arg max\nt\u2208Tcand\n\nI (t \u2208 Ri, P (Ri|xA) > 0)\n\n(14)\n\nThe next policy SETCOVER selects tests that maximize the expected number of \u2018covered\u2019 tests, i.e. if a selected\ntest is in collision, how many other tests does it remove from consideration.\n\n(cid:40) m(cid:91)\n\n(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)\n\n{Ri | P (Ri|xA) > 0} \u2212 m(cid:91)\n\n(cid:8)Rj\n\ni=1\n\nj=1\n\n(cid:12)(cid:12) P (Rj|, xA,\n\nXt=0) > 0(cid:9)(cid:41)\n\nt\n\n\u2217 \u2208 arg max\nt\u2208Tcand\n\n(1 \u2212 \u03b8t)\n\n\\ {A \u222a {t}}\n\n(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)\n\n(15)\n\nTheorem 4. SETCOVER is a near-optimal policy for the problem of optimally checking all regions.\nThe last heuristic is derived from a classic heuristic in decision theory: myopic value of information (Howard\n[18]). MVOI greedily chooses the test that maximizes the change in the probability mass of the most likely\nregion. This test selection works only with SelectCandTestSet(xA) = TmaxP.\nP (Ri | xA, Xt = 0)\n\n(1 \u2212 \u03b8t) max\n\n(16)\n\nt\n\n\u2217 \u2208 arg max\nt\u2208TmaxP\n\ni=1,...,m\n\nWe also evaluate against state of the art LAZYSP [11] planner which explicitly minimizes collision checking\neffort while trying to guarantee optimality. We ran two variants of LazySP. The \ufb01rst variant is the vanilla\nunconstrained algorithm that searches for the shortest path on the entire graph, collision checks the path and\nrepeats. The second variant is constrained to the library of paths used by all other baselines.\n\n5.2 Analysis of results\n\nTable 1 shows the evaluation cost of all algorithms on various datasets normalized w.r.t BISECT. The two\nnumbers are lower and upper 95% con\ufb01dence intervals - hence it conveys how much fractionally poorer are\nalgorithms w.r.t BISECT. The best performance on each dataset is highlighted. We present a set of observations\nto interpret these results.\nO 1. BISECT has a consistently competitive performance across all datasets.\n\n7\n\n\fFigure 3: Performance (number of evaluated edges) of all algorithms on 2D geometric planning. Snapshots,\nat start, interim and \ufb01nal stages respectively, show evaluated valid edges (green), invalid edges (red) and the\n\ufb01nal path (magenta). The utility of edges as computed by algorithms is shown varying from low (black) to high\n(cream).\n\nFigure 4: (a) A 7D arm has to perform pick and place tasks at high speed in a table with clutter. (b) Experimental\ndata from a full-scale helicopter that has to react quickly to avoid unmapped wires detected by the sensor.\nBISECT (given an informative prior) checks a small number of edges around the detected wire and identi\ufb01es a\npath. (c) Scenario where regions have size disparity. Unconstrained BISECT signi\ufb01cantly outperforms other\nalgorithms on such a scenario.\n\nTable 1 shows that on 13 out of the 14 datasets, BISECT is at par with the best. On 7 of those it is exclusively\nthe best.\nO 2. The MAXPROBREG variant improves the performance of all algorithms on most datasets\n\nTable 1 shows that this is true on 12 datasets. The impact is greatest on RANDOM on the 2D Forest dataset -\nperformance improves from (19.45, 27.66) to (0.13, 0.30). However, this is not true in general. On datasets\nwith large disparity in region sizes as illustrated in Fig. 4(c), unconstrained BISECT signi\ufb01cantly outperforms\nother algorithms. In such scenarios, MAXPROBREG latches on to the most probable path which also happens to\nhave a large number of edges. It performs poorly on instances where this region is invalid, while the other region\ncontaining a single edge is valid. Unconstrained BISECT prefers to evaluate the single edge belonging to region\n1 before proceeding to evaluate region 2, performing optimally on those instances. Hence, the myopic nature of\nMAXPROBREG is the reason behind its poor performance.\nO 3. On planning problems, BISECT strikes a trade-off between the complimentary natures of MAXTALLY\nand MVOI.\n\n8\n\n00.10.20.30.40.50.60.70.80.9100.10.20.30.40.50.60.70.80.9100.10.20.30.40.50.60.70.80.9100.10.20.30.40.50.60.70.80.9100.10.20.30.40.50.60.70.80.9100.10.20.30.40.50.60.70.80.9100.10.20.30.40.50.60.70.80.9100.10.20.30.40.50.60.70.80.9100.10.20.30.40.50.60.70.80.9100.10.20.30.40.50.60.70.80.9100.10.20.30.40.50.60.70.80.9100.10.20.30.40.50.60.70.80.9100.10.20.30.40.50.60.70.80.9100.10.20.30.40.50.60.70.80.9100.10.20.30.40.50.60.70.80.9100.10.20.30.40.50.60.70.80.9100.10.20.30.40.50.60.70.80.9100.10.20.30.40.50.60.70.80.9100.10.20.30.40.50.60.70.80.9100.10.20.30.40.50.60.70.80.9100.10.20.30.40.50.60.70.80.9100.10.20.30.40.50.60.70.80.9100.10.20.30.40.50.60.70.80.9100.10.20.30.40.50.60.70.80.91MVoI(|A|:28)SetCover(|A|:30)MaxTally(|A|:29)BiSECt(|A|:20)(a)(b)Wires in real \ufb02ight\u2026Region 1: Single edge with low probabilityRegion 2: Many edges with high probability(c)\fTable 1: Normalized evaluation cost - (lower, upper) bound of 95% con\ufb01dence interval\n\nLAZYSP\nUnconstrained\nConstrained\n\nMVOI\n\nRANDOM MAXTALLY\nUnconstrained\nUnconstrained\nMaxProbReg\nMaxProbReg\n\nSETCOVER\nUnconstrained\nMaxProbReg\n\nBISECT\nUnconstrained\nMaxProbReg\n\nSynthetic Bernoulli Test: Variation across region overlap\n\n2D Geometric Planning: Variation across environments\n\n(0.00, 0.08)\n\n(0.00, 0.00)\n(\u22120.11, 0.00)\n\n(0.03, 0.18)\n\n(0.045, 0.21)\n\n(0.00, 0.09)\n\n(4.18, 6.67)\n(0.12, 0.29)\n(3.27, 4.40)\n(0.05, 0.25)\n(2.86, 4.26)\n(0.00, 0.28)\n\n(19.5, 27.7)\n(0.13, 0.30)\n(13.4, 17.8)\n(0.11, 0.42)\n(13.8, 16.6)\n(0.33, 0.51)\n\n(3.49, 5.23)\n(0.12, 0.25)\n(3.04, 4.30)\n(0.14, 0.24)\n(2.62, 3.85)\n(0.06, 0.26)\n\n(4.68, 6.55)\n(0.09, 0.18)\n(4.12, 4.89)\n(0.00, 0.12)\n(2.76, 3.93)\n(0.10, 0.20)\n\n(10.8, 14.3)\n(1.38, 2.51)\n(6.96, 11.3)\n(0.16, 0.55)\n(18.9, 25.6)\n(\u22120.17, 0.01)\n\n(5.82, 12.1)\n(0.00, 0.57)\n(5.43, 10.02)\n(\u22120.03, 0.45)\n\n2D Geometric Planning: Variation across region size\n\n(0.00, 0.17)\n\n(0.00, 0.14)\n\n(12.1, 16.0)\n(0.12, 0.42)\n(13.3, 16.8)\n(0.09, 0.27)\n\n(4.47, 5.13)\n(0.06, 0.24)\n(2.18, 3.77)\n(\u22120.04, 0.08)\n\nNon-holonomic Path Planning: Variation across environments\n\n(1.97, 3.81)\n(0.15, 0.47)\n(0.97, 2.45)\n(0.02, 0.51)\n\n(0.97, 1.59)\n(0.24, 0.72)\n(0.28, 1.19)\n(0.00, 0.38)\n\n(0.09, 0.18)\n(\u22120.11, 0.11)\n7D Arm Planning: Variation across environments\n\n(9.79, 11.14)\n(0.25, 0.38)\n(8.40, 11.47)\n(0.21, 0.28)\n\n(22.4, 29.7)\n(0.46, 0.79)\n(13.0, 15.8)\n(0.00, 0.12)\n\n(2.63, 5.28)\n(0.00, 0.00)\n(3.72, 4.54)\n(\u22120.11, 0.11)\n\n(0.28, 0.54)\n\n(15.1, 19.4)\n(0.13, 0.31)\n(7.92, 9.85)\n(0.14, 0.36)\n\n(4.80, 6.98)\n(0.00, 0.04)\n(3.96, 6.44)\n(0.00, 0.00)\n\n(0.02, 0.20)\nDatasets with large disparity in region sizes\n\n(3.00, 3.50)\n\n(6.60, 10.5)\n\n(6.50, 8.00)\n(3.00, 4.50)\n(9.50, 11.3)\n(6.90, 10.8)\n\n(5.50, 6.50)\n(5.00, 7.50)\n(2.80, 6.10)\n(6.80, 8.30)\n\n(1.36, 2.17)\n(0.00, 0.11)\n(1.42, 2.07)\n(0.00, 0.11)\n\n(3.00, 3.50)\n(3.00, 3.50)\n(6.60, 10.5)\n(6.60, 10.5)\n\n(1.77, 3.01)\n(0.18, 0.40)\n(3.55, 4.67)\n(0.14, 0.33)\n(2.94, 3.71)\n(0.09, 0.22)\n\n(3.53, 5.07)\n(0.00, 0.09)\n(1.36, 2.11)\n(0.14, 0.29)\n(2.07, 2.94)\n(0.00, 0.00)\n\n(2.00, 3.41)\n(0.00, 0.38)\n(1.04, 1.62)\n(0.00, 0.14)\n\n(1.42, 2.36)\n(0.00, 0.00)\n(1.77, 2.64)\n(0.00, 0.00)\n(1.33, 1.81)\n(0.00, 0.00)\n\n(1.90, 2.46)\n(0.00, 0.00)\n(0.76, 1.20)\n(0.00, 0.00)\n(0.91, 1.44)\n(0.00, 0.00)\n\n(0.94, 1.42)\n(0.00, 0.00)\n(0.41, 0.91)\n(0.00, 0.00)\n\n(1.54, 2.46)\n(0.00, 0.00)\n(3.28, 3.78)\n(0.00, 0.00)\n\n(0.32, 0.67)\n(0.00, 0.00)\n(1.23, 1.75)\n(0.00, 0.00)\n\n(0.00, 0.00)\n(3.00, 3.50)\n(0.00, 0.00)\n(7.30, 11.2)\n\nSmall\nm : 100\nMedium\nm : 500\nLarge\nm : 1e3\n\nForest\n\nOneWall\n\nTwoWall\n\nOneWall\nm : 300\nOneWall\nm : 858\n\nForest\n\nOneWall\n\nTable\n\nClutter\n\nSynth.\n(T : 10)\n2D Plan\n(m : 2)\n\nWe examine this in the context of 2D planning as shown in Fig. 3. MAXTALLY selects edges belonging to many\npaths which is useful for path elimination but does not reason about the event when the edge is not in collision.\nMVOI selects edges to eliminate the most probable path but does not reason about how many paths a single edge\ncan eliminate. BISECT switches between these behaviors thus achieving greater ef\ufb01ciency than both heuristics.\nO 4. BISECT checks informative edges in collision avoidance problems encountered a helicopter\n\nFig. 4(b) shows the ef\ufb01cacy of BISECT on experimental \ufb02ight data from a helicopter avoiding wire.\n\n6 Conclusion\n\nIn this paper, we addressed the problem of identi\ufb01cation of a feasible path from a library while minimizing the\nexpected cost of edge evaluation given priors on the likelihood of edge validity. We showed that this problem\nis equivalent to a decision region determination problem where the goal is to select tests (edges) that drive\nuncertainty into a single decision region (a valid path). We proposed BISECT, and ef\ufb01cient and near-optimal\nalgorithm that solves this problem by greedily optimizing a surrogate objective.We validated BISECT on a\nspectrum of problems against state of the art heuristics and showed that it has a consistent performance across\ndatasets. This works serves as a \ufb01rst step towards importing Bayesian active learning approaches into the domain\nof motion planning.\n\n9\n\n\fAcknowledgments\n\nWe would like to acknowledge the support from ONR grant N000141310821. We would like to thank Shushman\nChoudhury for insightful discussions and the 7D arm planning datasets. We would like to thank Oren Salzaman,\nMohak Bhardwaj, Vishal Dugar and Paloma Sodhi for feedback on the paper.\n\nReferences\n[1] Robert Bohlin and Lydia E Kavraki. Path planning using lazy prm. In ICRA, 2000.\n\n[2] Brendan Burns and Oliver Brock. Sampling-based motion planning using predictive models. In ICRA,\n\n2005.\n\n[3] John F Canny. The complexity of robot motion planning. 1988.\n\n[4] Kathryn Chaloner and Isabella Verdinelli. Bayesian experimental design: A review. Statistical Science,\n\npages 273\u2013304, 1995.\n\n[5] Yuxin Chen, Shervin Javdani, Amin Karbasi, J. Andrew (Drew) Bagnell, Siddhartha Srinivasa, and Andreas\n\nKrause. Submodular surrogates for value of information. In AAAI, 2015.\n\n[6] Sanjiban Choudhury, Jonathan D. Gammell, Timothy D. Barfoot, Siddhartha Srinivasa, and Sebastian\nScherer. Regionally accelerated batch informed trees (rabit*): A framework to integrate local information\ninto optimal path planning. In ICRA, 2016.\n\n[7] Shushman Choudhury, Christopher M Dellin, and Siddhartha S Srinivasa. Pareto-optimal search over\n\ncon\ufb01guration space beliefs for anytime motion planning. In IROS, 2016.\n\n[8] Benjamin Cohen, Mike Phillips, and Maxim Likhachev. Planning single-arm manipulations with n-arm\n\nrobots. In Eigth Annual Symposium on Combinatorial Search, 2015.\n\n[9] Hugh Cover, Sanjiban Choudhury, Sebastian Scherer, and Sanjiv Singh. Sparse tangential network (spartan):\n\nMotion planning for micro aerial vehicles. In ICRA. IEEE, 2013.\n\n[10] Sanjoy Dasgupta. Analysis of a greedy active learning strategy. In NIPS, 2004.\n\n[11] Christopher M Dellin and Siddhartha S Srinivasa. A unifying formalism for shortest path problems with\n\nexpensive edge evaluations via lazy best-\ufb01rst search over paths with edge selectors. In ICAPS, 2016.\n\n[12] Christopher M Dellin, Kyle Strabala, G Clark Haynes, David Stager, and Siddhartha S Srinivasa. Guided\n\nmanipulation planning at the darpa robotics challenge trials. In Experimental Robotics, 2016.\n\n[13] Alan Frieze and Micha\u0142 Karo\u00b4nski. Introduction to random graphs. Cambridge Press, 2015.\n\n[14] Jonathan D. Gammell, Siddhartha S. Srinivasa, and Timothy D. Barfoot. Batch Informed Trees: Sampling-\n\nbased optimal planning via heuristically guided search of random geometric graphs. In ICRA, 2015.\n\n[15] Daniel Golovin and Andreas Krause. Adaptive submodularity: Theory and applications in active learning\n\nand stochastic optimization. Journal of Arti\ufb01cial Intelligence Research, 2011.\n\n[16] Daniel Golovin, Andreas Krause, and Debajyoti Ray. Near-optimal bayesian active learning with noisy\n\nobservations. In NIPS, 2010.\n\n[17] Peter E Hart, Nils J Nilsson, and Bertram Raphael. A formal basis for the heuristic determination of\n\nminimum cost paths. IEEE Trans. on Systems Science and Cybernetics, 1968.\n\n[18] Ronald A Howard. Information value theory. IEEE Tran. Systems Science Cybernetics, 1966.\n\n[19] Shervin Javdani, Yuxin Chen, Amin Karbasi, Andreas Krause, J. Andrew (Drew) Bagnell, and Siddhartha\n\nSrinivasa. Near optimal bayesian active learning for decision making. In AISTATS, 2014.\n\n[20] Sertac Karaman and Emilio Frazzoli. Sampling-based algorithms for optimal motion planning. The\n\nInternational Journal of Robotics Research, 30(7):846\u2013894, 2011.\n\n[21] Igor Kononenko. Machine learning for medical diagnosis: History, state of the art and perspective. Arti\ufb01cial\n\nIntelligence in Medicine, 2001.\n\n[22] Andreas Krause and Carlos Guestrin. Optimal value of information in graphical models. Journal of\n\nArti\ufb01cial Intelligence Research, 35:557\u2013591, 2009.\n\n10\n\n\f[23] S. M. LaValle. Planning Algorithms. Cambridge University Press, Cambridge, U.K., 2006.\n\n[24] Steven M LaValle and James J Kuffner Jr. Randomized kinodynamic planning. IJRR, 2001.\n\n[25] Venkatraman Narayanan and Maxim Likhachev. Heuristic search on graphs with existence priors for\n\nexpensive-to-evaluate edges. In ICAPS, 2017.\n\n[26] Christian L Nielsen and Lydia E Kavraki. A 2 level fuzzy prm for manipulation planning. In IROS, 2000.\n\n[27] Mihail Pivtoraiko, Ross A Knepper, and Alonzo Kelly. Differentially constrained mobile robot motion\n\nplanning in state lattices. Journal of Field Robotics, 2009.\n\n[28] Takayuki Yoshizumi, Teruhisa Miura, and Toru Ishida. A* with partial expansion for large branching factor\n\nproblems. In AAAI/IAAI, pages 923\u2013929, 2000.\n\n11\n\n\f", "award": [], "sourceid": 2421, "authors": [{"given_name": "Sanjiban", "family_name": "Choudhury", "institution": "Carnegie Mellon University"}, {"given_name": "Shervin", "family_name": "Javdani", "institution": "Carnegie Mellon University"}, {"given_name": "Siddhartha", "family_name": "Srinivasa", "institution": "Carnegie Mellon University"}, {"given_name": "Sebastian", "family_name": "Scherer", "institution": "Carnegie Mellon University"}]}