{"title": "Optimization for Approximate Submodularity", "book": "Advances in Neural Information Processing Systems", "page_first": 396, "page_last": 407, "abstract": "We consider the problem of maximizing a submodular function when given access to its approximate version. Submodular functions are heavily studied in a wide variety of disciplines, since they are used to model many real world phenomena, and are amenable to optimization. However, there are many cases in which the phenomena we observe is only approximately submodular and the approximation guarantees cease to hold. We describe a technique which we call the sampled\nmean approximation that yields strong guarantees for maximization of submodular functions from approximate surrogates under cardinality and intersection of matroid constraints. In particular, we show tight guarantees for maximization under a cardinality constraint and 1/(1+P) approximation\nunder intersection of P matroids.", "full_text": "Optimization for Approximate Submodularity\n\nAvinatan Hassidim\n\nBar Ilan University and Google\navinatan@cs.biu.ac.il\n\nYaron Singer\n\nHarvard University\n\nyaron@seas.harvard.edu\n\nAbstract\n\nWe consider the problem of maximizing a submodular function when given access\nto its approximate version. Submodular functions are heavily studied in a wide\nvariety of disciplines since they are used to model many real world phenomena\nand are amenable to optimization. There are many cases however in which the\nphenomena we observe is only approximately submodular and the optimization\nguarantees cease to hold. In this paper we describe a technique that yields strong\nguarantees for maximization of monotone submodular functions from approximate\nsurrogates under cardinality and intersection of matroid constraints. In particular,\nwe show tight guarantees for maximization under a cardinality constraint and\n1/(1 + P ) approximation under intersection of P matroids.\n\n1\n\nIntroduction\n\nIn this paper we study maximization of approximately submodular functions. For nearly half a\ncentury submodular functions have been extensively studied since they are amenable to optimization\nand are recognized as an effective modeling instrument. In machine learning, submodular functions\ncapture a variety of objectives that include entropy, diversity, image segmentation, and clustering.\nAlthough submodular functions are used to model real world phenomena, in many cases the functions\nwe encounter are only approximately submodular. This is either due to the fact that the objectives are\nnot exactly submodular, or alternatively they are, but we only observe their approximate version.\nIn the literature, approximate utility functions are modeled as surrogates of the original function that\nhave been corrupted by random variables drawn i.i.d from some distribution. Some examples include:\n\n\u2022 Revealed preference theory. Luce\u2019s famous model assumes that an agent\u2019s revealed utility\n\u02dcf : 2N ! R can be approximated by a well-behaved utility function f : 2N ! R s.t.\n\u02dcf (S) = f (S) + \u21e0S for every S \u2713 N where \u21e0S is drawn i.i.d from a distribution [CE16]. \u02dcf\nis a utility function that approximates f, and multiple queries to \u02dcf return the same response;\n\u2022 Statistics and learning theory. The assumption in learning is that the data we observe is\ngenerated by \u02dcf (x) = f (x) + \u21e0x where f is in some well-behaved hypothesis class and \u21e0x\nis drawn i.i.d from some distribution. The use of \u02dcf is not to model corruption by noise but\nrather the fact that data is not exactly manufactured by a function in the hypothesis class;\n\u2022 Active learning. There is a long line of work on noise-robust learning where one has access\nto a noisy membership oracle \u02dcf (x) = f (x) + \u21e0x and for every x we have that \u21e0x is drawn\ni.i.d from a distribution [Ang88, GKS90, Jac94, SS95, BF02, Fel09]. In this model as well,\nthe oracle is consistent and multiple queries return the same response. For set functions,\none can consider active learning in experimental design applications where the objective\nfunction is often submodular and the goal would be to optimize f : 2N ! R given \u02dcf.\n\n32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montr\u00e9al, Canada.\n\n\fSimilar to the above examples, we say that a function \u02dcf : 2N ! R+ is approximately submodular\nif there is a submodular function f : 2N ! R+ and a distribution D s.t. for each set S \u2713 N we\nhave that \u02dcf (S) = \u21e0Sf (S) where \u21e0S is drawn i.i.d from D.1 The modeling assumption that \u21e0S is not\nadversarially chosen but drawn i.i.d is crucial. Without this assumption for \u02dcf (S) = \u21e0Sf (S) where\n\u21e0S 2 [1 \u270f, 1 + \u270f] even for subconstant \u270f> 0 no algorithm can obtain an approximation strictly\nbetter than n1/2 to maximizing either \u02dcf or f under a cardinality constraint when n = |N| [HS17].\nThis hardness stems from the fact that approximate submodularity implies that the function is close\nto submodular, but its marginals (or gradients) are not well approximated by those of the submodular\nfunction. When given an \u21b5 approximation to the marginals, the greedy algorithm produces a 1 1/e\u21b5\napproximation. Furthermore, even for continuous submodular functions, a recent line of work\nshows that gradient methods produce strong guarantees with approximate gradient information of the\nfunction [HSK17, CHHK18, MHK18].\nIn contrast to the vast literature on submodular optimization, optimization of approximate sub-\nmodularity is nascent. For distributions bounded in [1 \u270f/k, 1 + \u270f/k] the function is suf\ufb01-\nciently close to submodular for the approximation guarantees of the greedy algorithm to go\nthrough [HS, LCSZ17, QSY+17]. Without this assumptions, the greedy algorithm performs ar-\nbitrarily poorly. In [HS17] the authors give an algorithm that obtains an approximation arbitrarily\nclose to 1 1/e under a cardinality constraint that is suf\ufb01ciently large. For arbitrary cardinality and\ngeneral matroid constraints there are no known approximation guarantees.\n\n1.1 Our Contribution\nIn this paper we consider the problem maxS2F f (S) when f : 2N ! R is non-negative monotone\nsubmodular de\ufb01ned of a ground set N of size n and the algorithm is only given access to an\napproximate surrogate \u02dcf and F is a uniform matroid (cardinality) or intersection of matroids constraint.\nWe introduce a powerful technique which we call the sampled mean approximation and show:\n\n\u2022 Optimal guarantees for maximization under a cardinality constraint. In [HS17] the re-\nsult gives an approximation arbitrarily close to 1 1/e for k 2 \u2326(log log n). This is a funda-\nmental limitation in their technique that initializes the solution with a set of size \u2326(log log n)\nused for \u201csmoothing\u201d the approximately submodular function (see Appendix G.1 for more\ndetails). The technique in this paper is novel and yields an approximation of 1 1/e for any\nk 2, and 1/2 for k = 1, which is information theoretically tight, as we later show;\n\u2022 1/(1 + P ) approximation for intersection of P matroids. We utilize the sampled mean\napproximation method to produce the \ufb01rst results for the more challenging case of maxi-\nmization under general matroid constraints. Our approximation guarantees are comparable\nwith those achievable with a greedy algorithm for monotone submodular functions;\n\u2022 Information theoretic lower bounds. We show that no randomized algorithm can obtain an\napproximation strictly better than 1/2+O(n1/2) for maxa2N f (a) given an approximately\nsubmodular oracle \u02dcf, and that no randomized algorithm can obtain an approximation strictly\nbetter than (2k 1)/2k + O(n1/2) for maximization under cardinality constraint k;\n\u2022 Bounds in Extreme Value Theory. As we later discuss, some of our results may be of\nindependent interest to Extreme Value Theory (EVT) which studies the bounds on the\nmaximum sample (or the top samples) from some distribution. To achieve our main result\nwe prove subtle properties about extreme values of random variables where not all samples\nare created equal and the distributions generalize those typically studied in EVT.\n\nThe results above are for the problem maxS2F f (S) when the algorithm is given access to \u02dcf. In\nsome applications, however, \u02dcf is the function that we actually wish to optimize, i.e. our goal is\n\u02dcf (S). If \u02dcf (S) approximates f (S) well on all sets S, we can use the solution\nto solve maxS2F\n\u02dcf (S). In general, however, a solution that is good for\nfor maxS2F f (S) as a solution for maxS2F\n\u02dcf (S). In Appendix E we give a black-box reduction\nmaxS2F f (S) can be arbitrarily bad for maxS2F\nshowing that these problems are essentially equivalent. Speci\ufb01cally, we show that given a solution\n\n1Describing \u02dcf as a multiplicative approximation of f is more convenient for analyzing multiplicative\n\napproximation guarantees. This is w.l.o.g as all our results apply to additive approximations as well.\n\n2\n\n\f\u02dcf (S)\nto maxS2F f (S) one can produce a solution that is of arbitrarily close quality to maxS2F\nwhen F is any uniform matroid, an intersection of matroids of rank \u2326(log n), and an intersection of\nmatroids of any rank when the distribution D has bounded support.\n1.2 Technical Overview\n\nThe approximately submodular functions we consider approximate a submodular function f using\nsamples from a distribution D of the class of generalized exponential tail distributions de\ufb01ned as:\nDe\ufb01nition. A noise distribution D has a generalized exponential tail if there exists some x0 such\nthat for x > x0 the probability density function \u21e2(x) = eg(x), where g(x) =Pi aix\u21b5i for some\n(not necessarily integers) \u21b50 \u21b51 . . ., s.t. \u21b50 1. If D has bounded support we only require that\neither it has an atom at its supremum, or that \u21e2 is continuous and non-zero at the supremum.\n\nThis class of distributions is carefully de\ufb01ned. On the one hand it is general enough to contain\nGaussian and Exponential distributions, as well as any distribution with bounded support. On the\nother hand it has enough structure one can leverage. Note that optimization in this setting always\nrequires that the support is independent of n and that n is suf\ufb01ciently large 2. Throughout the paper\nwe assume that D has a generalized exponential tail and that n is suf\ufb01ciently large.\nTheorem. For any non-negative monotone submodular function there is a deterministic polynomial-\ntime algorithm which optimizes the function under a cardinality constraint k 3 and obtains an\napproximation ratio that is arbitrarily close to 1 1/e with probability 1 o(1) using access to\nan approximate oracle. For k 2 there is a randomized algorithm whose approximation ratio is\narbitrarily close to 1 1/e, in expectation over the randomization of the algorithm. For k = 1 the\nalgorithm achieves a 1/2 approximation in expectation, and no randomized algorithm can achieve\nan approximation better than 1/2 + o(1), in expectation.\n\nThe main part of the proof involves analysis of the following greedy algorithm. The algorithm\niteratively chooses bundles of elements of size O(1/\u270f). In each iteration,the algorithm \ufb01rst identi\ufb01es\na bundle x whose addition to the current solution approximately maximizes the approximate mean\nvalue \u02dcF . Informally, \u02dcF (x) is the average value of \u02dcf evaluated on all bundles at Hamming distance\none from x. Then, the algorithm does not choose x but rather the bundle at Hamming distance one\nfrom x whose addition to the current solution maximizes the approximate submodular value \u02dcf.\nThe major technical challenge is in analyzing the regime in which k 2 \u2326(1/\u270f2) \\O (plog n). At\n\na high level, in this regime the analysis relies on showing that the marginal contribution of the\nbundle of elements selected in every iteration is approximately largest. Doing so requires proving\nsubtle properties about extreme values of random variables drawn from the generalized exponential\ntail distribution, and the analysis fully leverages the properties of the distribution and the fact that\n\nk 2O (plog n). This is of independent interest to Extreme Value Theory (EVT) which tries to bound\nthe maximum sample (or the top samples) from some distribution. If we would consider the constant\nfunction f (S) = 1 for S 6= ;, and try to maximize an approximate version of f with respect to some\ndistribution, this would be a classical EVT setting. One can view the bounds we develop as bounds\non a generalization of EVT, where not all samples are created equal.\nFor general matroid constraints we apply the sampled-mean technique and obtain an approximation\ncomparable to that of applying the greedy algorithm on a monotone submodular function.\nTheorem. For any non-negative monotone submodular function there is a deterministic polynomial-\ntime algorithm which optimizes the function under an intersection of P matroids constraint and\nobtains an approximation ratio arbitrarily close to 1/(1 + P ) given an approximate oracle.\n\nPaper organization. The main contribution of the paper is the de\ufb01nition of sampled mean ap-\nproximation in Section 2 and the subsequent analysis of the algorithm for cardinality constraint in\nSection 3 and matroid constraints in Section 4. The techniques are novel, and the major technical\ncrux of the paper is in analyzing the algorithm in Section 3. Optimization for small rank and lower\nbounds are in Appendix B.1. In Appendix G we further discuss related work, and in Appendix F we\ndiscuss extensions of the algorithms to related models.\n\n2For example, if for every S the noise is s.t. \u21e0S = 2100 w.p. 1/2100 and 0 otherwise, but n = 50, it is likely\n\nthat the oracle will always return 0, in which case we cannot do better than selecting an element at random.\n\n3\n\n\f2 The Sampled Mean Approximation\n\nWe begin by de\ufb01ning the sampled mean approximation of a function. This approach considers\nbundles x of size c 2O (1/\u270f). We can then run a variant of a greedy algorithm which adds a bundle\nof size c in every iteration. For a given bundle x and set S, we de\ufb01ne a ball to be all bundles obtained\nby swapping a single element from x with another not in S [ x. We denote xij = (x \\ {xi}) [{ xj}.\nDe\ufb01nition. For S \u21e2 N and bundle x \u21e2 N, the ball around x is BS(x) := {xij : i 2 x, j /2 S [ x}.\nWe illustrate a ball in Figure 1. Notice that as long as |S|\uf8ff (1 )|N| for some \ufb01xed > 0, we\nhave that |BS(x)|2 \u2326(n). This will allow us to derive weak (but suf\ufb01cient) concentration bounds.\nDe\ufb01nition. Let f : 2N ! R. For a set S \u2713 N and bundle x \u2713 N, the mean value, noisy mean\nvalue, and mean marginal contribution of x given S are, respectively:\n\n(1)\n\n(2)\n\n(3)\n\nF (S [ x)\n\u02dcF (S [ x)\nFS(x)\n\n:= Ez\u21e0BS (x)\u21e5\n:= Ez\u21e0BS (x)\u21e5\n:= Ez\u21e0BS (x)\u21e5\n\nf (S [ z)\n\u02dcf (S [ z)\nfS(z)\n\n\u21e4\n\u21e4\n\u21e4\n\nThe following lemma implies that under the right conditions, the bundle that maximizes the noisy\nmean value is a good approximation of the bundle whose (non-noisy) mean value is largest.\nLemma 2.1. Let x 2 argmaxz:|z|=c\n\n\u02dcF (S [ z) where c = 2/\u270f. Then, w.p. 1 exp\u2326n1/4:\n\nFS(x) (1 \u270f) max\nz:|z|=c\n\nFS(z).\n\nThe above lemma gives us a weak concentration bound in the following sense. While it is generally\nnot true that \u02dcF (S [ z) \u21e1 F (S [ z), we can upper bound \u02dcF (S [ z) in a meaningful way and show that\n\u02dcF (S [ x?) \u21e1 F (S [ x?) for x? 2 argmaxz fS(z) by using submodularity. This allows us to show\nthat the mean marginal contribution of x that maximizes the noisy mean value is an \u270f-approximation\nto the maximal (non-noisy) mean marginal contribution. Details and proofs are in Appendix A.\nIn addition to approximating the mean marginal contribution given a noisy oracle, an important\nproperty of the sampled-mean approach is that it well-approximates its true marginal contribution.\nLemma 2.2. For any \u270f> 0 and any set S \u21e2 N, let x be a bundle of size 1/\u270f, then:\n\nFS(x) (1 \u270f)fS(x),\n\nThe proof is in Appendix A and exploits a natural property of submodular functions: the removal of a\nrandom element from a suf\ufb01ciently large set does not signi\ufb01cantly affect its value, in expectation.\nLet x? 2 argmaxb:|b|=c fS(b). Lemma 2.1 and Lemma 2.2 together imply the following corollary:\n\u02dcF (S [ b). Then, w.p. at least\nCorollary 2.3. For a \ufb01xed \u270f> 0 let c = 3/\u270f, and x 2 argmaxb:|b|=c\n1 exp(\u2326(n1/4)) we have that: FS(x) (1 \u270f)fS(x?).\nAt a \ufb01rst glance, it may seem as if running the greedy algorithm with F instead of f suf\ufb01ces. The\nproblem, however, is that the mean marginal contribution FS may be an unbounded overestimate\nof the true marginal contribution fS. Consider for example an instance where S = ; and there is\na bundle x of size c s.t. for every ; 6= z \u2713 x we have f (z) = for some arbitrarily small , while\nevery other subset T \u2713 N \\ x is complementary to x and has some arbitrarily large value M. In this\ncase, x = argmaxz:|z|=c F (z) and F (x) = M + while f (x) = .\n3 The Sampled Mean Greedy for Cardinality Constraints\n\nThe SM-GREEDY begins with the empty set S and at every iteration considers all bundles (sets) of\nsize c 2O (1/\u270f) to add to S. At every iteration, the algorithm \ufb01rst identi\ufb01es the bundle x which\nmaximizes the noisy mean value. After identifying x, it then considers all possible bundles z 2B S(x)\nand takes the one whose noisy value is largest. We include a formal description below.\n\n4\n\n\fx 1,3 = {x2, x3}\n\nx 2,3 = {x1, x3}\n\nx = {x1, x2}\n\nFigure 1: Illustration of a ball around x = {x1, x2} where N = {x1, x2, x3}, and S = ;. We think of x as a\npoint in [0, 1]3 and BS(x) = {x12, x23} = {{x2, x3},{x1, x3}} = {(0, 1, 1), (1, 0, 1)}.\nAlgorithm 1 SM-GREEDY\nInput: budget k, precision \u270f> 0, c 56\n1: S ;\n2: while |S| < c \u00b7\u2305 k\n3:\n4:\n5:\n6: end while\n7: return S\n\nx argmaxb:|b|=c\n\u02c6x argmaxz2B(x)\nS S [ \u02c6x\n\n\u02dcF (S [ b)\n\u02dcf (S [ z)\n\nc\u21e7 do\n\n\u270f\n\nA key step in analyzing greedy algorithms like the one above is showing that in every iteration the\nmarginal contribution of the element selected by the algorithm is arbitrarily close to maximal. This\ncan then be used in a standard inductive argument to show that the algorithm obtains an approximation\narbitrarily close to 1 1/e. The main crux of the analysis is showing that this property indeed holds\nin SM-GREEDY w.h.p. when k 2 \u2326(1/\u270f2) \\O (plog n). For k > plog n the analysis become\nsubstantially simpler since it suf\ufb01ces to argue that the algorithm chooses an element whose marginal\ncontribution is approximately optimal in expectation (details are in the proof of Theorem 3.5).\n\n3.1 Analysis for k 2 \u2326(1/\u270f2) \\O (plog n)\nThroughout the rest of this section we will analyze a single iteration of SM-GREEDY in which a set\nS \u21e2 N was selected in previous iterations and the algorithm adds a bundle \u02c6x of size c. Speci\ufb01cally,\n\u02c6x 2 argmaxz2BS (x)\nAs discussed above, for x? 2 argmaxb:|b|=c fS(b) we want to show that when c 2O (1/\u270f):\n\n\u02dcf (S [ z) where x 2 argmaxb:|b|=c\n\n\u02dcF (S [ b).\n\nfS(\u02c6x) (1 \u270f)fS(x?).\n\nTo do so, we will de\ufb01ne two kinds of bundles in BS(x), called good and bad. A good bundle is a\nbundle z for which fS(z) (1 2\n3 \u270f)fS(x?) and a bad bundle z is one s.t. fS(z) \uf8ff (1 \u270f)fS(x?).\nOur goal is to prove that the bundle \u02c6x added by the algorithm is w.h.p. not bad. Since according to the\nde\ufb01nition of good and bad the true marginal contribution of good bundles has a \ufb01xed advantage over\nthe true marginal contribution of bad bundles, and \u02c6x is the bundle with largest noisy value, essentially\nwhat we need to show is that the largest noise multiplier of a good bundle is suf\ufb01ciently close to the\nlargest noise multiplier of a bad bundle, with suf\ufb01ciently high probability.\nAs a \ufb01rst step we quantify the fraction of good bundles in a ball, which will then allow us to bound\nthe values of the noise multipliers of good and bad bundles. The following claim implies at least half\nof the bundles in BS(x) are good and at most half are bad (the proof is in Appendix B).\nClaim 3.1. Suppose FS(x) 1 \u270f\n\n3 fS(x?). Then at least half of the bundles in BS(x) are good.\n\nWe next de\ufb01ne two thresholds \u21e0inf and \u21e0sup. The threshold \u21e0inf is used as a lower bound on the\nlargest value obtained when sampling at least |BS (x)|\nrandom variables from the noise distribution.\nSince at least half of the bundles in the ball are good, \u21e0inf is a lower bound on the largest noise\nmultiplier of a good bundle. The threshold \u21e0sup is used as an upper bound on the largest value\nobtained when sampling at most |BS (x)|\nrandom variables from the noise distribution. Since at most\nhalf of the bundles in the ball are bad, \u21e0sup upper bounds the value of a noise multiplier of a bad\nbundle. Throughout the rest of this section D will denote a generalized exponential tail distribution.\n\n2\n\n2\n\n5\n\n\fDe\ufb01nition. Let m = |BS (x)|\n\n2\n\n. For probability density function \u21e2(x) of D we de\ufb01ne:\n\n\u2022 \u21e0sup is the value for which:R 1\n\u2022 \u21e0inf is the value for which:R 1\n\n\u21e0inf\n\n\u21e0sup\n\nClaim 3.2. Let m = |BS (x)|\n\n2\n\n\u21e2(x)dx = 2\n\nm log n;\n\n\u21e2(x)dx = 2 log n\nm .\n\n, \u21e01, . . . ,\u21e0 m be i.i.d samples of D and \u21e0? = max{\u21e01, . . . ,\u21e0 m}. Then:\nPrh \u21e0inf \uf8ff \u21e0? \uf8ff \u21e0sup i 1 \n\nlog n\n\n3\n\nThe proof is in Appendix B. Since at least half of the bundles in the ball are good and at most half are\nbad, the above claim implies that with probability 1 3/ log n the maximal value of noise multipliers\nof good and bad bundles fall within the range [\u21e0inf,\u21e0 sup]. Since this holds in every iteration, when\nk 2 O(plog n) by a union bound we get that this holds in all iterations w.p. 1 o(1).\n\nAt this point, our lower bound on the largest noisy value of a good bundle is:\n\nand the upper bound on the noisy value of any bad bundle is:\n\nmax\nz2good\n\n\u02dcf (S [ z) \u21e0inf \u21e5\u2713f (S) +\u27131 \n\u270f\u25c6 fS(x?)\u25c6\n\u02dcf (S [ z) \uf8ff \u21e0sup \u21e5 (f (S) + (1 \u270f) fS(x?)) .\n\n2\n3\n\nmax\nz2bad\n\u02dcf (S [ z). We show that given the right bound on \u21e0inf against \u21e0sup, then \u02c6x\nLet \u02c6x = argmaxz2BS (x)\nis not a bad bundle (it does not have to be a good bundle). Lemma 3.3 below gives us such a bound.\nLemma 3.3. For any generalized exponential tail distribution, \ufb01xed > 0 and suf\ufb01ciently large n:\n\n\u21e0inf \u27131 \n\n\n\nplog n\u25c6 \u21e0sup\n\nThe proof is quite technical and the fully leverages the fact that k 2O (plog n) and the properties of\n\ngeneralized exponential tail distributions. The main challenge is that the tail of the distribution is not\nnecessarily monotone (see Appendix B for further discussion). We defer the proof to Appendix B.\n\nProving the Main Lemma. We can now prove our main lemma which shows that for any k 2\n!(1/\u270f) \\O (plog n) taking a bundle according to the sample mean approach is guaranteed to be\nclose to the optimal bundle. We state it for a cardinality constraint, but this fact holds more generally\nfor any matroid of rank k. We include a short proof sketch and defer the full proof to Appendix B.\nLemma 3.4. Let \u270f> 0, and assume that k 2 !(1/\u270f) \\O (plog n) and that fS(x?) 2 \u2326\u21e3 f (S)\nk \u2318.\nThen, in every iteration of SM-GREEDY we have that with probability at least 1 4\n\nlog n:\n\nfS(\u02c6x) (1 \u270f)fS(x?).\n\nProof Sketch. From Corollary 2.3 we know that in every iteration with overwhelming high probabil-\nity FS(x) (1 )f (x?) for = \u270f/3. Since Lemma 3.3 applies for any \ufb01xed > 0 we know that\nfor suf\ufb01ciently large n we can lower bound the value of the maximal good bundle in the ball:\n\nLet b be the bad bundle with the maximal noisy value for fS. To bound this:\n\n\u02dcf (S [ z) \u27131 \n\nmax\nz2good\n\n2\n\n3plog n\u25c6 \u21e0sup \u21e5[f (S) + (1 2)fS(x?)]\n\u21e0S[zf (S [ z) \uf8ff \u21e0sup \u21e5[f (S) + (1 3)fS(x?)]\n\n\u02dcf (S [ b) = max\nz2bad\n\nThe difference between the two bounds is positive, implying that a bad bundle is not selected.\nTheorem 3.5. For any \ufb01xed \u270f> 0 and k 4/\u270f2 SM-GREEDY returns a set S s.t. w.p. 1 o(1):\n\nf (S) (1 1/e \u270f)OPT.\n\n6\n\n\fc c and by submodularity f (O) (1 )OPT when k 1\n\nProof. Let \uf8ff = b k\ncc and use O to denote the set of \uf8ff bundles of size c whose total value is\nlargest. To simplify notation we will treat sets of bundles as sets of elements. We will show that\nf (S) (1 1/e )f (O) where = \u270f/2. Notice that this implies f (S) (1 1/e \u270f) since\n\uf8ff> k\nWe introduce some notation: we use \u02c6xi to denote the bundle selected at iteration i 2 [\uf8ff], Si = [j\uf8ffi\u02c6xj,\ni 2 argmaxz fSi-1(z), and O is the set of \uf8ff bundles with maximal\nxi 2 argmaxz\nvalue. De\ufb01ne i = f (O) f (Si-1), and S0 = ;. From submodularity and monotonicity:\n\n\u02dcF (Si-1 [ z), x?\n\n\u270f2 .\n2 = 4\n\nfSi-1(x?\n\ni ) \n\n1\n\uf8ff\n\ni\n\n\u21e3j\n\n(1)\n\ni\n\n\u21e3i+1\n\n\uf8ff \uf8ff\n\nj=1 \u21e3j f (O)\n\nWe therefore have that:\n\nf (Si) f (Si-1) = fSi-1( \u02c6xi) \u21e3ifSi-1(x?\n\nConsider now a set of bundles {z1, . . . , z\uf8ff} where for every i 2 [\uf8ff] we have that zi is drawn u.a.r.\nfrom BSi-1(xi). For each such bundle we can assign a random variable \u21e3i for which fSi-1(zi) =\ni ). Since in every iteration i 2 [\uf8ff] we choose the set whose value is maximal in BSi-1(xi),\n\u21e3ifSi-1(x?\nby stochastic dominance we know that fSi-1( \u02c6xi) f (zi) and therefore:\n\u21e3i\ni ) \n\uf8ff\nj=1\u21e31 \u21e3j\n\uf8ff\u2318 f (O). This is\nWe will now show by induction that for all i 2 [\uf8ff] we have that i \uf8ffQi\nclearly the case for i = 1 when S0 = ; and in general applying the inductive hypothesis we get:\ni+1Yj=1\u27131 \n\uf8ff \u25c6 \uf8ff\ni+1 = f (O) f (Si) = i (f (Si) f (Si1)) \uf8ff i\u27131 \n\uf8ff\u25c6 f (O)\n\uf8ffYj=1\u27131 \n\uf8ff\u25c6 f (O) \uf8ff e 1\n\uf8ffP\uf8ff\nf (S) \u21e31 e 1\nj=1 \u21e3j\u2318 f (O)\n\uf8ffP\uf8ff\nfSi-1( \u02c6xi) (1 )fSi-1(x?\ni )\nEz\u21e0B(xi)[fSi-1(z)] = FSi-1(xi) \u27131 \n\nObserve that the solution of the algorithm S respects f (S) = f (S\uf8ff) = f (O) \uf8ff, thus:\nFrom Lemma 3.4, when k \uf8ff plog n for every i 2 [\uf8ff 1] we have that w.p. 1 4/ log(n): 3\nTherefore by a union bound, with probability 1 o(1), we have that \u21e3i (1 ) for all i 2 [\uf8ff]. In\nparticular, 1\n\n2\u25c6 fSi-1(x?\ni )\nThus, E[\u21e3i+1] (1 /2), and by Chernoff when \uf8ff> plog n we get 1\n\uf8ffP\uf8ff\ni=1 \u21e3i (1 ) w.p. at\nleast 1 exp(2\uf8ff/8). Therefore, in both cases, when k \uf8ff plog n and when k plog n we have\n\uf8ffP\uf8ff\nj=1 \u21e3j (1 ) w.p. 1 o(1). With 1 this implies f (S) (1 1/e )f (O).\nConstant k and information theoretic lower bounds. For any constant, a single iteration of a\nminor modi\ufb01cation of SM-GREEDY suf\ufb01ces. In Appendix B.1 we show an approximation arbitrarily\nclose to 1 1/k w.h.p. and 1 1/(k + 1) in expectation. For k = 1 this is arbitrarily close to 1/2. In\nAppendix D we show nearly matching lower bounds, and in particular that no randomized algorithm\ncan obtain an approximation ratio better than 1/2 + o(1) when k = 1, and that it is impossible to\nobtain an approximation better than (2k 1)/2k + O(1/pn) for the optimal set of size k.\nTheorem 3.6. For any non-negative monotone submodular function there is a deterministic\npolynomial-time algorithm which optimizes the function under a cardinality constraint k 3\nand obtains an approximation ratio that is arbitrarily close to 1 1/e with probability 1 o(1) using\naccess to an approximate oracle. For k 2 there is a randomized algorithm whose approximation\nratio is arbitrarily close to 1 1/e, in expectation over the randomization of the algorithm. For\nk = 1 the algorithm achieves a 1/2 approximation in expectation, and no randomized algorithm can\nachieve an approximation better than 1/2 + o(1), in expectation.\n\nj=1 \u21e3j (1 ). Otherwise, when k > plog n from Lemma 2.1:\n\uf8ffP\uf8ff\n\nthat 1\n\n\u21e3j\n\n\n\n3W.l.o.g we assume that in every iteration fSi-1 (x?\n\n) to apply Lemma 3.4. Since \uf8ff 2 \u2326(1/\u270f),\nk 2 \u2326(1/\u270f2), and f (S) \uf8ff OPT, ignoring iterations where this does not hold costs \u270fOPT for a small \ufb01xed .\n\ni ) 2 \u2326( f (Si-1)\n\nk\n\n7\n\n\f4 Approximation Algorithm for Matroids\n\nFor intersection of matroids F the algorithm from the Section 3 is generalized as described below.\nAlgorithm 2 SM-MATROID-GREEDY\nInput: intersection of matroids F, precision \u270f> 0, c 56\n1: S ;, X N\n2: while X 6= S do\n3: X X \\ {x : S [ x /2F}\n\u02dcF (S [ b)\n4:\n\u02dcf (S [ z)\n5:\n6:\n7: end while\n8: return S\n\nx argmaxb:|b|=c\n\u02c6x argmaxz2B(x)\nS S [ \u02c6x\n\n\u270f\n\nThe analysis of the algorithm uses the lemma below, which is a generalization of the classic result\nof [NWF78a]. The proof can be found in the Appendix.\nLemma 4.1. Let O be the optimal solution, k = |O|, and for every iteration i of SM-MATROID-\nGREEDY let Si be the set of elements selected and x?\n\nk\n\ni 2 argmax|z|=c fSi-1(z). Then:\ncXi=1\n\nfSi(x?\ni )\n\nf (O) \uf8ff (P + 1)\n\nTheorem 4.2. Let F denote the intersection of P 1 matroids on the ground set N, and f : 2N ! R\nbe a non-negative monotone submodular function. Then with probability 1 o(1) the SM-MATROID-\nGREEDY algorithm returns a set S 2F s.t.:\n\nf (S) \n\n1 \u270f\nP + 1\n\nOPT\n\nProof Sketch. If the rank of the matroid is O(1/\u270f2) we can simply apply the case of small k as in the\nprevious section. Otherwise, assume the rank is at least \u2326(1/\u270f2) Let \uf8ff = k\nc , Si = {\u02c6x1, . . . , \u02c6xi} be\ncurrent solutions of bundles of size c at iteration i 2 [\uf8ff] of the algorithm, and let x?\ni be the optimal\ni = argmaxb:|b|\uf8ffc fSi1(b) . In every iteration i 2 [\uf8ff], similar to the\nbundle at iteration i, i.e. x?\nproof of Theorem 3.5, since we choose the set whose value is maximal in BSi-1(xi) we have:\n\nwhere \u21e3i is a random variable with mean (1 \n\nf (S) =\n\n\u21e3ifSi-1(x?\ni )\n\nfSi-1( \u02c6xi) \u21e3ifSi-1(x?\ni )\n2 ). Therefore:\n\uf8ffXi=1\n\uf8ffXi=1\nfSi-1( \u02c6xi) (1 )fSi-1(x?\ni )\n\nfi-1( \u02c6xi) \n\nFrom Lemma 3.4, when k \uf8ff plog n for every i 2 [\uf8ff 1] we have that w.p. 1 4/ log(n):\n\nTherefore by a union bound, with probability 1 o(1), we have that \u21e3i (1 ) for all i 2 [\uf8ff].\nOtherwise, when k > plog n we apply a Chernoff bound. We get that with probability 1 o(1):\n\nf (S) \n\n\u21e3ifSi-1(x?\n\ni ) (1 )\n\n\uf8ffXi=1\n\nfSi-1(x?\ni )\n\n\uf8ffXi=1\n\nFrom Lemma 4.1 this implies the result.\n\nAcknowledgements. A.H. is supported by 1394/16 and by a BSF grant. Y.S. is supported by NSF\ngrant CAREER CCF 1452961, NSF CCF 1301976, BSF grant 2014389, NSF USICCS proposal\n1540428, a Google Research award, and a Facebook research award.\n\n8\n\n\fReferences\n\n[Ang88] Dana Angluin. Queries and concept learning. Machine Learning, 2(4):319\u2013342, 1988.\n\n[AS04] Alexander A. Ageev and Maxim Sviridenko. Pipage rounding: A new method of\nconstructing algorithms with proven performance guarantee. J. Comb. Optim., 8(3),\n2004.\n\n[BDF+10] D. Buchfuhrer, S. Dughmi, H. Fu, R. Kleinberg, E. Mossel, C. H. Papadimitriou,\nM. Schapira, Y. Singer, and C. Umans. Inapproximability for VCG-based combinatorial\nauctions. In SIAM-ACM Symposium on Discrete Algorithms (SODA), pages 518\u2013536,\n2010.\n\n[BF02] Nader H. Bshouty and Vitaly Feldman. On using extended statistical queries to avoid\n\nmembership queries. Journal of Machine Learning Research, 2:359\u2013395, 2002.\n\n[BFNS12] Niv Buchbinder, Moran Feldman, Joseph Naor, and Roy Schwartz. A tight linear time\n(1/2)-approximation for unconstrained submodular maximization. In 53rd Annual IEEE\nSymposium on Foundations of Computer Science, FOCS 2012, New Brunswick, NJ,\nUSA, October 20-23, 2012, pages 649\u2013658, 2012.\n\n[BFNS14] Niv Buchbinder, Moran Feldman, Joseph Naor, and Roy Schwartz. Submodular max-\nimization with cardinality constraints.\nIn Proceedings of the Twenty-Fifth Annual\nACM-SIAM Symposium on Discrete Algorithms, SODA 2014, Portland, Oregon, USA,\nJanuary 5-7, 2014, pages 1433\u20131452, 2014.\n\n[BH11] Maria-Florina Balcan and Nicholas J. A. Harvey. Learning submodular functions. In\nProceedings of the 43rd ACM Symposium on Theory of Computing, STOC 2011, San\nJose, CA, USA, 6-8 June 2011, pages 793\u2013802, 2011.\n\n[BMKK14] Ashwinkumar Badanidiyuru, Baharan Mirzasoleiman, Amin Karbasi, and Andreas\nKrause. Streaming submodular maximization: massive data summarization on the \ufb02y.\nIn The 20th ACM SIGKDD International Conference on Knowledge Discovery and\nData Mining, KDD \u201914, New York, NY, USA - August 24 - 27, 2014, pages 671\u2013680,\n2014.\n\n[BSS10] D. Buchfuhrer, M. Schapira, and Y. Singer. Computation and incentives in combinatorial\n\npublic projects. In EC, pages 33\u201342, 2010.\n\n[CCPV07] Gruia Calinescu, Chandra Chekuri, Martin P\u00e1l, and Jan Vondr\u00e1k. Maximizing a sub-\nmodular set function subject to a matroid constraint. In Integer programming and\ncombinatorial optimization, pages 182\u2013196. Springer, 2007.\n\n[CE11] Chandra Chekuri and Alina Ene. Approximation algorithms for submodular multiway\npartition. In IEEE 52nd Annual Symposium on Foundations of Computer Science, FOCS\n2011, Palm Springs, CA, USA, October 22-25, 2011, pages 807\u2013816, 2011.\n\n[CE16] C.P. Chambers and F. Echenique. Revealed Preference Theory. Econometric Society\n\nMonographs. Cambridge University Press, 2016.\n\n[CHHK18] Lin Chen, Christopher Harshaw, Hamed Hassani, and Amin Karbasi. Projection-free\nonline optimization with stochastic gradient: From convexity to submodularity. In\nProceedings of the 35th International Conference on Machine Learning, ICML 2018,\nStockholmsm\u00e4ssan, Stockholm, Sweden, July 10-15, 2018, pages 813\u2013822, 2018.\n\n[CJV15] Chandra Chekuri, T. S. Jayram, and Jan Vondr\u00e1k. On multiplicative weight updates for\nconcave and submodular function maximization. In Proceedings of the 2015 Conference\non Innovations in Theoretical Computer Science, ITCS 2015, Rehovot, Israel, January\n11-13, 2015, pages 201\u2013210, 2015.\n\n[DFK11] Shahar Dobzinski, Hu Fu, and Robert D. Kleinberg. Optimal auctions with correlated\n\nbidders are easy. In STOC, pages 129\u2013138, 2011.\n\n9\n\n\f[DK14] J. Djolonga and A. Krause. From MAP to marginals: Variational inference in bayesian\nsubmodular models. In Advances in Neural Information Processing Systems (NIPS),\n2014.\n\n[DLN08] Shahar Dobzinski, Ron Lavi, and Noam Nisan. Multi-unit auctions with budget limits.\n\nIn FOCS, 2008.\n\n[DNS05] Shahar Dobzinski, Noam Nisan, and Michael Schapira. Approximation algorithms for\ncombinatorial auctions with complement-free bidders. In STOC, pages 610\u2013618, 2005.\n\n[DRY11] Shaddin Dughmi, Tim Roughgarden, and Qiqi Yan. From convex optimization to\nrandomized mechanisms: toward optimal combinatorial auctions. In STOC, pages\n149\u2013158, 2011.\n\n[DS06] Shahar Dobzinski and Michael Schapira. An improved approximation algorithm for\ncombinatorial auctions with submodular bidders. In Proceedings of the seventeenth\nannual ACM-SIAM symposium on Discrete algorithm, pages 1064\u20131073. Society for\nIndustrial and Applied Mathematics, 2006.\n\n[DV12] Shahar Dobzinski and Jan Vondr\u00e1k. The computational complexity of truthfulness in\n\ncombinatorial auctions. In EC, pages 405\u2013422, 2012.\n\n[Fei98] Uriel Feige. A threshold of ln n for approximating set cover. Journal of the ACM\n\n(JACM), 45(4):634\u2013652, 1998.\n\n[Fel09] Vitaly Feldman. On the power of membership queries in agnostic learning. Journal of\n\nMachine Learning Research, 10:163\u2013182, 2009.\n\n[FFI+15] Uriel Feige, Michal Feldman, Nicole Immorlica, Rani Izsak, Brendan Lucier, and Vasilis\nSyrgkanis. A unifying hierarchy of valuations with complements and substitutes. In\nAAAI, pages 872\u2013878, 2015.\n\n[FMV11] Uriel Feige, Vahab S Mirrokni, and Jan Vondrak. Maximizing non-monotone submodu-\n\nlar functions. SIAM Journal on Computing, 40(4):1133\u20131153, 2011.\n\n[FNW78] Marshall L Fisher, George L Nemhauser, and Laurence A Wolsey. An analysis of\n\napproximations for maximizing submodular set functions\u2014II. Springer, 1978.\n\n[FV06] Uriel Feige and Jan Vondrak. Approximation algorithms for allocation problems:\nImproving the factor of 1-1/e. In Foundations of Computer Science, 2006. FOCS\u201906.\n47th Annual IEEE Symposium on, pages 667\u2013676. IEEE, 2006.\n\n[GFK10] D. Golovin, M. Faulkner, and A. Krause. Online distributed sensor selection. In IPSN,\n\n2010.\n\n[GK10] R. Gomes and A. Krause. Budgeted nonparametric learning from data streams. In Int.\n\nConference on Machine Learning (ICML), 2010.\n\n[GK11] D. Golovin and A. Krause. Adaptive submodularity: Theory and applications in active\n\nlearning and stochastic optimization. JAIR, 42:427\u2013486, 2011.\n\n[GKS90] Sally A. Goldman, Michael J. Kearns, and Robert E. Schapire. Exact identi\ufb01cation of\ncircuits using \ufb01xed points of ampli\ufb01cation functions (abstract). In Proceedings of the\nThird Annual Workshop on Computational Learning Theory, COLT 1990, University of\nRochester, Rochester, NY, USA, August 6-8, 1990., page 388, 1990.\n\n[HS] Thibaut Horel and Yaron Singer. Maximizing approximately submodular functions. In\nAdvances in Neural Information Processing Systems 29: Annual Conference on Neural\nInformation Processing Systems 2016.\n\n[HS17] Avinatan Hassidim and Yaron Singer. Submodular optimization under noise. 2017.\n\nCOLT.\n\n10\n\n\f[HSK17] S. Hamed Hassani, Mahdi Soltanolkotabi, and Amin Karbasi. Gradient methods for\nsubmodular maximization. In Advances in Neural Information Processing Systems 30:\nAnnual Conference on Neural Information Processing Systems 2017, 4-9 December\n2017, Long Beach, CA, USA, pages 5843\u20135853, 2017.\n\n[Jac94] Jeffrey C. Jackson. An ef\ufb01cient membership-query algorithm for learning DNF with\nrespect to the uniform distribution.\nIn 35th Annual Symposium on Foundations of\nComputer Science, Santa Fe, New Mexico, USA, 20-22 November 1994, pages 42\u201353,\n1994.\n\n[JB11a] S. Jegelka and J. Bilmes. Approximation bounds for inference using cooperative cuts.\n\nIn Int. Conference on Machine Learning (ICML), 2011.\n\n[JB11b] S. Jegelka and J. Bilmes. Submodularity beyond submodular energies: Coupling edges\nin graph cuts. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR),\npages 1897\u20131904, 2011.\n\n[KG07] A. Krause and C. Guestrin. Nonmyopic active learning of gaussian processes. an\nexploration\u2013exploitation approach. In Int. Conference on Machine Learning (ICML),\n2007.\n\n[KKT03] D. Kempe, J. Kleinberg, and E. Tardos. Maximizing the spread of in\ufb02uence through\na social network. In ACM SIGKDD Conference on Knowledge Discovery and Data\nMining (KDD), 2003.\n\n[KLMM05] Subhash Khot, Richard J Lipton, Evangelos Markakis, and Aranyak Mehta.\n\nInap-\nproximability results for combinatorial auctions with submodular utility functions. In\nInternet and Network Economics, pages 92\u2013101. Springer, 2005.\n\n[KMVV13] Ravi Kumar, Benjamin Moseley, Sergei Vassilvitskii, and Andrea Vattani. Fast greedy\n\nalgorithms in mapreduce and streaming. In SPAA, 2013.\n\n[KOJ13] P. Kohli, A. Osokin, and S. Jegelka. A principled deep random \ufb01eld for image seg-\nmentation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR),\n2013.\n\n[LB11a] H. Lin and J. Bilmes. A class of submodular functions for document summarization. In\n\nACL/HLT, 2011.\n\n[LB11b] H. Lin and J. Bilmes. Optimal selection of limited vocabulary speech corpora. In Proc.\n\nInterspeech, 2011.\n\n[LCSZ17] Qiang Li, Wei Chen, Xiaoming Sun, and Jialin Zhang. In\ufb02uence maximization with\n\u270f-almost submodular threshold functions. In Advances in Neural Information Processing\nSystems 30: Annual Conference on Neural Information Processing Systems 2017, 4-9\nDecember 2017, Long Beach, CA, USA, pages 3804\u20133814, 2017.\n\n[LKG+07] J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. VanBriesen, and N. Glance. Cost-\neffective outbreak detection in networks. In ACM SIGKDD Conference on Knowledge\nDiscovery and Data Mining (KDD), 2007.\n\n[LMNS09] Jon Lee, Vahab S. Mirrokni, Viswanath Nagarajan, and Maxim Sviridenko. Non-\nmonotone submodular maximization under matroid and knapsack constraints. In Pro-\nceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC 2009,\nBethesda, MD, USA, May 31 - June 2, 2009, pages 323\u2013332, 2009.\n\n[LSST13] Brendan Lucier, Yaron Singer, Vasilis Syrgkanis, and \u00c9va Tardos. Equilibrium in\n\ncombinatorial public projects. In WINE, pages 347\u2013360, 2013.\n\n[MHK18] Aryan Mokhtari, Hamed Hassani, and Amin Karbasi. Conditional gradient method for\nstochastic submodular maximization: Closing the gap. In International Conference\non Arti\ufb01cial Intelligence and Statistics, AISTATS 2018, 9-11 April 2018, Playa Blanca,\nLanzarote, Canary Islands, Spain, pages 1886\u20131895, 2018.\n\n11\n\n\f[MSV08] Vahab S. Mirrokni, Michael Schapira, and Jan Vondr\u00e1k. Tight information-theoretic\nlower bounds for welfare maximization in combinatorial auctions. In Proceedings 9th\nACM Conference on Electronic Commerce (EC-2008), Chicago, IL, USA, June 8-12,\n2008, pages 70\u201377, 2008.\n\n[NW78] George L Nemhauser and Leonard A Wolsey. Best algorithms for approximating the\nmaximum of a submodular set function. Mathematics of operations research, 3(3):177\u2013\n188, 1978.\n\n[NWF78a] G. L. Nemhauser, L. A. Wolsey, and M. L. Fisher. An analysis of approximations for\n\nmaximizing submodular set functions ii. Math. Programming Study 8, 1978.\n\n[NWF78b] George L Nemhauser, Laurence A Wolsey, and Marshall L Fisher. An analysis of ap-\nproximations for maximizing submodular set functions\u2014I. Mathematical Programming,\n14(1):265\u2013294, 1978.\n\n[PP11] Christos H. Papadimitriou and George Pierrakos. On optimal single-item auctions. In\n\nSTOC, pages 119\u2013128, 2011.\n\n[PSS08] Christos H. Papadimitriou, Michael Schapira, and Yaron Singer. On the hardness of\nbeing truthful. In 49th Annual IEEE Symposium on Foundations of Computer Science,\nFOCS 2008, October 25-28, 2008, Philadelphia, PA, USA, pages 250\u2013259, 2008.\n\n[QSY+17] Chao Qian, Jing-Cheng Shi, Yang Yu, Ke Tang, and Zhi-Hua Zhou. Subset selection\nunder noise.\nIn Advances in Neural Information Processing Systems 30: Annual\nConference on Neural Information Processing Systems 2017, 4-9 December 2017, Long\nBeach, CA, USA, pages 3563\u20133573, 2017.\n\n[RLK11] M. Gomez Rodriguez, J. Leskovec, and A. Krause. Inferring networks of diffusion and\n\nin\ufb02uence. ACM TKDD, 5(4), 2011.\n\n[SGK09] M. Streeter, D. Golovin, and A. Krause. Online learning of assignments. In Advances\n\nin Neural Information Processing Systems (NIPS), 2009.\n\n[SS95] Eli Shamir and Clara Schwartzman. Learning by extended statistical queries and its\nrelation to PAC learning. In Computational Learning Theory: Eurocolt \u201995, pages\n357\u2013366. Springer-Verlag, 1995.\n\n[SS08] Michael Schapira and Yaron Singer. Inapproximability of combinatorial public projects.\n\nIn WINE, pages 351\u2013361, 2008.\n\n[VCZ11] Jan Vondr\u00e1k, Chandra Chekuri, and Rico Zenklusen. Submodular function maximization\nvia the multilinear relaxation and contention resolution schemes. In Proceedings of\nthe Forty-third Annual ACM Symposium on Theory of Computing, STOC \u201911, pages\n783\u2013792, New York, NY, USA, 2011. ACM.\n\n[Von08] Jan Vondr\u00e1k. Optimal approximation for the submodular welfare problem in the value\n\noracle model. In STOC, pages 67\u201374, 2008.\n\n12\n\n\f", "award": [], "sourceid": 259, "authors": [{"given_name": "Yaron", "family_name": "Singer", "institution": "Harvard University"}, {"given_name": "Avinatan", "family_name": "Hassidim", "institution": "Bar Ilan University"}]}