{"title": "Beyond Online Balanced Descent: An Optimal Algorithm for Smoothed Online Optimization", "book": "Advances in Neural Information Processing Systems", "page_first": 1875, "page_last": 1885, "abstract": "We study online convex optimization in a setting where the learner seeks to minimize the sum of a per-round hitting cost and a movement cost which is incurred when changing decisions between rounds. We prove a new lower bound on the competitive ratio of any online algorithm in the setting where the costs are $m$-strongly convex and the movement costs are the squared $\\ell_2$ norm. This lower bound shows that no algorithm can achieve a competitive ratio that is $o(m^{-1/2})$ as $m$ tends to zero. No existing algorithms have competitive ratios matching this bound, and we show that the state-of-the-art algorithm, Online Balanced Decent (OBD), has a competitive ratio that is $\\Omega(m^{-2/3})$. We additionally propose two new algorithms, Greedy OBD (G-OBD) and Regularized OBD (R-OBD) and prove that both algorithms have an $O(m^{-1/2})$ competitive ratio. The result for G-OBD holds when the hitting costs are quasiconvex and the movement costs are the squared $\\ell_2$ norm, while the result for R-OBD holds when the hitting costs are $m$-strongly convex and the movement costs are Bregman Divergences. Further, we show that R-OBD simultaneously achieves constant, dimension-free competitive ratio and sublinear regret when hitting costs are strongly convex.", "full_text": "Beyond Online Balanced Descent: An Optimal\nAlgorithm for Smoothed Online Optimization\n\nGautam Goel*1 Yiheng Lin*2,1 Haoyuan Sun*1 Adam Wierman1\n\n2Institute for Interdisciplinary Information Sciences, Tsinghua University\n\n1California Institute of Technology\n\nAbstract\n\nWe study online convex optimization in a setting where the learner seeks to mini-\nmize the sum of a per-round hitting cost and a movement cost which is incurred\nwhen changing decisions between rounds. We prove a new lower bound on the\ncompetitive ratio of any online algorithm in the setting where the costs are m-\nstrongly convex and the movement costs are the squared (cid:96)2 norm. This lower\nbound shows that no algorithm can achieve a competitive ratio that is o(m\u22121/2)\nas m tends to zero. No existing algorithms have competitive ratios matching this\nbound, and we show that the state-of-the-art algorithm, Online Balanced Decent\n(OBD), has a competitive ratio that is \u2126(m\u22122/3). We additionally propose two new\nalgorithms, Greedy OBD (G-OBD) and Regularized OBD (R-OBD) and prove that\nboth algorithms have an O(m\u22121/2) competitive ratio. The result for G-OBD holds\nwhen the hitting costs are quasiconvex and the movement costs are the squared\n(cid:96)2 norm, while the result for R-OBD holds when the hitting costs are m-strongly\nconvex and the movement costs are Bregman Divergences. Further, we show that\nR-OBD simultaneously achieves constant, dimension-free competitive ratio and\nsublinear regret when hitting costs are strongly convex.\n\n1\n\nIntroduction\n\nWe consider the problem of Smoothed Online Convex Optimization (SOCO), a variant of online\nconvex optimization (OCO) where the online learner pays a movement cost for changing actions\nbetween rounds. More precisely, we consider a game where an online learner plays a series of\nrounds against an adaptive adversary. In each round, the adversary picks a convex cost function\nft : Rd \u2192 R\u22650 and shows it to the learner. After observing the cost function, the learner chooses an\naction xt and pays a hitting cost ft(xt), as well as a movement cost c(xt, xt\u22121), which penalizes the\nonline learner for switching points between rounds.\nSOCO was originally proposed in the context of dynamic power management in data centers [28].\nSince then it has seen a wealth of applications, from speech animation to management of electric\nvehicle charging [24\u201326], and more recently applications in control [21,22] and power systems [5,27].\nSOCO has been widely studied in the machine learning community with the special cases of online\nlogistic regression and smoothed online maximum likelihood estimation receiving recent attention\n[22].\nAdditionally, SOCO has connections to a number of other important problems in online algorithms\nand learning. Convex Body Chasing (CBC), introduced in [20], is a special case of SOCO [14]. The\n\nGautam Goel, Yiheng Lin, and Haoyuan Sun contributed equally to this work. This work was supported by\nNSF grants AitF-1637598 and CNS-1518941, with additional support for Gautam Goel provided by an Amazon\nAWS AI Fellowship.\n\n33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada.\n\n\f\u221a\n\nproblem of designing competitive algorithms for Convex Body Chasing has attracted much recent\nattention. e.g. [2, 6, 14]. SOCO can also be viewed as a continuous version of the Metrical Task\nSystem (MTS) problem (see [9, 11, 12]). A special case of MTS is the celebrated k\u2212server problem,\n\ufb01rst proposed in [30], which has received signi\ufb01cant attention in recent years (see [13, 15]).\nGiven these connections, the design and analysis of algorithms for SOCO and related problems\nhas received considerable attention in the last decade. SOCO was \ufb01rst studied in the scalar setting\nin [29], which used SOCO to model dynamic \u201cright-sizing\u201d in data centers and gave a 3-competitive\nalgorithm. A 2-competitive algorithm was shown in [8], also in the scalar setting, which matches the\nlower bound for online algorithms in this setting [1]. Another rich line of work studies how to design\ncompetitive algorithms for SOCO when the online algorithm has access to predictions of future cost\nfunctions (see [16, 17, 27, 28]).\nDespite a large and growing literature on SOCO and related problems, for nearly a decade the\nonly known constant-competitive algorithms that did not use predictions of future costs were for\none-dimensional action spaces. In fact, the connections between SOCO and Convex Body Chasing\nhighlight that, in general, one cannot expect dimension-free constant competitive algorithms due to a\nd) lower bound (see [18, 20]). However, recently there has been considerable progress moving\n\u2126(\nbeyond the one-dimensional setting for large, important classes of hitting and movement costs.\nA breakthrough came in 2017 when [18] proposed a new algorithm, Online Balanced Descent (OBD),\nand showed that it is constant competitive in all dimensions in the setting where the hitting costs are\nlocally polyhedral and movement costs are the (cid:96)2 norm. The following year, [22] showed that OBD\nis also constant competitive, speci\ufb01cally 3 + O(1/m)-competitive, in the setting where the hitting\ncosts are m-strongly convex and the movement costs are the squared (cid:96)2 norm. Note that this setting\nis of particular interest because of its importance for online regression and LQR control (see [22]).\nWhile OBD has proven to be a promising new algorithm, at this point it is not known whether OBD\nis optimal for the competitive ratio, or if there is more room for improvement. This is because there\nare no non-trivial lower bounds known for important classes of hitting costs, the most prominent of\nwhich is the class of strongly convex functions.\nContributions of this paper. In this paper we prove the \ufb01rst non-trivial lower bounds on SOCO\nwith strongly convex hitting costs, both for general algorithms and for OBD speci\ufb01cally. These\nlower bounds show that OBD is not optimal and there is an order-of-magnitude gap between its\nperformance and the general lower bound. Motivated by this gap and the construction of the lower\nbounds we present two new algorithms, both variations of OBD, which have competitive ratios that\nmatch the lower bound. More speci\ufb01cally, we make four main contributions in this paper.\nFirst, we prove a new lower bound on the performance achievable by any online algorithm in the\nsetting where the hitting costs are m-strongly convex and the movement costs are the squared (cid:96)2\nnorm. In particular, in Theorem 1, we show that as m tends to zero, any online algorithm must have\ncompetitive ratio at least \u2126(m\u22121/2).\nSecond, we show that the state-of-the-art algorithm, OBD, cannot match this lower bound. More\nprecisely, in Theorem 2 we show that, as m tends to zero, the competitive ratio of OBD is \u2126(m\u22122/3),\nan order-of-magnitude higher than the lower bound of \u2126(m\u22121/2). This immediately begs the question:\ncan any online algorithm close the gap and match the lower bound?\nOur third contribution answers this question in the af\ufb01rmative. In Section 4, we propose two novel\nalgorithms, Greedy Online Balanced Descent (G-OBD) and Regularized Online Balanced Descent\n(R-OBD), which are able to close the gap left open by OBD and match the \u2126(m\u22121/2) lower bound.\nBoth algorithms can be viewed as \u201caggressive\" variants of OBD, in the sense that they chase the\nminimizers of the hitting costs more aggressively than OBD. In Theorem 3 we show that G-OBD\nmatches the lower bound up to constant factors for quasiconvex hitting costs (a more general class\nthan m-strongly convex). In Theorem 4 we show that R-OBD has a competitive ratio that precisely\nmatches the lower bound, including the constant factors, and hence can be viewed as an optimal\nalgorithm for SOCO in the setting where the costs are m-strongly convex and the movement cost\nis the squared (cid:96)2 norm. Further, our results for R-OBD hold not only for squared (cid:96)2 movement\ncosts; they also hold for movement costs that are Bregman Divergences, which commonly appear\nthroughout information geometry, probability, and optimization.\n\n2\n\n\fFinally, in our last section we move beyond competitive ratio and additionally consider regret. We\nprove in Theorem 6 that R-OBD can simultaneously achieve bounded, dimension-free competitive\nratio and sublinear regret in the case of m-strongly convex hitting costs and squared (cid:96)2 movement\ncosts. This result helps close a crucial gap in the literature. Previous work has shown that it not\npossible for any algorithm to simultaneously achieve both a constant competitive ratio and sublinear\nregret in general SOCO problems [19]. However, this was shown through the use of linear hitting and\nmovement costs. Thus, the question of whether it is possible to simultaneously achieve a dimension-\nfree, constant competitive ratio and sublinear regret when hitting costs are strongly convex has\nremained open. The closest previous result is from [18], which showed that OBD can achieve either\nconstant competitive ratio or sublinear regret with locally polyhedral cost functions depending on the\n\u201cbalance condition\u201d used; however both cannot be achieved simultaneously. Our result (Theorem 6),\nshows that R-OBD can simultaneously provide a constant competitive ratio and sublinear regret for\nstrongly convex cost functions when the movement costs are the squared (cid:96)2 norm.\n\n2 Model & Preliminaries\nAn instance of Smoothed Online Convex Optimization (SOCO) consists of a convex action set X \u2282\nRd, an initial point x0 \u2208 X , a sequence of non-negative convex cost functions f1 . . . ft : Rd \u2192 R\u22650,\nand a movement cost c : Rd \u00d7 Rd \u2192 R\u22650. In every round, the environment picks a cost function ft\n(potentially adversarily) for an online learner. After observing the cost function, the learner chooses\nan action xt \u2208 Rd and pays a cost that is the sum of the hitting cost, ft(xt), and the movement cost,\na.k.a., switching cost, c(xt, xt\u22121). The goal of the online learner is to minimize its total cost over T\n\nrounds: cost(ALG) =(cid:80)T\n\nt=1 ft(xt) + c(xt, xt\u22121).\n\nWe emphasize that it is the movement costs that make this problem interesting and challenging; if\nthere were no movement costs, c(xt, xt\u22121) = 0, the problem would be trivial, since the learner could\nalways pay the optimal cost simply by picking the action that minimizes the hitting cost in each\nround, i.e., by setting xt = arg minx ft(x). The movement cost couples the cost the learner pays\nacross rounds, which means that the optimal action of the learner depends on unknown future costs.\nThere is a long literature on SOCO, both focusing on algorithmic questions, e.g., [8, 18, 22, 29], and\napplications, e.g., [24\u201326,28]. The variety of applications studied means that a variety of assumptions\nabout the movement costs have been considered. Motivated by applications to data center capacity\nmanagement, movement costs have often been taken as the (cid:96)1 norm, i.e., c(x1, x2) = (cid:107)x1 \u2212 x2(cid:107)1,\ne.g. [8, 29]. However, recently, more general norms have been considered and the setting of squared\n(cid:96)2 movement costs has gained attention due to its use in online regression problems and connections\nto LQR control, among other applications (see [3, 21, 22]).\nIn this paper, we focus on the setting of the squared (cid:96)2 norm, i.e. c(x2, x1) = 1\n2; however,\nwe also consider a generalization of the (cid:96)2 norm in Section 4.2 where c is the Bregman divergence.\nSpeci\ufb01cally, we consider c(xt, xt\u22121) = Dh(xt||xt\u22121) = h(xt)\u2212 h(xt\u22121)\u2212(cid:104)\u2207h(xt\u22121), xt \u2212 xt\u22121(cid:105),\nwhere both the potential h and its Fenchel Conjugate h\u2217 are differentiable. Further, we assume that h\nis \u03b1-strongly convex and \u03b2-strongly smooth with respect to an underlying norm (cid:107)\u00b7(cid:107). De\ufb01nitions of\neach of these properties can be found in the appendix.\nNote that the squared (cid:96)2 norm is itself a Bregman divergence, with \u03b1 = \u03b2 = 1 and (cid:107)\u00b7(cid:107) = (cid:107)\u00b7(cid:107)2,\n\u2206n = {y \u2208 [0, 1]n |(cid:80)\nDh(xt||xt\u22121) = 1\ni yi ln yi with domain\n(cid:80)\ni yi = 1}, Dh(xt||xt\u22121) is the Kullback-Liebler divergence (see [7]). Further,\n\u03b4 ln 2-strongly smooth in the domain X = P\u03b4 = {y \u2208 [0, 1]n |\ni yi = 1, yi \u2265 \u03b4} (see [18]). This extension is important given the role Bregman divergence plays\n\n2. However, more generally, when h(y) =(cid:80)\n\nacross optimization and information theory, e.g., see [4, 31].\nLike for movement costs, a variety of assumptions have been made about hitting costs. In particular,\nbecause of the emergence of pessimistic lower bounds when general convex hitting costs are consid-\nered, papers typically have considered restricted classes of functions, e.g., locally polyhedral [18] and\nstrongly convex [22]. In this paper, we focus on hitting costs that are m-strongly convex; however\nour results in Section 4.1 generalize to the case of quasiconvex functions.\nCompetitive Ratio and Regret. The primary goal of the SOCO literature is to design online\nalgorithms that (nearly) match the performance of the of\ufb02ine optimal algorithm. The performance\nmetric used to evaluate an algorithm is typically the competitive ratio because the goal is to learn in\n\n2 (cid:107)xt \u2212 xt\u22121(cid:107)2\n2 ln 2-strongly convex and\n\n1\n\n2(cid:107)x2 \u2212 x1(cid:107)2\n\nh is\n\n1\n\n3\n\n\fLet x(l) =(cid:81)\n\nAlgorithm 1 Online Balanced Descent (OBD)\n1: procedure OBD(ft, xt\u22121, \u03b3)\n2:\n3:\n4:\n5:\n6:\n\nvt \u2190 arg minx ft(x)\n(xt\u22121). Initialize l = ft(vt). Here K l\nIncrease l. Stop when c(x(l), xt\u22121) = \u03b3(l \u2212 ft(vt)).\nxt \u2190 x(l).\nreturn xt\n\nKl\nt\n\n(cid:46) Procedure to select xt\n\nt = {x|ft(x) \u2264 l}.\n\n(cid:80)T\n\ncost(ALG)/cost(OP T ).\n\nt=1 c(xt, xt\u22121) \u2264 L.\n\n(cid:80)T\nt=1 ft(xt) + c(xt, xt\u22121) subject to (cid:80)T\n\nan environment that is changing dynamically and is potentially adversarial. The competitive ratio is\nthe worst-case ratio of total cost incurred by the online learner and the of\ufb02ine optimal costs. The cost\nof the of\ufb02ine optimal is de\ufb01ned as the minimal cost of an algorithm if it has full knowledge of the\nsequence of costs {ft}, i.e. cost(OP T ) = minx1...xT\nt=1 ft(xt) + c(xt, xt\u22121). Using this, the\ncompetitive ratio is de\ufb01ned as supf1...fT\nNote that another important performance measure of interest is the regret. In this paper, we study\na generalization of the classical regret called the L-constrained regret, which is de\ufb01ned as follows.\nThe L-(constrained) dynamic regret of an online algorithm ALG is \u03c1L(T ) if for all sequences of\ncost functions ft,\u00b7\u00b7\u00b7 , fT , we have cost(ALG) \u2212 cost(OP T (L)) \u2264 \u03c1L(T ) where OP T (L) is the\ncost of an L-constrained of\ufb02ine optimal solution, i.e., one with movement cost upper bounded by L:\nOP T (L) = minx\u2208X T\nAs the de\ufb01nitions above highlight, the regret and competitive ratio both compare with the cost of an\nof\ufb02ine optimal solution, however regret constrains the movement allowed by the of\ufb02ine optimal. The\nclassical notion of regret focuses on the static optimal (L = 0), but relaxing that to allow limited\nmovement bridges regret and the competitive ratio since, as L grows, the L-constrained of\ufb02ine\noptimal approaches the of\ufb02ine (dynamic) optimal. Intuitively, one can think of regret as being suited\nfor evaluating learning algorithms in (nearly) static settings while the competitive ratio as being suited\nfor evaluating learning algorithms in dynamic settings.\nOnline Balanced Descent. The state-of-the-art algorithm for SOCO is Online Balanced Descent\n(OBD). OBD, which is formally de\ufb01ned in Algorithm 1, uses the operator \u03a0K(x) : Rd \u2192 K\nto denote the (cid:96)2 projection of x onto a convex set K; and this operator is de\ufb01ned as \u03a0K(x) =\narg miny\u2208K (cid:107)y \u2212 x(cid:107)2. Intuitively, it works as follows. In every round, OBD projects the previously\nchosen point xt\u22121 onto a carefully chosen level set of the current cost function ft. The level set is\nchosen so that the hitting costs and movement costs are \u201cbalanced\": in every round, the movement\ncost is at most a constant \u03b3 times the hitting cost. The balance helps ensure that the online learner\nis matching the of\ufb02ine costs. Since neither cost is too high, OBD ensures that both are comparable\nto the of\ufb02ine optimal. The parameter \u03b3 can be tuned to give the optimal competitive ratio and the\nappropriate level set can be ef\ufb01ciently selected via binary search.\nImplicitly, OBD can be viewed as a proximal algorithm with a dynamic step size [32], in the sense\nthat, like proximal algorithms, OBD iteratively projects the previously chosen point onto a level set\nof the cost function. Unlike traditional proximal algorithms, OBD considers several different level\nsets, and carefully selects the level set in every round so as to balance the hitting and movement costs.\nWe exploit this connection heavily when designing Regularized OBD (R-OBD), which is a proximal\nalgorithm with a special regularization term added to the objective to help steer the online learner\ntowards the hitting cost minimizer in each round.\nOBD was proposed in [18], where the authors show that it has a constant, dimension-free competitive\nratio in the setting where the movement costs are the (cid:96)2 norm and the hitting costs are locally\npolyhedral, i.e. grow at least linearly away from the minimizer. This was the \ufb01rst time an algorithm\n\u221a\nhad been shown to be constant competitive beyond one-dimensional action spaces. In the same\npaper, a variation of OBD that uses a different balance condition was proven to have O(\nT L)\nL-constrained regret for locally polyhedral hitting costs. OBD has since been shown to also have a\nconstant, dimension-free competitive ratio when movement costs are the squared (cid:96)2 norm and hitting\ncosts are strongly convex, which is the setting we consider in this paper. However, up until this\npaper, lower bounds for the strongly convex setting did not exist and it was not known whether the\nperformance of OBD in this setting is optimal or if OBD can simultaneously achieve sublinear regret\nand a constant, dimension-free competitive ratio.\n\n4\n\n\f3 Lower Bounds\n\n\u221a\n\nOur \ufb01rst set of results focuses on lower bounding the competitive ratio achievable by online algorithms\nfor SOCO. While [18] proves a general lower bound for SOCO showing that the competitive ratio of\nany online algorithm is \u2126(\nd), where d is the dimension of the action space, there are large classes\nof important problems where better performance is possible. In particular, when the hitting costs\nare m-strongly convex, [22] has shown that OBD provides a dimension-free competitive ratio of\n3 + O(1/m). However, no non-trivial lower bounds are known for the strongly convex setting.\nOur \ufb01rst result in this section shows a general lower bound on the competitive ratio of SOCO algo-\nrithms when the hitting costs are strongly convex and the movement costs are quadratic. Importantly,\nthere is a gap between this bound and the competitive ratio for OBD proven in [22]. Our second\nresult further explores this gap. We show a lower bound on the competitive ratio of OBD which\nhighlights that OBD cannot achieve a competitive ratio that matches the general lower bound. This\ngap, and the construction used to show it, motivate us to propose new variations of OBD in the next\nsection. We then prove that these new algorithms have competitive ratios that match the lower bound.\nWe begin by stating the \ufb01rst lower bound for strongly convex hitting costs in SOCO.\nTheorem 1. Consider hitting cost functions that are m-strongly convex with respect to (cid:96)2 norm and\nmovement costs given by 1\n2. Any online algorithm must have a competitive ratio at\nleast 1\n2\n\n2 (cid:107)xt \u2212 xt\u22121(cid:107)2\n\n(cid:113)\n\n(cid:17)\n\n.\n\n(cid:16)\n\n1 +\n\n1 + 4\nm\n\nTheorem 1 is proven in the appendix using an argument that leverages the fact that, when the\nmovement cost is quadratic, reaching a target point via one large step is more costly than reaching it\nby taking many small steps. More concretely, to prove the lower bound we consider a scenario on the\nreal line where the online algorithm encounters a sequence of cost functions whose minimizers are at\nzero followed by a very steep cost function whose minimizer is at x = 1. Without knowledge of the\nfuture, the algorithm has no incentive to move away from zero until the last step, when it is forced\nto incur a large cost; however, the of\ufb02ine adversary, with full knowledge of the cost sequence, can\ndivide the journey into multiple small steps.\nImportantly, the lower bound in Theorem 1 highlights the dependence of the competitive ratio on\nm, the convexity parameter. It shows that the case where online algorithms do the worst is when m\nis small, and that algorithms that match the lower bound up to a constant are those for which the\ncompetitive ratio is O(m\u22121/2) as m \u2192 0+. Note that our results in Section 4 show that there exists\nonline algorithms that precisely achieve the competitive ratio in Theorem 1. However, in contrast, the\nfollowing shows that OBD cannot match the lower bound in Theorem 1.\nTheorem 2. Consider hitting cost functions that are m-strongly convex with respect to (cid:96)2 norm and\n3 ) as m \u2192 0+,\na movement costs given by 1\nfor any \ufb01xed balance parameter \u03b3.\n\n2. The competitive ratio of OBD is \u2126(m\u2212 2\n\n2 (cid:107)xt \u2212 xt\u22121(cid:107)2\n\nAs we have discussed, OBD is the state-of-the-art algorithm for SOCO, and has been shown to\nprovide a competitive ratio of 3 + O (1/m) [22]. However, Theorem 2 highlights a gap between\nOBD and the general lower bound. If the lower bound is achievable (which we prove it is in the next\nsection), this implies that OBD is a sub-optimal algorithm.\nThe proof of Theorem 2 gives important intuition about what goes wrong with OBD and how the\nalgorithm can be improved. Speci\ufb01cally, our proof of Theorem 2 considers a scenario where the cost\nfunctions have minimizers very near each other, but OBD takes a series of steps without approaching\nthe minimizing points. The optimal is able to pay little cost and stay near the minimizers, but OBD\nnever moves enough to be close to the minimizers. Figure 1 illustrates the construction, showing\nOBD moving along the circumference of a circle, while the of\ufb02ine optimal stays near the origin.\n\n4 Algorithms\n\nThe lower bounds in Theorem 1 and Theorem 2 suggest a gap between the competitive ratio of OBD\nand what is achievable via an online algorithm. Further, the construction used in the proof of Theorem\n2 highlights the core issue that leads to inef\ufb01ciency in OBD. In the construction, OBD takes a large\nstep from xt\u22121 to xt, but the of\ufb02ine optimal, x\u2217\nt , only decreases by a very small amount. This means\n\n5\n\n\fFigure 1: Counterexample used to prove Theorem 2. In the \ufb01gure, {xt} are the choices of OBD and\n{x\u2217\n\nt} are the choices of the of\ufb02ine optimal.\n\nt shrinks.\n\nthat OBD is continually chasing the of\ufb02ine optimal but never closing the gap. In this section, we take\ninspiration from this example and develop two new algorithms that build on OBD but ensure that the\ngap to the of\ufb02ine optimal x\u2217\nHow to ensure that the gap to the of\ufb02ine optimal shrinks is not obvious since, without the knowledge\nabout the future, it is impossible to determine how x\u2217\nt will evolve. A natural idea is to determine\nan online estimate of x\u2217\nt and then move towards that estimate. Motivated by the construction in the\nproof of Theorem 2, we use the minimizer of the hitting cost at round t, vt, as a rough estimate of the\nof\ufb02ine optimal and ensure that we close the gap to vt in each round.\nThere are a number of ways of implementing the goal of ensuring that OBD more aggressively\nmoves toward the minimizer of the hitting cost each round. In this section, we consider two concrete\napproaches, each of which (nearly) matches the lower bound in Theorem 1.\nThe \ufb01rst approach, which we term Greedy OBD (Algorithm 2) is a two-stage algorithm, where the\n\ufb01rst stage applies OBD and then a second stage explicitly takes a step directly towards the minimizer\n(of carefully chosen size). We introduce the algorithm and analyze its performance in Section 4.1.\nGreedy OBD is order-optimal, i.e. matches the lower bound up to constant factors, in the setting of\nsquared (cid:96)2 norm movement costs and quasiconvex hitting costs.\nThe second approach for ensuring that OBD moves aggressively toward the minimizer uses a different\nview of OBD. In particular, Greedy OBD uses a geometric view of OBD, which is the way OBD\nhas been presented previously in the literature. Our second view uses a \u201clocal view\u201d of OBD that\nparallels the local view of gradient descent and mirror descent, e.g., see [7, 23]. In particular, the\nchoice of an action in OBD can be viewed as the solution to a per-round local optimization. Given\nthis view, we ensure that OBD more aggressively tracks the minimizer by adding a regularization\nterm to this local optimization which penalizes points which are far from the minimizer. We term this\napproach Regularized OBD (Algorithm 3), and study it in Section 4.2. Note that Regularized OBD\nhas a competitive ratio that precisely matches the lower bound, including the constant factors, when\nmovement costs are Bregman divergences and hitting costs are m-strongly convex. Thus, it applies\nfor more general movement costs than Greedy OBD but less general hitting costs.\n\n4.1 Greedy OBD\n\nThe formal description of Greedy Online Balanced Descent (G-OBD) is given in Algorithm 2. G-\nOBD has two steps each round. First, the algorithm takes a standard OBD step from the previous\npoint xt\u22121 to a new point x(cid:48)\nt, which is the projection of xt\u22121 onto a level set of the current hitting\ncost ft, where the level set is chosen to balance hitting and movement costs. G-OBD then takes\nan additional step directly towards the minimizer of the hitting cost, vt, with the size of the step\nchosen based on the convexity parameter m. G-OBD can be implemented ef\ufb01ciently using the same\napproach as described for OBD [18]. G-OBD has two parameters \u03b3 and \u00b5. The \ufb01rst, \u03b3, is the balance\nparameter in OBD and the second, \u00b5, is a parameter controlling the size of the step towards the\n\n6\n\nxt\u22121xtx\u2217tx\u2217t\u22121xt+1x\u2217t+1Oxt+2x\u2217t+2hh\u2018\u2018\u2018\fAlgorithm 2 Greedy Online Balanced Descent (G-OBD)\n1: procedure G-OBD(ft, xt\u22121)\nvt \u2190 arg minx ft(x)\n2:\nt \u2190 OBD(ft, xt\u22121, \u03b3)\nx(cid:48)\n\u221a\n3:\nm \u2265 1 then\nif \u00b5\n4:\nxt \u2190 vt\n5:\n\u221a\n6:\nxt \u2190 \u00b5\n7:\nreturn xt\n8:\n\n\u221a\nmvt + (1 \u2212 \u00b5\n\nm)x(cid:48)\n\nelse\n\nt\n\n(cid:46) Procedure to select xt\n\nAlgorithm 3 Regularized OBD (R-OBD)\n1: procedure R-OBD(ft, xt\u22121)\n2:\n3:\n4:\n\nvt \u2190 arg minx ft(x)\nxt \u2190 arg minx ft(x) + \u03bb1c(x, xt\u22121) + \u03bb2c(x, vt)\nreturn xt\n\n(cid:46) Procedure to select xt\n\nminimizer vt. Note that the two-step approach of G-OBD is reminiscent of the two-stage algorithm\nused in [10]; however the resulting algorithms are quite distinct.\nWhile the addition of a second step in G-OBD may seem like a small change, it improves performance\nby an order-of-magnitude. We prove that G-OBD asymptotically matches the lower bound proven in\nTheorem 2 not just for m-strongly convex hitting costs, but more broadly to quasiconvex costs.\nTheorem 3. Consider quasiconvex hitting costs such that ft(x) \u2265 ft(vt) + m\n2 (cid:107)x \u2212 vt(cid:107)2\nmovement costs c(xt, xt\u22121) = 1\ncompetitive algorithm as m \u2192 0+.\n\n2. G-OBD with \u03b3 = 1, \u00b5 = 1 is an O(cid:0)m\u22121/2(cid:1)-\n\n2 (cid:107)xt \u2212 xt\u22121(cid:107)2\n\n2 and\n\n4.2 Regularized OBD\n\nThe G-OBD framework is based on the geometric view of OBD used previously in literature. There\nare, however, two limitations to this approach. First, the competitive ratio obtained, while having\noptimal asymptotic dependence on m, does not not match the constants in the lower bound of\nTheorem 1. Second, G-OBD requires repeated projections, which makes ef\ufb01cient implementation\nchallenging when the functions ft have complex geometry.\nHere, we present a variation of OBD based on a local view that overcomes these limitations. Regular-\nized OBD (R-OBD) is computationally simpler and provides a competitive ratio that matches the\nconstant factors in the lower bound in Theorem 1. However, unlike G-OBD, our analysis of R-OBD\ndoes not apply to quasiconvex hitting costs. R-OBD is described formally in Algorithm 3. In each\nround, R-OBD picks a point that minimizes a weighted sum of the hitting and movement costs, as\nwell as a regularization term which encourages the algorithm to pick points close to the minimizer of\nthe current hitting cost function, vt = arg minx ft(x). Thus, R-OBD can be implemented ef\ufb01ciently\nusing two invocations of a convex solver. Note that R-OBD has two parameters \u03bb1 and \u03bb2 which\nadjust the weights of the movement cost and regularizer respectively.\nWhile it may not be immediately clear how R-OBD connects to OBD, it is straightforward to\nillustrate the connection in the squared (cid:96)2 setting. In this case, computing xt = arg minx ft(x) +\n2 (cid:107)x \u2212 xt\u22121(cid:107)2\n2 is equivalent to doing a projection onto a level set of ft, since the selection of the\n\u03bb1\nminimizer can be restated as the solution to \u2207ft(xt) + \u03bb1(xt \u2212 xt\u22121) = 0. Thus, without the\nregularizer, the optimization in R-OBD gives a local view of OBD and then the regularizer provides\nmore aggressive movement toward the minimizer of the hitting cost.\nNot only does the local view lead to a computationally simpler algorithm, but we prove that R-OBD\nmatches the constant factors in Theorem 1 precisely, not just asymptotically. Further, it does this\nnot just in the setting where movement costs are the squared (cid:96)2 norm, but also in the case where\nmovement costs are Bregman divergences.\nTheorem 4. Consider hitting costs that are m\u2212strongly convex with respect to a norm (cid:107)\u00b7(cid:107) and\nmovement costs de\ufb01ned as c(xt, xt\u22121) = Dh(xt||xt\u22121), where h is \u03b1-strongly convex and \u03b2-strongly\n\n7\n\n\f\u03bb1\n\n.\n\n1 +\n\n(cid:18)\n\n1 +\n\n(cid:19)\n\n(cid:17)\n\n\u03bb1\n\n(cid:113)\n\n\u03bb2\u03b2+m\n\n1 +\n\n(cid:113)\n\n1 + 4\u03b22\n\u03b1m\n\n(cid:16) m+\u03bb2\u03b2\n\nthe competitive ratio is 1\n2\n\n(cid:18)\n\u03b1 \u00b7\n\u00b7 1\nm , 1 + \u03b22\n\n. If \u03bb1 and \u03bb2 satisfy m + \u03bb2\u03b2 = \u03bb1m\n2\n\nsmooth with respect to the same norm. Additionally, assume {ft}, h and its Fenchel Conjugate h\u2217\nare differentiable. Then, R-OBD with parameters 1 \u2265 \u03bb1 > 0 and \u03bb2 \u2265 0 has a competitive ratio of\nthen\nmax\n\n(cid:19)\n2 (1 +(cid:112)1 + 4/m). This competitive ratio matches exactly\n(cid:19)\u22121\n\nTheorem 4 focuses on movement costs that are Bregman divergences, which generalizes the case\nof squared (cid:96)2 movement costs. To recover the squared (cid:96)2 case, we use (cid:107)\u00b7(cid:107) = (cid:107)\u00b7(cid:107)2 and \u03b1 = \u03b2 = 1,\nwhich results in a competitive ratio of 1\nwith the lower bound claimed in Theorem 1. Further, in this case the assumption in Theorem 4 that\nthe hitting cost functions are differentiable is not required (see Theorem ?? in the appendix).\nIt is also interesting to investigate the settings of \u03bb1 and \u03bb2 that yield the optimal competitive ratio.\n\n1 + 4\u03b22\nm\u03b1\n\n1 + 4\u03b22\n\u03b1m\n\nSetting \u03bb2 = 0 achieves the optimal competitive ratio as long as \u03bb1 = 2\n. By\nrestating the update rule in R-OBD as \u2207ft(xt) = \u03bb1(\u2207h(xt\u22121) \u2212 \u2207h(xt)), we see that R-OBD\nwith \u03bb2 = 0 can be interpreted as \u201cone step lookahead mirror descent\u201d. Further R-OBD with \u03bb2 = 0\ncan be implemented even when we do not know the location of the minimizer vt. For example,\n2 (cid:107)x(cid:107)2\n2, we can run gradient descent starting at xt\u22121 to minimize the strongly convex\nwhen h(x) = 1\n2 (cid:107)x \u2212 xt\u22121(cid:107)2\nfunction ft(x) + \u03bb1\n2. Only local gradients will be queried in this process. However, the\nfollowing lower bound highlights that this simple form comes at some cost in terms of generality\nwhen compared with our results for G-OBD.\nTheorem 5. Consider quasiconvex hitting costs such that ft(x) \u2212 ft(vt) \u2265 m\nmovement costs given by c(xt, xt\u22121) = 1\nof \u2126(1/m) when \u03bb2 = 0.\n\n2 and\n2. Regularized OBD has a competitive ratio\n\n2 (cid:107)xt \u2212 xt\u22121(cid:107)2\n\n2 (cid:107)x \u2212 vt(cid:107)2\n\n(cid:113)\n\n(cid:18)\n\n5 Balancing Regret and Competitive Ratio\n\nIn the previous sections we have focused on the competitive ratio; however another important\nperformance measure is regret. In this section, we consider the L-constrained dynamic regret. The\nmotivation for our study is [19], which provides an impossibility result showing that no algorithm can\nsimultaneously maintain a constant competitive ratio and a sub-linear regret in the general setting of\nSOCO. However, [19] utilizes linear hitting costs in its construction and thus it is an open question as\nto whether this impossibility result holds for strongly convex hitting costs. In this section, we show\nthat the impossibility result does not hold for strongly convex hitting costs. To show this, we \ufb01rst\ncharacterize the parameters for which R-OBD gives sublinear regret.\nTheorem 6. Consider hitting costs that are m\u2212strongly convex with respect to a norm (cid:107)\u00b7(cid:107) and\nmovement costs de\ufb01ned as c(xt, xt\u22121) = Dh(xt||xt\u22121), where h is \u03b1-strongly convex and \u03b2-strongly\nsmooth with respect to the same norm. Additionally, assume {ft}, h and its Fenchel Conjugate h\u2217\nare differentiable. Further, suppose that (cid:107)\u2207h(x)(cid:107)\u2217 is bounded above by G < \u221e, the diameter of\n(cid:113) T\nthe feasible set X is bounded above by D, and \u2207h(0) = 0. Then, for \u03bb1, \u03bb2 such that \u03bb1 \u2265 1 \u2212 m\nL < \u221e,\nand \u03bb2 = \u03b7(T, L, D, G), where \u03b7(T, L, D, G) is such that limT\u2192\u221e \u03b7(T, L, D, G) \u00b7 D2\n\u221a\nD2 \u00b7(cid:113) L\nthe L-constrained regret of R-OBD is O(G\n\n\u221a\nTheorem 6 highlights that O(G\nT\nfor some constant K. This suggests that the tendency to aggressively move towards the minimizer\n\u221a\nshould shrink over time in order to achieve a small regret. It is not possible to use Theorem 6 to\nsimultaneously achieve the optimal competitive ratio and O(G\nT L) regret for all strongly convex\n\u221a\nhitting costs (m > 0). However, the corollary below shows that it is possible to simultaneously\nachieve a dimension-free, constant competitive ratio and an O(G\nT L) regret for all m > 0. An\ninteresting open question that remains is whether it is possible to develop an algorithm that has\nsublinear regret and matches the optimal order for competitive ratio.\n\nT L) regret can be achieved when \u03bb1 \u2265 1\u2212 m\n\n4\u03b2 and \u03bb2 \u2264 KG\n\nT L).\n\n4\u03b2\n\nG\n\n8\n\n\fCorollary 1. Consider the same conditions as in Theorem 6 and \ufb01x m > 0. R-OBD with pa-\n\n, 1 \u2212 m\n\n4\u03b2\n\n\u221a\n, \u03bb2 = 0 has an O(G\n\nT L) regret and is\n\n(cid:113)\n\n(cid:32)\n\n(cid:18)\n(cid:19)\n\n(cid:19)\u22121\n(cid:19)\n\n(cid:33)\n\nrameters \u03bb1 = max\n\n2\n\n1 +\n\n1 + 4\u03b22\n\u03b1m\n\n(cid:18)\n\n(cid:18)\n\n(cid:113)\n\nmax\n\n1\n2\n\n1 +\n\n1 + 4\u03b22\n\u03b1m\n\n, 1 \u2212 \u03b2\n\n4\u03b1 + \u03b22\n\n\u03b1m\n\n-competitive.\n\n9\n\n\fReferences\n[1] A. Antoniadis and K. Schewior. A tight lower bound for online convex optimization with\nswitching costs. In Proceedings of the International Workshop on Approximation and Online\nAlgorithms, pages 164\u2013175. Springer, 2017.\n\n[2] C. Argue, S. Bubeck, M. B. Cohen, A. Gupta, and Y. T. Lee. A nearly-linear bound for chasing\nnested convex bodies. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms\n(SODA), pages 117\u2013122, 2019.\n\n[3] K. J. Astr\u00f6m and R. M. Murray. Feedback systems: an introduction for scientists and engineers.\n\nPrinceton university press, 2010.\n\n[4] N. Azizan and B. Hassibi. Stochastic gradient/mirror descent: Minimax optimality and implicit\nregularization. In Proceedings of the International Conference on Learning Representations\n(ICLR), 2019.\n\n[5] M. Badiei, N. Li, and A. Wierman. Online convex optimization with ramp constraints. In IEEE\n\nConference on Decision and Control (CDC), pages 6730\u20136736, 2015.\n\n[6] N. Bansal, M. B\u00f6hm, M. Eli\u00e1\u0161, G. Koumoutsos, and S. W. Umboh. Nested convex bodies are\nchaseable. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA), pages\n1253\u20131260, 2018.\n\n[7] N. Bansal and A. Gupta. Potential-function proofs for \ufb01rst-order methods. arXiv preprint\n\narXiv:1712.04581, 2017.\n\n[8] N. Bansal, A. Gupta, R. Krishnaswamy, K. Pruhs, K. Schewior, and C. Stein. A 2-competitive\nalgorithm for online convex optimization with switching costs. In Proceedings of the Ap-\nproximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques\n(APPROX/RANDOM). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2015.\n\n[9] Y. Bartal, A. Blum, C. Burch, and A. Tomkins. A polylog(n)-competitive algorithm for metrical\ntask systems. In Proceedings of the ACM Symposium on Theory of Computing (STOC), pages\n711\u2013719, 1997.\n\n[10] M. Bienkowski, J. Byrka, M. Chrobak, C. Coester, L. Jez, and E. Koutsoupias. Better bounds\n\nfor online line chasing. arXiv preprint arXiv:1811.09233, 2018.\n\n[11] A. Blum and C. Burch. On-line learning and the metrical task system problem. Machine\n\nLearning, 39(1):35\u201358, 2000.\n\n[12] A. Borodin, N. Linial, and M. E. Saks. An optimal on-line algorithm for metrical task system.\n\nJournal of the ACM, 39(4):745\u2013763, 1992.\n\n[13] S. Bubeck, M. B. Cohen, Y. T. Lee, J. R. Lee, and A. M \u02dbadry. k-server via multiscale entropic\nIn Proceedings of the ACM SIGACT Symposium on Theory of Computing\n\nregularization.\n(STOC), pages 3\u201316, 2018.\n\n[14] S. Bubeck, Y. T. Lee, Y. Li, and M. Sellke. Competitively chasing convex bodies. In Proceedings\n\nof the ACM SIGACT Symposium on Theory of Computing (STOC), 2019.\n\n[15] N. Buchbinder, A. Gupta, M. Molinaro, and J. Naor. k-servers with a smile: online algorithms\nvia projections. In Proceedings of the ACM-SIAM Symposium on Discrete Algorithms (SODA),\npages 98\u2013116, 2019.\n\n[16] N. Chen, A. Agarwal, A. Wierman, S. Barman, and L. L. Andrew. Online convex optimization\nusing predictions. ACM SIGMETRICS Performance Evaluation Review, 43(1):191\u2013204, 2015.\n[17] N. Chen, J. Comden, Z. Liu, A. Gandhi, and A. Wierman. Using predictions in online optimiza-\ntion: Looking forward with an eye on the past. ACM SIGMETRICS Performance Evaluation\nReview, 44(1):193\u2013206, 2016.\n\n[18] N. Chen, G. Goel, and A. Wierman. Smoothed online convex optimization in high dimensions\nvia online balanced descent. In Proceedings of Conference On Learning Theory (COLT), pages\n1574\u20131594, 2018.\n\n[19] A. Daniely and Y. Mansour. Competitive ratio vs regret minimization: achieving the best of\n\nboth worlds. In Proceedings of Algorithmic Learning Theory, pages 333\u2013368, 2019.\n\n[20] J. Friedman and N. Linial. On convex body chasing. Discrete & Computational Geometry,\n\n9(3):293\u2013321, 1993.\n\n10\n\n\f[21] G. Goel, N. Chen, and A. Wierman. Thinking fast and slow: Optimization decomposition across\ntimescales. In Proceedings of the IEEE Conference on Decision and Control (CDC), pages\n1291\u20131298, 2017.\n\n[22] G. Goel and A. Wierman. An online algorithm for smoothed regression and LQR control. In\n\nProceedings of the Machine Learning Research, volume 89, pages 2504\u20132513, 2019.\n\n[23] E. Hazan et al.\n\nIntroduction to online convex optimization. Foundations and Trends in\n\nOptimization, 2(3-4):157\u2013325, 2016.\n\n[24] V. Joseph and G. de Veciana. Jointly optimizing multi-user rate adaptation for video trans-\nport over wireless systems: Mean-fairness-variability tradeoffs. In Proceedings of the IEEE\nINFOCOM, pages 567\u2013575, 2012.\n\n[25] S. Kim and G. B. Giannakis. An online convex optimization approach to real-time energy\n\npricing for demand response. IEEE Transactions on Smart Grid, 8(6):2784\u20132793, 2017.\n\n[26] T. Kim, Y. Yue, S. Taylor, and I. Matthews. A decision tree framework for spatiotempo-\nral sequence prediction. In Proceedings of the ACM SIGKDD International Conference on\nKnowledge Discovery and Data Mining, pages 577\u2013586, 2015.\n\n[27] Y. Li, G. Qu, and N. Li. Using predictions in online optimization with switching costs: A fast\nalgorithm and a fundamental limit. In Proceedings of the American Control Conference (ACC),\npages 3008\u20133013. IEEE, 2018.\n\n[28] M. Lin, Z. Liu, A. Wierman, and L. L. Andrew. Online algorithms for geographical load\nbalancing. In Proceedings of the International Green Computing Conference (IGCC), pages\n1\u201310, 2012.\n\n[29] M. Lin, A. Wierman, L. L. Andrew, and E. Thereska. Dynamic right-sizing for power-\nproportional data centers. IEEE/ACM Transactions on Networking (TON), 21(5):1378\u20131391,\n2013.\n\n[30] M. S. Manasse, L. A. McGeoch, and D. D. Sleator. Competitive algorithms for server problems.\n\nJournal of Algorithms, 11(2):208\u2013230, 1990.\n\n[31] N. Murata, T. Takenouchi, T. Kanamori, and S. Eguchi. Information geometry of u-boost and\n\nBregman divergence. Neural Computation, 16(7):1437\u20131481, 2004.\n\n[32] N. Parikh and S. Boyd. Proximal algorithms. Foundations and Trends in Optimization, 1(3):127\u2013\n\n239, 2014.\n\n11\n\n\f", "award": [], "sourceid": 1084, "authors": [{"given_name": "Gautam", "family_name": "Goel", "institution": "Caltech"}, {"given_name": "Yiheng", "family_name": "Lin", "institution": "Institute for Interdisciplinary Information Sciences, Tsinghua University"}, {"given_name": "Haoyuan", "family_name": "Sun", "institution": "California Institute of Technology"}, {"given_name": "Adam", "family_name": "Wierman", "institution": "California Institute of Technology"}]}