{"title": "Correlation clustering with local objectives", "book": "Advances in Neural Information Processing Systems", "page_first": 9346, "page_last": 9355, "abstract": "Correlation Clustering is a powerful graph partitioning model that aims to cluster items based on the notion of similarity between items. An instance of the Correlation Clustering problem consists of a graph G (not necessarily complete) whose edges are labeled by a binary classifier as\nsimilar and dissimilar. Classically, we are tasked with producing a clustering that minimizes the number of disagreements: an edge is in disagreement if it is a similar edge and is present across clusters or if it is a dissimilar edge and is present within a cluster. Define the disagreements vector to be an n dimensional vector indexed by the vertices, where the v-th index is the number of disagreements at vertex v.\n\nRecently, Puleo and Milenkovic (ICML '16) initiated the study of the Correlation Clustering framework in which the objectives were more general functions of the disagreements vector. In this paper, we study algorithms for minimizing \\ell_q norms (q >= 1) of the disagreements vector for both arbitrary and complete graphs. We present the first known algorithm for minimizing the \\ell_q norm of the disagreements vector on arbitrary graphs and also provide an improved algorithm for minimizing the \\ell_q norm (q >= 1) of the disagreements vector on complete graphs. We also study an alternate cluster-wise local objective introduced by Ahmadi, Khuller and Saha (IPCO '19), which aims to minimize the maximum number of disagreements associated with a cluster. We present an improved (2 + \\eps) approximation algorithm for this objective.", "full_text": "Correlation Clustering with Local Objectives\n\nSanchit Kalhan\n\nKonstantin Makarychev\n\nTimothy Zhou\n\nAbstract\n\nCorrelation Clustering is a powerful graph partitioning model that aims to clus-\nter items based on the notion of similarity between items. An instance of the\nCorrelation Clustering problem consists of a graph G (not necessarily complete)\nwhose edges are labeled by a binary classi\ufb01er as \u201csimilar\u201d and \u201cdissimilar\u201d. An\nobjective which has received a lot of attention in literature is that of minimizing\nthe number of disagreements: an edge is in disagreement if it is a \u201csimilar\u201d edge\nand is present across clusters or if it is a \u201cdissimilar\u201d edge and is present within\na cluster. De\ufb01ne the disagreements vector to be an n dimensional vector indexed\nby the vertices, where the v-th index is the number of disagreements at vertex v.\nRecently, Puleo and Milenkovic (ICML \u201916) initiated the study of the Correlation\nClustering framework in which the objectives were more general functions of the\ndisagreements vector. In this paper, we study algorithms for minimizing `q norms\n(q 1) of the disagreements vector for both arbitrary and complete graphs. We\npresent the \ufb01rst known algorithm for minimizing the `q norm of the disagreements\nvector on arbitrary graphs and also provide an improved algorithm for minimizing\nthe `q norm (q 1) of the disagreements vector on complete graphs. We also\nstudy an alternate cluster-wise local objective introduced by Ahmadi, Khuller and\nSaha (IPCO \u201919), which aims to minimize the maximum number of disagreements\nassociated with a cluster. We also present an improved (2 + \")-approximation\nalgorithm for this objective. Finally, we compliment our algorithmic results for\nminimizing the `q norm of the disagreements vector with some hardness results.\n\n1\n\nIntroduction\n\nA basic task in machine learning is that of clustering items based on the similarity between them.\nThis task can be elegantly captured by Correlation Clustering, a clustering framework \ufb01rst introduced\nby Bansal et al. [2004]. In this model, we are given access to items and the similarity/dissimilarity\nbetween them in the form of a graph G on n vertices. The edges of G represent whether the items\nare similar or dissimilar and are labelled as (\u201c+\u201d) and (\u201c\u201d) respectively. The goal is to produce\na clustering that agrees with the labeling of the edges as much as possible, i.e., to group positive\nedges in the same cluster and place negative edges across different clusters (a positive edge that\nis present across clusters or a negative edge that is present within the same cluster is said to be in\ndisagreement). The Correlation Clustering problem can be viewed as an agnostic learning problem,\nwhere we are given noisy examples and the task is to \ufb01t a hypothesis as best as possible to these\nexamples. Co-reference resolution (see e.g., Cohen and Richman [2001, 2002]), spam detection\n(see e.g., Ramachandran et al. [2007], Bonchi et al. [2014]) and image segmentation (see e.g., Wirth\n[2017]) are some of the applications to which Correlation Clustering has been applied to in practice.\nThis task is made trivial if the labeling given is consistent (transitive): if (u, v) and (v, w) are similar,\nthen (u, w) is similar for all vertices u, v, w in G (the connected components on similar edges would\ngive an optimal clustering). Instead, it is assumed that the given labeling is inconsistent, i.e., it is\npossible that (u, w) are dissimilar even though (u, v) and (v, w) are similar. For such a triplet u, v, w,\nevery possible clustering incurs a disagreement on at least one edge and thus, no perfect clustering\n\n33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada.\n\n\fexists. The optimal clustering is the one which minimizes the disagreements. Moreover, as the\nnumber of clusters is not prede\ufb01ned, the optimal clustering can use anywhere from 1 to n clusters.\nMinimizing the total weight of edges in disagreement is the objective that has received the most\nconsideration in literature. De\ufb01ne the disagreements vector be an n dimensional vector indexed by\nthe vertices where the v-th coordinate equals the number of disagreements at v. Thus, minimizing the\ntotal number of disagreements is equivalent to minimizing the `1 norm of the disagreements vector.\nPuleo and Milenkovic [2016] initiated the study of local objectives in the Correlation Clustering\nframework. They focus on complete graphs and study the minimization of `q norms (q 1) of the\ndisagreements vector \u2013 for which they provided a 48approximation algorithm. Charikar, Gupta,\nand Schwartz [2017] gave an improved 7approximation algorithm for minimizing `q disagreements\non complete graphs. They also studied the problem of minimizing the `1 norm of the disagreements\nvector (also known as Min Max Correlation Clustering) for arbitrary graphs, for which they provided\na O(pn)approximation.\nFor higher values of q (particularly q = 1), a clustering optimized for minimizing the `q norm\nprioritizes reducing the disagreements at vertices that are worst off. Thus, such metrics are very\nunforgiving in most cases as it is possible that in the optimal clustering there is only one vertex with\nhigh disagreements while every other vertex has low disagreements. Hence, one is forced to infer\nthe most pessimistic picture about the overall clustering. The `2 norm is a solution to this tension\nbetween the `1 and `1 objectives. The `2 norm of the disagreements vector takes into account the\ndisagreements at each vertex while also penalizing the vertices with high disagreements more heavily.\nThus, a clustering optimized for the minimum `2 norm gives a more balanced clustering as it takes\ninto consideration both the global and local picture.\nRecently, Ahmadi, Khuller, and Saha [2019b] introduced an alternative min max objective for\ncorrelation clustering (which we call AKS min max objective). For a cluster C \u2713 V , let us refer\nto similar edges with exactly one endpoint in C and dissimilar edges with both endpoints in C as\nedges in disagreements with respect to C. We call the weight of all edges in disagreement with C the\ncost of C. Then, the AKS min max objective asks to \ufb01nd a clustering C1, . . . , CT that minimizes\nthe maximum cost Ci. Ahmadi et al. [2019b] gave an O(log n)approximation algorithm for this\nobjective. Ahmadi, Galhotra, Khuller, Saha, and Schwartz [2019a] improved the approximation\nfactor to O(plog n \u00b7 max{log |E|, log(k)}).\nOur contributions. In this paper, we provide positive and negative results for Correlation Clustering\nwith the `q objective. We \ufb01rst study the problem of minimizing disagreements on arbitrary graphs.\nWe present the \ufb01rst approximation algorithm minimizing any `q norm (q 1) of the disagreements\nvector.\nTheorem 1.1. There exists a polynomial time O(n\n2q n)approximation algorithm for\nthe minimum `q disagreements problem on general weighted graphs.\n\n2q \u00b7 log\n\n2 1\n\n1\n2 + 1\n\n1\n\nFor the `2 objective, the above algorithm leads to an approximation ratio of \u02dcO(n1/4), thus providing the\n\ufb01rst known approximation ratio for optimizing the clustering for this version of the objective. Note that\nthe above algorithm matches the best approximation guarantee of O(log n) for the classical objective\nof minimizing the `1 norm of the disagreements vector. For the `1 norm, our algorithm matches the\nguarantee of the algorithm by Charikar, Gupta, and Schwartz [2017] up to log factors. Fundamental\ncombinatorial optimization problems like Multicut, Multiway Cut and s-t Cut can be framed as special\ncases of Correlation Clustering. Thus, Theorem 1.1 leads to the \ufb01rst known algorithms for Multicut,\nMultiway Cut and s-t Cut with the `q objective when q 6= 1 and q 6= 1. We can also use the algorithm\n2q n) bi-criteria approximation for Min k-Balanced\nfrom Theorem 1.1 to obtain O(n\nPartitioning with the `q objective (we omit details here).\nNext, we study the case of complete graphs. For this case, we present an improved 5approximation\nalgorithm for minimizing any `q norm (q 1) of the disagreements vector.\nTheorem 1.2. There exists a polynomial time 5approximation algorithm for the minimum `q\ndisagreements problem on complete graphs.\n\n2q \u00b7 log\n\n2 1\n\n1\n2 + 1\n\n1\n\nWe also study the case of complete bipartite graphs where disagreements need to be bounded for\nonly one side of the bipartition, and not the whole vertex set. We give an improved 5approximation\nalgorithm for minimizing any `q norm (q 1) of the disagreements vector.\n\n2\n\n\fTheorem 1.3. There exists a polynomial time 5approximation algorithm for the minimum `q\ndisagreements problem on complete bipartite graphs where disagreements are measured for only one\nside of the bipartition.\n\nIn this paper, we also consider the AKS min max objective.\nFor this objective, we\ngive a (2 + \")approximation algorithm, which improves the approximation ratio of\nO(plog n \u00b7 max{log |E|, log(k)}) given by Ahmadi, Galhotra, Khuller, Saha, and Schwartz\n[2019a].\nTheorem 1.4. There exists a polynomial time (2 + \")approximation algorithm for the AKS min\nmax problem on arbitrary graphs.\n\n1\n\nFinally, in the full version of this paper (see supplemental materials), we present an integrality gap of\n2 1\n2q ) for minimum `q s t cut and prove a hardness of approximation of 2 for minimum `1\n\u2326(n\ns t cut.\nPrevious work. Bansal, Blum, and Chawla [2004] showed that it is NP-hard to \ufb01nd a clustering that\nminimizes the total disagreements, even on complete graphs. They give a constant-factor approxima-\ntion algorithm to minimize disagreements and a PTAS to maximize agreements on complete graphs.\nFor complete graphs, Ailon, Charikar, and Newman [2008] presented a randomized algorithm with an\napproximation guarantee of 3 to minimize total disagreements. They also gave a 2.5 approximation\nalgorithm based on LP rounding. This factor was improved to slightly less than 2.06 by Chawla,\nMakarychev, Schramm, and Yaroslavtsev [2015]. Since, the natural LP is known to have an integrality\ngap of 2, the problem of optimizing the classical objective is almost settled with respect to the natural\nLP. For arbitrary graphs, the best known approximation ratio is O(log n) (see Charikar, Guruswami,\nand Wirth [2003], Demaine, Emanuel, Fiat, and Immorlica [2006]). Assuming the Unique Games\nConjecture, there is no constant-factor approximation algorithm for minimizing `1 disagreements on\narbitrary graphs (see Chawla et al. [2006]). Puleo and Milenkovic [2016] \ufb01rst studied Correlation\nClustering with more local objectives. For minimizing `q (q 1) norms of the disagreements vector\non complete graphs, their algorithm achieves an approximation guarantee of 48. This was improved\nto 7 by Charikar, Gupta, and Schwartz [2017]. Charikar et al. [2017] also studied the problem of\nminimizing the `1 norm of the disagreements vector on general graphs. They showed that the\nnatural LP/SDP has an integrality gap of n/2 for this problem and provided a O(pn)approximation\nalgorithm for minimum `1 disagreements. Puleo and Milenkovic [2016] also initiated the study of\nminimizing the `q norm of the disagreements vector (for one side of the bipartition) on complete\nbipartite graphs. The presented a 10approximation algorithm for this problem, which was improved\nto 7 by Charikar, Gupta, and Schwartz [2017]. Recently, Ahmadi et al. [2019b] studied an alternative\nobjective for the correlation clustering problem. Motivated by creating balanced communities for\nproblems such as image segmentation and community detection in social networks, they propose\na new cluster-wise min-max objective. This objective minimizes the maximum weight of edges in\ndisagreement associated with a cluster, where an edge is in disagreement with respect to a cluster if\nit is a similar edge and has exactly one end point in the cluster or if it is a dissimilar edge and has\n\nboth its endpoints in the cluster. They gave an O(plog n \u00b7 max{log |E|, log(k)})approximation\nalgorithm for this objective. Moreover, they give a O(r2)approximation algorithm for graphs that\nexclude a Kr,r minor, and a 14approximation algorithm for complete graphs.\n\n2 Preliminaries\n\nWe now formally de\ufb01ne the Correlation Clustering with `q objective problem. We will need the\nfollowing de\ufb01nition. Consider a set of points V and two disjoint sets of edges on V : positive edges\nE+ and negative edges E. We assume that every edge has a weight wuv. For every partition P of\nV , we say that a positive edge is in disagreement with P if the endpoints u and v belongs to different\nparts of P; and a negative edge is in disagreement with P if the endpoints u and v belongs to the\nsame part of P. The vector of disagreements, denoted by disagree(P, E+, E), is a |V | dimensional\nvector indexed by elements of V . Its coordinate u equals\n\ndisagreeu(P, E+, E) =Xv:(u,v)2E+[E\n\nwuv1((u, v) is in disagreement with P).\n\n3\n\n\fminimize max\u21e3kykq,Xu2V\nsubject to yu = Xv:(u,v)2E+\nzu = Xv:(u,v)2E+\n\nq\u2318\nzu 1\nwuvxuv + Xv:(u,v)2E\nuvxuv + Xv:(u,v)2E\n\nwq\n\nwuv(1 xuv)\nwq\nuv(1 xuv)\n\nfor all u 2 V\nfor all u 2 V\n\nxv1v2 + xv2v3 xv1v3\nxuv = xvu\nxuv 2 [0, 1]\n\nfor all v1, v2, v3 2 V\nfor all u, v 2 V\nfor all u, v 2 V\n\n(P)\n\n(P1)\n\n(P2)\n\n(P3)\n(P4)\n(P5)\n\nFigure 3.1: Convex relaxation for Correlation Clustering with min `q objective for q < 1.\n\nThat is, disagreeu(P, E+, E) is the weight of disagreeing edges incident to u. We similarly de\ufb01ne\na cut vector for a set of edges E:\n\ncutu(P, E) =Xv:(u,v)2E\n\nwuv1(u and v are separated by P).\n\n1\n\nu)\n\nWe use the standard de\ufb01nition for the `q norm of a vector x: kxkq = (Pu xq\nq and kxk1 =\nmaxu xu. For a partition P, we denote by P(u) the piece that contains vertex u.\nDe\ufb01nition 1. In the Correlation Clustering problem with `q objective, we are given a graph G on\na set V with two disjoint sets of edges E+ and E and a set of weights wuv. The goal is \ufb01nd a\npartition P that minimizes the `q norm of the disagreements vector, k disagree(P, E+, E)kq.\nIn our algorithm for Correlation Clustering on arbitrary graphs, we will use a powerful technique\nof padded metric space decompositions (see e.g., Bartal [1996], Rao [1999], Fakcharoenphol and\nTalwar [2003], Gupta, Krauthgamer, and Lee [2003]).\nDe\ufb01nition 2 (Padded Decomposition). Let (X, d) be a metric space on n points, and let > 0. A\nprobabilistic distribution of partitions P of X is called a padded decomposition if it satis\ufb01es the\nfollowing properties:\n\n\u2022 Each cluster C 2P has diameter at most .\n\u2022 For every u 2 X and \"> 0, Pr(Ball(u, ) 6\u21e2 P(u)) \uf8ff D \u00b7 \n\n where Ball(u, ) = {v 2\nTheorem 2.1 (Fakcharoenphol, Rao, and Talwar [2004]). Every metric space (X, d) on n points\nadmits a D = O(log n) separating padded decomposition. Moreover, there is a polynomial-time\nalgorithm that samples a partition from this distribution.\n\nX : d(u, v) \uf8ff }\n\n3 Convex Relaxation\n\nIn our algorithms for minimizing `q disagreements in arbitrary and complete graphs, we use a convex\nrelaxation given in Figure 3.1. Our convex relaxation for Correlation Clustering is fairly standard.\nIt is similar to relaxations used in the papers by Garg, Vazirani, and Yannakakis [1996], Demaine,\nEmanuel, Fiat, and Immorlica [2006], Charikar, Guruswami, and Wirth [2003]. For every pair of\nvertices u and v, we have a variable xuv that is equal to the distance between u and v in the \u201cmulticut\nmetric\u201d. Variables xuv satisfy the triangle inequality constraints (P3). They are also symmetric (P4)\nand xuv 2 [0, 1] (P5). Thus, the set of vertices V equipped with the distance function d(u, v) = xuv\nis a metric space.\nAdditionally, for every vertex u 2 V , we have variables yu and zu (see constraints (P1) and (P2))\nthat lower bound the number of disagreeing edges incident to u. The objective of our convex\nprogram is to minimize max(kykq, (Pu zu)\nq ). Note that all constraints in the program (P) are linear;\nhowever, the objective function of (P) is not convex as is. So in order to \ufb01nd the optimal solution, we\n\n1\n\n4\n\n\fraise the objective function to the power of q and \ufb01nd feasible x, y, z that minimizes the objective\nq,Pu zu).\nmax(kykq\nThis program has a polynomial number of linear constraints, and its objective function is convex:\nq,Pu zu), is the maximum of two convex functions.\nThis is because the objective function, max(kykq\nThe \ufb01rst function, kykq\nq is the sum of q-th powers of the variables yu which are positive. Thus, kykq\nis convex and differentiable. The second function,Pu zu is a linear function. Therefore, we can use\noff-the-shelf convex solvers (quadratic solvers for `2) to get an optimal solution to (P ).\nLet us verify that program (P) is a relaxation for Correlation Clustering. Consider an arbitrary\npartitioning P of V . In the integral solution corresponding to P, we set xuv = 0 if u and v are\nin the same cluster in P; and xuv = 1 if u and v are in different clusters in P. In this solution,\ndistances xuv satisfy triangle inequality constraints (P3) and xuv = xvu (P4). Observe that a positive\nedge (u, v) 2 E+ is in disagreement with P if xuv = 1; and a negative edge (u, v) 2 E is in\ndisagreement if xuv = 0. Thus, in this integral solution, yu = disagreeu(P, E+, E) and moreover,\nu. Therefore, in the integral solution corresponding to P, the objective function of (P) equals\nzu \uf8ff yq\nk disagreeu(P, E+, E)kq. Of course, the cost of the optimal fractional solution to the problem may\nbe less than the cost of the optimal integral solution. Thus, (P) is a relaxation for our problem. Below,\nwe denote the cost of the optimal fraction solution to (P) by LP .\nWe remark that we can get a simpler relaxation by removing variables z and changing the objective\nfunction to kykq. This relaxation also works for `1 norm. We use it in our 5-approximation\nalgorithm.\n\nq\n\n4 Overview of Algorithms\n\nWe note that some proofs from Subsections 4.1, 4.2 and 4.3 have been deferred to Sections A, B and\nC respectively (in the supplementary material). These Lemmas and their proofs have been referrenced\nappropriately.\n\n4.1 Correlation Clustering on arbitrary graphs\n\nq+1\n\nq1\n2q log\n\nIn this section, we describe our algorithm for minimizing `q disagreements on arbitrary graphs. We\nwill prove the following main theorem.\nTheorem 4.1. There exists a randomized polynomial-time O(n\n2q n)approximation algo-\nrithm for Correlation Clustering with the `q objective (q 1).\nWe remark that the same algorithm gives O(pn log n)approximation for the `1 norm. We omit\n\nthe details in the conference version of the paper.\nOur algorithm relies on a procedure for partitioning arbitrary metric spaces into pieces of small\ndiameter. In particular, we prove the following theorem,\nTheorem 4.2. There exists a polynomial-time randomized algorithm that given a metric space (X, d)\non n points and parameter returns a random partition P of X such that the diameter of every set\nP in P is at most and for every q 1 (q 6= 1) and every weighted graph G = (X, E, w), we\nhave\nEhk cut(P, E)kqi \uf8ff Cn\n\n2q n \u00b7h\u21e3Xu2X Xv:(u,v)2E\n\nq1\n2q log\n\nq+1\n\nwq\nuv\n\nd(u, v)\n\n \u23181/q\n+\u21e3Xu2X\u21e3 Xv:(u,v)2E\n\n+\n\nwuv\n\nd(u, v)\n\n \u2318q\u23181/qi,\n\n(1)\n\nfor some absolute constant C.\n\nWe defer the proof of the above theorem to Section A.\nWe now show how to use the above metric space partitioning scheme to obtain an approximation\nalgorithm for Correlation Clustering. Note that this proves Theorem 4.1.\n\n5\n\n\fProof of Theorem 4.1. Our algorithm \ufb01rst \ufb01nds the optimal solution x, y, z to the convex relaxation\n(P) presented in Section 3. Then, it de\ufb01nes a metric d(u, v) = xuv on the vertices of the graph. Finally,\nit runs the metric space partitioning algorithm with = 1 /2 from Section A (see Theorem 4.2) and\noutputs the obtained partitioning P.\nLet us analyze the performance of this algorithm. Denote the cost of the optimal solution x, y, z by\nLP . We know that the cost of the optimal solution OP T is lower bounded by LP (see Section 3\nfor details). By Theorem 4.2, applied to the graph G = (V, E+) (note: we ignore negative edges for\nnow),\n\nC\n\n\nq+1\n\nyq\n\nq1\n2q log\n\nq+1\n\nq1\n2q log\n\nn\n\nq +Xu2V\nu 1\n\n2q n \u00b7\u21e3Xu2V\n\nq\u2318 \uf8ff 4Cn\nzu 1\n\nEhk cut(P, E+)kqi \uf8ff\n2q n \u00b7 LP.\n(2)\nRecall that a positive edge is not in agreement if and only if it is cut. Hence, disagree(P, E+, ?) =\ncut(P, E+), and the bound above holds for Ek disagree(P, E+, ?)kq. By the triangle inequality,\nEk disagree(P, E+, E)kq \uf8ff Ek disagree(P, E+, ?)kq + Ek disagree(P, ?, E)kq. Hence, to\n\ufb01nish the proof, it remains to upper bound Ek disagree(P, ?, E)kq.\nObserve that the diameter of every cluster returned by the algorithm is at most = 1 /2. For\nall disagreeing negative edges (u, v) 2 E, we have xuv \uf8ff 1/2 and 1 xuv 1/2. Thus,\ndisagreeu(P, ?, E) \uf8ff 2yu for every u, and Ek disagree(P, ?, E)kq \uf8ff 2kykq \uf8ff 2LP . This\ncompletes the proof.\n\n4.2 Correlation Clustering on complete graphs\nIn this section, we present our algorithm for Correlation Clustering on complete graphs and its\nanalysis. Our algorithm achieves an approximation ratio of 5 and is an improvement over the\napproximation ratio of 7 by Charikar, Gupta, and Schwartz [2017].\n\n4.2.1 Summary of the algorithm\nOur algorithm is based on rounding an optimal solution to the convex relaxation (P). Recall that\nfor complete graphs, we can get a simpler relaxation by removing the variables z in our convex\nprogramming formulation. We start with considering the entire vertex set of unclustered vertices. At\neach step t of the algorithm, we select a subset of vertices as a cluster Ct and remove it from the set\nof unclustered vertices. Thus, each vertex is assigned to a cluster exactly once and is never removed\nfrom a cluster once it is assigned.\nFor each vertex w 2 V , let Ball(w, \u21e2) = {u 2 V : xuw \uf8ff \u21e2} be the set of vertices within a distance\nof \u21e2 from w. For r = 1/5 the quantity r xuw where u 2 Ball(w, r) represents the distance from u\nto the boundary of the ball of radius 1/5 around w. Let Vt \u2713 V be the set of unclustered vertices at\nstep t, and de\ufb01ne\n\nLt(w) = Xu2Ball(w,r)\\Vt\n\nr xuw.\n\nAt each step t, we select the vertex wt that maximizes the quantity Lt(w) over all unclustered vertices\nw 2 Vt and select the set Ball(wt, 2r) as a cluster. We repeat this step until all the nodes have\nbeen clustered. A complete description of the algorithm can be found in Figure B.1 (supplementary\nmaterial).\n\n4.2.2 Overview of the analysis\nOur main result for complete graphs is the following, which proves Theorem 1.3.\nTheorem 4.3. Algorithm 2 is a 5approximation algorithm for Correlation Clustering on complete\ngraphs.\nFor an edge (u, v) 2 E, let LP (u, v) be the LP cost of the edge (u, v): LP (u, v) = xuv if (u, v) 2\nE+ and LP (u, v) = 1 xuv if (u, v) 2 E. Let ALG(u, v) = 1((u, v) is in disagreement ).\nDe\ufb01ne\n\npro\ufb01t(u) = X(u,v)2E\n\nLP (u, v) r X(u,v)2E\n\nALG(u, v),\n\n6\n\n\fwhere r = 1/5. We show that for each vertex u 2 V , we have pro\ufb01t(u) 0 (see Lemma 4.4) and,\ntherefore, the number of disagreeing edges incident to u is upper bounded by 5y(u):\n\nALG(u) = Xv:(u,v)2E\n\nALG(u, v) \uf8ff\n\n1\n\nr Xv:(u,v)2E\n\nLP (u, v) = 5y(u).\n\nThus, kALGkq \uf8ff 5kykq for any q 1. Consequently, the approximation ratio of the algorithm is at\nmost 5 for any norm `q.\nLemma 4.4. For every u 2 V , we have pro\ufb01t(u) 0.\nAt each step t of the algorithm, we create a new cluster Ct and remove it from the graph. We also\nremove all edges with at least one endpoint in Ct. Denote this set of edges by\n\nALG(u, v).\n\n(3)\n\nNow let\n\nEt = {(u, v) : u 2 Ct or v 2 Ct}.\n\npro\ufb01tt(u, v) =\u21e2LP (u, v) rALG(u, v),\n\n0,\n\npro\ufb01tt(u) = Xv2Vt\n\npro\ufb01tt(u, v) = X(u,v)2Et\n\n.\n\nif (u, v) 2 E\notherwise\nLP (u, v) r X(u,v)2Et\n\nAs all sets Et are disjoint, pro\ufb01t(u) =Pt pro\ufb01tt(u). Thus, to prove Lemma 4.4, it is suf\ufb01cient to\nshow that pro\ufb01tt(u) 0 for all t. Note that we only need to consider u 2 Vt as pro\ufb01tt(u) = 0 for\nu /2 Vt.\nConsider a step t of the algorithm and vertex u 2 Vt. Let w = wt be the center of the cluster\nchosen at this step. First, we show that since the diameter of the cluster Ct is 4r, for all negative\nedges (u, v) 2 E with u, v 2 Ct, we can charge the cost of disagreement to the edge itself, that\nis, pro\ufb01tt(u, v) is nonnegative for (u, v) 2 E (see Lemma B.3). We then consider two cases:\nxuw 2 [0, r] [ [3r, 1] and xuw 2 (r, 3r].\nThe former case is fairly simple since disagreeing positive edges (u, v) 2 E+ (with xuw 2 [0, r] [\n[3r, 1]) have a \u201clarge\u201d LP cost. In Lemma B.4 and Lemma B.5, we prove that the cost of disagreement\ncan be charged to the edge itself and hence pro\ufb01tt(u) 0.\nWe then consider the latter case. For vertices u with xuw 2 (r, 3r], pro\ufb01tt(u, v) for some disagreeing\npositive edges (u, v) might be negative. Thus, we split the pro\ufb01t at step t for such vertices u into the\npro\ufb01t they get from edges (u, v) with v in Ball(w, r) \\ Vt and from edges with v in Vt \\ Ball(w, r).\nThat is,\n\npro\ufb01tt(u, v)\n\npro\ufb01tt(u, v)\n\n.\n\npro\ufb01tt(u) = Xv2Ball(w,r)\n{z\n\n|\n\nPhigh(u)\n\n+ Xv2Vt\\Ball(w,r)\n{z\n|\n\n}\n\nPlow(u)\n\n}\n\nDenote the \ufb01rst term by Phigh(u) and the second term by Plow(u). We show that Plow(u) Lt(u)\n(see Claim B.6 and Lemma B.7) and Phigh Lt(w) (see Claim B.8 and Lemma B.9) and conclude\nthat pro\ufb01tt(u) = Phigh(u) + Plow(u) Lt(w) Lt(u) 0 since Lt(w) = maxw02Vt Lt(w0) \nLt(u).\nThis \ufb01nishes the proof of Lemma 4.4.\n\n4.3 Correlation Clustering with AKS Min Max Objective\nIn this section, we present our improved algorithm for Correlation Clustering with AKS Min Max\nObjective. Our algorithm produces a clustering of cost at most (2 + \")OP T , which improves upon\n\nthe bound of O(plog n \u00b7 max{log |E|, log(k)})approximation algorithm studied by Ahmadi,\nGalhotra, Khuller, Saha, and Schwartz [2019a].\nFor a subset S \u2713 V of vertices, we use cost+(S) to refer to the weight of positive edges \u201cassociated\u201d\nwith S that are in disagreement. These are the edges with exactly one end point in S. Thus,\ncost+(S) =P(u,v)2E+,u2S,v62S wuv. Similarly, we use cost(S) to refer to the weight of dissimilar\n\nedges \u201cassociated\u201d with S that are in disagreement. These are the edges with both endpoints in\n\n7\n\n\fS. Thus, cost(S) =P(u,v)2E,u,v2S wuv. The total cost of the set S is cost(S) = cost+(S) +\n\ncost(S).\nSimilar to the algorithm of Ahmadi et al. [2019b], our algorithm works in two phases. In the \ufb01rst phase,\nthe algorithm covers all vertices of the graph with (possibly overlapping) sets S1, . . . , Sk such that the\ncost of each set Si is at most 2OP T (i.e., cost(Si) \uf8ff 2OP T for each i 2{ 1, . . . , k}). In the second\nphase, the algorithm \ufb01nds sets P1, . . . , Pk such that: (1) P1, . . . , Pk are disjoint and cover the vertex\nset; (2) Pi \u2713 Si (and, consequently, cost(Pi) \uf8ff cost(Si)); (3) cost+(Pi) \uf8ff (1 + \") cost+(Si).\nThe sets P1, . . . , Pk are obtained from S1, . . . , Sk using an uncrossing procedure of Bansal et al.\n[2011]. Hence the clustering that is output is P = (P1, . . . , Pk). The improvement in the approxima-\ntion factor comes from the \ufb01rst phase of the algorithm.\n\n4.3.1 Summary of the algorithm\nAt the core of our algorithm is a simple subproblem: For a given vertex z 2 V , \ufb01nd a subset S \u2713 V\ncontaining z such that cost(S) is minimized. We solve this subproblem using a linear programming\nrelaxation, which is formulated as follows: The LP has a variable xu for each vertex u 2 V . In the\nintended integral solution, we have xu = 1 if u is in the set S, and xu = 0, otherwise. That is, xu is\nthe indicator of the event \u201cu 2 S\u201d. The LP has only one constraint: xz = 1. A complete description\nof the LP can be found in Figure C.1. In Claim C.1 we show that this LP is indeed a valid relaxation\nfor our subproblem.\nMoreover we prove that this LP is half-integral, please see section C.1 for details. We now present\nour algorithm which gives a 2-approximation to the subproblem.\nRounding algorithm for subproblem. We present a simple rounding algorithm. Let x\u21e4 be an\noptimal half-integral LP solution to the problem. We obtain an integral solution x by rouding down\nx\u21e4, that is xu = bx\u21e4uc for all u. Thus, \u00b5uv \uf8ff 2 \u00b7 \u00b5\u21e4uv and \u2318uv \uf8ff \u2318\u21e4uv for all positive and negative\nedges respectively. Thus, the cost of the rounded solution x is at most 2OPT.\nRounding algorithm for AKS Min Max Correlation Clustering. To obtain a cover of all the\nvertices, we pick yet uncovered vertices z 2 V one by one and for each z, \ufb01nd a set S(z) as described\nabove. Then, we remove those sets S(z) that are completely covered by other sets. The obtained\nfamily of sets S = {S(z)} satis\ufb01es the following properties: (1) Sets in S cover the entire set V ; (2)\ncost(S) \uf8ff 2OP T for each S 2S ; (3) Each set S 2S is not covered by the other sets in S (that is,\nfor each S 2S , S 6\u21e2 [S02(S\\{S})S0). However, sets S in S are not necessarily disjoint.\nFollowing Ahmadi et al. [2019b], we then apply an uncrossing procedure developed by Bansal et al.\n[2011] to the sets Si in S and obtain disjoint sets Pi such that (1) Pi \u21e2 Si and (2) cost+(Pi) \uf8ff\ncost+(Si) + \"OP T for each i (see Lemma C.3 in Section C.2). We have cost+(Pi) \uf8ff cost+(Si) +\n\"OP T and cost(Pi) \uf8ff cost(Si), since Pi is a subset of Si. Thus, cost(Pi) \uf8ff cost(Si) + \"OP T\nand, consequently, P1, . . . , Pk is a 2(1 + \")-approximation for Correlation Clustering with the\nAKS Min Max objective. We note that by slightly modifying our algorithm we can obtain a 2-\napproximation.\nFinally, we show that AKS Min-Max Correlation Clustering is at least as hard as Vertex Cover (see\nC.3 for details). Vertex Cover is NP-hard to approximate within any constant factor better than 2\nassuming the Unique Games conjecture (UGC) (see Khot and Regev [2008]). Thus, our algorithm\ngives the best possible approximation if UGC holds.\n\nReferences\nSaba Ahmadi, Sainyam Galhotra, Samir Khuller, Barna Saha, and Roy Schwartz. Min-max correlation\n\nclustering via multicut, 2019a.\n\nSaba Ahmadi, Samir Khuller, and Barna Saha. Min-max correlation clustering via multicut. In\nInternational Conference on Integer Programming and Combinatorial Optimization, pages 13\u201326.\nSpringer, 2019b.\n\nNir Ailon, Moses Charikar, and Alantha Newman. Aggregating inconsistent information: ranking\n\nand clustering. Journal of the ACM (JACM), 55(5):23, 2008.\n\n8\n\n\fNikhil Bansal, Avrim Blum, and Shuchi Chawla. Correlation clustering. Machine learning, 56(1-3):\n\n89\u2013113, 2004.\n\nNikhil Bansal, Uriel Feige, Robert Krauthgamer, Konstantin Makarychev, Viswanath Nagarajan,\nJoseph Naor, and Roy Schwartz. Min-max graph partitioning and small set expansion. In 2011\nIEEE 52nd Annual Symposium on Foundations of Computer Science, pages 17\u201326. IEEE, 2011.\nIn\nFoundations of Computer Science, 1996. Proceedings., 37th Annual Symposium on, pages 184\u2013193.\nIEEE, 1996.\n\nYair Bartal. Probabilistic approximation of metric spaces and its algorithmic applications.\n\nFrancesco Bonchi, David Garcia-Soriano, and Edo Liberty. Correlation clustering: from theory to\n\npractice. In KDD, page 1972. Citeseer, 2014.\n\nMoses Charikar, Venkatesan Guruswami, and Anthony Wirth. Clustering with qualitative information.\n\nIn IEEE Symposium on Foundations of Computer Science (FOCS). Citeseer, 2003.\n\nMoses Charikar, Neha Gupta, and Roy Schwartz. Local guarantees in graph cuts and clustering.\nIn International Conference on Integer Programming and Combinatorial Optimization, pages\n136\u2013147. Springer, 2017.\n\nShuchi Chawla, Robert Krauthgamer, Ravi Kumar, Yuval Rabani, and D Sivakumar. On the hardness\n\nof approximating multicut and sparsest-cut. computational complexity, 15(2):94\u2013114, 2006.\n\nShuchi Chawla, Konstantin Makarychev, Tselil Schramm, and Grigory Yaroslavtsev. Near optimal\nlp rounding algorithm for correlationclustering on complete and complete k-partite graphs. In\nProceedings of the forty-seventh annual ACM symposium on Theory of computing, pages 219\u2013228.\nACM, 2015.\n\nWilliam Cohen and Jacob Richman. Learning to match and cluster entity names. In ACM SIGIR-2001\n\nWorkshop on Mathematical/Formal Methods in Information Retrieval, 2001.\n\nWilliam W Cohen and Jacob Richman. Learning to match and cluster large high-dimensional data\nsets for data integration. In Proceedings of the eighth ACM SIGKDD international conference on\nKnowledge discovery and data mining, pages 475\u2013480. ACM, 2002.\n\nErik D Demaine, Dotan Emanuel, Amos Fiat, and Nicole Immorlica. Correlation clustering in general\n\nweighted graphs. Theoretical Computer Science, 361(2-3):172\u2013187, 2006.\n\nJittat Fakcharoenphol and Kunal Talwar. Improved decompositions of graphs with forbidden minors.\nIn 6th International workshop on Approximation algorithms for combinatorial optimization, pages\n36\u201346, 2003.\n\nJittat Fakcharoenphol, Satish Rao, and Kunal Talwar. A tight bound on approximating arbitrary\n\nmetrics by tree metrics. Journal of Computer and System Sciences, 69(3):485\u2013497, 2004.\n\nNaveen Garg, Vijay V Vazirani, and Mihalis Yannakakis. Approximate max-\ufb02ow min-(multi) cut\n\ntheorems and their applications. SIAM Journal on Computing, 25(2):235\u2013251, 1996.\n\nAnupam Gupta, Robert Krauthgamer, and James R Lee. Bounded geometries, fractals, and low-\n\ndistortion embeddings. In null, page 534. IEEE, 2003.\n\nSubhash Khot and Oded Regev. Vertex cover might be hard to approximate to within 2- \". Journal of\n\nComputer and System Sciences, 74(3):335\u2013349, 2008.\n\nGregory Puleo and Olgica Milenkovic. Correlation clustering and biclustering with locally bounded\n\nerrors. In International Conference on Machine Learning, pages 869\u2013877, 2016.\n\nAnirudh Ramachandran, Nick Feamster, and Santosh Vempala. Filtering spam with behavioral\nblacklisting. In Proceedings of the 14th ACM conference on Computer and communications\nsecurity, pages 342\u2013351. ACM, 2007.\n\nSatish Rao. Small distortion and volume preserving embeddings for planar and euclidean metrics. In\nProceedings of the \ufb01fteenth annual symposium on Computational geometry, pages 300\u2013306. ACM,\n1999.\n\n9\n\n\fAnthony Wirth. Correlation Clustering, pages 280\u2013284. Springer US, Boston, MA, 2017. ISBN\n\n978-1-4899-7687-1. doi: 10.1007/978-1-4899-7687-1_176. URL\n\n.\n\n10\n\n\f", "award": [], "sourceid": 4985, "authors": [{"given_name": "Sanchit", "family_name": "Kalhan", "institution": "Northwestern University"}, {"given_name": "Konstantin", "family_name": "Makarychev", "institution": "Northwestern University"}, {"given_name": "Timothy", "family_name": "Zhou", "institution": "University of Illinois at Urbana\u2013Champaign"}]}