{"title": "Distributed Power-law Graph Computing: Theoretical and Empirical Analysis", "book": "Advances in Neural Information Processing Systems", "page_first": 1673, "page_last": 1681, "abstract": "With the emergence of big graphs in a variety of real applications like social networks, machine learning based on distributed graph-computing~(DGC) frameworks has attracted much attention from big data machine learning community. In DGC frameworks, the graph partitioning~(GP) strategy plays a key role to affect the performance, including the workload balance and communication cost. Typically, the degree distributions of natural graphs from real applications follow skewed power laws, which makes GP a challenging task. Recently, many methods have been proposed to solve the GP problem. However, the existing GP methods cannot achieve satisfactory performance for applications with power-law graphs. In this paper, we propose a novel vertex-cut method, called \\emph{degree-based hashing}~(DBH), for GP. DBH makes effective use of the skewed degree distributions for GP. We theoretically prove that DBH can achieve lower communication cost than existing methods and can simultaneously guarantee good workload balance. Furthermore, empirical results on several large power-law graphs also show that DBH can outperform the state of the art.", "full_text": "Distributed Power-law Graph Computing:\n\nTheoretical and Empirical Analysis\n\nCong Xie\n\nDept. of Comp. Sci. and Eng.\nShanghai Jiao Tong University\n\n800 Dongchuan Road\nShanghai 200240, China\n\nxcgoner1108@gmail.com\n\nWu-Jun Li\n\nNational Key Lab. for Novel Software Tech.\n\nDept. of Comp. Sci. and Tech.\n\nNanjing University\n\nNanjing 210023, China\n\nliwujun@nju.edu.cn\n\nLing Yan\n\nDept. of Comp. Sci. and Eng.\nShanghai Jiao Tong University\n\n800 Dongchuan Road\nShanghai 200240, China\n\nyling0718@sjtu.edu.cn\n\nZhihua Zhang\n\nDept. of Comp. Sci. and Eng.\nShanghai Jiao Tong University\n\n800 Dongchuan Road\nShanghai 200240, China\n\nzhang-zh@cs.sjtu.edu.cn\n\nAbstract\n\nWith the emergence of big graphs in a variety of real applications like social\nnetworks, machine learning based on distributed graph-computing (DGC) frame-\nworks has attracted much attention from big data machine learning community.\nIn DGC frameworks, the graph partitioning (GP) strategy plays a key role to af-\nfect the performance, including the workload balance and communication cost.\nTypically, the degree distributions of natural graphs from real applications follow\nskewed power laws, which makes GP a challenging task. Recently, many methods\nhave been proposed to solve the GP problem. However, the existing GP methods\ncannot achieve satisfactory performance for applications with power-law graphs.\nIn this paper, we propose a novel vertex-cut method, called degree-based hash-\ning (DBH), for GP. DBH makes effective use of the skewed degree distributions\nfor GP. We theoretically prove that DBH can achieve lower communication cost\nthan existing methods and can simultaneously guarantee good workload balance.\nFurthermore, empirical results on several large power-law graphs also show that\nDBH can outperform the state of the art.\n\n1\n\nIntroduction\n\nRecent years have witnessed the emergence of big graphs in a large variety of real applications,\nsuch as the web and social network services. Furthermore, many machine learning and data mining\nalgorithms can also be modeled with graphs [13]. Hence, machine learning based on distributed\ngraph-computing (DGC) frameworks has attracted much attention from big data machine learning\ncommunity [13, 15, 14, 6, 11, 7]. To perform distributed (parallel) graph-computing on clusters with\nseveral machines (servers), one has to partition the whole graph across the machines in a cluster.\nGraph partitioning (GP) can dramatically affect the performance of DGC frameworks in terms of\nworkload balance and communication cost. Hence, the GP strategy typically plays a key role in\nDGC frameworks. The ideal GP method should minimize the cross-machine communication cost,\nand simultaneously keep the workload in every machine approximately balanced.\n\n1\n\n\fExisting GP methods can be divided into two main categories: edge-cut and vertex-cut methods.\nEdge-cut tries to evenly assign the vertices to machines by cutting the edges. In contrast, vertex-cut\ntries to evenly assign the edges to machines by cutting the vertices. Figure 1 illustrates the edge-\ncut and vertex-cut partitioning results of an example graph. In Figure 1 (a), the edges (A,C) and\n(A,E) are cut, and the two machines store the vertex sets {A,B,D} and {C,E}, respectively. In\nFigure 1 (b), the vertex A is cut, and the two machines store the edge sets {(A,B), (A,D), (B,D)}\nand {(A,C), (A,E), (C,E)}, respectively. In edge-cut, both machines of a cut edge should maintain\na ghost (local replica) of the vertex and the edge data. In vertex-cut, all the machines associated\nwith a cut vertex should maintain a mirror (local replica) of the vertex. The ghosts and mirrors are\nshown in shaded vertices in Figure 1. In edge-cut, the workload of a machine is determined by\nthe number of vertices located in that machine, and the communication cost of the whole graph is\ndetermined by the number of edges spanning different machines. In vertex-cut, the workload of a\nmachine is determined by the number of edges located in that machine, and the communication cost\nof the whole graph is determined by the number of mirrors of the vertices.\n\n(a) Edge-Cut\n\n(b) Vertex-Cut\n\nFigure 1: Two strategies for graph partitioning. Shaded vertices are ghosts and mirrors, respectively.\n\nMost traditional DGC frameworks, such as GraphLab [13] and Pregel [15], use edge-cut meth-\nods [9, 18, 19, 20] for GP. Very recently, the authors of PowerGraph [6] \ufb01nd that the vertex-cut\nmethods can achieve better performance than edge-cut methods, especially for power-law graph-\ns. Hence, vertex-cut has attracted more and more attention from DGC research community. For\nexample, PowerGraph [6] adopts a random vertex-cut method and two greedy variants for GP.\nGraphBuilder [8] provides some heuristics, such as the grid-based constrained solution, to improve\nthe random vertex-cut method.\nLarge natural graphs usually follow skewed degree distributions like power-law distributions, which\nmakes GP challenging. Different vertex-cut methods can result in different performance for power-\nlaw graphs. For example, Figure 2 (a) shows a toy power-law graph with only one vertex having\nmuch higher degree than the others. Figure 2 (b) shows a partitioning strategy by cutting the vertices\n{E, F, A, C, D}, and Figure 2 (c) shows a partitioning strategy by cutting the vertices {A, E}. We\ncan \ufb01nd that the partitioning strategy in Figure 2 (c) is better than that in Figure 2 (b) because the\nnumber of mirrors in Figure 2 (c) is smaller which means less communication cost. The intuition\nunderlying this example is that cutting higher-degree vertices can result in fewer mirror vertices.\nHence, the power-law degree distribution can be used to facilitate GP. Unfortunately, existing vertex-\ncut methods, including those in PowerGraph and GraphBuilder, make rarely use of the power-law\ndegree distribution for GP. Hence, they cannot achieve satisfactory performance in natural power-\nlaw graphs. PowerLyra [4] tries to combine both edge-cut and vertex-cut together by using the\npower-law degree distribution. However, it is lack of theoretical guarantee.\n\n(b) Bad partitioning\n\n(a) Sample\n\n(c) Good partitioning\n\nFigure 2: Partition a sample graph with vertex-cut.\n\n2\n\n\fIn this paper, we propose a novel vertex-cut GP method, called degree-based hashing (DBH), for\ndistributed power-law graph computing. The main contributions of DBH are brie\ufb02y outlined as\nfollows:\n\ncut GP.\n\n\u2022 DBH can effectively exploit the power-law degree distributions in natural graphs for vertex-\n\u2022 Theoretical bounds on the communication cost and workload balance for DBH can be de-\nrived, which show that DBH can achieve lower communication cost than existing methods\nand can simultaneously guarantee good workload balance.\n\u2022 DBH can be implemented as an execution engine for PowerGraph [6], and hence all\n\u2022 Empirical results on several large real graphs and synthetic graphs show that DBH can\n\nPowerGraph applications can be seamlessly supported by DBH.\n\noutperform the state-of-the-art methods.\n\n2 Problem Formulation\nLet G = (V, E) denote a graph, where V = {v1, v2, . . . , vn} is the set of vertices and E \u2286 V \u00d7 V\nis the set of edges in G. Let |V | denote the cardinality of the set V , and hence |V | = n. vi and vj are\ncalled neighbors if (vi, vj) \u2208 E. The degree of vi is denoted as di, which measures the number of\nneighbors of vi. Please note that we only need to consider the GP task for undirected graphs because\nthe workload mainly depends on the number of edges no matter directed or undirected graphs the\ncomputation is based on. Even if the computation is based on directed graphs, we can also use the\nundirected counterparts of the directed graphs to get the partitioning results.\nAssume we have a cluster of p machines. Vertex-cut GP is to assign each edge with the two corre-\nsponding vertices to one of the p machines in the cluster. The assignment of an edge is unique, while\nvertices may have replicas across different machines. For DGC frameworks based on vertex-cut GP,\nthe workload (amount of computation) of a machine is roughly linear in the number of edges located\nin that machine, and the replicas of the vertices incur communication for synchronization. So the\ngoal of vertex-cut GP is to minimize the number of replicas and simultaneously balance the number\nof edges on each machine.\nLet M (e) \u2208 {1, . . . , p} be the machine edge e \u2208 E is assigned to, and A(v) \u2286 {1, . . . , p} be\nthe span of vertex v over different machines. Hence, |A(v)| is the number of replicas of v among\ndifferent machines. Similar to PowerGraph [6], one of the replicas of a vertex is chosen as the master\nand the others are treated as the mirrors of the master. We let M aster(v) denote the machine in\nwhich the master of v is located. Hence, the goal of vertex-cut GP can be formulated as follows:\n\nn(cid:88)\n\nmin\n\nA\n\n|A(vi)|\n\n1\nn\n|{e \u2208 E | M (e) = m}| < \u03bb\n\ni=1\n\n|E|\np\n\n, and max\nm\n\n|{v \u2208 V | M aster(v) = m}| < \u03c1\n\nn\np\n\n,\n\ns.t. max\n\nm\n\nn(cid:80)\n\n|A(vi)| as replication factor, p|E| max\n\nwhere m \u2208 {1, . . . , p} denotes a machine, \u03bb \u2265 1 and \u03c1 \u2265 1 are imbalance factors. We de-\n|{e \u2208 E | M (e) = m}| as edge-imbalance, and\n\ufb01ne 1\nn\n|{v \u2208 V | M aster(v) = m}| as vertex-imbalance. To get a good balance of workload, \u03bb\np\nn max\nm\nand \u03c1 should be as small as possible.\nThe degrees of natural graphs usually follow skewed power-law distributions [3, 1]:\n\ni=1\n\nm\n\nPr(d) \u221d d\u2212\u03b1,\n\nwhere Pr(d) is the probability that a vertex has degree d and the power parameter \u03b1 is a positive\nconstant. The lower the \u03b1 is, the more skewed a graph will be. This power-law degree distribu-\ntion makes GP challenging [6]. Although vertex-cut methods can achieve better performance than\nedge-cut methods for power-law graphs [6], existing vertex-cut methods, such as random method in\nPowerGraph and grid-based method in GraphBuilder [8], cannot make effective use of the power-\nlaw distribution to achieve satisfactory performance.\n\n3\n\n\f3 Degree-Based Hashing for GP\nIn this section, we propose a novel vertex-cut method, called degree-based hashing (DBH), to ef-\nfectively exploit the power-law distribution for GP.\n\n3.1 Hashing Model\n\nWe refer to a certain machine by its index idx, and the idxth machine is denoted as Pidx. We \ufb01rst de-\n\ufb01ne two kinds of hash functions: vertex-hash function idx = vertex hash(v) which hashes vertex\nv to the machine Pidx, and edge-hash function idx = edge hash(e) or idx = edge hash(vi, vj)\nwhich hashes edge e = (vi, vj) to the machine Pidx.\nOur hashing model includes two main components:\n\n\u2022 Master-vertex assignment: The master replica of vi is uniquely assigned to one of the\np machines with equal probability for each machine by some randomized hash function\nvertex hash(vi).\n\u2022 Edge assignment: Each edge e = (vi, vj) is assigned to one of the p machines by some\n\nhash function edge hash(vi, vj).\n\nIt is easy to \ufb01nd that the above hashing model is a vertex-cut GP method. The master-vertex as-\nsignment can be easily implemented, which can also be expected to achieve a low vertex-imbalance\nscore. On the contrary, the edge assignment is much more complicated. Different edge-hash func-\ntions can achieve different replication factors and different edge-imbalance scores. Please note that\nreplication factor re\ufb02ects communication cost, and edge-imbalance re\ufb02ects workload-imbalance.\nHence, the key of our hashing model lies in the edge-hash function edge hash(vi, vj).\n\n3.2 Degree-Based Hashing\n\nFrom the example in Figure 2, we observe that in power-law graphs the replication factor, which is\nde\ufb01ned as the total number of replicas divided by the total number of vertices, will be smaller if we\ncut vertices with relatively higher degrees. Based on this intuition, we de\ufb01ne the edge hash(vi, vj)\nas follows:\n\nedge hash(vi, vj) =\n\n(1)\n\n(cid:26)vertex hash(vi)\n\nif di < dj,\nvertex hash(vj) otherwise.\n\nIt means that we use the vertex-hash function to de\ufb01ne the edge-hash function. Furthermore, the\nedge-hash function value of an edge is determined by the degrees of the two associated vertices.\nMore speci\ufb01cally, the edge-hash function value of an edge is de\ufb01ned by the vertex-hash function\nvalue of the associated vertex with a smaller degree. Hence, our method is called degree-based\nhashing (DBH). DBH can effectively capture the intuition that cutting vertices with higher degrees\nwill get better performance.\nOur DBH method for vertex-cut GP is brie\ufb02y summarized in Algorithm 1, where [n] = {1, . . . , n}.\n\nAlgorithm 1 Degree-based hashing (DBH) for vertex-cut GP\nInput: The set of edges E; the set of vertices V ; the number of machines p.\nOutput: The assignment M (e) \u2208 [p] for each edge e.\n1: Initialization: count the degree di for each i \u2208 [n] in parallel\n2: for all e = (vi, vj) \u2208 E do\n3:\n4:\n5:\n6:\n7:\nend if\n8:\n9: end for\n\nM (e) \u2190 vertex hash(vi)\nM (e) \u2190 vertex hash(vj)\n\nHash each edge in parallel:\nif di < dj then\n\nelse\n\n4\n\n\fE\n\n(cid:34)\n\n(cid:35)\n\n(cid:12)(cid:12)(cid:12)(cid:12)D\n(cid:35)\n\n\u221a\n\n(cid:12)(cid:12)(cid:12)(cid:12)D\n\n4 Theoretical Analysis\n\nIn this section, we present theoretical analysis for our DBH method. For comparison, the ran-\ndom vertex-cut method (called Random) of PowerGraph [6] and the grid-based constrained solu-\ntion (called Grid) of GraphBuilder [8] are adopted as baselines. Our analysis is based on random-\nization. Moreover, we assume that the graph is undirected and there are no duplicated edges in the\ngraph. We mainly study the performance in terms of replication factor, edge-imbalance and vertex-\nimbalance de\ufb01ned in Section 2. Due to space limitation, we put the proofs of all theoretical results\ninto the supplementary material.\n\n4.1 Partitioning Degree-\ufb01xed Graphs\nFirstly, we assume that the degree sequence {di}n\nreplication factor produced by different methods.\nRandom assigns each edge evenly to the p machines via a randomized hash function. The result can\nbe directly got from PowerGraph [6].\nLemma 1. Assume that we have a sequence of n vertices {vi}n\nsequence D = {di}n\nfactor:\n\ni=1 and the corresponding degree\ni=1. A simple randomized vertex-cut on p machines has the expected replication\n\ni=1 is \ufb01xed. Then we can get the following expected\n\n(cid:34)\n\nn(cid:88)\n\ni=1\n\n1\nn\n\n|A(vi)|\n\n(cid:20)\n1 \u2212(cid:16)\n\nn(cid:88)\n\ni=1\n\n1 \u2212 1\np\n\n(cid:17)di(cid:21)\n\n.\n\n=\n\np\nn\n\u221a\n\nBy using the Grid hash function, each vertex has\nRandom. Thus we simply replace p with\nCorollary 1. By using Grid for hashing, the expected replication factor on p machines is:\n\np to get the following corollary.\n\np rather than p candidate machines compared to\n\nn(cid:88)\n\ni=1\n\nE\n\n1\nn\n\n|A(vi)|\n\n(cid:20)\n1 \u2212(cid:16)\n\nn(cid:88)\n\ni=1\n\n\u221a\n\np\nn\n\n=\n\n1 \u2212 1\u221a\np\n\n(cid:17)di(cid:21)\n\n.\n\nn(cid:88)\n\ni=1\n\nUsing DBH method in Section 3.2, we obtain the following result by \ufb01xing the sequence {hi}n\ni=1,\nwhere hi is de\ufb01ned as the number of vi\u2019s adjacent edges which are hashed by the neighbors of vi\naccording to the edge-hash function de\ufb01ned in (1).\nTheorem 1. Assume that we have a sequence of n vertices {vi}n\nsequence D = {di}n\nH = {hi}n\n\ni=1 and the corresponding degree\ni=1. For each vi, di \u2212 hi adjacent edges of it are hashed by vi itself. De\ufb01ne\n\ni=1. Our DBH method on p machines has the expected replication factor:\n1 \u2212 1\np\n\n(cid:20)\n1 \u2212(cid:16)\n\n(cid:20)\n1 \u2212(cid:16)\n\n(cid:17)hi+1(cid:21)\n\n(cid:12)(cid:12)(cid:12)(cid:12)H, D\n\n(cid:17)di(cid:21)\n\n1 \u2212 1\np\n\nn(cid:88)\n\n|A(vi)|\n\n\u2264 p\nn\n\n(cid:35)\n\n=\n\np\nn\n\n(cid:34)\n\nE\n\n1\nn\n\n,\n\nwhere hi \u2264 di \u2212 1 for any vi.\nThis theorem says that our DBH method has smaller expected replication factor than Random of\nPowerGraph [6].\nNext we turn to the analysis of the balance constraints. We still \ufb01x the degree sequence and have the\nfollowing result for our DBH method.\nTheorem 2. Our DBH method on p machines with the sequences {vi}n\ni=1 and {hi}n\nde\ufb01ned in Theorem 1 has the edge-imbalance:\n\ni=1, {di}n\n\ni=1\n\nn(cid:88)\n\ni=1\n\ni=1\n\nn(cid:80)\n\n(cid:80)\n\nmax\n\nm\n\n|{e \u2208 E | M (e) = m}|\n\n|E|/p\n\ni=1\n\n=\n\nhi\np + max\nj\u2208[p]\nvi\u2208Pj\n2|E|/p\n\n(di \u2212 hi)\n\n.\n\nAlthough the master vertices are evenly assigned to each machine, we want to show how the ran-\ndomized assignment is close to the perfect balance. This problem is well studied in the model of\nuniformly throwing n balls into p bins when n (cid:29) p(ln p)3 [17].\n\n5\n\n\fLemma 2. The maximum number of master vertices for each machine is bounded as follows:\n\n(cid:26)Pr[M axLoad > ka] = o(1)\n\nPr[M axLoad > ka] = 1 \u2212 o(1)\n|{v \u2208 V | M aster(v) = m}|, and ka = n\n\nif a > 1,\nif 0 < a < 1.\n\n(cid:114)\n\np +\n\n2n ln p\n\np\n\n(cid:16)\n\n1 \u2212 ln ln p\n\n2a ln p\n\n(cid:17)\n\n.\n\nHere M axLoad = max\nm\n\n4.2 Partitioning Power-law Graphs\nNow we change the sequence of \ufb01xed degrees into a sequence of random samples generated from\nthe power-law distribution. As a result, upper-bounds can be provided for the above three methods,\nwhich are Random, Grid and DBH.\nTheorem 3. Let the minimal degree be dmin and each d \u2208 {di}n\ni=1 be sampled from a power-law\ndegree distribution with parameter \u03b1 \u2208 (2, 3). The expected replication factor of Random on p\nmachines can be approximately bounded by:\n\n(cid:34)\n\nED\n\np\nn\n\n(cid:18)\n1 \u2212(cid:16)\n\nn(cid:88)\n\ni=1\n\n1 \u2212 1\np\n\n(cid:17)di(cid:19)(cid:35)\n\n(cid:20)\n1 \u2212(cid:16)\n\n\u2264 p\n\n(cid:17) \u02c6\u2126(cid:21)\n\n,\n\n1 \u2212 1\np\n\nwhere \u02c6\u2126 = dmin \u00d7 \u03b1\u22121\n\u03b1\u22122 .\nThis theorem says that when the degree sequence is under power-law distribution, the upper bound\nof the expected replication factor increases as \u03b1 decreases. This implies that Random yields a worse\npartitioning when the power-law graph is more skewed.\nLike Corollary 1, we replace p with\nCorollary 2. By using Grid method, the expected replication factor on p machines can be approxi-\nmately bounded by:\n\np to get the similar result for Grid.\n\n\u221a\n\n(cid:34)\u221a\n\n1 \u2212 1\u221a\np\n\nn(cid:88)\n(cid:17) \u02c6\u2126(cid:21)\n\n(cid:18)\n1 \u2212(cid:16)\n(cid:20)\n1 \u2212(cid:16)\n\n\u2264 p\n\ni=1\n\n(cid:17)di(cid:19)(cid:35)\n\u2264 \u221a\n(cid:17) \u02c6\u2126(cid:21)\n\n1 \u2212 1\n\np\n\nED\n\np\nn\nwhere \u02c6\u2126 = dmin \u00d7 \u03b1\u22121\n\u03b1\u22122 .\n\n(cid:20)\n1 \u2212(cid:16)\n\n\u221a\n\n1 \u2212 1\u221a\n\n(cid:20)\n1 \u2212(cid:16)\n\np\n\n(cid:17) \u02c6\u2126(cid:21)\n\n,\n\n1 \u2212 1\u221a\np\n\np\n\nNote that\nthe replication factor but it is not motivated by the skewness of the degree distribution.\nTheorem 4. Assume each edge is hashed by our DBH method and hi \u2264 di \u2212 1 for any vi. The\nexpected replication factor of DBH on p machines can be approximately bounded by:\n\n. So Corollary 2 tells us that Grid can reduce\n\np\n\n(cid:34)\n\nn(cid:88)\n\ni=1\n\np\nn\n\n(cid:18)\n1 \u2212(cid:16)\n\n1 \u2212 1\np\n\n(cid:17)hi+1(cid:19)(cid:35)\n\n(cid:20)\n\n1 \u2212(cid:16)\n\n\u2264 p\n\n(cid:17) \u02c6\u2126(cid:48)(cid:21)\n\n,\n\n1 \u2212 1\np\n\nEH,D\n\nwhere \u02c6\u2126(cid:48) = dmin \u00d7 \u03b1\u22121\nNote that\n\n\u03b1\u22122 \u2212 dmin \u00d7 \u03b1\u22121\n\n2 .\n2\u03b1\u22123 + 1\n\n(cid:20)\n1 \u2212(cid:16)\n\np\n\n(cid:17) \u02c6\u2126(cid:48)(cid:21)\n\n(cid:20)\n1 \u2212(cid:16)\n\n< p\n\n(cid:17) \u02c6\u2126(cid:21)\n\n.\n\n1 \u2212 1\np\n\n1 \u2212 1\np\n\nTherefore, our DBH method can expectedly reduce the replication factor. The term \u03b1\u22121\n2\u03b1\u22123 increases\nas \u03b1 decreases, which means our DBH reduces more replication factor when the power-law graph\nis more skewed. Note that Grid and our DBH method actually use two different ways to reduce the\nreplication factor. Grid reduces more replication factor when p grows. These two approaches can\nbe combined to obtain further improvement, which is not the focus of this paper.\nFinally, we show that our DBH methd also guarantees good edge-balance (workload balance) under\npower-law distributions.\n\n6\n\n\f(cid:21)\n\n(cid:20) n(cid:80)\n\ni=1 and\ni=1 de\ufb01ned above. The vertices are evenly assigned. By taking the constant 2|E|/p =\n= nED [d] /p, there exists \u0001 \u2208 (0, 1) such that the expected edge-imbalance of DBH\n\nTheorem 5. Assume each edge is hashed by the DBH method with dmin, {vi}n\n{hi}n\nED\non p machines can be bounded w.h.p (with high probability). That is,\n\ni=1, {di}n\n\ndi\n\ni=1\n\n(cid:88)\n\nvi\u2208Pj\n\n\uf8f9\uf8fb \u2264 (1 + \u0001)\n\n2|E|\np\n\n.\n\n\uf8ee\uf8f0 n(cid:88)\n\ni=1\n\nEH,D\n\nhi\np\n\n+ max\nj\u2208[p]\n\n(di \u2212 hi)\n\nNote that any \u0001 that satis\ufb01es 1/\u0001 (cid:28) n/p could work for this theorem, which results in a tighter\nbound for large n. Therefore, together with Theorem 4, this theorem shows that our DBH method\ncan reduce the replication factor and simultaneously guarantee good workload balance.\n\n5 Empirical Evaluation\n\nIn this section, empirical evaluation on real and synthetic graphs is used to verify the effectiveness\nof our DBH method. The cluster for experiment contains 64 machines connected via 1 GB Ethernet.\nEach machine has 24 Intel Xeon cores and 96GB of RAM.\n\n5.1 Datasets\n\nThe graph datasets used in our experiments include both synthetic and real-world power-law graphs.\nEach synthetic power-law graph is generated by a combination of two synthetic directed graphs. The\nin-degree and out-degree of the two directed graphs are sampled from the power-law degree distribu-\ntions with different power parameters \u03b1 and \u03b2, respectively. Such a collection of synthetic graphs is\nseparated into two subsets: one subset with parameter \u03b1 \u2265 \u03b2 which is shown in Table 1(a), and the\nother subset with parameter \u03b1 < \u03b2 which is shown in Table 1(b). The real-world graphs are shown\nin Table 1(c). Some of the real-world graphs are the same as those in the experiment of PowerGraph.\nAnd some additional real-world graphs are from the UF Sparse Matrices Collection [5].\n\n(a) Synthetic graphs: \u03b1 \u2265 \u03b2\n|E|\n71,334,974\n88,305,754\n134,881,233\n273,569,812\n103,838,645\n164,602,848\n280,516,909\n208,555,632\n310,763,862\n\nAlias\nS1\nS2\nS3\nS4\nS5\nS6\nS7\nS8\nS9\n\n\u03b2\n2.2\n2.1\n2.0\n1.9\n2.1\n2.0\n1.9\n2.0\n1.9\n\n\u03b1\n2.2\n2.2\n2.2\n2.2\n2.1\n2.1\n2.1\n2.0\n2.0\n\nTable 1: Datasets\n(b) Synthetic graphs: \u03b1 < \u03b2\n\nAlias\nS10\nS11\nS12\nS13\nS14\nS15\n\n\u03b1\n2.1\n2.0\n2.0\n1.9\n1.9\n1.9\n\n\u03b2\n2.2\n2.2\n2.1\n2.2\n2.1\n2.0\n\n|E|\n88,617,300\n135,998,503\n145,307,486\n280,090,594\n289,002,621\n327,718,498\n\n(c) Real-world graphs\n\nAlias\nTw\nArab\nWiki\nLJ\nWG\n\nGraph\nTwitter [10]\nArabic-2005 [5]\nWiki [2]\nLiveJournal [16]\nWebGoogle [12]\n\n|V |\n42M\n22M\n5.7M\n5.4M\n0.9M\n\n|E|\n1.47B\n0.6B\n130M\n79M\n5.1M\n\n5.2 Baselines and Evaluation Metric\n\nIn our experiment, we adopt the Random of PowerGraph [6] and the Grid of GraphBuilder [8]1\nas baselines for empirical comparison. The method Hybrid of PowerLyra [4] is not adopted for\ncomparison because it combines both edge-cut and vertex-cut which is not a pure vertex-cut method.\nOne important metric is the replication factor, which re\ufb02ects the communication cost. To test the\nspeedup for real applications, we use the total execution time for PageRank which is forced to take\n100 iterations. The speedup is de\ufb01ned as: speedup = 100% \u00d7 (\u03b3Alg \u2212 \u03b3DBH )/\u03b3Alg, where \u03b3Alg is\nthe execution time of PageRank with the method Alg. Here, Alg can be Random or Grid. Because\nall the methods can achieve good workload balance in our experiments, we do not report it here.\n\n1GraphLab 2.2 released in July 2013 has used PowerGraph as its engine, and the Grid GP method has been\nadopted by GraphLab 2.2 to replace the original Random GP method. Detailed information can be found at:\nhttp://graphlab.org/projects/index.html\n\n7\n\n\f5.3 Results\n\nFigure 3 shows the replication factor on two subsets of synthetic graphs. We can \ufb01nd that our DBH\nmethod achieves much lower replication factor than Random and Grid. The replication factor of\nDBH is reduced by up to 80% compared to Random and 60% compared to Grid.\n\n(a) Replication Factor\n\n(b) Replication Factor\n\nFigure 3: Experiments on two subsets of synthetic graphs. The X-axis denotes different datasets in Table 1(a)\nand Table 1(b). The number of machines is 48.\n\nFigure 4 (a) shows the replication factor on the real-world graphs. We can also \ufb01nd that DBH\nachieves the best performance. Figure 4 (b) shows that the relative speedup of DBH is up to 60%\nover Random and 25% over Grid on the PageRank computation.\n\n(a) Replication Factor\n\n(b) Execution Speedup\n\nFigure 4: Experiments on real-world graphs. The number of machines is 48.\n\nFigure 5 shows the replication factor and execution time for PageRank on Twitter graph when the\nnumber of machines ranges from 8 to 64. We can \ufb01nd our DBH achieves the best performance for\nall cases.\n\n(a) Replication Factor\n\n(b) Execution Time\n\nFigure 5: Experiments on Twitter graph. The number of machines ranges from 8 to 64.\n\n6 Conclusion\nIn this paper, we have proposed a new vertex-cut graph partitioning method called degree-based\nhashing (DBH) for distributed graph-computing frameworks. DBH can effectively exploit the\npower-law degree distributions in natural graphs to achieve promising performance. Both theo-\nretical and empirical results show that DBH can outperform the state-of-the-art methods. In our\nfuture work, we will apply DBH to more big data machine learning tasks.\n\n7 Acknowledgements\n\nThis work is supported by the NSFC (No. 61100125, No. 61472182), the 863 Program of China\n(No. 2012AA011003), and the Fundamental Research Funds for the Central Universities.\n\n8\n\nS1S2S3S4S5S6S7S8S9051015202530Replication Factor 1+10\u221212RandomGridDBHS10S11S12S13S14S15051015202530Replication Factor 1+10\u221212RandomGridDBHWGLJWikiArabTw024681012141618Replication Factor 1+10\u221212RandomGridDBHWGLJWikiArabTw010203040506070Speedup(%) 26.5%8.42%21.2%4.28%23.6%6.06%31.5%13.3%60.6%25%1+10\u221212RandomGrid816244864024681012141618201+10\u221212Replication FactorNumber of Machines RandomGridDBH8162448642004006008001000120014001600180020001+10\u221212Execution Time (Sec)Number of Machines RandomGridDBH\fReferences\n[1] Lada A Adamic and Bernardo A Huberman. Zipf\u2019s law and the internet. Glottometrics, 3(1):143\u2013150,\n\n2002.\n\n[2] Paolo Boldi and Sebastiano Vigna. The webgraph framework I: compression techniques. In Proceedings\n\nof the 13th international conference on World Wide Web (WWW), 2004.\n\n[3] Andrei Broder, Ravi Kumar, Farzin Maghoul, Prabhakar Raghavan, Sridhar Rajagopalan, Raymie Stata,\nAndrew Tomkins, and Janet Wiener. Graph structure in the web. Computer networks, 33(1):309\u2013320,\n2000.\n\n[4] Rong Chen, Jiaxin Shi, Yanzhe Chen, Haibing Guan, and Haibo Chen. Powerlyra: Differentiated graph\ncomputation and partitioning on skewed graphs. Technical Report IPADSTR-2013-001, Institute of Par-\nallel and Distributed Systems, Shanghai Jiao Tong University, 2013.\n\n[5] Timothy A Davis and Yifan Hu. The University of Florida sparse matrix collection. ACM Transactions\n\non Mathematical Software, 38(1):1, 2011.\n\n[6] Joseph E Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin. Powergraph: Dis-\ntributed graph-parallel computation on natural graphs. In Proceedings of the 10th USENIX Symposium\non Operating Systems Design and Implementation (OSDI), 2012.\n\n[7] Joseph E. Gonzalez, Reynold S. Xin, Ankur Dave, Daniel Crankshaw, Michael J. Franklin, and Ion Stoica.\nIn Proceedings of the 11th USENIX\n\nGraphX: Graph processing in a distributed data\ufb02ow framework.\nSymposium on Operating Systems Design and Implementation (OSDI), 2014.\n\n[8] Nilesh Jain, Guangdeng Liao, and Theodore L Willke. Graphbuilder: scalable graph etl framework. In\nProceedings of the First International Workshop on Graph Data Management Experiences and Systems,\n2013.\n\n[9] George Karypis and Vipin Kumar. Multilevel graph partitioning schemes. In Proceedings of the Interna-\n\ntional Conference on Parallel Processing (ICPP), 1995.\n\n[10] Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. What is twitter, a social network or a\n\nnews media. In Proceedings of the 19th international conference on World Wide Web (WWW), 2010.\n\n[11] Aapo Kyrola, Guy E. Blelloch, and Carlos Guestrin. Graphchi: Large-scale graph computation on just a\nPC. In Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation\n(OSDI), 2012.\n\n[12] Jure Leskovec. Stanford large network dataset collection. URL http://snap. stanford. edu/data/index.\n\nhtml, 2011.\n\n[13] Yucheng Low, Joseph Gonzalez, Aapo Kyrola, Danny Bickson, Carlos Guestrin, and Joseph M. Heller-\nstein. GraphLab: A new framework for parallel machine learning. In Proceedings of the Conference on\nUncertainty in Arti\ufb01cial Intelligence (UAI), 2010.\n\n[14] Yucheng Low, Joseph Gonzalez, Aapo Kyrola, Danny Bickson, Carlos Guestrin, and Joseph M. Heller-\nIn Proceedings of the\n\nstein. Distributed graphlab: A framework for machine learning in the cloud.\nInternational Conference on Very Large Data Bases (VLDB), 2012.\n\n[15] Grzegorz Malewicz, Matthew H Austern, Aart JC Bik, James C Dehnert, Ilan Horn, Naty Leiser, and\nGrzegorz Czajkowski. Pregel: a system for large-scale graph processing. In Proceedings of the ACM\nSIGMOD International Conference on Management of Data (SIGMOD), 2010.\n\n[16] Alan Mislove, Massimiliano Marcon, Krishna P Gummadi, Peter Druschel, and Bobby Bhattacharjee.\nMeasurement and analysis of online social networks. In Proceedings of the 7th ACM SIGCOMM confer-\nence on Internet Measurement, 2007.\n\n[17] Martin Raab and Angelika Steger. balls into binsa simple and tight analysis.\nApproximation Techniques in Computer Science, pages 159\u2013170. Springer, 1998.\n\nIn Randomization and\n\n[18] Isabelle Stanton and Gabriel Kliot. Streaming graph partitioning for large distributed graphs. In Pro-\nceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining\n(KDD), 2012.\n\n[19] Charalampos Tsourakakis, Christos Gkantsidis, Bozidar Radunovic, and Milan Vojnovic. Fennel: Stream-\ning graph partitioning for massive scale graphs. In Proceedings of the 7th ACM International Conference\non Web Search and Data Mining (WSDM), 2014.\n\n[20] Lu Wang, Yanghua Xiao, Bin Shao, and Haixun Wang. How to partition a billion-node graph. In Pro-\n\nceedings of the International Conference on Data Engineering (ICDE), 2014.\n\n9\n\n\f", "award": [], "sourceid": 876, "authors": [{"given_name": "Cong", "family_name": "Xie", "institution": "Shanghai Jiao Tong University"}, {"given_name": "Ling", "family_name": "Yan", "institution": "Shanghai Jiao Tong University"}, {"given_name": "Wu-Jun", "family_name": "Li", "institution": "Nanjing University"}, {"given_name": "Zhihua", "family_name": "Zhang", "institution": "Shanghai Jiao Tong University"}]}