{"title": "Graph Matching via Multiplicative Update Algorithm", "book": "Advances in Neural Information Processing Systems", "page_first": 3187, "page_last": 3195, "abstract": "As a fundamental problem in computer vision, graph matching problem can usually be formulated as a Quadratic Programming (QP) problem with doubly stochastic and discrete (integer) constraints. Since it is NP-hard, approximate algorithms are required. In this paper, we present a new algorithm, called Multiplicative Update Graph Matching (MPGM), that develops a multiplicative update technique to solve the QP matching problem. MPGM has three main benefits: (1) theoretically, MPGM solves the general QP problem with doubly stochastic constraint naturally whose convergence and KKT optimality are guaranteed. (2) Em- pirically, MPGM generally returns a sparse solution and thus can also incorporate the discrete constraint approximately. (3) It is efficient and simple to implement. Experimental results show the benefits of MPGM algorithm.", "full_text": "Graph Matching via Multiplicative Update Algorithm\n\nBo Jiang\n\nJin Tang\n\nSchool of Computer Science\n\nSchool of Computer Science\n\nChris Ding\n\nCSE Department,\n\nUniversity of Texas at\n\nArlington, Arlington, USA\n\nand Technology\n\nAnhui University, China\njiangbo@ahu.edu.cn\n\nand Technology\n\nAnhui University, China\n\ntj@ahu.edu.cn\n\nchqding@uta.edu\n\nYihong Gong\n\nSchool of Electronic\n\nand Information Engineering\n\nXi\u2019an Jiaotong University, China\nygong@mail.xjtu.edu.cn\n\nBin Luo\n\nSchool of Computer Science\n\nand Technology,\n\nAnhui University, China\nluobin@ahu.edu.cn\n\nAbstract\n\nAs a fundamental problem in computer vision, graph matching problem can\nusually be formulated as a Quadratic Programming (QP) problem with doubly\nstochastic and discrete (integer) constraints. Since it is NP-hard, approximate\nalgorithms are required. In this paper, we present a new algorithm, called Multi-\nplicative Update Graph Matching (MPGM), that develops a multiplicative update\ntechnique to solve the QP matching problem. MPGM has three main bene\ufb01ts: (1)\ntheoretically, MPGM solves the general QP problem with doubly stochastic con-\nstraint naturally whose convergence and KKT optimality are guaranteed. (2) Em-\npirically, MPGM generally returns a sparse solution and thus can also incorporate\nthe discrete constraint approximately. (3) It is ef\ufb01cient and simple to implement.\nExperimental results show the bene\ufb01ts of MPGM algorithm.\n\n1 Introduction\n\nIn computer vision and machine learning area, many problems of interest can be formulated by\ngraph matching problem. Previous approaches [3\u20135, 15, 16] have formulated graph matching as a\nQuadratic Programming (QP) problem with both doubly stochastic and discrete constraints. Since\nit is known to be NP-hard, many approximate algorithms have been developed to \ufb01nd approximate\nsolutions for this problem [8, 16, 21, 24, 20, 13].\nOne kind of approximate methods generally \ufb01rst develop a continuous problem by relaxing the dis-\ncrete constraint and aim to \ufb01nd the optimal solution for this continuous problem. After that, they\nobtain the \ufb01nal discrete solution by using a discretization step such as Hungarian or greedy algo-\nrithm [3, 15, 16]. Obviously, the discretization step of these methods is generally independent of the\nmatching objective optimization process which may lead to weak local optimum for the problem.\nAnother kind of methods aim to obtain a discrete solution for QP matching problem [16, 1, 24].\nFor example, Leordeanu et al. [16] proposed an iterative matching method (IPFP) which optimized\nthe QP matching problem in a discrete domain. Zhou et al. [24, 25] proposed an effective graph\nmatching method (FGM) which optimized the QP matching problem approximately using a convex-\nconcave relaxation technique [21] and thus returns a discrete solution for the problem. From opti-\nmization aspect, the core optimization algorithm used in both IPFP [16] and FGM [24] is related to\nFrank-Wolfe [9] algorithm and FGM [24, 25] further uses a path following procedure to alleviate the\nlocal-optimum problem more carefully. The core of Frank-Wolfe [9] algorithm is to optimize the\nquadratic problem by sequentially optimizing the linear approximations of QP problem. In addition\n\n31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA.\n\n\fto optimization-based methods, probabilistic methods can also be used for solving graph matching\nproblems [3, 19, 23].\nIn this paper, we propose a new algorithm, called Multiplicative Update Graph Matching (MPGM),\nthat develops a multiplicative update technique for the general QP problem with doubly stochas-\ntic constraint. Generally, MPGM has the following three main aspects. First, MPGM solves the\ngeneral QP problem with doubly stochastic constraint directly and naturally. In MPGM algorithm,\neach update step has a closed-form solution and the convergence of the algorithm is also guaranteed.\nMoreover, the converged solution is guaranteed to be Karush-Kuhn-Tucker (KKT) optimality. Sec-\nond, empirically, MPGM can generate a sparse solution and thus incorporates the discrete constraint\nnaturally in optimization. Therefore, MPGM can obtain a local optimal discrete solution for the\nQP matching problem. Third, it is ef\ufb01cient and simple to implement. Experimental results on both\nsynthetic and real-world matching tasks demonstrate the effectiveness and bene\ufb01ts of the proposed\nMPGM algorithm.\n\n2 Problem Formulation and Related Works\n\n\u2032\n\n\u2032\n\n; E\n\n= (V\n\n\u2208 V\n\n\u2032\nj), there is an af\ufb01nity Sa(ai; a\u2032\n\n\u2032. Also, for each correspondence pair (vi; v\n\n\u2032\nProblem Formulation. Assume G = (V; E) and G\n) are two attributed graphs to be\nmatched, where each node vi \u2208 V or edge eik \u2208 E has an attribute vector ai or rik. The aim of graph\n\u2032. For each corre-\nmatching problem is to establish the correct correspondences between V and V\nj) that measures how well node vi \u2208 V matches node\nspondence (vi; v\n\u2032\n\u2032\nl), there is an af\ufb01nity Sr(rik; r\u2032\n\u2032\nj) and (vk; v\nv\njl)\n\u2032\n\u2032\nj\nl). One can de\ufb01ne an af\ufb01nity\nthat measures the compatibility between node pair (vi; vk) and (v\nj; v\nmatrix W whose diagonal term Wij;ij represents Sa(ai; a\u2032\nj), and the non-diagonal element Wij;kl\ncontains Sr(rik; r\u2032\njl). The one-to-one correspondences can be represented by a permutation matrix\nX \u2208 {0; 1}n\u00d7n, where n = |V | = |V\n\u2032|1. Here, Xij = 1 implies that node vi in G corresponds to\n\u2032, and Xij = 0 otherwise. In this paper, we denote x = (X11:::Xn1; :::; X1n:::Xnn)T\nnode v\nas a column-wise vectorized replica of X. The graph matching problem is generally formulated as\na Quadratic Programming (QP) problem with doubly stochastic and discrete constraints [16, 3, 10],\ni.e.,\n\n\u2032\nj in G\n\nwhere P is de\ufb01ned as,\n\nP = {x | \u2200i\n\nx\u2217\n\n(xTWx)\n\nx\n\nn\n\n= arg max\n\n\u2211\nj=1 xij = 1;\u2200j\n\u2211\n\nj=1 xij = 1;\u2200j\n\nn\n\nn\n\ns:t: x \u2208 P;\n\n\u2211\ni=1 xij = 1; xij \u2208 {0; 1}}\n\u2211\n\ni=1 xij = 1; xij \u2265 0}:\n\nn\n\nD = {x|\u2200i\n\n(1)\n\n(2)\n\n(3)\n\n(4)\n\nThe above QP problem is NP-hard and thus approximate relaxations are usually required. One\npopular way is to relax the permutation domain P to the doubly stochastic domain D,\n\nThat is solving the following relaxed matching problem [21, 20, 10],\ns:t: x \u2208 D:\n\n= arg max\n\n(xTWx)\n\nx\u2217\n\nx\n\nSince W is not necessarliy positive (or negative) semi-de\ufb01nite, thus this problem is generally not a\nconcave or convex problem.\nRelated Works. Many algorithms have been proposed to \ufb01nd a local optimal solution for the above\nQP matching problem (Eq.(4)). One kind of popular methods is to use constraint relaxation and pro-\njection, such as GA [10] and RRWM [3]. Generally, they iteratively conduct the following two steps:\n(a) searching for a solution by ignoring the doubly stochastic constraint temporarily; (b) Projecting\nthe current solution onto the desired doubly stochastic domain to obtain a feasible solution. Note\nthat the projection step (b) is generally independent of the optimization step (a) and thus may lead to\nweak local optimum. Another kind of important methods is to use objective function approximation\nand thus solves the problem approximately, such as Frank-Wolfe algorithm [9]. Frank-Wolfe aims\nto optimize the above quadratic problem by sequentially solving the approximate linear problems.\nThis algorithm has been widely adopted in many recent matching methods [16, 24, 21], such as IPFP\n[16] and FGM [24].\n\n1Here, we focus on equal-size graph matching problem. For graphs with different sizes, one can add dummy\n\nisolated nodes into the smaller graph and transform them to equal-size case [21, 10]\n\n2\n\n\f3 Algorithm\n\n[\n\n]1=2\n\n\u2212\nl\n\nOur aim in this paper is to develop a new algorithm to solve the general QP matching problem Eq.(4).\nWe call it as Multiplicative Update Graph Matching (MPGM). Formally, starting with an initial\nsolution vector x(0), MPGM solves the problem Eq.(4) by iteratively updating a current solution\nvector x(t); t = 0; 1::: as follows,\n\nx(t+1)\nkl\nk = (|(cid:3)k| + (cid:3)k)=2, (cid:3)\n\n\u2212\n2(Wx(t))kl + (cid:3)\nk + (cid:0)\n= x(t)\nkl\nk + (cid:0)+\nk = (|(cid:3)k| \u2212 (cid:3)k)=2, (cid:0)+\n\u2212\n)\u22121\nwhere (cid:3)+\nand the Lagrangian multipliers ((cid:3); (cid:0)) are computed as,\n\n[\n\n(cid:3)+\n\nl\n\n(\n(\nI \u2212 X(t)TX(t)\n\n(\nK(t)X(t)T) \u2212 X(t)(cid:0)\n\ndiag\n\n) \u2212 X(t)T\n\nK(t)TX(t)\n\n(cid:0) =2\n\n;\n\n(5)\nk = (|(cid:0)k| \u2212 (cid:0)k)=2,\nk = (|(cid:0)k| + (cid:0)k)=2, (cid:0)\n\u2212\n\n(\n\nK(t)X(t)T)]\n\ndiag\n\n(cid:3) =2 diag\n\nkl = x(t)\n\nkl = (Wx(t))kl; X(t)\n\n(6)\nwhere K(t), X(t) are the matrix forms of vector (Wx(t)) and x(t), respectively, i.e., K(t); X(t) \u2208\nkl . (cid:3) = ((cid:3)1;\u00b7\u00b7\u00b7 (cid:3)n)T \u2208 Rn\u00d71; (cid:0) = ((cid:0)1;\u00b7\u00b7\u00b7 (cid:0)n)T \u2208\nRn\u00d7n and K(t)\nRn\u00d71. The iteration starts with an initial x(0) and is repeated until convergence.\nComplexity. The main complexity in each iteration is on computing Wx(t). Thus, the total com-\nputational complexity for MPGM is less than O(M N 2), where N = n2 is the length of vector x(t)\nand M is the maximum iteration. Our experience is that the algorithm converges quickly and the\naverage maximum iteration M is generally less than 200. Theoretically, the complexity of MPGM\nis the same with RRWM [3] and IPFP [16], but obviously lower than GA [10] and FGM [24].\nComparison with Related Works. Multiplicative update algorithms have been studied in solving\nmatching problems [6, 13, 11, 12]. Our work is signi\ufb01cantly different from previous works in the\nfollowing aspects. Previous works [6, 13, 11] generally \ufb01rst develop a kind of approximation (or\nrelaxation) for QP matching problem by ignoring the doubly stochastic constraint, and then aim\nto \ufb01nd the optimum of the relaxation problem by developing an algorithm. In contrast, our work\nfocus on the general and challengeable QP problem with doubly stochastic constraint (Eq.(4)), and\nderive a simple multiplicative algorithm to solve the problem Eq.(4) directly. Note that, the proposed\nalgorithm is not limited to solving QP matching problem only. It can also be used in some other QP\n(or general continuous objective function) problems with doubly stochastic constraint (e.g. MAP\ninference, clustering) in machine learning area. In this paper, we focus on graph matching problem.\nStarting Point. To alleviate the local optima and provide a feasible starting point for MPGM algo-\nrithm, given an initial vector x(0), we \ufb01rst use the simple projection x(0) = P (Wx(0)) several times\nto obtain a kind of the feasible start point for MPGM algorithm. Here P denotes the projection [22]\nor normalization [20] to make x(0) satisfy the doubly stochastic constraint.\n\n4 Theoretical Analysis\nTheorem 1. Under update Eq.(5), the Lagrangian function L(x) is monotonically increasing,\n\nL(x) = xTWx (cid:0) n\u2211\n\nn\u2211\n\nxij (cid:0) 1) (cid:0) n\u2211\n\n\u039bi(\n\nn\u2211\n\n\u0393j(\n\nxij (cid:0) 1)\n\n(7)\n\ni=1\nwhere (cid:3); (cid:0) are Lagrangian multipliers.\nProof. To prove it, we use the auxiliary function approach [7, 14]. An auxiliary function function\n(cid:8)(x; ~x) of Lagrangian function L(x) satis\ufb01es following,\n\nj=1\n\nj=1\n\ni=1\n\n(cid:8)(x; x) = L(x); (cid:8)(x; ~x) \u2264 L(x):\n\nUsing the auxiliary function (cid:8)(x; ~x), we de\ufb01ne\n\nx(t+1) = arg max\n\nx\n\n(cid:8)(x; x(t)):\n\n3\n\n(8)\n\n(9)\n\n\fThen by construction of (cid:8)(x; ~x), we have\n\nL(x(t)) = (cid:8)(x(t); x(t)) \u2264 L(x(t+1)):\n\nThis proves that L(x(t)) is monotonically increasing.\nThe main step in the following of the proof is to provide an appropriate auxiliary function and \ufb01nd\nthe global maximum for the auxiliary function. We rewrite Eq.(7) as\nxij (cid:0) 1)\n\nn\u2211\n\n\u039bi(\n\n\u0393j(\n\n=\n\ni=1\n\nj=1\n\nj=1\n\nn\u2211\n\nxij (cid:0) 1) (cid:0) n\u2211\nL(x) = xT Wx (cid:0) n\u2211\nWij;klxijxkl (cid:0) n\u2211\nn\u2211\nn\u2211\nn\u2211\nn\u2211\n(\nn\u2211\nn\u2211\nn\u2211\nn\u2211\nWe show that one auxiliary function (cid:8)(x; ~x) of L(x) is,\n[ n\u2211\nn\u2211\n[ n\u2211\nn\u2211\n\n]\n+ ~xij) (cid:0) 1\n]\n+ ~xij) (cid:0) 1\n\n(cid:0) n\u2211\n(cid:0) n\u2211\n\nWij;kl~xij~xkl\n\nk=1\nx2\nij\n~xij\n\n(cid:8)(x, ~x) =\n\n\u039b+\ni\n\n\u039bi(\n\n1\n2\n\nk=1\n\nj=1\n\nj=1\n\nj=1\n\nj=1\n\ni=1\n\ni=1\n\ni=1\n\ni=1\n\ni=1\n\n+\n\n+\n\nl=1\n\nl=1\n\n(\n\n\u0393+\nj\n\n(cid:0)\n\u039b\ni\n\n(cid:0)\n\u0393\nj\n\n1\n2\n\n(\n\nx2\nij\n~xij\n\n1 + log\n\n[ n\u2211\n[ n\u2211\n\nj=1\n\ni=1\n\nn\u2211\nxij (cid:0) 1) (cid:0) n\u2211\n)\n\nj=1\n\nxijxkl\n~xij~xkl\n\nj=1\n\ni=1\n\nj=1\n\ni=1\n\n]\n) (cid:0) 1\n]\n\n) (cid:0) 1\n\n.\n\n~xij(1 + log\n\n~xij(1 + log\n\nxij\n~xij\n\nxij\n~xij\n\nn\u2211\n\n\u0393j(\n\ni=1\n\nxij (cid:0) 1).\n\n(10)\n\n(11)\n\n(12)\n\nUsing the inequality z \u2265 1 + log z and ab \u2264 1\n2 ( a2\nb + b)), one can prove that Eq.(12)\nis a lower bound of Eq.(11). Thus, Z(x; ~x) is an auxiliary function of L(x). According to Eq.(9), we\nneed to \ufb01nd the global maximum of (cid:8)(x; ~x) for x. The gradient is\n\n2 (a2 + b2)(a \u2264 1\n\n[(\n\n@(cid:8)(x; ~x)\n\n@xkl\n\n= 2(W~x)kl\n\n~xkl\nxkl\n\n\u2212 (cid:3)+\n\nk\n\nxkl\n~xkl\n\n\u2212\nk\n\n~xkl\nxkl\n\n\u2212 (cid:0)+\n\nl\n\nxkl\n~xkl\n\n\u2212\n+ (cid:0)\nl\n\n~xkl\nxkl\n\nNote that, for graph matching problem, we have WT = W. Thus, the second derivative is\n(cid:14)ki(cid:14)lj \u2264 0;\n\n\u2212\n\u2212\n2(W~x)kl + (cid:3)\nk + (cid:0)\nl\n\nk + (cid:0)+\nl )\n\n= \u2212\n\n((cid:3)+\n\n+\n\n@2(cid:8)(x; ~x)\n@xkl@xij\n\n1\n~xkl\n\n(13)\n\nTherefore, (cid:8)(x; ~x) is a concave function in x and has a unique global maximum. It can be obtained\nby setting the \ufb01rst derivative to zero ( @(cid:8)(x;~x)\n@xkl\n\n+ (cid:3)\n\n) ~xkl\n\nx2\nkl\n\n]\n\n[\n\n= 0), which gives\n\u2212\n\u2212\nk + (cid:0)\nl\n\n2(W~x)kl + (cid:3)\n\n(cid:3)+\n\nk + (cid:0)+\n\nl\n\n]1=2\n\nxkl = ~xkl\n\n:\n\n(14)\n\nTherefore, we obtain the update rule in Eq.(5) by setting x(t+1) = x and x(t) = ~x. (cid:3)\nTheorem 2. Under update Eq.(5), the converged solution x\u2217 is Karush-Kuhn-Tucker (KKT) optimal.\nProof. The standard Lagrangian function is\n\nL(x) = xTWx (cid:0) n\u2211\n\nn\u2211\n\nxij (cid:0) 1) (cid:0) n\u2211\n\n\u039bi(\n\nn\u2211\n\nxij (cid:0) 1) (cid:0) n\u2211\n\nn\u2211\n\n\u0393j(\n\n\u2206ijxij\n\n(15)\n\nHere, we use the Lagrangian function to induce KKT optimal condition. Using Eq.(15), we have\n\ni=1\n\nj=1\n\nj=1\n\ni=1\n\ni=1\n\nj=1\n\nThe corresponding KKT condition is\n\u2202L(x)\n\u2202xkl\n\n\u2202L(x)\n\u2202xkl\n\n= 2(Wx)kl (cid:0) (cid:21)k (cid:0) (cid:22)l.\n\n= 2(Wx)kl (cid:0) \u039bk (cid:0) \u0393l (cid:0) \u2206kl = 0\nxkl (cid:0) 1) = 0\n\n= (cid:0)(\n\n\u2211\n\u2211\n\nl\n\n\u2202L(x)\n\u2202\u039bk\n\u2202L(x)\n\u2202\u0393l\n\nxkl (cid:0) 1) = 0\n\nk\n\n\u2206klxkl = 0.\n\n= (cid:0)(\n\n4\n\n(16)\n\n(17)\n\n(18)\n\n(19)\n\n(20)\n\n\fThis leads to the following KKT complementary slackness condition,\n\n(21)\nk xkl = 1, summing over indexes k and l respectively, we obtain the follow-\n\nxkl = 0:\n\n\u2211\n\n\u2211\n\nBecause\ning two group equations,\n\nl xkl = 1;\n\n]\n\n[\n2(Wx)kl \u2212 (cid:3)k \u2212 (cid:0)l\nxkl(Wx)kl \u2212 n\u2211\nn\u2211\nn\u2211\nxkl(Wx)kl \u2212 n\u2211\n\nl=1\n\nl=1\n\nk=1\n\nk=1\n\n2\n\n2\n\n(cid:0)lxkl \u2212 (cid:3)k = 0;\n\n(cid:3)kxkl \u2212 (cid:0)l = 0:\n\n(22)\n\n(23)\n\nEqs.(22, 23) can be equivalently reformulated as the following matrix forms,\n\n2 diag(KXT) \u2212 (cid:3) \u2212 X(cid:0) = 0;\n2 diag(KTX) \u2212 (cid:0) \u2212 XT(cid:3) = 0:\n\n(24)\n(25)\nwhere k = 1; 2;\u00b7\u00b7\u00b7 n, l = 1; 2;\u00b7\u00b7\u00b7 n. K, X are the matrix forms of vector (Wx) and x, respectively,\ni.e., K; X \u2208 Rn\u00d7n and Kkl = (Wx)kl; Xkl = xkl. Thus, we can obtain the values for (cid:3) and (cid:0) as,\n(26)\n(27)\n\nOn the other hand, from update Eq.(5), at convergence,\n)kl + (cid:3)\nk + (cid:0)+\n(cid:3)+\n\n2(Wx\u2217\n\n(cid:0) = 2(I \u2212 XTX)\n\u22121(diag(KTX) \u2212 XT diag(KXT))\n(cid:3) = 2 diag(KXT) \u2212 X(cid:0)\n[\n)kl \u2212 (cid:3)k \u2212 (cid:0)l)x\u22172\n2(Wx\u2217\n\nkl = x\u2217\nx\u2217\n[\n\n]1=2\n\n\u2212\n\u2212\nk + (cid:0)\nl\n\n]\n\nkl\n\nl\n\nkl = 0, which is identical to the following KKT condition,\n)kl \u2212 (cid:3)k \u2212 (cid:0)l\n(29)\n\nx\u2217\nkl = 0:\n\nThus, we have (2(Wx\u2217\n\nSubstituting the values of (cid:3)k; (cid:0)l in Eq.(28) from Eqs.(26,27), we obtain update rule Eq.(5). (cid:3)\nRemark. Similar to the above analysis, we can also derive another similar update as,\n\n(28)\n\nx(t+1)\nkl\n\n= x(t)\nkl\n\n\u2212\n\u2212\n2(Wx(t))kl + (cid:3)\nk + (cid:0)\nl\nk + (cid:0)+\n\n(cid:3)+\n\nl\n\n:\n\n(30)\n\nThe optimality and convergence of this update are also guaranteed. We omit the further discussion\nof them due to the lack of space. In real application, one can use both of these two update algorithms\n(Eq.(5), Eq.(30)) to obtain better results.\n\n5 Sparsity and Discrete Solution\n\nOne property of the proposed MPGM is that it can result in a sparse optimal solution, although the\ndiscrete binary constraint have been dropped in MPGM optimization process. This suggests that\nMPGM can search for an optimal solution nearly on the permutation domain P, i.e., the boundary\nof the doubly stochastic domain D. Unfortunately, here we cannot provide a theoretical proof on the\nsparsity of MPGM solution, but demonstrate it experimentally.\nFigure 1 (a) shows the solution x(t) across different iterations. Note that, regardless of initialization,\nas the iteration increases, the solution vector x(t) of MPGM becomes more and more sparse and\nconverges to a discrete binary solution. Note that, in MPGM update Eq.(5), when xt\nkl closes to zero,\nit can keep closing to zero in the following update process because of the particular multiplicative\noperation. Therefore, as the iteration increases, the solution vector xt+1 is guaranteed to be more\nsparse than solution vector xt. Figure 1 (b) shows the objective and sparsity2 of the solution vector\nx(t). We can observe that (1) the objective of x(t) increases and converges after some iterations,\ndemonstrating the convergence of MPGM algorithm. (2) The sparsity of the solution x(t) increases\nand converges to the baseline, which demonstrates the ability of MPGM algorithm to maintain the\ndiscrete constraint in the converged solution.\n\n2Sparsity measures the percentage of zero (close-to-zero) elements in Z. Firstly, set the threshold \u03f5 =\n0.001 (cid:2) mean(Z), then renew Zij = 0 if Zij (cid:20) \u03f5. Finally, the sparsity is de\ufb01ned as the percentage of zero\nelements in the renewed Z.\n\n5\n\n\fFigure 1: (a) Solution vector x(t) of MPGM across different iterations (top: start from uniform\nsolution; middle: start from SM solution; bottom: start from RRWM solution).\n\n6 Experiments\n\nWe have applied MPGM algorithm to several matching tasks. Our method has been compared with\nsome other state-of-the-art methods including SM [15], IPFP [16], SMAC [5], RRWM [3] and FGM\n[24]. We implemented IPFP [16] with two versions: (1) IPFP-U that is initialized by the uniform\nsolution; (2) IPFP-S that is initialized by SM method [15]. In experiments, we initialize our MPGM\nwith uniform solution and obtain similar results when initializing with SM solution.\n\n6.1 Synthetic Data\n\nSimilar to the works [3, 24], we have randomly generated data sets of nin 2D points as inlier nodes\n\u2032 by transforming the whole point set with\nfor G. We obtain the corresponding nodes in graph G\na random rotation and translation and then adding Gaussian noise N (0; (cid:27)) to the point positions\nfrom graph G. In addition, we also added nout outlier nodes in both graphs respectively at random\n\u22252\npositions. The af\ufb01nity matrix W has been computed as Wij;kl = exp(\u2212\u2225rik \u2212 r\u2032\nF =0:0015),\nwhere rik is the Euclidean distance between two nodes in G and similarly for r\u2032\njl.\nFigure 2 summarizes the comparison results. We can note that: (1) similar to IPFP [16] and FGM\n[24] which return discrete matching solutions, MPGM always generates sparse solutions on doubly\nstochastic domain. (2) MPGM returns higher objective score and accuracy than IPFP [16] and FGM\n[24] methods, which demonstrate that MPGM can \ufb01nd the sparse solution more optimal than these\nmethods. (3) MPGM generally performs better than the continuous domain methods including SM\n[15], SMAC [5] and RRWM [3]. Comparing with these methods, MPGM incorporates the doubly\nstochastic constraint more naturally and thus \ufb01nds the solution more optimal than RRWM method.\n(4) MPGM generally has similar time cost with RRWM [3]. We have not shown the time cost of\nFGM [24] method in Fig.2, because FGM uses a hybrid optimization method and has obviously\nhigher time cost than other methods.\n\njl\n\n6.2\n\nImage Sequence Data\n\nIn this section, we perform feature matching on CMU and YORK house sequences [3, 2, 18]. For\nCMU \"hotel\" sequence, we have matched all images spaced by 5, 10 \u00b7\u00b7\u00b7 75 and 80 frames and com-\nputed the average performances per separation gap. For YORK house sequence, we have matched\nall images spaced by 1, 2 \u00b7\u00b7\u00b7 8 and 9 frames and computed the average performances per separation\n\u22252\ngap. The af\ufb01nity matrix has been computed by Wij;kl = exp(\u2212\u2225rik \u2212 r\u2032\nF =1000), where rik is\nthe Euclidean distance between two points.\nFigure 3 summarizes the performance results. It is noted that MPGM outperforms the other methods\nin both objective score and matching accuracy, indicating the effectiveness of MPGM method. Also,\n\njl\n\n6\n\n\fFigure 2: Comparison results of different methods on synthetic point sets matching\n\nMPGM can generate sparse solutions. These are generally consistent with the results on the synthetic\ndata experiments and further demonstrate the bene\ufb01ts of MPGM algorithm.\n\nFigure 3: Comparison results of different methods on CMU and YORK image sequences. Top:\nCMU images; Bottom: YORK images.\n\n6.3 Real-world Image Data\n\nIn this section, we tested our method on some real-world image datasets. We evaluate our MPGM\non the dataset [17] whose images are selected from Pascal 2007 3. In this dataset, there are 30 pairs\nof car images and 20 pairs of motorbike images. For each image pair, feature points and ground-\ntruth matches were manually marked and each pair contains 30-60 ground-truth correspondences.\nThe af\ufb01nity between two nodes is computed as Wij;ij = exp(\n), where pi is the orientation\n\u2032\nof normal vector at the sampled point (node) i to the contour, similarly to p\nj. Also, the af\ufb01nity\n\n\u2212|pi\u2212p\n\n0:05\n\n|\n\n\u2032\nj\n\n3http://www.pascalnetwork.org/challenges/VOC/voc2007/workshop/index.html\n\n7\n\n0.020.040.060.080.10.30.40.50.60.70.80.91Deformation noise \u03c3Accuracyinliers nin = 20outliers nout = 0 FGMRRWMSMIPFP\u2212UIPFP\u2212SSMACMPGM0.020.040.060.080.10.650.70.750.80.850.90.951Deformation noise \u03c3Objective scoreinliers nin = 20outliers nout = 0 FGMRRWMSMIPFP\u2212UIPFP\u2212SSMACMPGM0.020.040.060.080.100.20.40.60.81Deformation noise \u03c3Sparsityinliers nin = 20outliers nout = 0 FGMRRWMSMIPFP\u2212UIPFP\u2212SSMACMPGM00.020.040.060.080.100.0050.010.0150.020.0250.030.0350.04Deformation noise \u03c3Timeinliers nin = 20 outliers nout = 0 RRWMSMIPFP\u2212UIPFP\u2212SSMACMPGM2468100.40.50.60.70.80.9# of outliers noututliers n Accuracyinliers nin = 15deformation noise \u03c3 = 0.04 FGMRRWMSMIPFP\u2212UIPFP\u2212SSMACMPGM2468100.20.30.40.50.60.70.80.91# of outliers noutObjective scoreinliers nin = 15deformation noise \u03c3 = 0.04 FGMRRWMSMIPFP\u2212UIPFP\u2212SSMACMPGM24681000.20.40.60.81# of outliers noutSparsityinliers nin = 15deformation noise \u03c3 = 0.04 FGMRRWMSMIPFP\u2212UIPFP\u2212SSMACMPGM24681000.020.040.060.080.10.12# of outliers noutTimeinliers nin = 15deformation noise \u03c3 = 0.04 RRWMSMIPFP\u2212UIPFP\u2212SSMACMPGM0.020.040.060.080.10.20.30.40.50.60.70.80.91Deformation noise \u03c3Accuracyinliers nin = 15outliers nout = 5 FGMRRWMSMIPFP\u2212UIPFP\u2212SSMACMPGM0.020.040.060.080.10.20.30.40.50.60.70.80.91Deformation noise \u03c3Objective scoreinliers nin = 15outliers nout = 5 FGMRRWMSMIPFP\u2212UIPFP\u2212SSMACMPGM0.020.040.060.080.100.20.40.60.81Deformation noise \u03c3Sparsityinliers nin = 15outliers nout = 5 FGMRRWMSMIPFP\u2212UIPFP\u2212SSMACMPGM00.020.040.060.080.10.010.020.030.040.05Deformation noise \u03c3Timeinliers nin = 15outliers nout = 5 RRWMSMIPFP\u2212UIPFP\u2212SSMACMPGM10203040506070800.40.50.60.70.80.91SeparationAccuracy FGMRRWMSMIPFP\u2212UIPFP\u2212SSMACMPGM10203040506070800.650.70.750.80.850.90.951SeparationObjective score FGMRRWMSMIPFP\u2212UIPFP\u2212SSMACMPGM102030405060708000.20.40.60.81SeparationSparsity FGMRRWMSMIPFP\u2212UIPFP\u2212SSMACMPGM246800.20.40.60.8SeparationAccuracy FGMRRWMSMIPFP\u2212UIPFP\u2212SSMACMPGM24680.50.60.70.80.91SeparationObjective score FGMRRWMSMIPFP\u2212UIPFP\u2212SSMACMPGM246800.20.40.60.81SeparationSparsity FGMRRWMSMIPFP\u2212UIPFP\u2212SSMACMPGM\fFigure 4: Some examples of image matching on Pascal 2007 dataset (LEFT: original image pair,\nMIDDLE: FGM result, RIGHT: MPGM result. Incorrect matches are marked by red lines)\n\nFigure 5: Comparison results of different graph matching methods on the Pascal 2007 dataset\n\n\u2212|dik\u2212d\nbetween two correspondences has been computed as Wij;kl = exp(\n), where dik denotes\n\u2032\nthe Euclidean distance between feature point i and k, similarly to d\njl. Some matching examples\nare shown in Figure 4. To test the performance against outlier noise, we have randomly added 0-\n20 outlier features for each image pair. The overall results of matching accuracy across different\noutlier features are summarized in Figure 5. From Figure 5, we can note that MPGM outperforms\nthe other competing methods including RRWM [3] and FGM [24], which further demonstrates the\neffectiveness and practicality of MPGM on conducting real-world image matching tasks.\n\n0:15\n\n|\n\n\u2032\njl\n\n7 Conclusions and Future work\n\nThis paper presents an effective algorithm, Multiplicative Update Graph Matching (MPGM), that de-\nvelops a multiplicative update technique to solve the QP matching problem with doubly stochastic\nmapping constraint. The KKT optimality and convergence properties of MPGM algorithms are theo-\nretically guaranteed. We show experimentally that MPGM solution is sparse and thus approximately\nincorporates the discrete constraint in optimization naturally. In our future, the theoretical analysis\non the sparsity of MPGM needs to be further studied. Also, we will incorporate our MPGM in some\npath-following strategy to \ufb01nd a more optimal solution for the matching problem. We will adapt the\nproposed algorithm to solve some other optimization problems with doubly stochastic constraint in\nmachine learning and computer vision area.\n\nAcknowledgment\n\nThis work is supported by the NBRPC 973 Program (2015CB351705); National Natural Sci-\nence Foundation of China (61602001,61671018, 61572030); Natural Science Foundation of An-\nhui Province (1708085QF139); Natural Science Foundation of Anhui Higher Education Institutions\nof China (KJ2016A020); Co-Innovation Center for Information Supply & Assurance Technology,\nAnhui University; The Open Projects Program of National Laboratory of Pattern Recognition.\n\n8\n\n\fReferences\n[1] K. Adamczewski, Y. Suh, and K. M. Lee. Discrete tabu search for graph matching.\n\nIn ICCV, pages\n\n109\u2013117, 2015.\n\n[2] T. S. Caetano, J. J. McAuley, L. Cheng, Q. V. Le, and A. J. Smola. Learning graph matching. IEEE\n\nTransactions on Pattern Analysis and Machine Intelligence, 31(6):1048\u20131058, 2009.\n\n[3] M. Cho, J. Lee, and K. M. Lee. Reweighted random walks for graph matching. In European Conference\n\non Computer Vision, pages 492\u2013505, 2010.\n\n[4] D. Conte, P. Foggia, C. Sansone, and M. Vento. Thirty years of graph matching in pattern recognition.\n\nInternational Journal of Pattern Recognition and Arti\ufb01cial Intelligence, pages 265\u2013298, 2004.\n\n[5] M. Cour, P. Srinivasan, and J.Shi. Balanced graph matching. In Neural Information Processing Systems,\n\npages 313\u2013320, 2006.\n\n[6] C. Ding, T. Li, and M. I. Jordan. Nonnegative matrix factorization for combinatorial optimization: Spec-\ntral clustering, graph matching and clique \ufb01nding. In IEEE International Conference on Data Mining,\npages 183\u2013192, 2008.\n\n[7] C. Ding, T. Li, and M. I. Jordan. Convex and semi-nonnegative matrix factorization. IEEE Transactions\n\non Pattern Analysis and Machine Intelligence, 32(1):45\u201355, 2010.\n\n[8] O. Enqvist, K. Josephon, and F. Kahl. Optimal correspondences from pairwise constraints.\n\nIn IEEE\n\nInternational Conference on Computer Vision, pages 1295\u20131302, 2009.\n\n[9] M. Frank and P. Wolfe. An algorithm for quadratic programming. Naval Research Logistics Quarterly,\n\n3(1-2):95\u2013110, 1956.\n\n3790\u20133796, 2015.\n\n[10] S. Gold and A. Rangarajan. A graduated assignment algorithm for graph matching. IEEE Transactions\n\non Pattern Analysis and Machine Intelligence, 18(4):377\u2013388, 1996.\n\n[11] B. Jiang, J. Tang, C. Ding, and B. Luo. A local sparse model for matching problem. In AAAI, pages\n\n[12] B. Jiang, J. Tang, C. Ding, and B. Luo. Nonnegative orthogonal graph matching. In AAAI, 2017.\n[13] B. Jiang, H. F. Zhao, J. Tang, and B. Luo. A sparse nonnegative matrix factorization technique for graph\n\nmatching problem. Pattern Recognition, 47(1):736\u2013747, 2014.\n\n[14] D. D. Lee and H. S. Seung. Algorithms for nonnegative matrix factorization.\n\nIn Neural Information\n\nProcessing Systems, pages 556\u2013562, 2001.\n\n[15] M. Leordeanu and M. Hebert. A spectral technique for correspondence problem using pairwise constraints.\n\nIn IEEE International Conference on Computer Vision, pages 1482\u20131489, 2005.\n\n[16] M. Leordeanu, M. Hebert, and R. Sukthankar. An integer projected \ufb01xed point method for graph macthing\n\nand map inference. In Neural Information Processing Systems, pages 1114\u20131122, 2009.\n\n[17] M. Leordeanu, R. Sukthankar, and M. Hebert. Unsupervised learning for graph mathing. International\n\nJournal of Computer Vision, 95(1):1\u201318, 2011.\n[18] B. Luo, R. C. Wilson, and E. R. Hancock.\n\n36(10):2213\u20132230, 2003.\n\nSpectal embedding of graphs. Pattern Recognition,\n\n[19] J. J. MuAuley and T. S. Caetano. Fast matching of large point sets under occlusions. Pattern Recognition,\n\n45(1):563\u2013569, 2012.\n\n[20] B. J. van Wyk and M. A. van Wyk. A pocs-based graph matching algorithm.\n\nIEEE Transactions on\n\nPattern Analysis and Machine Intelligence, 16(11):1526\u20131530, 2004.\n\n[21] M. Zaslavskiy, F. Bach, and J. P. Vert. A path following algorithm for the graph matching problem. IEEE\n\nTransactions on Pattern Analysis and Machine Intelligence, 31(12):2227\u20132242, 2009.\n\n[22] R. Zass and A. Shashua. Doubly stochastic normalization for spectral clustering. In Proceedings of the\n\nconference on Neural Information Processing Systems (NIPS), pages 1569\u20131576, 2006.\n\n[23] Z. Zhang, Q. Shi, J. McAuley, W. Wei, Y. Zhang, and A. V. D. Hengel. Pairwise matching through\n\nmax-weight bipartite belief propagation. In CVPR, pages 1202\u20131210, 2016.\n\n[24] F. Zhou and F. D. la Torre. Factorized graph matching. In IEEE Conference on Computer Vision and\n\n[25] F. Zhou and F. D. la Torre. Deformable graph matching. In IEEE Conference on Computer Vision and\n\nPattern Recognition, pages 127\u2013134, 2012.\n\nPattern Recognition, pages 127\u2013134, 2013.\n\n9\n\n\f", "award": [], "sourceid": 1806, "authors": [{"given_name": "Bo", "family_name": "Jiang", "institution": "Anhui University"}, {"given_name": "Jin", "family_name": "Tang", "institution": null}, {"given_name": "Chris", "family_name": "Ding", "institution": "University of Texas at Arlington"}, {"given_name": "Yihong", "family_name": "Gong", "institution": null}, {"given_name": "Bin", "family_name": "Luo", "institution": null}]}