{"title": "New Rules for Domain Independent Lifted MAP Inference", "book": "Advances in Neural Information Processing Systems", "page_first": 649, "page_last": 657, "abstract": "Lifted inference algorithms for probabilistic first-order logic frameworks such as Markov logic networks (MLNs) have received significant attention in recent years. These algorithms use so called lifting rules to identify symmetries in the first-order representation and reduce the inference problem over a large probabilistic model to an inference problem over a much smaller model. In this paper, we present two new lifting rules, which enable fast MAP inference in a large class of MLNs. Our first rule uses the concept of single occurrence equivalence class of logical variables, which we define in the paper. The rule states that the MAP assignment over an MLN can be recovered from a much smaller MLN, in which each logical variable in each single occurrence equivalence class is replaced by a constant (i.e., an object in the domain of the variable). Our second rule states that we can safely remove a subset of formulas from the MLN if all equivalence classes of variables in the remaining MLN are single occurrence and all formulas in the subset are tautology (i.e., evaluate to true) at extremes (i.e., assignments with identical truth value for groundings of a predicate). We prove that our two new rules are sound and demonstrate via a detailed experimental evaluation that our approach is superior in terms of scalability and MAP solution quality to the state of the art approaches.", "full_text": "New Rules for Domain Independent\n\nLifted MAP Inference\n\nHappy Mittal, Prasoon Goyal\nDept. of Comp. Sci. & Engg.\n\nI.I.T. Delhi, Hauz Khas\nNew Delhi, 110016, India\nhappy.mittal@cse.iitd.ac.in\n\nprasoongoyal13@gmail.com\n\nVibhav Gogate\n\nDept. of Comp. Sci.\nUniv. of Texas Dallas\n\nRichardson, TX 75080, USA\n\nParag Singla\n\nDept. of Comp. Sci. & Engg.\n\nI.I.T. Delhi, Hauz Khas\nNew Delhi, 110016, India\n\nvgogate@hlt.utdallas.edu\n\nparags@cse.iitd.ac.in\n\nAbstract\n\nLifted inference algorithms for probabilistic \ufb01rst-order logic frameworks such as\nMarkov logic networks (MLNs) have received signi\ufb01cant attention in recent years.\nThese algorithms use so called lifting rules to identify symmetries in the \ufb01rst-order\nrepresentation and reduce the inference problem over a large probabilistic model\nto an inference problem over a much smaller model. In this paper, we present\ntwo new lifting rules, which enable fast MAP inference in a large class of MLNs.\nOur \ufb01rst rule uses the concept of single occurrence equivalence class of logical\nvariables, which we de\ufb01ne in the paper. The rule states that the MAP assignment\nover an MLN can be recovered from a much smaller MLN, in which each logical\nvariable in each single occurrence equivalence class is replaced by a constant (i.e.,\nan object in the domain of the variable). Our second rule states that we can safely\nremove a subset of formulas from the MLN if all equivalence classes of variables\nin the remaining MLN are single occurrence and all formulas in the subset are\ntautology (i.e., evaluate to true) at extremes (i.e., assignments with identical truth\nvalue for groundings of a predicate). We prove that our two new rules are sound and\ndemonstrate via a detailed experimental evaluation that our approach is superior in\nterms of scalability and MAP solution quality to the state of the art approaches.\n\nIntroduction\n\n1\nMarkov logic [4] uses weighted \ufb01rst order formulas to compactly encode uncertainty in large,\nrelational domains such as those occurring in natural language understanding and computer vision. At\na high level, a Markov logic network (MLN) can be seen as a template for generating ground Markov\nnetworks. Therefore, a natural way to answer inference queries over MLNs is to construct a ground\nMarkov network and then use standard inference techniques (e.g., Loopy Belief Propagation) for\nMarkov networks. Unfortunately, this approach is not practical because the ground Markov networks\ncan be quite large, having millions of random variables and features.\nLifted inference approaches [17] avoid grounding the whole Markov logic theory by exploiting\nsymmetries in the \ufb01rst-order representation. Existing lifted inference algorithms can be roughly\ndivided into two types: algorithms that lift exact solvers [2, 3, 6, 17], and algorithms that lift\napproximate inference techniques such as belief propagation [12, 20] and sampling based methods [7,\n21]. Another line of work [1, 5, 9, 15] attempts to characterize the complexity of lifted inference\nindependent of the speci\ufb01c solver being used. Despite the presence of large literature on lifting,\nthere has been limited focus on exploiting the speci\ufb01c structure of the MAP problem. Some recent\nwork [14, 16] has looked at exploiting symmetries in the context of LP formulations for MAP\ninference. Sarkhel et. al [19] show that the MAP problem can be propositionalized in the limited\nsetting of non-shared MLNs. But largely, the question is still open as to whether there can be a greater\nexploitation of the structure for lifting MAP inference.\n\n1\n\n\fIn this paper, we propose two new rules for lifted inference speci\ufb01cally tailored for MAP queries.\nWe identify equivalence classes of variables which are single occurrence i.e., they have at most a\nsingle variable from the class appearing in any given formula. Our \ufb01rst rule for lifting states that\nMAP inference over the original theory can be equivalently formulated over a reduced theory where\nevery single occurrence class has been reduced to a unary sized domain. This leads to a general\nframework for transforming the original theory into a (MAP) equivalent reduced theory. Any existing\n(propositional or lifted) MAP solver can be applied over this reduced theory. When every equivalence\nclass is single occurrence, our approach is domain independent, i.e., the complexity of MAP inference\ndoes not depend on the number of constants in the domain. Existing lifting constructs such as the\ndecomposer [6] and the non-shared MLNs [19] are special cases of our single occurrence rule.\nWhen the MLN theory is single occurrence, one of the MAP solutions lies at extreme, namely all\ngroundings of any given predicate have identical values (true/false) in the MAP assignment. Our\nsecond rule for lifting states that formulas which become tautology (i.e., evaluate to true) at extreme\nassignments can be ignored for the purpose of MAP inference when the remaining theory is single\noccurrence. Many dif\ufb01cult to lift formulas such as symmetry and transitivity are easy to handle in our\nframework because of this rule. Experiments on three benchmark MLNs clearly demonstrate that our\napproach is more accurate and scalable than the state of the art approaches for MAP inference.\n2 Background\nA \ufb01rst order logic [18] theory is constructed using the constant, variable, function and\npredicate symbols. Predicates are de\ufb01ned over terms as arguments where each term is either\na constant, or a variable or a function applied to a term. A formula is constructed by combining\npredicates using operators such as \u00ac, \u2227 and \u2228. Variables in a \ufb01rst-order theory are often referred\nto as Logical Variables. Variables in a formula can be universally or existentially quanti-\n\ufb01ed. A Knowledge Base (KB) is a set of formulas. A theory is in Conjunctive Normal Form\n(CNF) if it is expressed as a conjunction of disjunctive formulas. The process of (partial)\ngrounding corresponds to replacing (some) all of the free variables in a predicate or a formula\nwith constants in the theory. In this paper, we assume function-free \ufb01rst order logic theory with\nHerbrand interpretations [18], and that variables in the theory are implicitly universally quanti\ufb01ed.\nMarkov Logic [4] is de\ufb01ned as a set of pairs (fi, wi), where fi is a formula in \ufb01rst-order logic and\nwi is its weight. The weight wi signi\ufb01es the strength of the constraint represented by the formula fi.\nGiven a set of constants, an MLN can be seen as a template for constructing ground Markov networks.\nThere is a node in the network for every ground atom and a feature for every ground formula. The\nprobability distribution speci\ufb01ed by an MLN is:\n\n\uf8eb\uf8ed (cid:88)\n\ni:fi\u2208F\n\n\uf8f6\uf8f8\n\nP (X = x) =\n\n1\nZ\n\nexp\n\nwini(x)\n\n(1)\n\nwhere X = x speci\ufb01es an assignment to the ground atoms, the sum in the exponent is taken over the\nindices of the \ufb01rst order formulas (denoted by F ) in the theory, wi is the weight of the ith formula,\nni(x) denotes the number of true groundings of the ith formula under the assignment x, and Z is\nthe normalization constant. A formula f in MLN with weight w can be equivalently replaced by\nnegation of the formula i.e., \u00acf with weight \u2212w. Hence, without loss of generality, we will assume\nthat all the formulas in our MLN theory have non-negative weights. Also for convenience, we will\nassume that each formula is either a conjunction or a disjunction of literals.\nThe MAP inference task is de\ufb01ned as the task of \ufb01nding an assignment (there could be multiple such\nassignments) having the maximum probability. Since Z is a constant and exp is a monotonically\nincreasing function, the MAP problem for MLNs can be written as:\n\narg max\n\nx\n\nP (X = x) = arg max\n\nx\n\nwini(x)\n\n(2)\n\n(cid:88)\n\ni:fi\u2208F\n\nOne of the ways to \ufb01nd the MAP solution in MLNs is to ground the whole theory and then reformulate\nthe problem as a MaxSAT problem [4]. Given a set of weighted clauses (constraints), the goal in\nMaxSAT is to \ufb01nd an assignment which maximizes the sum of the weights of the satis\ufb01ed clauses.\nAny standard solver such as MaxWalkSAT [10] can be used over the ground theory to \ufb01nd the MAP\nsolution. This can be wasteful when there is rich structure present in the network and lifted inference\ntechniques can exploit this structure [11]. In this paper, we assume an MLN theory for the ease of\n\n2\n\n\fexposition. But our ideas are easily generalizable to other similar representations such as weighted\nparfactors [2], probabilistic knowledge bases [6] and WFOMC [5].\n\n3 Basic Framework\n3.1 Motivation\nMost existing work on lifted MAP inference adapts the techniques for lifting marginal inference. One\nof the key ideas used in lifting is to exploit the presence of a decomposer [2, 6, 9]. A decomposer\nsplits the theory into identical but independent sub-theories and therefore only one of them needs to\nbe solved. A counting argument can be used when a decomposer is not present [2, 6, 9]. For theories\ncontaining upto two logical variables in each clause, there exists a polynomial time lifted inference\nprocedure [5]. Speci\ufb01cally exploiting the structure of MAP inference, Sarkhel et. al [19] show that\nMAP inference in non-shared MLNs (with no self joins) can be reduced to a propositional problem.\nDespite all these lifting techniques, there is a larger class of MLN formulas where it is still not clear\nwhether there exists an ef\ufb01cient lifting algorithm for MAP inference. For instance, consider the single\nrule MLN theory:\n\nw1 P arent(X, Y ) \u2227 F riend(Y, Z) \u21d2 Knows(X, Z)\n\nThis rule is hard to lift for any of the existing algorithms since neither the decomposer nor the counting\nargument is directly applicable. The counting argument can be applied after (partially) grounding X\nand as a result lifted inference on this theory will be more ef\ufb01cient than ground inference. However,\nconsider adding transitivity to the above theory:\n\nw2 F riend(X, Y ) \u2227 F riend(Y, Z) \u21d2 F riend(X, Z)\n\nThis makes the problem even harder because in order to process the new MLN formula via lifted\ninference, one has to at least ground both the arguments of F riend. In this work, we exploit speci\ufb01c\nproperties of MAP inference and develop two new lifting rules, which are able to lift the above theory.\nIn fact, as we will show, MAP inference for MLN containing (exactly) the two formulas given above\nis domain independent, namely, it does not depend on the domain size of the variables.\n\n3.2 Notation and Preliminaries\nWe will use the upper case letters X, Y, Z etc. to denote the variables. We will use the lower case\nletters a, b, c etc. to denote the constants. Let \u2206X denote the domain of a variable X. We will assume\nthat the variables in the MLN are standardized apart, namely, no two formulas contain the same\nvariable symbol. Further, we will assume that the input MLN is in normal form [9]. An MLN is\nsaid to be in normal form if a) If X and Y are two variables appearing at the same argument position\nin a predicate P in the MLN theory, then \u2206X = \u2206Y . b) There are no constants in any formula. Any\ngiven MLN can be converted into the normal form by a series of mechanical operations in time that\nis polynomial in the size of the MLN theory and the evidence. We will require normal forms for\nsimplicity of exposition. For lack of space, proofs of all the theorems and lemmas marked by (*) are\npresented in the extended version of the paper (see the supplementary material).\nFollowing Jha et. al [9] and Broeck [5], we de\ufb01ne a symmetric and transitive relation over the\nvariables in the theory as follows. X and Y are related if either a) they appear in the same position\nof a predicate P , or b) \u2203 a variable Z such that X, Z and Y, Z are related. We refer to the relation\nabove as binding relation [5]. Being symmetric and transitive, binding relation splits the variables\ninto a set of equivalence classes. We say that X and Y bind to each other if they belong to\nthe same equivalence class under the binding relation. We denote this by writing X \u223c Y . We will\nuse the notation \u00afX to refer to the equivalence class to which variable X belongs. As an example,\nthe MLN theory consisting of two rules: 1) P (X) \u2228 Q(X, Y ) 2) P (Z) \u2228 Q(U, V ) has two variable\nequivalence classes given by {X, Z, U} and {Y, V }.\nBroeck [5] introduce the notion of domain lifted inference. An inference procedure is domain\nlifted if it is polynomial in the size of the variable domains. Note that the notion of domain lifted\ndoes not impose any condition on how the complexity depends on the size of the MLN theory. On\nthe similar lines, we introduce the notion of domain independent inference.\nDe\ufb01nition 3.1. An inference procedure is domain independent if its time complexity is inde-\npendent of the domain size of the variables. As in the case of domain lifted inference, the complexity\ncan still depend arbitrarily on the size of the MLN theory.\n\n3\n\n\f4 Exploiting Single Occurrence\nWe show that the domains of equivalence classes satisfying certain desired properties can be reduced\nto unary sized domains for the MAP inference task. This forms the basis of our \ufb01rst inference rule.\nDe\ufb01nition 4.1. Given an MLN theory M, a variable equivalence class \u00afX is said to be single\noccurrence with respect to M if for any two variables X, Y \u2208 \u00afX, X and Y do not appear\ntogether in any formula in the MLN. In other words, every formula in the MLN has at most a single\noccurrence of variables from \u00afX. A predicate is said to be single occurrence if each of the equivalence\nclasses of its argument variables is single occurrence. An MLN is said to be single occurrence\nif each of its variable equivalence classes is single occurrence.\nConsider the MLN theory with two formulas as earlier: 1) P (X) \u2228 Q(X, Y ) 2) P (Z) \u2228 Q(U, V ).\nHere, {Y, V } is a single occurrence equivalence class while {X, Z, U} is not. Next, we show that\nthe MAP tuple of an MLN can be recovered from a much smaller MLN in which the domain size of\neach variable in each single occurrence equivalence class is reduced to one.\n\ni)}m\n\ni=1 where the domain of \u00afX has been reduced to a single constant.\n\n4.1 First Rule for Lifting MAP\nTheorem 4.1. Let M be an MLN theory represented by the set of pairs {(fi, wi)}m\ni=1. Let \u00afX be\na single occurrence equivalence class with domain \u2206 \u00afX. Then, MAP inference problem in M can\nbe reduced to the MAP inference problem over a simpler MLN M r\n\u00afX represented by a set of pairs\n{(fi, w(cid:48)\nProof. We will prove the above theorem by constructing the desired theory M r\n\u00afX has\nthe same set of formulas as M with a set of modi\ufb01ed weights. Let F \u00afX be the set of formulas in M\nwhich contain a variable from the equivalence class \u00afX. Let F\u2212 \u00afX be the set of formulas in M which\ndo not contain a variable from the equivalence class \u00afX. Let {a1, a2, . . . , ar} be the domain of \u00afX.\nWe will split the theory M into r equivalent theories {M1, M2, . . . , Mr} such that for each Mj: 1\n1. For every formula fi \u2208 F \u00afX with weight wi, Mj contains fi with weight wi.\n2. For every formula fi \u2208 F\u2212 \u00afX with weight wi, Mj contains fi with weight wi/r.\n3. Domain of \u00afX in Mj is reduced to a single constant {aj}.\n4. All other equivalence classes have domains identical to that in M.\nThis divides the set of weighted constraints in M across the r sub-theories. Formally:\n\n\u00afX. Note that M r\n\nj=1.\n\nLemma 4.1. * The set of weighted constraints in M is a union of the set of weighted constraints in\nthe sub-theories {Mj}r\nCorollary 4.1. Let x be an assignment to the ground atoms in M. Let the function WM (x) denote the\nweight of satis\ufb01ed ground formulas in M under the assignment x in theory M. Further, let xj denote\nj=1 WMj (xj).\n\nthe assignment x restricted to the ground atoms in theory Mj. Then: WM (x) =(cid:80)r\n\nIt is easy to see that Mj\u2019s are identical to each other upto the renaming of the constants aj\u2019s. Hence,\nexploiting symmetry, there is a one to one correspondence between the assignments across the\nsub-theories. In particular, there is one to one correspondence between MAP assignments across the\nsub-theories {Mj}r\nLemma 4.2. If xMAP\nxMAP\nl\naj (in ground atoms of Mj) is replaced by constant al (in ground atoms of Ml).\n\nis a MAP assignment to the theory Mj, then there exists a MAP assignment\nwith the difference that occurrence of constant\n\nto Ml such that xMAP\n\nis identical to xMAP\n\nj=1.\n\nj\n\nj\n\nl\n\nProof of this lemma follows from the construction of the sub-theories M1, M2, . . . Mr. Next, we\nwill show that MAP solution for the theory M can be read off from the MAP solution for any of\ntheories {Mj}r\nbe some\nMAP assignment for M1. Using lemma 4.2 there are MAP assignments xMAP\n, . . . , xMAP\nfor M2, M3, . . . Mr which are identical to xMAP\nupto renaming of the constant a1. We construct an\nassignment xMAP for the original theory M as follows.\n\nj=1. Without loss of generality, let us consider the theory M1. Let xMAP\n\n1\n, xMAP\n\n3\n\n2\n\n1\n\nr\n\n1Supplement presents an example of splitting an MLN theory based on the following procedure.\n\n4\n\n\fj\n\n1\n\nbecause of Lemma 4.2.\n\n1. For each predicate P which does not contain any occurrence of the variables from the equivalence\nclass \u00afX, read off the assignment to its groundings in xMAP from xMAP\n. Note that assignments of\ngroundings of P are identical in each of xM AP\n2. The (partial) groundings of each predicate P whose arguments contain a variable X \u2208 \u00afX are split\nacross the sub-theories {Mj}1\u2264j\u2264r corresponding to the substitutions {X = aj}1\u2264j\u2264r, respectively.\nWe assign the groundings of P in xMAP the values from the assignments xMAP\n, . . . xMAP\nfor the respective partial groundings. Because of Lemma 4.2, these partial groundings have identical\nvalues across the sub-theories upto renaming of the constant aj and hence, can be read off from either\nof the sub-theory assignments, and more speci\ufb01cally, xMAP\nBy construction, assignment xMAP restricted to the ground atoms in sub-theory Mj corresponds to\nthe assignment xMAP\n\nfor each j, 1 \u2264 j \u2264 r.\n\n, xMAP\n\n1\n\n1\n\n2\n\n.\n\nr\n\nj\n\nj\n\nj\n\nj\n\n) > WMj (xMAP\n\nj=1 WMj (xMAP\n\n) >(cid:80)r\n\nj=1 WMj (xalt\nj\n). But this would imply that xMAP\n\n4.1, WM (xalt) > WM (xMAP) \u21d2(cid:80)r\n\nThe only thing remaining to show is that xMAP is indeed a MAP assignment for M. Suppose it\nis not, then there is another assignment xalt such that WM (xalt) > WM (xMAP). Using Corollary\n). This means that \u2203j,\nis not a MAP assignment\n\nsuch that WMj (xalt\nfor Mj which is a contradiction. Hence, xMAP is indeed a MAP assignment for M.\nDe\ufb01nition 4.2. Application of Theorem 4.1 to transform the MAP problem over an MLN theory M\ninto the MAP over a reduced theory M r\nDecomposer [6] is a very powerful construct for lifted inference. The next theorem states that\na decomposer is a single occurrence equivalence class (and therefore, the single occurrence rule\nincludes the decomposer rule as a special case).\nTheorem 4.2. * Let M be an MLN theory and let \u00afX be an equivalence class of variables. If \u00afX is a\ndecomposer for M, then \u00afX is single occurrence in M.\n\n\u00afX is referred to as Single Occurrence Rule for lifted MAP.\n\nj\n\n4.2 Domain Independent Lifted MAP\nA simple procedure for lifted MAP inference which utilizes the property of MLN reduction for\nsingle occurrence equivalence classes is given in Algorithm 1. Here, the MLN theory is successively\nreduced with respect to each of the single occurrence equivalence classes.\n\nAlgorithm 1 Reducing all the single occurrence equivalence classes in an MLN\nreduce(MLN M)\nM r \u2190 M\nfor all Equivalence-Class \u00afX do\nM r \u2190 reduceEQ(M r, \u00afX)\n\n\u00afX \u2190 {}; size \u2190 |\u2206 \u00afX|; \u2206 \u00afX \u2190 {a \u00afX\n1 }\nAdd (fi, wi) to M r\n\u00afX\n\nif (isSingleOccurrence( \u00afX)) then\n\nreduceEQ(MLN M, class \u00afX)\nM r\nfor all Formulas fi \u2208 F \u00afX do\nend for\nfor all Formulas fi \u2208 F\u2212 \u00afX do\nAdd (fi, wi/size) to M r\n\u00afX\nend for;\n\nreturn M r\n\u00afX\n\nend if\nend for\nreturn M r\n\nTheorem 4.3. * MAP inference in a single occurrence MLN is domain independent.\nIf an MLN theory contains a combination of both single occurrence and non-single occurrence\nequivalence classes, we can \ufb01rst reduce all the single occurrence classes to unary domains using\nAlgorithm 1. Any existing (lifted or propositional) solver can be applied on this reduced theory to\nobtain the MAP solution. Revisiting the single rule example from Section 3.1: P arent(X, Y ) \u2227\nF riend(Y, Z) \u21d2 Knows(X, Z), we have 3 equivalence classes {X},{Y }, and {Z}, all of which\nare single occurrence. Hence, MAP inference for this MLN theory is domain independent.\n5 Exploiting Extremes\nEven when a theory does not contain single occurrence variables, we can reduce it effectively if a)\nthere is a subset of formulas all of whose groundings are satis\ufb01ed at extremes i.e. the assignments\nwith identical truth value for all the groundings of a predicate, and b) the remaining theory with these\nformulas removed is single occurrence. This is the key idea behind our second rule for lifted MAP.\nWe will \ufb01rst formalize the notion of an extreme assignment followed by the description of our second\nlifting rule.\n\n5\n\n\f5.1 Extreme Assignments\n\nDe\ufb01nition 5.1. Let M be an MLN theory. Given an assignment x to the ground atoms in M, we say\nthat predicate P is at extreme in x if all the groundings of P take the same value (either true or\nfalse) in x. We say that x is at extreme if all the predicates in M are at extreme in x.\nTheorem 5.1.* Given an MLN theory M, let PS be the set of predicates which are single occurrence\nin M. Then there is a MAP assignment xMAP such that \u2200P \u2208 PS, P is at extreme in xMAP.\nCorollary 5.1. A single occurrence MLN admits a MAP solution which is at extreme.\n\nSarkhel et. al [19] show that non-shared MLNs (with no self-joins) have a MAP solution at the\nextreme. This turns out to be a special case of single occurrence MLNs.\nTheorem 5.2. * If an MLN theory is non-shared and has no-self joins, then M is single occurrence.\n\n5.2 Second Rule for Lifting MAP\nConsider the MLN theory with a single formula as in Section 3.1: w1 P arent(X, Y ) \u2227\nF riend(Y, Z) \u21d2 Knows(X, Z). This is a single occurrence MLN and hence by Corollary 5.1,\nMAP solution lies at extreme. Consider adding the transitivity constraint to the theory: w2\nF riend(X, Y ) \u2227 F riend(Y, Z) \u21d2 F riend(X, Z). All the groundings of the second formula\nare satis\ufb01ed at any extreme assignment of the F riends predicate groundings. Since, the MAP\nsolution to the original theory with single formula is at extreme, it satis\ufb01es all the groundings of the\nsecond formula. Hence, it is a MAP for the new theory as well. We introduce the notion of tautology\nat extremes:\nDe\ufb01nition 5.2. An MLN formula f is said to be a tautology at extremes if all of its groundings are\nsatis\ufb01ed at any of the extreme assignments of its predicates.\n\nIf an MLN theory becomes single occurrence after removing all the tautologies at extremes in it, then\nMAP inference in such a theory is domain independent.\nTheorem 5.3.* Let M be an MLN theory with the set of formulas denoted by F . Let Fte denote a set\nof formulas in M which are tautologies at extremes. Let M(cid:48) be a new theory with formulas F \u2212 Fte\nand formula weights as in M. Let the variable domains in M(cid:48) be same as in M. If M(cid:48) is single\noccurrence then the MAP inference for the original theory M can be reduced to the MAP inference\nproblem over the new theory M(cid:48).\nCorollary 5.2. Let M be an MLN theory. Let M(cid:48) be a single occurrence theory (with variable\ndomains identical to M) obtained after removing a subset of formulas in M which are tautologies at\nextremes. Then, MAP inference in M is domain independent.\nDe\ufb01nition 5.3. Application of Theorem 5.3 to transform the MAP problem over an MLN theory\nM into the MAP problem over the remaining theory M(cid:48) after removing (a subset of) tautologies at\nextremes is referred to as Tautology at Extremes Rule for lifted MAP.\n\nClearly, Corollary 5.2 applies to the two rule MLN theory considered above (and in the Section 3.1)\nand hence, MAP inference for the theory is domain independent. A necessary and suf\ufb01cient condition\nfor a clausal formula to be a tautology at extremes is to have both positive and negative occurrences\nof the same predicate symbol. Many dif\ufb01cult to lift but important formulas such as symmetry and\ntransitivity are tautologies at extremes and hence, can be handled by our approach.\n\n5.3 A Procedure for Identifying Tautologies\n\nIn general, we only need the equivalence classes of variables appearing in Fte to be single occurrence\nin the remaining theory for Theorem 5.3 to hold. 2 Algorithm 2 describes a procedure to identify the\nlargest set of tautologies at extremes such that all the variables in them are single occurrence with\nrespect to the remainder of the theory. The algorithm \ufb01rst identi\ufb01es all the tautologies at extremes.\nIt then successively removes those from the set all of whose variables are not single occurrence in\nthe remainder of the theory. The process stops when all the tautologies have only single occurrence\nvariables appearing in them. We can then apply the procedure in Section 4 to \ufb01nd the MAP solution\nfor the remainder of the theory. This is also the MAP for the whole theory by Theorem 5.3.\n\n2Theorem 5.3g in the supplement gives a more general version of Theorem 5.3.\n\n6\n\n\fAlgorithm 2 Finding Tautologies at Extremes with Single Occurrence Variables\ngetAllTautologyAtExtremes(MLN M)\ngetSingleOccurTautology(MLN M)\nFte \u2190 getAllTautologyAtExtremes(M);\nF (cid:48) = F \u2212 Fte; \ufb01xpoint=False;\nwhile (\ufb01xpoint==False) do\n\n//Iterate over all the formulas in M and return the\n//subset of formulas which are tautologies at extremes\n//Pseudocode omitted due to lack of space\n\nEQVars \u2190 getSingleOccurVars(F (cid:48))\n\ufb01xpoint=True\nfor all formulas f \u2208 Fte do\n\nif (!(Vars(f) \u2286 EQVars)) then\n\nF (cid:48) \u2190 F (cid:48) \u222a {f}; \ufb01xpoint = False\n\nend if\nend for\nend while;\n\nreturn F \u2212 F (cid:48)\n\nisTautologyAtExtreme(Formula f)\nf(cid:48) = Clone(f)\nPU \u2190 set of unique predicates in f(cid:48)\nfor all P \u2208 PU do\nReplaceByNewPropositionalPred(P ,f(cid:48))\nend for\n// f(cid:48) is a propositional formula at this point\nreturn isTautology(f(cid:48))\n\n6 Experiments\nWe compared the performance of our algorithm against Sarkhel et. al [19]\u2019s non shared MLN\napproach and the purely grounded version on three benchmark MLNs. For both the lifted approaches,\nwe used them as pre-processing algorithms to reduce the MLN domains. We applied the ILP based\nsolver Gurobi [8] as the base solver on the reduced theory to \ufb01nd the MAP assignment. In principle,\nany MAP solver could be used as the base solver 3. For the ground version, we directly applied\nGurobi on the grounded theory. We will refer to the grounded version as GRB. We will refer to our\nand Sarkhel et. al [19]\u2019s approaches as SOLGRB (Single Occurrence Lifted GRB) and NSLGRB\n(Non-shared Lifted GRB), respectively.\n\n6.1 Datasets and Methodology\nWe used the following benchmark MLNs for our experiments. (Results on the Student network [19]\nare presented in the supplement.):\n1) Information Extraction (IE): This theory is available from the Alchemy [13] website. We pre-\nprocessed the theory using the pure literal elimination rule described by Sarkhel et. al [19]. Resulting\nMLN had 7 formulas, 5 predicates and 4 variable equivalence classes.\n2) Friends & Smokers (FS): This is a standard MLN used earlier in the literature [20]. The MLN\nhas 2 formulas, 3 predicates and 1 variable equivalence class. We also introduced singletons for each\npredicate.\nFor each algorithm, we report:\n1) Time: Time to reach the optimal as the domain size is varied from 25 to 1000. 4,5\n2) Cost: Cost of the unsatis\ufb01ed clauses as the running time is varied for a \ufb01xed domain size (500).\n3) Theory Size: Ground theory size as the domain size is varied.\nAll the experiments were run on an Intel four core i3 processor with 4 GB of RAM.\n6.2 Results\nFigures 1a-1c plot the results for the IE domain. Figure 1a shows the time taken to reach the\noptimal. 6 This theory has a mix of single occurrence and non-single occurrence variables. Hence,\nevery algorithm needs to ground some or all of the variables. SOLGRB only grounds the variables\nwhose domain size was kept constant. Hence, varying domain size has no effect on SOLGRB and\nit reaches optimal instantaneously for all the domain sizes. NSLGRB partially grounds this theory\nand its time to optimal gradually increases with increasing domain size. GRB performs signi\ufb01cantly\nworse due to grounding of the whole theory.\nFigure 1b (log scale) plots the total cost of unsatis\ufb01ed formulae with varying time at domain size\nof 500. SOLGRB reaches optimal right in the beginning because of a very small ground theory.\nNSLGRB takes about 15 seconds. GRB runs out of memory. Figure 1c (log scale) shows the size\nof the ground theory with varying domain size. As expected, SOLGRB stays constant whereas the\n\n3Using MaxWalkSAT [10] as the base solver resulted in sub-optimal results.\n4For IE, two of the variable domains of were varied and other two were kept constant at 10 as done in [19].\n5Reported results are averaged over 5 runs.\n6 NSLGRB and GRB ran out of memory at domain sizes 800 and 100, respectively.\n\n7\n\n\fground theory size increases polynomially for both NSLGRB and GRB with differing degrees (due\nto the different number of variables grounded).\nFigure 2 shows the results for FS. This theory is not single occurrence but the tautology at extremes\nrule applies and our theory does not need to ground any variable. NSLGRB is identical to the\ngrounded version in this case. Results are qualitatively similar to IE domain. Time taken to reach the\noptimal is much higher in FS for NSLGRB and GRB for larger domain sizes.\nThese results clearly demonstrate the scalability as well as the superior performance of our approach\ncompared to the existing algorithms.\n\n(a) Time Taken Vs Domain Size\n\n(b) Cost at Domain Size 500\n\n(c) # of Gndings Vs Domain Size\n\nFigure 1: IE\n\n(a) Time Taken Vs Domain Size\n\n(b) Cost at Domain Size 500\n\n(c) # of Gndings Vs Domain Size\n\nFigure 2: Friends & Smokers\n\n7 Conclusion and Future Work\nWe have presented two new rules for lifting MAP inference which are applicable to a wide variety of\nMLN theories and result in highly scalable solutions. The MAP inference problem becomes domain\nindependent when every equivalence class is single occurrence. In the current framework, our rules\nhave been used as a pre-processing step generating a reduced theory over which any existing MAP\nsolver can be applied. This leaves open the question of effectively combining our rules with existing\nlifting rules in the literature.\nConsider the theory with two rules: S(X) \u2228 R(X) and S(Y ) \u2228 R(Z) \u2228 T (U ). Here, the equivalence\nclass {X, Y, Z} is not single occurrence, and our algorithm will only be able to reduce the domain of\nequivalence class {U}. But if we apply Binomial rule [9] on S, we get a new theory where {X, Z}\nbecomes a single occurrence equivalence class and we can resort to domain independent inference. 7\nTherefore, application of Binomial rule before single occurrence would lead to larger savings. In\ngeneral, there could be arbitrary orderings for applying lifted inference rules leading to different\ninference complexities. Exploring the properties of these orderings and coming up with an optimal\none (or heuristics for the same) is a direction for future work.\n\n8 Acknowledgements\nHappy Mittal was supported by TCS Research Scholar Program. Vibhav Gogate was partially\nsupported by the DARPA Probabilistic Programming for Advanced Machine Learning Program under\nAFRL prime contract number FA8750-14-C-0005. We are grateful to Somdeb Sarkhel and Deepak\nVenugopal for sharing their code and also for helpful discussions.\n\n7A decomposer does not apply even after conditioning on S.\n\n8\n\n 0 50 100 150 200 250 0 200 400 600 800 1000Time in secondsDomain sizeGRBNSLGRBSOLGRB 10000 100000 1e+06 1e+07 1e+08 0 10 20 30 40 50 60 70 80 90 100Cost of unsat. formulasTime in secondsNSLGRBSOLGRB 1 100 10000 1e+06 1e+08 1e+10 0 50 100 150 200 250 300 350 400 450 500Ground theory sizeDomain sizeGRBNSLGRBSOLGRB 0 50 100 150 200 250 0 200 400 600 800 1000Time in secondsDomain sizeGRBNSLGRBSOLGRB 100 1000 10000 100000 1e+06 0 10 20 30 40 50 60 70 80 90 100Cost of unsat. formulasTime in secondsGRBNSLGRBSOLGRB 1 100 10000 1e+06 1e+08 1e+10 0 50 100 150 200 250 300 350 400 450 500Ground theory sizeDomain sizeGRBNSLGRBSOLGRB\fReferences\n[1] H. Bui, T. Huynh, and S. Riedel. Automorphism groups of graphical models and lifted\n\nvariational inference. In Proc. of UAI-13, pages 132\u2013141, 2013.\n\n[2] R. de Salvo Braz, E. Amir, and D. Roth. Lifted \ufb01rst-order probabilistic inference. In Proc. of\n\nIJCAI-05, pages 1319\u20131325, 2005.\n\n[3] R. de Salvo Braz, E. Amir, and D. Roth. MPE and partial inversion in lifted probabilistic\n\nvariable elimination. In Proc. of AAAI-06, pages 1123\u20131130, 2006.\n\n[4] Pedro Domingos and Daniel Lowd. Markov Logic: An Interface Layer for Arti\ufb01cial Intelligence.\nSynthesis Lectures on Arti\ufb01cial Intelligence and Machine Learning. Morgan & Claypool\nPublishers, 2009.\n\n[5] G. Van den Broeck. On the completeness of \ufb01rst-order knowledge compilation for lifted\n\nprobabilistic inference. In Proc. of NIPS-11, pages 1386\u20131394, 2011.\n\n[6] V. Gogate and P. Domingos. Probabilisitic theorem proving. In Proc. of UAI-11, pages 256\u2013265,\n\n2011.\n\n[7] V. Gogate, A. Jha, and D. Venugopal. Advances in lifted importance sampling. In Proc. of\n\nAAAI-12, pages 1910\u20131916, 2012.\n\n[8] Gurobi Optimization Inc. Gurobi Optimizer Reference Manual, 2013. http://gurobi.com.\n[9] Abhay Kumar Jha, Vibhav Gogate, Alexandra Meliou, and Dan Suciu. Lifted inference seen\n\nfrom the other side : The tractable features. In Proc. of NIPS-10, pages 973\u2013981, 2010.\n\n[10] H. Kautz, B. Selman, and M. Shah. ReferralWeb: Combining social networks and collaborative\n\n\ufb01ltering. Communications of the ACM, 40(3):63\u201366, 1997.\n\n[11] K. Kersting. Lifted probabilistic inference. In Proceedings of the Twentieth European Confer-\n\nence on Arti\ufb01cial Intelligence, pages 33\u201338, 2012.\n\n[12] K. Kersting, B. Ahmadi, and S. Natarajan. Counting belief propagation. In Proc. of UAI-09,\n\npages 277\u2013284, 2009.\n\n[13] S. Kok, M. Sumner, M. Richardson, P. Singla, H. Poon, D. Lowd, J. Wang, and P. Domingos.\nThe Alchemy system for statistical relational AI. Technical report, University of Washington,\n2008. http://alchemy.cs.washington.edu.\n\n[14] K. Kersting M. Mladenov and A. Globerson. Ef\ufb01cient lifting of map lp relaxations using\n\nk-locality. In Proc. of AISTATS-14, pages 623\u2013632, 2014.\n\n[15] Mathias Niepert and Guy Van den Broeck. Tractability through exchangeability: A new\nperspective on ef\ufb01cient probabilistic inference. In Proc. of AAAI-14, pages 2467\u20132475, 2014.\n[16] J. Noessner, M. Niepert, and H. Stuckenschmidt. RockIt: Exploiting parallelism and symmetry\nfor MAP inference in statistical relational models. In Proc. of AAAI-13, pages 739\u2013745, 2013.\n\n[17] D. Poole. First-order probabilistic inference. In Proc. of IJCAI-03, pages 985\u2013991, 2003.\n[18] Stuart J. Russell and Peter Norvig. Arti\ufb01cial Intelligence - A Modern Approach (3rd edition).\n\nPearson Education, 2010.\n\n[19] S. Sarkhel, D. Venugopal, P. Singla, and V. Gogate. Lifted MAP inference for Markov logic\n\nnetworks. In Proc. of AISTATS-14, pages 895\u2013903, 2014.\n\n[20] P. Singla and P. Domingos. Lifted \ufb01rst-order belief propagation. In Proc. of AAAI-08, pages\n\n1094\u20131099, 2008.\n\n[21] D. Venugopal and V. Gogate. On lifting the Gibbs sampling algorithm. In Proc. of NIPS-12,\n\npages 1664\u20131672, 2012.\n\n9\n\n\f", "award": [], "sourceid": 451, "authors": [{"given_name": "Happy", "family_name": "Mittal", "institution": "IIT Delhi"}, {"given_name": "Prasoon", "family_name": "Goyal", "institution": "New York University"}, {"given_name": "Vibhav", "family_name": "Gogate", "institution": "UT Dallas"}, {"given_name": "Parag", "family_name": "Singla", "institution": "IIT Delhi"}]}