{"title": "CPR for CSPs: A Probabilistic Relaxation of Constraint Propagation", "book": "Advances in Neural Information Processing Systems", "page_first": 1113, "page_last": 1120, "abstract": "This paper proposes constraint propagation relaxation (CPR), a probabilistic approach to classical constraint propagation that provides another view on the whole parametric family of survey propagation algorithms SP(ρ), ranging from belief propagation (ρ = 0) to (pure) survey propagation(ρ = 1). More importantly, the approach elucidates the implicit, but fundamental assumptions underlying SP(ρ), thus shedding some light on its effectiveness and leading to applications beyond k-SAT.", "full_text": "CPR for CSPs: A Probabilistic Relaxation of\n\nConstraint Propagation\n\nLuis E. Ortiz\n\nECE Dept, Univ. of Puerto Rico, Mayag\u00a8uez, PR 00681-9042\n\nleortiz@ece.uprm.edu\n\nAbstract\n\nThis paper proposes constraint propagation relaxation (CPR), a probabilistic ap-\nproach to classical constraint propagation that provides another view on the whole\nparametric family of survey propagation algorithms SP(\u03c1). More importantly, the\napproach elucidates the implicit, but fundamental assumptions underlying SP(\u03c1),\nthus shedding some light on its effectiveness and leading to applications beyond\nk-SAT.\n\n1 Introduction\n\nSurvey propagation (SP) is an algorithm for solving k-SAT recently developed in the physics com-\nmunity [1, 2] that exhibits excellent empirical performance on \u201chard\u201d instances. To understand the\nbehavior of SP and its effectiveness, recent work (see Maneva et al. [3] and the references therein)\nhas concentrated on establishing connections to belief propagation (BP) [4], a well-known approxi-\nmation method for computing posterior probabilities in probabilistic graphical models. Instead, this\npaper argues that it is perhaps more natural to establish connections to constraint propagation (CP),\nanother message-passing algorithm tailored to constraint satisfaction problems (CSPs) that is well-\nknown in the AI community. The ideas behind CP were \ufb01rst proposed by Waltz [5] 1 Yet, CP has\nreceived considerably less attention than BP lately.\nThis paper reconnects BP to CP in the context of CSPs by proposing a probabilistic relaxation\nof CP that generalizes it. Through the approach, it is easy to see the exact, implicit underlying\nassumptions behind the entire family of survey propagation algorithms SP(\u03c1). (Here, the approach\nis presented in the context of k-SAT; it will be described in full generality in a separate document.)\nIn short, the main point of this paper is that survey propagation algorithms are instances of a natural\ngeneralization of constraint propagation and have simple interpretations in that context.\n\n2 Constraint Networks and Propagation\n\nThis section presents a brief introduction to the graphical representation of CSPs and CP, and con-\ncentrates on the aspects that are relevant to this paper. 2\nA constraint network (CN) is the graphical model for CSPs used in the AI community. Of interest\nhere is the CN based on the hidden transformation. (See Bacchus et al. [9] for more information\non the different transformations and their properties.) It has a bipartite graph where every variable\nand constraint is each represented by a node or vertex in the graph and there is an edge between a\nvariable i and a constraint a if and only if a is a function of i (see \ufb01gure 1). From now on, a CN with\na tree graph is referred to as a tree CN, and a CN with an arbitrary graph as an arbitrary CN.\n\n1See also Pearl [4], section 4.1.1, and the \ufb01rst paragraph of section 4.1.2.\n2Please refer to Russell and Norvig [6] for a general introduction, Kumar [7] for a tutorial and Dechter [8]\n\nfor a more comprehensive treatment of these topics and additional references.\n\n1\n\n\f1\n\n4\n\n2\n\na\n\nvariables\n3\n\nConstraint propagation is typically used as part\nof a depth-\ufb01rst search algorithm for solving\nCSPs. The search algorithm works by extend-\ning partial assignments, usually one variable\nat a time, during the search. The algorithm\nis called backtracking search because one can\nbacktrack and change the value of a previously\nassigned variable when the search reaches an\nillegal assignment.\nCP is often applied either as a preprocessing\nstep or after an assignment to a variable is\nmade. The objective is to reduce the domains\nof the variables by making them locally consis-\ntent with the current partial assignment. The\npropagation process starts with the belief that\nfor every value assignment vi in the domain of\neach variable i there exists a solution with vi as-\nsigned to i. The process then attempts to correct\nthis a priori belief by locally propagating con-\nstraint information.\nIt is well-known that CP,\nunlike BP, always converges, regardless of the\nstructure of the CN graph. This is because no\npossible solution is ignored at the start and none\never removed during the process. In the end, CP\nproduces potentially reduced variable domains\nthat are in fact locally consistent. In turn, the\nresulting search space is at worst no larger than the original but potentially smaller while still con-\ntaining all possible solutions. The computational ef\ufb01ciency and effectiveness of CP in practice has\nmade it a popular algorithm in the CSP community.\n\nFigure 1: The graph of the constraint network cor-\nresponding to the 3-SAT formula f(x) = (x1 \u2228\nx2 \u2228 x3) \u2227 (x2 \u2228 \u00afx3 \u2228 x4), which has four vari-\nables and two clauses; the \ufb01rst and second clause\nare denoted in the \ufb01gure by a and b, respectively.\nFollowing the convention of the SP community,\nclause and variable nodes are drawn as boxes and\ncircles, respectively; also, if a variable appears as\na negative literal in a clause (e.g., variable 3 in\nclause b), the edge between them is drawn as a\ndashed line.\n\nclauses\n\nb\n\n3 Terminology and Notation\n\n3\n\n4\n\n1\n\n2\n\nfb\u21922\n\na(i) and C u\n\nLet V (a) be the set of variables that appear in\nconstraint a and C(i) the set of constraints in\nwhich variable i appears. Let also Vi(a) \u2261\nV (a) \u2212 {i} and Ca(i) \u2261 C(i) \u2212 {a}. In k-\nSAT, the constraints are the clauses, each vari-\nable is binary, with domain {0, 1}, and a solu-\ntion corresponds to a satisfying assignment. If\ni \u2208 V (a), denote by sa,i the value assignment\nto variable i that guarantees the satis\ufb01ability of\nclause a; and denote the other possible assign-\na (i)\nment to i by ua,i. Finally, let C s\nbe the set of clauses in Ca(i) where variable i\nappears in the same and different literal form as\nit does in clause a, respectively.\nThe k-SAT formula under consideration is de-\nnoted by f.\nIt is convenient to introduce no-\ntation for formulae associated to the CN that\nresults from removing variables or constraints\nfrom f. Let fa be the function that results from\nremoving clause a from f (see \ufb01gure 2), and\nsimilarly, abusing notation, let fi be the function that results from removing variable i from f. Let\nfa\u2192i be the function that corresponds to the connected component of the CN graph for fa that con-\ntains variable i \u2208 V (a), and let fi\u2192a be the function that corresponds to the connected component\nof the CN graph for fi that contains a \u2208 C(i). (Naturally, if node a is not a separator of the CN\ngraph for f, fa has a single connected component, which leads to fa\u2192i = fa; similarly for fi.)\n\nFigure 2: The graph inside the continuous curve is\nthe CN graph for the formula fb that results from\nremoving clause b from f. The graph inside the\ndashed curve is the CN graph for fb\u21922, which cor-\nresponds to the formula for the connected compo-\nnent of the CN graph for fb that contains variable\n2.\n\nfb\n\na\n\nb\n\n2\n\nclausesvariables\fIt is convenient to use a simple, if perhaps unusual, representation of sets in order to track the\ndomains of the variables during the propagation process. Each subset A of a set S of size m is\nrepresented as a bit array of m elements where component k in the array is set to 1 if k is in A and\nto 0 otherwise. For instance, if S = {0, 1}, then the array [00] represents \u2205, and similarly, [01], [10]\nand [11] represent {0}, {1} and {0, 1}, respectively.\nIt is also useful to introduce the concept of (globally) consistent domains of variables and SAT\nfunctions. Let Sf = {x|x satis\ufb01es f} be the set of assignments that satisfy f. Given a complete\nassignment x, denote by x\u2212i the assignments to all the variables except i; thus, x = (x1, . . . , xn) =\n(xi, x\u2212i). Let the set Wi be the consistent domain of variable i in f if Wi = {xi|x = (xi, x\u2212i) \u2208\nSf for some x\u2212i}; that is, Wi contains the set of all possible values that variable i can take in an\nassignment that satis\ufb01es f. Let the set W be the consistent domain of f if W = \u00d7n\ni=1Wi and, for\nall i, Wi is the consistent domain of variable i in f.\nFinally, some additional terminology classi\ufb01es variables of a SAT function given a satisfying assign-\nment. Given a function f and a satisfying assignment x, let variable i be \ufb01xed if changing only its\nassignment xi in x does not produce another satisfying assignment for f; and be free otherwise.\n\n4 Propagation Algorithms for Satis\ufb01ability\n\nConstraint Propagation.\nIn CP for k-SAT, the message Ma\u2192i that clause a sends to variable i\nis an array of binary values indexed by the elements of the domain of i; similarly, for the message\nMi\u2192a that variable i sends to clause a. Intuitively, for all xi \u2208 {0, 1}, Mi\u2192a(xi) = 1 if and only\nif assigning value xi to variable i is \u201cok\u201d with all clauses other than a. Formally, Mi\u2192a(xi) = 1\nif and only if fa\u2192i has a satisfying assignment with xi assigned to variable i (or in other words,\nxi is in the consistent domain of i in fa\u2192i). Similarly, Ma\u2192i(xi) = 1 if and only if clause a\nis \u201cok\u201d with assigning value xi to variable i; or formally, Ma\u2192i(xi) = 1 if and only if fi\u2192a\nhas a satisfying assignment with xi assigned to variable i, or assigning xi to variable i by itself\nsatis\ufb01es a. It is convenient to denote Mi\u2192a(xi) and Mi\u2192a(xi) by M xi\na\u2192i, respectively.\nIn addition, M sa,i\ni\u2192a, M s\na\u2192i and\na\u2192i, respectively.\nM u\nIn summary, we can write CP for k-SAT as follows.\n\u2022 Messages that clause a sends to variable i:\n\na\u2192i are simply denoted by M s\n\na\u2192i and M xi\ni\u2192a, M u\n\na\u2192i and M ua,i\n\ni\u2192a, M ua,i\n\ni\u2192a, M sa,i\n\na\u2192i = 1 if and only if xi = sa,i or, there exists j \u2208 Vi(a), s.t. M s\nM xi\n\nj\u2192a = 1.\n\n\u2022 Messages that variable i sends to clause a:\n\ni\u2192a = 1 if and only if for all b \u2208 Ca(i), M xi\nM xi\n\nb\u2192i = 1.\n\n(1)\n\n(2)\n\nIt is convenient to express CP mathematically as follows.\n\n\u2022 Messages that clause a sends to variable i:\n\n(cid:26) 1,\n1 \u2212(cid:81)\n\nM xi\n\na\u2192i =\n\n\u2022 Messages that variable i sends to clause a: M xi\n\nj\u2208Vi(a)(1 \u2212 M s\n\nj\u2192a),\n\ni\u2192a =(cid:81)\n\nif xi = sa,i,\nif xi = ua,i.\n\nb\u2208Ca(i) M xi\n\nb\u2192i.\n\na\u2192i = 1, and naturally, M s\n\ni\u2192a =\nIn order to guarantee convergence, the message values in CP are initialized as M s\n1, M u\na\u2192i = 1. This initialization encodes the a priori belief that every\nassignment is a solution. CP attempts to \u201ccorrect\u201d or update this belief through the local propagation\nof constraint information. In fact, the expressions in CP force the messages to be locally consistent.\nBy being initially conservative about the consistent domains, no satisfying assignment is discarded\nduring the propagation process.\nfor\nOnce CP converges,\n\nits\nbecomes\na\u2192i = 1} \u2208 2{0,1}. For general CSPs,\nM u\nCP is usually very effective because it can signi\ufb01cantly reduce the original domain of the variables,\n\na\u2192i = 1} = {xi|(cid:81)\n\nvariable\ni,\na\u2208C(i):xi=ua,i\n\n{xi|(cid:81)\n\nlocally-consistent\n\ni\u2192a = 1, M u\n\na\u2208C(i) M xi\n\ndomain\n\neach\n\n3\n\n\fleading to a smaller search space of possible assignments. It should be noted that in the particular\ncase of k-SAT with arbitrary CNs, CP is usually only effective after some variables have already\nbeing assigned during the search, because those (partial) assignments can lead to \u201cboundary\nconditions.\u201d Without such boundary conditions, however, CP never reduces the domain of the\nvariables in k-SAT, as can be easily seen from the expressions above.\nOn the other hand, when CP is applied to tree CNs, it exhibits additional special properties. For\nexample, convergence is actually guaranteed regardless of how the messages are initialized, because\nof the boundary conditions imposed by the leaves of the tree. Also, the \ufb01nal messages are in fact\nglobally consistent (i.e., all the messages are consistent with their de\ufb01nition). Therefore, the locally-\nconsistent domains are in fact the consistent domains. Whether the formula is satis\ufb01able, or not,\ncan be determined immediately after applying CP. If the formula is not satis\ufb01able, the consistent\ndomains will be empty sets. If the formula is in fact satis\ufb01able, applying depth-\ufb01rst search always\n\ufb01nds a satisfying assignment without the need to backtrack.\nWe can express CP in a way that looks closer to SP and BP. Using the reparametrization \u0393a\u2192i =\n1 \u2212 M u\n\na\u2192i, we get the following expression of CP.\n\n\u2022 Message that clause a sends to variable i: \u0393a\u2192i =(cid:81)\ni\u2192a =(cid:81)\n\n\u2022 Message that variable i sends to clause a: M s\n\nj\u2208Vi(a)(1 \u2212 M s\nb\u2208Cu\n\nj\u2192a).\na (i)(1 \u2212 \u0393b\u2192i).\n\nSurvey Propagation. Survey propagation has become a very popular propagation algorithm for\nk-SAT. It was developed in the physics community by M\u00b4ezard et al. [2]. The excitement around\nSP comes from its excellent empirical performance on hard satis\ufb01ability problems; that is, k-SAT\nformulae with a ratio \u03b1 of the number of clauses to the number of variables near the so called\nsatis\ufb01ability threshold \u03b1c.\nThe following is a description of an SP-inspired family of message-passing procedures, parametrized\nby \u03c1 \u2208 [0, 1]. It is often denoted by SP(\u03c1), and contains BP (\u03c1 = 0) and (pure) SP (\u03c1 = 1).\n\n\u2022 Message that clause a sends to variable i:\n\n\u03b7a\u2192i =(cid:81)\n\n\u03a0u\nj\u2192a\nj\u2192a+\u03a0\u2217\nj\u2192a+\u03a0s\n\n\u03a0u\n\nj\u2192a\n\nj\u2208Vi(a)\n\u2022 Messages that variable i sends to clause a:\n\ni\u2192a =\n\n(cid:16)\ni\u2192a = (cid:81)\ni\u2192a = (cid:81)\n\n\u03a0u\n\n\u03a0s\n\u03a0\u2217\n\n1 \u2212 \u03c1(cid:81)\n(cid:16)\na (i)(1 \u2212 \u03b7b\u2192i)(cid:81)\na (i)(1 \u2212 \u03b7b\u2192i)\n\n(cid:17)(cid:81)\n1 \u2212(cid:81)\na (i)(1 \u2212 \u03b7b\u2192i)\na(i)(1 \u2212 \u03b7b\u2192i) =(cid:81)\n\nb\u2208Cs\na(i)(1 \u2212 \u03b7b\u2192i)\n\n(cid:17)\na(i)(1 \u2212 \u03b7b\u2192i)\nb\u2208Ca(i)(1 \u2212 \u03b7b\u2192i)\n\nb\u2208Cu\nb\u2208Cu\n\nb\u2208Cu\n\nb\u2208Cs\n\nb\u2208Cs\n\nSP was originally derived via arguments and concepts from physics. A simple derivation based on a\nprobabilistic interpretation of CP is given in the next section of the paper. The derivation presented\nhere elucidates the assumptions that SP algorithms make about the satis\ufb01ability properties and struc-\nture of k-SAT formulae. However, it is easy to establish strong equivalence relations between the\ndifferent propagation algorithms even at the basic level, before introducing the probabilistic inter-\npretation (details omitted).\n\n5 A Probabilistic Relaxation of Constraint Propagation for Satis\ufb01ability\n\nThe main idea behind constraint propagation relaxation (CPR) is to introduce a probabilistic model\nfor the k-SAT formula and view the messages as random variables in that model. If the formula f\nhas n variables, the sample space \u2126 = (2{0,1})n is the set of the n-tuple whose components are\nsubsets of the set of possible values that each variable i can take (i.e., subsets of {0, 1}). The \u201ctrue\nprobability law\u201d Pf of a SAT formula f that corresponds to CP is de\ufb01ned in terms of the consistent\ndomain of f: for all W \u2208 \u2126,\n\n(cid:26) 1,\n\nPf (W) =\n\n0, otherwise.\n\nif W is the consistent domain of f,\n\n4\n\n\fClearly, if we could compute the consistent domains of the remaining variables after each variable\nassignment during the search, there would be no need to backtrack. But, while it is easy to compute\nconsistent domains for tree CNs, it is actually hard in general for arbitrary CNs. Thus, it is generally\nhard to compute Pf . (CNs with graphs of bounded tree-width are a notable exception.)\nHowever, the probabilistic interpretation will allow us to introduce \u201cbias\u201d on \u2126, which leads to a\nheuristic for dynamically ordering both the variables and their values during search. As shown in\nthis section, it turns out that for arbitrary CNs, survey propagation algorithms attempt to compute\ndifferent \u201capproximations\u201d or \u201crelaxations\u201d of Pf by making different assumptions about its \u201cprob-\nabilistic structure.\u201d\nLet us now view each message M s\ni\u2192a for each variable i and clause\na as a (Bernoulli) random variable in some probabilistic model with sample space \u2126 and a, now\narbitrary, probability law P. 3 Formally, for each clause a, variable i and possible assignment value\nxi \u2208 {0, 1}, we de\ufb01ne\n\ni\u2192a, and M u\n\na\u2192i, M u\n\na\u2192i, M s\n\na\u2192i \u223c Bernoulli(pxi\nM xi\na\u2192i = 1) and pxi\n\na\u2192i) and M xi\n\ni\u2192a \u223c Bernoulli(pxi\n\ni\u2192a)\n\ni = P(M u\n\ni\u2192a = P(M xi\n\na\u2192i = P(M xi\n\ni \u2261 P(M xi\n\na\u2192i = 1 for all a \u2208 C\u2212(i)) and p0\n\na\u2192i because it is always 1, by the de\ufb01nition of M s\n\ni\u2192a = 1). This is a distribution over all possible\nwhere pxi\nsubsets (i.e., the power set) of the domain of each variable, not just over the variable\u2019s domain itself.\nAlso, clearly we do not need to worry about ps\na\u2192i.\nThe following is a description of how we can use those probabilities during search.\nIn the SP\ncommunity, the resulting heuristic search is called \u201cdecimation\u201d [1, 2]. If we believe that P \u201cclosely\na\u2192i = 1 for all a \u2208 C(i)) that xi is in\napproximates\u201d Pf , and know the probability pxi\nthe consistent domain for variable i of f, for every variable i, clause a and possible assignment\nxi, we can use them to dynamically order both the variables and the values they can take during\na\u2192i =\nsearch. Speci\ufb01cally, we \ufb01rst compute p1\n1 for all a \u2208 C +(i)) for each variable i, where C +(i) and C\u2212(i) are the sets of clauses where\nvariable i appears as a positive and a negative literal, respectively. Using those probability values,\nwe then compute what the SP community calls the \u201cbias\u201d of i: |p1\ni|. The variable to assign next\ni \u2212 p0\nis the one with the largest bias. 4 We would set that variable to the value of largest probability; for\ninstance, if variable i has the largest bias, then we set i next, to 1 if p1\ni .\ni < p0\nThe objective is then to compute or estimate those probabilities.\nThe following are (independence) assumptions about the random variables (i.e., messages) used in\nthis section. The assumptions hold for tree CNs and, as formally shown below, are inherent to the\nsurvey propagation process.\nj\u2192a for all j \u2208 Vi(a) are\nAssumption 1. For each clause a and variable i, the random variables M s\nindependent.\nb\u2192i for all clauses b \u2208\nAssumption 2. For each clause a and variable i, the random variables M u\nC u\nb\u2192i for all clauses b \u2208\nAssumption 3. For each clause a and variable i, the random variables M u\nC s\n\na (i) are independent.\n\ni , and to 0 if p1\n\ni = P(M u\n\na(i) are independent.\n\ni > p0\n\nWithout any further assumptions, we can derive the following, by applying assumption 1 and the\nexpression for M u\n\na\u2192i that results from 1:\n\na\u2192i = P(M u\npu\n\nj\u2208Vi(a) P(M s\nSimilarly, by assumption 2 and the expression for M s\n\na\u2192i = 1) = 1 \u2212(cid:81)\ni\u2192a = 1) =(cid:81)\n\ni\u2192a = P(M s\nps\n\nj\u2192a = 0) = 1 \u2212(cid:81)\nb\u2192i = 1) =(cid:81)\n\nj\u2208Vi(a)(1 \u2212 ps\ni\u2192a that results from 2, we derive\nb\u2192i.\n\nb\u2208Cu\n\na (i) pu\n\nj\u2192a).\n\nUsing the reparametrization \u03b7a\u2192i = P(M u\npassing procedure.\n\na (i) P(M u\n\nb\u2208Cu\na\u2192i = 0) = 1 \u2212 pu\n\na\u2192i, we obtain the following message-\n\n3Given clause a and variable i of SAT formula f, let Dj\ni\u2192a(W) = 1 iff Wj \u2282 Dj\n\na\u2192i be the (globally) consistent domain of fa\u2192i\nfor variable j. The random variables corresponding to the messages from variable i to clause a are de\ufb01ned as\nM xi\na\u2192i. The other random variables are\nj\u2192a(W)) for all W.\nthen de\ufb01ned as M s\n\na\u2192i for every variable j of fa\u2192i; and xi \u2208 Di\nj\u2208Vi(a)(1 \u2212 M s\n\na\u2192i(W) = 1 \u2212Q\n\na\u2192i(W) = 1 and M u\n\n4For both variable and value ordering, we can break ties uniformly at random. Also, the description of\nSP(\u03c1) used often, sets a fraction \u03b2 of the variables that remained unset during search. While clearly this\nspeeds up the process of getting a full assignment, the effect that heuristic might have on the completeness of\nthe search procedure is unclear, even in practice.\n\n5\n\n\f\u2022 Message that clause a sends to variable i: \u03b7a\u2192i =(cid:81)\ni\u2192a =(cid:81)\n\n\u2022 Message that variable i sends to clause a: ps\n\nj\u2208Vi(a)(1 \u2212 ps\nb\u2208Cu\n\ni\u2192a)\na (i)(1 \u2212 \u03b7b\u2192i)\n\ni\u2192a as(cid:81)\n\nWe can then use assumption 3 to estimate pu\nNote that this message-passing procedure is exactly \u201cclassical\u201d CP if we initialize \u03b7a\u2192i = 0 and\ni\u2192a = 1 for all variables i and clause a. However, the version here allows the messages to be in\nps\n[0, 1]. At the same time, for tree CNs, this algorithm is the same as classical CP (i.e., produces the\nsame result), regardless of how the messages \u03b7a\u2192i and ps\ni\u2192a are initialized. In fact, in the tree case,\nthe \ufb01nal messages uniquely identify P = Pf .\n\nb\u2208Cs\n\na(i)(1 \u2212 \u03b7b\u2192i).\n\nMaking Assumptions about Satis\ufb01ability. Let us make the following assumption about the\n\u201cprobabilistic satis\ufb01ability structure\u201d of the k-SAT formula.\nAssumption 4. For some \u03c1 \u2208 [0, 1], for each clause a and variable i,\n\nP(M s\n\ni\u2192a = 0, M u\n\ni\u2192a = 0) = (1 \u2212 \u03c1)P(M s\n\ni\u2192a = 1, M u\n\ni\u2192a = 1).\n\ni\u2192a = 0, M u\n\ni\u2192a = 0, M u\ni\u2192a = 1), which, interestingly, is equivalent to the condition P(M s\n\nFor \u03c1 = 1, the last assumption essentially says that fa\u2192i has a satisfying assignment;\ni.e.,\nP(M s\ni\u2192a = 0) = 0. For \u03c1 = 0, it essentially says that the likelihood that fa\u2192i does\nnot have a satisfying assignment is the same as the likelihood that fa\u2192i has a satisfying assignment\ni\u2192a =\nwhere variable i is free. Formally, in this case, we have P(M s\n1, M u\ni\u2192a =\n1) = 1.\nLet us introduce a \ufb01nal assumption about the random variables associated to the messages from\nvariables to clauses.\nAssumption 5. For each clause a and variable i, the random variables M s\nindependent.\n\ni\u2192a = 0) = P(M s\ni\u2192a = 1) + P(M u\n\ni\u2192a and M u\n\ni\u2192a are\n\nNote that assumptions 2, 3 and 5 hold (simultaneously) if and only if for each clause a and variable\ni, the random variables M u\nThe following theorem is the main result of this paper.\nTheorem 1. (Suf\ufb01cient Assumptions) Let assumptions 1, 2 and 3 hold. The message-passing\nprocedure that results from CPR as presented above is\n\nb\u2192i for all clauses b \u2208 Ca(i) are independent.\n\n1. belief propagation (i.e., SP(0)), if assumption 4, with \u03c1 = 0, holds, and\n2. a member of the family of survey propagation algorithms SP(\u03c1), with 0 < \u03c1 \u2264 1, if\n\nassumption 4, with the given \u03c1, and assumption 5 hold.\n\nThese assumptions are also necessary in a strong sense (details omitted), Assumptions 1, 2, 3, and\neven 5 might be obvious to some readers, but assumption 4 might not be, and it is essential.\n\nProof. As in the last subsection, assumption 1 leads to pu\nassumptions 2 and 3 lead to ps\nb\u2192i and pu\n\nb\u2208Cu\n\na (i) pu\n\nNote also that assumption 4 is equivalent to ps\nallows us to express\n\ni\u2192a + pu\n\ni\u2192a \u2212 \u03c1 P(M s\n\nj\u2192a), while\n\nj\u2208Vi(a)(1 \u2212 ps\na(i) pu\n\nb\u2192i.\nb\u2208Cs\ni\u2192a = 1, M u\n\ni\u2192a = 1) = 1. This\n\ni\u2192a =(cid:81)\n\na\u2192i = 1 \u2212(cid:81)\ni\u2192a =(cid:81)\n\nP(M s\n\ni\u2192a = 1) = ps\n\ni\u2192a =\n\nps\ni\u2192a\ni\u2192a \u2212 \u03c1 P(M s\n\ni\u2192a + pu\nps\n\ni\u2192a = 1, M u\n\ni\u2192a = 1) ,\n\nwhich implies\n\nP(M s\n\ni\u2192a = 0) =\n\ni\u2192a \u2212 \u03c1 P(M s\npu\n\ni\u2192a \u2212 \u03c1 P(M s\npu\n\ni\u2192a = 1, M u\n\ni\u2192a = 1)\ni\u2192a = 1) + ps\n\ni\u2192a = 1, M u\n\n.\n\ni\u2192a\n\nIf \u03c1 = 0, then the last expression simpli\ufb01es to\n\nP(M s\n\ni\u2192a = 0) =\n\npu\ni\u2192a\ni\u2192a + ps\npu\n\ni\u2192a\n\n.\n\n6\n\n\fa\u2192i = 0) = 1 \u2212 pu\nUsing the reparametrization \u03b7a\u2192i \u2261 P(M u\ni\u2192a, leads to BP (i.e., SP(0)).\ni\u2192a = 1) = ps\nand \u03a0s\nOtherwise, if 0 < \u03c1 \u2264 1, then using the reparametrization \u03b7a\u2192i \u2261 P(M u\n\ni\u2192a \u2261 P(M s\n\ni\u2192a + \u03a0\u2217\n\na\u2192i, \u03a0u\n\na\u2192i = 0),\n\ni\u2192a \u2261 P(M u\n\ni\u2192a = 1) = pu\n\ni\u2192a\n\ni\u2192a = 1) \u2212 \u03c1 P(M s\ni\u2192a = 0, M u\ni\u2192a = 1, M u\ni\u2192a = 1, M u\nand applying assumption 5 leads to SP(\u03c1).\n\ni\u2192a \u2261 P(M u\n\u03a0u\n= P(M s\ni\u2192a \u2261 P(M s\n\u03a0s\ni\u2192a \u2261 P(M s\n\u03a0\u2217\n\ni\u2192a = 1, M u\n\ni\u2192a = 1)\n\ni\u2192a = 1) + (1 \u2212 \u03c1)P(M s\ni\u2192a = 0), and\ni\u2192a = 1),\n\ni\u2192a = 1, M u\n\ni\u2192a = 1),\n\nThe following are some remarks that can be easily derived using CPR.\n\nOn the Relationship Between SP and BP.\nSP essentially assumes that every sub-formula fa\u2192i\nhas a satisfying assignment, while BP assumes that for every clause a and variable i \u2208 V (a), variable\ni is equally likely not to have a satisfying assignment or being free in fa\u2192i, as it is easy to see from\nassumption 4. The parameter \u03c1 just modulates the relative scaling of those two likelihoods. While\nthe same statement about pure SP is not novel, the statement about BP, and more generally, the class\nSP(\u03c1) for 0 \u2264 \u03c1 < 1, seems to be.\n\nOn the Solutions of SAT formula f. Note that Pf may not satisfy all or any of the assumptions.\nYet, satisfying an assumption imposes constraints on what Pf actually is and thus on the solution\nspace of f. For example, if Pf satis\ufb01es assumption 4 for any \u03c1 < 1, which includes BP when \u03c1 = 0,\ni\u2192a = 0, M u\nand for all clauses a and variables i, then Pf (M s\ni\u2192a =\n1) = 0 and therefore either Pf (M s\ni\u2192a = 1) = 1\ni\u2192a = 0) = 1 or Pf (M s\ni\u2192a = 1, M u\nholds, but not both of course. That implies f must have a unique solution!\n\ni\u2192a = 0) = Pf (M s\n\ni\u2192a = 0, M u\n\ni\u2192a = 1, M u\n\ni\u2192a = 0, M u\n\nOn SP. This result provides additional support to previous informal conjectures as to why SP is\nso effective near the satis\ufb01ability threshold: SP concentrates all its efforts on \ufb01nding a satisfying\nassignment when they are scarce and \u201cscattered\u201d across the space of possible assignments. Thus, SP\nassumes that the set of satisfying assignments has in fact special structure.\ni\u2192a = 0) = 0\nTo see that, note that assumptions 4, with \u03c1 = 1, and 5 imply that P(M s\nor P(M s\ni\u2192a = 1) = 0 must hold. This says that in every assignment that satis\ufb01es\nfa\u2192i, variable i is either free or always has the same value assignment. This observation is relevant\nbecause it has been argued that as we approach the satis\ufb01ability threshold, the set of satisfying\nassignments decomposes into many \u201clocal\u201d or disconnected subsets.\nIt follows easily from the\ndiscussion here that SP assumes such a structure, therefore potentially making it most effective\nunder those conditions (see Maneva et al. [3] for more information).\nSimilarly, it has also been empirically observed that SP is more effective for \u03c1 close to, but strictly\nless than 1. The CPR approach suggests that such behavior might be because, with respect to any\nP that satis\ufb01es assumption 4, unlike pure SP, for such values of \u03c1 < 1, SP(\u03c1) guards against the\npossibility that fa\u2192i is not satis\ufb01able, while still being somewhat optimistic by giving more weight\nto the event that variable i is free in fa\u2192i. Naturally, BP, which is the case of \u03c1 = 0, might be too\npessimistic in this sense.\n\ni\u2192a = 1, M u\n\ni\u2192a = 0, M u\n\ni\u2192a = 1, M u\n\nOn BP. For BP (\u03c1 = 0), making the additional assumption that the formula fa\u2192i is satis\ufb01able\n(i.e., P(M s\ni\u2192a = 0) = 0) implies that there are no assignments with free variables (i.e.,\nP(M s\ni\u2192a = 1) = 0). Therefore, the only possible consistent domain is the singleton\n{sa,i} or {ua,i} (i.e., P(M s\ni\u2192a = 1) = 1). Thus,\neither 0 or 1 can possibly be a consistent value assignment, but not both. This suggests that BP is\nconcentrating its efforts on \ufb01nding satisfying assignments without free variables.\n\ni\u2192a = 0) + P(M s\n\ni\u2192a = 1, M u\n\ni\u2192a = 0, M u\n\nOn Variable and Value Ordering. To complete the picture of the derivation of SP(\u03c1) via CPR,\nwe need to compute p0\ni for all variables i to use for variable and value ordering during search.\nWe can use the following, slightly stronger versions of assumptions 2 and 3 for that.\na\u2192i for all clauses a \u2208 C\u2212(i) are\nAssumption 6. For each variable i, the random variables M u\nindependent.\n\ni and p1\n\n7\n\n\fAssumption 7. For each variable i, the random variables M u\nindependent.\n\n(cid:81)\nUsing assumptions 6 and 7, we can easily derive that p1\na\u2208C+(i)(1 \u2212 \u03b7a\u2192i), respectively.\n\ni = (cid:81)\n\na\u2192i for all clauses a \u2208 C +(i) are\n\na\u2208C\u2212(i)(1 \u2212 \u03b7a\u2192i) and p0\n\ni =\n\nOn Generalizations. The approach provides a general, simple and principled way to introduce\npossibly uncertain domain knowledge into the problem by making assumptions about the structure\nof the set of satisfying assignments and incorporating them through P. That can lead to more\neffective propagation algorithms for speci\ufb01c contexts.\n\nRelated Work. Dechter and Mateescu [10] also connect BP to CP but in the context of the in-\nference problem of assessing zero posterior probabilities. Hsu and McIlraith [11] give an intuitive\nexplanation of the behavior of SP and BP from the perspective of traditional local search methods.\nThey provide a probabilistic interpretation, but the distribution used there is over the biases.\nBraunstein and Zecchina [12] showed that pure SP is equivalent to BP on a particular MRF over\nan extended domain on the variables of the SAT formula, which adds a so called \u201cjoker\u201d state.\nManeva et al. [3] generalized that result by showing that SP(\u03c1) is only one of many families of\nalgorithms that are equivalent to performing BP on a particular MRF. In both cases, one can easily\ninterpret those MRFs as ultimately imposing a distribution over \u2126, as de\ufb01ned here, where the joker\nstate corresponds to the domain {0, 1}. Here, the only particular distribution explicitly de\ufb01ned is\nPf , the \u201coptimal\u201d distribution. This paper does not make any explicit statements about any speci\ufb01c\ndistribution P for which applying CPR leads to SP(\u03c1).\n\n6 Conclusion\n\nThis paper strongly connects survey and constraint propagation. In fact, the paper shows how survey\npropagation algorithms are instances of CPR, the probabilistic generalization of classical constraint\npropagation proposed here. The general approach presented not only provides a new view on survey\npropagation algorithms, which can lead to a better understanding of them, but can also be used to\neasily develop potentially better algorithms tailored to speci\ufb01c classes of CSPs.\n\nReferences\n[1] A. Braunstein, M. M\u00b4ezard, and R. Zecchina. Survey propagation: An algorithm for satis\ufb01ability. Random\n\nStructures and Algorithms, 27:201, 2005.\n\n[2] M. M\u00b4ezard, G. Parisi, and R. Zecchina. Analytic and Algorithmic Solution of Random Satis\ufb01ability\n\nProblems. Science, 297(5582):812\u2013815, 2002.\n\n[3] E. Maneva, E. Mossel, and M. J. Wainwright. A new look at survey propagation and its generalizations.\n\nACM, 54(4):2\u201341, July 2007.\n\n[4] J. Pearl. Probabilistic Reasoning in Intelligent Systems. Networks of Plausible Inference. Morgan Kauf-\n\nmann, 1988.\n\n[5] D. L. Waltz. Generating semantic descriptions from drawings of scenes with shadows. Technical Report\n\n271, MIT AI Lab, Nov. 1972. PhD Thesis.\n\n[6] S. Russell and P. Norvig. Arti\ufb01cial Intelligence: A Modern Approach, chapter 5, pages 137\u2013160. Prentice\n\nHall, second edition, 1995.\n\n[7] V. Kumar. Algorithms for constraint-satisfaction problems: A survey. AI Magazine, 13(1):32\u201344, 1992.\n[8] R. Dechter. Constraint Processing. Morgan Kaufmann, 2003.\n[9] F. Bacchus, X. Chen, P. van Beek, and T. Walsh. Binary vs. non-binary constraints. AI, 140(1-2):1\u201337,\n\nSept. 2002.\n\n[10] R. Dechter and R. Mateescu. A simple insight into iterative belief propagation\u2019s success. In UAI, 2003.\n[11] E. I. Hsu and S. A. McIlraith. Characterizing propagation methods for boolean satis\ufb01ability. In SAT,\n\n2006.\n\n[12] A. Braunstein and R. Zecchina. Survey propagation as local equilibrium equations. JSTAT, 2004.\n\n8\n\n\f", "award": [], "sourceid": 988, "authors": [{"given_name": "Luis", "family_name": "Ortiz", "institution": null}]}