__ Summary and Contributions__: The authors propose a new approximate optimization method for the constrained max. a posteriori inference problem in discrete graphical models. The single constraint has a form f(x_1, ... , x_n) <= c, where x_1,..., x_n are the discrete variables of the considered graphical model. The proposed method consists in
- considering a combined graph, which includes all edges from the original graphical model and the edges corresponding to the function f, if one considers it as an energy of a different graphical model;
- fixing the values of a subset of variables such that the remaining graph falls into multiple small connected components, up to a bounded k number of variables each.
- a reduction of the problem on the remaining graph to the multiple choice knapsack problem, where each connected component is treated as a single variable.
- for solving the multiple choice knapsack problem an existing algorithm is used.

__ Strengths__: The paper is theoretically sound and mathematical statements are rigorously proven.
The problem is
- important, as a number of existing problems can be reduced to it
- NP-hard, as the max. a posteriori inference is NP-hard by itself
Therefore, any progress in this direction is important. The proposed reduction is new for me. The discrete graphical models is one of the important topics for the NeurIPS community.

__ Weaknesses__: The proposed reduction is practically limited to the case when the initial graph and the graph defined by the function f are sparse and the number of variables, which must be fixed to obtain small connected components, is relatively small.
As a consequence, the empirical evaluation is limited to the small-sized sparse problems from UAI 2010 and 2014 competitions.

__ Correctness__: The claims and the method are correct. The empirical evaluation is sensible, I was unable to propose a different evaluation methodology on the fly.

__ Clarity__: The paper is written very clearly and rigirously.

__ Relation to Prior Work__: The relation to the previous work is clearly stated.

__ Reproducibility__: Yes

__ Additional Feedback__: Major:
- Why the SCIP solver was used as a baseline and not Gurobi? My experience suggests that Gurobi typically performs notably better and the academic license is free.
- Proposition 1 does not contain a claim. It seems you unintentionally deleted a part of it.
Minor:
- I would suggest to use the term "volume" instead of "cost", as it makes more sense to restrict the volume and maximize the profit than to restrict the costs and maximize the profit, especially when one speaks about a knapsack (with a predefined volume).
l117: m is the number of log-potentials. It becomes more readable, if you would write "m is the number of log-potentials, i.e. m=|\mathbf(f)|".
x^l = argmax ... , x^u = argmin ...
l155: the sentence "and a real number q such that no two functions ... share any variable..." - is ambiguous. Put "and a real number q" right before "we can construct" instead.
Section 4.3. Please specify the meaning of "n" again to simplify understanding. It is the number of connected components, is not it?
=============Post rebuttal:================================
I've read the other reviews and the rebuttal of the authors. It seems to me that authors properly addressed the main concerns of all reviewers. Therefore, I leave my positive evaluation of the paper unchanged.
As an additional remark to Proposition 1: 'argmax' instead of 'max' at the very beginning of the statement would eliminate my confusion.

__ Summary and Contributions__: The paper considers a class of discrete optimisation problems which requires to maximise the (negative) energy of a discrete probabilistic graphical model under the constraint that the energy of the solution w.r.t. a second model is bounded by some given value. The authors propose an algorithm that reduces this task to a set of multiple choice knapsack problems which are then solved by a greedy algorithm.

__ Strengths__: The authors discuss several application relevant problem classes that can be reduced to the studied problem. The main idea of the proposed approach is quite unexpected and consists in following. The first step is to find a separator set in the combined graph of the two models, such that the remaining graph decomposes into connected components with at most k elements. After fixing a variable assignment for the separator set, the initial task can be reduced to a multiple choice knapsack problem, where the choices in the bins are the variable assignments of the connected components. The authors show that their approach outperforms of the shelf mixed ILP solvers considerably for several benchmark graphical models.

__ Weaknesses__: The paper is clearly structured and concise. I see almost no weakness. Of course, all the sub-problems, i.e. (1) finding an optimal separator set, (2) enumerating over all possible label assignments on the separator set and (3) solving the multiple choice knapsack problem are themselves hard and require heuristic approaches. However, the paper shows that using well studied approximations for each of them leads, altogether, to an algorithm that outperforms standard solvers by orders of magnitude.

__ Correctness__: The claims and proposed methods are correct. The experimental validation is convincing.

__ Clarity__: The paper is clearly structured and well written.

__ Relation to Prior Work__: Relation to prior work is clearly discussed. The proposed method is unexpected and novel.

__ Reproducibility__: Yes

__ Additional Feedback__: Post rebuttal:
I have read the other reviews and the rebuttal of the authors. It seems to me that they have addressed the concerns raised in the other reviews. I will therefore keep my positive recommendation for accepting the paper.

__ Summary and Contributions__: The authors define and study a combinatorial optimization task called the "constrained most probable explanation" (CMPE) in discrete PGMs, to find the MPE configuration x of some distribution p(x) given a bound on another, q(x). The authors' approach is essentially to find a cutset of the graph of p & q such that the resulting independent components are small enough to be solved (approximately) using methods from multiple choice knapsack problems.

__ Strengths__: The most interesting aspect of the work is the problem formulation itself, and its interpretation as a multiple choice knapsack problem. This was new to me, and seems like it would be of interest generally. The authors argue that the CMPE problem is important, since several tasks (m-best, nearest assignment, and decision-preserving MAP) can be framed as CMPE.

__ Weaknesses__: The significance of the work is unclear, in my opinion. The authors do not study any realistic problems corresponding to CMPE tasks, or use the tasks that can be framed as CMPE to compare their approach with other methods for solving existing problems. The experiments look at the performance of synthetic CMPE problems defined on UAI benchmark instances, some of which are built from real data, but not for the purpose of a CMPE task. It would be much more compelling to see this applied to something more realistic.
In terms of methodology, the approach is mainly a combination of two existing techniques (cutsets and MCKP solvers) and is thus mostly innovative in their application to the new task rather than as a novel approach.

__ Correctness__: Yes. Empirical methodology appears correct, with the criticism that it is very synthetic in the manner noted.

__ Clarity__: Yes, the paper is clear and well written.

__ Relation to Prior Work__: Yes; the work is defining a new problem type.

__ Reproducibility__: Yes

__ Additional Feedback__: I think this work would be greatly strengthened by showing that these types of queries can be useful for something practical, and evaluating on real models, or models meant to simulate that type of task.

__ Summary and Contributions__: The paper presents a new constrained MAP inference task for graphical models. Specifically, given two graphical models defined over the same set of variables and a real number q, the task is to find a MAP assignment for the first graphical model such that the cost of the assignment is less than q with respect to the second graphical model. The proposed algorithm combines (local/exhaustive) search over a subset of variables while encoding the remaining conditioned subproblem as a multiple-choice knapsack problem and solving that with a specialised solver. The empirical evaluation on standard benchmarks for graphical models shows that the proposed algorithm performs well in practice and outperforms a MIP encoding of the problem that is subsequently solved with an open-source MIP solver.

__ Strengths__: The constrained inference task defined in this paper appears to be relevant to other domains as well. Therefore, I think the paper presents a fairly strong contribution. Furthermore, the empirical evaluation is performed on standard benchmarks for graphical models and therefore I think the results can be easily reproduced.

__ Weaknesses__: My main concern is that the paper is not really self contained. The proposed anytime algorithm relies on state-of-the-art MCKP solvers. So I think the background section should contain a brief description of such solvers.

__ Correctness__: The technical contribution appears to be correct. I didn't see any obvious mistakes.

__ Clarity__: The quality of the presentation is overall quite good. The technical details are discuss in a fairly clear manner and therefore the paper is relatively easy to follow.

__ Relation to Prior Work__: The paper appears to be positioned well in the context of related work. However, I think there is a close connection with Dechter's "mixed networks" framework that also defines a constrained inference task over graphical models.
Robert Mateescu and Rina Dechter. "Mixed deterministic and probabilistic networks". In Annals of Mathematics and Artificial Intelligence. Special Issue: Probabilistic Relational Learning, Volume 54 (1-3), pages 3-51, 2008.

__ Reproducibility__: Yes

__ Additional Feedback__: 1. I think it's important to present in the empirical section the MIP encoding of the proposed inference task.
2. I was surprised that no established MAP inference algorithms for graphical models have been pursued in the paper. For example, a straightforward approach is to run a depth-first branch and bound search algorithm for MAP that enumerates the MAP assignments of M1 and evaluates them on M2 to determine if they exceed the q cost or not.
3. It is well known that solving MAP as an integer program is not the most efficient way, so I'm not that surprised that SCIP doesn't perform that well.
*** Post rebuttal ***
I've read the other reviews and the author response. I'm satisfied with the clarifications provided by the authors.