
Submitted by Assigned_Reviewer_1
Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
Online binary vertex classification over a graph. In the problem, initially the learner is provided with a graph $G$ and on each round, possibly adversarially a node from this graph is picked and the learner provides a label in $\{0,1\}$ for this node. Nature then reveals the true label $y_t$ for this node. The prediction made by the learner is based on the ising model when the temperature parameter approaches 0. However this is a compitationally hard problem to solve and so the authors propose an approximate scheme based on constructing PicardQueyranne graphs that preserve combinatorial structure of minimum cuts. Now the learner makes predictions by playing an online game on the PQ graph instead. The authors provide a mistake bound for the algorithm. In the case when the graph is a tree the mistake bounds are comparable to treeopt algortihm
I really liked the paper but it definitely was a hard read and the paper is not all that well written. Even the basic problem setup and description is staggered around in the first two sections. It would nice to give a bullet point type list to give a concise but complete description of the basic problem. The authors should also consider giving an informal proof outline and high level results in one of the first two sections. The 0ising model is a very natural although intractable benchmark model to compare against. The comparison of the treeopt to the proposed algorithm is encouraging.The reduction to PG graphs and the prediciton PQ prediction game is interesting and novel
Overall I believe the work is defintiely worth publishing.
Q2: Please summarize your review in 12 sentences
The problem considered is a very interesting one and has practical applications. The time complexity of the algorithm is order square of number of vertices which makes it suitable only for small to medium scale problems.
Submitted by Assigned_Reviewer_2
Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
Summary:  The paper studies prediction of labels on a graph, where the nodes are data points, the edges indicate similarity in label, and getting different labels on the sides of an edge is penalized. The authors analyze the problem in an online setting, and provide a general mistake bound (Thm. 4) that holds for any algorithm that has three properties: monotonicity (observing a label can only result in more predictions of that label), permutationinvariance (order of observations does not matter), and a "Markov" property (predictions on a node are influenced only by its immediate connections). Next, the authors provide two prediction strategies that satisfy these properties (Thm. 9), one of which (0Ising) is computationally inefficient while the other (longestpath) has polynomial running time in the size of the graph. Finally, the authors show a percluster mistake bound for these two strategies (Thm. 10) that, combined with the general mistake bound of Thm. 4, results in a concrete bound for these two algorithms. The new bound retrieves the previous optimal bound on trees, while it still gives meaningful (in cases improved) bounds when the graph is not a tree.
Quality and Clarity:  Generally, the paper is wellwritten: the setting, definitions, and logical steps are laid out and explained clearly, which makes the paper easy to read even for those who are not directly working in this area. There are some problems to fix, which I have mentioned at the end.
The theory seems to work, although I have not verified all the proofs in detail.
Originality and Significance:  The contribution that the paper lays out seems interesting and I think it will be useful for the community. However, I am not working in this area directly, so I leave a deeper comparison with the related work to other reviewers that are more familiar with the previous work in this particular area.
Other issues: 
 It is not specified what is the format of the maxflow returned in step 2 of Figure 1, which propagates to the definition of graph $$I$$ in step 3. In particular, in Figure 2(c), has 1>3>2 been in the max flow? If no, then why has 11>10>12 been there? If yes, then why are the edges 13 and 32 still in I?  On page 3, paragraph 2, line 7, please fix the definition of the set of edges of the quotient graph.  Perhaps the first sentence of Sec. 2 needs to be moved to the first page to become the first sentence of the last paragraph on that page. Then I think the phrase "this graph" on the thirdtolast line on page 1 will have a closeby reference.  perhaps the references [3] and [4] on line 4 of the first paragraph are swapped.
Q2: Please summarize your review in 12 sentences
This is a wellwritten paper. The theory seems to work and I think the contributions, as laid out in the paper, will be interesting to the community. However, I am not directly working in this specific area, so I leave it to the other reviewers to comment on the significance.
Submitted by Assigned_Reviewer_3
Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
** Summary of paper
This paper presents a new online algorithm to predict label on a
graph. The graph labeling prediction problem is motivated by
semisupervised learning where labeled and unlabeled are vertices
on a graph and edges represent closeness of these data. This work
uses an Ising model and seeks to optimize a bound on number of
mistake made by the learner online given a constraint on the
complexity of true labeling on the graph, such as the number of
edges connecting disagreeing vertex in the graph.
The paper exploits a transformation of the graph to PicardQueyranne
graph and analyses the mistake bounds for two prediction strategies
via analysis of mistake bounds in PQgames and percluster mistakes
bounds. The final mistake bounds are compared with mistake bounds
in existing literature.
** Quality
The final results reproduce the optimal result for online
labelling problem on trees. In comparison to mistake bounds in
existing literature, this paper's result is better when the graph
can be covered by labelconsistent clusters of different
diameter. This is because the analysis is done percluster and this
should be a better bound in most natural cases.
** Clarity
This paper is well written. The background material required to
understand the PQ graphs is sufficiently covered.
** Originality
This paper makes novel use of PQgraph and percluster analysis to
achieve the final mistake bounds.
** Significance
The algorithms proposed and theoretical results constitute significant
technical contributions.
Q2: Please summarize your review in 12 sentences
This a high quality paper that makes novel use of a PicardQueyranne graph to achieve new mistake bounds in the online graph label prediction problem. The new bounds are arguably better than existing ones for most natural graph labelling.
Submitted by Assigned_Reviewer_4
Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
The paper discusses the problem of semisupervised learning where data points are vertices on a graph and edges represent our belief that two vertices are likely to have the same label. The authors analyze a new method (longestpath) for predicting vertex labels incrementally (online) and characterize the number of mistakes made depending on the behaviour of the adversary (who chooses the next vertex and its label). The setup is assuming the labeling follows an underlying Ising distribution. The proposed method is analyzed with the help of a combinatorial structure (PQ graph) that collapses vertices of the same label. This allows for the development of a tractable algorithm (quadratic in the size of the graph) that avoids counting of the label cuts (sets of sedge connecting vertices with different labels).
Quality: The results and analysis appear to be correct. The proposed new algorithm has tractable runtime and improves the error bound in certain settings.
Clarity: I found the paper poorly structures and hard to follow. The introductory sections are well written and set the stage. However, the analysis does not follow smoothly. It is unclear where the paper is going and why the intermediate results on regular graphs are presented. Furthermore, the discussion section appears to give a lot of new information (e.g., summarizing the 4+1 proposed) methods for label prediction). The paper could benefit a lot by given all that information in the paper and not at the end to help evaluate the context and significance of this work.
Originality, Significance: The paper proposes a new algorithm that makes use of the PQgraph to reduce complexity. The longestpath algorithm can achieve lower error bounds in certain cases, and maths the optimal bound on trees. Unfortunately, The paper is outside my area so my ability to judge its significance and originality is limited.
Q2: Please summarize your review in 12 sentences
The paper discusses error bounds for a new algorithm for predicting incrementally the labels of a graph in a semisupervised setting. I found the flow of the paper not very smooth and thus hard to follow and evaluate the significance of the results.
Q1:Author
rebuttal: Please respond to any concerns raised in the reviews. There are
no constraints on how you want to argue your case, except for the fact
that your text should be limited to a maximum of 5000 characters. Note
however, that reviewers and area chairs are busy and may not read long
vague rebuttals. It is in your own interest to be concise and to the
point.
Dear Reviewers, we would like to thank you for your
comments. Below we respond to the specific comments of reviewers 1 and
3.
Reviewer 1:
 "It is not specified what is the format of
the maxflow returned in step 2 of Figure 1, which propagates to the
definition of graph $$I$$ in step 3. In particular, in Figure 2(c), has 1
 > 3  > 2 been in the max flow? If no, then why has 11  > 10 
> 12 been there? If yes, then why are the edges 13 and 32 still in
I?"
* The maxflow F returned in step 2 is just the set of directed
edges in a 01 max flow (Note generally there may be more than one such
max flow). The logic of step 3 which creates graph I, is that all the
directed edges of the max flow F are in I and every (undirected) edge in H
is also in I if it does not have an orientation which is in F.
*
Thus 1  > 3  > 2 is not in the max flow.
 ". . . then why
has 11  > 10  > 12 been there?"
* Note that either "1 1
> 10  > 12" or "11  > 12" must be in the max flow but not both.
 ". . . then why are the edges 13 and 32 still in I?"
* By the definition of I since their orientations are not in the
max flow they are included as undirected edges.
 On page 3,
paragraph 2, line 7, please fix the definition of the set of edges of the
quotient graph.
* The definition is now fixed.
 Perhaps
the first sentence of Sec. 2 needs to be moved to the first page to become
the first sentence of the last paragraph on that page. Then I think the
phrase "this graph" on the thirdtolast line on page 1 will have a
closeby reference.
* Now fixed.
 perhaps the
references [3] and [4] on line 4 of the first paragraph are
swapped.
* Now swapped.
Reviewer 3:
 ". . .
characterize the number of mistakes made depending on the behavior of the
adversary (who chooses the next vertex and its label). The setup is
assuming the labeling follows an underlying Ising distribution."
*
The setup is in fact completely adversarial, i.e., the labeling is chosen
by the adversary.
 " It is unclear where the paper is going and
why the intermediate results on regular graphs are presented."
*
The structure of the mistake bound analysis is discussed in the beginning
of Section 3 in lines 218233.
* Note that there are no results
presented on "regular graphs" however, an analysis is given of "regular
graph label prediction algorithms" (permutationinvariant,
labelmonotone, Markov) is presented [Theorem 4]. The two algorithms
considered in this paper longestpath and 0Ising are both regular
algorithms [Theorem 9]. [Theorem 10] in conjunction with [Theorem 4]
completes the mistake bound analysis.

