|
Submitted by Assigned_Reviewer_1
Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
In this paper, a generalization of known results on the Stochastic Block Model for a different type of Network generative model is presented, namely, the fact that spectral clustering based on the Laplacian works well for large degree (log(n)).
I think that the paper is not presenting the current knowledge for the SBM accurately, and the discussion of state-of-the art is flawed: It ignores all the spectacular development of spectral clustering for the Stochastic Block model in the last 3 years, where it has already been proven that a large class of networks are recoverable either exactly or with a small error after a sharp bound, see eg:
For detectability: http://arxiv.org/abs/1306.5550 http://arxiv.org/abs/1311.3085 http://arxiv.org/abs/1311.4115
For exact recovery: http://arxiv.org/abs/1405.3267
Given all these strong results, the novelty of the presented paper seem rather weak, and both the motivation and the title are misleading. In fact, rather than giving new results about the detectability of the identification of clusters in the plethora of models for networks (SMB, degree-corrected, preferential attachement) the authors instead invented yet another model for which their proof work.
Q2: Please summarize your review in 1-2 sentences
A generalization of known results on the Stochastic Block Model for a different type of Network generative model. Interesting, but only the weakest result known for the SBM are adapted to new class of models, while all the recent progress is ignored. Because of the lack of proper depiction of the current state-of-the-art, the reviewer finds also the presentation to be misleading. Also, it seems that a new kind of generative model is invented only so that the proof could apply.
Submitted by Assigned_Reviewer_2
Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
The term KPFM is never well defined.
After reading line 66, I thought I understood, "A random graph family over node set V admits a K-preference frame H, and is called a Preference Frame Model (KPFM), if the edges i, j, i < j are sampled independently from Bernoulli distributions with parameters Sij."
Then, line 104 says, "We note that the the KPFM model does not specifically require that the edges are sampled independently."
This is fine, but what appeared to be a definition on line 66 is no longer a definition.
This is super frustrating.
Given that the aim of this paper is to {increase the clarity of the of the parameterizations}, it seems that the paper should excel in defining its model.
This referee agrees that there may be space to generalize the Stochastic Blockmodel and its degree correction in a way that more closely aligns with spectral clustering.
For example, I'd suggest the paper starts by saying that ''spectral clustering is not a model based algorithm, but can we derive a model that more naturally aligns with its properties?''
This referee is eager to up his quality score.
This paper has a lot of promise, but the organization of the paper is sorely lacking.
Where is the definition of misclustered?
Where is the definition of the model? Are edges independent or not?
Minor point: Because the algorithm uses a normalization step, I believe that the correct citation for this version of the algorithm is not 13 and 21, but rather 16.
Q2: Please summarize your review in 1-2 sentences
This is an interesting problem and the approach in this paper might be exactly correct.
However, the organization and clarity of the paper is very frustrating.
Submitted by Assigned_Reviewer_3
Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
The authors present an analysis that establishes the applicability of the classical spectral clustering algorithm on a wide variety of generalized block stochastic models (various communities, sizes, transition probabilities, etc). Their results consider the weak recovery regime, where "most of the nodes" are clustered correctly; certainly a more realistic setup.
The ideas presented seem novel, however I am not an expert in the area. The paper is well-written, however it is extremely heavy in notation, making the technical result inaccessible.
One example where the technical discussion is confusing: It is not clear how parameter kappa should be thought of. For example, what is a maximum degree bound with a reasonable kappa, where the probability bounds are meaningful?
Also, there are many relevant and important recent results on the subject, and a comparison against them will be very useful - Results by Mossel, Neeman, Sly et al. http://arxiv.org/abs/1407.1591 http://arxiv.org/abs/1404.6325 http://arxiv.org/abs/1311.4115 - Results by Abbe, Bandeira, Sandon et al. http://arxiv.org/abs/1506.03729 http://arxiv.org/abs/1503.00609 http://arxiv.org/abs/1405.3267 - Results by Lelarge, Massoulie et al. http://arxiv.org/abs/1506.08621 http://arxiv.org/abs/1406.6897
Finally, a minor comment: the authors mention at some point that: "there are re strong reasons to believe that the KPFM is the most general class for which community recovery by direct spectral clustering is possible." Why is this true?
Overall, it seems that the contribution could be a valuable addition to the literature of community detection on SBMs, however it is not clear how it compares against the state of the art spectral methods for extended SBMs as noted above.
Some typos: - [19,18,9].In particular -> space missing
- overfill in line 090
- "degree model"[8]
-> space missing
Q2: Please summarize your review in 1-2 sentences
The authors establish that the spectral clustering algorithm that works for SBM, is also applicable to a more general class of stochastic community models. The ideas seem novel, however the technical material is not very accessible, and more comparisons to prior art seem necessary.
Submitted by Assigned_Reviewer_4
Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
Typos:
Line 91/92 Equation has indexing typo
Line 116:
This type of interactions
Line 125: consists with
Line 160/161: recover(y)
Q2: Please summarize your review in 1-2 sentences
The paper analyzes stochastic block models and related models under an umbrella model of preference frame models. Particularly focusing on recovery using spectral methods. The generalization seems useful for the overall problem. The paper is well written and assumptions clearly stated.
Q1:Author
rebuttal: Please respond to any concerns raised in the reviews. There are
no constraints on how you want to argue your case, except for the fact
that your text should be limited to a maximum of 5000 characters. Note
however, that reviewers and area chairs are busy and may not read long
vague rebuttals. It is in your own interest to be concise and to the
point.
REV 1: Apologies for the eggregious and frustrating
mistake(s). The KPFM random graph model is as defined in lines 66-69 where
([n], S) is a weighted graph that admits a pref. frame (l 57-58). What
we **should have** said at the bottom of p.2 is that we use independent
sampling of edges only to prove concentration of hat{L}. The result
extends to other graph models with dependent edges (e.g graph lifts) if
one can prove concentration.
"Misclustered" is defined the usual
way: p_err = (1/n)*(min over all permutations of cluster labels of the
Hamming distance between label vectors).
Minor: [16] normalizes
the _rows_ in step 3 to unit length. Alg 1 is almost that of [13,21]
(there, columns are normalized after step 3 not before). This variation
amounts to the re-definition of c_max, c_min (in l 252).
Minor
correction: lambda_2 of block M^kl (Thm 6) is the third eigenvalue (the
first 2 e-values of M^kl have same magnitude and opposite
signs).
"spectral clustering is not model-based but can we
derive...": this could well be the opening statement of our paper.
We hope that with these clarifications the reviewer will be
motivated to continue reading the paper. It is a privilege to have an
expert reviewer for our paper and we would appreciate more in depth
feedback.
Rev 4: Thank you for feedback and insightful
questions. kappa is needed to prove concentration of hat{L} , Supplement,
Prop 4. We suspect kappa could also grow SLOWLY with n. For reasonable
p_err kappa (eq 7) is not the critical factor in our exps. One needs the
expression [C_0....] to get small enough which requires fairly large n.
regarding the 8 references:
[Mossel,Neeman,Sly,et al]
**Indeed state of the art, but for the *planted bisection* problem.
This model has 2 parameters (p,q) whereas our model has O(n^2) parameters.
DC-SBM has O(n). One of our contributions is to show that the vast
majority of these are nuisance params (lines 149-155 and proof of Prop
2). ** are on the very sparse regime d_i = O(1) and on the problem of
"detection"/"correlated partition". The recovery regime of our Thm 3 is
impossible below the d_i=Omega(log n) threshold.
1506.03729 and
1503.00609 (Abbe&Sandon) deal strictly with SBM, i.e O(K^2) params.
The sharp threshold obtained does not apply to KPFM or even HKPFM.
1405.3267 (Abbe,Bandeira,Hall) is on planted
bisection.
1506.08261 (posted 3 weeks after the NIPS submission) --
recovery of degree-corrected SBM (i.e HKPFM), under very weak conditions
(more below). Implemented on our example, got p_err=0.26, lambda_K of
hat{H} has multiplicity 6.
1406.6897 ### impressively general
model. Technically, this model is hierarchical (parameters are random,
edges random) while our model has fixed parameters. Besides, the bound
obtained (Thm 1, eq 9) vanishes only when epsilon_r vanishes, i.e. when
the relevant operator T (properly normalized) tends to a rank K operator.
Our KPFM does not require this assumption. If we were to construct the T
operator for KPFM, the residual spectrum will stay bounded away from 0, so
the the recovery error will not vanish asymptotically.
To put
things in perspective, the models cited above relie more or less on a
similar condition (e.g that there is an "ideal" matrix of rank K). This
is true for DC-SBM (and exploited implicitly by e.g Gulikers & al).
The KPFM departs from this assumption; we use e-value separation instead
(much weaker than controlling epsilon_r). In Theorem 6 we show how one can
control the eigenvalue separation using only the natural condition that
clusters cannot be further partitioned (see bottom of pag 6). This is a
novel result by itself.
"Finally a minor comment..." a precise
answer requires more space. Very simplistically, the rows in alg 1, step 3
will concentrate iff the matrix hat{P} is close to satisfying (1), which
essentially defines KPFM.
Rev 5:
1306.5550
Krzakala&al,1311.3085 Massoulie, 1311.4115 (see Rev 3), 1405.3267 (see
R3). Remarkable results, yet they apply only to the SBM (K^2 params or
fewer). We agree with the reviewer that the non-backtracking r.w. holds
great promise and will including this + citations in the next version. But
we are not aware of this approach having been applied to anything beyond
SBM.
In a NIPS paper there is little room to cover progress that
is not _directly_ related to our contribution. Thus accusation of
"ignorance" and "misleading" should wait until we post on arxiv. What is
fair to say is that work on planted bisection is making spectacular
progress, but that more complex models, and in particular geometric
aspects that matter once we get away O(K^2) models, have been neglected
and our paper advances the state of the art in that direction.
KPFM is much more flexible than the above models, yet we don't
make stronger assumptions for recovery. In fact our assumptions are a
little simpler, clearer, and in practice weaker as we show by the
comparison in sec 4.2.
|
|