Submitted by Assigned_Reviewer_1
Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
This paper is about hypothesis testing of Gaussian Hidden Clique problem. It shows that, for any \epsilon, the case where there exists a planted clique of size L = (1-\epsilon) \sqrt{n} cannot be distinguished against the case where there is no hidden clique, using spectral methods. This complements the positive result of [12], where it is possible to distinguish the two cases using a spectral method, when L > (1+\epsilon) \sqrt{n}. The main technical contribution is a general analysis based on rank-1 perturbation of order-k symmetric Gaussian tensors, which may be of independent interest.
Quality: The result is technically sound as far as I have checked.
Clarity: The paper is well-written overall; I am confused about Equation (24) -- why does it follow directly from Lemma 5?
Originality: Although the proof technique based on Lemma 2 is quite interesting, to be honest, I am not able to evaluate the novelty since I am not familiar with the relevant literature of proving such lower bound.
Significance: the paper establishes an interesting result, which nicely complements the previous positive result on spectral methods. But the downside is that this is only a lower bound for ``spectral methods", which makes the usage rather restrictive. It seems that extending this lower bound to any efficient detection algorithm is not easy, since the technique heavily relies on the fact that the eigenvalues are conjugation-invariant.
Q2: Please summarize your review in 1-2 sentences
This paper is about hypothesis testing of Gaussian Hidden Clique problem using spectral methods. The techniques are general, the results are interesting and nicely complements the existing lower bound.
Submitted by Assigned_Reviewer_2
Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
The authors consider the following problem:
Given a symmetric matrix X of dimension n, design a spectral test (a statistical test that depends only on the eigenvalues of X)
to distinguish between the following two hypotheses:
H_0 = all elements are drawn from N(0,1)
H_{1,L} = there exists a submatrix that is drawn from N(1,1)
It is known that when L >= (1-\epsilon)\sqrt(n) there is a simple test (involving checking the top eigenvalue) to distinguish H_0 and H_{1,L}.
What the authors show is that the result is tight and that no spectral test is reliable for L \leq (1-\epsilon)\sqrt(n).
The authors also prove results regarding the analogous tensor variant in particular distinguishing among H_0: X = Z H_1: X = \beta v^{\otimes k} + Z
The papers' contribution is primarily theoretical and there are no experiments.
Questions/Concerns:
(1) It's not entirely clear to me what the machine learning applications are for such a problem.
(2) Could the authors explain why the eigenvalue distribution of \beta * vv^T + Z is the same as Z if the norm of v is fixed?
(3) Is there some intuition to why examining the top eigenvalue is sufficient for the gaussian clique problem (if L is large enough?)
(3) I don't have an intuitive understanding of Theorem 1. In particular, why spectral contiguity of H_{1,L} with respect to H_0 implies that no spectral test is reliable for L <= (1-\epsilon)\sqrt(n).
Q2: Please summarize your review in 1-2 sentences
The authors prove some theoretical lower bounds for the gaussian hidden
clique problem in both matrices and tensors.
The contributions of the paper are primarily theoretical (i.e. no experiments).
I am not familiar with the related work and have some questions about some of the concepts that I detail below.
Submitted by Assigned_Reviewer_3
Q1: Comments to author(s). First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. (For detailed reviewing guidelines, see http://nips.cc/PaperInformation/ReviewerInstructions)
The paper provides a result on how the eigenvalues of a random matrix can be exploited to answer the detection problem "Gaussian hidden clique problem". The authors provide a tight result on how for small size of planted clique, the eigenvalues are not enough for the detection problem.
Revise the first sentence in the abstract. It is a very long sentence, and it is hard to read and understand it. I also suggest to move some of the related works discussion provided at the end of paper to the beginning in the introduction section.
Q2: Please summarize your review in 1-2 sentences
The paper has a reasonable contribution, and it is clearly written.
Q1:Author
rebuttal: Please respond to any concerns raised in the reviews. There are
no constraints on how you want to argue your case, except for the fact
that your text should be limited to a maximum of 5000 characters. Note
however, that reviewers and area chairs are busy and may not read long
vague rebuttals. It is in your own interest to be concise and to the
point.
We thank the reviewers for their useful and
insightful comments. Our responses to some of the issues are summarized
below:
1) Reviewer 2 asks for further pointers of applications of
our result to the NIPS
community and notes that perturbations of
matrices of
rank greater than one is of interest to researchers in
machine learning.
Of course, spectral clustering provides another
broad set of
application for spectral methods. Our approach may
provide insights into the
limitations of spectral clustering. The
main technical is to deal with higher-rank
structures, which we
believe to be feasible.
2) Reviewer 2 correctly observes that our
results are asymptotic, hence the
question arises to what extent
they apply to finite instances that are encountered
in
practice.
All calculations are explicit, and in principle we can
derive explicit bounds on the
total variation distance for each
fixed k. For instance in Eq. (27) the integral yields
E\Lambda^2 \le 1 + C'/n^{(k/2)-1}
for k\ge 3
Hence allready for $k=4$ the total variation decays
at a linear rate.
3) Reviewer 2, raises the issue of spectral
algorithms that use eigenvectors in
addition to eigenvalues. We
believe that establishing the limitations
of such methods is of
great interest. However, It should be noted that
proving
unconditional hardness results for detection problems which
consider both
eigenvectors and eigenvalues is unlikely, as the
whole matrix is determined by its
eigenvalues and eigenvectors (up
to permutations).
4) We thank reviewer 3 for his suggestion
regarding the first sentence. We will make the first sentence shorter and
easier to parse.
5) Reviewer 5 asks about Equation (24) -- why does
it follow directly from Lemma 5?
This is a large deviations
calculation. Eq. (24) follows from Lemma 5 applying Varadhan's
lemma.
We agree that this point deserves a clarification in the
text-we will add such a clarification. |