Sun Dec 8th through Sat the 14th, 2019 at Vancouver Convention Center
The paper is well written and clearly organized. A smoother introduction of equation (2) would be nice for non-experts though. The quality of the theoretical part is undeniable. The elegant interpretations of the method steps given make the method quite compelling and the arguments convincing. As a simple comment, I must say that it's a shame that the PGD interpretation of the LISTA network is lost in practice when the ReLU activations are replaced by tanh. Moreover, the originality of the submission relies on the adaptation of a known technique (LISTA Network) for the particular application of real-time acoustic imaging. This can have great significance for practitioners and researchers in the field given the strong claims made in terms of performance and the disruptiveness of the ideas compared to the state-of-the-art approaches. For this reason, I believe that the experimental results section could have been larger, with more results to support the claims that are made by the authors. Indeed, I am surprised that 2760 data points were enough for training (without overfitting) and assessing the 5-layered neural network. More explanations on that matter would be nice. Also, I wonder what is the spherical map resolution N for the experiment and if the improvements stated in terms of runtime, resolution and contrast are averages over the 2760 * 0.2 = 552 test data points. In this case, some confidence measure of these scores could be useful to interpret the results. Other experiments from the supplementary material are cited in a small paragraph and should be further commented in the main document in my opinion. -- The authors have addressed most of my concerns. I am increasing my score under the condition that they add more precisions and results to their experimental section.
This work proposes a RNN for real-time reconstruction of acoustic camera spherical maps. It belongs to the wider topic of neural-networks inspired by optimization algorithms., e.g, LISTA based on iterative soft-thresholding algorithm. The authors did a good job in telling the stories behind the logic of the network they designed. They carefully reviewed the connection of the network with PGD, how in theory and by nature the network was developed and trained. The analysis is solid. However, I don't think the paper presents enough contribution and novelty. Firstly, the network was proposed to solve the acoustic imaging problem, and was not designed for a wider range of applications. Secondly, the formulation of the network was largely based on the natural settings of an acoustic imaging, e.g., the settings related to the spherical microphone arrays. Based on these settings, the authors discussed the theoretical justifications and their solutions, which seems not well applied to other areas.
+ Traditional acoustic camera methods had been advanced significantly with the advent of compressed sensing techniques, which reconstruct the original signals successfully by means of hand-crafted features or functions with nonlinear optimization, e.g., proximal gradient descent. However, the performance of the reconstruction process has been significantly slow due to nonlinear optimization steps. This paper proposes a new approach that substitutes the traditional nonlinear optimization approach with recurrent network architecture, i.e., by unrolling the iterative convex optimization algorithm in a form of neural network architecture. This paper takes a two-layered design, where a bias and back-projection gradient, and deblurring matrix are learned. + As described in the paper, the recurrent architecture has been proposed to substitute the signal reconstruction problem for other field applications of compressive sensings, such as compressive imaging. I'm not an expert in the acoustic camera field, but I cannot find out any existing works in this field yet. I, therefore, assume that this work is novel. + This submission is very carefully prepared. The supplemental document provides a thorough derivation of backpropagation and gradient descent details. The writing and soundness of this work should meet the standard of NeurIPS. + This work also tested the proposed method with real participants of eight people by recording their conversation. + I cannot find out any problem in terms of algorithms. - In this work, I cannot find out any ablation study of the parameters of the proposed method. It would be better if this paper might include an ablation study of the proposed method in terms of hyper-parameters or learnable parameters. - In addition, I cannot find out any quantitative comparison of the proposed method with state-of-the-art methods. For this reason, I cannot give a higher score to this work.