Review for NeurIPS paper: MeshSDF: Differentiable Iso-Surface Extraction

NeurIPS 2020

MeshSDF: Differentiable Iso-Surface Extraction

Review 1

Summary and Contributions: It is previously not differentiable to extract surface from deep sign distance field, e.g. by Marching Cube approaches. This paper presents a differentiable way, called MeshSDF. This open doors to the application of DeepSDF to various downstream tasks that require extracting surface.

Strengths: 1. First, the paper is well written and easy to follow. 2. The presented iso-surface differentiation is of high significance, both in terms of theory and applications. It is also well-grounded based on function analysis. 3. The differentiation is only applied in the backward pass and is independent of the way to sample the surface in the forward pass, therefore the method is generally applicable.

Weaknesses: I also have some confusions on current manuscript. 1. In Sec.3.2, after SDF changes a little bit, the new surface sample v' is defined as the sample on the new SDF that are closest to the original sample v. I wonder whether this is a principled definition or sort of heuristics? As far as my understanding, there is no constraint on the sample v that can be applied to sampling on both old and new SDF --- such constraint could establish the correspondence between v and v'. Thus, one could not find the v' corresponding to v unless some constraints are imposed, e.g. the closest point here. 2. Following the above definition of v' as the closet point, when we update z via backpropagation and run MC in the next iterations, would the new v' actually sampled by MC really the v' defined above? This is an important question to ask, because this is to make sure that the actually updating on v is guided by the computed differentiation. 3. Regarding minimizing equation (7), how to make the loss differentiable to the discrete values? Judging from line 201, DR_silhouette(M(z)) is a binary variable. I can imagine that when z is continuously changed, the DR_silhouette(M(z)) switches between 0 and 1 as a binary value and is thus a discrete function.

Correctness: The claims, method, and empirical methodology are correct.

Clarity: Yes

Relation to Prior Work: Yes

Reproducibility: Yes

Additional Feedback:

Review 2

Summary and Contributions: The authors address the task of deriving a surface mesh representation from a deep signed distance function with a differentiable iso-surface extraction approach. In contrast to applying Marching Cubes differentiation (where the position of a vertex along an edge is determined via linear interpolation and, hence, effective backpropagation is not possible for topology changes), iso-surface differentiation relies on a continuous model that describes how signed distance functions perturbations locally influence surface geometry. This allows a differentiable method for computing surface vertices from the signed distance field. In detail, the forward pass involves evaluating the signed distance field on a grid and using Marching Cubes to extract the iso-surface. The backward pass involves backpropagation of the gradients based on the chain rule (where a forward pass is used to get normals). The method’s potential is demonstrated in the scope of quantitative and qualitative results, however, some more results would strengthen the paper.

Strengths: Evaluation: - The authors evaluate that their approach can handle differentiable topology changes, which would not be possible with Marching Cubes. - The authors show the potential of their approach for single-view reconstruction based on quantitative and qualitative evaluations including a comparison to other methods. - In addition, they provide a practical application scenario for optimizing shapes in the scope of aerodynamic design. Exposition: The paper is well-written and easy to follow. Figures and captions are informative. In addition, the approach is reasonable. Reproducibility: The paper seems reproducible by the facts in the paper. In addition, the authors provide a detailed supplement with proofs regarding non-differentiability of marching cubes, the acceleration of iso-surface extraction as well as further details on the approach and its use for the different application scenarios.

Weaknesses: Evaluation: - A direct comparison to the results of Deep Marching Cubes, i.e. the differentiable Marching Cubes approach, would be interesting. As the authors also discuss in Section 2, we expect problems regarding the resolution, but the approach would also handle topological changes. - Figure 4 and Table 2 only demonstrate the performance for the category chairs. The visual evaluation would be strengthened by providing results for different types of objects. - A detailed discussion of failure cases and limitations is missing. - It’s not clear whether the change to a completely different car model/type is always the desirable solution (Figure 5). Would it be possible to constrain the underlying car type stronger? - A comparison of the computational burden and training times would be interesting for practical application. Exposition: - The value of the regularization strength lambda_reg in equation 1 seems not discussed. What value has been chosen and how susceptible is approach with respect to the respective parameter choice? Typos: line 70/194: wrt. -> w.r.t. line 127: represent -> represents in Table 1: AtlasNett -> AtlasNet line 221: so that to encourage line 260: Fig. 5c) -> there is no c) in Figure 5

Correctness: The paper seems correct and the method is reasonable as also verified in the results.

Clarity: The paper is well-written and easy to follow. The approach is well-motivated and figures and captions are informative. Furthermore, the information in the paper allow reproducing the technique.

Relation to Prior Work: The relation to the related Marching Cubes is sufficiently discussed, but the differentiable version Deep Marching Cubes seems to represent a relevant method to be considered for the comparison.

Reproducibility: Yes

Additional Feedback: see comments above Post-rebuttal comments: After reading the rebuttal and the other reviews, I am increasing my rating to accept. I would like to see the following revisions: - inclusion of the comparison to Deep Marching Cubes, as the increase in resolution is a clear benefit - inclusion of a comparison to differentiable rendering - discussion of limitations and failure cases in the main paper - in the main paper, a reference to the comparison of the computational burden and training times in the supplemental should at least be mentioned - an example better demonstrating the learning of an explicit mesh representation with variable topology would strengthen the evaluation (here, I agree with R3)

Review 3

Summary and Contributions: The paper proposes a way to differentiate through a (non-differentiable) marching cubes step when training deep implicit surfaces (e.g., DeepSDF, etc.). This makes it possible to optimize a loss function on the explicit mesh representation (as extracted using marching cubes) for a deep implicit surface, avoiding the usual topological constraints of mesh-based learning methods. The authors demonstrate their method experimentally on single view reconstruction using a differentiable mesh renderer and CFD-based optimization.

Strengths: State-of-the-art 3D deep learning methods typically must navigate the tradeoff between interpretable explicit parameterization (e.g., meshes) and variable topological structure (e.g., implicit surfaces). Typically, trying to combine the two induces a difficult combinatorial problem. The authors propose a novel way to circumvent this issue, taking advantage of spatial gradients of a learned implicit fields to allow the computation of "explicit" objective functions while learning an implicit representation. The experiments show convincing applications for why this is a useful capability, and, overall, the paper makes progress towards variable topology 3D learning.

Weaknesses: The theoretical derivative that is derived for a mesh vertex with respect to the implicit field is only valid under the assumption that the vertex is precisely on the zero level set of the implicit surface. Because of the discretization in marching cubes, this is not the case in the practice. It would be helpful to include some analysis, empirical and theoretical, about the stability of these gradients and the overall method with respect to the discretization. Relatedly, what grid resolution is used for marching cubes in the experiments, and how is this value determined? While this method enables the computation of mesh-based objectives without fixing surface topology, the representation that is ultimately learned is still an implicit function. For applications where mesh quality and resolution is important, there is no way to learn parameterizations that are better suited. Furthermore, it seems that in some cases, the use of loss functions on the explicit geometry obtained through marching cubes rather than on the implicit surface would actually negatively impact reconstruction due to the discretization. Is there a reasonable way to combine the explicit losses with some of the described implicit losses (used in DeepSDF, DISN, etc.)? It is also a bit surprising that without fine-tuning, ("MeshSDF Raw"), the proposed method consistently underperforms compared to the deep implicit function baseline (DISN). What is the reason for this? It would be interesting to see to what extent this is due to the inability to capture certain details of the predicted surfaces because of the marching cubes discretization.

Correctness: Besides some of the issues described under "Weaknesses" above, the theoretical and experimental components of the paper are well-designed.

Clarity: Overall the paper is clearly written and easy to follow. In equation 5, how are samples s \in S and t \in T for meshes S and T computed? Are they sampled randomly at each iteration? "AtlasNett" is misspelled in a few places.

Relation to Prior Work: Relevant methods are discussed and contextualized, and there is sufficient comparison to prior work.

Reproducibility: Yes

Additional Feedback: Post-rebuttal update: Thank you to the authors for the rebuttal. After reading the rebuttal and the other reviews, I'm keeping my original score. I agree that this is a nice a contribution and a step towards truly variable topology in 3D learning. I would love to see a bit more evidence that this method allows the use of explicit shape information during training, i.e., loss terms on the explicit geometry.

Review 4

Summary and Contributions: The paper introduces a new approach to extracting 3D surface meshes from Deep Signed Distance Functions while preserving end-to-end differentiability. To tackle the non-differentiability of traditional iso-surface extraction algorithm, e.g. Marching-Cubes, the paper derives a closed-form expression for the derivative of a surface sample with respect to the underlying implicit field. With the theoretically well-grounded technique for differentiating through iso-surface extraction, it's possible to train network end-to-end with loss over the mesh to generate arbitrary topology and unlimited resolution surface mesh.

Strengths: The technical contributions are sound. The evaluations and comparisons are thorough.

Weaknesses: The No.1 and No.3 experiments show that MeshSDF allows for differentiable topology changes and is applicable to shape optimization with explicit surface modeling. However, the No.2 experiment, single view reconstruction, is not sufficient. While all contrast works directly predict reconstructed mesh, the experiment only post-processes MeshSDF to refine reconstruction results. Another experiment is needed that based on a continuous implicit field-based single-view reconstruction method (e.g. the MeshSDF(raw) mentioned in the paper), various differential SDF surface extraction methods (such as differential rendering methods[1,2,3,4]) should be compared to refine reconstruction results with 2D silhouette loss. If MeshSDF cannot get much more impressive results in such an experiment, it's not convincing to prove that MeshSDF is better at harnessing the power of deep implicit surface representation. [1] DIST: Rendering Deep Implicit Signed Distance Function with Differentiable Sphere Tracing [2] Differentiable Volumetric Rendering: Learning Implicit 3D Representations without 3D Supervision [3] SDFDiff: Differentiable Rendering of Signed Distance Fields for 3D Shape Optimization [4] Learning to Infer Implicit Surfaces without 3D Supervision

Correctness: Correct.

Clarity: Very clear.

Relation to Prior Work: Yes, the DeepSDF and Deep Marching Cubes works are well discussed in the paper.

Reproducibility: Yes

Additional Feedback: