Reviews: Is Deeper Better only when Shallow is Good?

This paper investigates the effect of depth of expressivity and learnability, given a distribution generated by an iterated function system. In particular, they showed that shallow networks need an exponential number of neurons to realize a fractal distribution while deep networks only require a number of neurons that is linear with the depth of the fractal distribution. The results are interesting and could shed some lights on the theoretical understanding of deep learning. So, the reviewers have shown their support to this paper, despite that it studies a mathematically narrow case whose practical value is not very clear. The impact of the work will be greatly improved if the authors could extend their studies to more general cases.

Paper ID:	3466
Title:	Is Deeper Better only when Shallow is Good?