This paper initially got mixed recommendations, three positive and one negative. The reviewers agree that this paper addresses an important problem for neural network architecture design. The experiments are comprehensive and results are good. However, one reviewer has the concerns on the experimental justification and that gave a weak reject. This concern was addressed by the additional experiments in the authors' response. Finally all the reviewers agree for acceptance. AC concurs.