Smaller generalization error derived for deep compared to shallow residual neural networks