We derive risk bounds for fitting deep neural networks to data generated
from the multivariate nonparametric regression model. It is shown that
estimators based on sparsely connected deep neural networks with ReLU
activation function and properly chosen network architecture achieve the
minimax rates of convergence (up to logarithmic factors) under a
general composition assumption on the regression function. The framework
includes many well-studied structural constraints such as (generalized)
additive models. While there is a lot of flexibility in the network
architecture, the tuning parameter is the sparsity of the network.
Specifically, we consider large networks with number of potential
parameters being much bigger than the sample size. We also discuss some
theoretical results that compare the performance to other methods such
as wavelets and spline-type methods. This is joint work with K. Eckle
(Leiden).
- Tags
-