Given any countable collection of regression procedures (e.g., kernel, spline, wavelet, local polynomial, neural nets, etc.), we show that a single adaptive procedure can be constructed to share their advantages to a great extent in terms of global squared L2 risk. The combined procedure basically pays a price only of order 1/n for adaptation over the collection. An interesting consequence is that for a countable collection of classes of regression functions (possibly of completely different characteristics), a minimax-rate adaptive estimator can be constructed such that it automatically converges at the right rate for each of the classes being considered. A demonstration is given for high-dimensional regression, for which case, to overcome the well-known curse of dimensionality in accuracy, it is advantageous to seek different ways of characterizing a high-dimensional function (e.g., using neural nets or additive modelings) to reduce the influence of input dimension in the traditional theory of approximation (e.g., in terms of series expansion). However, in general, it is difficult to assess which characterization works well for the unknown regression function. Thus adaptation over different modelings is desired. For example, we show by combining various regression procedures that a single estimator can be constructed to be minimax-rate adaptive over Besov classes of unknown smoothness and interaction order, to converge at rate o(n-1/2) when the regression function has a neural net representation, and at the same time to be consistent over all bounded regression functions.
- Adaptive estimation, combined procedures, minimax rate, nonparametric regression