In this paper, we propose a unified two-phase scheme to accelerate any high-order regularized tensor approximation approach on the smooth part of a composite convex optimization model. The proposed scheme has the advantage of not needing to assume any prior knowledge of the Lipschitz constants for the gradient, the Hessian, and/or high-order derivatives. This is achieved by tuning the parameters used in the algorithm adaptively in its process of progression, which has been successfully incorporated in high-order nonconvex optimization [C. Cartis, N. I. M. Gould, and Ph. L. Toint, Found. Comput. Math., 18 (2018), pp. 1073-1107; E. G. Birgin et al., Math. Program., 163 (2017), pp. 359-368]. By adopting a similar approximate measure of the subproblem in [E. G. Birgin et al., Math. Program., 163 (2017), pp. 359-368] for nonconvex optimization, we establish the overall iteration complexity bounds for three specific algorithms to obtain an ϵ-optimal solution for composite convex problems. In general, we show that the adaptive high-order method has an iteration bound of Ol(1ϵ1/(p+1)r) if the first pth-order derivative information is used in the approximation, which has the same iteration complexity as in [M. Baes, Estimate Sequence Methods: Extensions and Approximations, Institute for Operations Research, ETH, Zürich, 2009; Y. Nesterov, Math. Program., published online Nov. 21, 2019, https://doi.org/10.1007/s10107-019-01449-1], where the Lipschitz constants are assumed to be known, and the subproblems are assumed to be solved exactly. Thus, our results partially address the problem of incorporating adaptive strategies into the highorder accelerated methods raised by Nesterov in [Math. Program., published online Nov. 21, 2019, https://doi.org/10.1007/s10107-019-01449-1], although our strategies cannot ensure the convexity of the auxiliary problem, and such adaptive strategies are already popular in high-order nonconvex optimization [C. Cartis, N. I. M. Gould, and Ph. L. Toint, Found. Comput. Math., 18 (2018), pp. 1073-1107; E. G. Birgin et al., Math. Program., 163 (2017), pp. 359-368]. Specifically, we show that the gradient method achieves an iteration complexity on the order of O(ϵ1/1/2r) , which is known to be best possible (cf. [Y. Nesterov, Lectures on Convex Optimization, 2nd ed., Springer, 2018]), while the adaptive cubic regularization methods with the exact/inexact Hessian matrix achieve an iteration complexity on the order of O(1ϵ1/3r) , which matches that of the original accelerated cubic regularization method presented in [Y. Nesterov, Math. Program., 112 (2008), pp. 159-181]. The results of our numerical experiment clearly show the effect of the acceleration displayed in the adaptive Newton's method with cubic regularization on a set of regularized logistic regression instances.
Bibliographical noteFunding Information:
\ast Received by the editors September 9, 2019; accepted for publication (in revised form) July 2, 2020; published electronically October 8, 2020. https://doi.org/10.1137/19M1286025 Funding: This work was supported in part by NSFC grants 11771269 and 11831002, in part by National Science Foundation grant CMMI-1462408, in part by Shenzhen Research Fund grant KQTD2015033114415450, and in part by the Program for Innovative Research Team of Shanghai University of Finance and Economics.
© 2020 Society for Industrial and Applied Mathematics.
- Adaptive method
- Convex optimization
- Iteration complexity
- Tensor method