Kernel ridge regression plays a central role in various signal processing and machine learning applications. Suitable kernels are often chosen as linear combinations of 'basis kernels' by optimizing criteria under regularization constraints. Although such approaches offer reliable generalization performance, solving the associated min-max optimization problems face major challenges, especially with big data inputs. After analyzing the key properties of a convex reformulation, the present paper introduces an efficient algorithm based on a generalization of Nesterov's acceleration method, which achieves order-optimal convergence rate among first-order methods. Closed-form updates are derived for common regularizers. Experiments on real datasets corroborate considerable speedup advantages over competing algorithms.