Analysis of an approximate gradient projection method with applications to the backpropagation algorithm

Zhi Quan Luo, Paul Tseng

Research output: Contribution to journalArticlepeer-review

46 Scopus citations

Abstract

We analyze the convergence of an approximate gradient projection method for minimizing the sum of continuously differentiable functions over a nonempty closed convex set. In this method, the functions are aggregated and, at each iteration, a succession of gradient steps, one for each of the aggregate functions, is applied and the result is projected onto the convex set. We show that if the gradients of the functions are bounded and Lipschitz continuous over a certain level set and the stepsizes are chosen to be proportional to a certain residual squared or to be square summable, then every cluster point of the iterates is a stationary point. We apply these results to the backpropagation algorithm to obtain new deterministic convergence results for this algorithm. We also discuss the issues of parallel implementation and give a simple criterion for choosing the aggregation.

Original languageEnglish (US)
Pages (from-to)85-101
Number of pages17
JournalOptimization Methods and Software
Volume4
Issue number2
DOIs
StatePublished - Jan 1 1994

Bibliographical note

Funding Information:
* The research of the first author is supported by the Natural Sciences and Engineering Research Council of Canada, Grant No. OPG0090391, and the research of the second author IS supported by the National Science Foundation, Grant No. CCR-9103804

Keywords

  • Backpropagation
  • Gradient projection
  • Neural networks

Fingerprint Dive into the research topics of 'Analysis of an approximate gradient projection method with applications to the backpropagation algorithm'. Together they form a unique fingerprint.

Cite this