A linearly convergent doubly stochastic Gauss–Seidel algorithm for solving linear equations and a certain class of over-parameterized optimization problems

Meisam Razaviyayn, Mingyi Hong, Navid Reyhanian, Zhi-Quan Luo

Research output: Contribution to journalArticle

Abstract

Consider the classical problem of solving a general linear system of equations Ax= b. It is well known that the (successively over relaxed) Gauss–Seidel scheme and many of its variants may not converge when A is neither diagonally dominant nor symmetric positive definite. Can we have a linearly convergent G–S type algorithm that works for anyA? In this paper we answer this question affirmatively by proposing a doubly stochastic G–S algorithm that is provably linearly convergent (in the mean square error sense) for any feasible linear system of equations. The key in the algorithm design is to introduce a nonuniform double stochastic scheme for picking the equation and the variable in each update step as well as a stepsize rule. These techniques also generalize to certain iterative alternating projection algorithms for solving the linear feasibility problem Ax≤ b with an arbitrary A, as well as high-dimensional minimization problems for training over-parameterized models in machine learning. Our results demonstrate that a carefully designed randomization scheme can make an otherwise divergent G–S algorithm converge.

Original languageEnglish (US)
JournalMathematical Programming
DOIs
StatePublished - Jan 1 2019

Fingerprint

Gauss-Seidel
Linear equations
Linear equation
Linearly
Linear system of equations
Optimization Problem
Alternating Projections
Converge
Projection Algorithm
Algorithm Design
Linear systems
Randomisation
Mean square error
Positive definite
Minimization Problem
Machine Learning
High-dimensional
Update
Generalise
Arbitrary

Keywords

  • Gauss–Seidel algorithm
  • Linear systems of equations
  • Nonuniform block coordinate descent algorithm
  • Over-parameterized optimization

Cite this

@article{bee0ca1bb9d4447cb693c6803048f102,
title = "A linearly convergent doubly stochastic Gauss–Seidel algorithm for solving linear equations and a certain class of over-parameterized optimization problems",
abstract = "Consider the classical problem of solving a general linear system of equations Ax= b. It is well known that the (successively over relaxed) Gauss–Seidel scheme and many of its variants may not converge when A is neither diagonally dominant nor symmetric positive definite. Can we have a linearly convergent G–S type algorithm that works for anyA? In this paper we answer this question affirmatively by proposing a doubly stochastic G–S algorithm that is provably linearly convergent (in the mean square error sense) for any feasible linear system of equations. The key in the algorithm design is to introduce a nonuniform double stochastic scheme for picking the equation and the variable in each update step as well as a stepsize rule. These techniques also generalize to certain iterative alternating projection algorithms for solving the linear feasibility problem Ax≤ b with an arbitrary A, as well as high-dimensional minimization problems for training over-parameterized models in machine learning. Our results demonstrate that a carefully designed randomization scheme can make an otherwise divergent G–S algorithm converge.",
keywords = "Gauss–Seidel algorithm, Linear systems of equations, Nonuniform block coordinate descent algorithm, Over-parameterized optimization",
author = "Meisam Razaviyayn and Mingyi Hong and Navid Reyhanian and Zhi-Quan Luo",
year = "2019",
month = "1",
day = "1",
doi = "10.1007/s10107-019-01404-0",
language = "English (US)",
journal = "Mathematical Programming",
issn = "0025-5610",
publisher = "Springer-Verlag GmbH and Co. KG",

}

TY - JOUR

T1 - A linearly convergent doubly stochastic Gauss–Seidel algorithm for solving linear equations and a certain class of over-parameterized optimization problems

AU - Razaviyayn, Meisam

AU - Hong, Mingyi

AU - Reyhanian, Navid

AU - Luo, Zhi-Quan

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Consider the classical problem of solving a general linear system of equations Ax= b. It is well known that the (successively over relaxed) Gauss–Seidel scheme and many of its variants may not converge when A is neither diagonally dominant nor symmetric positive definite. Can we have a linearly convergent G–S type algorithm that works for anyA? In this paper we answer this question affirmatively by proposing a doubly stochastic G–S algorithm that is provably linearly convergent (in the mean square error sense) for any feasible linear system of equations. The key in the algorithm design is to introduce a nonuniform double stochastic scheme for picking the equation and the variable in each update step as well as a stepsize rule. These techniques also generalize to certain iterative alternating projection algorithms for solving the linear feasibility problem Ax≤ b with an arbitrary A, as well as high-dimensional minimization problems for training over-parameterized models in machine learning. Our results demonstrate that a carefully designed randomization scheme can make an otherwise divergent G–S algorithm converge.

AB - Consider the classical problem of solving a general linear system of equations Ax= b. It is well known that the (successively over relaxed) Gauss–Seidel scheme and many of its variants may not converge when A is neither diagonally dominant nor symmetric positive definite. Can we have a linearly convergent G–S type algorithm that works for anyA? In this paper we answer this question affirmatively by proposing a doubly stochastic G–S algorithm that is provably linearly convergent (in the mean square error sense) for any feasible linear system of equations. The key in the algorithm design is to introduce a nonuniform double stochastic scheme for picking the equation and the variable in each update step as well as a stepsize rule. These techniques also generalize to certain iterative alternating projection algorithms for solving the linear feasibility problem Ax≤ b with an arbitrary A, as well as high-dimensional minimization problems for training over-parameterized models in machine learning. Our results demonstrate that a carefully designed randomization scheme can make an otherwise divergent G–S algorithm converge.

KW - Gauss–Seidel algorithm

KW - Linear systems of equations

KW - Nonuniform block coordinate descent algorithm

KW - Over-parameterized optimization

UR - http://www.scopus.com/inward/record.url?scp=85066914389&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85066914389&partnerID=8YFLogxK

U2 - 10.1007/s10107-019-01404-0

DO - 10.1007/s10107-019-01404-0

M3 - Article

JO - Mathematical Programming

JF - Mathematical Programming

SN - 0025-5610

ER -