scispace - formally typeset
Open AccessJournal ArticleDOI

Numerical solution of saddle point problems

TLDR
A large selection of solution methods for linear systems in saddle point form are presented, with an emphasis on iterative methods for large and sparse problems.
Abstract
Large linear systems of saddle point type arise in a wide variety of applications throughout computational science and engineering. Due to their indefiniteness and often poor spectral properties, such linear systems represent a significant challenge for solver developers. In recent years there has been a surge of interest in saddle point problems, and numerous solution techniques have been proposed for this type of system. The aim of this paper is to present and discuss a large selection of solution methods for linear systems in saddle point form, with an emphasis on iterative methods for large and sparse problems.

read more

Content maybe subject to copyright    Report

Technical Report
TR-2004-028
Numerical Solution of Saddle Point Problems
by
Michele Benzi, Gene H. Golub, Jorg Liesen
Mathematics and Computer Science
EMORY UNIVERSITY

NUMERICAL SOLUTION OF SADDLE POINT PROBLEMS
MICHELE BENZI
, GENE H. GOLUB
, AND J
¨
ORG LIESEN
§
We dedicate this paper to Gil Strang on the occasion of his 70th birthday
Abstract. Large linear systems of saddle point type arise in a wide variety of applications throughout compu-
tational science and engineering. Due to their indefiniteness and often poor spectral properties, such linear systems
represent a significant challenge for solver developers. In recent years there has been a surge of interest in saddle
point problems, and numerous solution techniques have been proposed for solving this type of systems. The aim of
this paper is to present and discuss a large selection of solution methods for linear systems in saddle point form, with
an emphasis on iterative methods for large and sparse problems.
CONTENTS
1 Introduction 1
2 Applications leading to saddle p oint problems 4
3 Properties of saddle point matrices 9
4 Overview of solution algorithms 20
5 Schur complement reduction 21
6 Null space methods 22
7 Coupled direct solvers 28
8 Stationary iterations 30
9 Krylov subspace methods 34
10 Preconditioners 41
11 Multilevel methods 66
12 Available software 72
13 Concluding remarks 73
References 74
1. Introduction. In recent years, a large amount of work has been devoted to the problem
of solving large linear systems in saddle point form. The reason for th is interest is due to the fact
that such problems arise in a wide variety of technical and scientific applications. For example, the
ever increasing popularity of mixed finite element methods in engineering fields such as fluid and
solid mechanics has been a major source of saddle point systems [79, 170]. Another reason for this
surge in interest is due to the extraordinary success of interior point algorithms in both linear and
nonlinear op timization, which require at their heart the solution of a sequence of systems in saddle
point form [371, 506, 507].
Because of the ubiquitous nature of saddle point systems, methods and results on their numerical
solution have appeared in a wide variety of books, journals and conference proceedings, justifying
This draft dated 13 December 2004. To appear in Acta Numerica 2005.
Department of Mathematics and Computer Science, Emory University, Atlanta, Georgia 30322, USA
(benzi@mathcs.emory.edu). The work of this author was supported in part by the National Science Foundation
grant DMS-0207599.
Scientific Computing and Computational Mathematics Program, Stanford University, Stanford, California 94305-
9025, USA (golub@sccm.stanford.edu).
§
Institut f¨ur Mathematik, Technische Universit¨at Berlin, D-10623 Berlin, Germany (liesen@math.tu-berlin.de).
1

2 M. Benzi, G. H. Golub and J. Liesen
the need for a comprehensive survey of the subject. The purpose of this article is to review many of
the most promising solution methods, with an emphasis on iterative methods for large and sparse
problems. Although many of these solvers have been developed with specific applications in mind
(for example, Stokes-types problems in fluid dynamics), it is possible to d iscuss them in a fairly
general setting using standard numerical linear algebra concepts, the most prominent being perhaps
the Schur complement. Nevertheless, when choosing a preconditioner (or developing a new one),
knowledge of the origin of the particular problem at hand is essential. We therefore devote some
space to a discussion of saddle point problems arising in a few selected applications.
It is hoped that the present survey will prove useful to practitioners who are looking for guidance
in the choice of a solution method for their own application, to researchers in numerical linear algebra
and scientific computing, and especially to graduate students as an introduction to this very rich
and important subject.
1.1. Problem statement and class ification. The subject of this paper is the solution of
block 2 × 2 linear systems of the form
A B
T
1
B
2
C
x
y
=
f
g
, or Au = b ,(1.1)
A R
n×n
, B
1
, B
2
R
m×n
, C R
m×m
with n m .(1.2)
It is obvious that, under suitable partitioning, any linear system can be cast in the form (1.1)–(1.2).
We explicitly exclude the case where A or one or both of B
1
, B
2
are zero. When the linear system
describes a (generalized) saddle point problem, the constitu ent blocks A, B
1
, B
2
and C satisfy one
or more of the following conditions:
C1 A is symmetric: A = A
T
C2 The symmetric part of A, H
1
2
(A + A
T
), is positive semidefinite
C3 B
1
= B
2
= B
C4 C is symmetric (C = C
T
) and positive semidefinite
C5 C = O (the zero matrix)
Note that C5 imp lies C4. The most basic case is obtained when all the above conditions are
satisfied. In this case A is symmetric positive semidefinite and we have a symmetric linear system
of the form
A B
T
B O
x
y
=
f
g
.(1.3)
This system arises as the first order optimality conditions for the following equality-constrained
quadratic programming problem:
min J(x) =
1
2
x
T
Ax f
T
x(1.4)
subject to Bx = g .(1.5)
In this case the variable y represents the vector of Lagrange multipliers. Any solution (x
, y
) of
(1.3) is a saddle p oint for the Lagrangian
L(x, y) =
1
2
x
T
Ax f
T
x + (Bx g)
T
y,
hence the name “saddle point problem” given to (1.3). Recall that a saddle point is a point (x
, y
)
R
n+m
that satisfies
L(x
, y) L(x
, y
) L(x, y
) x R
n
, y R
m
,

Solution of Saddle Point Problems 3
or, equivalently,
min
x
max
y
L(x, y) = L(x
, y
) = max
y
min
x
L(x, y) .
Systems of the form (1.3) also arise in nonlinearly constrained optimization (sequential quadratic
programming and interior point methods), in fluid dynamics (Stokes problem), incompressible elas-
ticity, circuit analysis, structural analysis, and so forth; see the next section for a discussion of
applications leading to saddle p oint problems.
Another important special case is when conditions C1–C4 are satisfied, but not C5. In this case
we have a block linear system of the form:
A B
T
B C
x
y
=
f
g
.(1.6)
Problems of this kind frequently arise in the context of stabilized mixed finite element method s.
Stabilization is used whenever the discrete variables x and y belong to finite element spaces that
do not satisfy the Ladyzhenskaya–Babuˇska–Brezzi (or inf-sup) condition [79]. Another situation
leading to a nonzero C is the discretization of the equations describing slightly compressible fluids
or solids [69, Chapter 6.3]. Systems of the form (1.6) also arise from regularized, weighted least-
squares problems [49] and from certain interior point methods i n optimization [506, 507]. Often the
matrix C has small norm compared to the other blocks.
In the literature, the phrase generalized saddle point problem has been used primarily to allow
for the possibility of a nonsymmetric coefficient matrix A in (1.1). In such problems either A 6= A
T
(with condition C2 usually satisfied), or B
1
6= B
2
, or both. The most important example is perhaps
that of the linearized Navier–Stokes equations, where linearization has been obtained by Picard
iteration or by some variant of Newton’s method. See [111, 370] and [456] for additional examples.
We n ote that our definition of generalized sadd le point p rob lem as a linear system of the form (1.1)–
(1.2) where the blocks A, B
1
, B
2
and C satisfy one or more of the conditions C1-C5 is the most
general possible, and it contains previous definitions as special cases.
In the vast majority of cases, linear systems of saddle point type have real coefficients, and in
this paper we restrict ourselves to the real case. Complex coefficient matrices, however, do arise in
some cases; see, e.g., [61, 345] and [449, page 117]. Most of the results and algorithms reviewed in
this paper admit straightforward extensions to the complex case.
1.2. Sparsity, structure and size. Although saddle point systems come in all sizes and
with widely different structural and sparsity properties, in this paper we are mainly interested in
problems that are both large and sparse. This justifies our emphasis on iterative solvers. Direct
solvers, however, are still the preferred method in optimization and other areas. Furthermore, direct
methods are often used in the solution of subproblems, for example as part of a preconditioner solve.
Some of the algorithms considered in this paper are also applicable if one or more of the blocks in A
happen to be dense, as long as matrix-vector products with A can be performed efficiently, typically
in O(n + m) time. This means that if a dense block is present, it must have a special structure (e.g.,
Toeplitz, as in [49, 285]) or it must be possible to approximate its action on a vector with (nearly)
linear complexity, as in the fast multipole method [345].
Frequently, the matrices that arise in practice have quite a bit of structure. For instance, the
A block is often block diagonal, with each diagonal block endowed with additional structure. Many
of the algor ithms discussed in this paper are able to exploit the structure of the problem to gain
efficiency and save on storage. Sometimes the structure of the problem su ggests solution algorithms

4 M. Benzi, G. H. Golub and J. Liesen
that have a high degree of parallelism. This last aspect, however, is not emphasized in this paper.
Finally we mention that in most applications n is larger than m, often much larger.
2. Applications leading to saddle point problems. As already mentioned, large-scale
saddle point problems occur in many areas of computational science and engineering. The following
is a list of some fields where saddle point problems naturally arise, together with some references:
Computational fluid dynamics [213, 407, 459, 469, 499]
Constrained and weighted least squares estimation [59, 222]
Constrained optimization [210, 506, 507]
Economics [18, 143, 320, 456]
Electrical circuits and networks [51, 109, 449, 467]
Electromagnetism [67, 390, 392]
Finance [348, 349]
Image reconstruction [255]
Image registration [248, 362]
Interpolation of scattered data [342, 435]
Linear elasticity [69, 110]
Mesh generation for computer graphics [324]
Mixed finite element approximations of elliptic PDEs [78, 79, 407]
Model order reduction for dynamical systems [194, 263, 453]
Optimal control [36, 37, 56, 57, 369]
Parameter identification problems [86, 246, 247]
Quite often, saddle point systems arise when a certain quantity (such as the energy of a physical
system) has to be minimized, subject to a set of linear constraints. In this case the Lagrange
multiplier y usually has a physical interpretation and its computation is also of interest. For example,
in incompressible flow problems x is a vector of velocities and y a vector of pressures. In structural
mechanics x is the vector of internal forces, y represents the nodal displacements of the structure.
For resistive electrical networks y represents the nodal potentials, x being the vector of currents.
In some cases, such as fluid dynamics or linear elasticity, saddle point problems result from the
discretization of systems of partial differential equations with constraints. Typically the constraints
represent some basic conservation law, such as mass conservation in fluid dynamics. In other cases,
such as resistive electrical networks or structural analysis, the equations are discrete to begin with.
Now the constraints may correspond to the topology (connectivity) of the system being studied. Be-
cause saddle point equations can be derived as equilibrium conditions for a physical system, they are
sometimes called equilibrium equations. See [449] for a very nice discussion of equilibrium equations
throughout applied mathematics. Anoth er popular name for saddle point systems, especially in the
optimization literature, is “KKT system,” from the Karush-Kuh n-Tucker constraint qualification
conditions; see [371, page 328] for precise definitions, and [219, 300] for historical notes.
Systems of the form (1.1)–(1.2) also arise from non-overlapping domain decomposition when
interface unknowns are numbered last, as well as from FETI-type schemes when Lagrange multipliers
are used to ensure continuity at the interfaces; see for instance [95, 175, 275, 408].
It is of course not possible for us to cover here all these different applications. We choose instead
to give some details about three classes of problems leading to saddle point systems. The first comes
from the field of computational fluid dynamics, the second from least squares estimation, and the
third one from interior point methods in constrained optimization.

Citations
More filters
Book

Optimization Algorithms on Matrix Manifolds

TL;DR: Optimization Algorithms on Matrix Manifolds offers techniques with broad applications in linear algebra, signal processing, data mining, computer vision, and statistical analysis and will be of interest to applied mathematicians, engineers, and computer scientists.
Book

Automated Solution of Differential Equations by the Finite Element Method: The FEniCS Book

TL;DR: This book is a tutorial written by researchers and developers behind the FEniCS Project and explores an advanced, expressive approach to the development of mathematical software.

On the Navier-Stokes equations

Hantaek Bae
TL;DR: In this paper, a criterion for the convergence of numerical solutions of Navier-Stokes equations in two dimensions under steady conditions is given, which applies to all cases, of steady viscous flow in 2D.
Journal ArticleDOI

qpOASES: a parametric active-set algorithm for quadratic programming

TL;DR: The open-source C++ software package qpOASES is described, which implements a parametric active-set method in a reliable and efficient way and can be used to compute critical points of nonconvex QP problems.
Journal ArticleDOI

Split Bregman Methods and Frame Based Image Restoration

TL;DR: It is proved the convergence of the split Bregman iterations, where the number of inner iterations is fixed to be one, which gives a set of new frame based image restoration algorithms that cover several topics in image restorations.
References
More filters
Book

Matrix computations

Gene H. Golub
Book

Matrix Analysis

TL;DR: In this article, the authors present results of both classic and recent matrix analyses using canonical forms as a unifying theme, and demonstrate their importance in a variety of applications, such as linear algebra and matrix theory.
Book

Numerical heat transfer and fluid flow

TL;DR: In this article, the authors focus on heat and mass transfer, fluid flow, chemical reaction, and other related processes that occur in engineering equipment, the natural environment, and living organisms.
Book

Numerical Optimization

TL;DR: Numerical Optimization presents a comprehensive and up-to-date description of the most effective methods in continuous optimization, responding to the growing interest in optimization in engineering, science, and business by focusing on the methods that are best suited to practical problems.
Book

Iterative Methods for Sparse Linear Systems

Yousef Saad
TL;DR: This chapter discusses methods related to the normal equations of linear algebra, and some of the techniques used in this chapter were derived from previous chapters of this book.
Related Papers (5)
Frequently Asked Questions (13)
Q1. What is the upshot of using inexact solves?

The upshot is that inexact solves can be used to greatly reduce the cost of each iteration, at the expense of somewhat slower convergence. 

For LBB-stable discretizations of the linear elasticity and steady-state Stokes problem, the Schur complement is spectrally equivalent to a mass matrix [485]. 

For mixed finite element discretizations of the generalized Stokes problem that arises from the implicit treatment of the time-dependent Stokes problem, on the other hand, the Schur complement is of the form S = −B(A+ βI)−1BT where β > 

because of the absence of decay in A−1, it is difficult to construct good sparse approximate inverse preconditioners for saddle point matrices. 

since fill-in tends to be very heavy with the original ordering of A, large numbers of fill-ins have to be discarded, often resulting in preconditioners of low quality. 

“improving” the preconditioner (by allowing additional fill-in) may actually cause the preconditioned matrix to become very close to singular, which in turn may cause the preconditioned iteration to converge more slowly or even fail. 

Suitable tuning of this scaling factor can be interpreted as a form of preconditioning and has a dramatic impact on the accuracy attainable by sparse direct solvers [11, 144]. 

The interpretation of the kth error and residual in (9.5) and (9.6) in terms of the initial error and residual multiplied by a certain polynomial in the matrix A, respectively, is the typical starting point for the convergence analysis of the Krylov subspace methods characterized by the items (C) and (M). 

The main disadvantages are the need for A to be nonsingular, and the fact that the Schur complement S = −(BA−1BT + C) may be completely full and too expensive to compute and to factor. 

The idea was developed by Bunch and Parlett in [85], resulting in a stable algorithm for factoring symmetric indefinite matrices at a cost comparable to that of a Cholesky factorization for positive definite ones. 

for very large time steps (β small) the matrix S = −B(A+βI)−1BT is close to the Schur complement of the steady Stokes problem and is well-conditioned independent of mesh size. 

See also [425] for closely related work in the context of constrained finite element analyses, and [33, 283, 284] for earlier work on the use of preconditioned conjugate gradients in the context of implicit null space algorithms—i.e., null space algorithms in which the matrix Z is not formed explicitly. 

In spite of vigorous algorithmic and theoretical developments, the production of high-quality, widely accessible software for solving linear systems in saddle point form has been somewhat lagging.