Open AccessJournal ArticleDOI

Stable signal recovery from incomplete and inaccurate measurements

- 01 Aug 2006 -

Communications on Pure and Applied Mathe...

- Vol. 59, Iss: 8, pp 1207-1223

TLDR

In this paper, the authors considered the problem of recovering a vector x ∈ R^m from incomplete and contaminated observations y = Ax ∈ e + e, where e is an error term.

Abstract:

Suppose we wish to recover a vector x_0 Є R^m (e.g., a digital signal or image) from incomplete and contaminated observations y = Ax_0 + e; A is an n by m matrix with far fewer rows than columns (n « m) and e is an error term. Is it possible to recover x_0 accurately based on the data y? To recover x_0, we consider the solution x^# to the l_(1-)regularization problem min ‖x‖l_1 subject to ‖Ax - y‖l(2) ≤ Є, where Є is the size of the error term e. We show that if A obeys a uniform uncertainty principle (with unit-normed columns) and if the vector x_0 is sufficiently sparse, then the solution is within the noise level ‖x^# - x_0‖l_2 ≤ C Є. As a first example, suppose that A is a Gaussian random matrix; then stable recovery occurs for almost all such A's provided that the number of nonzeros of x_0 is of about the same order as the number of observations. As a second instance, suppose one observes few Fourier samples of x_0; then stable recovery occurs for almost any set of n coefficients provided that the number of nonzeros is of the order of n/[log m]^6. In the case where the error term vanishes, the recovery is of course exact, and this work actually provides novel insights into the exact recovery phenomenon discussed in earlier papers. The methodology also explains why one can also very nearly recover approximately sparse signals.

Content maybe subject to copyright Report

arXiv:math/0503066v2 [math.NA] 7 Dec 2005

Stable Signal Recovery from

Incomplete and Inaccurate Measurements

Emmanuel Candes

†

, Justin Romberg

†

, and Terence Tao

♯

† Applied and Computational Mathematics, Caltech, Pasadena, CA 91125

♯ Department of Mathematics, University of California, Los Angeles, CA 90095

February, 2005; Revised June 2005

Abstract

Suppose we wish to recover a vector x

∈ R

(e.g. a digital signal or image) from

incomplete and contaminated observations y = Ax

+ e; A is a n by m matrix with far

fewer rows than columns (n ≪ m) and e is an error term. Is it possible to recover x

accurately based on the data y?

To recover x

, we consider the solution x

♯

to the ℓ

-regularization problem

min kxk

ℓ

subject to kAx − yk

ℓ

≤ ǫ,

where ǫ is the size of the error term e. We show that if A obeys a uniform uncertainty

principle (with unit-normed columns) and if the vector x

is suﬃciently sparse, then

the solution is within the noise level

♯

− x

ℓ

≤ C · ǫ.

As a ﬁrst example, suppose that A is a Gaussian random matrix, then stable recovery

occurs for a lmost all such A’s provided that the number of nonzeros of x

is of about the

same order as the number of o bservations. As a s e c ond instance, suppose o ne observes

few Fourier samples of x

, then stable recovery occurs for almost a ny set of n coeﬃcients

provided that the number of nonzeros is of the order of n/[log m]

In the case where the error term vanishes, the r e c overy is of course exact, and this

work actually provides novel insights on the exact recovery phenomenon discussed in

earlier pa pers. The methodology also explains why one can also very nearly recover

approximately sparse signals.

Keywords. ℓ

-minimization, basis pursuit, restricted orthonormality, sparsity, singular

values of random matrices.

Acknowledgments. E. C. is partially supported by a National Science Foundation grant

DMS 01-40698 (FRG) and by an Alfred P. Sloan Fellowship. J. R. is supported by National

Science Foundation grants DMS 01-40698 and IT R ACI-0204932. T. T. is supported in part

by grants from the Packard Foundation.

1 Introduction

1.1 Exact recovery of sparse signals

Recent papers [2–5,10] have developed a series of powerful results about the exact recovery

of a ﬁnite signal x

∈ R

from a very limited number of observations. As a representative

result from this literature, consider the problem of recovering an unknown sparse signal

(t) ∈ R

; that is, a signal x

whose support T

= {t : x

(t) 6= 0} is assumed to have

small card inality. All we know about x

are n linear measurements of the form

= hx

, a

i k = 1, . . . , n or y = Ax

where the a

∈ R

are known test signals. Of special interest is the vastly underdetermined

case, n ≪ m, wh ere th ere are many more unknowns than observations. At ﬁrst glance, this

may seem impossible. However, it turns out that one can actually recover x

exactly by

solving the convex p rogram

) min kxk

ℓ

subject to Ax = y, (1)

provided that the matrix A ∈ R

n×m

obeys a uniform uncertainty principle.

The uniform uncertainty principle, introduced in [5] and reﬁned in [4], essentially states that

the n × m measurement matrix A obeys a “restricted isometry hypothesis.” To introduce

this notion, let A

, T ⊂ {1, . . . , m} be the n × |T | submatrix obtained by extracting the

columns of A corresponding to the indices in T . Then [4] deﬁnes th e S-restricted isometry

constant δ

of A which is the smallest quantity such that

(1 −δ

) kck

ℓ

≤ kA

ℓ

≤ (1 + δ

) kck

ℓ

(2)

for all subsets T with |T | ≤ S and coeﬃcient sequences (c

)

j∈T

. This property essentially

requires that every set of columns with cardinality less than S approximately behaves like

an orthonormal system. It was shown (also in [4]) that if S veriﬁes

+ δ

< 1, (3)

then solving (P

) recovers any sparse signal x

with support size obeying |T

| ≤ S.

1.2 Stable recovery from imperfect measurements

This paper d evelops results for the “imperfect” (and far m ore realistic) scenarios where the

measurements are noisy and the signal is not exactly sparse. Everyone would agree that

in most practical situations, we cannot assume that Ax

is known with arbitrary precision.

More appr op riately, we will assume instead that one is given “noisy” data y = Ax

+e, where

e is some unknown perturbation bounded by a known amount kek

ℓ

≤ ǫ. To be broadly

applicable, our recovery procedure must be stable: small changes in the observations should

result in small changes in the recovery. This wish, however, may be quite hopeless. How can

we possibly hope to recover our signal when not only the available information is severely

incomplete but in addition, the few available observations are also inaccurate?

) can even be recast as a linear program [6].

Consider nevertheless (as in [12] for example) the convex program searching, among all

signals consistent with the data y, for that with minimum ℓ

-norm

) min kxk

ℓ

subject to kAx − yk

ℓ

≤ ǫ. (4)

The ﬁrst result of this paper shows that contrary to the belief expressed above, the solution

to (P

) recovers an unknown sparse object with an error at most proportional to the noise

level. Our condition for stable recovery again involves the restricted isometry constants.

Theorem 1 Let S be such that δ

+ 3δ

< 2. Then for any signal x

supported on T

with |T

| ≤ S and any perturbation e with kek

ℓ

≤ ǫ, the solution x

♯

to (P

) obe ys

♯

− x

ℓ

≤ C

· ǫ, (5)

where the constant C

may only depend on δ

. For reasonable values of δ

, C

is well

behaved; e.g. C

≈ 8.82 for δ

= 1/5 and C

≈ 10.47 for δ

= 1/4.

It is interesting to note that for S obeying the condition of the theorem, the reconstruction

from noiseless data is exact. It is quite possible that for some matrices A, this condition

tolerates larger values of S than (3).

We would like to oﬀer two comments. First, the matrix A is rectangular with many more

columns than rows. As such, most of its singular values are zero. As emphasized earlier,

the fact that the severely ill-posed matrix inversion keeps the perturbation from “blowing

up” is rather remarkable and perhaps unexpected.

Second, no recovery method can perform fundamentally better for arbitrary perturbations

of s ize ǫ. To see why this is true, suppose one had available an oracle letting us know, in

advance, the support T

of x

. With this additional information, the problem is well-posed

and one could reconstruct x

by the method of Least-Squares for examp le,

ˆx =

(

∗

)

−1

∗

y on T

0 elsewhere.

In the abs en ce of any other information, one could easily argue that no method would

exhibit a fundamentally better performance. Now of course, ˆx −x

= 0 on th e complement

of T

while on T

ˆx − x

= (A

∗

)

−1

∗

and since by hypothesis, the eigenvalues of A

∗

are well-behaved

kˆx − x

ℓ

≈ kA

∗

ℓ

≈ ǫ,

at least for pertur bations concentrated in the row s pace of A

. In short, obtaining a

reconstruction with an error term whose size is guaranteed to be proportional to th e noise

level is the best one can hope for.

Remarkably, not only can we recover sparse input vectors but one can also stably recover

approximately sparse vectors, as we have the following companion theorem.

Observe the role played by the singular values of A

in the analysis of the oracle error.

Theorem 2 Suppose that x

is an arbitrary vec tor in R

and let x

0,S

be the truncated

vector corre sponding to the S largest values of x

(in absolute value). U nder the hypothesis

of Theorem 1, the solution x

♯

to (P

) obe ys

♯

− x

ℓ

≤ C

1,S

· ǫ + C

2,S

− x

0,S

ℓ

√

. (6)

For reasonable values of δ

the constants in (5) are well behaved; e.g. C

1,S

≈ 12.04 and

1,S

≈ 8.77 for δ

= 1/5.

Roughly speaking, the theorem says that minimizing ℓ

stably recovers the S-largest entries

of an m-dimensional unknown vector x from n measurements only.

We now specialize this result to a commonly discussed mo del in mathematical signal pro-

cessing, namely, the class of compressible signals. We say that x

is compressible if its

entries obey a power law

(k)

≤ C

· k

−r

, (7)

where |x

(k)

is the kth largest value of x

(|x

(1)

≥ |x

(2)

≥ . . . ≥ |x

(m)

), r > 1, and

is a constant which depends only on r. Such a model is appropriate for the wavelet

co eﬃcients of a piecewise smooth signal, f or example. If x

obeys (7), then

− x

0,S

ℓ

√

≤ C

′

· S

−r+1/2

Observe now that in this case

− x

0,S

ℓ

≤ C

′′

· S

−r+1/2

and for generic elements obeying (7), there are no fundamentally better estimates available.

Hence, we see that with n measurements only, we achieve an approximation error which is

almost as good as that one would obtain by knowing everything about the signal x

and

selecting its S-largest entries.

As a last remark, we would like to point out that in the noiseless case, Theorem 2 improves

upon an earlier result from Cand`es and Tao, see also [8]; it is sharper in the sense that 1)

this is a deterministic statement and there is no probability of failure, 2) it is universal in

that it holds for all signals, 3) it gives upper estimates with better bounds and constants,

and 4) it holds for a wider range of values of S.

1.3 Examples

It is of course of interest to know which matrices obey the uniform u ncertainty principle

with good isometry constants. Using tools from random matrix theory, [3,5,10] give several

examples of matrices such that (3) holds for S on the order of n to within log factors.

Examples include (proofs and additional discussion can be foun d in [5]):

• Random matrices with i.i.d. entries. Suppose the entries of A are i.i.d. Gaussian with

mean zero and variance 1/n, then [5, 10, 17] show that the condition for Theorem 1

holds with overwhelming pr obab ility when

S ≤ C · n/ log(m/n).

In fact, [4] gives numerical values for the constant C as a function of the ratio n/m.

The same conclusion applies to binary matrices with independent entries taking values

±1/

√

n with equal probability.

• Fourier ensemble. Suppose now that A is obtained by selecting n rows from the

m × m discrete Fourier transform and r en ormalizing the columns so that they are

unit-normed. If the rows are selected at random, the condition for Theorem 1 holds

with overwhelming probability f or S ≤ C · n/(log m)

[5]. (For simplicity, we have

assumed that A takes on real-valued entries although our theory clearly accommodates

complex-valued matrices so that our discussion holds for both complex and real-valued

Fourier transforms.)

This case is of special interest as reconstructing a digital signal or image from incom-

plete Fourier data is an important inverse problem with applications in biomedical

imaging (MRI and tomography), Astrophysics (interferometric imaging), and geo-

physical exploration.

• General orthogonal measurement ensembles. Suppose A is obtained by selecting n

rows from an m by m orthonormal matrix U and renormalizing the columns so that

they are unit-normed. Then [5] shows that if the rows are selected at random, the

condition for Theorem 1 holds with overwhelming probability provided

S ≤ C ·

(log m)

, (8)

where µ :=

√

m max

i,j

|. Observe that for the Four ier matrix, µ = 1, and thus

(8) is an extension of the Fourier ensemble.

This fact is of signiﬁcant p ractical relevance because in many situations, signals of

interest may not be sparse in the time domain but rather may be (approximately)

decomposed as a sparse superposition of waveforms in a ﬁxed orthonormal basis Ψ;

e.g. in a nice wavelet basis. S uppose that we use as test signals a set of n vectors taken

from a second orthonormal basis Φ. We then solve (P

) in the coeﬃcient domain

′

) min kαk

ℓ

subject to Aα = y,

where A is obtained by extracting n rows from the orthonormal matrix U = ΦΨ

∗

. The

recovery condition then depends on the mutual coherence µ between the measurement

basis Φ and the sparsity basis Ψ which measures the similarity between Φ and Ψ;

µ(Φ, Ψ) =

√

m max |hφ

, ψ

i|, φ

∈ Φ, ψ

∈ Ψ.

1.4 Prior work and innovations

The problem of recovering a sparse vector by minimizing ℓ

under linear equality constraints

has recently received much attention, mostly in the context of Basis Pursuit, where the goal

is to uncover sparse signal decompositions in overcomplete dictionaries. We refer the reader

to [11, 13] and the references therein for a full discussion.

We would especially like to note two works by Donoho, Elad, and Temlyakov [12], and Tropp

[18] that also study the recovery of sparse signals from noisy observations by solving (P

)

(and other closely related optimization programs), and give conditions for stable recovery.

In [12], the sparsity constraint on the underlying signal x

depends on the magnitude of

HTML Viewer

Figures

Table 1: Recovery results for sparse 1D signals. Gaussian white noise of variance σ2 was added to each of the n = 300 measurements, and (P2) was solved with ǫ chosen such that ‖e‖2 ≤ ǫ with high probability (see (17)).

Table 2: Recovery results for compressible 1D signals. Gaussian white noise of variance σ2 was added to each measurement, and (P2) was solved with ǫ as in (17).

Table 3: Image recovery results. Measurements of the Boats image were corrupted in two different ways: by adding white noise (left column) with σ = 5 · 10−4 and by rounding off to one digit (right column). In each case, the image was recovered in two different ways: by solving (P ′ 2 ) (third row) and solving (TV ) (fourth row). The (TV ) images are shown in Figure 3.

Figure 4: (a) Noiseless measurements Ax0 of the Boats image. (b) Gaussian measurement error with σ = 5 · 10−4 in the recovery experiment summarized in the left column of Table 3. The signalto-noise ratio is ‖Ax0‖ℓ2/‖e‖ℓ2 = 4.5. (c) Round-off error in the recovery experiment summarized in the right column of Table 3. The signal-to-noise ratio is ‖Ax0‖ℓ2/‖e‖ℓ2 = 4.3.

Figure 3: (a) Original 256×256 Boats image. (b) Recovery via (TV ) from n = 25000measurements corrupted with Gaussian noise. (c) Recovery via (TV ) from n = 25000 measurements corrupted by round-off error. In both cases, the reconstruction error is less than the sum of the nonlinear approximation and measurement errors.

Figure 2: (a) Example of a sparse signal used in the 1D experiments. There are 50 non-zero coefficients taking values ±1. (b) Sparse signal recovered from noisy measurements with σ = 0.05. (c) Example of a compressible signal used in the 1D experiments. (d) Compressible signal recovered from noisy measurements with σ = 0.05.

Open Access

More filters

Book

Convex Optimization

Stephen Boyd, +1 more

TL;DR: In this article, the focus is on recognizing convex optimization problems and then finding the most appropriate technique for solving them, and a comprehensive introduction to the subject is given. But the focus of this book is not on the optimization problem itself, but on the problem of finding the appropriate technique to solve it.

...read moreread less

Emmanuel J. Candès, +1 more

- 01 Dec 2005 -

IEEE Transactions on Information Theory

TL;DR: F can be recovered exactly by solving a simple convex optimization problem (which one can recast as a linear program) and numerical experiments suggest that this recovery procedure works unreasonably well; f is recovered exactly even in situations where a significant fraction of the output is corrupted.

...read moreread less

Compressed sensing

D.L. Donoho

Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information

Emmanuel J. Candès, +2 more

- 01 Feb 2006 -

IEEE Transactions on Information Theory

Decoding by linear programming

Emmanuel J. Candès, +1 more

- 01 Dec 2005 -

IEEE Transactions on Information Theory

Signal Recovery From Random Measurements Via Orthogonal Matching Pursuit

Joel A. Tropp, +1 more

- 01 Dec 2007 -

IEEE Transactions on Information Theory

An Introduction To Compressive Sampling

Emmanuel J. Candès, +1 more

- 21 Mar 2008 -

IEEE Signal Processing Magazine

Frequently Asked Questions (14)

Q1. What are the contributions mentioned in the paper "Stable signal recovery from incomplete and inaccurate measurements" ?

To recover x0, the authors consider the solution x ♯ to the l1-regularization problem min ‖x‖l1 subject to ‖Ax− y‖l2 ≤ ǫ, where ǫ is the size of the error term e. The authors show that if A obeys a uniform uncertainty principle ( with unit-normed columns ) and if the vector x0 is sufficiently sparse, then the solution is within the noise level ‖x − x0‖l2 ≤ C · ǫ. In the case where the error term vanishes, the recovery is of course exact, and this work actually provides novel insights on the exact recovery phenomenon discussed in earlier papers.

Q2. What is the problem of minimizing l1 under linear equality constraints?

The problem of recovering a sparse vector by minimizing ℓ1 under linear equality constraints has recently received much attention, mostly in the context of Basis Pursuit, where the goal is to uncover sparse signal decompositions in overcomplete dictionaries.

Q3. What is the recovery condition for the Fourier transform?

The recovery condition then depends on the mutual coherence µ between the measurement basis Φ and the sparsity basis Ψ which measures the similarity between Φ and Ψ; µ(Φ,Ψ) = √ m max |〈φk, ψj〉|, φk ∈ Φ, ψj ∈ Ψ.

Q4. What are the obvious extensions of the vcp?

Obvious extensions include looking for signals that are sparsein overcomplete wavelet or curvelet bases, or for images that have certain geometrical structure.

Q5. How do the authors get the wavelet coefficients of the image from the scrambled real?

the authors make 25000 measurements of the image using a scrambled real Fourier ensemble; that is, the test functions ak(t) are real-valued sines and cosines (with randomly selected frequencies) which are temporally scrambled by randomly permuting the m time points.

Q6. What is the significance of the singular values of AT0?

Now of course, x̂−x0 = 0 on the complement of T0 while on T0 x̂− x0 = (A∗T0AT0)−1A∗T0e, and since by hypothesis, the eigenvalues of A∗T0AT0 are well-behaved 2‖x̂− x0‖ℓ2 ≈ ‖A∗T0e‖ℓ2 ≈ ǫ,at least for perturbations concentrated in the row space of AT0 .

Q7. What is the general rule for the recovery procedure?

To be broadly applicable, their recovery procedure must be stable: small changes in the observations should result in small changes in the recovery.

Q8. What is the sparsity constraint on the underlying signal?

In [12], the sparsity constraint on the underlying signal x0 depends on the magnitude ofthe maximum entry of the Gram matrix M(A) = maxi,j:i6=j |(A∗A)|i,j .

Q9. How many terms are used to calculate the recovery error?

As a reference, the 50 term nonlinear approximation errors of these compressible signals is around 0.47; at low signal-to-noise ratios their recovery error is about 1.5 times this quantity.

Q10. What is the sparsity constraint for the underlying signal?

For the measurement ensembles listed in the previous section, however, the sparsity required is still on the order of √ n in the situation where n is comparable to m.

Q11. What is the way to explain the random matrix theory?

Using tools from random matrix theory, [3,5,10] give several examples of matrices such that (3) holds for S on the order of n to within log factors.

Q12. What is the important part of the paper?

It was shown (also in [4]) that if S verifiesδS + δ2S + δ3S < 1, (3)then solving (P1) recovers any sparse signal x0 with support size obeying |T0| ≤ S.This paper develops results for the “imperfect” (and far more realistic) scenarios where the measurements are noisy and the signal is not exactly sparse.

Q13. What is the S-restricted isometry constant of A?

Then [4] defines the S-restricted isometry constant δS of A which is the smallest quantity such that(1 − δS) ‖c‖2ℓ2 ≤ ‖AT c‖2ℓ2 ≤ (1 + δS) ‖c‖2ℓ2 (2)for all subsets T with |T | ≤ S and coefficient sequences (cj)j∈T .

Q14. What is the way to recover a vector of arbitrary size?

Roughly speaking, the theorem says that minimizing ℓ1 stably recovers the S-largest entries of an m-dimensional unknown vector x from n measurements only.

Stable signal recovery from incomplete and inaccurate measurements

Figures

Citations

An Introduction To Compressive Sampling

Robust Face Recognition via Sparse Representation

Sparse MRI: The application of compressed sensing for rapid MR imaging.

Enhancing Sparsity by Reweighted ℓ 1 Minimization

CoSaMP: Iterative signal recovery from incomplete and inaccurate samples

References

Convex Optimization

Compressed sensing

Nonlinear total variation based noise removal algorithms

Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information

Decoding by linear programming

Related Papers (5)

Compressed sensing

Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information

Decoding by linear programming

Signal Recovery From Random Measurements Via Orthogonal Matching Pursuit

An Introduction To Compressive Sampling

Frequently Asked Questions (14)

Q1. What are the contributions mentioned in the paper "Stable signal recovery from incomplete and inaccurate measurements" ?

Q2. What is the problem of minimizing l1 under linear equality constraints?

Q3. What is the recovery condition for the Fourier transform?

Q4. What are the obvious extensions of the vcp?

Q5. How do the authors get the wavelet coefficients of the image from the scrambled real?

Q6. What is the significance of the singular values of AT0?

Q7. What is the general rule for the recovery procedure?

Q8. What is the sparsity constraint on the underlying signal?

Q9. How many terms are used to calculate the recovery error?

Q10. What is the sparsity constraint for the underlying signal?

Q11. What is the way to explain the random matrix theory?

Q12. What is the important part of the paper?

Q13. What is the S-restricted isometry constant of A?

Q14. What is the way to recover a vector of arbitrary size?