Why is the distribution of the eigenvalues of random variables in (2.16) independent?

Because the random variables in (2.16) are independent andhave all the same distribution, the quantity depends only on the equivalence relation and not on the value of itself.

How does the proof of the theorem work?

The proof of the theorem then proceeds by developing a recursive inequality on the central term in this second expansion, which is done in Section IV-E.

What is the conjecture that is essentially the content of Section I-G?

The earlier [4] also contains a conjecture that more powerful uncertainty principles may exist if one of , is chosen at random, which is essentially the content of Section I-G here.

What is the distribution of the eigenvalues of such operators?

The distribution of the eigenvalues of such operators was studied by Landau and others [27]–[29] while developing the prolate spheroidal wave functions that are now commonly used in signal processing and communications.

What is the simplest way to find a trigonometric polynomial?

The following lemma shows that a necessary and sufficient condition for the solution to be the solution to is the existence of a trigonometric polynomial whose Fourier transform is supported on , matches on , and has magnitude strictly less than elsewhere.

Why is the paper unable to elaborate on this fact?

Because of space limitation, the authors are unable to elaborate on this fact and its implications further, but will do so in a companion paper.

What is the function that the authors exchanged?

(4.43)Note that the authors voluntarily exchanged the function arguments to reflect the idea that the authors shall view as a function of while will serve as a parameter.

(Open Access) Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information (2006) | Emmanuel J. Candès

Q: What are the contributions mentioned in the paper "Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information" ?

This paper considers the model problem of reconstructing an object from incomplete frequency samples. A typical result of this paper is as follows.

Q: What is the way to recover from a convex matrix?

To the best of their knowledge, one would essentially need to let vary over all subsets of cardinality , checking for each one whether is in the range of or not, and then invert the relevant minor of the Fourier matrix to recover once is determined.

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 2, FEBRUARY 2006

489

Robust Uncertainty Principles: Exact Signal

Reconstruction From Highly Incomplete

Frequency Information

Emmanuel J. Candès, Justin Romberg, Member, IEEE, and Terence Tao

Abstract—This paper considers the model problem of recon-

structing an object from incomplete frequency samples. Consider

a discrete-time signal

and a randomly chosen set of

frequencies



. Is it possible to reconstruct from the partial

knowledge of its Fourier coefﬁcients on the set



A typical result of this paper is as follows. Suppose that

is a

superposition of

spikes

( )= ( ) ( )

obeying

(log )



for some constant

. We do not know the locations of the

spikes nor their amplitudes. Then with probability at least

( )

, can be reconstructed exactly as the solution to the

minimization problem

min

( )

s.t.

^( )=

( )

for all



In short, exact recovery may be obtained by solving a convex op-

timization problem. We give numerical values for

which de-

pend on the desired probability of success. Our result may be in-

terpreted as a novel kind of nonlinear sampling theorem. In effect,

it says that any signal made out of

spikes may be recovered by

convex programming from almost every set of frequencies of size

( log )

. Moreover, this is nearly optimal in the sense that

any method succeeding with probability

( )

would in

general require a number of frequency samples at least propor-

tional to

log

The methodology extends to a variety of other situations and

higher dimensions. For example, we show how one can reconstruct

a piecewise constant (one- or two-dimensional) object from in-

complete frequency samples—provided that the number of jumps

(discontinuities) obeys the condition above—by minimizing other

convex functionals such as the total variation of

Index Terms—Convex optimization, duality in optimization, free

probability, image reconstruction, linear programming, random

matrices, sparsity, total-variation minimization, trigonometric ex-

pansions, uncertainty principle.

Manuscript received June 10, 2004; revised September 9, 2005. the work of E.

J. Candes is supported in part by the National Science Foundation under Grant

DMS 01-40698 (FRG) and by an Alfred P. Sloan Fellowship. The work of J.

Romberg is supported by the National Science Foundation under Grants DMS

01-40698 and ITR ACI-0204932. The work of T. Tao is supported in part by a

grant from the Packard Foundation.

E. J. Candes and J. Romberg are with the Department of Applied and Compu-

tational Mathematics, California Institute of Technology, Pasadena, CA 91125

USA (e-mail: emmanuel@acm.caltech.edu, jrom@acm.caltech.edu).

T. Tao is with the Department of Mathematics, University of California, Los

Angeles, CA 90095 USA (e-mail: tao@math.ucla.edu).

Communicated by A. Høst-Madsen, Associate Editor for Detection and Es-

timation.

Digital Object Identiﬁer 10.1109/TIT.2005.862083

I. INTRODUCTION

N many applications of practical interest, we often wish to

reconstruct an object (a discrete signal, a discrete image,

etc.) from incomplete Fourier samples. In a discrete setting, we

may pose the problem as follows; let

be the Fourier trans-

form of a discrete object

The problem is then to recover from partial frequency infor-

mation, namely, from

, where belongs

to some set

of cardinality less than —the size of the dis-

crete object.

In this paper, we show that we can recover

exactly from

observations

on small set of frequencies provided that

is sparse. The recovery consists of solving a straightforward

optimization problem that ﬁnds

of minimal complexity with

, .

A. A Puzzling Numerical Experiment

This idea is best motivated by an experiment with surpris-

ingly positive results. Consider a simpliﬁed version of the clas-

sical tomography problem in medical imaging: we wish to re-

construct a two–dimensional image

from samples

of its discrete Fourier transform on a star-shaped domain [1].

Our choice of domain is not contrived; many real imaging de-

vices collect high-resolution samples along radial lines at rela-

tively few angles. Fig. 1(b) illustrates a typical case where one

gathers 512 samples along each of 22 radial lines.

Frequently discussed approaches in the literature of medical

imaging for reconstructing an object from polar frequency sam-

ples are the so-called ﬁltered backprojection algorithms. In a

nutshell, one assumes that the Fourier coefﬁcients at all of the

unobserved frequencies are zero (thus reconstructing the image

of “minimal energy” under the observation constraints). This

strategy does not perform very well, and could hardly be used

for medical diagnostics [2]. The reconstructed image, shown in

Fig. 1(c), has severe nonlocal artifacts caused by the angular un-

dersampling. A good reconstruction algorithm, it seems, would

have to guess the values of the missing Fourier coefﬁcients.

In other words, one would need to interpolate

. This

seems highly problematic, however; predictions of Fourier coef-

ﬁcients from their neighbors are very delicate, due to the global

and highly oscillatory nature of the Fourier transform. Going

490 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 2, FEBRUARY 2006

Fig. 1. Example of a simple recovery problem. (a) The Logan–Shepp phantom test image. (b) Sampling domain



in the frequency plane; Fourier coefﬁcients are

sampled along 22 approximately radial lines. (c) Minimum energy reconstruction obtained by setting unobserved Fourier coefﬁcients to zero. (d) Reconstruction

obtained by minimizing the total variation, as in (1.1). The reconstruction is an exact replica of the image in (a).

back to the example in Fig. 1, we can see the problem

immediately. To recover frequency information near

, where is near , we would

need to interpolate

at the Nyquist rate . However, we

only have samples at rate about

; the sampling rate is

almost 50 times smaller than the Nyquist rate!

We propose instead a strategy based on convex optimization.

Let

be the total-variation norm of a two-dimensional

(2D) object

. For discrete data ,

where is the ﬁnite difference

and . To recover from par-

tial Fourier samples, we ﬁnd a solution

to the optimization

problem

subject to for all (1.1)

In a nutshell, given partial observation

, we seek a solution

with minimum complexity—called here the total variation

(TV)—and whose “visible” coefﬁcients match those of the un-

known object

. Our hope here is to partially erase some of

the artifacts that classical reconstruction methods exhibit (which

tend to have large TV norm) while maintaining ﬁdelity to the ob-

served data via the constraints on the Fourier coefﬁcients of the

reconstruction. (Note that the TV norm is widely used in image

processing, see [31] for example.)

When we use (1.1) for the recovery problem illustrated in

Fig. 1 (with the popular Logan–Shepp phantom as a test image),

the results are surprising. The reconstruction is exact; that is,

This numerical result is also not special to this phantom.

In fact, we performed a series of experiments of this type and

obtained perfect reconstruction on many similar test phantoms.

B. Main Results

This paper is about a quantitative understanding of this very

special phenomenon. For which classes of signals/images can

we expect perfect reconstruction? What are the tradeoffs be-

tween complexity and number of samples? In order to answer

these questions, we ﬁrst develop a fundamental mathematical

understanding of a special 1D model problem. We then exhibit

CANDES et al.: ROBUST UNCERTAINTY PRINCIPLES 491

reconstruction strategies which are shown to exactly reconstruct

certain unknown signals, and can be extended for use in a va-

riety of related and sophisticated reconstruction applications.

For a signal

,wedeﬁne the classical discrete Fourier

transform

(1.2)

If we are given the value of the Fourier coefﬁcients

for

all frequencies

, then one can obviously reconstruct

exactly via the Fourier inversion formula

Now suppose that we are only given the Fourier coefﬁcients

sampled on some partial subset of all frequencies. Of

course, this is not enough information to reconstruct

exactly

in general;

has degrees of freedom and we are only spec-

ifying

of those degrees (here and below denotes

the cardinality of

Suppose, however, that we also specify that

is supported

on a small (but a priori unknown) subset

of ; that is, we

assume that

can be written as a sparse superposition of spikes

In the case where is prime, the following theorem tells us that

it is possible to recover

exactly if is small enough.

Theorem 1.1: Suppose that the signal length

is a prime

integer. Let

be a subset of , and let be a

vector supported on

such that

(1.3)

Then

can be reconstructed uniquely from and . Con-

versely, if

is not the set of all frequencies, then there exist

distinct vectors

, such that

and such that .

Proof: We will need the following lemma [3], from which

we see that with knowledge of

, we can reconstruct uniquely

(using linear algebra) from

Lemma 1.2: ([3, Corollary 1.4]) Let

be a prime integer and

, be subsets of . Put (resp., ) to be the space

of signals that are zero outside of

(resp., ). The restricted

Fourier transform

is deﬁned as

for all

If , then is a bijection; as a consequence, we

thus see that

is injective for and surjective for

. Clearly, the same claims hold if the Fourier transform

is replaced by the inverse Fourier transform .

To prove Theorem 1.1, assume that

. Suppose

for contradiction that there were two objects

, such that

and . Then the Fourier

transform of

vanishes on , and .

By Lemma 1.2, we see that

is injective, and thus

. The uniqueness claim follows.

We now examine the converse claim. Since

, we can

ﬁnd disjoint subsets

, of such that

and . Let be some frequency which does

not lie in

. Applying Lemma 1.2, we have that

is a bijection, and thus we can ﬁnd a vector supported on

whose Fourier transform vanishes on but is nonzero on ;in

particular,

is not identically zero. The claim now follows by

taking

and .

Note that if is not prime, the lemma (and hence the the-

orem) fails, essentially because of the presence of nontrivial

subgroups of

with addition modulo ; see Sections I-C and

-D for concrete counter examples, and [3], [4] for further dis-

cussion. However, it is plausible to think that Lemma 1.2 con-

tinues to hold for nonprime

if and are assumed to be

generic—in particular, they are not subgroups of

, or cosets

of subgroups. If

and are selected uniformly at random, then

it is expected that the theorem holds with probability very close

to one; one can indeed presumably quantify this statement by

adapting the arguments given above but we will not do so here.

However, we refer the reader to Section I-G for a rapid presen-

tation of informal arguments pointing in this direction.

Areﬁnement of the argument in Theorem 1.1 shows that for

ﬁxed subsets

, in the time domain and in the frequency

domain, the space of vectors

, supported on , such that

has dimension when ,

and has dimension

otherwise. In particular, if we let

denote those vectors whose support has size at most ,

then the set of vectors in

which cannot be reconstructed

uniquely in this class from the Fourier coefﬁcients sampled at

, is contained in a ﬁnite union of linear spaces of dimension

at most

. Since itself is a ﬁnite union of linear

spaces of dimension

, we thus see that recovery of from

is in principle possible generically whenever

; once , however, it is clear from simple

degrees-of-freedom arguments that unique recovery is no longer

possible. While our methods do not quite attain this theoretical

upper bound for correct recovery, our numerical experiements

suggest that they do come within a constant factor of this bound

(see Fig. 2).

Theorem 1.1 asserts that one can reconstruct

from fre-

quency samples (and that, in general, there is no hope to do so

from fewer samples). In principle, we can recover

exactly by

solving the combinatorial optimization problem

(1.4)

where

is the number of nonzero terms .

This is a combinatorial optimization problem, and solving (1.4)

directly is infeasible even for modest-sized signals. To the best

of our knowledge, one would essentially need to let

vary over

all subsets

of cardinality ,

checking for each one whether

is in the range of or

not, and then invert the relevant minor of the Fourier matrix to

recover

once is determined. Clearly, this is computationally

492 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 2, FEBRUARY 2006

very expensive since there are exponentially many subsets to

check; for instance, if

, then the number of subsets

scales like

! As an aside comment, note that it is also

not clear how to make this algorithm robust, especially since the

results in [3] do not provide any effective lower bound on the

determinant of the minors of the Fourier matrix, see Section VI

for a discussion of this point.

A more computationally efﬁcient strategy for recovering

from and is to solve the convex problem

(1.5)

The key result in this paper is that the solutions to

and

are equivalent for an overwhelming percentage of the choices

for

and with ( is a constant): in

these cases, solving the convex problem

recovers exactly.

To establish this upper bound, we will assume that the ob-

served Fourier coefﬁcients are randomly sampled. Given the

number

of samples to take in the Fourier domain, we choose

the subset

uniformly at random from all sets of this size; i.e.,

each of the

possible subsets are equally likely. Our main

theorem can now be stated as follows.

Theorem 1.3: Let

be a discrete signal supported on

an unknown set

, and choose of size uniformly

at random. For a given accuracy parameter

,if

(1.6)

then with probability at least

, the minimizer to

the problem (1.5) is unique and is equal to

Notice that (1.6) essentially says that

is of size ,

modulo a constant and a logarithmic factor. Our proof gives an

explicit value of

, namely, (valid for

, , and , say) although we have not

pursued the question of exactly what the optimal value might

be.

In Section V, we present numerical results which suggest that

in practice, we can expect to recover most signals

more than

50% of the time if the size of the support obeys

.By

most signals, we mean that we empirically study the success rate

for randomly selected signals, and do not search for the worst

case signal

—that which needs the most frequency samples.

For

, the recovery rate is above 90%. Empirically,

the constants

and do not seem to vary for in the range

of a few hundred to a few thousand.

C. For Almost Every

As the theorem allows, there exist sets and functions for

which the

-minimization procedure does not recover cor-

rectly, even if

is much smaller than . We sketch

two counter examples.

• A discrete Dirac comb. Suppose that

is a perfect square

and consider the picket-fence signal which consists of

spikes of unit height and with uniform spacing equal to

. This signal is often used as an extremal point for

uncertainty principles [4], [5] as one of its remarkable

properties is its invariance through the Fourier transform.

Hence, suppose that

is the set of all frequencies but the

multiples of

, namely, . Then

and obviously the reconstruction is identically zero.

Note that the problem here does not really have anything

to do with

-minimization per se; cannot be recon-

structed from its Fourier samples on

thereby showing

that Theorem 1.1 does not work “as is” for arbitrary

sample sizes.

• Boxcar signals. The example above suggests that in some

sense

must not be greater than about . In fact,

there exist more extreme examples. Assume the sample

size

is large and consider, for example, the indicator

function

of the interval

and let be the set . Let

be a function whose Fourier transform is a nonnegative

bump function adapted to the interval

which equals when .

Then

has Fourier transform vanishing in , and

is rapidly decreasing away from

; in particular, we

have

for . On the other hand,

one easily computes that

for some absolute

constant

. Because of this, the signal

will have smaller -norm than for sufﬁciently

small (and

sufﬁciently large), while still having the

same Fourier coefﬁcients as

on . Thus, in this case

is not the minimizer to the problem , despite the fact

that the support of

is much smaller than that of .

The above counter examples relied heavily on the special

choice of

(and to a lesser extent of ); in particular,

it needed the fact that the complement of

contained a large

interval (or more generally, a long arithmetic progression). But

for most sets

, large arithmetic progressions in the complement

do not exist, and the problem largely disappears. In short, The-

orem 1.3 essentially says that for most sets of

of size about

, there is no loss of information.

D. Optimality

Theorem 1.3 states that for any signal

supported on an ar-

bitrary set

in the time domain, recovers exactly—with

high probability— from a number of frequency samples that

is within a constant of

. It is natural to wonder

whether this is a fundamental limit. In other words, is there an

algorithm that can recover an arbitrary signal from far fewer

random observations, and with the same probability of success?

It is clear that the number of samples needs to be at least

proportional to

, otherwise, will not be injective. We

argue here that it must also be proportional to

to guar-

antee recovery of certain signals from the vast majority of sets

of a certain size.

Suppose

is the Dirac comb signal discussed in the previous

section. If we want to have a chance of recovering

, then at

the very least, the observation set

and the frequency support

must overlap at one location; otherwise, all of

the observations are zero, and nothing can be done. Choosing

CANDES et al.: ROBUST UNCERTAINTY PRINCIPLES 493

uniformly at random, the probability that it includes none of the

members of

where we have used the assumption that .

Then for

to be smaller than , it must be

true that

and if we make the restriction that cannot be as large as ,

meaning that

,wehave

For the Dirac comb then, any algorithm must have

observations for the identiﬁed probability of suc-

cess.

Examples for larger supports

exist as well. If is an

even power of two, we can superimpose

Dirac combs at

dyadic shifts to construct signals with time-domain support

and frequency-domain support

for . The same argument as above would

then dictate that

In short, Theorem 1.3 identiﬁes a fundamental limit. No re-

covery can be successful for all signals using signiﬁcantly fewer

observations.

E. Extensions

As mentioned earlier, results for our model problem extend

easily to higher dimensions and alternate recovery scenarios. To

be concrete, consider the problem of recovering a 1D piecewise-

constant signal via

subject to (1.7)

where we adopt the convention that .Ina

nutshell, model (1.5) is obtained from (1.7) after differentiation.

Indeed, let

be the vector of ﬁrst difference

, and note that . Obviously

for all

and, therefore, with , the problem is

identical to

s.t.

which is precisely what we have been studying.

Corollary 1.4: Put

. Under

the assumptions of Theorem 1.3, the minimizer to the

problem (1.7) is unique and is equal

with probability at

least

—provided that be adjusted so that

We now explore versions of Theorem 1.3 in higher dimen-

sions. To be concrete, consider the 2D situation (statements in

arbitrary dimensions are exactly of the same ﬂavor).

Theorem 1.5: Put

. We let ,

be a discrete real-valued image and of a certain size be

chosen uniformly at random. Assume that for a given accuracy

parameter

, is supported on obeying (1.6). Then with

probability at least

, the minimizer to the problem

(1.5) is unique and is equal to

We will not prove this result as the strategy is exactly parallel

to that of Theorem 1.3. Letting

be the horizontal ﬁnite dif-

ferences

and be the

vertical analog, we have just seen that we can think about the

data as the properly renormalized Fourier coefﬁcients of

and . Now put , where . Then the

minimum total-variation problem may be expressed as

subject to (1.8)

where

is a partial Fourier transform. One then obtains a

statement for piecewise constant 2D functions, which is sim-

ilar to that for sparse one–dimensional (1D) signals provided

that the support of

be replaced by

. We omit the details.

The main point here is that there actually are a variety of re-

sults similar to Theorem 1.3. Theorem 1.5 serves as another

recovery example, and provides a precise quantitative under-

standing of the “surprising result” discussed at the beginning

of this paper.

To be complete, we would like to mention that for complex

valued signals, the minimum

problem (1.5) and, therefore,

the minimum TV problem (1.1) can be recast as special convex

programs known as second-order cone programs (SOCPs). For

example, (1.8) is equivalent to

subject to

(1.9)

with variables

, , and in ( and are the real and

imaginary parts of

). If in addition, is real valued, then this

is a linear program. Much progress has been made in the past

decade on algorithms to solve both linear and second-order cone

programs [6], and many off-the-shelf software packages exist

for solving problems such as

and (1.9).

F. Relationship to Uncertainty Principles

From a certain point of view, our results are connected to

the so-called uncertainty principles [4], [5] which say that it is

difﬁcult to localize a signal

both in time and frequency

Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information

Figures

Citations

Compressed sensing

Distributed Optimization and Statistical Learning Via the Alternating Direction Method of Multipliers

An Introduction To Compressive Sampling

Signal Recovery From Random Measurements Via Orthogonal Matching Pursuit

Decoding by linear programming

References

Matrix computations

Numerical Optimization

Nonlinear total variation based noise removal algorithms

Atomic Decomposition by Basis Pursuit

Greed is good: algorithmic results for sparse approximation

Related Papers (5)

Compressed sensing

Signal Recovery From Random Measurements Via Orthogonal Matching Pursuit

Decoding by linear programming

An Introduction To Compressive Sampling

Stable signal recovery from incomplete and inaccurate measurements

Frequently Asked Questions (11)

Q1. What are the contributions mentioned in the paper "Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information" ?

Q2. What is the commonly discussed approach in the literature for reconstructing an object from polar frequency?

Q3. Why is the distribution of the eigenvalues of random variables in (2.16) independent?

Q4. How does the proof of the theorem work?

Q5. What is the conjecture that is essentially the content of Section I-G?

Q6. What is the distribution of the eigenvalues of such operators?

Q7. What is the way to recover from a convex matrix?

Q8. What is the simplest way to find a trigonometric polynomial?

Q9. Why is the paper unable to elaborate on this fact?

Q10. What is the function that the authors exchanged?

Q11. What is the simplest way to recover a signal?