scispace - formally typeset
Open AccessJournal ArticleDOI

Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information

TLDR
In this paper, the authors considered the model problem of reconstructing an object from incomplete frequency samples and showed that with probability at least 1-O(N/sup -M/), f can be reconstructed exactly as the solution to the lscr/sub 1/ minimization problem.
Abstract
This paper considers the model problem of reconstructing an object from incomplete frequency samples. Consider a discrete-time signal f/spl isin/C/sup N/ and a randomly chosen set of frequencies /spl Omega/. Is it possible to reconstruct f from the partial knowledge of its Fourier coefficients on the set /spl Omega/? A typical result of this paper is as follows. Suppose that f is a superposition of |T| spikes f(t)=/spl sigma//sub /spl tau//spl isin/T/f(/spl tau/)/spl delta/(t-/spl tau/) obeying |T|/spl les/C/sub M//spl middot/(log N)/sup -1/ /spl middot/ |/spl Omega/| for some constant C/sub M/>0. We do not know the locations of the spikes nor their amplitudes. Then with probability at least 1-O(N/sup -M/), f can be reconstructed exactly as the solution to the /spl lscr//sub 1/ minimization problem. In short, exact recovery may be obtained by solving a convex optimization problem. We give numerical values for C/sub M/ which depend on the desired probability of success. Our result may be interpreted as a novel kind of nonlinear sampling theorem. In effect, it says that any signal made out of |T| spikes may be recovered by convex programming from almost every set of frequencies of size O(|T|/spl middot/logN). Moreover, this is nearly optimal in the sense that any method succeeding with probability 1-O(N/sup -M/) would in general require a number of frequency samples at least proportional to |T|/spl middot/logN. The methodology extends to a variety of other situations and higher dimensions. For example, we show how one can reconstruct a piecewise constant (one- or two-dimensional) object from incomplete frequency samples - provided that the number of jumps (discontinuities) obeys the condition above - by minimizing other convex functionals such as the total variation of f.

read more

Content maybe subject to copyright    Report

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 2, FEBRUARY 2006
489
Robust Uncertainty Principles: Exact Signal
Reconstruction From Highly Incomplete
Frequency Information
Emmanuel J. Candès, Justin Romberg, Member, IEEE, and Terence Tao
Abstract—This paper considers the model problem of recon-
structing an object from incomplete frequency samples. Consider
a discrete-time signal
and a randomly chosen set of
frequencies
. Is it possible to reconstruct from the partial
knowledge of its Fourier coefficients on the set
?
A typical result of this paper is as follows. Suppose that
is a
superposition of
spikes
( )= ( ) ( )
obeying
(log )
1
for some constant
0
. We do not know the locations of the
spikes nor their amplitudes. Then with probability at least
1
( )
, can be reconstructed exactly as the solution to the
1
minimization problem
min
1
=0
( )
s.t.
^( )=
^
( )
for all
In short, exact recovery may be obtained by solving a convex op-
timization problem. We give numerical values for
which de-
pend on the desired probability of success. Our result may be in-
terpreted as a novel kind of nonlinear sampling theorem. In effect,
it says that any signal made out of
spikes may be recovered by
convex programming from almost every set of frequencies of size
( log )
. Moreover, this is nearly optimal in the sense that
any method succeeding with probability
1
( )
would in
general require a number of frequency samples at least propor-
tional to
log
.
The methodology extends to a variety of other situations and
higher dimensions. For example, we show how one can reconstruct
a piecewise constant (one- or two-dimensional) object from in-
complete frequency samples—provided that the number of jumps
(discontinuities) obeys the condition above—by minimizing other
convex functionals such as the total variation of
.
Index Terms—Convex optimization, duality in optimization, free
probability, image reconstruction, linear programming, random
matrices, sparsity, total-variation minimization, trigonometric ex-
pansions, uncertainty principle.
Manuscript received June 10, 2004; revised September 9, 2005. the work of E.
J. Candes is supported in part by the National Science Foundation under Grant
DMS 01-40698 (FRG) and by an Alfred P. Sloan Fellowship. The work of J.
Romberg is supported by the National Science Foundation under Grants DMS
01-40698 and ITR ACI-0204932. The work of T. Tao is supported in part by a
grant from the Packard Foundation.
E. J. Candes and J. Romberg are with the Department of Applied and Compu-
tational Mathematics, California Institute of Technology, Pasadena, CA 91125
USA (e-mail: emmanuel@acm.caltech.edu, jrom@acm.caltech.edu).
T. Tao is with the Department of Mathematics, University of California, Los
Angeles, CA 90095 USA (e-mail: tao@math.ucla.edu).
Communicated by A. Høst-Madsen, Associate Editor for Detection and Es-
timation.
Digital Object Identifier 10.1109/TIT.2005.862083
I. INTRODUCTION
I
N many applications of practical interest, we often wish to
reconstruct an object (a discrete signal, a discrete image,
etc.) from incomplete Fourier samples. In a discrete setting, we
may pose the problem as follows; let
be the Fourier trans-
form of a discrete object
,
The problem is then to recover from partial frequency infor-
mation, namely, from
, where belongs
to some set
of cardinality less than —the size of the dis-
crete object.
In this paper, we show that we can recover
exactly from
observations
on small set of frequencies provided that
is sparse. The recovery consists of solving a straightforward
optimization problem that finds
of minimal complexity with
, .
A. A Puzzling Numerical Experiment
This idea is best motivated by an experiment with surpris-
ingly positive results. Consider a simplified version of the clas-
sical tomography problem in medical imaging: we wish to re-
construct a two–dimensional image
from samples
of its discrete Fourier transform on a star-shaped domain [1].
Our choice of domain is not contrived; many real imaging de-
vices collect high-resolution samples along radial lines at rela-
tively few angles. Fig. 1(b) illustrates a typical case where one
gathers 512 samples along each of 22 radial lines.
Frequently discussed approaches in the literature of medical
imaging for reconstructing an object from polar frequency sam-
ples are the so-called filtered backprojection algorithms. In a
nutshell, one assumes that the Fourier coefficients at all of the
unobserved frequencies are zero (thus reconstructing the image
of “minimal energy” under the observation constraints). This
strategy does not perform very well, and could hardly be used
for medical diagnostics [2]. The reconstructed image, shown in
Fig. 1(c), has severe nonlocal artifacts caused by the angular un-
dersampling. A good reconstruction algorithm, it seems, would
have to guess the values of the missing Fourier coefficients.
In other words, one would need to interpolate
. This
seems highly problematic, however; predictions of Fourier coef-
ficients from their neighbors are very delicate, due to the global
and highly oscillatory nature of the Fourier transform. Going
0018-9448/$20.00 © 2006 IEEE

490 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 2, FEBRUARY 2006
Fig. 1. Example of a simple recovery problem. (a) The LoganShepp phantom test image. (b) Sampling domain
in the frequency plane; Fourier coefcients are
sampled along 22 approximately radial lines. (c) Minimum energy reconstruction obtained by setting unobserved Fourier coefcients to zero. (d) Reconstruction
obtained by minimizing the total variation, as in (1.1). The reconstruction is an exact replica of the image in (a).
back to the example in Fig. 1, we can see the problem
immediately. To recover frequency information near
, where is near , we would
need to interpolate
at the Nyquist rate . However, we
only have samples at rate about
; the sampling rate is
almost 50 times smaller than the Nyquist rate!
We propose instead a strategy based on convex optimization.
Let
be the total-variation norm of a two-dimensional
(2D) object
. For discrete data ,
where is the nite difference
and . To recover from par-
tial Fourier samples, we nd a solution
to the optimization
problem
subject to for all (1.1)
In a nutshell, given partial observation
, we seek a solution
with minimum complexitycalled here the total variation
(TV)and whose visible coefcients match those of the un-
known object
. Our hope here is to partially erase some of
the artifacts that classical reconstruction methods exhibit (which
tend to have large TV norm) while maintaining delity to the ob-
served data via the constraints on the Fourier coefcients of the
reconstruction. (Note that the TV norm is widely used in image
processing, see [31] for example.)
When we use (1.1) for the recovery problem illustrated in
Fig. 1 (with the popular LoganShepp phantom as a test image),
the results are surprising. The reconstruction is exact; that is,
This numerical result is also not special to this phantom.
In fact, we performed a series of experiments of this type and
obtained perfect reconstruction on many similar test phantoms.
B. Main Results
This paper is about a quantitative understanding of this very
special phenomenon. For which classes of signals/images can
we expect perfect reconstruction? What are the tradeoffs be-
tween complexity and number of samples? In order to answer
these questions, we rst develop a fundamental mathematical
understanding of a special 1D model problem. We then exhibit

CANDES et al.: ROBUST UNCERTAINTY PRINCIPLES 491
reconstruction strategies which are shown to exactly reconstruct
certain unknown signals, and can be extended for use in a va-
riety of related and sophisticated reconstruction applications.
For a signal
,wedene the classical discrete Fourier
transform
by
(1.2)
If we are given the value of the Fourier coefcients
for
all frequencies
, then one can obviously reconstruct
exactly via the Fourier inversion formula
Now suppose that we are only given the Fourier coefcients
sampled on some partial subset of all frequencies. Of
course, this is not enough information to reconstruct
exactly
in general;
has degrees of freedom and we are only spec-
ifying
of those degrees (here and below denotes
the cardinality of
).
Suppose, however, that we also specify that
is supported
on a small (but a priori unknown) subset
of ; that is, we
assume that
can be written as a sparse superposition of spikes
In the case where is prime, the following theorem tells us that
it is possible to recover
exactly if is small enough.
Theorem 1.1: Suppose that the signal length
is a prime
integer. Let
be a subset of , and let be a
vector supported on
such that
(1.3)
Then
can be reconstructed uniquely from and . Con-
versely, if
is not the set of all frequencies, then there exist
distinct vectors
, such that
and such that .
Proof: We will need the following lemma [3], from which
we see that with knowledge of
, we can reconstruct uniquely
(using linear algebra) from
.
Lemma 1.2: ([3, Corollary 1.4]) Let
be a prime integer and
, be subsets of . Put (resp., ) to be the space
of signals that are zero outside of
(resp., ). The restricted
Fourier transform
is dened as
for all
If , then is a bijection; as a consequence, we
thus see that
is injective for and surjective for
. Clearly, the same claims hold if the Fourier transform
is replaced by the inverse Fourier transform .
To prove Theorem 1.1, assume that
. Suppose
for contradiction that there were two objects
, such that
and . Then the Fourier
transform of
vanishes on , and .
By Lemma 1.2, we see that
is injective, and thus
. The uniqueness claim follows.
We now examine the converse claim. Since
, we can
nd disjoint subsets
, of such that
and . Let be some frequency which does
not lie in
. Applying Lemma 1.2, we have that
is a bijection, and thus we can nd a vector supported on
whose Fourier transform vanishes on but is nonzero on ;in
particular,
is not identically zero. The claim now follows by
taking
and .
Note that if is not prime, the lemma (and hence the the-
orem) fails, essentially because of the presence of nontrivial
subgroups of
with addition modulo ; see Sections I-C and
-D for concrete counter examples, and [3], [4] for further dis-
cussion. However, it is plausible to think that Lemma 1.2 con-
tinues to hold for nonprime
if and are assumed to be
genericin particular, they are not subgroups of
, or cosets
of subgroups. If
and are selected uniformly at random, then
it is expected that the theorem holds with probability very close
to one; one can indeed presumably quantify this statement by
adapting the arguments given above but we will not do so here.
However, we refer the reader to Section I-G for a rapid presen-
tation of informal arguments pointing in this direction.
Arenement of the argument in Theorem 1.1 shows that for
xed subsets
, in the time domain and in the frequency
domain, the space of vectors
, supported on , such that
has dimension when ,
and has dimension
otherwise. In particular, if we let
denote those vectors whose support has size at most ,
then the set of vectors in
which cannot be reconstructed
uniquely in this class from the Fourier coefcients sampled at
, is contained in a nite union of linear spaces of dimension
at most
. Since itself is a nite union of linear
spaces of dimension
, we thus see that recovery of from
is in principle possible generically whenever
; once , however, it is clear from simple
degrees-of-freedom arguments that unique recovery is no longer
possible. While our methods do not quite attain this theoretical
upper bound for correct recovery, our numerical experiements
suggest that they do come within a constant factor of this bound
(see Fig. 2).
Theorem 1.1 asserts that one can reconstruct
from fre-
quency samples (and that, in general, there is no hope to do so
from fewer samples). In principle, we can recover
exactly by
solving the combinatorial optimization problem
(1.4)
where
is the number of nonzero terms .
This is a combinatorial optimization problem, and solving (1.4)
directly is infeasible even for modest-sized signals. To the best
of our knowledge, one would essentially need to let
vary over
all subsets
of cardinality ,
checking for each one whether
is in the range of or
not, and then invert the relevant minor of the Fourier matrix to
recover
once is determined. Clearly, this is computationally

492 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 2, FEBRUARY 2006
very expensive since there are exponentially many subsets to
check; for instance, if
, then the number of subsets
scales like
! As an aside comment, note that it is also
not clear how to make this algorithm robust, especially since the
results in [3] do not provide any effective lower bound on the
determinant of the minors of the Fourier matrix, see Section VI
for a discussion of this point.
A more computationally efcient strategy for recovering
from and is to solve the convex problem
(1.5)
The key result in this paper is that the solutions to
and
are equivalent for an overwhelming percentage of the choices
for
and with ( is a constant): in
these cases, solving the convex problem
recovers exactly.
To establish this upper bound, we will assume that the ob-
served Fourier coefcients are randomly sampled. Given the
number
of samples to take in the Fourier domain, we choose
the subset
uniformly at random from all sets of this size; i.e.,
each of the
possible subsets are equally likely. Our main
theorem can now be stated as follows.
Theorem 1.3: Let
be a discrete signal supported on
an unknown set
, and choose of size uniformly
at random. For a given accuracy parameter
,if
(1.6)
then with probability at least
, the minimizer to
the problem (1.5) is unique and is equal to
.
Notice that (1.6) essentially says that
is of size ,
modulo a constant and a logarithmic factor. Our proof gives an
explicit value of
, namely, (valid for
, , and , say) although we have not
pursued the question of exactly what the optimal value might
be.
In Section V, we present numerical results which suggest that
in practice, we can expect to recover most signals
more than
50% of the time if the size of the support obeys
.By
most signals, we mean that we empirically study the success rate
for randomly selected signals, and do not search for the worst
case signal
that which needs the most frequency samples.
For
, the recovery rate is above 90%. Empirically,
the constants
and do not seem to vary for in the range
of a few hundred to a few thousand.
C. For Almost Every
As the theorem allows, there exist sets and functions for
which the
-minimization procedure does not recover cor-
rectly, even if
is much smaller than . We sketch
two counter examples.
A discrete Dirac comb. Suppose that
is a perfect square
and consider the picket-fence signal which consists of
spikes of unit height and with uniform spacing equal to
. This signal is often used as an extremal point for
uncertainty principles [4], [5] as one of its remarkable
properties is its invariance through the Fourier transform.
Hence, suppose that
is the set of all frequencies but the
multiples of
, namely, . Then
and obviously the reconstruction is identically zero.
Note that the problem here does not really have anything
to do with
-minimization per se; cannot be recon-
structed from its Fourier samples on
thereby showing
that Theorem 1.1 does not work as is for arbitrary
sample sizes.
Boxcar signals. The example above suggests that in some
sense
must not be greater than about . In fact,
there exist more extreme examples. Assume the sample
size
is large and consider, for example, the indicator
function
of the interval
and let be the set . Let
be a function whose Fourier transform is a nonnegative
bump function adapted to the interval
which equals when .
Then
has Fourier transform vanishing in , and
is rapidly decreasing away from
; in particular, we
have
for . On the other hand,
one easily computes that
for some absolute
constant
. Because of this, the signal
will have smaller -norm than for sufciently
small (and
sufciently large), while still having the
same Fourier coefcients as
on . Thus, in this case
is not the minimizer to the problem , despite the fact
that the support of
is much smaller than that of .
The above counter examples relied heavily on the special
choice of
(and to a lesser extent of ); in particular,
it needed the fact that the complement of
contained a large
interval (or more generally, a long arithmetic progression). But
for most sets
, large arithmetic progressions in the complement
do not exist, and the problem largely disappears. In short, The-
orem 1.3 essentially says that for most sets of
of size about
, there is no loss of information.
D. Optimality
Theorem 1.3 states that for any signal
supported on an ar-
bitrary set
in the time domain, recovers exactlywith
high probability from a number of frequency samples that
is within a constant of
. It is natural to wonder
whether this is a fundamental limit. In other words, is there an
algorithm that can recover an arbitrary signal from far fewer
random observations, and with the same probability of success?
It is clear that the number of samples needs to be at least
proportional to
, otherwise, will not be injective. We
argue here that it must also be proportional to
to guar-
antee recovery of certain signals from the vast majority of sets
of a certain size.
Suppose
is the Dirac comb signal discussed in the previous
section. If we want to have a chance of recovering
, then at
the very least, the observation set
and the frequency support
must overlap at one location; otherwise, all of
the observations are zero, and nothing can be done. Choosing

CANDES et al.: ROBUST UNCERTAINTY PRINCIPLES 493
uniformly at random, the probability that it includes none of the
members of
is
where we have used the assumption that .
Then for
to be smaller than , it must be
true that
and if we make the restriction that cannot be as large as ,
meaning that
,wehave
For the Dirac comb then, any algorithm must have
observations for the identied probability of suc-
cess.
Examples for larger supports
exist as well. If is an
even power of two, we can superimpose
Dirac combs at
dyadic shifts to construct signals with time-domain support
and frequency-domain support
for . The same argument as above would
then dictate that
In short, Theorem 1.3 identies a fundamental limit. No re-
covery can be successful for all signals using signicantly fewer
observations.
E. Extensions
As mentioned earlier, results for our model problem extend
easily to higher dimensions and alternate recovery scenarios. To
be concrete, consider the problem of recovering a 1D piecewise-
constant signal via
subject to (1.7)
where we adopt the convention that .Ina
nutshell, model (1.5) is obtained from (1.7) after differentiation.
Indeed, let
be the vector of rst difference
, and note that . Obviously
for all
and, therefore, with , the problem is
identical to
s.t.
which is precisely what we have been studying.
Corollary 1.4: Put
. Under
the assumptions of Theorem 1.3, the minimizer to the
problem (1.7) is unique and is equal
with probability at
least
provided that be adjusted so that
.
We now explore versions of Theorem 1.3 in higher dimen-
sions. To be concrete, consider the 2D situation (statements in
arbitrary dimensions are exactly of the same avor).
Theorem 1.5: Put
. We let ,
be a discrete real-valued image and of a certain size be
chosen uniformly at random. Assume that for a given accuracy
parameter
, is supported on obeying (1.6). Then with
probability at least
, the minimizer to the problem
(1.5) is unique and is equal to
.
We will not prove this result as the strategy is exactly parallel
to that of Theorem 1.3. Letting
be the horizontal nite dif-
ferences
and be the
vertical analog, we have just seen that we can think about the
data as the properly renormalized Fourier coefcients of
and . Now put , where . Then the
minimum total-variation problem may be expressed as
subject to (1.8)
where
is a partial Fourier transform. One then obtains a
statement for piecewise constant 2D functions, which is sim-
ilar to that for sparse onedimensional (1D) signals provided
that the support of
be replaced by
. We omit the details.
The main point here is that there actually are a variety of re-
sults similar to Theorem 1.3. Theorem 1.5 serves as another
recovery example, and provides a precise quantitative under-
standing of the surprising result discussed at the beginning
of this paper.
To be complete, we would like to mention that for complex
valued signals, the minimum
problem (1.5) and, therefore,
the minimum TV problem (1.1) can be recast as special convex
programs known as second-order cone programs (SOCPs). For
example, (1.8) is equivalent to
subject to
(1.9)
with variables
, , and in ( and are the real and
imaginary parts of
). If in addition, is real valued, then this
is a linear program. Much progress has been made in the past
decade on algorithms to solve both linear and second-order cone
programs [6], and many off-the-shelf software packages exist
for solving problems such as
and (1.9).
F. Relationship to Uncertainty Principles
From a certain point of view, our results are connected to
the so-called uncertainty principles [4], [5] which say that it is
difcult to localize a signal
both in time and frequency

Figures
Citations
More filters
Book

Compressed sensing

TL;DR: It is possible to design n=O(Nlog(m)) nonadaptive measurements allowing reconstruction with accuracy comparable to that attainable with direct knowledge of the N most important coefficients, and a good approximation to those N important coefficients is extracted from the n measurements by solving a linear program-Basis Pursuit in signal processing.
Book

Distributed Optimization and Statistical Learning Via the Alternating Direction Method of Multipliers

TL;DR: It is argued that the alternating direction method of multipliers is well suited to distributed convex optimization, and in particular to large-scale problems arising in statistics, machine learning, and related areas.
Journal ArticleDOI

An Introduction To Compressive Sampling

TL;DR: The theory of compressive sampling, also known as compressed sensing or CS, is surveyed, a novel sensing/sampling paradigm that goes against the common wisdom in data acquisition.
Journal ArticleDOI

Signal Recovery From Random Measurements Via Orthogonal Matching Pursuit

TL;DR: It is demonstrated theoretically and empirically that a greedy algorithm called orthogonal matching pursuit (OMP) can reliably recover a signal with m nonzero entries in dimension d given O(m ln d) random linear measurements of that signal.
Journal ArticleDOI

Decoding by linear programming

TL;DR: F can be recovered exactly by solving a simple convex optimization problem (which one can recast as a linear program) and numerical experiments suggest that this recovery procedure works unreasonably well; f is recovered exactly even in situations where a significant fraction of the output is corrupted.
References
More filters
Book

Matrix computations

Gene H. Golub
Book

Numerical Optimization

TL;DR: Numerical Optimization presents a comprehensive and up-to-date description of the most effective methods in continuous optimization, responding to the growing interest in optimization in engineering, science, and business by focusing on the methods that are best suited to practical problems.
Journal ArticleDOI

Nonlinear total variation based noise removal algorithms

TL;DR: In this article, a constrained optimization type of numerical algorithm for removing noise from images is presented, where the total variation of the image is minimized subject to constraints involving the statistics of the noise.
Journal ArticleDOI

Atomic Decomposition by Basis Pursuit

TL;DR: Basis Pursuit (BP) is a principle for decomposing a signal into an "optimal" superposition of dictionary elements, where optimal means having the smallest l1 norm of coefficients among all such decompositions.
Journal ArticleDOI

Greed is good: algorithmic results for sparse approximation

TL;DR: This article presents new results on using a greedy algorithm, orthogonal matching pursuit (OMP), to solve the sparse approximation problem over redundant dictionaries and develops a sufficient condition under which OMP can identify atoms from an optimal approximation of a nonsparse signal.
Related Papers (5)
Frequently Asked Questions (11)
Q1. What are the contributions mentioned in the paper "Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information" ?

This paper considers the model problem of reconstructing an object from incomplete frequency samples. A typical result of this paper is as follows. 

Frequently discussed approaches in the literature of medical imaging for reconstructing an object from polar frequency samples are the so-called filtered backprojection algorithms. 

Because the random variables in (2.16) are independent andhave all the same distribution, the quantity depends only on the equivalence relation and not on the value of itself. 

The proof of the theorem then proceeds by developing a recursive inequality on the central term in this second expansion, which is done in Section IV-E. 

The earlier [4] also contains a conjecture that more powerful uncertainty principles may exist if one of , is chosen at random, which is essentially the content of Section I-G here. 

The distribution of the eigenvalues of such operators was studied by Landau and others [27]–[29] while developing the prolate spheroidal wave functions that are now commonly used in signal processing and communications. 

To the best of their knowledge, one would essentially need to let vary over all subsets of cardinality , checking for each one whether is in the range of or not, and then invert the relevant minor of the Fourier matrix to recover once is determined. 

The following lemma shows that a necessary and sufficient condition for the solution to be the solution to is the existence of a trigonometric polynomial whose Fourier transform is supported on , matches on , and has magnitude strictly less than elsewhere. 

Because of space limitation, the authors are unable to elaborate on this fact and its implications further, but will do so in a companion paper. 

(4.43)Note that the authors voluntarily exchanged the function arguments to reflect the idea that the authors shall view as a function of while will serve as a parameter. 

Theorem 1.3 states that for any signal supported on an arbitrary set in the time domain, recovers exactly—with high probability— from a number of frequency samples that is within a constant of .