The selection of a prior is, in my view, the weakest part of Bayesian
inference, so we will sidestep the debate on the correct choice. Rather,
let us view the situation as an opportunity, a license to explore the
consequences of different priors on the ``true'' maps which emerge. This is
easily done by simulation - take a plausible map, Fourier transform,
sample with a function so that some information is now missing, and
use your favourite
prior and maximise ``entropy'' to get a candidate for the true map. It is
this kind of study which was responsible for the great initial interest in
MEM. Briefly, what MEM seemed to do in simple cases was to eliminate the
sidelobes and even resolve pairs of peaks which overlapped in the true
map, i.e it was sometimes ``better'' than the original! This last feature is
called superresolution, and we will not discuss this in the same spirit of
modesty that prompted us to use a CLEAN beam. Unlike CLEAN, MEM did not
seem to have a serious problem with extended structure, unless it had a
sharp edge (like the image of a planet). In this last case, it was found
that MEM actually enhanced the ripples near the edge which were sitting at
high brightness levels; though it controlled the ripples which were close
to zero intensity. This is perhaps not surprising if one looks at the
graph of the function
. There is much more to be gained by
removing ripples near
than at higher values of
, since the
derivative of the function is higher near
.
Fortunately, these empirical studies of the MEM can be backed up by an
analytical/graphical argument due to Ramesh Narayan, which is outlined
below. The full consequences of this viewpoint were developed in a review
article (Annual review of Astronomy and
Astrophysics 24 127 1986), so they will not be elaborated here, but the basic
reasoning is simple and short enough. Take the expression for the entropy,
and differentiate it with respect to the free parameters at our disposal,
namely the unmeasured visibilities, and set to zero for
maximisation. The derivative of the entropy taken with respect to a
visibility is denoted by
. The understanding is that
have not been measured. The condition for a maximum is
More properties of the MEM solution are given in the
references cited earlier. But one can immediately see that taking the
exponential of a function with only a limited range of spatial frequencies
(those present in the dirty beam) is going to generate all spatial
frequencies, i.e., one is extrapolating and interpolating in the
plane. It is also clear that the fitting is a nonlinear operation because
of the exponential. Adding two data sets and obtaining the MEM solution
will not give the same answer as finding the MEM solution for each
separately and adding later! A little thought shows that this is equally
true of CLEAN.
If one has a default image in the definition of the entropy
function, then the same algebra shows that
is the exponential of a
band-limited function. This could be desirable. For example, while imaging
a planet, if the sharp edge is put into
, then the MEM does not have
to do so much work in generating new spatial frequencies in the ratio
. The spirit is similar to using a window to help CLEAN find
sources in the right place.