project.md

---
title: Final Project
---

# Computed Tomography


## Reading

- [Radon Transform](http://www-math.mit.edu/~helgason/Radonbook.pdf) by Sigurdur Helgason
- [Robust Uncertainty Principles: Exact Signal Reconstruction from Highly Incomplete Frequency
  Information](assets/pdf/2004_Candes_Romberg_Tao.pdf) by Candes, Romberg, and Tao
- [Total Variation and Tomographic Imaging from Projections](assets/pdf/2011_Hansen_Jorgensen.pdf)
  by Hansen and J&oslash;rgensen

## Code

All my code is in this [repo](https://gitlab.cba.mit.edu/erik/funky_ct/tree/develop), named in honor
of the [Radon transform](https://en.wikipedia.org/wiki/Radon_transform)'s more eccentric
[cousin](https://en.wikipedia.org/wiki/Funk_transform). Building requires a modern C++ compiler and
cmake. [Libpng](http://www.libpng.org/pub/png/libpng.html) must be installed as well. All told it's
about 1,800 lines (excluding third party code), but there's some unnecessary duplication in there
since I've been favoring velocity over hygiene.


## Fourier Reconstruction


### Fourier Slice Theorem

Let $$f : \mathbb{R}^2 \rightarrow \mathbb{R}$$ be a density function. If we map density to
brightness we can view $$f$$ as describing an image. We'll assume that this function is defined
everywhere, but is always zero outside some finite neighborhood of the origin (say, the bounds of
the image).

The projection of $$f$$ to the x axis is obtained by integrating along y:

$$
p(x) = \int_\mathbb{R} f(x, y) dy
$$

Meanwhile, the [Fourier transform](notes/fourier_transform.html) of $$f$$ is

$$
\hat{f}(u, v)
= \int_\mathbb{R} \int_\mathbb{R} f(x, y) e^{-2 \pi i (u x + v y)} dx dy
$$

Now comes the key insight. The slice along the $$u$$ axis in frequency space is

$$
\begin{align*}
\hat{f}(u, 0)
&= \int_\mathbb{R} \int_\mathbb{R} f(x, y) e^{-2 \pi i u x} dx dy \\
&= \int_\mathbb{R} \left( \int_\mathbb{R} f(x, y) dy \right) e^{-2 \pi i u x} dx \\
&= \int_\mathbb{R} p(x) e^{-2 \pi i u x} dx \\
&= \hat{p}(u)
\end{align*}
$$

So the Fourier transform of the 1d projection is a 1d slice through the 2d Fourier transform of the
image. This result is known as the Fourier Slice Theorem, and is the foundation of most
reconstruction techniques.

Since the x axis is arbitrary (we can rotate the image however we want), this works for other
angles as well:

$$
\hat{p}_\theta (\omega) = \hat{f} (\omega \cos \theta, \omega \sin \theta)
$$

where $$p_\theta (r)$$ is the projection of $$f$$ onto the line that forms an angle $$\theta$$ with
the x axis. In other words, the Fourier transform of the 1D x-ray projection at angle $$\theta$$ is
the slice through the 2D Fourier transform of the image at angle $$\theta$$.

Conceptually this tells us everything we need to know about the reconstruction. First we take the 1D
Fourier transform of each projection. Then we combine them by arranging them radially. Finally we
take the inverse Fourier transform of the resulting 2d function. We'll end up with the reconstructed
image.

It also tells us how to generate the projections, given that we don't have a 1d x-ray machine. First
we take the Fourier transform of the image. Then we extract radial slices from it. Finally we take
the inverse Fourier transform of each slice. These are the projections. This will come in handy for
generating testing data.


### Discretization

Naturally the clean math of the theory has be modified a bit to make room for reality. In
particular, we only have discrete samples of $$f$$ (i.e. pixel values) rather than the full
continuous function (which in our current formalism may contain infinite information). This has two
important implications.

First, we'll want our Fourier transforms to be discrete Fourier transforms (DFTs). Luckily the
continuous and discrete Fourier transforms are effectively interchangeable, as long as the functions
we work with are mostly spatially and bandwidth limited, and we take an appropriate number of
appropriately spaced samples. You can read more about these requirements
[here](notes/fourier_series.html).

Second, since we combine the DFTs of the projections radially, we'll end up with samples (of the 2D
Fourier transform of our image) on a polar grid rather than a cartesian one. So we'll have to
interpolate. This step is tricky and tends to introduce a lot of error. , but there are better
algorithms out there that come closer to the theoretically ideal
[sinc interpolation](https://en.wikipedia.org/wiki/Whittaker%E2%80%93Shannon_interpolation_formula).
The popular [gridrec](https://www.ncbi.nlm.nih.gov/pubmed/23093766) method is one.


### Results

I used [FFTW](http://www.fftw.org/) to compute Fourier transforms. It's written in C and is very
[fast](http://www.fftw.org/benchfft/). I implemented my own polar resampling routine. It uses a
[Catmull-Rom interpolation](http://entropymine.com/imageworsener/bicubic/) kernel.

I started with this image of a
[brain](https://commons.wikimedia.org/wiki/File:FMRI_coronal_scan.jpg) from Wikimedia Commons.

![brain](assets/img/project_brain.png)

Fourier reconstruction produces a nice [sinogram](https://en.wikipedia.org/wiki/Radon_transform).
Each row is one projection, with angle going from 0 at the top to $$\pi$$ at the bottom, and $$r =
0$$ down the middle column. You can clearly see the skull (a roughly circular feature) unwrapped to
a line on the left side of the sinogram.

![sinogram](assets/img/project_sinogram.png)

The reconstruction, however, isn't so clean.

![brain](assets/img/project_fourier_reconstruction.png)

The Fourier library I'm using is solid (and indeed it reproduces images very well even after
repeated applications), so the error must be coming from my interpolation code. Indeed, the high
frequency content looks ok, but there's a lot of error in low frequency content. This is encoded in
the middle of the Fourier transform, which is what is most distorted by the polar resampling. I
could implement a resampling routine specifically designed for polar resampling, or indeed
specifically for polar resampling for CT reconstruction, but there are better algorithms out there
anyway so I'll move on.


### Outtakes

I got a number of interesting failures before getting my code to work correctly.

![outtake](assets/img/project_cool_1.png)
![outtake](assets/img/project_cool_2.png)
![outtake](assets/img/project_cool_3.png)
![outtake](assets/img/project_cool_4.png)

These all started as attempts to project and reconstruct a [disk](notes/fourier_examples.html),
though I did experiment once interesting mistakes started happening. They mostly result from
indexing errors and an issue with my integration with FFTW. The latter problem relates to the
periodicity of discrete Fourier transforms and the resulting ambiguity in frequency interpretation.
In a nutshell, the DFT doesn't give you samples of the continuous Fourier transform; it gives you
samples of the periodic summation of the continuous Fourier transform. So each sample isn't
representative of one frequency, it's representative of a whole equivalence class of frequencies.

For this application it's very important that the center of the polar coordinate system used for
resampling is right at the DC sample. So though the raw Fourier transform of the image looks like
this (real and imaginary parts shown separately),

![transform](assets/img/project_transform_swapped_re.png)
![transform](assets/img/project_transform_swapped_im.png)

we want to permute the quadrants so that it looks like this:

![transform](assets/img/project_transform_re.png)
![transform](assets/img/project_transform_im.png)

Then the center of the image can be the center of the polar coordinate system. (Note: to make these
I linearly mapped the full range of each image to [1, e], then applied the natural logarithm. So
though they aren't the same color, both tend toward zero away from the center.) This is akin to
viewing the sample frequencies not as $$0$$, $$1/N$$, $$\ldots$$, $$(N - 1)/N$$, but as $$0$$,
$$1/N$$, $$\ldots$$, $$(N/2 - 1)/N$$, $$-1/2$$, $$\ldots$$, $$-1/N$$.

Interestingly, you can do this by literally swapping quadrants of the image, or by multiplying the
results element-wise by a checker board of 1s and -1s. This seems like magic until you just write
out the math.


## Filtered Back Projection

It would be nice if we could avoid the interpolation required for Fourier reconstruction. The
simplest way of doing so, and still the most popular method of performing image reconstruction, is
called filtered back projection.


### Theory

Let's hop back to the continuous theory for a moment. The Fourier reconstruction technique is based
on the fact that $$f$$ can be represented in terms of its projections $$p_\theta$$ in polar
coordinates.

$$
\begin{align*}
f(x, y)
&= \int_\mathbb{R} \int_\mathbb{R} \hat{f}(u, v) e^{2 \pi i (ux + vy)} \mathrm{d}u \mathrm{d}v \\
&= \int_0^\pi \int_\mathbb{R} \hat{f}(\omega \cos \theta, \omega \sin \theta)
    e^{2 \pi i \omega (x \cos \theta + y \sin \theta)}
    \vert \omega \vert \mathrm{d} \omega \mathrm{d} \theta \\
&= \int_0^\pi \int_\mathbb{R} \hat{p}_\theta(\omega)
    e^{2 \pi i \omega (x \cos \theta + y \sin \theta)}
    \vert \omega \vert \mathrm{d} \omega \mathrm{d} \theta \\
\end{align*}
$$

Instead of using interpolation to transform the problem back to cartesian coordinates where we can
apply the usual (inverse) DFT, we can directly evaluate the above integral.

To simplify things a bit, note that the integral over $$\omega$$ is itself a one dimensional inverse
Fourier transform. In particular if we define

$$
q_\theta(\omega) = \hat{p}_\theta(\omega) \vert \omega \vert
$$

then the inner integral is just

$$
\mathcal{F}^{-1}[q_\theta](x \cos \theta + y \sin \theta)
$$

So overall

$$
f(x, y) = \int_0^\pi \mathcal{F}^{-1}[q_\theta](x \cos \theta + y \sin \theta) \mathrm{d} \theta
$$

Since multiplication in the frequency domain is equivalent to convolution in the spatial domain,
$$\mathcal{F}^{-1}[q_\theta]$$ is simply a filtered version of $$p_\theta$$. Hence the name filtered
back projection.


### Discretization

We'll replace the continuous Fourier transforms with DFTs as before.

The only remaining integral (i.e. that's not stuffed inside a Fourier transform) is over $$\theta$$,
so most quadrature techniques will want samples of the integrand that are evenly spaced in
$$\theta$$. We want the pixels in our resulting image to lie on a cartesian grid, so our integrand
samples should be evenly spaced in $$x$$ and $$y$$ as well.

How do we compute $$\mathcal{F}^{-1}[q_\theta]$$? We are given samples of $$p_\theta(r)$$ on a polar
grid. Using the DFT, we can get samples of $$\hat{p}_\theta(\omega)$$. They will be for the same
$$\theta$$ values, and evenly spaced frequency values $$\omega$$. So if we just multiply by $$\vert
\omega \vert$$, we get the corresponding samples of $$q_\theta(\omega)$$. Finally we just take the
inverse DFT and we have samples of $$\mathcal{F}^{-1}[q_\theta]$$ on a polar grid -- namely at the
same $$(r, \theta)$$ points we started out with.

This works out perfectly for $$\theta$$: we can take the samples we naturally end up with and use
them directly for quadrature. For $$r$$, on the other hand, the regular samples we end up with won't
line up with the values $$x \cos \theta + y \sin \theta$$ that we want. So we'll still have to do
some interpolation. But now it's a simple 1D interpolation problem that's much easier to do without
introducing as much error. In particular, we only have to do interpolation in the spatial domain, so
errors will only accumulate locally.


### Results

I again use [FFTW](http://www.fftw.org/) for Fourier transforms. For interpolation I just take the
sample to the left of the desired location, and for integration I just use a left Riemann sum. Even
with these lazy techniques, the reconstruction looks quite good.

![brain](assets/img/project_fbp_reconstruction.png)


## Total Variation / Compressed Sensing

Unfortunately this is still a work in progress. I started implementing a [modern
approach](assets/pdf/2011_hansen_jorgensen.pdf), but found I didn't have enough background to
make it work. They gloss over some details that I tried to get around with brute force, only to find
that the resulting problem was computationally intractable. So yesterday I started over,
re-implementing results from one of the original compressed sensing
[papers](assets/pdf/2004_Candes_Romberg_Tao.pdf). However I did learn some things from my failed
attempts...


## Matrix Formulation

Both techniques I've fully implemented so far involve only three types of operations: DFTs,
interpolation, and multiplication (for the $$\vert \omega \vert$$ filter). All these operations are
linear, so we can express them as matrices and reformulate each technique as a single matrix
multiplication with a vector. In practice this fact alone is useless, since the resulting matrix
ends up being enormous. (In fact if your image and sinogram are n by n pixels, the matrix will have
n^4 entries.) But this formulation is used to derive the theory of many total variation versions.

In the process of my first failed total variation implementation, I ended up with most of the parts
of the Fourier and filtered back projection algorithms implemented as matrices. So let's use them
and create a sinogram with a single matrix multiplication. As mentioned the matrices are huge, so
here I'll work with a 32 by 32 brain image.

![dft](assets/img/project_tiny_brain.png)

The DFTs were easy. Very similar to the cosine transform we used in the last problem set.

![dft](assets/img/project_matrix_dft_re.png)
![dft](assets/img/project_matrix_dft_im.png)

To construct the polar interpolation matrix, I rewrote my interpolation routine to drop the weights
it calculated in the appropriate entries in a matrix rather than summing things up as it goes. These
weights depend on the type of interpolation used and the sizes of the images involved.

![dft](assets/img/project_matrix_polar_re.png)
![dft](assets/img/project_matrix_polar_im.png)

Finally it's just some (inverse) DFTs to get the sinogram. They can all be expressed simultaneously
in one matrix.

![dft](assets/img/project_matrix_sinogram.png)

I bump up the pixel count for the polar projections so that the resampling doesn't lose as much
information. Thus the final matrix that performs all three of these operations at once has
$$2 \cdot 64^2$$ rows and $$32^2$$ columns, for a total of 8,388,608 entries. (I could chop this in
half if I didn't bother computing the imaginary part of the sinogram. It should be zero but it's a
nice sanity check.)


## TomoPy

Ultimately there are a lot more algorithms out there than I care to implement myself. Thanks to the
people behind [TomoPy](https://tomopy.readthedocs.io/en/latest/), I don't have to. It's a library of
tomographic imaging algorithms exposed through Python. It's decently documented, and the
[code](https://github.com/tomopy/tomopy) is open source. It also integrates with
[ASTRA](https://www.astra-toolbox.com/) which has some blazing fast GPU implementations.

I scanned a 3d print that I made in [How to Make (almost)
Anything](http://fab.cba.mit.edu/classes/863.18/CBA/people/erik/04_3d_printing_scanning/) last
semester.

![3d print](assets/img/project_3d_print.jpg)

I reconstructed it using TomoPy's gridrec implementation.

![3d print](assets/img/project_3d_print_reconstruction.jpg)

The results aren't that great, so evidently the settings will require some tweaking. Ultimately I'd
like to set up an easy to use TomoPy/ASTRA toolchain to use with CBA's CT scanner; this is just a
first test.

In particular, TomoPy is designed for parallel beam scans, not cone beam scans. In 2D it's not hard
to manipulate fan beam data into a format that parallel beam algorithms can understand, since every
line in a fan beam is a line in some other parallel beam. But this doesn't work in 3D since there's
only one horizontal plane in which the cone beam rays are aligned with the parallel beam rays.
Luckily ASTRA supports cone beams, so once I have that integration figured out we should be able to
get proper reconstructions.