diff --git a/_psets/12.md b/_psets/12.md
index 83aeb66cc9ebec19624951ecad6c246ff469358f..6869f16a452c7c9d8cd4a444e6a2e75a8d51854c 100644
--- a/_psets/12.md
+++ b/_psets/12.md
@@ -211,6 +211,80 @@ one bit?
 What is the SNR due to quantization noise in an 8-bit A/D? 16-bit? How much must the former be
 averaged to match the latter?
 
+This depends on the characteristics of the signal. Assuming that it uses the full range of the A/D
+converter (but no more), and over time is equally likely to be any voltage in that range, then the
+SNR in decibels is
+
+$$
+\text{SNR} = 20 \log_{10}(2^n)
+$$
+
+where $$n$$ is the number of bits used.
+
+So we have
+
+$$
+\begin{align*}
+\text{8-bit SNR} &= 20 \log_{10}(2^8) \\ &= 48.2 \si{dB} \\
+\text{16-bit SNR} &= 20 \log_{10}(2^{16}) \\ &= 96.3 \si{dB}
+\end{align*}
+$$
+
+But where does this come from? Recall that the definition of the SNR is the ratio of the power in
+the signal to the power in the noise. Let's say we have an analog signal $$f(t)$$ within the range
+$$[-A, A]$$. Let its time-averaged distribution be $$p(x)$$. Then the power in the signal is
+
+$$
+P_\text{signal} = \int_{-A}^A x^2 p(x) \mathrm{d} x
+$$
+
+Let's assume that the signal is equally likely to take on any value in its range, so $$p(x) =
+1/(2A)$$. This is completely true of triangle waves and sawtooth waves, relatively true for sin waves, but not
+true at all for square waves. So this approximation may or may not be very accurate. Then its power
+is
+
+$$
+\begin{align*}
+P_\text{signal} &= \frac{1}{2A} \int_{-A}^A x^2 \mathrm{d} x \\
+&= \frac{1}{2A} \frac{1}{3} \left[ x^3 \right]_ {-A}^A \\
+&= \frac{1}{2A} \frac{1}{3} 2A^3 \\
+&= \frac{A^2}{3}
+\end{align*}
+$$
+
+When the signal is quantized, each measurement $$f(t)$$ is replaced with a quantized version. The
+most significant bit tells us which half of the signal range we're in. In this case that means
+whether we're in the range [-A, 0] or [0, A]. Note that each interval is $$A$$ long. The next bit
+tells us which half of that half we're in. Each interval is $$A/2$$ long. So finally the least
+significant bit will tell us which half of a half etc. we're in, and each interval will be
+$$A/2^{n - 1}$$ long (or equivalently $$2A/2^n$$), where $$n$$ is the number of bits.
+
+The uncertainty we have about the original value is thus plus or minus half the least significant
+bit. So the quantization error will be in the range $$[-A/2^n, A/2^n]$$. Since we're assuming our
+signal is equally likely to take any value, the quantization error is equally likely to fall
+anywhere in this range. Thus the power in the quantization noise is
+
+$$
+\begin{align*}
+P_\text{noise} &= \frac{2^n}{2A} \int_{-A/2^n}^{A/2^n} x^2 \mathrm{d} x \\
+&= \frac{2^n}{2A} \frac{1}{3} \left[ x^3 \right]_ {-A/2^n}^{A/2^n} \\
+&= \frac{2^n}{2A} \frac{1}{3} \frac{2 A^3}{2^{3n}} \\
+&= \frac{A^2}{3} \frac{1}{2^{2n}}
+\end{align*}
+$$
+
+Putting it together,
+
+$$
+\begin{align*}
+\text{SNR} &= 10 \log_{10} \left( \frac{P_\text{signal}}{P_\text{noise}} \right) \\
+&= 10 \log_{10} \left( \frac{A^2}{3} \cdot \frac{3}{A^2} \frac{2^{2n}}{1} \right) \\
+&= 10 \log_{10} \left( 2^{2n} \right) \\
+&= 20 \log_{10} \left( 2^n \right)
+\end{align*}
+$$
+
+
 ## (15.6)
 
 {:.question}
@@ -223,8 +297,9 @@ the convolutional encoder in Figure 15.20, what data were transmitted?
 This problem is harder than the others.
 
 My code for all sections of this problem is
-[here](https://gitlab.cba.mit.edu/erik/compressed_sensing). I wrote it in C++ using
-[Eigen](http://eigen.tuxfamily.org) for vector and matrix operations.
+[here](https://gitlab.cba.mit.edu/erik/compressed_sensing). All the sampling and gradient descent is
+done in C++ using [Eigen](http://eigen.tuxfamily.org) for vector and matrix operations. I use Python
+and [matplotlib](https://matplotlib.org/) to generate the plots.
 
 ### (a)