diff --git a/_psets/12.md b/_psets/12.md index ee34eaa37f2e271baa5460415bd2cabe1be1fe9a..83aeb66cc9ebec19624951ecad6c246ff469358f 100644 --- a/_psets/12.md +++ b/_psets/12.md @@ -222,12 +222,20 @@ the convolutional encoder in Figure 15.20, what data were transmitted? {:.question} This problem is harder than the others. +My code for all sections of this problem is +[here](https://gitlab.cba.mit.edu/erik/compressed_sensing). I wrote it in C++ using +[Eigen](http://eigen.tuxfamily.org) for vector and matrix operations. + ### (a) {:.question} Generate and plot a periodically sampled time series {$$t_j$$} of N points for the sum of two sine waves at 697 and 1209 Hz, which is the DTMF tone for the number 1 key. +Here's a plot of 250 samples taken over one tenth of a second. + + + ### (b) {:.question} @@ -244,18 +252,28 @@ D_{ij} = \end{align*} $$ + + ### (c) {:.question} Plot the inverse transform of the {$$f_i$$} by multiplying them by the inverse of the DCT matrix (which is equal to its transpose) and verify that it matches the time series. +The original samples are recovered. + + + ### (d) {:.question} Randomly sample and plot a subset of M points {$$t^\prime_k$$} of the {$$t_j$$}; you’ll later investigate the dependence on the sample size. +Here I've selected 100 samples from the original 250. The plot is recognizable but very distorted. + + + ### (e) {:.question} @@ -270,6 +288,19 @@ $$ {:.question} and plot the resulting estimated coefficients. +Gradient descent very quickly drives the loss function to zero. However it's not reconstructing the +true DCT coefficients. + + + +To make sure I don't have a bug in my code, I plotted the samples we get by performing the inverse +DCT on the estimated coefficients. + + + +Sure enough all samples in the subset are matched exactly. But the others are way off the mark. +We've added a lot of high frequency content, and are obviously overfitting. + ### (f) {:.question} @@ -287,6 +318,20 @@ $$ {:.question} and plot the resulting estimated coefficients. +With L2 regularization, we remove some of the high frequency content. This makes the real peaks a +little more prominent. + + + +However it comes at a cost: gradient descent no longer drive the loss to zero. As such the loss +itself isn't a good termination condition. In its place I terminate when the squared norm of the +gradient is less than $$\num{1e-6}$$. The final loss for the coefficients in the plot above is +around 50. + +You can easily see that the loss is nonzero from the reconstructed samples. + + + ### (g) {:.question} @@ -301,3 +346,33 @@ $$ {:.question} Plot the resulting estimated coefficients, compare to the L2 norm estimate, and compare the dependence of the results on M to the Nyquist sampling limit of twice the highest frequency. + +With L1 regularization the DCT coefficients are recovered pretty well. There is no added high +frequency noise. + + + +It still can't drive the loss to zero. Additionally it's hard to drive the squared norm of the +gradient to zero, since the gradient of the absolute values shows up as 1 or -1. (Though to help +prevent oscillation I actually drop this contribution if the absolute value of the coefficient in +question is less than $$\num{1e-3}$$.) So here I terminate when the relative change in the loss +falls below $$\num{1e-9}$$. + +The final loss is around 40; smaller than we were able to find with L2 regularization. However it +did take more effort: this version converged after 21,060 iterations, as opposed to 44 (for L2) or +42 (for unregularized). + +The recovered samples are also much more recognizable. The amplitude of our waveform seems overall +a bit diminished, but unlike our previous attempts it looks similar to the original. + + + +This technique can recover the signal substantially below the Nyquist limit. The highest frequency +signal is 1209 Hz, so with traditional techniques we'd have to sample at 2418 Hz or faster to avoid +artifacts. Since I'm only plotting over one hundreth of a second, I thus need at least 242 samples. +So my original 250 is (not coicidentally) near here. But even with a subset of only 50 samples, the +L1 regularized gradient descent does an admirable job at recovering the DCT coefficients and +samples: + + + diff --git a/assets/img/pset12_fig_a.png b/assets/img/pset12_fig_a.png new file mode 100644 index 0000000000000000000000000000000000000000..9a732c8a7b3f6de4fbab9a098287dcb545387981 Binary files /dev/null and b/assets/img/pset12_fig_a.png differ diff --git a/assets/img/pset12_fig_b.png b/assets/img/pset12_fig_b.png new file mode 100644 index 0000000000000000000000000000000000000000..430727e7d8177f5c35a65cfe70009534735ef275 Binary files /dev/null and b/assets/img/pset12_fig_b.png differ diff --git a/assets/img/pset12_fig_c.png b/assets/img/pset12_fig_c.png new file mode 100644 index 0000000000000000000000000000000000000000..2ab51bbe4e1070fe56621106e7e0f145dbcc6a9b Binary files /dev/null and b/assets/img/pset12_fig_c.png differ diff --git a/assets/img/pset12_fig_d.png b/assets/img/pset12_fig_d.png new file mode 100644 index 0000000000000000000000000000000000000000..ef37320e97e761f81c2cc1c38a660ad4fc1bd3f7 Binary files /dev/null and b/assets/img/pset12_fig_d.png differ diff --git a/assets/img/pset12_fig_e.png b/assets/img/pset12_fig_e.png new file mode 100644 index 0000000000000000000000000000000000000000..a7d0e4ed9ab67096d407361efb1bd782ed630ff6 Binary files /dev/null and b/assets/img/pset12_fig_e.png differ diff --git a/assets/img/pset12_fig_e_2.png b/assets/img/pset12_fig_e_2.png new file mode 100644 index 0000000000000000000000000000000000000000..af87ba881cc9e14be2ab36ae3962dd0cb2ec28b4 Binary files /dev/null and b/assets/img/pset12_fig_e_2.png differ diff --git a/assets/img/pset12_fig_f.png b/assets/img/pset12_fig_f.png new file mode 100644 index 0000000000000000000000000000000000000000..b8abd0990175f5242c669c4005ab6bc57ef33f01 Binary files /dev/null and b/assets/img/pset12_fig_f.png differ diff --git a/assets/img/pset12_fig_f_2.png b/assets/img/pset12_fig_f_2.png new file mode 100644 index 0000000000000000000000000000000000000000..dc8f7854097ebf4c741d926c80045c205f7ee9cb Binary files /dev/null and b/assets/img/pset12_fig_f_2.png differ diff --git a/assets/img/pset12_fig_g.png b/assets/img/pset12_fig_g.png new file mode 100644 index 0000000000000000000000000000000000000000..3ff1a21f6ec7d278180fca9f0293643e82993063 Binary files /dev/null and b/assets/img/pset12_fig_g.png differ diff --git a/assets/img/pset12_fig_g_2.png b/assets/img/pset12_fig_g_2.png new file mode 100644 index 0000000000000000000000000000000000000000..3dc73ce49fafeb8338bc0d3a58bdbb8942a1b789 Binary files /dev/null and b/assets/img/pset12_fig_g_2.png differ diff --git a/assets/img/pset12_fig_g_3.png b/assets/img/pset12_fig_g_3.png new file mode 100644 index 0000000000000000000000000000000000000000..15e35844d56bcdaa22258fb5c9076589b0f8e644 Binary files /dev/null and b/assets/img/pset12_fig_g_3.png differ diff --git a/assets/img/pset12_fig_g_4.png b/assets/img/pset12_fig_g_4.png new file mode 100644 index 0000000000000000000000000000000000000000..3e94e446a2771d1c89f27426358ace0cd89b65ac Binary files /dev/null and b/assets/img/pset12_fig_g_4.png differ