02.md



title: Problem Set 2


3.1

(a)
{:.question}
Derive equation (3.16) (i.e. the Poisson probability density function) from the binomial
distribution and Stirling’s approximation.
Fix any (positive, real) \lambda. For any n, the binomial probability mass function for
n trials with probability of success p = \lambda / n is

\begin{align*}
p(k; n, \lambda/n)
&= {n \choose k} \left( \frac{\lambda}{n} \right)^k \left( 1 - \frac{\lambda}{n} \right)^{n - k} \\
&= \frac{n (n - 1) \cdots (n - k + 1)}{k!} \left( \frac{\lambda}{n} \right)^k
\left( 1 - \frac{\lambda}{n} \right)^{n - k} \\
&= \frac{n^k + \mathcal{O}\left( n^{k - 1} \right) }{k!} \left( \frac{\lambda}{n} \right)^k
\left( 1 - \frac{\lambda}{n} \right)^{n - k} \\
&= \frac{\lambda^k}{k!} \left( 1 - \frac{\lambda}{n} \right)^{n - k} +
\frac{\mathcal{O}\left( n^{k - 1} \right) }{k!} \left( \frac{\lambda}{n} \right)^k
\left( 1 - \frac{\lambda}{n} \right)^{n - k}
\end{align*}

As n approaches infinity, the second term goes to zero because the n^{-k} dominates
\mathcal{O}\left( n^{k - 1} \right) (and the last factor is always between 0 and 1). The first
term approaches \lambda^k e^{-\lambda}/k! which is the Poisson distribution.

(b)
{:.question}
Use it to derive equation (3.18) (\langle k (k - 1) \cdots (k - m + 1) \rangle = \lambda^m) when
k follows the Poisson distribution with average number of events \lambda.
First, let's verify that \langle k \rangle = \lambda. This will serve as the base case for an
inductive argument.

\begin{align*}
\langle k \rangle &= \sum_{k = 0}^\infty k \frac{\lambda^k e^{-\lambda}}{k!} \\
&= \lambda \sum_{k = 1}^\infty \frac{\lambda^{k - 1} e^{-\lambda}}{(k - 1)!} \\
&= \lambda \sum_{k = 0}^\infty \frac{\lambda^k e^{-\lambda}}{k!} \\
&= \lambda
\end{align*}

Now assume that \langle k (k - 1) \cdots (k - m + 1) \rangle = \lambda^m.

\begin{align*}
\langle k (k - 1) \cdots (k - m + 1) (k - m) \rangle
&= \sum_{k = 0}^\infty k \cdots (k - m) \frac{\lambda^k e^{-\lambda}}{k!} \\
&= \lambda \sum_{k = 1}^\infty (k - 1) \cdots (k - m)
\frac{\lambda^{k - 1} e^{-\lambda}}{(k - 1)!} \\
&= \lambda \sum_{k = 0}^\infty k \cdots (k - m + 1) \frac{\lambda^k e^{-\lambda}}{k!} \\
&= \lambda \lambda^m \\
&= \lambda^{m + 1}
\end{align*}


(c)
{:.question}
Use (b) to derive equation (3.19): \sigma / \langle k \rangle = 1 / \sqrt{\lambda}.
To compute \sigma, we need to know what \langle k^2 \rangle is. It can be found using the
same trick we relied on for the last problem.

\begin{align*}
\langle k^2 \rangle
&= \sum_{k = 0}^\infty k^2 \frac{\lambda^k e^{-\lambda}}{k!} \\
&= \lambda \sum_{k = 1}^\infty k \frac{\lambda^{k - 1} e^{-\lambda}}{(k - 1)!} \\
&= \lambda \sum_{k = 0}^\infty (k + 1) \frac{\lambda^k e^{-\lambda}}{k!} \\
&= \lambda \left( \sum_{k = 0}^\infty k \frac{\lambda^k e^{-\lambda}}{k!} +
\sum_{k = 0}^\infty \frac{\lambda^k e^{-\lambda}}{k!} \right) \\
&= \lambda (\lambda + 1)
\end{align*}

Thus \sigma^2 = \langle k^2 \rangle - \langle k \rangle^2 = \lambda (\lambda + 1) - \lambda^2 =
\lambda, and \sigma / \langle k \rangle = \sqrt{\lambda} / \lambda = 1 / \sqrt{\lambda}.

(3.2)
{:.question}
Assume that photons are generated by a light source independently and randomly with an average rate
N per second. How many must be counted by a photodetector per second to be able to determine the
rate to within 1%? To within 1 part per million? How many watts do these cases correspond to for
visible light?
Note: in the previous problem I used \lambda as the expected number of events of a Poisson process.
Here I'll use N to avoid confusion with wavelength.
Since the photons are generated independently and with a constant average rate, it's reasonable to
model their creation as a Poisson process. We have already found that \sigma = \sqrt{N}. For
large N, the Poisson distribution is very close to the normal distribution. So about two thirds
of the probability mass lies between N - \sigma and N + \sigma. Thus if \sigma \leq 0.01
N, in any given second it's more likely than not that the number of photons emitted is within one
percent of the true mean. Thus we'd need \sqrt{N} \leq 0.01 N, i.e. N \geq 10^4. To have the
same probability that the number of observed photons is within 10^{-6} N of the true value, we
need N \geq 10^{12}.
The wavelength of visible light is about \num{500e-9} \si{m}, so the energy of each photon will
be

\begin{align*}
E &= \frac{h c}{\lambda} \\
&= \frac{\num{6.626e-34} \si{J.s} \cdot \num{3e8} \si{m/s}} {\num{500e-9} \si{m}} \\
&= \num{3.8e-19} \si{J}
\end{align*}

Thus 10^4 photons per second is \num{3.8e-15} \si{W}, and 10^{12} photons per second is
\num{3.8e-7} \si{W}.
Some caveats about this answer: the Poisson distribution is strictly greater than zero on its whole
domain, so one can never be certain that they have found the correct value within any error
bounds. That is, even if the real value of N is one, there's still a (vanishingly) slight
chance you'll see a million. The best we can do is find an answer that's probably approximately
correct.
Second, I only found limits on the probability that the number of emitted photons is near the true
(known) average. It's a more subtle question to ask for the probability that a given estimate is
within one percent of the true (latent) average. That could be tackled with confidence intervals,
p-values, Bayes factors, or direct posterior estimation, though none would be quite so simple. And
as an unapologetic Bayesian I'm required to point out that my confidence I've determined N with
some precision depends on my prior for N. This would be very real if we were, for instance,
counting photons in a lab module for this course. I would be more confident in my answer if I
measured something close to \num{1e12} photons, than if I measured something like
\num{3.64e12}, since I think it's more likely that the instructor would choose a nice round
number with lots of zeros.

(3.3)
{:.question}
Consider an audio amplifier with a 20 kHz bandwidth.

(a)
{:.question}
If it is driven by a voltage source at room temperature with a source impedance of 10kΩ how large
must the input voltage be for the SNR with respect to the source Johnson noise to be 20 dB?

\begin{align*}
\langle V_\text{noise}^2 \rangle &= 4 k T R \Delta f \\
&= 4 \cdot \num{1.38e-23} \si{J/K} \cdot 300 \si{K} \cdot 10^4 \si{\ohm} \cdot \num{20e3} \si{Hz} \\
&= \num{3.3e-12} \si{V^2}
\end{align*}

So for the input signal to be 20dB louder, we need

10 \log_{10} \left( \frac{V_\text{signal}^2}{\num{3.3e-12}} \right) = 20 \\
V_\text{signal}^2 = 10^2 \cdot \num{3.3e-12} \si{V^2} = \num{3.3e-10} \si{V^2}

Note that I use a factor of 10 since V^2 is proportional to power. This signal strength would be
achieved, for instance, by a sinusoid that ranges between -\num{6.6e-10} and \num{6.6e-10}
volts.

(b)
{:.question}
What size capacitor has voltage fluctuations that match the magnitude of this Johnson noise?
Since a farad is equal to a second per ohm, purely dimensional considerations imply that the noise
energy of the capacitor is approximately kT/C. Indeed this turns out to be exactly correct. The
energy in a capacitor is C V^2 / 2, so by the equipartition theorem, kT/2 = \langle C V^2 / 2
\rangle. This implies that \langle V^2 \rangle = k T / C. Thus the capacitor that matches the
voltage fluctuations found in part (b) at room temperature has capacitance

\begin{align*}
C &= \frac{k T}{\left \langle V^2 \right \rangle} \\
&= \frac{\num{1.38e-23} \si{J/K} \cdot 300 \si{K}}{\num{3.3e-12} \si{V^2}} \\
&= \num{1.25e-9} \si{F}
\end{align*}


(c)
{:.question}
If it is driven by a current source, how large must it be for the RMS shot noise to be equal to 1%
of that current?
RMS shot noise is the square root of 2 q I \Delta f. So we want \sqrt{2 q \Delta f I} = 0.01
I, which is solved by

\begin{align*}
I &= \frac{2q \Delta f}{0.01^2} \\
&= \frac{2 \cdot \num{1.6e-19} \si{C} \cdot 20 \si{kHz}}{10^{-4}} \\
&= \num{6.4e-11} A
\end{align*}


3.4
{:.question}
This problem is much harder than the others. Consider a stochastic process x(t) that randomly
switches between x = 0 and x = 1. Let \alpha \mathrm{d}t be the probability that it makes a
transition from 0 to 1 during the interval \mathrm{d}t if it starts in x = 0, and let \beta
\mathrm{d}t be the probability that it makes a transition from 1 to 0 during \mathrm{d}t if it
starts in x = 1.

(a)
{:.question}
Write a matrix differential equation for the change in time of the probability p_0(t) to be in
the 0 state and the probability p_1(t) to be in the 1 state.

\frac{d}{dt} \begin{bmatrix} p_0(t) \\ p_1(t) \end{bmatrix}
= \begin{bmatrix} -\alpha & \beta \\ \alpha & -\beta \end{bmatrix}
\begin{bmatrix} p_0(t) \\ p_1(t) \end{bmatrix}


(b)
{:.question}
Solve this by diagonalizing the 2 × 2 matrix.
Solving a system of ODEs isn't necessary here, since p_1(t) = 1 - p_0(t). So we just need to
solve

\begin{align*}
\frac{d}{dt} p_0(t) &= -\alpha p_0(t) + \beta (1 - p_0(t)) \\
&= \beta -(\alpha + \beta) p_0(t)
\end{align*}

Since the derivative is proportional to the function, it's solved generally by an exponential

p_0(t) = A + B e^{-(\alpha + \beta) t}

Then

\begin{align*}
\frac{d}{dt} p_0(t) &= -(\alpha + \beta) B e^{-(\alpha + \beta) t} \\
&= (\alpha + \beta) A - (\alpha + \beta) (A + B e^{-(\alpha + \beta) t}) \\
&= (\alpha + \beta) A - (\alpha + \beta) p_0(t)
\end{align*}

which implies that A = \beta / (\alpha + \beta). B is determined by p_0(t):

p_0(0) = \frac{\beta}{\alpha + \beta} + B

So putting everything together we have
$$
\begin{align*}
p_0(t) &= \frac{\beta}{\alpha + \beta}
+ \left(p_0(0) - \frac{\beta}{\alpha + \beta} \right) e^{-(\alpha + \beta) t} \
p_1(t) &= \frac{\alpha}{\alpha + \beta}
- \left(p_0(0) - \frac{\beta}{\alpha + \beta} \right) e^{-(\alpha + \beta) t} \
&= \frac{\alpha}{\alpha + \beta}

\left(p_1(0) - \frac{\alpha}{\alpha + \beta} \right) e^{-(\alpha + \beta) t}
\end{align*}
$$


(c)
{:.question}
Use this solution to find the autocorrelation function \langle x(t)x(t + \tau) \rangle.
For positive \tau,

\begin{align*}
\langle x(t) x(t + \tau) \rangle
&= \sum_{i, j \in \{0, 1\} \times \{0, 1\}} p(x(t) = i \cap x(t + \tau) = j) i j \\
&= p(x(t) = 1 \cap x(t + \tau) = 1) \\
&= p_1(t + \tau | x(t) = 1) p_1(t) \\
&= p_1(\tau | p_1(0) = 1) p_1(t) \\
&= \left( \frac{\alpha}{\alpha + \beta} + \left(1 - \frac{\alpha}{\alpha + \beta} \right)
e^{-(\alpha + \beta) \tau} \right) \frac{\alpha}{\alpha + \beta} \\
&= \left( \frac{\alpha}{\alpha + \beta} + \frac{\beta}{\alpha + \beta}
e^{-(\alpha + \beta) \tau} \right) \frac{\alpha}{\alpha + \beta} \\
&= \frac{\alpha}{(\alpha + \beta)^2} \left( \alpha + \beta e^{-(\alpha + \beta) \tau} \right)
\end{align*}

By symmetry this is an even function. That is, \langle x(t) x(t + \tau) \rangle = \langle x(t)
x(t - \tau) \rangle.

(d)
{:.question}
Use the autocorrelation function to show that the power spectrum is a Lorentzian.
By Wiener-Khinchin,

\begin{align*}
S(f) &= \int_\mathbb{R} \frac{\alpha}{(\alpha + \beta)^2}\left( \alpha + \beta e^{-(\alpha + \beta)
|\tau|} \right) e^{-2 \pi i f \tau} \mathrm{d} \tau \\
&= \frac{\alpha \beta}{(\alpha + \beta)^2} \left( \int_\mathbb{R} \frac{\alpha}{\beta} e^{-2 \pi i f
\tau} \mathrm{d} \tau + \int_\mathbb{R} e^{-(\alpha + \beta) |\tau| -2 \pi i f \tau} \mathrm{d}
\tau \right)
\end{align*}

The first integral evaluates to a delta function \alpha / \beta \delta(f). The second can be
broken into a sum of two integrals over the positive and negative halves of \mathbb{R}:

\begin{align*}
\int_0^\infty e^{-\tau ((\alpha + \beta) + 2 \pi i f)} \mathrm{d} \tau
&= \left[ \frac{-1}{(\alpha + \beta) + 2 \pi i f}
e^{-\tau ((\alpha + \beta) + 2 \pi i f)} \right]_0^\infty \\
&= \frac{1}{(\alpha + \beta) + 2 \pi i f} \\
\int_{-\infty}^0 e^{\tau ((\alpha + \beta) - 2 \pi i f)} \mathrm{d} \tau
&= \left[ \frac{1}{(\alpha + \beta) - 2 \pi i f}
e^{\tau ((\alpha + \beta) - 2 \pi i f)} \right]_{-\infty}^0 \\
&= \frac{1}{(\alpha + \beta) - 2 \pi i f} \\
\end{align*}

Putting everything together,
$$
\begin{align*}
S(f) &= \frac{\alpha \beta}{(\alpha + \beta)^2} \left( \frac{\alpha}{\beta} \delta(f)
+ \frac{1}{(\alpha + \beta) + 2 \pi i f} + \frac{1}{(\alpha + \beta) - 2 \pi i f} \right) \
&= \frac{\alpha \beta}{(\alpha + \beta)^2} \left( \frac{\alpha}{\beta} \delta(f)

\frac{2 (\alpha + \beta)}{(\alpha + \beta)^2 + (2 \pi f)^2} \right) \
&= \frac{\alpha^2}{(\alpha + \beta)^2} \delta(f) + \frac{\alpha \beta}{(\alpha + \beta)^2}
\frac{2 (\alpha + \beta)}{(\alpha + \beta)^2 + (2 \pi f)^2} \
&= \frac{\alpha^2}{(\alpha + \beta)^2} \delta(f) + \frac{\alpha \beta}{(\alpha + \beta)^2}
\frac{2 (\alpha + \beta)^{-1}}{1 + \left( \frac{2 \pi f}{\alpha + \beta} \right)^2} \
\end{align*}
$$

Ignoring the delta function, up to a constant factor this is a Lorentzian distribution with
\tau = 1 / (\alpha + \beta).

(e)
{:.question}
At what frequency is the magnitude of the Lorentzian reduced by half relative to its low-frequency
value?
Looking only at the Lorentzian portion,

S(0) = \frac{2 \alpha \beta}{(\alpha + \beta)^3}

So we must plug this in and solve for f:

\begin{align*}
\frac{\alpha \beta}{(\alpha + \beta)^2}
\frac{2 (\alpha + \beta)^{-1}}{1 + \left( \frac{2 \pi f}{\alpha + \beta} \right)^2}
&= \frac{1}{2} \frac{2 \alpha \beta}{(\alpha + \beta)^3} \\
\frac{1}{1 + \left( \frac{2 \pi f}{\alpha + \beta} \right)^2} &= \frac{1}{2} \\
1 &= \left( \frac{2 \pi f}{\alpha + \beta} \right)^2 \\
f &= \frac{\alpha + \beta}{2 \pi}
\end{align*}


(f)
{:.question}
For a thermally activated process, show that a flat distribution of barrier energies leads to a
distribution of switching times p(\tau) \propto 1/\tau, and in turn to S(f) \propto 1/f.
We are assuming the distribution of barrier energies p(E) is constant. According to (3.37), for
a thermally activated process the characteristic switching time is a function of the energy: \tau
= \tau_0 e^{E/kT}. So to get the distribution p(\tau), we just need to transform p(E)
accordingly. In particular,

p(\tau) = p(E) \frac{k T}{\tau}

Let's prove that this is the case. Given any random variable X, and monotonic function f,
let Y = f(X). Then the cumulative distribution functions are related by

\begin{align*}
p(Y \leq y) &= p(f(X) \leq y) \\ &= p(X \leq f^{-1}(y))
\end{align*}

By the fundamental theorem of calculus, the probability density function is the derivative of the
cumulative distribution function. So by employing the chain rule we find that

\begin{align*}
p(y) &= \frac{d}{dy} p(Y \leq y) \\
&= \frac{d}{dy} p(X \leq f^{-1}(y)) \\
&= p(f^{-1}(y)) \frac{d}{dy} f^{-1}(y)
\end{align*}

So our result above for p(\tau) follows because E = k T \log(\tau / \tau_0), so dE / d\tau
= k T / \tau.
Now we must show that S(f) \propto 1/f.

\begin{align*}
S(f) &= \int_0^\infty p(\tau) \frac{2 \tau}{1 + (2 \pi f \tau)^2} \mathrm{d} \tau \\
&= \int_0^\infty \frac{2 p(E) k T}{1 + (2 \pi f \tau)^2} \mathrm{d} \tau \\
&= \frac{1}{2 \pi f} \int_0^\infty \frac{2 p(E) k T}{1 + \tau^2} \mathrm{d} \tau \\
&= \frac{p(E) k T}{\pi f} \int_0^\infty \frac{1}{1 + \tau^2} \mathrm{d} \tau \\
&= \frac{p(E) k T}{\pi f} \int_0^{\pi/2} \frac{sec^2{\theta}}{1 + tan^2(\theta)} \mathrm{d} \theta \\
&= \frac{p(E) k T}{\pi f} \int_0^{\pi/2} \mathrm{d} \theta \\
&= \frac{p(E) k T}{2 f}
\end{align*}