Longest Path Search

A non-counting lower bound for the expected distance of a simple random walk

Posted 2023-06-20

It's been a while since I've updated the blog (likely due to the fact that I've been struggling to get it to work with Github pages...). Anyways, it'll, at some point, be migrated over, but for now this will have to do.

This post will focus on a particular, nearly silly, proof of a lower bound for the distance of an unbiased random walk, defined as

X=i=1nXi, X = \sum_{i=1}^n X_i,

where Xi{±1}X_i \sim \{\pm 1\}, uniformly. The quantity we want to find a lower bound to is

E[X], \mathbf{E}[|X|],

as nn is large. We know from a basic, if somewhat annoying, counting argument that

E[X]2πn, \mathbf{E}[|X|] \sim \sqrt{\frac{2}{\pi}}\sqrt{n},

when n1n \gg 1. In general, we're interested in bounds of the form

E[X]Ω(n). \mathbf{E}[|X|] \ge \Omega(\sqrt{n}).

Bounds like these are applicable in a number of important lower bounds for online convex optimization (see, e.g., Hazan's lovely overview, section 3.2) though we won't be talking too much about the applications on this one.

Additionally, since E[X2]=n\mathbf{E}[X^2] = n (which follows by expanding and using the fact that XiX_i are independent with mean zero) then

E[X]E[X2]=n, \mathbf{E}[|X|] \le \sqrt{\mathbf{E}[X^2]} = \sqrt{n},

so we know that this bound is tight up to a constant. The first inequality here follows from an application of Jensen's inequality to the square root function (which is concave).

Why a non-counting proof?

Mostly because I'm bad at counting and always end up with a hilarious number of errors. Plus, this proof is easily generalizable to a number of other similar results!

Proof idea

One simple method for lower-bounding the expectation of a variable like X|X| is to note that X|X| is nonnegative, so we have the following 'silly' bound

E[X]E[a1Xa]=aPr(Xa), \mathbf{E}[|X|] \ge \mathbf{E}[a\mathbf{1}_{|X| \ge a}] = a \mathbf{Pr}(|X| \ge a),

for any a0a \ge 0, where 1Xa\mathbf{1}_{|X| \ge a} is the indicator function for the event Xa|X| \ge a, that is 1 if Xa|X| \ge a and zero otherwise. (The bound follows from the fact that Xa1Xa|X| \ge a \mathbf{1}_{|X|\ge a} pointwise.) Maximizing over aa, assuming we have a somewhat tight lower bound over the probability that Xa|X| \ge a, then this approach might give us a reasonable lower bound.

In a very general sense, we want to show that X|X| is 'anticoncentrated'; i.e., it is reasonably 'spread out', which would indicate that its expectation cannot be too small, since it is nonnegative.

Attempt #1

The first idea (or, at least, my first idea) would be to note that, since E[X2]\mathbf{E}[X^2] is on the order of nn, then maybe we can use this fact to construct a bound for E[X]\mathbf{E}[|X|] which 'should be' on the order of n\sqrt{n} assuming some niceness conditions, for example, that Xn|X| \le n is a bounded variable.

Unfortunately, just these two simple facts are not enough to prove the claim! We can construct a nonnegative random variable Y0Y\ge 0 such that its second moment is E[Y2]=n\mathbf{E}[Y^2] = n, it is bounded by YnY \le n, yet E[Y]=1\mathbf{E}[Y] = 1. In other words, we wish to construct a variable that is very concentrated around 00, with 'sharp' peaks at larger values.

Of course, the simplest example would be to take Y=nY = n with probability 1/n1/n and Y=0Y=0 with probability 11/n1-1/n. Clearly, this variable is bounded, and has nn as its second moment. On the other hand,

E[Y]=(1/n)n+(11/n)0=1, \mathbf{E}[Y] = (1/n)n + (1-1/n)0 = 1,

which means that the best bound we can hope for, using just these conditions (nonnegativity, boundedness, and second moment bound) on a variable, is a constant. (Indeed, applying a basic argument, we find that this is the smallest expectation possible.)

This suggests that we need a little more control over the tails of X|X|, which gets us to...

Attempt #2 (and solution)

Another easy quantity to compute in this case is E[X4]\mathbf{E}[X^4]. (And, really, any even power of XX is easy. On the other hand, since XX has a distribution that is symmetric around 0, all odd moments are 0.) Splitting the sum out into each of the possible quartic terms, we find that any term containing an odd power of XiX_i will be zero in expectation as the XiX_i are independent. So, we find

E[X4]=iE[Xi4]+ijE[Xi2Xj2]=n+n(n1)=n2. \mathbf{E}[X^4] = \sum_{i} \mathbf{E}[X_i^4] + \sum_{i\ne j} \mathbf{E}[X_i^2X_j^2] = n + n(n-1) = n^2.

This quantity will come in handy soon.

We can, on the other hand, split up the expectation of X2X^2 in a variety of ways. One is particularly handy to get a tail lower bound like the one we wanted in our proof idea (above):

E[X2]=E[X21X<a]+E[X21Xa]a2+E[X21Xa]. \mathbf{E}[X^2] = \mathbf{E}[X^2\mathbf{1}_{|X| < a}] + \mathbf{E}[X^2\mathbf{1}_{|X| \ge a}] \le a^2 + \mathbf{E}[X^2\mathbf{1}_{|X| \ge a}].

The latter term can be upper bounded using Cauchy–Schwarz,[1]

E[X21Xa]E[X4]E[1Xa]. \mathbf{E}[X^2\mathbf{1}_{|X| \ge a}] \le \sqrt{\mathbf{E}[X^4]}\sqrt{\mathbf{E}[\mathbf{1}_{|X| \ge a}]}.

(Since 1Xa2=1Xa\mathbf{1}_{|X| \ge a}^2 = \mathbf{1}_{|X| \ge a}.) And, since E[1Xa]=Pr(Xa)\mathbf{E}[\mathbf{1}_{|X| \ge a}] = \mathbf{Pr}(|X| \ge a), we finally have:

E[X2]a2+E[X4]Pr(Xa). \mathbf{E}[X^2] \le a^2 + \sqrt{\mathbf{E}[X^4]}\sqrt{\mathbf{Pr}(|X| \ge a)}.

Rearranging gives us the desired lower bound,

Pr(Xa)(E[X2]a2)2E[X4]. \mathbf{Pr}(|X| \ge a) \ge \frac{(\mathbf{E}[X^2] - a^2)^2}{\mathbf{E}[X^4]}.

(This is a Paley–Zygmund-style bound, except over X2X^2 rather than nonnegative XX.)

Now, since we know that

E[X]aPr(Xa), \mathbf{E}[|X|] \ge a \mathbf{Pr}(|X| \ge a),

then we have

E[X]a(E[X2]a2)2E[X4]. \mathbf{E}[|X|] \ge a \frac{(\mathbf{E}[X^2] - a^2)^2}{\mathbf{E}[X^4]}.

Parametrizing aa by a=αE[X2]a = \alpha\sqrt{\mathbf{E}[X^2]} for some 0α10 \le \alpha \le 1, we then have

E[X]α(1α2)2E[X2]3/2E[X4]. \mathbf{E}[|X|] \ge \alpha(1-\alpha^2)^2\frac{\mathbf{E}[X^2]^{3/2}}{\mathbf{E}[X^4]}.

The right-hand-side is maximized at α=1/5\alpha = 1/\sqrt{5}, which gives the following lower bound

E[X]16255E[X2]3/2E[X4]. \mathbf{E}[|X|] \ge \frac{16}{25\sqrt{5}}\frac{\mathbf{E}[X^2]^{3/2}}{\mathbf{E}[X^4]}.

And, finally, using the fact that E[X2]=n\mathbf{E}[X^2] = n and E[X4]=n2\mathbf{E}[X^4] = n^2, we get the final result:

E[X]16255nΩ(n), \mathbf{E}[|X|] \ge \frac{16}{25\sqrt{5}}\sqrt{n} \ge \Omega(\sqrt{n}),

as required, with no need for combinatorics! Of course the factor of 16/(255).2916/(25\sqrt{5}) \approx .29 is rather weak compared to the factor of 2/π.80\sqrt{2/\pi} \approx .80, but this is ok for our purposes.

General extensions

Of course, similar constructions also hold rather nicely for things like uniform [1,1][-1, 1] variables, or Normally distributed, mean zero variables. Any variable for which the second and fourth moment can be easily computed allows us to compute a lower bound on this expectation. (Expectations of the absolute value of the sums of independently drawn versions of these variables could be similarly computed.) These have no obvious combinatorial analogue, so those techinques cannot be easily generalized, whereas this bound applies immediately.

[1] Possibly the most elegant proof of Cauchy–Schwarz I know is based on minimizing a quadratic, and goes a little like this. Note that E[(XtY)2]0\mathbf{E}[(X - tY)^2]\ge 0 for any tRt \in \mathbf{R}. (That this expectation exists can be shown for any tt assuming both XX and YY have finite second moment. If not, the inequality is also trivial.) Expanding gives E[X2]2tE[XY]+t2E[Y2]0\mathbf{E}[X^2] - 2t\mathbf{E}[XY] + t^2\mathbf{E}[Y^2] \ge 0. Minimizing the left hand side over tt then shows that t=E[XY]/E[Y2]t^\star = \mathbf{E}[XY]/\mathbf{E}[Y^2], which gives

E[X2]E[XY]2E[Y2]0. \mathbf{E}[X^2] - \frac{\mathbf{E}[XY]^2}{\mathbf{E}[Y^2]} \ge 0.

Multiplying both sides by E[Y2]\mathbf{E}[Y^2] gives the final result.

Built with Franklin.jl and Julia.