Longest Path Search

The S-procedure and small covering ellipsoids

Posted 2019-06-05

Note: This post was inspired by Kunal Shah's question that came up at some point during one of our meetings: is there an efficient way of finding an ellipsoid which covers the intersections and unions of a bunch of other ellipsoids?

While this question has been explored somewhat extensively, the exposition is often more general than necessary and aimed at a relatively mathematical audience. Either way, if you're interested, both papers are fairly well-written—I highly recommend at least a quick skim!

The S-procedure

The S-procedure is a well known lemma in control theory that seeks to answer the following question:

Let's say we have a bunch of quadratic functions f0,f1,f2,,fn:RmRf_0, f_1, f_2, \dots, f_n : \mathbb{R}^m \to \mathbb{R}. When is it true that

fi(x)0  for i=1,,n    f0(x)0, f_i(x) \le 0 ~~ \text{for $i=1, \dots, n$} \implies f_0(x) \le 0,

for xRmx \in \mathbb{R}^m? (Recall that a quadratic is a function of the form f(x)=xTPx+2qTx+rf(x) = x^TPx + 2q^Tx + r for symmetric PRm×mP \in \mathbb{R}^{m\times m}, qRmq \in \mathbb{R}^m, and rRr \in \mathbb{R}).

There are many reasons to attempt to answer this (surprisingly useful) question. The original motivations were to show stability of systems, though the domain of applications is certainly larger. We can use this to show anything from impossibility results (for example, many of the results of this paper can be recast in terms of the S-procedure) to, well, in our case, the construction of a small covering ellipsoid from a bunch of other ellipsoids, which is itself useful for things like filtering (for localizing drones from noisy measurements for example) along with many other applications.

If you're familiar with Lagrange duality, this is mostly an equivalent statement—except that this statement is in the special case of quadratics, where you can say a little more than with general functions.

The n=1n=1 case

We can fully and completely answer this question in the case that n=1n=1: there exists a nonnegative number τ0\tau \ge 0 such that

f0(x)τf1(x) f_0(x) \le \tau f_1(x)

for all xx if, and only if, f1(x)0    f0(x)0f_1(x) \le 0 \implies f_0(x) \le 0.

Why? Well, let say we have a τ0\tau\ge 0 that satisfies the above inequality. Then, if we have an xx such that f1(x)0f_1(x) \le 0, then

f0(x)τf1(x)τ0=0. f_0(x) \le \tau f_1(x) \le \tau0 = 0.

The converse is slightly trickier, so I will defer to B&V's Convex Optimization which has a very readable presentation of the proof (see B.1 and B.2 in the appendix).

The general case

The general case is really only a slight change from the n=1n=1 case (except that the converse of the statement is not true). In particular, if there exist λ0\lambda \ge 0 such that

f0(x)iλifi(x)  for all xRm, f_0(x) \le \sum_i \lambda_i f_i(x) ~~ \text{for all $x \in \mathbb{R}^m$},

then, fi(x)0 for i=1,,n    f0(x)0f_i(x) \le 0 ~ \text{for} ~ i = 1, \dots, n \implies f_0(x) \le 0. Showing this is nearly the same as the n=1n=1 case,

f0(x)iλifi(x)iλi0=0. f_0(x) \le \sum_i \lambda_i f_i(x) \le \sum_i \lambda_i 0 = 0.

So now we have a family of sufficient (but not necessary!) conditions for which we know when fi(x)0 for i=1,,nf_i(x) \le 0 ~ \text{for} ~ i = 1, \dots, n implies that f0(x)0f_0(x) \le 0.

Covering ellipsoids for unions

Definitions and connections

Ellipsoids are a particularly nice family to work with since, as you may have guessed, they are the sets defined by

E={xf(x)0}, \mathcal{E} = \{x \mid f(x) \le 0\},

where f:RmRf: \mathbb{R}^m \to \mathbb{R} is a convex quadratic. This definition gives us a way of translating statements about sets (inclusion, etc) into statements about the functions which generate them. In particular, if we have two ellipsoids E,E0Rn\mathcal{E}, \mathcal{E}_0 \subseteq \mathbb{R}^n defined by the convex quadratics f,f0f, f_0, then

EE0    (f(x)0    f0(x)0). \mathcal{E} \subseteq \mathcal{E}_0 \iff (f(x) \le 0 \implies f_0(x) \le 0).

But wait a minute, we know exactly when this happens! By the previous section, we found that

f(x)0    f0(x)0, f(x) \le 0 \implies f_0(x) \le 0,

if and only if there is some τ0\tau \ge 0 with f0(x)τf(x)f_0(x) \le \tau f(x). Also note that if we have a union of a bunch of ellipsoids (say E1,,Em\mathcal{E}_1, \dots, \mathcal{E}_m) that we want to cover with an ellipsoid E0\mathcal{E}_0, then this is the same as saying

EiE0, for i=1,,m, \mathcal{E}_i \subseteq \mathcal{E}_0, ~\text{for $i=1, \dots, m$},

or, that each ellipsoid is covered by the big one, E0\mathcal{E}_0.

Back to our goal

Ok, to reiterate, we are looking for a small ellipsoid E0={xf0(x)0}\mathcal{E}_0 = \{x \mid f_0(x) \le 0\} such that E0\mathcal{E}_0 contains all of the other ellipsoids Ei={xfi(x)0}\mathcal{E}_i = \{x \mid f_i(x) \le 0\}, where the fif_i and f0f_0 are convex quadratics. In other words, using the results of the previous subsection, we look for a quadratic f0f_0 such that

(fi(x)0    f0(x)0)  for each i (f_i(x) \le 0 \implies f_0(x) \le 0) ~~ \text{for each $i$}

which we know happens only when there exists some τi0\tau_i \ge 0 with

f0(x)τfi(x)  for each i and all x. f_0(x) \le \tau f_i(x) ~~ \text{for each $i$ and all $x$}.

Now remains the final question: given two quadratics, fif_i and f0f_0 and some number τ0\tau \ge 0, how can we check if f0(x)τfi(x)f_0(x) \le \tau f_i(x) for all xx? I won't prove this (though I have written a quick proof of this statement in my notes, found here), but, if we let fi(x)=xTPix+2qiTx+rif_i(x) = x^TP_ix + 2q_i^Tx + r_i and f0(x)=xTPx+2(q)Tx+rf_0(x) = x^TP'x + 2(q')^Tx + r' then f0(x)fi(x)f_0(x) \le f_i(x) for all xx if, and only if,

[Pq(q)Tr]τ[PiqiqiTri], \begin{bmatrix} P' & q\\ (q')^T & r' \end{bmatrix} \le \tau\begin{bmatrix} P_i & q_i\\ q_i^T & r_i \end{bmatrix},

where we say two symmetric matrices A,BA, B satisfy ABA \le B whenever xTAxxTBxx^TAx \le x^TBx for all xx. A straightforward exercise it to verify that the set of matrices A0A \ge 0 is a convex cone (almost universally called the positive semidefinite or PSD cone).

This rewriting is extremely useful, since we've turned a problem over a potentially difficult-to-handle space (the space of quadratics greater than or equal to another) into a problem that is easy to handle (the PSD cone). The best news, though, is that we have efficient algorithms to solve optimization problems whose constraints are PSD constraints.[1]

Corresponding optimization problem

Finally, after enough background, we can get to the final goal: writing an efficiently-solvable optimization problem to give us a small bounding ellipsoid.

There are several ways of defining "small," in the case of ellipsoids, but one of the most common definitions is to pick the ellipsoid with the smallest volume. In the case that E0={xxTPx+2(q)Tx+r0}\mathcal{E}_0 = \{x \mid x^TP'x + 2(q')^Tx + r \le 0\}, the volume of this ellipsoid is given by the determinant, detP\mathop{\mathrm{det}} P', of the matrix PP'. So, we can write—using the conditions given above—an optimization problem corresponding to finding the smallest (in volume) ellipsoid E0\mathcal{E}_0 which contains all ellipsoids, Ei\mathcal{E}_i as

minimizeP,q,r,τdetPsubject to[Pq(q)Tr]τi[PiqiqiTri],i=1,,n. \begin{aligned} & \underset{P', q', r', \tau}{\mathrm{minimize}} & & \mathop{\mathrm{det}} P' \\ & \text{subject to} & & \begin{bmatrix} P' & q\\ (q')^T & r' \end{bmatrix} \le \tau_i\begin{bmatrix} P_i & q_i\\ q_i^T & r_i \end{bmatrix}, \quad i = 1, \dots, n. \end{aligned}

The only problem here (which we can easily fix) is that the determinant is not a convex function. On the other hand, the log determinant is (for a proof, see the Convex book, section 3.1.5), so we can write,

minimizeP,q,r,τlogdetPsubject to[Pq(q)Tr]τi[PiqiqiTri],i=1,,n. \begin{aligned} & \underset{P', q', r', \tau}{\mathrm{minimize}} & & \log \mathop{\mathrm{det}} P' \\ & \text{subject to} & & \begin{bmatrix} P' & q\\ (q')^T & r' \end{bmatrix} \le \tau_i\begin{bmatrix} P_i & q_i\\ q_i^T & r_i \end{bmatrix}, \quad i = 1, \dots, n. \end{aligned}

This is equivalent to the original problem since log(y)\log(y) is an increasing function of yy.

Of course, any convex function (such as, for example the trace) would do here as well.

Covering ellipsoids for intersections and unions

Ok, now we know how to solve the problem where we have a bunch of ellipsoids and we want to find an ellipsoid which covers all of them. How about the problem where we want to find an ellipsoid which also covers the sets {Ni}\{N_i\} for i=1,,ki=1, \dots, k, which are, themselves, intersections of ellipsoids?

In particular, if NiN_i is defined as

Ni=jIiEj, N_i = \bigcap_{j \in I_i} \mathcal{E}_j,

for some index set Ii{1,,n}I_i \subseteq \{1, \dots, n\} and some set of ellipsoids {Ej}\{\mathcal{E}_j\}, each of which are defined as before (Ej={xfj(x)0}\mathcal{E}_j = \{x \mid f_j(x) \le 0 \}), we can perform a similar trick to the one above!

More generally, if we have an ellipsoid E0={xf0(x)0}\mathcal{E}_0 = \{x \mid f_0(x) \le 0\}, then

NiE0    (fj(x)0  for jIi    f0(x)0). N_i \subseteq \mathcal{E}_0 \iff (f_j(x) \le 0 ~~ \text{for $j \in I_i$} \implies f_0(x) \le 0).

(It's a worthwhile exercise to think about why, but it follows the same idea as before.) In other words, E0\mathcal{E}_0 is a superset of NiN_i only when a bunch of quadratic inequalities imply another. (Where have we seen this before...?)

In other words, we know that (by the first section), if there exist λ0\lambda \ge 0 such that

f0(x)jIiλjfj(x), f_0(x) \le \sum_{j \in I_i} \lambda_{j} f_j(x),

then we immediately have that NiE0N_i \subseteq \mathcal{E}_0. Since the converse is not true, we are sadly not guaranteed to actually find the smallest bounding ellipsoid E0\mathcal{E}_0, but this is usually quite a good approximation (if it's not exact).

Following exactly the same steps as in the previous section and using the same definitions, we now get a new program for minimizing the volume for the union and intersection of ellipsoids:

minimizeP,q,r,λlogdetPsubject to[Pq(q)Tr]jIiλij[PjqjqjTrj],i=1,,kλij0,i=1,,k,  jIi. \begin{aligned} & \underset{P', q', r', \lambda}{\mathrm{minimize}} & & \log \mathop{\mathrm{det}} P' \\ & \text{subject to} & & \begin{bmatrix} P' & q\\ (q')^T & r' \end{bmatrix} \le \sum_{j \in I_i}\lambda_{ij}\begin{bmatrix} P_j & q_j\\ q_j^T & r_j \end{bmatrix}, \quad i = 1, \dots, k \\ &&& \lambda_{ij} \ge 0, \quad i=1, \dots, k, ~~ j \in I_i. \end{aligned}

As before, this program does not guarantee actually finding the minimal volume ellipsoid, but it is likely to be quite close! (That is, if it's not spot on, most of the time.)

[1] For more information, see Boyd's Linear Matrix Inequalities book.
Built with Franklin.jl and Julia.