Longest Path Search

A note on "Optimal Design of Controlled Environment Agricultural Systems (...)"

Posted

This is just a basic note on Cetegen and Stuber’s paper (apologies for the paywall) published a few days ago, Optimal Design of Controlled Environment Agricultural Systems Under Market Uncertainty. This post “simplifies” problem (4) from a bilevel optimization problem to a single (convex) optimization problem which can be readily solved. As a side note, I only skimmed the references provided, and some were too complicated for me to understand, especially at first glance.1 It is possible this technique is well known in the specific literature referenced, but hidden under mountains of notation and definitions. I also note that this is not the problem they (approximately) solve, but the method presented here might be useful in that case, too.

The One (Duality) Trick Doctors Don’t Want You To Know™

I’ve used this trick before in a few other papers, with the main example being a paper coauthored with Kunal Shah and Mac Schwager, found here, specifically in equation (8) and below, starting on page 10.

The basic (very general!) idea is to replace a “min-max” optimization problem with a “min” optimization problem. For example, say we are given the following optimization problem

minimizemaxg(y)0f(x,y)subject toh(x)0,

with variables x𝐑m and y𝐑n and functions f:𝐑m×𝐑n𝐑, g:𝐑n𝐑, and h:𝐑m𝐑. Now, consider the usual Lagrangian of the “inner problem” (the one with the max over y), which we know is

L(x,y,λ)=f(x,y)λg(y).

If we define

f(x,λ)=supyL(x,y,λ),

then, for any λ0 and any feasible y (i.e., y that satisfies g(y)0), we have that

f(x,λ)L(x,y,λ)=f(x,y)λg(y)f(x,y).

(The first inequality follows from the definition of sup while the last inequality follows since λ0 and g(y)0 which means that λg(y)0.) So, for any x, we know that, if you give me any λ0, then f(x,λ) is an “overestimator” of f(x,y) for any feasible y.

But, since this is true for any λ0 and x, then certainly

infλ0f(x,λ)supg(y)0f(x,y),

for any x.2

If we use our new overestimator f instead of f, our new problem is now a simple optimization problem

minimizef(x,λ)subject toh(x)0λ0,

that is not in min-max form and requires no other special techniques to solve. The optimal value of this problem need not be the same as that of the original, but is always guaranteed to be at least as large.

Of course, this can help, but it certainly doesn’t solve our problem. We just need one more piece to the puzzle!

The hammer

If you’ve studied a bit of convex analysis the punchline is this: the inequality we have for f and f above holds exactly at equality when f is concave in y and g is convex in y. More specifically

infλ0f(x,λ)=supg(y)0f(x,y),

for any x.3

When this is true, the new problem has the same optimal value as the original and any solution x for the original is a solution to the new problem! (Why?)

I won’t cover more since the specifics don’t matter too much, but the general idea is simple enough, and we now have all the parts to convert the min-max problem (4) of Cetegen and Stuber to a simple convex optimization problem.

The (new and shiny) problem

Things are mostly algebra from here on out, so apologies in advance, I guess. I will leave much of the “hard” work to the reader :)

The complete problem (4) is, as written in the paper, using (mostly) their notation:

minimizemaxMxTMxsubject torTxrmin1Tx=10x1,

with variable x𝐑n, where r is some vector of returns (but the specifics don’t matter) and is:

={M0|MijMijMij+,  i,j=1,,n}.

In other words, is the set of positive semidefinite matrices (M0) whose entries lie between those of M and M+. I’ve also dropped some constant terms in the objective since those don’t change the problem.

In this case, the “inner” optimization problem is the one in the objective, which is just

maximizexTMxsubject toMijMijMij+,  i,j=1,,nM0,

with variable M𝐑n×n. We can easily write a (slightly not canonical) Lagrangian:

L(x,M,Λ+,Λ)=xTMxtr(Λ+(MM+))+tr(Λ(MM)),

where Λ+,Λ𝐑+n×n are elementwise nonnegative. (The Lagrangian is non-canonical because I have not included the constraint M0, which we will enforce below.) It is not hard to show that

supM0L(x,M,Λ+,Λ)={tr(Λ+M+)tr(ΛM)xxTΛ+Λ+otherwise.

As before, the inequality between matrices is with respect to the semidefinite cone.

Plugging this back into the original problem formulation, we now have a convex optimization problem:

minimizetr(Λ+M+)tr(ΛM)subject torTxrminxxTΛ+Λ1Tx=10x1Λij+,Λij0,i,j=1,,n.

The (extra!) variables Λ+,Λ𝐑n×n are included along with the original variable x𝐑n, and the same problem data as before. This problem, by use of the Schur complement applied to the semidefinite inequality, is easily recast into standard SDP form and can be solved by most standard convex optimization problem solvers, such as SCS or Mosek.

Quick edit (3/18/22): Thanks to bodonoghue85 for finding a typo!


Footnotes

  1. See, e.g. page 429 of this paper, where they seem to re-prove basic things like “the supremum of a bunch of convex functions is convex,” and “partial minimization of a convex function is convex,” but I’m honestly not 100% sure.

  2. This is called weak duality, c.f. sections 5.1.3 and 5.2.2 of Convex Optimization.

  3. This is often referred to as strong duality, c.f. section 5.3.2 of Convex Optimization. There are some additional conditions for equality to hold, but these are almost always met in practice.