1
$\begingroup$

Consider the constrained convex optimization problem $$\min_{(x_1,x_2,x_3)\in \mathbb R^3} \max(x_1-x_2+x_3,-x_1+2x_2+x_3,-x_1-x_2-3x_3) \quad \text{s.t.} \quad x_1+x_2+x_3=1.$$

This problem appears while looking for a portfolio that produces an arbitrage.
People with a finance background routinely solve this problem by claiming that, at a minimizer, the three arguments in the minimum are equal: $$\begin{cases} x_1-x_2+x_3 &= -x_1+2x_2+x_3 \\ -x_1+2x_2+x_3 &= -x_1-x_2-3x_3\end{cases},$$ hence $x_2=\frac 23 x_1$, $x_3=-\frac34 x_2 = -\frac 12 x_1$ and finally $(x_1,x_2,x_3)=(\frac 67, \frac 47, -\frac 37)$.

  1. What is the theoretical grounding between this reasoning?

Here are my thoughts.
I can solve this problem by introducing an auxiliary variable $z$ and adding inequality constraints, hence rewriting the problem as $$\min_{(x_1,x_2,x_3,z)\in \mathbb R^3} z \quad \text{s.t.} \quad \begin{aligned}[t]x_1+x_2+x_3&=1 \\x_1-x_2+x_3-z&\leq 0 \\-x_1+2x_2+x_3-z&\leq 0 \\-x_1-x_2-3x_3-z&\leq 0 \end{aligned}.$$ This last problem can be solved by setting up KKT conditions. While solving the corresponding system of equations, it appears that the multipliers corresponding to the three inequality constraints are nonzero. Therefore, by complementary slackness, it is indeed true that $x_1-x_2+x_3 = -x_1+2x_2+x_3= -x_1-x_2-3x_3$ at the optimum.

  1. Is there a quicker (less computationally intensive) way to show equality at the optimum? Can this be seen at once by just looking at the optimization problem?

  2. A minor technical issue on the side: why is a global minimizer guaranteed to exist? It is unclear to me that the objective is coercive.

$\endgroup$

2 Answers 2

1
$\begingroup$

Formal Proof: Perform a change of coordinates: $y_1=x_1-x_2+x_3$, $y_2=-x_1+2x_2+x_3$, and $y_3=-x_1-x_2-x_3$. This is a valid change of coordinates because $M=\begin{bmatrix}1&-1&1\\-1&2&1\\-1&-1&-3\end{bmatrix}$ has nonzero determinant. Note $\vec y = M \vec x$ and the constraint is $\vec x \cdot(1,1,1)=1$. Thus, we are looking to optimize $\max(y_1,y_2,y_3)$ subject to the constraint $(M^{-1}\vec y)\cdot(1,1,1) = 1$ or equivalently $\vec y \cdot ((M^{-1})^T(1,1,1))=1$. Computing $$ (M^{-1})^T\begin{bmatrix}1\\1\\1\end{bmatrix} = \begin{bmatrix}-3\\-2\\-2\end{bmatrix} $$ Thus the constraint is $-3y_1-2y_2-2y_3 = 1$. Note that $-3y_1-2y_2-2y_3\ge -7\max(y_1,y_2,y_3)$ with equality if and only if $y_1=y_2=y_3$. Therefore $1\ge -7\max(y_1,y_2,y_3)$ so $\max(y_1,y_2,y_3)\ge-\frac17$ for all $\vec y$, and equality is achieved when $y_1=y_2=y_3$, therefore this point is the unique minimizer.

The theoretical grounding is the fact that $(M^{-1})^T (1,1,1)$ has only negative components.

Handwaving why it makes sense: Let $$ f(x_1,x_2,x_3) =\max(x_1-x_2+x_3,-x_1+2x_2+x_3,-x_1-x_2-3x_3) $$

Note that the function you're optimizing is the maximum of three different affine functions, whose graphs are therefore hyperplanes in $\mathbb{R}^4$. The restriction to $x_1+x_2+x_3=1$, under a change of coordinates, turns it into a maximum of three affine functions $\mathbb{R}^2\rightarrow \mathbb{R}$ whose graphs are simply planes. (To get this, you can substitute in $x_3=1-x_1-x_2$). This is something you might be able to visualize, which can give an idea why it should have a unique minimizer which happens at the point of equality.

One fact about affine functions is that their maximizers and minimizers always occur at the boundary of their domain. This is because the gradient is a constant vector, and a function is monotonically increasing in the direction of its gradient. This is why the minimizer occurs when all three of the arguments in the max are equal. At any other point, moving in a direction that isn't orthogonal to any of the three gradients would increase or decrease the value of $f$, which in either case means that point is not a minimizer.

To get some intuition of why $f$ should have a unique minimizer, and why this should be at the point of equality of the arguments, think about an analogous function $g(x) = \max(ax+b,cx+d)$ from $\mathbb{R}$ to $\mathbb{R}$. If $a$ and $c$ have opposite signs, then $g$ will have a 'V' shape and has a clear, unique minimum, where the two lines $y=ax+b$ and $y=cx+d$ intersect:

enter image description here

If $a$ and $c$ have the same sign, then the function has no minimum. So for $f$, we need a higher-dimensional analogue of 'having the same sign'. The analogue of slope is the gradient. The three gradients of the three functions in $f$ are $(1,-1,1)$, $(-1,2,1)$, and $(-1,-1,-3)$. Note that the dot product of any pair of these is negative, which might be interpreted as "having opposite signs", and therefore the pairwise minimizers occur where two are equal.

$\endgroup$
0
$\begingroup$

Let's first show that the global minimizer should exist.

Let $a_1 = (1, -1, 1)^T$, $a_2 = (-1, 2, 1)^T$, $a_3=(-1, -1, -3)^T$, $e=(1,1,1)^T$.

We have $$3a_1+2a_2+2a_3=-e \tag{1}.$$

We let $x_0$ be a point that satisties $e^Tx_0=1$,

Let $d \ne 0$ be any direction that satisfies $e^Td=0$, we want to show that $\max_i a_i^T(x_0+td)\to \infty$ as $t \to \infty$.

This is equivalent to show that $\max_i a_i^Td>0$.

Suppose on the contrary that this is not the case, that we have $\max_i a_i^Td\le 0$, then $\forall i, a_i^Td \le 0.$

Let dot product equation $(1)$ with $d$, we have $$3a_1^Td+2a_2^Td+2a_3^Td=-e^Td=0$$

If any of the $a_i^Td<0$, then the LHS would be negative since they are nonpositive, thus, we have

$$\forall i, a_i^Td=0.$$

Consider matrix $M$ of which row $i$ of $M$ consists of the transposed vector of $a_i$, then we can check that $M$ is invertible but we have just found a vector $d$ such that $Md=0$, which is a contradiction.


Now that we know the global minimal exist, in linear programming, we can find it at a BFS. To do so, the $4$ equations should be active, and thus, we know that $z=a_1^Tx=a_2^Tx=a_3^Tx$.

$\endgroup$

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.