I'm reading "The Elements of Statistical Learning", to be precise chapter about optimal separating hyperplanes (SVM classification problem) and I've stuck with the following. I have a function to be maximized:
$$ \sum_{i=1}^{N}a_i - \frac{1}{2}\sum_{i=1}^{N}\sum_{k=1}^{N}a_ia_ky_iy_kx_i^Tx_k, $$
subject to:
$$ 0 = \sum_{i=1}^{N}y_ia_i, \quad b = \sum_{i=1}^{N}a_iy_ix_i, \quad a_i \ge 0 \quad \forall{i}, \quad a_i[y_i(x_i^Tb + b_o) - 1] = 0 \quad \forall{i} $$
where $y$ and $x$ are known variables and $a$, $b$ and $b_0$ should be found.
And as far as I understand the next step should be to solve this problem using any of quadratic programming approach. In the same time quadratic programming problems looks like:
$$ \frac{1}{2}X^TQX+C^TX $$
subject to:
$$ Ax \le B $$
And I don't understand how to convert the first problem into the second. As far as I see we can transform $a_i \ge 0$ can be transformed into $Ax \le B$, but what about $a_i[y_i(x_i^Tb + b_o) - 1] = 0$? It looks like a quadratic constraint and cannot be transformed into linear constraint.