Multiplying the matrices that way does not work.
Instead, using the Java JAMA package for matrices:
import Jama._
Matrix bi = new Matrix(4, 4);
Matrix lp = new Matrix(4, 4);
Matrix lv = new Matrix(4, 4);
Matrix cv = new Matrix(4, 4);
// store the float arrays to matrices
Matrix st = bi.times(lp).times(lv)
// read matrix into float array and transpose.
Interestingly, the shadow texture matrix also had to be transposed.
Still, the resulting image is shown below - it has artifacts. Question is - how to get rid of them?


EDIT:
Apparently, the problem is erroneous self-shadowing.
I realized this while reading the suggested link - thanks for the pointer!
However, the links suggests adjusting the near and far planes - this is not the solution.
The solution is to adjust the scaling factor of the depth matrix. To avoid peter-panning, the scale in the z-axis should be as big as possible, but less than 1.f:
glMatrixMode(GL_TEXTURE)
glLoadIdentity()
glScalef(1.f, 1.f, 0.9999f)
The result:

I figured it out while going through this example.