PCA minimizes volume of ellipsoid formed by variance projection onto each mode

volume proportional to produce sqrt(lambda_i)

ie minimise sum_i log(lambda_i)


Other component analyzers aim to minimise different objective functions.

For instance, we can encourage sparseness of the modes by minimising

sum_i { log(var_i)  + alpha sum_j | e_ij | }

where var_i is the variance of the data projected onto mode i, with
direction e_i.

Note that this obj.fn. can be decomposed by elements.
Thus an elegant way of optimising it is to start with some set
of orthogonal bases { e_i }.
We then choose any pair, say e_1 and e_2.
If we rotate these in the plane, ie generate two new bases:
u_1 = e_1 cos(A) - e_2 sin(A);
u_2 = e_1 sin(A) + e_2 cos(A);

then u_1 and u_2 are guaranteed to be orthogonal to each other
and to all the other e_i (i>2).

So we can perform a 1D optimisation of the objective fn w.r.t. A.
Then repeat for random pairs until satisfied.



General approach to optimisation:

Apply PCA to get first estimate of { e_i }

Choose two modes at random
  Optimise the objective function generated from the two modes w.r.t. angle
Repeat until happy


Simplest approach (not necessarily efficient):
mcal_single_basis_cost   (base class)
  evaluate(vnl_vector<double> data_projection, mode_vector)
  - Given data projected onto that mode, computes the component of the cost fn

To optimise a pair of modes:
1) Project data onto the two modes to get initial projections
2) For each angle perform linear combinatin of mode and projections,
   then evaluate with mcal_single_basis_cost for both.

However, we can compute the variance of the projection onto a rotated
version directly, without needing the full data_projection.
So this could be supplied as an alternative.