In obtaining sparsity in eqs. (78)
and (80), we need the inverse Gram matrix of the
set.
In the following the elements of the
set are indexed from 1 to
t. Using the matrix inversion formula, eq. (182), the
addition of a new element is done sequentially. This is well known,
commonly used in the Kalman filter algorithm. We consider the new
element at the end (last row and column) of matrix
Kt + 1. The
matrix
Kt + 1 is decomposed:
Assuming Kt-1 known and applying the matrix inversion lemma for Kt + 1:
where
= k*t + 1 - kt + 1TKt-1kt + 1 is
the squared distance of the last feature vector from the linear span
of all previous ones (see Section 3.1). Using notations
Kt-1kt + 1 =
from eq. (62),
Kt-1 = Qt, and
Kt + 1-1 = Qt + 1 we have the recursion:
and in a more compact matrix notation:
where
et + 1 is the t + 1-th unit vector. With this recursion
equation all matrix inversion is eliminated (this result is general
for block matrices, such implementation, together with an
interpretation of the parameters has been also made
in [10]). The introduction of the rule
> 0 guarantees non-singularity of the Gram matrix (see
Fig. 3.1).
The block-diagonal decomposition of the Gram matrix from eq. (188) allows us to have a recursive expression for the determinant. Using eq. (184), we have
where
is the squared distance of the new input from the
subspace spanned by all previous inputs.
For numerical stability we can use the Cholesky-factorisation of the inverse Gram matrix Q. Using the lower-triangular matrix R with the corresponding indices, and the identity Q = RTR, we have the update for the Cholesky-decomposition
that is a computationally very inexpensive operation, without
additional operations provided that the quantities
and
et + 1 are already computed.
In Chapter 3 the diagonal elements of the inverse
Gram matrix are used in establishing the score for an element of the
set, and the columns of the same Gram matrix are used in updating
the GP elements. Fixing the index of the column to l, we have the
l-th diagonal element and the columns expressed as
where we used the decomposition of Rt + 1 along the l-th column
Rdatat + 1 = ![]() |
and the update of the Cholesky decomposition, when removing the l-th column is written as
with U being the Cholesky decomposition of
IN - l + . This is always positive definite and there
is a fast computation for U (in matlab one can compute it with
U=cholupdate(I,qN/q) with corresponding quantities).