Subsections

Iterative computation of the inverse Gram matrix

In obtaining sparsity in eqs. (78) and (80), we need the inverse Gram matrix of the $\cal {BV}$ set. In the following the elements of the $\cal {BV}$ set are indexed from 1 to t. Using the matrix inversion formula, eq. (182), the addition of a new element is done sequentially. This is well known, commonly used in the Kalman filter algorithm. We consider the new element at the end (last row and column) of matrix K_{t + 1}. The matrix K_{t + 1} is decomposed:

K_{t + 1} = $\displaystyle \begin{bmatrix}{\boldsymbol { K } }_t & {\boldsymbol { k } }_{t+1} \\ {\boldsymbol { k } }_{t+1}^T & k^*_{t+1} \end{bmatrix}$

(188)

Assuming K_t^-1 known and applying the matrix inversion lemma for K_{t + 1}:

$\begin{displaymath}\begin{split}{\boldsymbol { K } }_{t+1}^{-1} &= \begin{bmatri... ...\gamma_{t+1}^{-1} & \gamma_{t+1}^{-1} \end{bmatrix} \end{split}\end{displaymath}$

(189)

where $\gamma_{t+1}^{}$ = k^*_{t + 1} - k_{t + 1}^TK_t^-1k_{t + 1} is the squared distance of the last feature vector from the linear span of all previous ones (see Section 3.1). Using notations K_t^-1k_{t + 1} = $\hat{{\boldsymbol { e } }}_{t+1}^{}$ from eq. (62), K_t^-1 = Q_t, and K_{t + 1}^-1 = Q_{t + 1} we have the recursion:

Q_{t + 1} = $\displaystyle \begin{bmatrix}{\boldsymbol { Q } }_t + \gamma_{t+1}^{-1}\hat{{\b... ...{t+1}^{-1} \hat{{\boldsymbol { e } }}_{t+1}^T & \gamma_{t+1}^{-1} \end{bmatrix}$

(190)

and in a more compact matrix notation:

Q_{t + 1} = Q_t + $\displaystyle \gamma^{-1}_{t+1}$ ( $\displaystyle \hat{{\boldsymbol { e } }}_{t+1}^{}$ - e_{t + 1})( $\displaystyle \hat{{\boldsymbol { e } }}_{t+1}^{}$ - e_{t + 1})^T

(191)

where e_{t + 1} is the t + 1-th unit vector. With this recursion equation all matrix inversion is eliminated (this result is general for block matrices, such implementation, together with an interpretation of the parameters has been also made in [10]). The introduction of the rule $\gamma_{t+1}^{}$ > 0 guarantees non-singularity of the Gram matrix (see Fig. 3.1).

Computing determinants

The block-diagonal decomposition of the Gram matrix from eq. (188) allows us to have a recursive expression for the determinant. Using eq. (184), we have

|K_{t + 1}| = |K_t|| k_{t + 1}^* - k_{t + 1}^TK_t^-1k_{t + 1}| = |K_t| $\displaystyle \gamma_{t+1}^{}$

(192)

where $\gamma_{t+1}^{}$ is the squared distance of the new input from the subspace spanned by all previous inputs.

Updates for the Cholesky factorisation

For numerical stability we can use the Cholesky-factorisation of the inverse Gram matrix Q. Using the lower-triangular matrix R with the corresponding indices, and the identity Q = R^TR, we have the update for the Cholesky-decomposition

R_{t + 1} = $\displaystyle \begin{pmatrix}{\boldsymbol { R } }_t & 0\\ -\gamma_{t+1}^{-1/2}\hat{{\boldsymbol { e } }}_{t+1}^T & \gamma_{t+1}^{-1/2} \end{pmatrix}$

(193)

that is a computationally very inexpensive operation, without additional operations provided that the quantities $\gamma_{t+1}^{}$ and e_{t + 1} are already computed.

In Chapter 3 the diagonal elements of the inverse Gram matrix are used in establishing the score for an element of the $\cal {BV}$ set, and the columns of the same Gram matrix are used in updating the GP elements. Fixing the index of the column to l, we have the l-th diagonal element and the columns expressed as

$\begin{displaymath}\begin{split}\gamma_{l}&=(r^*)^{2}+\tilde{{\boldsymbol { r } ... ...-l}^T\tilde{{\boldsymbol { r } }}_{l} \end{bmatrix} \end{split}\end{displaymath}$

(194)

where we used the decomposition of R_{t + 1} along the l-th column

R_{datat + 1} = $\displaystyle \begin{bmatrix}{\boldsymbol { R } }_{l} & {\boldsymbol { 0 } }_l ... ...{{\boldsymbol { r } }}_{t-l} & \tilde{{\boldsymbol { R } }}_{t-l} \end{bmatrix}$

and the update of the Cholesky decomposition, when removing the l-th column is written as

R_t^\l = $\displaystyle \begin{bmatrix}{\boldsymbol { R } }_{l} & {\boldsymbol { 0 } }_{l... ...mbol { r } }^*}\right) & U^{-1}\tilde{{\boldsymbol { Q } }}_{t-l} \end{bmatrix}$

(195)

with U being the Cholesky decomposition of I_{N - l} + ${\frac{\tilde{{\boldsymbol { q } }}_{t-l}\tilde{{\boldsymbol { q } }}_{t-l}^{T}}{(q^*)^{2}}}$ . This is always positive definite and there is a fast computation for U (in matlab one can compute it with U=cholupdate(I,qN/q) with corresponding quantities).