Computing the KL-distance
Updates for S_{t + 1} = (C_{t + 1}^-1 + K_{t + 1})^-1

KL-optimal parameter reduction

We want to find the constrained GP with parameters ( $\hat{{\boldsymbol { \alpha } }}_{t+1}^{}$ , $\hat{{\boldsymbol { C } }}_{t+1}^{}$ ) with the last elements all having zero values (see Section 3.3) that minimises the KL-divergence

$\begin{displaymath}\begin{split}2&{\ensuremath{\mathrm{KL}}}(\hat{{\ensuremath{{... ...oldsymbol { K } }_{t+1}^{-1}\right)^{-1}\right\vert \end{split}\end{displaymath}$

(196)

and we suppose the GP parameters ( $\alpha$ _{t + 1},C_{t + 1}) and the $\cal {BV}$ set are given (we know K_{t + 1} and K_t). In the following we use K_{t + 1}^-1 = Q_{t + 1} and the decomposition of the GP parameters as presented in Chapter 3, with Fig 3.3 repeated in Fig D.1.

**Figure:** Grouping of the GP parameters (Fig 3.3 repeated).
$\includegraphics[]{decomp.eps}$

The differentiation with respect to parameters $\hat{\alpha}_{1}^{}$ ,..., $\hat{\alpha}_{t}^{}$ leads to the system of equations that is easily written in matrix form as

$\begin{displaymath}\begin{split}\begin{bmatrix}{\boldsymbol { I } }_t & {\boldsy... ...{t+1} - {\boldsymbol { \alpha } }_{t+1} \right) &=0 \end{split}\end{displaymath}$

(197)

where I_t is the identity matrix and 0_t is the column vector of length t with zero elements. In the second line the matrix multiplication has been performed and we used the decomposition

$\displaystyle \left(\vphantom{{\boldsymbol { Q } }_{t+1} + {\boldsymbol { C } }_{t+1}}\right.$ Q_{t + 1} + C_{t + 1} $\displaystyle \left.\vphantom{{\boldsymbol { Q } }_{t+1} + {\boldsymbol { C } }_{t+1}}\right)^{-1}_{}$ = $\displaystyle \begin{bmatrix}{\boldsymbol { B } }& {\boldsymbol { a } }\\ {\boldsymbol { a } }^T & b \end{bmatrix}$

(198)

Finally, using the decomposition of the vector $\alpha$ _{t + 1} from Fig. 3.3, we have

$\displaystyle \hat{{\boldsymbol { \alpha } }}_{t+1}^{}$ = $\displaystyle \alpha$ ^(r) + $\displaystyle \alpha^{*}_{}$ $\displaystyle \tilde{{\boldsymbol { e } }}_{t+1}^{}$ with $\displaystyle \tilde{{\boldsymbol { e } }}_{t+1}^{}$ = B^-1a

and $\tilde{{\boldsymbol { e } }}_{t+1}^{}$ is obtained from the matrix inversion lemma for block matrices from eq. (182). Using the matrix inversion lemma for (Q_{t + 1} + C_{t + 1})^-1 from eq. (198) we have:

Q_{t + 1} + C_{t + 1} = $\displaystyle \begin{bmatrix}\left( {\boldsymbol { B } }-\frac{{\boldsymbol { a... ... -{\boldsymbol { a } }^T{\boldsymbol { B } }^{-1}\delta & \delta \end{bmatrix}$ with $\displaystyle \delta$ = $\displaystyle \left(\vphantom{b - {\boldsymbol { a } }^T{\boldsymbol { B } }{\boldsymbol { a } }}\right.$ b - a^TBa $\displaystyle \left.\vphantom{b - {\boldsymbol { a } }^T{\boldsymbol { B } }{\boldsymbol { a } }}\right)^{-1}_{}$

(199)

and using the correspondence $\delta$ = q^* + c^* and Q^* + C^* = - B^-1a $\delta$ read from eq. (199), we have

$\displaystyle \tilde{{\boldsymbol { e } }}_{t+1}^{}$ = - $\displaystyle {\frac{1}{q^* + c^*}}$ $\displaystyle \left(\vphantom{{\boldsymbol { Q } }^* + {\boldsymbol { C } }^* }\right.$ Q^* + C^* $\displaystyle \left.\vphantom{{\boldsymbol { Q } }^* + {\boldsymbol { C } }^* }\right)$

(200)

and replacing it into the expression for the reduced mean parameters, we have

$\displaystyle \hat{{\boldsymbol { \alpha } }}_{t+1}^{}$ = $\displaystyle \alpha^{(r)}_{}$ - $\displaystyle {\frac{\alpha^*}{c^* + q^*}}$ $\displaystyle \left(\vphantom{{\boldsymbol { Q } }^* + {\boldsymbol { C } }^* }\right.$ Q^* + C^* $\displaystyle \left.\vphantom{{\boldsymbol { Q } }^* + {\boldsymbol { C } }^* }\right)$

(201)

$\qedsymbol$

Before differentiating the KL-divergence with respect to $\hat{{\boldsymbol { C } }}_{t+1}^{}$ , we simplify the terms that include $\hat{{\boldsymbol { C } }}_{t+1}^{}$ in eq. (196). Firstly we write the constraints for the last row and column of $\hat{{\boldsymbol { C } }}_{t+1}^{}$ using the extension matrix [I_t 0_t] as

$\displaystyle \hat{{\boldsymbol { C } }}_{t+1}^{}$ = $\displaystyle \begin{bmatrix}{\boldsymbol { I } }_t & {\boldsymbol { 0 } }_t \end{bmatrix}^{T}_{}$ $\displaystyle \hat{{\boldsymbol { C } }}_{t+1}^{(r)}$ $\displaystyle \begin{bmatrix}{\boldsymbol { I } }_t & {\boldsymbol { 0 } }_t \end{bmatrix}$

(202)

where $\hat{{\boldsymbol { C } }}_{t+1}^{(r)}$ is a matrix with t rows and columns, and in the following we will use $\hat{{\boldsymbol { C } }}_{t+1}^{}$ instead of $\hat{{\boldsymbol { C } }}_{t+1}^{(r)}$ . Permuting the elements in the trace term of eq. (196) leads to

		tr $\displaystyle \left[\vphantom{\begin{bmatrix}{\boldsymbol { I } }_t & {\boldsym... ...d{bmatrix}({\boldsymbol { C } }_{t+1}+{\boldsymbol { Q } }_{t+1})^{-1} }\right.$ $\displaystyle \begin{bmatrix}{\boldsymbol { I } }_t & {\boldsymbol { 0 } }_t\end{bmatrix}^{T}_{}$ $\displaystyle \hat{{\boldsymbol { C } }}_{t+1}^{}$ $\displaystyle \begin{bmatrix}{\boldsymbol { I } }_t & {\boldsymbol { 0 } }_t\end{bmatrix}$ (C_{t + 1} + Q_{t + 1})^-1 $\displaystyle \left.\vphantom{\begin{bmatrix}{\boldsymbol { I } }_t & {\boldsym... ...d{bmatrix}({\boldsymbol { C } }_{t+1}+{\boldsymbol { Q } }_{t+1})^{-1} }\right]$
	$\textstyle =$	$\displaystyle \ensuremath{\mathrm{tr}}\left[\hat{{\boldsymbol { C } }}_{t+1} \b... ...{bmatrix}{\boldsymbol { I } }_t & {\boldsymbol { 0 } }_t\end{bmatrix}^T \right]$	(203)

where the additive term - C_{t + 1}(C_{t + 1} + Q_{t + 1})^-1 is ignored since it will not contribute to the result of the differentiation. Ignoring also the term not depending on $\hat{{\boldsymbol { C } }}_{t+1}^{}$ in the determinant, and using the replacement of $\hat{{\boldsymbol { C } }}_{t+1}^{}$ from eq. (202) we simplify the log-determinant

ln $\displaystyle \left\vert\vphantom{\left(\begin{bmatrix}{\boldsymbol { I } }_t &... ... {\boldsymbol { 0 } }_t\end{bmatrix}+{\boldsymbol { Q } }_{t+1}\right) }\right.$ $\displaystyle \left(\vphantom{\begin{bmatrix}{\boldsymbol { I } }_t & {\boldsym... ... } }_t & {\boldsymbol { 0 } }_t\end{bmatrix}+{\boldsymbol { Q } }_{t+1}}\right.$ $\displaystyle \begin{bmatrix}{\boldsymbol { I } }_t & {\boldsymbol { 0 } }_t\end{bmatrix}^{T}_{}$ $\displaystyle \hat{{\boldsymbol { C } }}_{t+1}^{}$ $\displaystyle \begin{bmatrix}{\boldsymbol { I } }_t & {\boldsymbol { 0 } }_t\end{bmatrix}$ + Q_{t + 1} $\displaystyle \left.\vphantom{\begin{bmatrix}{\boldsymbol { I } }_t & {\boldsym... ... } }_t & {\boldsymbol { 0 } }_t\end{bmatrix}+{\boldsymbol { Q } }_{t+1}}\right)$ $\displaystyle \left.\vphantom{\left(\begin{bmatrix}{\boldsymbol { I } }_t & {\b... ...oldsymbol { 0 } }_t\end{bmatrix}+{\boldsymbol { Q } }_{t+1}\right) }\right\vert$	=	ln $\displaystyle \left\vert\vphantom{ \begin{bmatrix} \hat{{\boldsymbol { C } }}_{... ...{\boldsymbol { Q } }^\\ {\boldsymbol { Q } }^{T} & q^* \end{bmatrix}}\right.$ $\displaystyle \begin{bmatrix} \hat{{\boldsymbol { C } }}_{t+1} + {\boldsymbol {... ...{q^} & {\boldsymbol { Q } }^\\ {\boldsymbol { Q } }^{T} & q^ \end{bmatrix}$ $\displaystyle \left.\vphantom{ \begin{bmatrix} \hat{{\boldsymbol { C } }}_{t+1}... ...ldsymbol { Q } }^\\ {\boldsymbol { Q } }^{T} & q^* \end{bmatrix}}\right\vert$
	=	ln $\displaystyle \left\vert\vphantom{ \hat{{\boldsymbol { C } }}_{t+1} + {\boldsym... ...T}}{q^} - \frac{{\boldsymbol { Q } }^{\boldsymbol { Q } }^{T}}{q^} }\right.$ $\displaystyle \hat{{\boldsymbol { C } }}_{t+1}^{}$ + Q_t + $\displaystyle {\frac{{\boldsymbol { Q } }^{\boldsymbol { Q } }^{T}}{q^}}$ - $\displaystyle {\frac{{\boldsymbol { Q } }^{\boldsymbol { Q } }^{T}}{q^}}$ $\displaystyle \left.\vphantom{ \hat{{\boldsymbol { C } }}_{t+1} + {\boldsymbol ... ...q^} - \frac{{\boldsymbol { Q } }^{\boldsymbol { Q } }^{T}}{q^} }\right\vert$ + ln q^*

	=	ln $\displaystyle \left\vert\vphantom{ \hat{{\boldsymbol { C } }}_{t+1} + {\boldsymbol { Q } }_t }\right.$ $\displaystyle \hat{{\boldsymbol { C } }}_{t+1}^{}$ + Q_t $\displaystyle \left.\vphantom{ \hat{{\boldsymbol { C } }}_{t+1} + {\boldsymbol { Q } }_t }\right\vert$ + ln q^*	(204)

where we used the decomposition into block-diagonal matrices (first line) and the expression for the determinants of block-diagonal matrices from eq (184).

The differentiation of the KL-distance with respect to $\hat{{\boldsymbol { C } }}_{t+1}^{}$ is the addition of differentiating eqs. (203) and (204):

$\displaystyle \left(\vphantom{ \hat{{\boldsymbol { C } }}_{t+1} + {\boldsymbol { Q } }_{t} }\right.$ $\displaystyle \hat{{\boldsymbol { C } }}_{t+1}^{}$ + Q_t $\displaystyle \left.\vphantom{ \hat{{\boldsymbol { C } }}_{t+1} + {\boldsymbol { Q } }_{t} }\right)^{-1}_{}$ = $\displaystyle \begin{bmatrix}{\boldsymbol { I } }_t & {\boldsymbol { 0 } } \end{bmatrix}$ $\displaystyle \left(\vphantom{{\boldsymbol { Q } }_{t+1} + {\boldsymbol { C } }_{t+1}}\right.$ Q_{t + 1} + C_{t + 1} $\displaystyle \left.\vphantom{{\boldsymbol { Q } }_{t+1} + {\boldsymbol { C } }_{t+1}}\right)^{-1}_{}$ $\displaystyle \begin{bmatrix}{\boldsymbol { I } }_t & {\boldsymbol { 0 } } \end{bmatrix}^{T}_{}$

(205)

We apply the matrix inversion lemma to the RHS similarly to the case of eq (204) and retaining only the upper-left part leads to

$\displaystyle \left(\vphantom{\hat{{\boldsymbol { C } }}_{t+1} + {\boldsymbol { Q } }_t }\right.$ $\displaystyle \hat{{\boldsymbol { C } }}_{t+1}^{}$ + Q_t $\displaystyle \left.\vphantom{\hat{{\boldsymbol { C } }}_{t+1} + {\boldsymbol { Q } }_t }\right)^{-1}_{}$ = $\displaystyle \left(\vphantom{{\boldsymbol { C } }^{(r)} + {\boldsymbol { Q } }... ...t({\boldsymbol { C } }^*+ {\boldsymbol { Q } }^*\right)^{T}} {q^*+c^*} }\right.$ C^(r) + Q_t + $\displaystyle {\frac{\displaystyle{\boldsymbol { Q } }^*{\boldsymbol { Q } }^{*T}}{q^*}}$ - $\displaystyle {\frac{\displaystyle\left({\boldsymbol { C } }^*+ {\boldsymbol { ... ...ight)\left({\boldsymbol { C } }^*+ {\boldsymbol { Q } }^*\right)^{T}}{q^*+c^*}}$ $\displaystyle \left.\vphantom{{\boldsymbol { C } }^{(r)} + {\boldsymbol { Q } }... ...symbol { C } }^*+ {\boldsymbol { Q } }^*\right)^{T}} {q^*+c^*} }\right)^{-1}_{}$

(206)

and the reduced covariance parameter is

$\displaystyle \hat{{\boldsymbol { C } }}_{t+1}^{}$ = C^(r) + $\displaystyle {\frac{{\boldsymbol { Q } }^*{\boldsymbol { Q } }^{*T}}{q^*}}$ - $\displaystyle {\frac{\left({\boldsymbol { Q } }^*+{\boldsymbol { C } }^*\right)\left({\boldsymbol { Q } }^*+{\boldsymbol { C } }^*\right)^T}{q^* + c^*}}$

(207)

$\qedsymbol$

ARRAY(0x896d3ec)ARRAY(0x896d3ec)

Computing the KL-distance

We are assessing the error made when pruning the GP by evaluating the KL-divergence from eq. (196) between the process with ( $\alpha$ _{t + 1},C_{t + 1}) and the pruned one with ( $\hat{{\boldsymbol { \alpha } }}_{t+1}^{}$ , $\hat{{\boldsymbol { C } }}_{t+1}^{}$ ) from the previous section. We start by writing the pruning equations in function of t + 1-dimensional vectors: in the following we will use Q^* $\doteq$ [Q^*Tq^*]^T and C^* $\doteq$ [C^*Tc^*]^T and the pruning equations are

$\begin{displaymath}\begin{split}\hat{{\boldsymbol { \alpha } }}_{t+1} &= {\bolds... ...oldsymbol { Q } }^*+{\boldsymbol { C } }^*\right)^T \end{split}\end{displaymath}$

(208)

and it is easy to check that the updates will result in the last row and column being all zeros. In computing the KL-divergence we will use the identities from the matrix algebra:

$\displaystyle \left(\vphantom{ {\boldsymbol { C } }_{t+1} + {\boldsymbol { Q } }_{t+1} }\right.$ C_{t + 1} + Q_{t + 1} $\displaystyle \left.\vphantom{ {\boldsymbol { C } }_{t+1} + {\boldsymbol { Q } }_{t+1} }\right)^{-1}_{}$ $\displaystyle \left(\vphantom{{\boldsymbol { C } }^* + {\boldsymbol { Q } }^* }\right.$ C^* + Q^* $\displaystyle \left.\vphantom{{\boldsymbol { C } }^* + {\boldsymbol { Q } }^* }\right)$ = e_{t + 1} and K_{t + 1}Q^* = e_{t + 1}

Based on the first identity, the term containing the mean is

( $\displaystyle \alpha$ _{t + 1} - $\displaystyle \hat{{\boldsymbol { \alpha } }}_{t+1}^{}$ ) $\displaystyle \left(\vphantom{{\boldsymbol { C } }_{t+1} + {\boldsymbol { K } }_{t+1}^{-1} }\right.$ C_{t + 1} + K_{t + 1}^-1 $\displaystyle \left.\vphantom{{\boldsymbol { C } }_{t+1} + {\boldsymbol { K } }_{t+1}^{-1} }\right)^{-1}_{}$ ( $\displaystyle \alpha$ _{t + 1} - $\displaystyle \hat{{\boldsymbol { \alpha } }}_{t+1}^{}$ ) = $\displaystyle {\frac{\alpha^{*2}}{q^* + c^*}}$

(209)

The logarithm of the determinants is transformed, using the determinants of the block-diagonal matrices, in eq. (184):

$\begin{displaymath}\begin{split}\left\vert \hat{{\boldsymbol { C } }}_{t+1} + {\... ...*+{\boldsymbol { C } }^*\right)^T \right\vert\; q^* \end{split}\end{displaymath}$

(210)

and using a similar decomposition for the denominator we have

$\begin{displaymath}\begin{split}\left\vert {\boldsymbol { C } }_{t+1} + {\boldsy... ...ymbol { C } }^*\right)^T \right\vert \; (q^* + c^*) \end{split}\end{displaymath}$

(211)

and the logarithm of the ratio has the simple expression as

ln $\displaystyle \left\vert\vphantom{ \left(\hat{{\boldsymbol { C } }}_{t+1} + {\b... ...ft({\boldsymbol { C } }_{t+1} + {\boldsymbol { Q } }_{t+1}\right)^{-1} }\right.$ $\displaystyle \left(\vphantom{\hat{{\boldsymbol { C } }}_{t+1} + {\boldsymbol { Q } }_{t+1}}\right.$ $\displaystyle \hat{{\boldsymbol { C } }}_{t+1}^{}$ + Q_{t + 1} $\displaystyle \left.\vphantom{\hat{{\boldsymbol { C } }}_{t+1} + {\boldsymbol { Q } }_{t+1}}\right)$ $\displaystyle \left(\vphantom{ {\boldsymbol { C } }_{t+1} + {\boldsymbol { Q } }_{t+1} }\right.$ C_{t + 1} + Q_{t + 1} $\displaystyle \left.\vphantom{ {\boldsymbol { C } }_{t+1} + {\boldsymbol { Q } }_{t+1} }\right)^{-1}_{}$ $\displaystyle \left.\vphantom{ \left(\hat{{\boldsymbol { C } }}_{t+1} + {\bolds... ...\boldsymbol { C } }_{t+1} + {\boldsymbol { Q } }_{t+1}\right)^{-1} }\right\vert$ = ln $\displaystyle {\frac{q^*}{q^* + c^*}}$

(212)

Finally, using the invariance of the trace of a product with respect to circular permutation of its elements, the trace term is:

	tr $\displaystyle \left[\vphantom{ \left( \frac{{\boldsymbol { Q } }^{\boldsymbol ... ...eft({\boldsymbol { C } }_{t+1} + {\boldsymbol { Q } }_{t+1}\right)^{-1}}\right.$ $\displaystyle \left(\vphantom{ \frac{{\boldsymbol { Q } }^{\boldsymbol { Q } }... ...ft({\boldsymbol { Q } }^+{\boldsymbol { C } }^\right)^T} {q^* + c^} }\right.$ $\displaystyle {\frac{{\boldsymbol { Q } }^{\boldsymbol { Q } }^{T}}{q^}}$ - $\displaystyle {\frac{\left({\boldsymbol { Q } }^+{\boldsymbol { C } }^\right)\left({\boldsymbol { Q } }^+{\boldsymbol { C } }^\right)^T}{q^* + c^}}$ $\displaystyle \left.\vphantom{ \frac{{\boldsymbol { Q } }^{\boldsymbol { Q } }... ...ft({\boldsymbol { Q } }^+{\boldsymbol { C } }^\right)^T} {q^* + c^} }\right)$ $\displaystyle \left(\vphantom{ {\boldsymbol { C } }_{t+1} + {\boldsymbol { Q } }_{t+1} }\right.$ C_{t + 1} + Q_{t + 1} $\displaystyle \left.\vphantom{ {\boldsymbol { C } }_{t+1} + {\boldsymbol { Q } }_{t+1} }\right)^{-1}_{}$ $\displaystyle \left.\vphantom{ \left( \frac{{\boldsymbol { Q } }^{\boldsymbol ... ...eft({\boldsymbol { C } }_{t+1} + {\boldsymbol { Q } }_{t+1}\right)^{-1}}\right]$
$\textstyle =$	$\displaystyle \frac{1}{q^}{\boldsymbol { Q } }^{T}\left({\boldsymbol { C } }_... ... }_{t+1}\right)^{-1} \left({\boldsymbol { Q } }^+{\boldsymbol { C } }^\right)$
$\textstyle =$	$\displaystyle \frac{1}{q^}{\boldsymbol { Q } }^{T}\left[{\boldsymbol { K } }_... ...}_{t+1}\right)^{-1} {\boldsymbol { K } }_{t+1} \right]{\boldsymbol { Q } }^* -1$
$\textstyle =$	$\displaystyle 1 - \frac{1}{q^}{\boldsymbol { e } }_{t+1}^T \left({\boldsymbol ... ...}{\boldsymbol { e } }_{t+1} - 1 = - \frac{\displaystyle s^}{\displaystyle q^*}$	(213)

where s^* is the last diagonal element of the matrix (C_{t + 1}^-1 + K_{t + 1})^-1. Summing up eqs (209), (212), and (213), we have the minimum KL-distance

2KL( $\displaystyle \hat{{\ensuremath{{\cal{GP}}}}}_{t+1}^{}$ | $\displaystyle \cal {GP}$ _{t + 1}) = $\displaystyle {\frac{\alpha^{*2}}{q^* + c^*}}$ - $\displaystyle {\frac{s^*}{q^*}}$ + ln $\displaystyle \left(\vphantom{ 1+ \frac{c^*}{q^*}}\right.$ 1 + $\displaystyle {\frac{c^*}{q^*}}$ $\displaystyle \left.\vphantom{ 1+ \frac{c^*}{q^*}}\right)$

(214)

Updates for S_{t + 1} = (C_{t + 1}^-1 + K_{t + 1})^-1

Matrix inversion is a sensitive issue and we are trying to avoid it. In computing the score for a given $\cal {BV}$ in the previous section, eq. (214), we need the diagonal element of the matrix S = (C^-1 + K)^-1. In this section we sketch an iterative update rule for the matrix S, and an update when the KL-optimal removal of the last $\cal {BV}$ element is performed.

First we establish the update rules for the inverse of matrix C_{t + 1}. By using the matrix inversion lemma and the update from eq. (57), the matrix C^-1 is

C_{t + 1}^-1 = $\displaystyle \begin{bmatrix}{\boldsymbol { C } }_t^{-1} & -{\boldsymbol { k } ... ...l { k } }_{t+1}^T{\boldsymbol { C } }_t{\boldsymbol { k } }_{t+1} \end{bmatrix}$

(215)

then we combine the above relation with the block-diagonal decomposition of the kernel matrix, and observing that the t×1 column vector is zero, we have

$\displaystyle \left(\vphantom{{\boldsymbol { C } }_{t+1}^{-1} + {\boldsymbol { K } }_{t+1} }\right.$ C_{t + 1}^-1 + K_{t + 1} $\displaystyle \left.\vphantom{{\boldsymbol { C } }_{t+1}^{-1} + {\boldsymbol { K } }_{t+1} }\right)^{-1}_{}$ = $\displaystyle \begin{bmatrix}\left({\boldsymbol { C } }_t^{-1}+{\boldsymbol { K... ...)^{-1} & {\boldsymbol { 0 } } \\ {\boldsymbol { 0 } }^T & a^{-1} \end{bmatrix}$ where a = (r^{(t + 1)})^-1 + k_{t + 1}^TC_tk_{t + 1} + k^*

and this shows that the update for the matrix S_{t + 1} is particularly simple: we only need to add a value on the last diagonal element.

When removing a $\cal {BV}$ however, the resulting matrix will not be diagonal any more. To have an update quadratic in the size of S, we use the matrix inversion lemma

S_{t + 1} = Q_{t + 1} - Q_{t + 1} $\displaystyle \left(\vphantom{ {\boldsymbol { C } }_{t+1} + {\boldsymbol { Q } }_{t+1} }\right.$ C_{t + 1} + Q_{t + 1} $\displaystyle \left.\vphantom{ {\boldsymbol { C } }_{t+1} + {\boldsymbol { Q } }_{t+1} }\right)^{-1}_{}$ Q_{t + 1}

and after the pruning we are looking for the t×t matrix $\hat{{\boldsymbol { S } }}_{t+1}^{}$ = $\left(\vphantom{\hat{{\boldsymbol { C } }}_{t+1}^{-1}+{\boldsymbol { K } }_t}\right.$ $\hat{{\boldsymbol { C } }}_{t+1}^{-1}$ + K_t $\left.\vphantom{\hat{{\boldsymbol { C } }}_{t+1}^{-1}+{\boldsymbol { K } }_t}\right)^{-1}_{}$ . We can obtain this by using eq. (206): the pruned $\left(\vphantom{\hat{{\boldsymbol { C } }}_{t+1} + {\boldsymbol { Q } }_{t}}\right.$ $\hat{{\boldsymbol { C } }}_{t+1}^{}$ + Q_t $\left.\vphantom{\hat{{\boldsymbol { C } }}_{t+1} + {\boldsymbol { Q } }_{t}}\right)^{-1}_{}$ is the matrix obtained by cutting the last row and column from $\left(\vphantom{{\boldsymbol { C } }_{t+1}+{\boldsymbol { Q } }_{t+1}}\right.$ C_{t + 1} + Q_{t + 1} $\left.\vphantom{{\boldsymbol { C } }_{t+1}+{\boldsymbol { Q } }_{t+1}}\right)^{-1}_{}$ . The computation of the updated matrix $\hat{{\boldsymbol { S } }}_{t+1}^{}$ has thus three steps:

compute

$\displaystyle \left(\vphantom{ {\boldsymbol { C } }_{t+1} + {\boldsymbol { Q } }_{t+1} }\right.$ C_{t + 1} + Q_{t + 1} $\displaystyle \left.\vphantom{ {\boldsymbol { C } }_{t+1} + {\boldsymbol { Q } }_{t+1} }\right)^{-1}_{}$ = K_{t + 1} - K_{t + 1}S_{t + 1}K_{t + 1}
compute the reduced matrix $\left(\vphantom{\hat{{\boldsymbol { C } }}_{t+1} + {\boldsymbol { Q } }_{t}}\right.$ $\hat{{\boldsymbol { C } }}_{t+1}^{}$ + Q_t $\left.\vphantom{\hat{{\boldsymbol { C } }}_{t+1} + {\boldsymbol { Q } }_{t}}\right)^{-1}_{}$ by trimming, use eq. (206)
compute the updated $\hat{{\boldsymbol { S } }}_{t+1}^{}$ using

$\displaystyle \hat{{\boldsymbol { S } }}_{t+1}^{}$ = Q_t - Q_t $\displaystyle \left(\vphantom{ \hat{{\boldsymbol { C } }}_{t+1} + {\boldsymbol { Q } }_{t} }\right.$ $\displaystyle \hat{{\boldsymbol { C } }}_{t+1}^{}$ + Q_t $\displaystyle \left.\vphantom{ \hat{{\boldsymbol { C } }}_{t+1} + {\boldsymbol { Q } }_{t} }\right)^{-1}_{}$ Q_t

KL-optimal parameter reduction

Computing the KL-distance

Updates for St + 1 = (Ct + 1-1 + Kt + 1)-1

Updates for S_{t + 1} = (C_{t + 1}^-1 + K_{t + 1})^-1