Toggle Light / Dark / Auto color theme
Toggle table of contents sidebar
Analytical example with greedy kernel points
Step-by-step usage of greedy kernel points on an analytical example.
See GreedyKernelPoints for a description of the
method.
We start with the supervised dataset of three points in \(\mathbb{R}^2\)
\[\begin{split}X = \begin{pmatrix} 1 & 0 \\ 0 & 1 \\ 2 & 1 \end{pmatrix}\end{split}\]
with supervision
\[\begin{split}y^{(1)} = \begin{pmatrix} 0 \\ 1 \\ 5 \end{pmatrix} .\end{split}\]
With the LinearKernel with
output_scale \(= 1\) and
constant \(= 0\) and using the
GreedyKernelPoints algorithm with zero
regularisation (\(\lambda = 0\) ), we seek a coreset of size two without duplicates.
The feature Gramian is
\[\begin{split}K^{(11)} &= X^T X \\
&= \begin{pmatrix} 1 & 0 & 2 \\ 0 & 1 & 1 \\ 2 & 1 & 5 \end{pmatrix} .\end{split}\]
Following the notation in GreedyKernelPoints , let
\(\mathcal{D}^{(2)} = \{(x_i, y_i)\}_{i=1}^m\) be the set of points already selected
for the coreset. Then, \(X^{(2)} \in \mathbb{R}^{m \times 2}\) is the selected data,
\(y^{(2)} \in \mathbb{R}^m\) is the vector of responses, \(K^{(12)}\) is the
cross-matrix of kernel evaluations (a subset of columns of the feature Gramian), and
\(K^{(22)}\) is the kernel matrix on \(\mathcal{D}^{(2)}\) (a subset of rows and
columns of the feature Gramian). With zero regularisation, the predictions for a given
coreset are given by \(z = K^{(12)} {K^{(22)}}^{-1} y^{(2)}\) .
First iteration
At the first iteration, there are no data points already in the coreset, so we consider
the three candidate coresets, each of size one.
Candidate (0)
\[\begin{split}X^{(2)} &= \begin{pmatrix} 1 & 0 \end{pmatrix} ; \\
y^{(2)} &= \begin{pmatrix} 0 \end{pmatrix} ; \\
K^{(12)} &= \begin{pmatrix} 1 \\ 0 \\ 2 \end{pmatrix} ; \\
K^{(22)} &= \begin{pmatrix} 1 \end{pmatrix} .\end{split}\]
Then, the inverse of the kernel matrix is
\[{K^{(22)}}^{-1} = \begin{pmatrix} 1 \end{pmatrix} ,\]
and the prediction is
\[\begin{split}z = \begin{pmatrix} 0 \\ 0 \\ 0 \end{pmatrix} ,\end{split}\]
so the loss is
\[\begin{split}L &= \left\| y^{(1)} - z \right\|^2 \\
&= 0^2 + 1^2 + 5^2 \\
&= 26 .\end{split}\]
Candidate (1)
\[\begin{split}X^{(2)} &= \begin{pmatrix} 0 & 1 \end{pmatrix} ; \\
y^{(2)} &= \begin{pmatrix} 0 \end{pmatrix} ; \\
K^{(12)} &= \begin{pmatrix} 0 \\ 1 \\ 1 \end{pmatrix} ; \\
K^{(22)} &= \begin{pmatrix} 1 \end{pmatrix} .\end{split}\]
Then, the inverse of the kernel matrix is
\[{K^{(22)}}^{-1} = \begin{pmatrix} 1 \end{pmatrix} ,\]
and the prediction is
\[\begin{split}z = \begin{pmatrix} 0 \\ 1 \\ 1 \end{pmatrix} ,\end{split}\]
so the loss is
\[\begin{split}L &= 0^2 + 0^2 + 4^2 \\
&= 16 .\end{split}\]
Candidate (2)
\[\begin{split}X^{(2)} &= \begin{pmatrix} 2 & 1 \end{pmatrix} ; \\
y^{(2)} &= \begin{pmatrix} 5 \end{pmatrix} ; \\
K^{(12)} &= \begin{pmatrix} 2 \\ 1 \\ 5 \end{pmatrix} ; \\
K^{(22)} &= \begin{pmatrix} 5 \end{pmatrix} .\end{split}\]
Then, the inverse of the kernel matrix is
\[{K^{(22)}}^{-1} = \begin{pmatrix} \frac{1}{5} \end{pmatrix} ,\]
and the prediction is
\[\begin{split}z = \begin{pmatrix} 2 \\ 1 \\ 5 \end{pmatrix} ,\end{split}\]
so the loss is
\[\begin{split}L &= 2^2 + 0^2 + 0^2 \\
&= 4 .\end{split}\]
Selection
Index 2 has the lowest loss, so joins the coreset.
Second iteration
We consider two candidate coresets of size two, each containing data corresponding to
index 2 as the first element with another index in the second element.
Candidate (2 0)
\[\begin{split}X^{(2)} &= \begin{pmatrix} 2 & 1 \\ 1 & 0 \end{pmatrix} ; \\
y^{(2)} &= \begin{pmatrix} 5 \\ 0 \end{pmatrix} ; \\
K^{(12)} &= \begin{pmatrix} 2 & 1 \\ 1 & 0 \\ 5 & 2 \end{pmatrix} ; \\
K^{(22)} &= \begin{pmatrix} 5 & 2 \\ 2 & 1 \end{pmatrix} .\end{split}\]
Then, the inverse of the kernel matrix is
\[\begin{split}{K^{(22)}}^{-1} = \begin{pmatrix} 1 & -2 \\ -2 & 5 \end{pmatrix} ,\end{split}\]
and the prediction is
\[\begin{split}z &= K^{(12)} {K^{(22)}}^{-1} y^{(2)} \\
&= \begin{pmatrix} 2 & 1 \\ 1 & 0 \\ 5 & 2 \end{pmatrix}
\begin{pmatrix} 1 & -2 \\ -2 & 5 \end{pmatrix}
\begin{pmatrix} 5 \\ 0 \end{pmatrix} \\
&= \begin{pmatrix} 2 & 1 \\ 1 & 0 \\ 5 & 2 \end{pmatrix}
\begin{pmatrix} 5 \\ -10 \end{pmatrix} \\
&= \begin{pmatrix} 0 \\ 5 \\ 5 \end{pmatrix} ,\end{split}\]
so the loss is
\[\begin{split}L &= 0^2 + 4^2 + 0^2 \\
&= 16 .\end{split}\]
Candidate (2 1)
\[\begin{split}X^{(2)} &= \begin{pmatrix} 2 & 1 \\ 0 & 1 \end{pmatrix} ; \\
y^{(2)} &= \begin{pmatrix} 5 \\ 1 \end{pmatrix} ; \\
K^{(12)} &= \begin{pmatrix} 2 & 0 \\ 1 & 1 \\ 5 & 1 \end{pmatrix} ; \\
K^{(22)} &= \begin{pmatrix} 5 & 1 \\ 1 & 1 \end{pmatrix} .\end{split}\]
Then, the inverse of the kernel matrix is
\[\begin{split}{K^{(22)}}^{-1} = \frac{1}{4} \begin{pmatrix} 1 & -1 \\ -1 & 5 \end{pmatrix} ,\end{split}\]
and the prediction is
\[\begin{split}z &= K^{(12)} {K^{(22)}}^{-1} y^{(2)} \\
&= \frac{1}{4} \begin{pmatrix} 2 & 0 \\ 1 & 1 \\ 5 & 1 \end{pmatrix}
\begin{pmatrix} 1 & -1 \\ -1 & 5 \end{pmatrix}
\begin{pmatrix} 5 \\ 1 \end{pmatrix} \\
&= \begin{pmatrix} 2 & 0 \\ 1 & 1 \\ 5 & 1 \end{pmatrix}
\begin{pmatrix} 4 \\ 0 \end{pmatrix} \\
&= \begin{pmatrix} 2 \\ 1 \\ 5 \end{pmatrix} ,\end{split}\]
so the loss is
\[\begin{split}L &= 2^2 + 0^2 + 0^2 \\
&= 4 .\end{split}\]
Final selection
The second candidate has the lower loss, so the final coreset consists of indices
\(\begin{pmatrix} 2 & 1 \end{pmatrix}\) . In terms of original data, this can be
expressed as
\[\begin{split}\hat{X} &= \begin{pmatrix} 2 & 1 \\ 0 & 1 \end{pmatrix} ; \\
\hat{y} &= \begin{pmatrix} 5 \\ 1 \end{pmatrix} .\end{split}\]
examples.greedy_kernel_points_analytic. main ( ) [source]
Run the greedy kernel points analytical example.
Return type:
Coresubset [SupervisedData ]
Returns:
Object containing coreset indices and materialised coreset.