Approximations#

Classes and associated functionality to approximate kernels.

When a dataset is very large, methods which have to evaluate all pairwise combinations of the data, such as gramian_row_mean(), can become prohibitively expensive. To reduce this computational cost, such methods can instead be approximated (providing suitable approximation error can be achieved).

The ApproximateKernels in this module provide the functionality required to override specific methods of a base_kernel with their approximate counterparts. Because ApproximateKernels inherit from ScalarValuedKernel, with all functionality provided through composition with a base_kernel, they can be freely used in any place where a standard ScalarValuedKernel is expected.

class coreax.approximation.ApproximateKernel(base_kernel)[source]#

Bases: UniCompositeKernel

Base class for approximated kernels.

Provides approximations of the methods in the base_kernel.

The gramian_row_mean() method is particularly amenable to approximation, with significant performance improvements possible depending on the acceptable levels of error.

Parameters:: base_kernel (ScalarValuedKernel) – a ScalarValuedKernel whose attributes/methods are to be approximated

compute_elementwise(x, y)[source]#

Evaluate the kernel on individual input vectors x and y, not-vectorised.

Vectorisation only becomes relevant in terms of computational speed when we have multiple x or y.

Parameters:

x – Vector \(\mathbf{x} \in \mathbb{R}^d\)
y – Vector \(\mathbf{y} \in \mathbb{R}^d\)

Returns:

Kernel evaluated at (x, y)

grad_x_elementwise(x, y)[source]#

Evaluate the element-wise gradient of the kernel function w.r.t. x.

The gradient (Jacobian) of the kernel function w.r.t. x.

Only accepts single vectors x and y, i.e. not arrays. coreax.kernels.ScalarValuedKernel.grad_x() provides a vectorised version of this method for arrays.

Parameters:

x – Vector \(\mathbf{x} \in \mathbb{R}^d\)
y – Vector \(\mathbf{y} \in \mathbb{R}^d\)

Returns:

Jacobian \(\nabla_\mathbf{x} k(\mathbf{x}, \mathbf{y}) \in \mathbb{R}^d\)

grad_y_elementwise(x, y)[source]#

Evaluate the element-wise gradient of the kernel function w.r.t. y.

The gradient (Jacobian) of the kernel function w.r.t. y.

Only accepts single vectors x and y, i.e. not arrays. coreax.kernels.ScalarValuedKernel.grad_y() provides a vectorised version of this method for arrays.

Parameters:

x – Vector \(\mathbf{x} \in \mathbb{R}^d\).
y – Vector \(\mathbf{y} \in \mathbb{R}^d\).

Returns:

Jacobian \(\nabla_\mathbf{y} k(\mathbf{x}, \mathbf{y}) \in \mathbb{R}^d\)

divergence_x_grad_y_elementwise(x, y)[source]#

Evaluate the element-wise divergence w.r.t. x of Jacobian w.r.t. y.

\(\nabla_\mathbf{x} \cdot \nabla_\mathbf{y} k(\mathbf{x}, \mathbf{y})\). Only accepts vectors x and y. A vectorised version for arrays is computed in divergence_x_grad_y().

This is the trace of the ‘pseudo-Hessian’, i.e. the trace of the Jacobian matrix \(\nabla_\mathbf{x} \nabla_\mathbf{y} k(\mathbf{x}, \mathbf{y})\).

Parameters:

x – First vector \(\mathbf{x} \in \mathbb{R}^d\)
y – Second vector \(\mathbf{y} \in \mathbb{R}^d\)

Returns:

Trace of the Laplace-style operator; a real number

class coreax.approximation.RandomRegressionKernel(base_kernel, random_key, num_kernel_points=10000, num_train_points=10000)[source]#

Bases: ApproximateKernel

An approximate kernel that requires the attributes for random regression.

Parameters:

base_kernel (ScalarValuedKernel) – a ScalarValuedKernel whose attributes/methods are to be approximated
random_key (ArrayLike) – Key for random number generation
num_kernel_points (int) – Number of kernel evaluation points
num_train_points (int) – Number of training points used to fit kernel regression

random_key: ArrayLike#

num_kernel_points: int = 10000#

num_train_points: int = 10000#

class coreax.approximation.MonteCarloApproximateKernel(base_kernel, random_key, num_kernel_points=10000, num_train_points=10000)[source]#

Bases: RandomRegressionKernel

Approximate a base kernel via random subset selection.

Only the Gramian row-mean is approximated here, all other methods are inherited directly from the base_kernel.

Parameters:

base_kernel (ScalarValuedKernel) – a ScalarValuedKernel whose attributes/methods are to be approximated
random_key (ArrayLike) – Key for random number generation
num_kernel_points (int) – Number of kernel evaluation points
num_train_points (int) – Number of training points used to fit kernel regression

gramian_row_mean(x, **kwargs)[source]#

Approximate the Gramian row-mean by Monte-Carlo sampling.

A uniform random subset of x is used to approximate the base kernel’s Gramian row-mean.

Parameters:: x (Union[Shaped[Array, 'n d'], Shaped[Array, 'd'], Shaped[Array, ''], float, int, Data]) – Data matrix, \(n \times d\)
Return type:: Shaped[Array, 'n']
Returns:: Approximation of the base kernel’s Gramian row-mean

class coreax.approximation.ANNchorApproximateKernel(base_kernel, random_key, num_kernel_points=10000, num_train_points=10000)[source]#

Bases: RandomRegressionKernel

Approximate a base kernel via random kernel regression on ANNchor selected points.

Only the base kernel’s Gramian row-mean is approximated here, all other methods are inherited directly from the base_kernel.

Parameters:

base_kernel (ScalarValuedKernel) – a ScalarValuedKernel whose attributes/methods are to be approximated
random_key (ArrayLike) – Key for random number generation
num_kernel_points (int) – Number of kernel evaluation points
num_train_points (int) – Number of training points used to fit kernel regression

gramian_row_mean(x, **kwargs)[source]#

Approximate the Gramian row-mean by random regression on ANNchor points.

A subset of x is selected via the ANNchor approach and random kernel regression used to approximate the base kernel’s Gramian row-mean. The ANNchor implementation used can be found here.

Parameters:: x (Union[Shaped[Array, 'n d'], Shaped[Array, 'd'], Shaped[Array, ''], float, int, Data]) – Data matrix, \(n \times d\)
Return type:: Shaped[Array, 'n']
Returns:: Approximation of the base kernel’s Gramian row-mean

class coreax.approximation.NystromApproximateKernel(base_kernel, random_key, num_kernel_points=10000, num_train_points=10000)[source]#

Bases: RandomRegressionKernel

Approximate a base kernel via Nystrom approximation.

Only the base kernel’s Gramian row-mean is approximated here, all other methods are inherited directly from the base_kernel.

Parameters:

base_kernel (ScalarValuedKernel) – a ScalarValuedKernel whose attributes/methods are to be approximated
random_key (ArrayLike) – Key for random number generation
num_kernel_points (int) – Number of kernel evaluation points
num_train_points (int) – Number of training points used to fit kernel regression

gramian_row_mean(x, **kwargs)[source]#

Approximate the Gramian row-mean by Nystrom approximation.

We consider a \(n \times d\) dataset, and wish to use an \(m \times d\) subset of this to approximate the base kernel’s Gramian row-mean. The m points are selected uniformly at random, and the Nystrom estimator, as defined in [chatalic2022nystrom] is computed using this subset.

Parameters:: x (Union[Shaped[Array, 'n d'], Shaped[Array, 'd'], Shaped[Array, ''], float, int, Data]) – Data matrix, \(n \times d\)
Return type:: Shaped[Array, 'n']
Returns:: Approximation of the base kernel’s Gramian row-mean