Approximations¶
Classes and associated functionality to approximate kernels.
When a dataset is very large, methods which have to evaluate all pairwise combinations
of the data, such as gramian_row_mean(), can
become prohibitively expensive. To reduce this computational cost, such methods can
instead be approximated (providing suitable approximation error can be achieved).
The ApproximateKernels in this module provide the functionality required to
override specific methods of a base_kernel with their approximate counterparts.
Because ApproximateKernels inherit from
ScalarValuedKernel, with all functionality provided through
composition with a base_kernel, they can be freely used in any place where a
standard ScalarValuedKernel is expected.
- class coreax.approximation.ApproximateKernel(base_kernel)[source]¶
Bases:
UniCompositeKernelBase class for approximated kernels.
Provides approximations of the methods in the
base_kernel.The
gramian_row_mean()method is particularly amenable to approximation, with significant performance improvements possible depending on the acceptable levels of error.- Parameters:
base_kernel (
ScalarValuedKernel) – aScalarValuedKernelwhose attributes/methods are to be approximated
- compute_elementwise(x, y)[source]¶
Evaluate the kernel on individual input vectors
xandy, not-vectorised.Vectorisation only becomes relevant in terms of computational speed when we have multiple
xory.- Parameters:
x – Vector \(\mathbf{x} \in \mathbb{R}^d\)
y – Vector \(\mathbf{y} \in \mathbb{R}^d\)
- Returns:
Kernel evaluated at (
x,y)
- grad_x_elementwise(x, y)[source]¶
Evaluate the element-wise gradient of the kernel function w.r.t.
x.The gradient (Jacobian) of the kernel function w.r.t.
x.Only accepts single vectors
xandy, i.e. not arrays.coreax.kernels.ScalarValuedKernel.grad_x()provides a vectorised version of this method for arrays.- Parameters:
x – Vector \(\mathbf{x} \in \mathbb{R}^d\)
y – Vector \(\mathbf{y} \in \mathbb{R}^d\)
- Returns:
Jacobian \(\nabla_\mathbf{x} k(\mathbf{x}, \mathbf{y}) \in \mathbb{R}^d\)
- grad_y_elementwise(x, y)[source]¶
Evaluate the element-wise gradient of the kernel function w.r.t.
y.The gradient (Jacobian) of the kernel function w.r.t.
y.Only accepts single vectors
xandy, i.e. not arrays.coreax.kernels.ScalarValuedKernel.grad_y()provides a vectorised version of this method for arrays.- Parameters:
x – Vector \(\mathbf{x} \in \mathbb{R}^d\).
y – Vector \(\mathbf{y} \in \mathbb{R}^d\).
- Returns:
Jacobian \(\nabla_\mathbf{y} k(\mathbf{x}, \mathbf{y}) \in \mathbb{R}^d\)
- divergence_x_grad_y_elementwise(x, y)[source]¶
Evaluate the element-wise divergence w.r.t.
xof Jacobian w.r.t.y.\(\nabla_\mathbf{x} \cdot \nabla_\mathbf{y} k(\mathbf{x}, \mathbf{y})\). Only accepts vectors
xandy. A vectorised version for arrays is computed indivergence_x_grad_y().This is the trace of the ‘pseudo-Hessian’, i.e. the trace of the Jacobian matrix \(\nabla_\mathbf{x} \nabla_\mathbf{y} k(\mathbf{x}, \mathbf{y})\).
- Parameters:
x – First vector \(\mathbf{x} \in \mathbb{R}^d\)
y – Second vector \(\mathbf{y} \in \mathbb{R}^d\)
- Returns:
Trace of the Laplace-style operator; a real number
- class coreax.approximation.RandomRegressionKernel(base_kernel, random_key, num_kernel_points=10000, num_train_points=10000)[source]¶
Bases:
ApproximateKernelAn approximate kernel that requires the attributes for random regression.
- Parameters:
base_kernel (
ScalarValuedKernel) – aScalarValuedKernelwhose attributes/methods are to be approximatedrandom_key (
ArrayLike) – Key for random number generationnum_kernel_points (
int) – Number of kernel evaluation pointsnum_train_points (
int) – Number of training points used to fit kernel regression
- class coreax.approximation.MonteCarloApproximateKernel(base_kernel, random_key, num_kernel_points=10000, num_train_points=10000)[source]¶
Bases:
RandomRegressionKernelApproximate a base kernel via random subset selection.
Only the Gramian row-mean is approximated here, all other methods are inherited directly from the
base_kernel.- Parameters:
base_kernel (
ScalarValuedKernel) – aScalarValuedKernelwhose attributes/methods are to be approximatedrandom_key (
ArrayLike) – Key for random number generationnum_kernel_points (
int) – Number of kernel evaluation pointsnum_train_points (
int) – Number of training points used to fit kernel regression
- class coreax.approximation.ANNchorApproximateKernel(base_kernel, random_key, num_kernel_points=10000, num_train_points=10000)[source]¶
Bases:
RandomRegressionKernelApproximate a base kernel via random kernel regression on ANNchor selected points.
Only the base kernel’s Gramian row-mean is approximated here, all other methods are inherited directly from the
base_kernel.- Parameters:
base_kernel (
ScalarValuedKernel) – aScalarValuedKernelwhose attributes/methods are to be approximatedrandom_key (
ArrayLike) – Key for random number generationnum_kernel_points (
int) – Number of kernel evaluation pointsnum_train_points (
int) – Number of training points used to fit kernel regression
- class coreax.approximation.NystromApproximateKernel(base_kernel, random_key, num_kernel_points=10000, num_train_points=10000)[source]¶
Bases:
RandomRegressionKernelApproximate a base kernel via Nystrom approximation.
Only the base kernel’s Gramian row-mean is approximated here, all other methods are inherited directly from the
base_kernel.- Parameters:
base_kernel (
ScalarValuedKernel) – aScalarValuedKernelwhose attributes/methods are to be approximatedrandom_key (
ArrayLike) – Key for random number generationnum_kernel_points (
int) – Number of kernel evaluation pointsnum_train_points (
int) – Number of training points used to fit kernel regression
- gramian_row_mean(x, **kwargs)[source]¶
Approximate the Gramian row-mean by Nystrom approximation.
We consider a \(n \times d\) dataset, and wish to use an \(m \times d\) subset of this to approximate the base kernel’s Gramian row-mean. The
mpoints are selected uniformly at random, and the Nystrom estimator, as defined in [chatalic2022nystrom] is computed using this subset.