Approximations#
Classes and associated functionality to approximate kernels.
When a dataset is very large, methods which have to evaluate all pairwise combinations
of the data, such as gramian_row_mean(), can become
prohibitively expensive. To reduce this computational cost, such methods can instead be
approximated (providing suitable approximation error can be achieved).
The ApproximateKernels in this module provide the functionality required to
override specific methods of a base_kernel with their approximate counterparts.
Because ApproximateKernels inherit from Kernel, with
all functionality provided through composition with a base_kernel, they can be
freely used in any place where a standard Kernel is expected.
- coreax.approximation._random_indices(key, num_data_points, num_select, mode='kernel')[source]#
Select a random subset of indices.
- Parameters:
- Returns:
A randomly selected subset of indices, of size
num_samples, for a dataset withnum_data_pointsindexable entries.
- coreax.approximation._random_least_squares(key, data, features, num_indices, target_map=<function <lambda>>)[source]#
Solve the least-square problem on a random subset of the system.
A linear system \(Ax = b\), solved via least-squares as \(x = A^+ b\), can be approximated by random least-square as x approx hat{x} = hat{A}^+ hat{b}, where \(\hat{A} = A_i\ \text{and}\ \hat{b} = b_i\, \forall i \in I]\). I is a random subset of indices for the original system of equations.
- Parameters:
key (
ArrayLike) – RNG key for seeding the random selectiondata (
Array) – The data \(z\); yields \(b\) when pushed through the target mapfeatures (
Array) – The feature matrix \(A\)num_indices (
int) – The size of the random subset of indices \(I\)target_map (
Callable[[Array],Array]) – The target map \(\phi\) which defines \(b := \phi(z)\), where \(z\) is the inputdata
- Return type:
- Returns:
The push-forward of the approximate solution \(A\hat{x}\)
- class coreax.approximation.ApproximateKernel(base_kernel)[source]#
Base class for approximated kernels.
Provides approximations of the methods in the
base_kernel.The
gramian_row_mean()method is particularly amenable to approximation, with significant performance improvements possible depending on the acceptable levels of error.- compute_elementwise(x, y)[source]#
Evaluate the kernel on individual input vectors
xandy, not-vectorised.Vectorisation only becomes relevant in terms of computational speed when we have multiple
xory.
- grad_x_elementwise(x, y)[source]#
Evaluate the element-wise gradient of the kernel function w.r.t.
x.The gradient (Jacobian) of the kernel function w.r.t.
x.Only accepts single vectors
xandy, i.e. not arrays.coreax.kernel.Kernel.grad_x()provides a vectorised version of this method for arrays.
- grad_y_elementwise(x, y)[source]#
Evaluate the element-wise gradient of the kernel function w.r.t.
y.The gradient (Jacobian) of the kernel function w.r.t.
y.Only accepts single vectors
xandy, i.e. not arrays.coreax.kernel.Kernel.grad_y()provides a vectorised version of this method for arrays.
- divergence_x_grad_y_elementwise(x, y)[source]#
Evaluate the element-wise divergence w.r.t.
xof Jacobian w.r.t.y.\(\nabla_\mathbf{x} \cdot \nabla_\mathbf{y} k(\mathbf{x}, \mathbf{y})\). Only accepts vectors
xandy. A vectorised version for arrays is computed indivergence_x_grad_y().This is the trace of the ‘pseudo-Hessian’, i.e. the trace of the Jacobian matrix \(\nabla_\mathbf{x} \nabla_\mathbf{y} k(\mathbf{x}, \mathbf{y})\).
- class coreax.approximation.RandomRegressionKernel(base_kernel, random_key, num_kernel_points=10000, num_train_points=10000)[source]#
An approximate kernel that requires the attributes for random regression.
- class coreax.approximation.MonteCarloApproximateKernel(base_kernel, random_key, num_kernel_points=10000, num_train_points=10000)[source]#
Approximate a base kernel via random subset selection.
Only the Gramian row-mean is approximated here, all other methods are inherited directly from the
base_kernel.- Parameters:
- class coreax.approximation.ANNchorApproximateKernel(base_kernel, random_key, num_kernel_points=10000, num_train_points=10000)[source]#
Approximate a base kernel via random kernel regression on ANNchor selected points.
Only the base kernel’s Gramian row-mean is approximated here, all other methods are inherited directly from the
base_kernel.- Parameters:
- class coreax.approximation.NystromApproximateKernel(base_kernel, random_key, num_kernel_points=10000, num_train_points=10000)[source]#
Approximate a base kernel via Nystrom approximation.
Only the base kernel’s Gramian row-mean is approximated here, all other methods are inherited directly from the
base_kernel.- Parameters:
- gramian_row_mean(x, **kwargs)[source]#
Approximate the Gramian row-mean by Nystrom approximation.
We consider a \(n \times d\) dataset, and wish to use an \(m \times d\) subset of this to approximate the base kernel’s Gramian row-mean. The
mpoints are selected uniformly at random, and the Nystrom estimator, as defined in [chatalic2022nystrom] is computed using this subset.