Data#

Data-structures for representing weighted and/or supervised data.

class coreax.data.Data(data, weights=None)[source]#

Bases: Module

Class for representing unsupervised data.

A dataset of size n consists of a set of pairs \(\{(x_i, w_i)\}_{i=1}^n\) where \(x_i\in\mathbb{R}^d\) are the features or inputs and \(w_i\) are weights.

Note

n-vector inputs for data are interpreted as n points in 1-dimension and converted to a (n, 1) array.

Parameters:
  • data (Union[Shaped[Array, 'n d *p'], Shaped[Array, 'n'], Shaped[Array, ''], Sequence[Union[Shaped[Array, '_n _d _*p'], Shaped[Array, '_n'], Shaped[Array, '']]]]) – An \(n \times d\) array defining the features of the unsupervised dataset

  • weights (Union[Shaped[Array, 'n'], Shaped[Array, ''], int, float, None]) – An \(n\)-vector of weights where each element of the weights vector is paired with the corresponding index of the data array, forming the pair \((x_i, w_i)\); if passed a scalar weight, it will be broadcast to an \(n\)-vector. the default value of None sets the weights to the ones vector (implies a scalar weight of one)

data: Shaped[Array, 'n d']#
weights: Union[Shaped[Array, 'n'], Shaped[Array, ''], int, float]#
normalize(*, preserve_zeros=False)[source]#

Return a copy of self with weights that sum to one.

Parameters:

preserve_zeros (bool) – If to preserve zero valued weights; when all weights are zero valued, the ‘normalized’ copy will sum to zero, not one.

Return type:

Self

Returns:

A copy of ‘self’ with normalized ‘weights’

class coreax.data.SupervisedData(data, supervision, weights=None)[source]#

Bases: Data

Class for representing supervised data.

A supervised dataset of size n consists of a set of triples \(\{(x_i, y_i, w_i)\}_{i=1}^n\) where \(x_i\in\mathbb{R}^d\) are the features or inputs, \(y_i\in\mathbb{R}^p\) are the responses or outputs, and \(w_i\) are weights which correspond to the pairs \((x_i, y_i)\).

Note

n-vector inputs for data and supervision are interpreted as n points in 1-dimension and converted to a (n, 1) array.

Parameters:
  • data (Union[Shaped[Array, 'n d *p'], Shaped[Array, 'n'], Shaped[Array, ''], Sequence[Union[Shaped[Array, '_n _d _*p'], Shaped[Array, '_n'], Shaped[Array, '']]]]) – An \(n \times d\) array defining the features of the supervised dataset paired with the corresponding index of the supervision

  • supervision (Union[Shaped[Array, 'n d *p'], Shaped[Array, 'n'], Shaped[Array, ''], Sequence[Union[Shaped[Array, '_n _d _*p'], Shaped[Array, '_n'], Shaped[Array, '']]]]) – An \(n \times p\) array defining the responses of the supervised dataset paired with the corresponding index of the data

  • weights (Union[Shaped[Array, 'n'], Shaped[Array, ''], int, float, None]) – An \(n\)-vector of weights where each element of the weights vector is is paired with the corresponding index of the data and supervision array, forming the triple \((x_i, y_i, w_i)\); if passed a scalar weight, it will be broadcast to an \(n\)-vector. the default value of None sets the weights to the ones vector (implies a scalar weight of one)

supervision: Shaped[Array, 'n p']#
coreax.data.as_data(x)[source]#

Cast x to a Data instance.

Return type:

Data

Parameters:

x (Shaped[Array, 'n d'] | Data) –

coreax.data.as_supervised_data(xy)[source]#

Cast xy to a SupervisedData instance.

Return type:

SupervisedData

Parameters:

xy (tuple[Shaped[Array, 'n d'], Shaped[Array, 'n p']] | SupervisedData) –