Data#

Data-structures for representing weighted and/or supervised data.

class coreax.data.Data(data, weights=None)[source]#

Class for representing unsupervised data.

A dataset of size n consists of a set of pairs \(\{(x_i, w_i)\}_{i=1}^n\) where :math`x_i` are the features or inputs and \(w_i\) are weights.

Parameters:
  • data (Union[Shaped[Array, 'n *d'], Shaped[ndarray, 'n *d']]) – An \(n \times d\) array defining the features of the unsupervised dataset

  • weights (Union[Shaped[Array, 'n'], Shaped[ndarray, 'n'], None]) – An \(n\)-vector of weights where each element of the weights vector is paired with the corresponding index of the data array, forming the pair \((x_i, w_i)\); if passed a scalar weight, it will be broadcast to an \(n\)-vector. the default value of None sets the weights to the ones vector (implies a scalar weight of one);

data: Shaped[Array, 'n *d']#
weights: Shaped[Array, 'n']#
normalize(*, preserve_zeros=False)[source]#

Return a copy of ‘self’ with ‘weights’ that sum to one.

Parameters:

preserve_zeros (bool) – If to preserve zero valued weights; when all weights are zero valued, the ‘normalized’ copy will sum to zero, not one.

Return type:

Self

Returns:

A copy of ‘self’ with normalized ‘weights’

class coreax.data.SupervisedData(data, supervision, weights=None)[source]#

Class for representing supervised data.

A supervised dataset of size n consists of a set of triples \(\{(x_i, y_i, w_i)\}_{i=1}^n\) where :math`x_i` are the features or inputs, \(y_i\) are the responses or outputs, and \(w_i\) are weights which correspond to the pairs \((x_i, y_i)\).

Parameters:
  • data (Shaped[Array, 'n d']) – An \(n \times d\) array defining the features of the supervised dataset paired with the corresponding index of the supervision

  • supervision (ArrayLike) – An \(n \times p\) array defining the responses of the supervised dataset paired with the corresponding index of the data

  • weights (Optional[Shaped[Array, 'n']]) – An \(n\)-vector of weights where each element of the weights vector is is paired with the corresponding index of the data and supervision array, forming the triple \((x_i, y_i, w_i)\); if passed a scalar weight, it will be broadcast to an \(n\)-vector. the default value of None sets the weights to the ones vector (implies a scalar weight of one);

supervision: Shaped[Array, 'n *p']#
coreax.data.as_data(x)[source]#

Cast ‘x’ to a data instance.

Return type:

Data

Parameters:

x (Any) –

coreax.data.is_data(x)[source]#

Return boolean indicating if ‘x’ is an instance of ‘coreax.data.Data’.

Return type:

bool

Parameters:

x (Any) –