Pounce (Map Reduce)#

Example coreset generation using a video of a pouncing cat.

This example showcases how a coreset can be generated from video data. In this context, a coreset is a set of frames that best capture the information in the original video.

Firstly, principal component analysis (PCA) is applied to the video data to reduce dimensionality. Then, a coreset is generated using Stein kernel herding, with a SquaredExponentialKernel base kernel. The score function (gradient of the log-density function) for the Stein kernel is estimated by applying kernel density estimation (KDE) to the data, and then taking gradients.

To reduce computational requirements, a map reduce approach is used, splitting the original dataset into distinct segments, with each segment handled on a different process.

The coreset attained from Stein kernel herding is compared to a coreset generated via uniform random sampling. Coreset quality is measured using maximum mean discrepancy (MMD).

examples.pounce_map_reduce.main(in_path=PosixPath('../examples/data/pounce/pounce.gif'), out_path=None)[source]#

Run the ‘pounce’ example for video sampling with Stein kernel herding.

Take a video of a pouncing cat, apply PCA and then generate a coreset using Stein kernel herding. Compare the result from this to a coreset generated via uniform random sampling. Coreset quality is measured using maximum mean discrepancy (MMD).

To reduce computational requirements, a map reduce approach is used, splitting the original dataset into distinct segments, with each segment handled on a different process.

Parameters:
  • in_path (Path) – Path to directory containing input video, assumed relative to this module file unless an absolute path is given

  • out_path (Optional[Path]) – Path to save output to, if not None, assumed relative to this module file unless an absolute path is given

Return type:

tuple[float, float]

Returns:

Coreset MMD, random sample MMD