Pounce (Map Reduce)¶
Example coreset generation using a video of a pouncing cat.
This example showcases how a coreset can be generated from video data. In this context, a coreset is a set of frames that best capture the information in the original video.
Firstly, principal component analysis (PCA) is applied to the video data to reduce dimensionality. Then, a coreset is generated using Stein kernel herding, with a SquaredExponentialKernel base kernel. The score function (gradient of the log-density function) for the Stein kernel is estimated by applying kernel density estimation (KDE) to the data, and then taking gradients.
To reduce computational requirements, a map reduce approach is used, splitting the original dataset into distinct segments, with each segment handled on a different process.
The coreset attained from Stein kernel herding is compared to a coreset generated via uniform random sampling. Coreset quality is measured using maximum mean discrepancy (MMD).
- examples.pounce_map_reduce.main(in_path=PosixPath('../examples/data/pounce/pounce.gif'), out_path=None)[source]¶
Run the ‘pounce’ example for video sampling with Stein kernel herding.
Take a video of a pouncing cat, apply PCA and then generate a coreset using Stein kernel herding. Compare the result from this to a coreset generated via uniform random sampling. Coreset quality is measured using maximum mean discrepancy (MMD).
To reduce computational requirements, a map reduce approach is used, splitting the original dataset into distinct segments, with each segment handled on a different process.
- Parameters:
- Return type:
- Returns:
Coreset MMD, random sample MMD