ML Systems

# Bayesian Online Changepoint Detection | July 27, 2021

By John Hallman - July 27, 2021

Materials

Why BOCD?

• Changepoint detection is a problem setting that focuses on detecting distributional changes in time-series data, such as trend changes or mean and volatility increases.
• BOCD is a changepoint detection algorithm that focuses on the online setting, where we try to model trend changes as data comes in, rather than post hoc when the entire time series can be observed.
• By combining Bayesian probability and exponential family distributions with convenient conjugate priors, BOCD is able to model a diverse set of time series distributions while being far more efficient than a lot of alternative time series models that use computationally expensive offline training.

Nuggets

• The algorithm takes as input:
• (1) a Hazard function which determines how frequent changepoints are.
• (2) an exponential family and default prior parameters.
• The key idea behind BOCD is the following - at time $t$, we track the probability that a changepoint occurred $k$ time points earlier for all values of $k$ from $0$ to $t$, along with a set of exponential family prior parameters for each such value $k$, conditional on a changepoint having occurred at time $t - k$.
• If we are given the time of the last changepoint, then it is easy to compute the posterior parameters since we are using conjugate exponential family distributions. Therefore, the key difficulty is modeling the probability of the most recent changepoint from time 0 to $t$.
• Conveniently, this can be done in an online manner! If we have the distribution up to time $t-1$ and observe the next datapoint $x_t$, the distribution of the most recent changepoint at time $t$ is a simple function of the hazard, the current prior parameters, and the distribution at time $t-1$.
• Finally, when using BOCD in practice, we may be interested in clearly defined timestamps where the changepoints occur, for example if we want to alert when there is a change. The most straightforward approach, which was also used in the paper, is to take the maximum a posteriori probability of the changepoint distribution.

Notes

If you like applying these kinds of methods to practical ML problems, join our team.