Dynamic data are all around us. Changepoint models allow us to know when changes happen in these data and what they look like. Probabilistic modelling allows us to elegantly build customizable changepoint models for different data types, as well as provide us with uncertainty estimates for the position and magnitude of the change (both indispensable quantities for decision-making and hypothesis testing). This tutorial will briefly cover building changepoint models for multivariate data using PyMC but will primarily focus on the ways in which this “basic” model can be extended.
This tutorial is targeted towards academic researchers, data scientists, and anyone interested in being able to easily build bespoke models which provide uncertainty estimates for inferred statistics. This talk will attempt to be accessible to beginners but leans towards more intermediate users interested in changepoint modelling. Previous experience with PyMC, and a background in statistical modelling is assumed. No libraries other than PyMC and the basic scientific stack (numpy, scipy, matplotlib) will be used.
The tutorial aims to be hands-on, will discuss some theory to provide context for the models discussed, and will be heavy on understanding code to construct the “guts” of the models (in particular, selection of distributions for modelling the emissions and changepoint locations, and the details of the tensor manipulation to put everything together).