Animated histograms in Python

A quick guide to animating your histograms

Valentine Chisango
3 min readMay 4, 2022

It is interesting to see how a distribution takes shape and becomes more recognisable as the number of simulations increases. One of the best ways to visualise this is through an animated plot. Having already tackled animated time series plots in R in a prior article, in this article I will provide a walkthrough on how to create animated histograms such as the one below in Python.

Graph by author

The data used for this demonstration is based on a simulation I made of the popular word game Wordle. In the game, you have 6 attempts to guess the word of the day correctly. In the simulation, the first 300 actual Wordle games are played using HEART as the initial guess. In order to evaluate the effectiveness of HEART as an initial guess, a histogram is plotted showing the frequency distribution of the number of guesses taken to win the game.

A few packages are needed to produce the histogram:

  • pandas & numpy : used to manipulate the data, and work with csv files
  • matplotlib.pyplot : used to plot the histogram
  • matplotlib.animate : used to animate the histogram
  • ImageMagicWriter : used to save the histogram as a gif

First, we need to read in the simulation data from HEART.csv and calculate the number of attempts required to win each of the 300 games as the average of the simulations of that game. The number of games, 300, will also be used as the number of frames for the animation.

Next, we need to create the update_hist function that will draw out the histogram by adding the data for each game one by one. The if statement is used to end the animation once all 300 games have been included. The histogram has the typical labels and structure one would expect, plus an annotation to indicate the number of games included in the histogram. We make use of a loop to add data labels to each rectangle in the histogram with a frequency greater than 0. Including these annotations (game counter and data labels) in the update_hist function ensures that they are updated as the animation progresses.

We now create a figure object for drawing the plot and render the animation using the FuncAnimation function. As parameters for the rendering, we supply: a figure object; a function to create the frames; the interval between each frame in milliseconds (default 100); the blit parameter to determine whether or not to optimise drawing (default False); and the number of frames to cache, which is 300 + 1 since the initial frame is when data for 0 games have been added to the histogram.

Finally, we save our histogram as both a gif and png, and display the animation. Importantly, for our gif we can specify that the animation should only loop once so that we can analyse the histogram once the animation is done. By default, the gif loops indefinitely without pausing between loops which is seldom ideal.

Graph by author

The code and data required to reproduce these plots is available in the repository here.

--

--

Valentine Chisango

Banking & Risk Management Actuary with a passion for data science