A beginner’s attempt . . .
There are many ways to make static graphs in Python — such as with the use of Matplotlib, Pandas, and Seaborn, to name a few. And I think it is safe to say that successfully making a static graph is one crucial tool for a beginner data scientist to have in their toolbox.
After feeling pretty confident in my static graph-making, I stumbled across a graph that takes the art of graph-making to a whole other level — that being the animated graph. I can admit that I am a sucker for animated visuals, and likewise, find animated graphs both mesmerizing and impactful. Something about seeing the plot ‘grow’ on its own, and doing all the visual work for the reader. I also find that it really emphasizes the point that the creator is making. As a new student of data science, my curiosity was piqued on how to create such a graph.
Is it pure magic or is it fairly simple? (in hindsight, I can positively say that it requires a bit of both) I took to the internet to obtain guidance and opened up my Jupyter notebook to give it a test. Below you will find my process, results and takeaways.
To start, I used a data set that I found on Kaggle that was originally published on Github from CNN. It is a data set on school shootings in the US over the last 10 years. For this exercise, I just wanted to look at the year and number of incidents per year.
I first read the CSV data using Pandas and cleaned up the data to create a data frame consisting of two columns, year and total # of shootings from that year.
Next, I imported the necessary libraries to create my graph and animation:
import matplotlib.pyplot as plt
import matplotlib.animation as animation
import seaborn as sns
I used matplotlib and Seaborn for the plotting, and the animation tools that I used centered around the
matplotlib.animation.Animation class. This class provides a framework around which the animation functionality is built. One of the main tools within this framework is
FuncAnimation, which is what I used to create my animation. I also called
%matplotlib notebook, this way I could see the animation work immediately in my notebook.
After coding in my figure and subplot details as usual for a static graph, I defined my
#select data rangedef animate(i):
data = df.iloc[:int(i+1)]
graph = sns.barplot(y=data['total shootings'], x=data.index, data=data, color="blue")
If you think about an animation as a series of frames — the most important piece of the animation code is the animate function which defines what to reference in each frame. Therefore, in the above function, i represents the index to select the range of data that should be visible in each frame. In the code following that, I used a Seaborn barplot to plot the data selection. This is one frame.
Finally, to run the animation, I called the function
matplotlib.animation.FuncAnimation. And within the function, I included my figure to update with each frame, my previously defined animate function, the amount of frames that the animation should contain (how many times animate(i) is being called) and set the interval level to control the speed between frames.
ani = matplotlib.animation.FuncAnimation(fig, animate, frames=len(data.index),interval=700,repeat=True)
And here I had my animated graph (recorded in QuickTime) :
Takeaways and further learning to do
While I succeeded in creating a basic animation — there are numerous areas for improvement and to delve into next. For one, figuring out a way to save the MP4 through my notebook. I ended up recording it through QuickTime but it would have been much easier to write a line of code in my notebook to save the mp4.
My graph also needs a couple aesthetic improvements to be on par with the animations out there that I admire. I need to work on slowing down the animation and look into making it run ‘smoother’ overall.