Making sense of Covid-19 Graphs
By: Kristin Hunter-Thomson
We all are trying to find our new normal as our lives and schools have been turned upside down. And we all are working to make sense of what Covid-19 is and how it may be progressing across the world and each country. Therefore it is not surprising that there are lots of data visualizations and graphs in the news and online that people can access these days about Covid-19.
But what kinds of claims can we or can we not make from these graphs? How can we help our students make sense of these graphs? Let’s explore some graphs (bar chart, histogram, line graph, scatterplot) and discuss them from a teaching perspective.
COVID-19 Surveillance Dashboard - 1.1.4 by the University of Virginia Biocomplexity Institute. Accessed at https://nssac.bii.virginia.edu/covid-19/dashboard/ on April 8, 2020 at 21:25 CET.
The “COVID-19 Surveillance Dashboard 1.1.4” data dashboard created and updated by the Biocomplexity Institute at University of Virginia has a lot of information packed into it. Let’s just focus on the graph on the left-hand side of the screen.
What kind of graph is it? - This is a Stacked Bar Chart.
What is easier or harder about making sense of this graph? -
-
The different categories (in this case confirmed cases, deaths, and recovered cases if COVID-19) are placed on top of one another for each day. This helps us get an overall sense of how many known cases of COVID-19 there were each day (aka the cumulative total across these three categories) and that cumulative total has increased over time. But it is hard to get a sense of what the rate of change is or when changes are occurring.
-
We are able to see in terms of a relative sense for each day how many known cases there were (whether the person was sick (red), died (blue), or recovered (green)). But it is extremely challenging to see how these relative proportions across the categories are changing over time.
-
We are able to compare the number (height) of the confirmed cases from day to day as it all of the red bars are anchored to the x-axis and therefore have a common point to compare from. But it is extremely challenging to compare the number of deaths or recovered cases from day to day.
-
We are able to make comparisons across the past four weeks (far right of the chart). But it is hard to compare with or make sense of any data before that (center and left side of the chart) due to the formatting of the y-axis and the number of cases in the past four weeks.
What can we and what can we not take away from these data? -
-
We can take away… the cumulative number of people who are sick from COVID-19 for each day (whether the person was sick, died, or recovered), a sense of how the number of confirmed cases has varied day to day over the past four weeks, and a relative sense for each day the number of confirmed cases vs death vs recovered cases.
-
We cannot take away… how the proportion of active vs deaths vs recovered cases is changing over time, nor how the number of deaths or recoveries is changing over time, nor how the current proportions of active vs deaths vs recovered cases now compares to previously in the epidemic.
COVID-19 cases in the United States by date of illness onset, January 12, 2020, to April 7, 2020, at 4pm ET (n=156,753) by the US Center for Disease Control & Prevention (CDC). Accessed at https://www.cdc.gov/coronavirus/2019-ncov/cases-updates/cases-in-us.html on April 9, 2020 at 10:10 CET.
The “COVID-19 cases in the United States by date of illness onset, January 12, 2020, to April 7, 2020, at 4pm ET (n=156,753)” graph was created and is updated by U.S. Center for Disease Control & Prevention. It important information packed into it, also with a great annotation and extra table below explaining limitations of the data in the past two weeks.
What kind of graph is it? - This is a Histogram (number of observations of cases across daily bins).
What is easier or harder about making sense of this graph? -
-
We are able to see how many confirmed cases there were each day by when a person “became sick” (meaning they had symptoms). But, because it can take anywhere from 5-14 days for infect persons to become sick we cannot make comparisons across the past couple weeks of data.
-
We are able to compare how the number of cases each days compares over time to get a sense of the distribution of case frequency over time. But we only are able to make informal and relative comparisons of the rate of change from one day to the other.
What can we and what can we not take away from these data? -
-
We can take away… how many people were showing symptoms of COVID-19 each day from the start of the spread up until two weeks ago (March 28th in this instance) and how the number per day changed over that time.
-
We cannot take away… how many people are actually infected by COVID-19 each day, nor how the number of people sick from COVID-19 has changed over the past two weeks, nor how the rate of infections has changed over time.
Confirmed cases of Covid-19 for selected countries by The Guardian (using data from Data Sources: Johns Hopkins CSSE Note: The CSSE states that its numbers rely upon publicly available data from multiple sources, which do not always agree) Accessed at https://www.theguardian.com/world/2020/apr/08/coronavirus-world-map-which-countries-have-most-cases-deaths-covid-19 on April 9, 2020 at 10:25 CET.
The “Confirmed cases of Covid-19 for selected countries” graph was created and is updated by The Guardian. This graph includes a lot of information and they have constructed it with colors and annotations to help you make sense of key parts.
What kind of graph is it? - This is a Line Chart/Graph (days since 100th case is an ordinal variable).
What is easier or harder about making sense of this graph? -
-
We are able to see see how the number of confirmed cases within a country has on a given day and/or how it has changed over time (days) since that country had its 100th case. We are not sure when in terms of the calendar year people within that country had a positive test result.
-
We are able to visually compare the number of confirmed cases across different countries at different numbers of days since the 100th case (e.g., on day 10 vs day 20). We are not sure how the number of cases within each country compares to how many people are in the country.
-
We are able to get a general sense of the rate of change (increase, decrease, or no change) of the number of confirmed cases over time (days) for a country since that country had its 100th case by looking at the slope of a line. We cannot calculate the actual rate of change.
-
We can get a general sense of the rate of change (increase, decrease, or no change) of the number of confirmed cases over time (days) across countries since each country had its 100th case by looking at the slopes of a lines. We can look at the slope at one point in time (e.g., day 15) or over the full time scope (i.e., 0-70+ days). But we cannot know why those changes in slope happen in some places but not others.
What can we and what can we not take away from these data? -
-
We can take away… how the number of confirmed cases since the 100th case changes over time for these 9 countries, and how the rate of change of the number of confirmed cases per day changed over time for these 9 countries, and how the number of confirmed cases since the 100th case compare among these 9 countries on a particular day post the 100th case and/or how the number of confirmed cases was changing over time for given points in time post the 100th among these 9 countries.
-
We cannot take away… how many people are actually infected by COVID-19 each day, nor when these people were sick in the calendar year, nor why the rate of change is or is not changing at different times for these 9 countries, nor what is happening in the other 175 countries with COVID-19 cases as of April 8, 2020.
Confirmed COVID-19 deaths: Total deaths vs daily deaths, Apr 8, 2020 by Our World in Data Accessed at https://ourworldindata.org/coronavirus-data on April 9, 2020 at 10:50 CET.
The “Confirmed COVID-19 deaths: Total deaths vs daily deaths, Apr 8, 2020” graph was created and is updated by Our World in Data. This interactive graph packs a lot of information into one graph, but again uses color and annotations to help us. Note that both the x- and y-axes are plotted on a log scale.
What kind of graph is it? - This is a Scatterplot / Scatter Chart.
What is easier or harder about making sense of this graph? -
-
We are able to see how as the number of confirmed deaths per day (y-axis) increases the total confirmed deaths from COVID-19 (x-axis) also increases. So we can say there is a positive relationship, or correlation, between these two variables. But we cannot say if it is a casual relationship.
-
We are able to see that the relationship between the number of confirmed deaths per day and the total confirmed deaths from COVID-19 is close to a 1:1, but not exactly. We do not know why it is lower than a 1:1 relationship, though we can hypothesize.
-
We are able to see that there is more variability in the relationship for countries that have had less than 10 confirmed deaths per day and less than 100 total confirmed deaths from COVID-19. We are not sure why there is more variability there but not above these thresholds.
-
We are able to see how the number of confirmed deaths per day and the total confirmed deaths from COVID-19 varies among the countries included in the graph. We are also able to see by the color which continent the countries are from. We cannot know why these values differ between the countries.
What can we and what can we not take away from these data? -
-
We can take away… there is a positive relationship between the number of confirmed deaths per day and the total confirmed deaths from COVID-19 overall, but that there is more variation in the relationship in nations with fewer deaths.
-
We cannot take away… why this relationship exists, nor why the number of deaths in the different countries are similar or different from one another, nor what this means for how COVID-19 is impacting these different countries.
So where does that leave us with our using these, and many other, graphs to help our students make sense of COVID-19?
We need to help our students be cognizant what they are actually looking at both in terms of what kind of graph is being used (and it corresponding strengths and weaknesses in representing the data) AND in terms of what kind of claims we can and cannot make from the data in the graph. It can be helpful to have your students think about what variables are and are not included in the graph and remember that limits what story the data is telling.
There is no single graph that will tell us everything (this is part of why there are just so many out there right now), and unfortunately there is also no graph that will predict the future for us. Both we can help our students make sense of the data and strengthen their data literacy skills as we help them engage with the graphs and data.
I hope you all are able to stay safe, healthy, and sane.
Check out the blog post “Making sense of Covid-19 Maps” if you are interested in a similar discussion about different maps being shared currently online.