Critical Analysis in Data Visualisation
A worked example of how to ensure you’re not spreading incorrect data
As part of this week’s Makeover Monday I spent some time working through the dataset in order to critically evaluate its quality and whether I was happy publishing the information I was visualising.
It would be easy to work through this example and illustrate the steps I went through as a simple step by step approach, but the truth is that isn’t the way things worked. The actual process involved several conversations with peers, it was iterative and collaborative. So what I present is the timeline in the best way I can present it, I certainly don’t intend to gild the lily or present myself as having all the answers. What you see here is the process as it happened as best I can lay out.
Initial Analysis: The Cavalier
Makeover Monday — a weekly project run by Eva Murray and Andy Kriebel (a duo whose untiring work keeps the data visualisation community served with a new dataset every single week) — this week provided a rather simple looking dataset.
It was accompanied by a visualisation and its original data, which is where I started my analysis…
As I started my first approach I was aware of the measurements used were squared, and I was wondering how to deal with this in the visualisation.
To me squared metres in the data I downloaded, and its title, meant something very specific. To illustrate the difference I’ll use an example, if I was carpet shopping I might ask for a carpet to suit a room 2 metres squared, I’d expect carpet in return that is 4 square metres.
At around the same time I received a message from Anna Foard (@stats_ninja) asking me my opinion on best practice for visualising a squared metric. We discussed the pros and cons and I suggested perhaps a square might be better than bars, as a quick mock up I suggested:
As I built out my ideas, I was concerned that the orders of magnitude of areas might be lost, so my final visualisation shows the number of g of protein from pulses that could be grown on land for other food types (click caption for interactive version). Note the numbers here reflect the fact I squared the areas to provide comparison:
Makeover Monday is positioned as an hour-long task, a way to quickly visualise data and work through it, building your visualisation skills along the way. I tackled the problem over a lunch hour and moved on.
Iterative Analysis: The Collaborator
Later, as I looked through the approaches of my fellow Monday-ers (or is that Makeoverists?) I noticed a common theme, one example from @watsonstevenc is shown below.
My reaction, to politely point out that 0.010395 squared meters does not go into one squared meter 96 times, it goes in 9254 times and that perhaps people should point consider the squared numbers when making comparisons.
It was at this point, and having commented on a few pieces, I started to reconsider. Not only did I wonder if my high school maths was correct, but I was also worried if I’d interpreted the data correctly.
Cue several conversations some in public, some in private. Jeff Shafer, Ann Jackson, Neil Richards, Mark Bradbourne, Eva Murray and Matt Francis, and others I’ve missed, all chipped in with opinions, questions, and critique of my thinking.
It seems most people had used the original data which stated sq m.
I’d approached the data from the original source, ourworldindata.org showing squared metres (m²) per gram in the title and axis, and in the data header.
Some have debated about the difference between meters squared and square meters (and also the meaning of sq m presented in this context), and which one we use clearly affects the results and conclusions.
Through our conversations, Eva, who sourced the data, told us that the data was indeed area (i.e. square metres) but my curiosity and critical thinking antenna were piqued, and so I was no longer willing to accept that at face value. Up to this point I’d not been very critical of what I was doing, I’d accepted data as it was presented (or how I thought it was presented) and I’d paid the price; at best, of looking foolish to my peers, and at worst, of spreading incorrect information and risking my reputation as a data visualisation practitioner.
If you’re faced with the same situation my advice is to delete your visualisation immediately and admit your mistake, rebuild it and move on. Don’t leave it out on social media for people to consume. People use social media charts to drive “water cooler” chats and you are helping to spread misinformation and #fakenews by leaving it.
I’m not the only one, many similar mistakes happen each week. Statements like this, derived from incorrect assumptions, are dangerous and we all need to work together to evaluate our output and remove incorrect visualisations as soon as mistakes are spotted.
This data set does not allow you to draw conclusions about total land use for different food purpose across the world. The data given is about the effectiveness of land use, showing how much protein can be generated from 1 m² of land depending on what that land is used for. It’s not possible to say how much land is used, or what percentage of the world’s land is used, for crop farming, or cattle raising, or anything else. Knowing the amount of protein we can get from land doesn’t tell us how much land is actually used.
A correct, but meaningless, statement might be to say 88% of land use (measured in m² per g of protein) when land is shared equally between all food types.
Critical Analysis: The Investigator
I asked several questions to help me get my head around the data as a result of my initial failure to be critical.
- Does the data result in sensible numbers?
This is the first question worth asking when faced with a new dataset, what numbers result from the data and do they make logical sense?
Let’s take that piece about how many grams of pulses can be grown in 1 square metre.
If we assume square meters then the answer is 96g of protein, as pulses are a quarter protein (very roughly, from an internet search) then this leaves us with 400g of pulses / sq m.
If we assume meters squared then the answer becomes ~9000g of protein (i.e. 96²), or 4kg of pulses!
I’m no bean grower but certainly my initial assumption around meters squared is certainly looking less doubtful — I’m sure farmers would kill for that yield. Finding an article on Farmers Weekly I found the yield for a certain bean in 2016 was around 4.71 tonnes / hectare i.e. 471 g / sq m.
2. What is the original source of the data?
Andy and Eva are fantastic at sourcing each dataset and visualisation, and in most cases that means we can investigate the providence.
In this case the chart for the makeover was sourced from Our World in Data (OWID), which in turn took its data source from a paper by Clark & Tilman (2017). Comparative analysis of environmental impacts of agricultural production systems, agricultural input efficiency, and food choice. Environmental Research Letters, Volume 12, Number 6.
This paper was a meta-analysis of several different studies, but provides the data as an Excel sheet, The units in here are listed as m² / g protein.
As this is a meta-analysis the values from several studies have been grouped and averaged (using a mean) to gain a single value used in the OWID chart.
3. How was the data sourced and treated? Is the methodology transparent and reliable? Can I trust it?
Going further into each metadata study we can see start to examine the methods used to source data.
For example, looking at Wiedemann, 2015 we see an analysis of land occupation for three different types of feed for cattle converted to m² / kg of retail cut meat, in the Clark & Tilman this seems to have been converted to m² / g protein at 13% protein per retail cut.
In Pelletier, 2010 this is presented as a similar comparison m² / kg but with live-weight beef — which in Clark & Tilman seems to have been convert at 5.67% protein.
Understanding how Clark & Tilman treated each of the studies and the potential for error helps us understand what trust we can put in the data. This is clearly a judgement decision but questions we might ask ourselves:
How crude are the measures of m² / g protein?
What potential error margin should we attribute to them?
Is this error being communicated to the end user?
4. Do the aggregation methods used represent the data well?
Regarding the data in the OWID visualisation then the Clark and Tillman study has been aggregated as an average (mean) within each food type to give single value.
If we look at the original Clark and Tillman data as separate data points from each part of the metadata study we see a slightly different story (the red values represent the mean presented in OWID).
and Ruminant Meat (Beef and Mutton) on a different scale…..
As can be seen, particularly for Ruminant Meats, Fresh Produce and Pork, the red mean value is hugely skewed by outliers.
This is stark when comparing the mean (blue below) to the median (orange)
Questions we need to ask ourselves:
Does a mean truly represent the data?
Is the median a better measure due to this skew?
Should we present the margin of error / range in the individual values in the study?
Are the above issues important to our story / audience? Do they risk the legitimacy of the data we’re presenting (e.g. if we presented the mean)?
5. Why are we using this measure?
The study provides several measures of land use m² per g, m² per kcal, m² per serving, as well as the one chosen in the chart m² per g protein. Why focus on protein?
Questions we might ask ourselves:
Is it a natural measure our audience will understand?
Are there better alternatives?
Does the message change when charting alternative measures?
In this case we might choose m² per kcal as a more meaningful measure, but doing so doesn’t alter the message substantially.
Getting an idea of the above questions helps us understand the issues with the data sources and how we should present them — there’s no right answer though. Each decision will be different, but doing so from a position of awareness rather than ignorance is key.
Conclusion
Data visualisation projects that take third party data are a key part of our community, and rightly so, but we need to be aware of the training opportunities and skills we can learn from though, which is dependent on our approach.
Approaches like my initial “cavalier” approach helps individuals learn tools and techniques. The risk with this approach is that it promotes habits, such as ignoring data quality and critical thinking, that are detrimental to our role as data analysts.
The collaborative approach is key to understanding, it is rare that in real life analytics we don’t have colleagues and peers to work with, and the same should be true of our personal data projects. We need to use our peers to assess our projects before they are published, and be more critical of visualisations we see published.
Critical data analysis is a key part of the role. Learning the skills to understand a data source and its provenance, as well as developing a natural distrust of data, is a key lesson — and one that can be learned through the likes of Makeover Monday, as much as design and tool techniques.
Personally this week’s Makeover Monday was quite a journey for me, I learnt key lessons and I hope I’ve been able to show how valuable digging deeper into data can be.
Many thanks to Ann Jackson, Gwilym Lockwood, Jeff Shafer, Anna Foard, Rob Radburn and others who helped with feedback on this article and during the analysis. Continued thanks to Eva and Andy for their untiring efforts with Makeover Monday which provides such a rich seam of data for these conversations.