A series about some pitfalls of data visualisation
This article is also available on Medium.
During my internships and various projects, I’ve gradually learnt that working with data, and especially doing data visualisation, can be tricky: knowing how to properly show your data so that the others understand it and get your point is necessary but it’s complicated!
Some people think graphs aren’t that important and what really matters is the facts and figures you’re talking about. I do believe the growing trend of adding data visualisations courses to engineering formations is a proof of the opposite…
The importance of data visualisation
Nowadays, we are flooded with data: it’s everywhere, all the time and often filtered by so many intermediaries it’s difficult to go back to the original content. Every time you search for something, there are so many links, websites and other sources to sort through!
That’s why it’s worth actually observing the data you play with and dive in it deep enough to be sure you get its message. Of course, while extracting the core idea of a dataset is a daunting task, it’s essential if you wish to use it in the right way. And the way you represent it is, I think, a good indicator of whether or not you understood it correctly.
(Because, yes, what is NOT hard is misrepresenting and, therefore, misinterpreting data…!)
After reading this really nice article on 16 data visualisations examples, I took a peek at the e-book featured at the end and discovered a very nice, well-knitted dossier on data visualisation that taught me a lot. What is great about this short report (only 50 pages, with graphics and plots) is that it goes back to the fundamentals and lists the different types of data, the different relationships you can establish between your elements and the different common representations. So you really put together the basic items and re-learn the glossary.
Moreover, it has a lot of examples: too me, it’s the best way to truly get how visual representation can bring out the soul of your data or crush it completely.
A little puzzle game for you 🙂
Therefore, I’ve decided I would try and give my take on the topic through a basic puzzle game for the reader: the Good, the Bad and the Ugly of Data Visualisation (or GBUDV for short, but that’s weird to say).
For this little series, I designed some fictitious datasets and represented them in several ways to show how fitting or misleading the visualisations can be. Foreach dataset, I made:
- the Good visualisation, which shows the information contained in the data as accurately as possible
- the Bad visualisation, which kind of gives you the data but is not easily interpretable (you actually have to think about it to get it!)
- the Ugly visualisation, which intentionally plays around with the implicit rules of data visualisation and with our biases to deceive the reader
And since a picture is worth a thousand words:
The series will have 4 episodes after this introduction and I’ll post one every Monday for the weeks to come. Each post will be structured in the same way:
- first, I will roughly synthesise what message each visualisation gets across – or at least the most common reactions I’ve had when showing them around to friends and family.
- then, there will be some paragraphs where I will provide some insights as to how I conceived the three visualisations and why I think some work and some others don’t. That’s where the real puzzle is: before checking out the explanations, try and take a minute to think of it for yourself – why is it ok? what’s strange in this one? do I even understand what I’m looking at?
The goals of GBUVD
What I want to prove is that the type of chart, the colors, the title, the labels, the axis and basic transformations on your data can subtly influence the others… whether you deliberately mislead them or just made a mistake when creating your visualisation, stay away from the Ugly visualisations!
Another important point is how sticky your first impression is: the first time you see a plot, you get a first feel of the data. This instinctive interpretation can be right on the mark if the representation is a Good one, or very wrong if it is an Ugly one. No matter, it will stay in a corner of your head and no further explanation from the creator will ever completely erase it. That’s one of the reason that make the Ugly plots so terrible: they actually work on many people because they play with our initial impression and twist your mind so you can’t unsee it.
But cheer up! 🙂
Truth is, many of us are more aware of these tricks and deceptions than we think and no doubt most of you won’t get fooled by all of the Uglies. The essential thing is that if you do, you understand why you did and you become a little more prepared for the next one. Because, let’s face it, journals, scientific articles and social networks are filled with infographics that are sometimes more of an Ugly than a Good sort…
I hope you’ll enjoy this series: feel free to react in the comments and tell me what you think of it! 😉
Quick note: all the designs were realised with infogr.am, an online tool to create infographics and charts; I discovered it in the e-book mentioned before and really like it, even if the Free version disables some cool features. If you want to do some data visualisation, I encourage you to test it!
Still, it’s not the only option: there are a lot of free and/or open-source data visualisation tools around, as listed in this article by Geoff Hoppe.
- The article that started all this: https://blog.hubspot.com/marketing/great-data-visualization-examples
- The e-book from the article: https://offers.hubspot.com/data-visualization-guide?hsCtaTracking=2f02d8fe-c9b0-4078-a3ae-5831c892fbd0%7Ce67d8bb8-ee0a-4f39-a88a-c900eaaa3d72
- Infogr.am’s website: https://infogram.com/
- G. Hoppe’s nice list of free and/or open-source data visualisation tools: https://blog.capterra.com/free-and-open-source-data-visualization-tools/