The Three Mental Models Model for Data Visualization
A useful way to think more holistically about how data visualization works
If you were to ask me: “what is the single most important neglected concept in visualization?” my answer would probably be “mental models”. What is a mental model? It’s quite simple: it’s the mental representation we have of how something works. Think about it, for any object or phenomenon that exists in the world we have a mental representation of its properties and how it works. To understand this more intuitively, try to ask different people how something works, and you will see that they have different ways to describe it. That is their own mental model.
If you want to dig deeper, you can consult the Wikipedia page for the “mental model” entry. It starts with this definition: “A mental model is an explanation of someone's thought process about how something works in the real world.” Not too different from what I wrote above.
Why do I believe that mental models are so important in data visualization? There are several reasons, but the strongest one is that to understand how visualization works, it’s necessary to understand that it’s all about: a) mental models people already have and b) the way visualization changes or helps build new mental models. In other words, it is my conviction that thinking more explicitly about mental models can help people think about data visualization better.
When we use a visualization there are (at least) 3 types of mental models at play:
Model of the visualization
Model of the data
Model of the world
Model of the visualization. The model of the visualization consists of understanding how to read the visualization and what it means. Think about it, if I show you a bar chart, you know from previous exposures to bar charts that each bar represents some kind of object or property and that the length of the bar represents some kind of quantity you are supposed to compare.
But this is not enough. When you encounter a chart for the first time, you also need to figure out what is the meaning of each element: that is, what data elements and what real-world objects or phenomena the graphical objects represent. Using bar charts again, if the chart represents the sales of the last quarter of different divisions of a company, you need to connect the concepts of “divisions” to bars and the concept of “sales” in dollars amounts to the length of the bars.
But, what happens when you see a new type of chart that you have never encountered before? You start making sense of it using your previous knowledge and cues that exist in the visualization.
Model of the data. The model of the data consists of understanding the meaning of the data and of what real-world phenomena they represent. This is often overlooked, but if we do not know how the data have been generated and what they mean, it’s hard to produce sound reasoning around them. The data gathered during the covid19 pandemic is a perfect example: if you don’t understand how cases are counted and what they represent, you may have a very partial understanding of what a given visualization is showing (e.g., cases grow or shrink not only as a function of actual infections but also according to how many tests are performed).
Model of the world. The model of the world consists of understanding the actual phenomena and objects described by the data. In other words, the domain knowledge. This is also often overlooked, but there is a big difference between a layperson and an expert looking at the same visualization. The expert knows the problem represented by the data intimately and can activate connections with their knowledge in a way that is simply impossible for a layperson to do.
There are two important and related observations to make about mental models. The first one is that mental models are always being formed and modified. It’s not like a mental model is a static object in our mind that is acquired once for all and then used whenever it is needed. The second is that almost no knowledge is acquired without connection to pre-existing knowledge. This aspect of how our mind works is often highlighted in learning science and education because it forms the basis of effective teaching and is one of the reasons why analogies, connections, and metaphors are so useful for understanding and memorization.
Of course, these three models are not really separate. Mine is only an abstraction useful to talk about these concepts in a more systematic way. One interesting thing to do is to look at the ways in which these models interact. Let’s consider these pairs and see what we can learn from them.
Visualization-World. I’ve already mentioned that when we form a mental model of a visualization we also implicitly need to connect the graphical element to the real-world object they represent. Imagine a visualization with no legend, title, or labels. It’s impossible to derive something from it because there is no way to create the semantic connection necessary to understand what it represents. It’s surprising to see how little we talk about this crucial connection! If we want to create effective visualizations we have to make sure that they are not only accurate but also meaningfully evocative of the phenomena they depict.
Visualization-Data. Having a good matching between the mental model of the data and of the visualization is also very important. This is somewhat related to the “expressiveness” issue I mentioned in one of my last posts. When I visualize data with a given graphical representation, I implicitly communicate the relevant properties of the data I am depicting. If there is a mismatch between these two, readers can either have problems in interpreting the representation (they form conflicting models) or they may come away with a wrong interpretation of the data (e.g., when a truncated axis is used improperly).
Data-World. As I have mentioned above, it is crucial for a reader to have a good understanding of what real-world objects a given piece of information represents. What is interesting here is what happens when the model of the world and the model of the data do not match (some sort of “data-driven cognitive dissonance”). One can have a full model of the world and realize when exposed to data that it does not match their mental model of the world; a very useful way to update somebody’s knowledge! And one can also have the right model of the world and see that the data do not match; a very useful way to find problems with the data or the data generating process!
So, what can we learn from this model of mental models?
The primacy of the world model. The first thing I’d like to highlight is that out of the three models the model of the world is the only thing that really matters. We do not collect and visualize data for the sake of it, we only do it because it has the potential to help people understand the reality represented by the data better. We tend to forget that, but the ultimate purpose is really to build better mental models of the world. Data and data visualization are only means we use to achieve this goal.
Graph comprehension. This is another aspect that is often overlooked. How can we help our readers understand what a given representation depicts and form a good mental model of how it works? Visualization is often described as the art of depicting data accurately, but this is very limited. An effective representation is one that people can learn with a reasonable amount of effort and can use effectively once it’s been learned. A curious example is connected scatter plots: it does not matter how much time I spend with them, I always have to relearn how to use them.
In this sense, being mindful of how my readers are going to make sense of the visualization I am developing is an essential aspect of visualization design. Subtle elements like legends, labels, etc., may seem minor details but they are essential in providing a semantic link between graphical objects and real-world objects. Similarly, when an unconventional representation is used, I need to figure out how to help my readers learn how to use it effectively before they are able to extract any information from it. In this space, a common issue is that people bring the existing mental models they previously built and they can get confused about old models clashing with the new ones. An egregious example is the “burning embers plot” used by IPCC in many of their reports.
This is another plot that I always have to relearn every time I see it. My mind wants to read it like a bar chart, but it’s not a bar chart.
Domain and data knowledge. If we want people to acquire a deeper knowledge through data visualization we have to meet there where they are in terms of how much they know about the domain and the data. Visualization is not sufficient to generate insights and meaningful inferences if the reader does not understand all the nuances of the data shown and the mechanisms that regulate the reality described by the data. In this sense, strategies coming from education can play a big role. Readers need to be guided in acquiring the knowledge they need in order to interpret a visualization correctly. In other words, what is “around” a visualization is as relevant as the visualization itself.
Asking yourself what kind of mental models already exist in your reader and what new models they will build when exposed to your work is a powerful tool to think more productively and more holistically about visualization.
I hope you’ll find this useful! Let me know what you think!
What an awesome weekend read ) (I may be a taxonomy and conceptual model freak, so this is just what I needed. Not coffee. This.)
You've managed to take rather common concepts and organize them in a way that helps me think about projects I work on or see around. It makes me look for answers – what models have bigger or smaller focus in the project and how that influences the project, its understanding or just branding.
I would like to ask you about the IPCC's risk visualization. I agree that when it comes to easy readability / understandability of the chart, it's so custom that one needs to learn how to read it and we can't build to much on our existing models. What I wasn't sure was if by "an egregious example" you mean just the level of our existing mental models of how to read visualization or whether you find the chart really bad overall? Because I actually find it really helpful (of course after learning how to read it, hh) and I was wondering if you would see a better way to display such information given the complexity of the topic / data and uncertainties that there are.
Thanks for the insightful post.
Do you think we can achieve a better visualisation model by utilising semantic icons and visual metaphors for laypeople? How do we measure that? Using performance metrics with non-experts who look at visualisations for non-work related tasks is not the best option. How do we balance between the two and measure their understanding without interference with their interaction?