A “Takeaway Message” Model for Data Visualization
A model to think about what messages people derive from a data visualization
A useful model for conceptualizing visualization is to reason about what kind of message readers take away from a visualization once it’s been observed and interpreted. Several researchers have looked into this idea, trying to identify elements that influence the takeaway message.
This specific lens to look at visualization is new compared to the traditional communication model we are used to. Traditional visualization research is about how to communicate something (often generically referred to as information) effectively. This is useful, but it’s based on a designer-centric view of visualization. I have something I want to communicate, and I have to find the best way to communicate it. However, a different lens is to use a reader-centric strategy where the focus is on what information, or more precisely, what message, the reader takes away from a visualization once they are exposed to it. This perspective is useful because it’s not rare for designer intent and reader interpretation to mismatch.
Interestingly, to this date, we do not have an established framework to think about what influences a takeaway message in visualization. What I know is that the traditional “visual encoding” model, which focuses on which visual channels communicate certain types of information more precisely, is inadequate to capture this information.
While preparing some new lectures for my courses, I started developing a small framework to think about this idea more systematically. It’s probably not complete, and it’s not validated, but it’s a good start!
The framework
The framework comprises three main high-level components. Each of these influences what messages people will derive from a data visualization. Let’s analyze each one specifically.
Data shape
Despite most visualization literature (understandably) focusing on visual representation, the shape of the data one visualizes is the most influential element. Data shape has two different meanings. The first is about what combination of variables types one decides to use. If I use time and a quantity, I will look into the temporal evolution of the quantity. If I have categories and associated quantities, I will compare these quantities across the categories. It’s more complicated than that, but I think you get the point. The second is about the data distribution. If you have time series data, the values can go up, down, stay flat, or have a mixed trend over time. Designers can do surprising things with data to influence these trends. For instance, they can decide to focus on a time range or a subset of categories, but the way the visualization will look and the message one will extract depends heavily on the data distribution, and the distribution is malleable only up to a certain degree.
Visual representation
Visual representation is another “free parameter” one can use to modulate a message starting from a given data set. However, before we discuss the role of visual representation it’s important to clarify what pertains to this step and what is instead part of the previous step. If we want to be systematic in understanding the role of visual representation, we have to start from the assumption that changes in visual representation pertain only to changes in the way the same information is encoded visually in different ways. Too often, we compare charts as if they depict the same information when, in fact, some data transformations are necessary to go from one to the other. This may seem obvious, but it’s not. For example, even a simple change from a bar chart to a pie chart requires transforming the value from absolute values to relative values, expressed in percentages.
A systematic analysis of the effect of visual representation is beyond the scope of this short post, and I don't have a complete theory or model to do that yet. However, some initial ideas exist in this space. My student Racquel published a paper in 2023 that starts looking into how different arrangements can make certain interpretations more or less likely. I posted an article on the study here, so you can refer to it if you want more details.
The fact that different mappings may lead to different interpretations is sometimes obvious. A classic example is truncated axis charts, where completely different interpretations can stem from the choice of truncating an axis. Another classic case is whether we decide to order the elements or stack the elements of a visualization or not. This is an example from the paper that gives an idea of how simple changes can lead to different takeaways.
As you can see, ordering, partitioning, spacing, and coloring all can have substantial impacts despite being very simple interventions. A while back, I posted the image below on LinkedIn to provide a simple example of how, from the same data, one can create very different visual arrangements and, consequently, prioritize different interpretations.
There is way more to explore in this space, and I hope I’ll be able to analyze these effects more systematically in the future. In any case, I am sure these simple examples will give you a sense of what the role of representation is here.
Narrative layer
The last component comprises all the elements that contribute to guiding the reader, which I like to call the “narrative layer.” Examples include titles and captions, annotations, reference lines, and animations. Text here plays a major role. Most elements of the narrative layer leverage text to guide the reader and to suggest specific interpretations. Titles are especially relevant because this is what most readers read first, and they strongly influence how the viewer interprets a visualization. Some interesting experiments found that titles are one of the most salient elements of visualizations and that they can bias the reader in remarkable ways.
Uses of the framework
The framework can be used in different ways. Designers can use it to be mindful of how their designs will impact their readers, and readers can use it to assess the validity of the messages they perceive at first glance.
Data, representation, and narrative together play a role in promoting specific interpretations of the data we are presented. By recognizing the role of individual components, we can better understand what other interpretations are possible and how the most obvious interpretation depends on the specific elements of the design used for a given visualization project.
This framework is also the basis of my Rhetorical Data Visualization course I am building. If you are interested in the course and want to receive updates about its development and future publication, please add a comment below, and I will add you to the course mailing list. I hope you find this specific model useful and insightful. If you have any ideas or feedback about it, please let me know!
Interesting ideas! I would like to receive updates about the course. Thanks!
(jonolav.eikenes@gmail.com)
This distinction between designer-centric and reader-centric has echoes of the distinction between "bottom-up" (perceptual) versus "top-down" (cognitive) in the saliency literature.