“Semantic Clashes” in Data Visualization

When it's hard to figure out what a graph communicates

Jul 01, 2024

Take a look at the images below and try to answer quickly.

If you are like me, the graphs above generate some confusion. It takes a while to understand what is going on. In a way, they are a puzzle. Why?

In the last few years, I came across examples like these and started noticing something that we rarely seem to acknowledge in visualization. Some visualization elements carry implicit semantic meaning that can generate clashes with the concepts one wants to communicate with the data. In other words, there can be a conflict between data and graphical semantics.

It’s surprising how little we know about how and when these clashes happen. In visualization, we have a lot of theories about how to match graphical properties to data properties but very little about how to match data semantics to graphical semantics. We don’t even have a good characterization of what kind of semantics different graphical elements carry.

If we break down graphical representations into their parts, can we generate general rules that help us identify and predict these effects?

The patient evolution example above has something to do with the association between direction and positive vs. negative valence. Somehow, a metric going up is perceived as an improvement, whereas one going down is a degradation. Maybe we humans have a spatial or physical bias towards the idea that more of something is better (more resources).

The runners’ example also involves spatial effects. We associate the idea of “runners” with their position on the finish line. However, we also have the graphical association of time represented as space on the graph, which collides with our more intuitive spatial association.

Observing these two examples, one might conclude that these effects are all inherently spatial. However, the example with colored regions negates this intuition because the ambiguity does not derive from spatial arrangements. The ambiguity stems from the relationship between the foreground and the background. When the background is dark, we perceive bright values as high values; our perception is reversed when the background is bright.

Is that all? Of course not! I am sure there is way more to be discovered and systematized.

Current visualization theory defines data types (categorical, ordinal, quantitative) and visual channels (position, color, size, etc.), but this is not enough to capture visualization effects that take place at a more conceptual level. For this reason, visualization theory and practice could greatly benefit from this type of work.

My student Racquel Fygenson (as well as other researchers) have been investigating the concept of “visualization affordance,” which could help with this characterization. Barbara Tversky has also done extensive basic research in cognitive science on the idea that humans have many abstract concepts derived from spatial cognition. Colin Ware also touches upon the idea of semantic effects with visual representations in his classic visualization book. Still, it would be great to have a more systematic way to describe “semantic clashes” because they often are the basis of confusion and misinterpretation.

—

And you? What do you think? Were you able to encounter this problem when designing a visualization? Do you think there’s a way we can think more systematically about these effects? Please let me know by writing a comment to this post.

FILWD

Discussion about this post