What is Qualitative Data Visualization?
An initial investigation trying to put some order in this space
This coming month, I’ll be teaching a private data visualization workshop, and the students have expressed interest in learning, among other topics, how to use visualization qualitatively. Their request made me reflect on the fact that traditional data visualization pedagogy rarely addresses this issue explicitly. I realized that there is a significant gap in both the literature and teaching practices regarding the qualitative uses of visualization. More precisely, I have noticed that it’s hard to find a systematic analysis of this space and that the same terms are often used for completely different concepts. Here is my initial attempt to bring some order to this space. Let me know what you think!
Visualizing Qualitative Data or Visualizing Data Qualitatively?
An interesting ambiguity here arises regarding whether qualitative refers to the data or the visual representation. One possibility is to develop data visualizations for qualitative data, such as data from surveys or interviews. Another possibility is to focus on the problem of what kind of visual representations communicate information qualitatively. Let’s explore both concepts.
Qualitative Data
What is qualitative data? One straightforward answer is that qualitative data is any data that is not quantitative. If you can perform mathematical operations, it’s quantitative. Otherwise, it’s qualitative. Categories, labels, concepts, etc., are all qualitative. Prices, temperatures, weights, and counts are quantitative. However, many grey areas exist. We often attach numbers to things that have an inherent qualitative nature. If I ask a person to express the presence of a given emotion on a scale between 1 and 10, is that number qualitative or quantitative data? We also have the common case of ordered categories. You can order them, but they don’t really measure anything. You have categories that stem from binning qualities into low, medium, and high value. So the distinction is often subtle.
However, when people talk about qualitative data, they often mean something different. It’s less about the nature of the values stored in a structured data table and more about free-form text gathered from surveys, interviews, observations, annotations, and similar sources. This is the core of what most qualitative researchers do: they collect rich, unstructured textual data on a given topic and analyze it to distill concepts, ideas, and themes, often organizing their findings into collections of labels or codes.
Qualitative data can also be more than what we covered so far. In a way, most unstructured data, such as text, images, and videos, is qualitative in nature. However, relational structures are also more qualitative than quantitative: networks, flows, and hierarchies are all examples of relational structures that are more qualitative than quantitative.
So, as you can see, sorting out qualitative data is not as straightforward as it may seem. Different people may interpret the same term in different ways, and the qualitative nature of data can stem from various aspects, including the nature of the phenomena captured in the data, the format of the data, and the meaning of the values used to represent the information.
Qualitative Representations
As anticipated above, another interpretation of the term “qualitative data visualization” refers to the qualitative nature of the visual representation rather than the data. In many cases, the qualitative nature of the data is reflected in the qualitative nature of the representation. For example, network data is often represented with node-link diagrams, which are inherently qualitative. But is it true in general that qualitative representations necessarily stem only from the qualitative nature of the data? I don’t think so. Often, the same information can be expressed in many different ways, and some ways can be more qualitative than others. After thinking about this idea for a while, I came up with this initial characterization of “qualitativeness.”
Visualization is often described as the procedure that maps data objects to visual symbols (often called marks), and data attributes to visual channels (e.g., color, size, shape, orientation) and spatial position. If we look at these elements through the qualitative lens, we can start analyzing how these elements can be used qualitatively.
Qualitativeness of channels. It is very well-known that some visual channels are more apt for qualitative than quantitative information. Shape and color hue convey qualities, whereas size, vertical or horizontal position, and color luminance convey quantities. According to this categorization, using a qualitative channel for quantitative information is wrong and vice versa. However, even within this constraint, the same type of information can be expressed more or less qualitatively according to which channel is chosen. For example, a quantitative variable can be expressed with position along an axis, bar length, bubble size, or color luminance. All these encodings are legitimate, but they also vary in the qualitative way they will be interpreted. Visualization research has traditionally focused on the idea that more precise channels are to be preferred over imprecise ones. Still, a useful complementary model is that less precise channels, like color luminance, are not necessarily always “bad.” Less precise representations may be preferred if we are interested in conveying information more qualitatively.
Qualitativeness of symbols (marks). A more interesting element of qualitative visualization is using symbols that carry meanings. Traditional visualization research categorizes marks according to three main elements: points, lines, and areas. All of these assume that we always want to represent data using abstract symbols, but this is not always true. Sometimes we want to use symbols that carry meaning and, as such, are more qualitative than abstract symbols. I have identified three main cases:
Icons: Icons are abstract figures that carry some meaning. A common example of icons used in visualization is “anthropographics,” the use of anthropomorphic symbols to convey the idea that the data depicted is about human individuals. Icons are qualitative and carry important meaning that can be exploited to convey different types of information.
Text: Individual words and numbers can be used as symbols in a visualization. Richard Brath’s book “Visualizing with Text” is all about this idea. Words can still be scaled, positioned, colored, etc., to encode specific values, but they also carry the meaning of the concept they represent.
Images: Finally, images can also be used as symbols in a visualization. Similarly to words and icons, they can be scaled, positioned, and colored, while carrying the rich qualitative information of the objects they represent.
Qualitativeness of space. Space in visualization can be used in a variety of ways. The most common approach is to represent values along one or more axes. For instance, a horizontal axis in a chart might display quantities or ordered and unordered categories. However, not all visualizations rely on axes, and this is where qualitative uses of space come into play. Space can be used to group elements, establish hierarchies and structures (such as through recursive spatial subdivisions), represent intersections as in Venn diagrams, and much more.
What qualitative concepts can we express?
But maybe a better model to think about the “qualitativeness” of a data visualization is to reason about the combination of concepts it includes? Here is a partial list of common qualitative concepts that visualizations can express:
Who/what
When
After/before
Where
With whom/what?
How much/many?
If you think about it, the vast majority of visualizations stem from a combination of these factors, where at least one factor is about how much or how many things there are. This is the quantitative part.
Think about the most classic charts. A bar chart integrates information about the who/what with how much or how many. A line chart integrates the when with how much or how many. A choropleth map integrates the where with how much or how many. Etc.
One thing I am wondering if one way to define qualitative visualizations is that they are visualizations that express combinations of concepts that are exclusively qualitative.
For example, a chart that integrates who and where, or who and when, without any associated quantity, is inherently qualitative. I need more time to think about this idea, and I’d like to develop some examples to share in future posts, but this seems a productive direction.
—
What do you think? Does this initial exploration make sense to you? Have you worked with qualitative data visualization before? I’d be happy to hear what you think.
Nice read and also food for thought, thanks Enrico! Lately i stumbled upon https://huyen-nguyen.github.io/maker/index.html#gallery - a so called WordStream "A Lightweight End-to-end Visualization Platform for Qualitative Time-series Data" Great work and paper by Nguyen in my opinion which bridges the gap and ease the process of extracting insights from temporal patterns in text data.
In the qualitative-data-equals-categorical data sense, "visualizations that express combinations of concepts that are exclusively qualitative" possibly corresponds to Michael Friendly's "Visualizing Categorical Data" ?
"Symbols that carry meanings" can also arguably include simple data symbols and simple visual channels, e.g., a (possibly outdated) correspondence between blue/pink data points and male/female groups or blue/red for cold/hot.