A Reading List on GenAI for Data Visualization
Enjoy the reading!
Hi folks! I hope you are enjoying the last bits of summer. Here I am busy preparing for the two courses I’ll teach this semester at Northeastern University. One of these courses is new and completely devoted to the intersection of Data Visualization and Generative AI. The first half of the course is devoted to reading research papers on this topic. Here is the list I have compiled for the course. I am sure you’ll enjoy reading about this rapidly evolving research!
Prompt-to-Vis Systems
LLMs can be used to ask an AI system to generate charts that solve a particular problem, enabling data visualization specification using natural language instead of code, domain languages, or UI interactions.
ChartGPT: Leveraging LLMs to Generate Charts from Abstract Natural Language
Visualization Generation with Large Language Models: An Evaluation
DynaVis: Dynamically Synthesized UI Widgets for Visualization Editing
Image Synthesis for Vis
Most data visualization solutions based on LLM transform prompts into code that generates the desired charts. However, generative AI can also generate images directly, rather than relying on code. Here are a few papers that do exactly that.
Let the Chart Spark: Embedding Semantic Context into Chart with Text-to-Image Generative Model
viz2viz: Prompt-driven stylized visualization generation using a diffusion model
Narrative Sequences
Building a sequence of charts and text is the core of data storytelling. LLMs can provide support in the ideation and implementation of these sequences, and the text introduces and describes each chart.
Captioning and Accessibility
Describing charts in terms of their structure and content is crucial for interpretation and accessibility. Can LLM provide the necessary support to support these important tasks? Can visually impaired users leverage GenAI to get easier access to visual content?
Pluto: Authoring Semantically Aligned Text and Charts for Data-Driven Communication
VizAbility: Enhancing Chart Accessibility with LLM-based Conversational Interaction
LLMs as Chart Readers
Can LLMs do some of the evaluative work that humans normally do with data visualizations? These papers examine the capabilities of LLMs and evaluate their performance in a series of interpretation and reasoning tasks.
Probing the visualization literacy of vision Language Models: The good, the bad, and the ugly.
How good (or bad) are LLMs at detecting misleading visualizations?
User-Driven Output Verification
LLMs often make mistakes in data-related tasks, such as data transformation and visual mapping. Therefore, user interfaces are essential to help data analysts verify the generated output. These papers offer infrastructural and UI support for users to verify LLM output.
Evaluation and Benchmarks
Evaluating LLMs’ charting capabilities at scale is crucial for enabling the evaluation and validation of novel systems and techniques for AI-driven data visualization. These papers provide evaluation methods and benchmark data set creation to assess the performance of LLM-based data visualization systems.
Automated Data Visualization from Natural Language via Large Language Models: An Exploratory Study
VisEval: A Benchmark for Data Visualization in the Era of Large Language Models
Natural Language Dataset Generation Framework for Visualizations Powered by Large Language Models
Understanding Real-World Use
To assess the use of LLMs for data visualization, it is essential to understand their practical application. These studies examine how people utilize LLMs for data-related tasks, including the challenges they face, the opportunities they present, and the strategies employed.
If you have recommendations for topics or specific papers to add, let me know!
Enjoy this list and look out for more posts on this topic. I will have more to report as my university course unfolds.


This list is GOLD!
Great list Enrico!