Some Guidance on Data Visualization
The importance of developing a solid data visualization skillset cannot be understated. The ability to translate your analysis into an effective visual that others can quickly interpret can be the difference between leaving your audience with meaningful insights and leaving them confused. This sounds like a straightforward task, but many times what makes sense to us isn’t as interpretable to an audience that is further removed from the subject matter. However, there’s some general guidance one can follow to start thinking like your audience and craft more meaningful visuals.
Before you can delve into visual creation, its important to define your strategy. Understanding your purpose in creating the graph or chart will better help guide you through the entire process, while keeping your audience in mind. In the article “Visualizations that Really Work” by Harvard Business Review, the author suggests asking yourself two questions to start thinking visually:
1. Is the information conceptual or data-driven?
2. Am I declaring something or exploring something?
The first question involves understanding if you’re visualizing qualitative vs quantitative information. Think the difference between showing an organizational structure or plotting revenues. The second question involves the purpose behind the activity and will also help determine the audience. If you’re declaring something, you may be giving a formal presentation about your findings. The audience in this case may be more high-level, and therefore require visualizations that are more succinct, and involve less technical jargon. In contrast, exploratory data visualizations help to understand underlying patterns or trends in your data. It may be for yourself during an exploratory data analysis process, or within a small group. This is much more informal, can be more in depth, for a more technical audience.
In addition to defining your purpose, it’s almost always necessary to put yourself in a “less is more” mindset. It may be tempting to make a visualization more intricate by aggregating analyses, but graphs and charts become confusing quickly. If it takes the audience more than a couple seconds to grasp the idea behind the visual, it’s ineffective and you should try to present the information in a different way.
Hard and Fast Rules
Along with being a bit more strategic in developing a visual mindset, there are some general rules to follow as well. IBM divides charts into categories based on their visual goals. Here’s a quick recap of those groups:
- Trends: Show trends with line charts, area charts, histograms and stream charts. These charts are used to track changes over time, so if there’s a time component to your data, consider using one of these visuals.
- Comparison: Show a comparison between groups with bar charts, bubble charts and radar graphs.
- Part to whole: Show a breakdown of subcategories in a group with pie charts, stacked bar charts, stacked area charts and tree maps.
- Correlations: Correlations can be shown with heatmaps and scatter plots.
- Relationships and Connections: These charts can be used to show hierarchies. Tree diagrams and alluvial diagrams are best for these purposes.
- Maps: Maps are useful when geography is an important factor in the data. These include choropleth maps and connecting lines.
A Few Examples
To end, let’s look at some examples of effective and not-so-effective visualizations:
There are a couple things that could be improved about Visual A. The first is the use of colors, since it’s not immediately clear why there is differentiation in colors between genders but not between age ranges. The percentages also don’t sum to 100, which may be due to some respondents being unsure, but it would be good to specify that or include a third column. The size of the bubbles are also a bit misleading, as they don’t seem to be proportional to the percentages listed.
Visual B is a better implementation of the bubble chart. The colors make more sense, since they are each unique in representing the different movies, and the size and placement of the shapes make it easy to compare production budgets over time.
Visual C has a bit too much going on, and by the time you sift through it, it may have been quicker to just look at the underlying data in a tabular format. One of the goals of a visual is to quickly summarize information, and this chart has a few too many features by looking at three different questions, broken down by political party and approval percentages. The way the percentages are displayed in a box plot format also makes it more difficult to interpret.
Visual D is a better representation of American sentiment regarding troop withdrawal. The percentages are easy to read since they’re in a pie chart format, and the colors also help in quickly determining approval. It also only focuses on one overall question, and then breaks it down by political party, as to not clutter the chart too much.