Unimpressed With Your Scatter and Bar Plots? Give These Four Classic Alternatives A Try.
Motivation
If you have ever visualized your data (which I am sure you have), the first plot type that possibly came to your mind was either a scatter, bar, or line plot.
While these plots do cover a wide variety of visualization use cases, I have seen many data scientists using them excessively in every possible place.
Although they are simple and easy to interpret, they are not the right choice to cover every possible use case.
Therefore, in this blog, I will demonstrate a few alternatives to these popular plots. Moreover, I will also explain how these can be more beneficial to use.
#1 Hexbin Plot
Scatter plots are extremely useful for visualizing two sets of numerical variables.
But when you have, say, thousands of data points, scatter plots can get too dense to interpret.
Hexbins can be a good choice in such cases. As the name suggests, they bin the area of a chart into hexagonal regions.
Moreover, each region is assigned a color intensity based on the method of aggregation used (the number of points, for instance).
Hexbins are especially useful for understanding the spread of data. It is often considered an elegant alternative to a scatter plot.
Moreover, binning makes it easier to identify data clusters and depict patterns.
#2 2D Density Plot
As we noticed above, when the number of data points is large, interpreting a scatter plot to determine its distribution is immensely difficult.
Similar to a hexbin plot which depicts the density of points, a 2D density plot illustrates the distribution of a set of points in a two-dimensional space.
A contour is created by connecting points of equal density. In other words, a single contour line depicts an equal density of data points.
As mentioned above, if a scatter plot is hard to interpret, a 2D density plot can be your way to proceed.
They can be especially useful when you want to identify patterns and outliers in the data. Scatter plots, on the other hand, are mainly used to depict the relationship between two numeric variables.
#3 Dot Plot
Bar plots are extremely useful for visualizing categorical variables against a continuous value.
But when you have many categories to depict, they can get too dense to interpret.
Moreover, in a bar plot with many bars, we’re often not paying attention to the individual bar lengths. Instead, we mostly consider the individual endpoints of each bar that denote the total value.