23 July 2020

One simple trick to improve (some) bar graphs

I ran across this figure because it was being used as a good example.

redictive power (root  mean square error: RMSE) of edaphic  (dark grey), topo‐climatic (pale grey) and  overall (white) predictors calculated on the  diversity of protist operational taxonomic  units from the overall community and nine  broad taxa retrieved from 178 meadow  soils in the Swiss western Alps. The RMSE  were calculated on 100 cross validation of  Generalized Additive Models performed  with 20% of the samples as test dataset.  The letters on the top of the boxplots  represent significantly different groups  according to a multiple comparison mean  rank sums test (Nemenyi test p < .05) for  each of the edaphic, topo‐climatic and  overall variables


This is a journal figure, not a poster figure, so there are many things I would want to do differently on a poster. But I just want to focus on one thing:

The most important things to read on this figure are the labels on the x-axis. You’re probably crooking your neck right now trying to do so. Readers should not have to contort themselves to read your graph.

For that matter, the all the y-axis labels also require you crook your neck read them. Even the numbers, which have no business having their being vertically aligned.

But luckily, the solution is simple: rotate it!

redictive power (root  mean square error: RMSE) of edaphic  (dark grey), topo‐climatic (pale grey) and  overall (white) predictors calculated on the  diversity of protist operational taxonomic  units from the overall community and nine  broad taxa retrieved from 178 meadow  soils in the Swiss western Alps. The RMSE  were calculated on 100 cross validation of  Generalized Additive Models performed  with 20% of the samples as test dataset.  The letters on the top of the boxplots  represent significantly different groups  according to a multiple comparison mean  rank sums test (Nemenyi test p < .05) for  each of the edaphic, topo‐climatic and  overall variables
Suddenly, this graph becomes easy to scan. No information is lost. The data is not harder to compare. And by definition, the graph takes up the same amount of space on the page. Some tweaking might be required to optimize to columns widths, though.

There are a couple more changes besides rotating the entire image. The old y-axis scales are on the top with a simple rotation, and those got moved to the bottom. The legend and the comparison letters, which were the only things oriented horizontally, got “unrotated” back to horizontal.

If your labels are too wide to be horizontally aligned, consider turning your vertical column graph into a horizontal bar graph.

You don’t want to do this in every case. If you are plotting time as a variable, it is almost always better to keep time on the x-axis, because that is such a standard way of portraying time in graphs. Fortunately, you can often use abbreviations for time (“Jan” or even “J” instead of “January”) to avoid vertical text.

I can’t help you with those taxonomic names, though.

And the moral of the story is: English text wants to be horizontal!

Reference

Seppey CVW et al. 2020. Soil protist diversity in the Swiss western Alps is better
predicted by topo‐climatic than by edaphic variables. Journal of Biogeography 47:866–878. https://doi.org/10.1111/jbi.13755