What’s really wrong with bar graphs of mean values: variable and inaccurate communication of evidence on three key dimensions

Autor: Jeremy Bennet Wilmer, Sarah Horan Kerns
Rok vydání: 2022
DOI: 10.31219/osf.io/av5ey
Popis: Human behavioral data are frequently communicated via bar graphs of mean values. Such “mean bar graphs” are presumed to communicate empirical results effectively to non-experts. Yet direct evidence for or against this presumption remains sparse. Here, we ask how a set of widely-consumed scientific mean bar graphs are interpreted by a demographically diverse sample of 133 participants. We use four mean bar graphs of research results, taken from major introductory psychology textbooks, which vary in content (developmental, clinical, social, cognitive), form (unidirectional bars, bidirectional bars), visual aesthetics (four different textbooks’ look and feel), data type (objective performance, survey ratings), and study design (experimental, non-experimental). Participants created a detailed sketch of each graph, adding datapoints for their best guess of individual values that were averaged to produce the mean values. Drawn data values were then analyzed as if they were real data. Results were examined for deviations from the ground truth of the published data and for variability between participants. On three separate dimensions—location of the mean, variability around the mean, and normality of distribution shape—we found large, systematic deviations from ground truth and high inter-participant variability. Together, the combination of systematic deviations and inter-participant variability yielded common, extreme misunderstandings, or fallacies, on all three dimensions. We call these fallacies: (1) the Bar-Tip Limit Error: most or all data plotted inside the bar, as if the bar’s tip represented the outside limit of the data rather than its balanced center point; (2) the Dichotomization Fallacy: little to no overlap between distributions that should show substantial overlap; (3) the Uniformity Fallacy: data distributed uniformly over its entire range, absent the tails that were present in the real data. These results replicated across the four varying stimulus graphs, suggesting that they are not limited to specific graph form, content, visual aesthetic, data type, or study design. We conclude that the choice to communicate human behavioral data via a mean bar graph carries with it at least two major risks. First, different viewers may walk away from the same graph with widely divergent interpretations of the presented evidence. Second, interpretations may deviate systematically, and, for many viewers, to an extreme degree, from ground truth.
Databáze: OpenAIRE