With graphics being so easy to add to documents these days, why don’t we show more histograms in place of the typical approach of representing very complicated data with one or two numbers (eg average and standard deviation)? Sure, if your data is normally distributed, then those two numbers really are a great distillation of the data. However, lots of things aren’t normally distributed, and I’m lobbying for more use of histograms instead of (or, I suppose, in conjunction with) the numeric characteristics of the data set.
Here’s the example that got me thinking about this today. At my school student evaluations of instructors are very important. We use a seven-point Likert scale on questions such as “The instructor encourages me to learn actively” and “This course was a valuable learning experience.” Quite often reviews of faculty are peppered with means and occasionally standard deviations of evaluation data for the reviewed faculty member. However, the data is not normally distributed at all! It can be bimodal (some hate me, some love me), or highly skewed in other ways. I’ve been working lately to provide an interface for our evaluations to help people on the tenure and promotion committee make wise recommendations. Instead of having to click through to each course, I’ve made a nice table that shows the average for the class on each question. The table rows are the various courses the faculty member has taught. But while thinking about the notion of showing histograms in addition to averages, I hit upon using PHP to dynamically create SVG’s with the histograms. Here’s what it looks like:
I feel like you learn a lot by looking at the (tiny) histograms. Take the three “4.44”s that are in the third class. The middle one is much more bimodal than the other two.
What am I lobbying for? I’d love it if many more reports/journal articles/newspaper stories did this kind of thing. The graphics generation and inclusion is really not that hard, and I think it communicates the whole story, not just a distilled version.
One downside is the inability to describe the data very easily. I was showing this to my partner and I was trying to say “this one is different than that one” and I had to point to them. I couldn’t easily describe them. So I resorted to saying “the 4.44 one . . .” etc. I suppose this is backing up my point that the data sets are complex and resist easy description, but I know my colleagues on the tenure and promotion committee like to really discuss these evaluations a lot.
Here’s another interesting point from a friend of mine (who’ll remain anonymous):
Averages and SDs are **NOT** appropriate for categorical data. They assume the “distance” between each category is equal, as if the numerical choices were locations on a spatial scale. They are not. You’ve got two choices: Report number of responses in each bin (as you’re playing with); or turn to Rasch analysis, which is designed for exactly this problem. But it’s not for the faint of heart…
Your thoughts? Here are some starters for you:
- This is great. I totally agree that representing all of the data is much better than any distillations. I would even go further by suggesting . . .
- This is dumb. We use the distillations for several very good reasons . . .
- Why do you use evaluation data at all? They’ve clearly been shown to be problematic.
- Why a 7-point Likert scale? How about a 2-point Love-ert scale?
- How did you make those SVG histograms in PHP?
- PHP?!!? I’m never reading this blog again.
- Wait, I thought you only knew how to use Mathematica.