If I had to pick a single paper from the all the VisWeek papers I’ve read so far, it would easily be “Graphical Inference for InfoVis”, by Wickham, Cook, Hoffman and Buja.
What I like about this paper is that is neatly shows the basic difference between the visualization mindset and the statistical mindset. Visualization folks have the following meta-strategy: “Is there any way that I can build a picture in which the data will tell me something?” Statistics folks, on the other hand, tend to think: “Is there any way that these results about the data are fooling me?” While visualization is immensely helpful in actually finding these patterns in the data, it is just as important that we do not fool ourselves, and the fact that VisWeek this year is bringing back the “Vis Lies” session is a testament that we are all aware of this problem, even if only in the back of our minds.
This paper presents two neat experimental protocols which let you use the best of both worlds, using visualizations to bringing out patterns, while making sure that the patterns you do find are in the data itself and not an artifact of the way you decided to encode your variables. In passing, the paper also has a great explanation of statistical testing in general, and the single best accurate description of what p-values really are (if you’ve never heard a good explanation about them, you’re likely to get it wrong).
I can’t recommend this paper strongly enough. If I had to decide on a single paper this year which I think people will easily remember in 10 years, this would be it.