Yesterday I went to David McCandless’s talk “Knowledge is Beautiful” at the British Library. It’s part of a series of events in conjunction with the Beautiful Science exhibition and I felt lucky to have snaffled up a last-minute spare ticket as the talk was sold out.
If you work in data and you aren’t already salivating over McCandless’s book and glorious interactive website Information is Beautiful, stop reading this post right now and go check it out. Have you gone? Good. Welcome back.
McCandless is a thought leader in the fields of data journalism and information design. It’s partly him that we have to thank for those increasingly popular data visualisations employed in publications like the Guardian and the New York Times to try and distil immense amounts of data into some sort of intuitive format that can be meaningfully digested. (I suppose that means that we also have McCandless to thank in a roundabout way for some of the most egregious examples of poor information design, because if it weren’t popular people wouldn’t try and jump on the bandwagon and keep failing so spectacularly. Some of the best examples are captured over here on wtfviz.net. It says a lot about my life that I clandestinely pass time by reading updates to this website. It says even more that doing so frequently reduces me to actually, genuinely crying with laughter.)
Part of McCandless’s talk last night touched on the distinctions (or the gradations) between data, information, and knowledge. He is working on a graphical way of representing this for his next book, with data being the most unstructured, information involving slightly more organisation and some structured links between different data sets, and knowledge being a fully integrated system that incorporates contextual information around the data.
During the Q & A I asked McCandless whether he thought qualitative data can be described in the same way. His answer was that it could, and that qualitative information lends itself very well to visualisations (albeit of a different sort than designs of quantitative data.) He showed us this infographic of the left and right wings of the political spectrum as an example.
I think that qualitative data, while it may have a visual output, can’t really be described as a visualisation in the same way. A design, or a piece of art, perhaps, but I don’t think what would classically be called a piece of data visualisation can emerge from qualitative data. McCandless was kind enough to discuss this with me afterwards. He suggested that even with qualitative data, there is some sort of organisational structure which can be fleshed out in a visual map of sorts, as with the work he did on the political spectrum piece. He mentioned categorising survey questions also. I would argue, though, that as soon as you get into that sort of categorisation, you’re already straying into a quantitative framework.
The fundamental distinction between quantitative and qualitative data is that qualitative data is non-numeric. As soon as you start to apply distinct numeric categories to it (even things like coding unstructured interviews to allow common themes or code words to be categorised into groups), you’re already quantifying it. I’m not arguing that this is a bad thing or that data which began life as qualitative shouldn’t be analysed in this way, just that it’s bringing in a whole different range of methodological techniques. Granted, using only traditional qualitative techniques there is a limit to the amount of data that can be collected, retained, and written up. This is why hybrid techniques which allow some strucutred quantitative elements to organise data which may have been collected in a qualitative framework are so useful: they enable the researcher to handle a much bigger amount of data. (There are also some really interesting digitial qualitative techniques emerging right now which allow bots and spiders to collate social media data into what are essentially blogs that look as though they were written by a single person, allowing the qualitative researcher to handle a huge amount of aggregate data but still drill down to the juicy bits, like personal playlists and foursquare checkins and things. But that’s beyond the scope of this blog.)
The real value of data visualisation, in my opinion (which I had the actual hubris to argue with David McCandless), is to breathe life into numerical data which is unwieldy or non-intuitive. Qualitative data, on the other hand, may reveal unexpected truths about the subjects of its study, but it can rarely if ever be described as un-intuitive. There are many other challenges faced by qualitative researchers (scope, researcher bias, getting locked in a bedroom on a half-sunken boat in the middle of the Nile) and of course there are many challenges that all researchers share regardless of the methods they employ. But the output of qualitative data, even if visual, couldn’t necessarily be replicated. With numerical data, the same set can be used to generate many types of charts and graphs but ultimately they’re all representing the same thing. With qualitative data, like a kaleidescope, shake it and each time you’ll find yourself looking at something unique, something that is particular to the moment between the researcher and the people or objects being researched, something that cannot be re-examined through the same eyes once the moment is past. To me, that is no less useful than quantitative data–and no less beautiful.