May 8, 2012 5 Comments
Since the beginning of this blog, we’ve been talking about ways to re-use and mash up data that already exists online. This is the core of what the programmable web is about, and there are many potential data sources to use. Figuring out ways to use them that advances skepticism and critical thinking is the key.
Among the others who noticed the utility of re-using existing data this way were journalists. This is because at the same time these fantastic web APIs and tools have become available, governments and other public institutions have moved to open up many of their massive public-domain databases for use by the public. When these datasets contain information that might bear on policy issues and decisions, they are potential gold mines for journalists.
This has kicked off a trend called data-driven journalism. Simply put, it is journalists using data mining and other data analysis techniques in order to find the basis for stories. I think skeptics could learn from the techniques of data-driven journalism, and use them for our purposes too. Indeed, I’ve done some very small experiments in that direction in my metrics articles.
Beware: it’s not the easiest thing in the world to get right. There are definitely many ways you can be tripped up if you aren’t careful. But I think if you are careful there are some interesting techniques here that will be helpful to skeptics.
So let’s explore what it would mean to do data-driven skepticism.