Data undermining privacy
There seems to be a growing trend in the past few years to collect and place more and more data about people into databases and then expect data mining to work its magic. The end result is always justifiable, e.g. stopping terrorism.
It’s irrelevant whether a person is innocent or guilty, just record everyone’s data. And if they are innocent, as the vast majority are, well no harm done.
The flaws in this approach are both privacy concerns and the base rate fallacy.
First, privacy concerns always arise from collecting data when there is no obvious need. It was therefore good to see Ontario’s Information and Privacy Commissioner, Ann Cavoukian, ordering a stop to the practice of mining “extensive” information from people selling goods to second-hand stores, cautioning the practice is a slippery slope toward an Orwellian society where authorities could misuse private data.
She went on to say that, “You’re collecting information on law-abiding citizens, which in a free and democratic society, you only do when you have a suspicion of wrongdoing. Here, we’re . . . treating everyone as a potential criminal.”
Secondly, ignoring the volume and impact of false positives. There is a great article by Bruce Schneier called Data Mining for Terrorists that explores in detail why it’s simply not possible for rare events, like a terrorist attack, to be prevented by data mining (the base rate fallacy).
This article concludes with one of my all-time favourite quotes, “It’s a needle-in-a-haystack problem, and throwing more hay on the pile doesn’t make that problem any easier.”