The technology much of the global public uses now on a daily basis is constantly evolving at an incredible rate. More than half of the entire global population now has access to the Internet, making connectivity across nations infinitely easier, as well as providing otherwise unknown or unobtainable information to people who once lacked access to basic data found in even a common encyclopedia. However, this connectivity comes at a price, which is usually either financial or privacy-based. We are all well aware that there are certain government-applied data obtaining methods, used for all sorts of purposes, but usually for public safety. The questioning on data mining is due to the fact that research on the subject has only been done in recent years, since it is a relatively new type of publicly used internet-based application.
This gathering of loose information off the Internet, popularly called “data mining”, is in question today by public privacy advocates and average citizens alike. There are several sides to the argument, which usually fall along the lines of acceptance of government-initiated data mining at the cost of privacy but at the
benefit for public safety, or the opposite- negative uproar about data mining due to said breach in public privacy. This turmoil within this enormous connected
community has caused quite a few issues internally in some countries, most recently in the United States with the questioning of Facebook CEO Mark Zuckerberg by Congress. In the hearing, Zuckerberg was chastised on the basis of his company selling private Facebook user data to political consulting firm Cambridge Analytica. The issue at hand was majorly in regards to Cambridge Analytica, being what they are, having the data of a great deal of U.S. citizen's info regarding political stances, demographics, interests, etc. during the time of the 2016 presidential election. Obviously, this didn't sit well with the public when the information was exposed, as people called the firm out for using the data to influence public behavior. This event sparked a great deal of debate and protest alike surrounding the morality, ethics, and limitations of data mining by any entity.
Aspects of data mining that sit well with the general public usually revolve around its ability to grab data to be used to create more thoroughly and accurately targeted ads and search recommendations based on user input, as well as to track suspected criminal or terrorist activity through data mega-databases like the FBI's “Sentinel” program. Aside from on-the-net benefits, the FDA has used programs based on mined data to track the impact of things like drugs or food products before they even enter the public marketplace. Recently, scientists at Columbia and Stanford Universities used said program to analyze several drugs, and upon closer inspection, two different drugs (paroxetine and pravastatin) were reported
multiple times to have caused high blood sugar. Now, this information was over several cases in very different parts of the United States, so there wasn't much of a way to put the pieces together manually to find out these adverse effects of the drugs. Data mining simply was a useful tool in this situation, like it has been in many others.
The opposite side of that coin is the matter that people see data mining as a tool used for “state snooping” and “invasion of privacy”. People see the government's obtaining of personal data as a totalitarian spying tactic, and are absolutely against it. Perhaps the biggest public opinion against data mining is that the gathering of certain public information, like questions you'd only ask the internet and not anyone else, could be used adversely to destroy careers or tarnish public images. Unfortunately, there are a great deal of anti-data mining advocates that willingly give their data out to brokers and firms without a second thought. Those who use geo-location to utilize GPS applications, click “I agree” on the privacy terms of a social media site without reading through it, and even play online based mobile games are all simply -and legally- handing over their personal data. Thus, there is a great deal of discussion and work being put into finding limitations on these applications' ability to data mine, and how open or clear the in-app warnings about it are.
Despite the positive and negative attention that data mining has recently caught, there is also a great deal of work to satisfy both ends of the argument. One such example of this is seen within the usage of privacy-preserving data mining, or
PPDM for short. PPDM essentially gathers data based on a set of constraints that restrict the programs from obtaining too personal of information. For example, one
such PPDM technique utilizes the mining application's ability to gather interests and pages liked on Facebook, but restricts it from obtaining demographical, gender, age, political affiliation, or religious choice information. Other technical methods of PPDM use include classification mining, clustering, distributed privacy preservation, randomization, and cryptographic mining. Although some forms of PPDM have been ultimately deemed as very useful tools in actually preserving privacy while still obtaining data that can be used for marketing, informational, or commercial purposes, there are still data points that are occasionally revealed in some forms of mining. Specifically, with PPDM used within international data mining for travel info purposes, things like passport numbers and pictures are left out of the mined data that is extracted, but there are leftover aspects that could be used to identify an individual- like zip codes, birthdays, and phone call information.
Essentially, you can see data mining (like any other innovative tool) as an issue to the public, or a great benefit. Some believe that limitations on government data collection is a barricade to methods that could otherwise prove as a detrimental public safety tool, while others think that their privacy is inherently more important than getting targeted ad coupons for their favorite grocery store. Either way, there is absolutely no denying the incredible power that usage of mined data potentially holds. Regardless of your position this is all best summed up in a quote by famous statistician Andrew Pole, who remarked about data mining,
“just wait, we'll be sending you coupons for things you want before you even know you want them”.
...(download the rest of the essay above)