Law.com Subscribers SAVE 30%

Call 855-808-4530 or email [email protected] to receive your discount on a new subscription.

What's the Big (Data) Deal?

By Timothy J. Toohey
March 29, 2013

There has been a great deal of discussion in recent years that 'Big Data' is the next big thing in the world of technology and business. In our increasingly data-intensive world, Big Data is proclaimed by its proponents as bringing about a new era of innovation and economic growth. But as increasingly large amounts of data are collected, stored and analyzed about individuals, privacy advocates have also raised concerns that Big Data may endanger, if not end, personal privacy and lead to the world depicted in the Steven Spielberg film Minority Report, where individuals are under constant surveillance and viewed almost entirely as a collection of data bits. Given the strong opinions held by both sides, it is important to unpack the meaning of the Big Data phenomenon and its alleged dangers to personal privacy.

What Is Big Data?

Big Data, as the term is typically used, has several different facets. As technology, Big Data refers to hardware that is capable of sorting and analyzing a massive amount of data in a short period of time. As a process, Big Data refers to finding patterns or other analytical results from such data. In more general terms, Big Data is shorthand for digital data that is so large and unstructured that it cannot be stored or managed by traditional relational database tools.

Big Data is diverse in type and origin. It consists not only of alphanumeric text, such as e-mail messages, spreadsheets and Web pages, but also media such as pictures and audio. The sources of such data encompass online data generated by social networking sites, blogs, Twitter feeds, commercial transactions and search queries, as well as mobile phone and other communications devices, and data from household devices and transportation networks. Much of Big Data consists of personal information about individuals from online activities. Data produced from such sources is sometimes referred to as 'data exhaust,' because it is a byproduct of the process rather than its direct object.

A salient feature of Big Data is that it is very large indeed. In 2011, researchers estimated that the world's data doubles every 1.2 years and that there were approximately two zettabytes of data in 2012 with a 'zettabyte' being sextillion bytes. See, 'New Ways to Exploit Raw Data May Bring Surge of Innovation, a Study Says,' The New York Times, http://nyti.ms/WXoNXb. Although any particular subset of Big Data does not encompass all existing data, data sets may be as large as an 'exabyte,' which is a quintillion bytes.

The enormous growth of data is enabled not only by the great improvements in transmitting information seen in the past decade, but by the even greater ability (and decreased expense) of storing massive amounts of digitized data.

Big Data's proponents reference the great potential human, scientific and business benefits that will come from collecting and analyzing massive amounts of data. These benefits include exploring and fulfilling consumer needs, advertising tailored to narrow segments or individuals, and replacing or supplementing human with automated decision making. See, Tene, Omer and Polonetsky, Jules, 'Big Data for All: Privacy and User Control in the Age of Analytics' (Sept. 20, 2012), Northwestern Journal of Technology and Intellectual Property, Forthcoming. Another cited benefit is the ability of scientific researchers to make associations and detect patterns from enormous amounts of unstructured information. One example which is frequently given by Big Data proponents is the discovery by Stanford researchers of a pattern of adverse interaction of two drugs through analysis of millions of FDA adverse event reports and de-identified search results. Id.

Privacy concerns regarding Big Data are closely related to its purported benefits. One of the most frequently expressed concerns is that the massive collection and use of data will lead to consumers' loss of control over their personal data. Id. As data is increasingly used as a form of digital currency stored remotely and transferred globally, some privacy advocates fear that consumers will be unable to determine how their data is being used and by whom. Indeed, some consumers are already uncomfortable about the use of their personal data, as indicated by a recent Pew Research Center survey that shows that up to 68% of Americans dislike having their online behavior tracked for targeted advertising. See, 'Internet Users Don't Like Targeted Ads,' PewResearchCenter. Privacy advocates also worry about the rise of automated decision-making and the lack of transparency regarding the use and collection of Big Data, which they claim amounts to omnipresent hidden surveillance.

An additional concern of privacy advocates is the lack of legal accountability regarding the gathering and use of Big Data. Unlike many countries, including Canada, Mexico and the European Union, the United States does not have a comprehensive federal privacy law, but instead relies on legislation regarding particular sectors, such as health care and financial institutions. Indeed, there is a deep-seated preference in the U.S. in technological and other quarters for self-regulation or minimal regulation of data flows. Citing the lack of privacy protection for personal data, privacy advocates and their allies in the White House (see, 'Consumer Data Privacy in a Networked World) and Federal Trade Commission (see, 'Protecting Consumer Privacy in an Era of Rapid Change') have called upon the U.S. to adopt new privacy legislation to address the issues posed not only by Big Data but also by other threats to personal data privacy in the online world.

The challenges posed by Big Data are frequently discussed in terms of how Big Data comports with the Fair Information Practice Principles (FIPPs), first articulated in a 1973 report to the secretary of the Department of Health Education and Welfare. Concerned by the increased computerization of records and compiling of computerized 'dossiers,' privacy advocates in the 1960s and '70s developed the FIPPs as a blueprint for providing consumers the right to control and protect their personal data. While no comprehensive federal privacy legislation pertaining to the private sector was adopted in the U.S., the FIPPs have endured as a basis for laws in other jurisdictions and for proposed regulations in the U.S. Indeed, on the eve of their 40th anniversary, the FIPPs formed a basis for significant parts of the 2012 White House and FTC privacy proposals.

At a basic level, the FIPPs are founded on a notice-and-choice model, i.e., providing notice to a consumer that personal data is being collected and conferring a right to consent to the collection and use of the data. At a more granular level, as expressed in the White House's 2012 proposal for a Consumer Privacy Bill of Rights, consumers have rights of 'individual control,' 'transparency,' 'respect for context,' 'access and accuracy' and 'focused collection' regarding their personal data. See, 'Consumer Data Privacy,' supra.

Notwithstanding their longevity, some privacy advocates, including Tene and Polonetsky of the Future of Privacy Forum, argue that at least some of the FIPPs may be irrelevant or impractical in the Big Data era. For example, Tene and Polonetsky are skeptical that the FIPP principles of notice, choice and 'individual control' are realistic where entities automatically collect massive amounts of personal data without the knowledge of consumers. Given doubts that consumers read privacy notices relating to the direct collection of personal data (see, 'The Challenge of 'Big Data' for Data Protection,' International Data Privacy Law), they argue that it is unrealistic to expect that consumers can effectively exercise consent regarding collection of massive amounts of unstructured personal data.

Questions have also arisen as to whether data minimization is feasible as applied to Big Data. Under the data minimization principle, as set forth in the White House report, companies are supposed to 'collect only as much personal data as they need to accomplish purposes specified under the Respect for Context principle' and to 'securely dispose of or de-identify personal data once they no longer need it.' See, 'Consumer Data Privacy,' supra. While this may be reasonable when an entity collects limited personal information relating to a specific transaction, data minimization runs counter to the Big Data practice of collecting and retaining large amounts of personal data as the byproduct of other activities, such as searching the Web. Indeed, Big Data's proponents point to the fact that collecting and storing unstructured data may create significant societal benefits, such as the discovery of the adverse interaction between drugs previously mentioned. See, 'The Privacy Paradox: Privacy and Its Conflicting Values,' a symposium held last year, co-sponsored by Stanford Law Review and Stanford Center for Internet and Society, .

Even if some of the FIPPs may be outmoded in the Big Data era, it is unlikely that they will disappear altogether as a point of reference for privacy issues. For example, Tene and Polonetsky advocate that as a counterweight to 'secret' collection of personal data that the FIPP principle of access be strengthened. They argue that consumers should be entitled to access their personal data from online activities as they now can do for mobile telephone applications, such as those making use of friend lists, location information and address books.

Conclusion

Given their stark differences, it is likely that the tensions between privacy advocates and Big Data proponents will continue. While it is risky to predict the outcome of the debate in today's political climate, it is likely that much of the discussion will continue to center on whether increased legislation or self-regulation strikes the right balance between the potential benefits of Big Data and personal privacy.


Timothy J. Toohey is a partner in the Los Angeles office of Snell & Wilmer. His practice concentrates on complex litigation, intellectual property and privacy and data protection matters.

There has been a great deal of discussion in recent years that 'Big Data' is the next big thing in the world of technology and business. In our increasingly data-intensive world, Big Data is proclaimed by its proponents as bringing about a new era of innovation and economic growth. But as increasingly large amounts of data are collected, stored and analyzed about individuals, privacy advocates have also raised concerns that Big Data may endanger, if not end, personal privacy and lead to the world depicted in the Steven Spielberg film Minority Report, where individuals are under constant surveillance and viewed almost entirely as a collection of data bits. Given the strong opinions held by both sides, it is important to unpack the meaning of the Big Data phenomenon and its alleged dangers to personal privacy.

What Is Big Data?

Big Data, as the term is typically used, has several different facets. As technology, Big Data refers to hardware that is capable of sorting and analyzing a massive amount of data in a short period of time. As a process, Big Data refers to finding patterns or other analytical results from such data. In more general terms, Big Data is shorthand for digital data that is so large and unstructured that it cannot be stored or managed by traditional relational database tools.

Big Data is diverse in type and origin. It consists not only of alphanumeric text, such as e-mail messages, spreadsheets and Web pages, but also media such as pictures and audio. The sources of such data encompass online data generated by social networking sites, blogs, Twitter feeds, commercial transactions and search queries, as well as mobile phone and other communications devices, and data from household devices and transportation networks. Much of Big Data consists of personal information about individuals from online activities. Data produced from such sources is sometimes referred to as 'data exhaust,' because it is a byproduct of the process rather than its direct object.

A salient feature of Big Data is that it is very large indeed. In 2011, researchers estimated that the world's data doubles every 1.2 years and that there were approximately two zettabytes of data in 2012 with a 'zettabyte' being sextillion bytes. See, 'New Ways to Exploit Raw Data May Bring Surge of Innovation, a Study Says,' The New York Times, http://nyti.ms/WXoNXb. Although any particular subset of Big Data does not encompass all existing data, data sets may be as large as an 'exabyte,' which is a quintillion bytes.

The enormous growth of data is enabled not only by the great improvements in transmitting information seen in the past decade, but by the even greater ability (and decreased expense) of storing massive amounts of digitized data.

Big Data's proponents reference the great potential human, scientific and business benefits that will come from collecting and analyzing massive amounts of data. These benefits include exploring and fulfilling consumer needs, advertising tailored to narrow segments or individuals, and replacing or supplementing human with automated decision making. See, Tene, Omer and Polonetsky, Jules, 'Big Data for All: Privacy and User Control in the Age of Analytics' (Sept. 20, 2012), Northwestern Journal of Technology and Intellectual Property, Forthcoming. Another cited benefit is the ability of scientific researchers to make associations and detect patterns from enormous amounts of unstructured information. One example which is frequently given by Big Data proponents is the discovery by Stanford researchers of a pattern of adverse interaction of two drugs through analysis of millions of FDA adverse event reports and de-identified search results. Id.

Privacy concerns regarding Big Data are closely related to its purported benefits. One of the most frequently expressed concerns is that the massive collection and use of data will lead to consumers' loss of control over their personal data. Id. As data is increasingly used as a form of digital currency stored remotely and transferred globally, some privacy advocates fear that consumers will be unable to determine how their data is being used and by whom. Indeed, some consumers are already uncomfortable about the use of their personal data, as indicated by a recent Pew Research Center survey that shows that up to 68% of Americans dislike having their online behavior tracked for targeted advertising. See, 'Internet Users Don't Like Targeted Ads,' PewResearchCenter. Privacy advocates also worry about the rise of automated decision-making and the lack of transparency regarding the use and collection of Big Data, which they claim amounts to omnipresent hidden surveillance.

An additional concern of privacy advocates is the lack of legal accountability regarding the gathering and use of Big Data. Unlike many countries, including Canada, Mexico and the European Union, the United States does not have a comprehensive federal privacy law, but instead relies on legislation regarding particular sectors, such as health care and financial institutions. Indeed, there is a deep-seated preference in the U.S. in technological and other quarters for self-regulation or minimal regulation of data flows. Citing the lack of privacy protection for personal data, privacy advocates and their allies in the White House (see, 'Consumer Data Privacy in a Networked World) and Federal Trade Commission (see, 'Protecting Consumer Privacy in an Era of Rapid Change') have called upon the U.S. to adopt new privacy legislation to address the issues posed not only by Big Data but also by other threats to personal data privacy in the online world.

The challenges posed by Big Data are frequently discussed in terms of how Big Data comports with the Fair Information Practice Principles (FIPPs), first articulated in a 1973 report to the secretary of the Department of Health Education and Welfare. Concerned by the increased computerization of records and compiling of computerized 'dossiers,' privacy advocates in the 1960s and '70s developed the FIPPs as a blueprint for providing consumers the right to control and protect their personal data. While no comprehensive federal privacy legislation pertaining to the private sector was adopted in the U.S., the FIPPs have endured as a basis for laws in other jurisdictions and for proposed regulations in the U.S. Indeed, on the eve of their 40th anniversary, the FIPPs formed a basis for significant parts of the 2012 White House and FTC privacy proposals.

At a basic level, the FIPPs are founded on a notice-and-choice model, i.e., providing notice to a consumer that personal data is being collected and conferring a right to consent to the collection and use of the data. At a more granular level, as expressed in the White House's 2012 proposal for a Consumer Privacy Bill of Rights, consumers have rights of 'individual control,' 'transparency,' 'respect for context,' 'access and accuracy' and 'focused collection' regarding their personal data. See, 'Consumer Data Privacy,' supra.

Notwithstanding their longevity, some privacy advocates, including Tene and Polonetsky of the Future of Privacy Forum, argue that at least some of the FIPPs may be irrelevant or impractical in the Big Data era. For example, Tene and Polonetsky are skeptical that the FIPP principles of notice, choice and 'individual control' are realistic where entities automatically collect massive amounts of personal data without the knowledge of consumers. Given doubts that consumers read privacy notices relating to the direct collection of personal data (see, 'The Challenge of 'Big Data' for Data Protection,' International Data Privacy Law), they argue that it is unrealistic to expect that consumers can effectively exercise consent regarding collection of massive amounts of unstructured personal data.

Questions have also arisen as to whether data minimization is feasible as applied to Big Data. Under the data minimization principle, as set forth in the White House report, companies are supposed to 'collect only as much personal data as they need to accomplish purposes specified under the Respect for Context principle' and to 'securely dispose of or de-identify personal data once they no longer need it.' See, 'Consumer Data Privacy,' supra. While this may be reasonable when an entity collects limited personal information relating to a specific transaction, data minimization runs counter to the Big Data practice of collecting and retaining large amounts of personal data as the byproduct of other activities, such as searching the Web. Indeed, Big Data's proponents point to the fact that collecting and storing unstructured data may create significant societal benefits, such as the discovery of the adverse interaction between drugs previously mentioned. See, 'The Privacy Paradox: Privacy and Its Conflicting Values,' a symposium held last year, co-sponsored by Stanford Law Review and Stanford Center for Internet and Society, .

Even if some of the FIPPs may be outmoded in the Big Data era, it is unlikely that they will disappear altogether as a point of reference for privacy issues. For example, Tene and Polonetsky advocate that as a counterweight to 'secret' collection of personal data that the FIPP principle of access be strengthened. They argue that consumers should be entitled to access their personal data from online activities as they now can do for mobile telephone applications, such as those making use of friend lists, location information and address books.

Conclusion

Given their stark differences, it is likely that the tensions between privacy advocates and Big Data proponents will continue. While it is risky to predict the outcome of the debate in today's political climate, it is likely that much of the discussion will continue to center on whether increased legislation or self-regulation strikes the right balance between the potential benefits of Big Data and personal privacy.


Timothy J. Toohey is a partner in the Los Angeles office of Snell & Wilmer. His practice concentrates on complex litigation, intellectual property and privacy and data protection matters.

Read These Next
How Secure Is the AI System Your Law Firm Is Using? Image

What Law Firms Need to Know Before Trusting AI Systems with Confidential Information In a profession where confidentiality is paramount, failing to address AI security concerns could have disastrous consequences. It is vital that law firms and those in related industries ask the right questions about AI security to protect their clients and their reputation.

COVID-19 and Lease Negotiations: Early Termination Provisions Image

During the COVID-19 pandemic, some tenants were able to negotiate termination agreements with their landlords. But even though a landlord may agree to terminate a lease to regain control of a defaulting tenant's space without costly and lengthy litigation, typically a defaulting tenant that otherwise has no contractual right to terminate its lease will be in a much weaker bargaining position with respect to the conditions for termination.

Pleading Importation: ITC Decisions Highlight Need for Adequate Evidentiary Support Image

The International Trade Commission is empowered to block the importation into the United States of products that infringe U.S. intellectual property rights, In the past, the ITC generally instituted investigations without questioning the importation allegations in the complaint, however in several recent cases, the ITC declined to institute an investigation as to certain proposed respondents due to inadequate pleading of importation.

Authentic Communications Today Increase Success for Value-Driven Clients Image

As the relationship between in-house and outside counsel continues to evolve, lawyers must continue to foster a client-first mindset, offer business-focused solutions, and embrace technology that helps deliver work faster and more efficiently.

The Power of Your Inner Circle: Turning Friends and Social Contacts Into Business Allies Image

Practical strategies to explore doing business with friends and social contacts in a way that respects relationships and maximizes opportunities.