Law.com Subscribers SAVE 30%

Call 855-808-4530 or email [email protected] to receive your discount on a new subscription.

A Primer on What 'Big Data' Is All About

By Jonathan Bick
June 02, 2015

In May 2015, the streaming music service Pandora acquired the music industry data collection company Next Big Sound, which extensively tracks sales, social and streaming data. Of course, in the Internet era, the entertainment and other industries are awash with data, all of it with varying degrees of copyright protection. The Pandora/Next Big Sound deal presents a good moment for a primer on this copyright protection.

So-called “Big Data” didn't exist until after the Internet was well established. The Big Data-Internet connection is underscored by the fact that the Internet is required for most Big Data transactions, including collection, storage and dissemination. Most of Big Data's content consists of uniquely Internet-related elements, namely users' transactions, meta-tag applicators and Internet content providers.

Most of the other constituents of Big Data, such as metadata, are derivatives of Internet data. By describing the contents and context of data files, metadata increases the value of those files. As a result, metadata facilitates the discovery and transfer of relevant information.

Big Data and Copyright

Big Data is a form of content and thus primarily copyright rights. Big Data has spawned legal rights difficulties as to who owns the copyrights to certain data and what protections exist for good-faith intermediaries who store and disseminate Big Data. While the general principles for resolving such difficulties are found in the Copyright Act of 1976, Internet-related copyright statutes provide better guidance.

For example, the Digital Millennium Copyright Act (DMCA), enacted in 1998, was innovative in this regard. Among other things, it created the notice-and-takedown procedure for copyright owners and online intermediaries, a corresponding safe harbor from liability in 17 U.S.C. '512 and technological protection measures in 17 U.S.C. '1201.

Sources of Big Data

Big Data may be classified in four groups: 1) user data; 2) information data; 3) application data; and 4) platform data. Understanding the nature of the data is useful for assessing which entity has which copyright rights in the data.

The ownership of the data, and the rights related to its use, are initially related to the entity that first converted the subject data to a tangible form. In particular, as soon as the data is susceptible to being seen or heard, a copyright arises. Once a copyright springs into existence, U.S. law grants the creator of the original work exclusive rights to its use and distribution, usually for a limited time, with the intention of enabling the creator to receive compensation for the intellectual effort.

Some sample data sources and, thus, copyright creators include: standard report users; ad hoc query users; remote partners/suppliers; power users; business executives; statisticians/data scientists; executives and the board of directors; and human resources.'The user who is responsible for making the data tangible can be considered the owner, though the relationship such a user has with others, if any, determines who has rights associated with the data.

Usage of data from third parties will usually be subject to copyright and/or licensing agreements. Even if data is freely available via the Internet, there may be terms and conditions associated with its use. The copyright is retained under most purchase agreements or licenses.

Some kinds of third-party data may also have additional usage restrictions, such as ethical requirements around data linkage and the identifiability of human subjects. If data involves third-party information about people, a separate set of privacy issues may arise and may require more than meeting the requirements of the data owner.

User data includes all data that is related to individual Internet users.

Information data includes all “types” of data. Sample types of information data include: transactional; master; meta; controlled; reference; reporting; analytical; open; cross-functional; historical; department-specific; and process-specific.

Applications data includes all data that results from a user's interface to the data. This data is generated when a user requests and gets access to his or her e-mail, stores data, or develops and executes reports.

Platform data includes data necessary to administer data systems. This data is related to where the data is stored and processed, including the structured data in operating systems and data warehouses, as well as Twitter data stored in the cloud.

Big Data Players

The most significant types of Big Data players are known as data collectors, data partners and data buyers.

Data collectors accumulate data. They are entities that generate data and store it. They may be computer users or firms that communicate with users via the Internet, such as website clicks, customer-managed relationship data, order transactions, social data, cross-platform and mobile data. This is the best type of data to have, if a company has permission to have it via explicit or implied consent.

Data partners' data is some other entity's data collector's data. Through a partnership or agreement, two companies share customer data ' for a cross-promotional campaign, for example. Or a site or application allows you to log in with a social profile, like Google or Facebook.

Finally, a data buyer's data is the collected, aggregated and anonymized data that is typically sold by data brokers. This data is widely available, including to the buyer's competitors. This is the data that consumers have little to no control over and, most likely, have not given explicit permission for its use.


Jonathan Bick is of counsel at Brach Eichler in Roseland, NJ. He is also a member of our sibling newsletter Internet Law & Strategy 's Board of Editors, an adjunct professor at Pace and Rutgers law schools, and author of 101 Things You Need to Know about Internet Law (Random House). He can be reached at [email protected].

In May 2015, the streaming music service Pandora acquired the music industry data collection company Next Big Sound, which extensively tracks sales, social and streaming data. Of course, in the Internet era, the entertainment and other industries are awash with data, all of it with varying degrees of copyright protection. The Pandora/Next Big Sound deal presents a good moment for a primer on this copyright protection.

So-called “Big Data” didn't exist until after the Internet was well established. The Big Data-Internet connection is underscored by the fact that the Internet is required for most Big Data transactions, including collection, storage and dissemination. Most of Big Data's content consists of uniquely Internet-related elements, namely users' transactions, meta-tag applicators and Internet content providers.

Most of the other constituents of Big Data, such as metadata, are derivatives of Internet data. By describing the contents and context of data files, metadata increases the value of those files. As a result, metadata facilitates the discovery and transfer of relevant information.

Big Data and Copyright

Big Data is a form of content and thus primarily copyright rights. Big Data has spawned legal rights difficulties as to who owns the copyrights to certain data and what protections exist for good-faith intermediaries who store and disseminate Big Data. While the general principles for resolving such difficulties are found in the Copyright Act of 1976, Internet-related copyright statutes provide better guidance.

For example, the Digital Millennium Copyright Act (DMCA), enacted in 1998, was innovative in this regard. Among other things, it created the notice-and-takedown procedure for copyright owners and online intermediaries, a corresponding safe harbor from liability in 17 U.S.C. '512 and technological protection measures in 17 U.S.C. '1201.

Sources of Big Data

Big Data may be classified in four groups: 1) user data; 2) information data; 3) application data; and 4) platform data. Understanding the nature of the data is useful for assessing which entity has which copyright rights in the data.

The ownership of the data, and the rights related to its use, are initially related to the entity that first converted the subject data to a tangible form. In particular, as soon as the data is susceptible to being seen or heard, a copyright arises. Once a copyright springs into existence, U.S. law grants the creator of the original work exclusive rights to its use and distribution, usually for a limited time, with the intention of enabling the creator to receive compensation for the intellectual effort.

Some sample data sources and, thus, copyright creators include: standard report users; ad hoc query users; remote partners/suppliers; power users; business executives; statisticians/data scientists; executives and the board of directors; and human resources.'The user who is responsible for making the data tangible can be considered the owner, though the relationship such a user has with others, if any, determines who has rights associated with the data.

Usage of data from third parties will usually be subject to copyright and/or licensing agreements. Even if data is freely available via the Internet, there may be terms and conditions associated with its use. The copyright is retained under most purchase agreements or licenses.

Some kinds of third-party data may also have additional usage restrictions, such as ethical requirements around data linkage and the identifiability of human subjects. If data involves third-party information about people, a separate set of privacy issues may arise and may require more than meeting the requirements of the data owner.

User data includes all data that is related to individual Internet users.

Information data includes all “types” of data. Sample types of information data include: transactional; master; meta; controlled; reference; reporting; analytical; open; cross-functional; historical; department-specific; and process-specific.

Applications data includes all data that results from a user's interface to the data. This data is generated when a user requests and gets access to his or her e-mail, stores data, or develops and executes reports.

Platform data includes data necessary to administer data systems. This data is related to where the data is stored and processed, including the structured data in operating systems and data warehouses, as well as Twitter data stored in the cloud.

Big Data Players

The most significant types of Big Data players are known as data collectors, data partners and data buyers.

Data collectors accumulate data. They are entities that generate data and store it. They may be computer users or firms that communicate with users via the Internet, such as website clicks, customer-managed relationship data, order transactions, social data, cross-platform and mobile data. This is the best type of data to have, if a company has permission to have it via explicit or implied consent.

Data partners' data is some other entity's data collector's data. Through a partnership or agreement, two companies share customer data ' for a cross-promotional campaign, for example. Or a site or application allows you to log in with a social profile, like Google or Facebook.

Finally, a data buyer's data is the collected, aggregated and anonymized data that is typically sold by data brokers. This data is widely available, including to the buyer's competitors. This is the data that consumers have little to no control over and, most likely, have not given explicit permission for its use.


Jonathan Bick is of counsel at Brach Eichler in Roseland, NJ. He is also a member of our sibling newsletter Internet Law & Strategy 's Board of Editors, an adjunct professor at Pace and Rutgers law schools, and author of 101 Things You Need to Know about Internet Law (Random House). He can be reached at [email protected].

This premium content is locked for Entertainment Law & Finance subscribers only

  • Stay current on the latest information, rulings, regulations, and trends
  • Includes practical, must-have information on copyrights, royalties, AI, and more
  • Tap into expert guidance from top entertainment lawyers and experts

For enterprise-wide or corporate acess, please contact Customer Service at [email protected] or 877-256-2473

Read These Next
Strategy vs. Tactics: Two Sides of a Difficult Coin Image

With each successive large-scale cyber attack, it is slowly becoming clear that ransomware attacks are targeting the critical infrastructure of the most powerful country on the planet. Understanding the strategy, and tactics of our opponents, as well as the strategy and the tactics we implement as a response are vital to victory.

'Huguenot LLC v. Megalith Capital Group Fund I, L.P.': A Tutorial On Contract Liability for Real Estate Purchasers Image

In June 2024, the First Department decided Huguenot LLC v. Megalith Capital Group Fund I, L.P., which resolved a question of liability for a group of condominium apartment buyers and in so doing, touched on a wide range of issues about how contracts can obligate purchasers of real property.

The Article 8 Opt In Image

The Article 8 opt-in election adds an additional layer of complexity to the already labyrinthine rules governing perfection of security interests under the UCC. A lender that is unaware of the nuances created by the opt in (may find its security interest vulnerable to being primed by another party that has taken steps to perfect in a superior manner under the circumstances.

CoStar Wins Injunction for Breach-of-Contract Damages In CRE Database Access Lawsuit Image

Latham & Watkins helped the largest U.S. commercial real estate research company prevail in a breach-of-contract dispute in District of Columbia federal court.

Fresh Filings Image

Notable recent court filings in entertainment law.