Disclosing Big Data

Privacy and Security, Patents, Copyright and Trademark and Intellectual Property

Article Snapshot


Michael Mattioli


Minnesota Law Review, Vol. 99, pp. 535-583, 2014


Firms that collect big data rarely disclose their methods for doing so. Because of lack of disclosure, others will not trust the data enough to use it. Trade secret, patent, and copyright laws do not encourage the disclosure of big data methods.

Policy Relevance

A new intellectual property right would encourage disclosure of big data methods. Disclosure would encourage data reuse and spur innovation.

Main Points

  • Big data analysts collect Internet search histories, social media posts, credit card records, networked devices, and other sources to gain new insights.
    • Researchers spotted an interaction between two drugs by noticing the drugs often appeared with “hypoglycemia” in web searches.
  • Analysts hesitate to use data if they do not know how it was gathered and processed.
  • Case studies show that big data firms use human analysts to select, process, and classify the data, and to rid it of errors.
    • A firm that handles medical data slightly changes patient’s ages, treatment dates, and locations to conceal patients’ identities.
    • Unlike source code, big data methods cannot be discovered by reverse engineering.
  • Existing intellectual property rights do not support big data disclosure.
    • Big data methods are rarely “novel” and do not quality for patent protection.
    • Patent protection is expensive.
    • Copyright protection for data is weak and does not require disclosure of the data collection methods.
  • Data publishers could be given an exclusive right to control downstream uses of data for a limited time, if the publisher discloses how the data was generated, a “dataright.”
    • Other proposals to create protection for electronic databases do not require disclosure.


Get The Article

Find the full article online

Search for Full Article