skip to Main Content

Web Scraping and Intellectual Property Rights

The Internet is glutted with the presence of a myriad variety of goods and services. These goods and services present online that form the bread and butter for business are a protected product/content that is categorized as intellectual property and is protected under the Intellectual Property laws through copyright, trademark, design, etc.  Thus, the protection of the intellectual property rights over these goods and services is of significant importance to these businesses due to the significant investments in the production as well as marketing of the products and services, where often the profits are not realized until after these are sold through authorized channels of distribution that makes the protection of the intellectual property in them all the more important.

With the global market barriers coming down and cut-throat competition among businesses on the rise, it is now more relevant and paramount than ever for these businesses who are dependent on data and data analysis to develop extraordinary and eccentric business strategies in order to excel in their industry as these data or statistics collected form the foundation for conveying any administrative or management or any other business-related decision [1].

Web scraping is a newly-emerging and increasingly sophisticated method that is used to extract data from third-party websites, often to use that data for commercial purposes. While Data scraping or Web scraping software programs are often used to recreate or automate an experience similar to what a human would follow in exploring the content of a site. Among the myriad of applications (both legal and otherwise) of web scraping tech.

With data or information forming the foundation for a plethora of business decisions it is most commonly used for the purposes of doing business or gaining intelligence about the competitors and their latest updates, promotions, products, information, etc., in a short span of time to take greater advantage of a situation by developing market strategies and taking business decisions. However, the use of this method may result in infringement of intellectual property of a company or an individual while obtaining data, particularly infringement upon copyrightable or copyrighted content over the websites resulting in an information compilation of infringing content that more often than not are hard or sometimes impossible to track down.

Data scraping/extraction/mining leads to legal challenges for both the content creator and the content importer. Data scraping majorly involves the copying of data from a source; therefore the Copyright laws come into the picture. Section 2 (o) of the Copyright Act, 1957 provides that a literary work includes compilations and since the data scraping is a compilation it comes under the category of a literary work.

The Copyright Act, 1957, under Section 13 (1) (a) [2] provides that a copyrights vests in an “original” literary work which raises a question that whether the content importer is a copyright holder in his compilation or not. At the same time, it may also be contended by the content owner that those compilations have been extracted from their site and therefore they have copyright over that content/data.

Although it is always preferable to have prior permission of the copyright owner of the content/data before web scraping, however, as a remedy to the content importer, the Supreme Court of India in the matter of EBC v. D.B. Modak[3], held that when there is a copyright that subsists in a compilation which fulfills the criteria of skill and judgment doctrine. This basically implies that in order to establish copyright in a compilation, the content importer is required to prove that there has been a minimal degree of creativity involved in such compilation.

The doctrine in the EBC v. D.B. Modak case provides that any sort of compilation that is not original or is obvious but simultaneously is not merely a product of capital and work and involves a minimal degree of creativity shall have copyright.  Another remedy could be that the work in question is being used for a non-commercial purpose or that it falls under Section 52 of the Copyright Act [4].

On contrary, as a remedy for the content creator, the creator shall establish ownership and whether the infringement falls within the meaning of fair use under Section 52 the Copyright Act, 1957 before going under Section 51[5] read with Section 14[6]of it, which provides that work will be deemed to have been infringed if it is in violation of any of the rights available under Section 14.

Due to such advancements in technology, businesses have been able to achieve benefits that enhance and develop their position in the market while also increasing their customer satisfaction. However, regardless of the benefits that data scraping might provide to the businesses this method of collecting data and information is at the same time causing media content piracy online. Thus, with the use of such advanced technologies specifically with the use of data scraping there needs to exist an underlying balance between Intellectual Property and its wrongful use, as the issue relating to vigilance, traceability of data extractor, etc., still remains.

Author: Hiba Nasir, a  5th Year student of the Faculty of Law, Jamia Millia Islamia,  an intern at IIPRD. In case of any queries please contact/write back to us at [email protected].


[1]Web scraping: A new hero that defends brands and intellectual property, AndriusPalionis, TechRadar

[2]The Copyright Act, 1957.—Section 13 – ‘Works in which copyright subsists.’

[3] Eastern Book Company &Orsvs D.B. Modak & Anr on 12 December 2007

[4] The Copyright Act, 1957—Section 52 – ‘Certain acts not to be an infringement of copyright.’

[5]The Copyright Act, 1957—Section 51 – ‘When copyright infringed.’

[6]The Copyright Act, 1957—Section 14 – ‘Meaning of Copyright’

Back To Top