RSS Feed Scrapers Become One of The Biggest Copyright Issues Online

By  //  November 29, 2022

According to Australian technology journalist RSS feed scrapers are the number one cause of copyright issues.

The main reason is that these scrapers aren’t really scraping the content, but rather finding links to the content and extracting the content from the links. This can cause a problem because copyright laws can be confusing, especially if you’re a blogger who uses these scrapers to keep a website updated.

Since 1997, RSS feeds have offered internet users a simple method to keep track of their preferred content. A real-time stream of information is provided by RSS feeds, which aggregates content from preferred news websites and news sources.

The problem facing RSS feeds today is they have become abused by blog operators around the world searching to fill their own content gaps and increase ad revenue.

Arron Hall, an attorney for business owners recently told an Australian tech news publication that many people mistakenly believed when blog posts were first made available via RSS feeds that the blog owner was authorizing copies as long as there was a link back to the original author.

“It’s possible the confusion arose from the fact that “syndication” is part of the RSS acronym.”

“There is a belief that website owners who provide RSS feeds have authorised redistribution and reuse of their material. However, having the ability to copy is not the same as being authorised to do so.” said Hall.

“It is vitally crucial to have the copyright owner’s consent in order to avoid copyright infringement, according to U.S. copyright law.” he said

It’s hard to foresee which spam tactics will be popular in the coming years, but RSS scraping has been a problem for at least six years and is still a problem today.

Author and editor for Tech Business News, Mr Giannelis says RSS feed scraping operators are using RSS feeds to remove high ranking content from major news publications to attract traffic to their own websites in order to get traffic for their ads programs such as Google Adsense.

“It’s not clear if webmasters operating these illegal copyright machines are aware they are breaching the copyright act of many countries around the globe”

“Website owners, especially small private publishers have began to despise these RSS feed scraper blogs ripping off hard working and dedicated journalists research with an automatic tool as they sleep”

“While to some might think this sounds harmless, search engines can sometimes “Hiccup” and place the stolen content much higher than the original resulting in the offending website being awarded with the web traffic off the back of someone else’s work.”

“It’s also obvious this is what many RSS feed scraper operators are actually counting on while others are not aware of the damage they are causing to hard working small publishers,” said Mr. Giannelis.

While there are several ways to remove copyright content from the likes of Google’s search engine index such as filing a DMCA takedown notice, by time the complaint is processed the it’s often too late as a news story starts to lose it’s headline traction.

Additionally, internet hyperlinks which are a high prize for website owners can often be awarded to the blog or news publication who ripped off the story instead of the original publisher.

The web development team at Tech Business News has been monitoring RSS feed scraper blogs since 2021 via logging tools provided by CDN and security services provider Cloudflare.

According to its statistics RSS feed bots can crawl a news publication over one thousand times per day although not all of these crawls happen to be scrapers, instead they can come from simple RSS feed readers used by people who simply prefer to get their daily news fix via a feed.

Despite the fact that RSS scrapers often rely on the “fair use” argument, few copyright lawyers would endorse it. The point of fair use is to allow readers to utilise the content, and it does not grant them the right to copy or republish another’s work.