Wholesale Blog Plagiarism … Alert

105 thoughts on “Wholesale Blog Plagiarism … Alert”

  1. Om,

    You contact their hosting provider and notify them of copyright infringement. If they are a US provider it should be a simple matter. They might move, but eventually they’ll get tired.

    If you want to pursue more litigious methods, you need to find out who they are.

  2. Om: Make sure to cite the DMCA in all correspondence. This is a case where the DCMA is your friend, in a very big way. Even if the ISP for some reason doesn’t believe or like you, they are required by law to automatically remove the content immediately and put it in safe harbor until the matter is resolved. If they don’t, you can sue *them*, which is why they will automatically do it.

  3. Welcome to the next spam wave, i know some of these guys are making thousands a day by taking XML feeds and repackaging them in high paying markets.

    THe only way you will be able to shut them down is to report them to google adsense for breaking their TOS.

  4. LOL. They even have your plagiarism blog published on their site. Reminds me of a stupid boy in my school who always copied from me in exams and sometimes even my name. HaHa!

    Meanwhile check this website http://www.techwhack.com, he plagiarises the entire news networks content as blogs to earn from ads and has a very high ranking in Google news.

    Om, regarding plagiarism, I can’t speak but a big company is already writing a tool to detect the same.

  5. khabri

    thanks for the heads up on the tool. whichever company is doing that, it is doing a lot of us a favor. and i kinda of am sick and tired of always fighting this battle. so some automation will come in handy

  6. I don’t see why you seem so pissed! I see the ‘Times of India’ having one page called ‘bloggers park’ where they lift articles from the blogosphere! Its basically free content for them and in a revenue as they have print ads. Now, will you call the plagiariasm? He doesn’t give credit I agree but guess what over a period of time people will know the real sourse. Try introducing referrals to your own site then and there so that the reader knows where the content is coming.

    Try to go after these kind of people is just pure waste of time and efoort. If are succesful in this after a month it may be someone else!!

  7. Om — I’m a hosting provider for 10+ years. We are, as others have stated, required to state the person who is the DCMA related legal contact on our site. That means, you are lucky (temporarily) these sploggers’ provider is in the US. It won’t be the case next time. As a hoster we always respond DCMA complaints (though am not sure we are required to immediately block a site alleged to be a copyright violator), however a crook hardly ever hosts with a seemingly law abiding hoster like us. These guys simply have some sort of an xml sucking tool so their cost of copying and moving the site is almost zero. As we’ve learnt from the spam war, content creators can’t win this lopsided war.

  8. I’ve found posts of mine copied onto splogs before, but it doesn’t seem to harmful. These websites get virtually no traffic, of course, and if you occasionaly link back to one of your posts in a post, then when it is copied there, Google will see that as yet another incoming link, which is arguably beneficial. It’s sort of like someone else doing SEO dirtywork for you.

    Still, it’s pretty weak.

  9. sanjay

    thanks for your input. i agree, some of these nefarious people are simply ruining it for all the other law abiding folks. i still wish there is something we could do about this. i have followed the instructions from all the folks who have left a comment, and i wonder if that is enough.

    thanks for your words of support.

  10. Om,

    For sites like mine and others that do not rely on ad revenue to pay for themselves ends up building our legend. For you and those like Tom Keatings and a few others who do rely on the ad revenue the stealing of traffic hurts in the wallet, which I why I alerted you.

    Andy

  11. Sanjay: You actually *are* required to immediately place any content which has a DMCA complaint against it in safe harbor (i.e. temporarily remove it from the site, by whatever means you choose). At that point, you are only required to notify the accused party that you have taken it down and that there is a complaint against it.

    This is to protect you as the ISP, and not anybody else.

    From the moment you take it down, there is really no further action you must take. You are not required to make a judgement on whether or not the content is illegal, and in fact, you may allow the content to be re-posted if you feel the accused is correct. At this point, any lawsuits are between the accuser and the accused and you are not a party to it. 99% of the time, it doesn’t get this far though as the accused has been “caught” and usually cowers away into a hole at that point.

  12. It’s something that is happening more and more out there. At least you have presence of mind to have your copyright below and are not relying on just a Creative Commons license.

    But, it does beg the question about other legitimate services that do use your posts and make money off of them. What about the news aggregators that pull your posts, yet have AdSense or other ads there? What about people that are blogging via Del.icio.us – pretty much just link blogs that also have ads? Where does the line get drawn?

    No, I don’t think what the xb90 guy doing is Kosher, but I do think others have been rabidly attacked for doing stupid stuff. This guy is milking the system, while others have just been lazy.

    Just my .02.

  13. While RSS feeds can spark a new take on a subject and we all build upon others comments, wholesale plagiarism of blogs shouldn’t be tolerated, just as we wouldn’t tolerate ripping-off the content of the AP or a ‘conventional’ online publication. If legal action cannot be taken, these thieves should at least be ‘out-ed.’

    Ed

  14. “The only way you will be able to shut them down is to report them to google adsense for breaking their TOS.”

    You’re joking, right? Why would Google shut down advertisers who are making them thousands a day? Google has ZERO oversight on the content that runs Adsense. they take no responsibility.

    “Google is a provider of information, not a mediator. We serve ads targeted to certain web pages, but we don’t control the content of these pages. For these kinds of questions or comments, it is best to directly address the webmaster of the page in question.”

    Google is the crack dealer handing out drugs at the playground and then saying “but I’m not forcing anyone to do it”.

  15. Om – to piggyback on pxlated, and if you use IE (unfortunately not available for FF or Opera yet), you can use the Netcraft toolbar to see who is hosting a domain as you visit it (see toolbar.netcraft.com). It also has some anti-phishing stuff there.

  16. Just a thought…
    If they are grabbing a feed and it’s “full”, they get value. If the feed is just a “Summary” with a link to your site, there is little (or less) value. I know some don’t like “summary” feeds but hey, it’s an easy defense against these types.

  17. This is why I’m against full text in feeds. Set your feed to include only the first N words and you’re done. I can’t see how you can complain if your feed contains the full text, you’re practically giving it all away.

  18. Publishers using the summary feature of RSS feeds is one side, aggregators using the “excerpt” capacity of most of the RSS parsing software is the other.

    Having twenty words of Om’s content as Memeorandum does at the moment means people will come to this site; having the whole article is theft.

    It is an issue which is going to become much more important as mashups and aggregations try to add value to original content.

  19. Talk about a nasty Christmas surprise…

    Uf as seems likely, the ripoff artists are mirroring your posts wholesale using some kind of bot, it is very easy to add some JavaScript code to your page, that defaces the plagiarized site big time.

  20. Scrapper sites may soon become the Achilles heel of google adsense program and trigger massive advertiser withdrawal, like what happened to banner advertisements of Web 1.0 era, when many sites started to reload the page every few seconds to get billions of ad displays and advertisers lost millions…….

  21. i know you don’t want to do this, but in august i killed our full text feed because of the text theft. our feeds now just display the summary. i didn’t want to go that route either, but i was getting sick of our content being copied/pasted and placed on a page with adsense ads(we don’t even have adsense on our page, so why should they make money where we don’t). as soon as i killed the full text feed, they completely stopped. i haven’t seen any any new content stolen since august.

  22. Om, this is nothing new – webmasters have been complaining of their sites being scraped for short-term Google AdSense monetization for some time now.

    By the way, your blog entries don’t seem to show for this Firefox user:

    Error: [Exception… “Component returned failure code: 0x80040111 (NS_ERROR_NOT_AVAILABLE) [nsIXMLHttpRequest.status]” nsresult: “0x80040111 (NS_ERROR_NOT_AVAILABLE)” location: “JS frame :: http://gigaom.com/wp-content/themes/gigaom/javascript/giga.js :: stateHandler :: line 32″ data: no]
    Source File: http://gigaom.com/wp-content/themes/gigaom/javascript/giga.js
    Line: 32

  23. Sorry to hear about this Om but I have one question. Have you been hiding under a rock or something? The amazing part to me is that someone who is supposedly so net savvy JUST realized that scraper sites steal blog content for the purpose of displaying ads next to it. Just seems odd you didn’t know about this a long time ago. In the online marketing and advertising business this kind of thing is old news.

  24. I hate to post this Om , but stopping the scrapers is almost a impossible task from the webmasters end ,you kill one and hundreds will appear!…

    IMHO , the best way to stop this is to eliminate their financial incentive ,almost 90% of this scrappers monetize with adsense – if only google had a strict adsense spam policy and act on the spam reports fast enough (meaning they have to deploy more warm bodies) this problem would’nt be this big!

  25. Well now you know, how the artistes felt when you downloaded their music using Napster or BitTorrent.

    awww…don’t tell me you never ripped off a song in your life huh?

  26. MIT Dude, I don’t think your analogy is quite right. The main issue here is these guys making money out of your content, which is not something you do when you download music (unless you copy it to a CD and then sell it on the street).

    The worst thing here is plagiarism: when someone takes your content and pretends he is the author (something you are unlikely to do with a Britney Spears song).

  27. If you have ever checked-out a “hire a freelancer” website, you know the ones where people post their Internet and related technologies project for so called “professionals” to bid on, you’d find that many many projects are to “clone” or “scape” another website. It makes me sick that people bid on these illegal projects!!! I’m not saying that all the buyers and sellers are crooks, but I am saying that this activity is going on in plain site yet nothing is being done about it.

  28. I have to agree with AGoToGuy here–this was news 10 months ago. Since then Technorati notifications on my domain name have been running 10-20 a week, all splogs. Wikipedia has had it defined for months.

    http://en.wikipedia.org/wiki/Splog

    If you click on the “Ads by Goooogle” link you can report the offending site, but that’s playing whack-a-mole, you’ll waste more time reporting than the offender does generating the splog.

  29. while ppl are discussing this issue I thought I should check opinion of legal/ethical experts on
    http://sf.getvendors.com (check out the news & views section).. we need to polish it and fix a number of issues (the final version will look quite different and load fast)..but looking at this discussion wondering what you folks think about approach..Feel free to take shots..

  30. Sorry, Duncan, and others suggesting that bloggers drop their full-text feeds… that’s throwing the baby out with the bathwater.

    What next? Gee, I think I’ll stop sending and receiving e-mail and just IM people with a link to view a note to them on my Web site?

    Spammers and other thugs on the Internet shouldn’t force us to make content distribution and accessibility a pain in the ass for the 99% of the people with ethics.

    Instead, people should indeed lobby Google to institute policies and procedures that — while not, unfortunately, likely to increase their revenues — will at least largely make the blogosphere a better, less-scraped place.

    The problem, though, is that Google AND its advertisers have no *economic* incentive to clean this stuff up. Look at how things currently are with splogs:

    1) Asshat creates a splog featuring, say, of Viagra links.

    2) Some floppy fella searches Google for the Big V.

    3) He lands on a scraper page and sees a big bold ad for Viagra, along with some scraped info from a medical blog about the topic. He’s happy. He clicks through on the ad to a real Viagra site.

    4) Google’s happy. They just got paid.

    5) The real Viagra site’s happy… they just got a new, valid customer (someone truly interested in their product), albeit in a slightly round-about way (one extra click).

    6) And the scraper site’s happy. They just got AdSense money.

    So, basically, in this typical scenario, EVERYONE is happy except for the folks whose content is being scraped. And unless they uber-geeks who check for their links regularly, they probably don’t even KNOW their stuff is being scraped, so no harm no foul, right?

    And those being scraped, so to speak? They’re typically not Google’s customers. Their unhappiness currently isn’t any sort of a liability for Google. Worse yet, making them happy (serving as a copyright policeman) is likely to LOSE Google money.

    I sincerely believe that Google’s engineers are trying to figure out algorithmic ways to blast the sploggers to hell and kill their AdSense revenue largely BECAUSE it’s the right thing to do. But given the lack of economic incentive, I sincerely doubt this is a top-priority project over there.

  31. This site: laptop-notebook.blogspot.com ripped off two of my reviews, word for word (along with ripping off a lot of other sites, including Trusted Reviews, Ziff-Davis Net, PC World, PC Magazine, LAPTOP Magazine, etc).

    It’s hosted by Google. I have contacted Google, both the Adsense and the Blogger sides. Nothing, not a word.

    How do these people get away with it? They don’t even change the wording of any of the reviews they lift (and they do a damn fine job, including lifting all relevant images).

  32. I repackage many rss feeds on my site and I have google ads. I don’t see any problem with it. I’m offering a service by aggregating various feeds together. An RSS feed is meant to be redistributed, is it not? (BTW I give clear attribution and links to the original rss feed and web page. )

  33. Make sure to get a cached copy of one of their offending webpages, including the date and time. You can get one from Google by searching for “cache:” followed by the offending website’s URL. Archive.org’s WayBackMachine may include more than one copy, so you can see how long they have been copying you over time.

  34. The answer is DRM, not DCMA. You need to proactively protect your content using technical means, not reactively throw lawyers at the problem. This would mean changing the way everyone publishes and consumes content. Unfortunately, that’s a huge change, but the only way we can ensure correct attribution and compensation.

  35. It’s a win win situation. Google makes money scrapers make money, adwords publishers get targetted traffic..
    But also it’s a trade off for google, it needs to maintain the quality of i’s results, but also be as profitable as possible, so while other search engines aren’t doing anything to improve their serps and ban scrapers, why would google care?

  36. Om,

    I have an online wholesale auction and have to tell you.. the online wholesale business is full of thieves. If you aren’t careful and blink, your website might be stolen in broad daylight. It happened to me!

    My “How it Works Page” was found on a website in Germany. Word for Word.

    I have two suggestions:

    1. Bookmark http://www.copyscape.com.

    Enter the URL and the software can scout the internet and notify you in seconds of all the websites currently using your content.

    1. Report adsense abuse to Google. The email address is adsense-abuse@google.com

    Hope that helps and if you get a chance, give janesdeals.com a plug in your blog!

    🙂

    Thanks!

  37. If a site robs your content, it is not always a negative. You should add your URL to each post and or in the text itself. This way when your content is taken, the theives would have to edit out all your URL links. If they leave your links then you not only get the credit, but also get another link and may help your PR in the future.

  38. I personally use the http://www.copygator.com website to find duplicated content. To me it has a number of benefits over copyscape and copyrightspot:

    1. it’s automated and brings me results instead of me searching for duplicated content. All i had to do was submit my feed and it started monitoring my feed showing me who’s republished my articles on the web.

    2. i get notified by email so it contacts me when it finds copies of my articles online.

    3. i use their image badge feature to alert me directly on my website when my content is being lifted.

    4. it’s a free service as opposed the “per page” cost of copyscape/copysentry.

  39. Choosing An Online Plagiarism Detector To Check For Plagiarism

    Plagiarism is a growing problem in academia and the work place. The internet has made it easy for nearly anyone to copy written material and pass it off as their own work. Because of the legal and ethical dilemmas associated with plagiarism, plagiarism checking software is now readily available. With so many online plagiarism detectors, choosing one may seem like an overwhelming task, but it can be easy if you know what you’re looking for.
    You can also find more details and services about plagiarism at here:

    Thanks

  40. I know this post is several years old, but you might get a kick out of my story.

    I had someone selling my free service on eBay. He had the audacity to a)lift the code, unchanged, directly off my site and use it in his own product description and b)send his customers directly to my site where they would see that what they had just paid for was actually free.

    Well, after an angry customer told me about him, I got creative in my response. I replaced all the images with ones that are illegal on eBay, then reported the seller. eBay suspended his account just a few days later.

  41. I found a site that was my site copied almost verbatim every article page was the same they only changed site map to articles page used a different header and changed my affiliate links to theirs.

    Apparently some one is selling plr packs of my articles according to one guy I contacted asking him why he was using my articles.

    I used these articles to drive traffic to my program ebay affiliate website builder and only noticed it because of a drop in different keywords S.e Traffic came via

This site uses Akismet to reduce spam. Learn how your comment data is processed.