Without getting into the semantics of the underlying technologies, I have believed that RSS is to the written word, what TiVo is to television. RSS, has been a ruthlessly efficient and rather simple way to keep in touch with the breaking developments in areas that interest. I have used three main services – PubSub, Feedster and Technorati – and set-up so more than two-dozen keyword searches that I end up finding even the most obscure feed that mentions say, the e911-VoIP debate.
It has been working well, but lately I have observed a few disturbing signs that can quickly turn RSS into yet another technology to avoid. RSS searches are now throwing up classifed listings from not just Craigslist but other sites. It shows the obvious shortcomings in the RSS search engines. This is going to be a bigger problem going forward – more and more people are indiscriminately churning out RSS feeds. Hey, when they set-up RSS only venture funds, you know it is the “loss of innocence.” But there is another evil lurking out there.
I find feeds from some dubious websites, which sole purpose is to output hundreds of RSS items and dupe people into clicking on some of the links. Others are simple commercial messages. For instance sites like VoIP-Information-Guide, is nothing but an affiliate site, that is just nothing but a whole lot of links and Google AdSense text ads. The RSS output of these sites includes google text ads, and Moreover news feeds. In many way I am impressed by the chutzpah of these folks’ ability to game the system, but at the same time it is also distressing because in the end it is going to completely pollute the RSS ecosystem. The RSS search companies like Feedster and Technorati will have to fight this, but how, I am not sure. Will RSS spam become as much a part of life as say email spam or spyware?
Technorati has several full-time employees weeding out spam blogs. The others probably do as well.
i hope so, i wonder if this is problem that can be licked. not sure how though?
I think you could eliminate most of the SPAM blogs by removing all blogs hosted at blogspot.com or created with WordPress blogware 😉
I first started to notice spam via RSS back in January of this year:
http://www.hyku.com/blog/archives/000239.html
My post ended up starting a conversation with Bob Wyman of PubSub, he shared his thoughts in what they are doing:
http://hyku.com/blog/archives/000251.html
There are now some Technorati/Feedster searches that produce 86% spam on the first page alone:
http://hyku.com/blog/archives/000497.html
A few of the PubSub searches I monitor are now useless since the majority of the results are spam. Now comes the fun of Tag spam:
http://hyku.com/blog/archives/000653.html
Sorry for all the links, but it’s something I have been following for a while.
Om,
Technorati has a whole group of people who are dedicated to identifying and eliminating spam from our indexes, both algorithmically, heuristically, and at last resort, via human means. We’ve been working with the industry on fighting spam, and organized the first web 2.0 spam squashing summit.
Things are still early, and will take some time to shake out. However, I like to rely on what my friend Cory Doctorow said: “All healthy ecosystems have parasites.” So the question is not, “will there be spam?” but only, “will this be managable, or will this be the red tide of spam that overtakes the industry?” I for one believe that the former is true. But to be perfectly honest, only time will tell.
Dave
RSS spam isn’t new, but it certainly seems to be growing. And will it overtake the industry? Yeah. Web based search engines have been fighting spam for nearly 10 years now. The constantly have had to evolve their defense. Link analysis is just now giving over to personalized search as a way to prevent this. RSS search engines I’ve used often demonstrate bad textual-retrieval relevancy to begin with. Then add to that spam, they’ve got a long slog ahead of them. Making matters worse, the fact that several of them work off of taking whatever is fed to them, rather than trying to only include the “best,” leaves them even more open to being spammed.
You could use my FooBar Search Alerts (http://www.ypjain.com/notify) to monitor specific sites or RSS/Atom feeds for certain keywords or queries. It doesnt monitor the entire web (like Technorati and the like). It just monitors any page or feed that you want it to and will send you links to actual entries if it finds them in the feed.
Om, that is an issue, and will continue to be an issue as RSS develops and evolves.
But, we at Nooked also realize that people DO want RSS feeds for marketing, for coupons, for deals, and that is what we are working on with marketing firms: how to do RSS in a smart way that doesn’t devolve into SPAM
We also have a search engine for RSS feeds on offers, news, deals and coupons which is SPAM proof due to the community filtering we use
regards
fergus