It has been a long time coming, but Powerset, a San Francisco-based contextual-semantic search engine has finally launched. I urge you to try it out, for this is quite an impressive search effort, despite the fact it is currently limited to searching Wikipedia along with some supplementary results from Metaweb’s Freebase. I think it has made Wikipedia much easier to use. I like how you can do more topic-based searches and get a holistic view of the information you’re looking for. Danny Sullivan over on Search Engine Land has an elaborate and fantastic indepth review of Powerset, and that frankly obviates the need for any other review.
That said, Powerset faces an uphill climb, especially when it comes to consumer mindshare. I think Google has become so synonymous with search that it is virtually impossible for a newcomer to establish a toehold. Powerset’s approach is different, and its tactic of applying its technology to specific content repositories such as Wikipedia is smart. But will they (web searchers) come and use Powerset?
At our recent GigaOM PM event, Chad Walters, director of engineering, search and platform at Powerset, gave a talk about how his company was using Hadoop and other clever technologies to meet its immense infrastructure needs. Here are some bits from OStatic’s live blog coverage of the event:
Powerset applies deep natural language processing (based on technology licensed from Xerox PARC), which means the company needs 100 times more processing horsepower than a simple keyword searching and indexing. Powerset uses a distributed database system called HBase in tandem with Coral, its Document Processing System. Coral uses Hadoop as its job control machine. Powerset uses 92 eight-core machines to do processing.