March 24, 2011

Meet Mapr, a Competitor to Hadoop Leader Cloudera

Cloudera's Amr Awadallah, Pervasive Software's Mike Hoskins, 10gen's Dwight Merriman, Yahoo's Todd Papaioannou, and DataStax Ben Werther — The Hadoop and Beyond Panel at Structure: Big Data

Hadoop, the open-source file system and MapReduce implementation for massive-scale data, was the talk of the conference Wednesday at our Structure Big Data conference in New York. From new Hadoop distributions to end-customers’ plans, Hadoop was all anyone could talk about. One of the companies whose name crept up in conversations was a stealth-mode company called Mapr, which is building a proprietary version of Hadoop and is likely to launch later this year.

Mapr, based in ~~Saratoga~~ San Jose, Calif., has been in the works for nearly two years. The Securities and Exchange Commission filings show the company has raised about $9 million in funding from Barry Eggers of Lightspeed Venture Partners and Peter Sonsini of the New Enterprise Associates. On its web site, the company says it’s “engineering game changing Map/Reduce related technologies.” Its ambitions aren’t limited by that somewhat ambiguous statement.

People Behind Mapr:

M.C. ~~Srinivas~~ Srivas, an ex-Googler (s goog) is the founder and CTO of the company.
John Schroeder, formerly of Lightspeed VC and former CEO of Calista Technologies (acquired by Microsoft (s msft)) and Rainfinity (acquired by EMC (s emc)) is the CEO and co-founder of Mapr.
The company has close to 30 employees, many of them based in India.
Ted Dunning, chief scientist at Site Tuner and Veoh Networks, is the chief application architect at Mapr. He created the recommendation engine for Musicmatch, a music service that was popular before iTunes (s aapl) came on the scene. He is also one of the key guys behind the Apache Mahout data-mining project.

What Is Mapr Doing?

They are said to be building a proprietary replacement for the Hadoop Distributed File System that’s allegedly three times faster than the current open-source version. It comes with snapshots and no NameNode single point of failure (SPOF), and is supposed to be API-compatible with HDFS, so it can be a drop-in replacement.

The Road Ahead

Mapr might have an edge over Apache Hadoop in the interim, but Apache is working to improve the HDFS architecture in its distribution, and should have its own snapshot feature sometime in 2012. Also, Appistry sells a NameNode-free HDFS alternative based on its distributed CloudIQ Storage offering. As for the speed advantage, I don’t have any details for now, but if you have some thoughts, please share them with us.

On a broader canvas, I think Mapr is up against a whole lot of major competitors. Cloudera has a lead in the commercial market place, and the Apache Hadoop distribution on which it’s based keeps improving thanks to upgrades from contributors like Facebook and Yahoo (s yhoo). Apache Hadoop companies more control over their data, as they are not at all held hostage by a vendor, and surveys and anecdotal evidence alike suggest that Apache Hadoop is still the most widely-used version.

10 comments

10 thoughts on this post

ct says:

March 24, 2011 at 12:51 pm

The MapR website says San Jose, CA (not Saratoga), then again it also says (C) 2009… maybe their site is out of date?

Reply
1. Cyndy Aleo says:
  
  March 24, 2011 at 7:10 pm
  
  Thank you for the comment. The site shows San Jose, but the first address listed on the SEC filing is Saratoga. We changed to acknowledge the paperwork was probably filed before the company had office space.
  
  Reply
2. Om Malik says:
  
  March 28, 2011 at 3:14 pm
  
  I was going with the first/only official address there is on a formal document.
  
  Reply
ss says:

March 24, 2011 at 7:20 pm

small correction, CTO is M.C. Srivas, not M. C. Srinivas.

Reply
1. Cyndy Aleo says:
  
  March 27, 2011 at 10:07 am
  
  Thanks for the comment; we’ve made the change.
  
  Reply
Mark says:

March 25, 2011 at 9:41 am

No mention of Pervasive datarush, which is an already shipping parallel and Hadoop-compatible programming environment?

Reply
Ashwin Jayaprakash says:

March 25, 2011 at 7:32 pm

+1 DataStax’s new Brisk project to integrate Cassandra with Hadoop.

Reply
hf says:

March 27, 2011 at 10:42 pm

Couple of companies are already trying out MapR product and loving it

Reply
1. Om Malik says:
  
  March 28, 2011 at 3:15 pm
  
  Can you tell us which are those companies?
  
  Reply
  1. hf says:
    
    May 9, 2011 at 11:32 pm
    
    Now you know 🙂
    
    Reply

Meet Mapr, a Competitor to Hadoop Leader Cloudera

People Behind Mapr:

What Is Mapr Doing?

The Road Ahead

Leave a Reply Cancel reply

10 thoughts on this post

Share on Mastodon