The summer of 2008 has been the best of times and worst of times for cloud computing. Many companies –- big and small — decided to throw in their lot with cloud computing, betting that it is the future of technology infrastructure. At the same time, cloud computing took its lumps as some of the early large-scale cloud applications hit the skids.
Apple’s MobileMe went on the blink for many while the GMail blackout that left millions angry and frustrated. Even Amazon’s seemingly fool proof S3 service was down for an extended period of time, impacting thousands of its customers. This isn’t the last we have seen of these outages. As the size and scope of cloud computing grows, so will the problems and the need for tools to monitor the clouds.
Enter San Francisco-based web infrastructure monitoring service provider Hyperic, which recently launched CloudStatus, a hosted real-time cloud monitoring service to keep an eye on cloud–based services. Thus far CloudStatus is monitoring Amazon’s web services, but sometime later today, the company will start monitoring Google’s App Engine infrastructure, a move that has been blessed by the search engine giant. The service, still in beta testing phase, is free for near foreseeable future, but company might charge for premium services at a later date.
To describe it in super-simplistic terms, this is how the service works: Hyperic has developed an application that runs on the Google App Engine and essentially sends all sorts of information to an agent sitting in Hyperic’s data center, which in turn passes it onto Hyperic’s main offering. Through a web-browser interface, folks can in turn keep an eye on the status of the cloud.
What will it do for Google App Engine users? Javier A. Soltero, CEO of Hyperic said that it will help users answer questions like “How fast is the App Engine cache service running?” or “What’s the response time to Facebook’s API from App Engines perspective?” and other such questions that can help keep apps healthy.
The company can do this is because it is based on agents that are deployed both inside and outside the “cloud infrastructure” and are specialized for the kind of services they monitor. A storage agent, for instance may monitor latency, throughput and remaining capacity. A compute engine agent would monitor load, availability and response times.
Taking measurements from both sides of the wall — that is, from inside the cloud providers operation, and from the outside looking in — gives Hyperic an advantage, says Soltero, who claims that it “picked up the Amazon S3 problems about 20 minutes before Amazon announced (the outage.)” He explained that the service is capable of monitoring any number of different clouds, and it will add more cloud providers to the list.
For Hyperic, CloudStatus is a chance to stand out amongst its competitors. While companies like Stubhub, Comcast and CNET have adopted Hyperics web monitoring tools, the company has little traction inside the enterprises. On the web-side, several companies such as Gomez, Keynote and Webmetrics offer Hyperic-like services.
Cloud computing, however, is a new game where Hyperic can make a play to win — though it would need to add depth and value to its offerings. After all, looming in the background is the distinct possibility that some day Amazon and Google will rollout their own monitoring services.
I just checked and CloudStatus is now monitoring Google App Engine.
Om…you mention the possibility of Amazon or Google rolling out its own monitoring services.
While this is a distinct possibility, it probably would not be in the interest of customers to have their provider also doing the monitoring. It’s a bit too much like grading your own paper 🙂
Companies like AlertSite (and Keynote and Gomez) provide completely independent external perspectives of web application performance and service delivery that should be a part of the operational tools that any company that has important interactions with customers through the internet leverages.
Ken Godskind
http://www.alertsite.com
@Ken Godskind
Good points but you know it must be something that comes as part of the offering and still need second opinion so to speak.
Om,
Another important aspect of such a service is that it helps resolve SLA agreements in a contract. I am sure that buyers of cloud services would want to ensure that SLAs are met, and if they aren’t, credits are applied to their account.
When Netflix had problems last week, its service department pro actively notified customers and promised to credit accounts if there were delays. Cloud service providers must learn from this great example and use tools such as Cloudstatus for a competitive advantage.
– Ranjit
@Om
It is exactly because it wouldn’t come as part of the product offering that makes it more authentic.
Regardless of whether an entity is a corporate enterprise, social network or even whether the application is hosted internally at a large service provider or at one of the cloud offerings, the real stakeholders in any organization need to maintain an objective awareness of how critical online applications are serving users.
This is important for some of the reasons mentioned above and many others like:
– SLA Management
To monitor the service delivery of your service provider or to
deliver proof of performance to your customers
– Operational Management
To proactively be notified when there are issues with key functionality
– Trending and Analysis
To understand performance of the various pieces of application
functionality and how that varies over time and throughout the day
– Rapid triage
To use the diagnostic information capture by these tools to rapidly
identify and address issues affecting users
Ken Godskind
VP, Web Performance Evangelism, AlertSite
http://www.alertsite.com
Hi,
I think folks are getting carried away with the on demand aspects of cloud computing.
One of the key benefits of Clouds is the promise of a simplified management console or API.
Hopefully some of the guys who burn themselves on the large public cloud facilities (Amazon, Google) will not be turned off from these benefits that can accrue if people setup “private” clouds.
Regards,
Alok