VMware Acquisitions – What’s it all mean?


There has been a lot of activity here at VMware with acquisitions and partnerships over the past few months.  A fellow engineer at VMware summarized a lot of these acquisitions and how they are meaningful to VMware as an organization.  I wanted to share this information because I think it provides people with a better understanding of where we are going as a company and the overall strategic vision of VMware (Thanks again Andy!).



– What is it? RabbitMQ is basically open source, message-oriented middleware, which uses something called Advanced Message Queuing Protocol, or AMQP (originally co-developed by JPMC).  You may be familiar with things like JMS or MQseries, and this is in the same class as those.  An important difference is that with the AMQP standard, messaging is handled via a protocol (the same as HTTP, for example), and not an API like with JMS.  In short, this means much greater interoperability between platforms besides Java.

– How is it used? RabbitMQ allows distributed components to communicate very rapidly (potentially with sub-millisecond latency) even though the components are completely decoupled and could be running on totally different platforms that are otherwise unaware of each other.  In other words, the components do not need to wait for each other – messages can be sent asynchronously.  The “Origins” section at http://en.wikipedia.org/wiki/Message_Oriented_Middleware discusses the original problems/requirements that caused the genesis of MoM in general.

Why did we acquire it? You can probably start to imagine that cloud architectures lend themselves extremely well to component-based, decoupled applications that could be running anywhere, including in an organization’s private cloud or out in a public cloud. Based on that, some sort of messaging is needed between all the various components and the users, wherever they happen to be located.  RabbitMQ provides us this capability, whether the components are Spring-based or not, and does so in a very standards-based, open manner.

– Example: RabbitMQ is currently available as a component on Amazon’s EC2, and this is a straightforward example of how it was used in one case:  http://railsdog.com/blog/2009/12/generating-pdfs-on-ec2-with-ruby/ There is also lots more information at http://www.rabbitmq.com/how.html



– What is it? The flagship product is called GemFire, and does in-memory distributed data management.  There’s that word distributed again – cloud anyone?  Because it does everything in-memory, it is extremely high performance and low latency, while providing availability and horizontal scalability.  To me, some of the concepts are similar to those of a directory service.  Data is partitioned so it can be placed on different nodes, and is then replicated for availability.  Again, this is all in-memory, and the ability to partition across nodes makes this almost zero latency, but also doable in a cost-effective manner.

– How is it used? One common use case for GemFire is for rapid decision support, where multiple sets of real-time data are utilized to provide live situational awareness, or inform real-time decisions about risk, pricing, etc.  Data that is persisted in relational stores on disk, even if optimized for decision support (like a data warehouse) are ill-suited to these time-sensitive use cases, but this is where GemFire shines.  As we saw with the fast stock market a couple weeks ago, seconds can sometimes be an eternity.  GemStone’s customer page references JPMC, which uses GemFire in a continuously updated pricing and risk computation system for derivatives trading.  The system consists of hundreds of commodity systems tied together with GemFire.  Those of you with large financial services customers – they are probably using GemFire.

– Why did we acquire it? One of the key issues with the shift to cloud computing is the availability of data, regardless of location.  Data access paradigms have to shift to support cloud, where applications may be distributed across highly-diverse architectures, yet will still have the same performance, availability, and security requirements.  GemFire is a key component here, ensuring that the data these applications/users need is available where it is needed.

– Example: For more on GemFire usage at JPMC and other organizations, see http://gemstone.com/industry and http://gemstone.com/customers



– What is it? Wikipedia’s definition is a mouthful:  “Redis is an open-source, networked, in-memory, persistent, journaled, key-value data store.”  It is a structured storage system, in contrast to a typical relational database like MySQL where the structure of the database itself is fixed, consistency is ensured, and data relationships are defined via joins.  Simply, Redis is a key-value store, but it is actually much more powerful.  An example of a key-value pair is something like “server:type => linux.”  However, values can be things besides simple strings, like lists, sets, sorted sets, or hashes.  Depending upon the type of data, you can perform different commands against the data, such as incrementing values, or pushing/popping values into or out of lists.  A lot of these functions can actually replace things that you could typically do in your application code, but the advantage of doing them in the data layer is that you have a single source of truth for multiple people or processes that are reading and interacting with the data simultaneously.

– How is it used? All of the things mentioned above are done in-memory, but Redis also writes out to disk asynchronously.  This is a key difference between Redis and an RDBMS, which must write data to disk before it is considered committed.  Because of this, Redis is extremely fast for both reads and writes.

– Why did we acquire it? We actually hired the owner of the open source project, and it will remain free open source software.  I see a few places where this could play.  First, Redis enables easy and rapid application development, because it is much more flexible than an RDBMS, and doesn’t require developers to go through gyrations to fit their data into a relational paradigm.  In that sense, it fits the cloud development model and our entry into the PaaS world quite well.  Second, a technology like Redis could provide us a way to scale our infrastructure components themselves.  For example, as vSphere or even tcServer hosts are scaled horizontally, something like Redis could be used to ensure configuration consistency and store state information.

– Examples: Redis is used by Craigslist as part of their spam-filtering system, and tools in the same class as Redis are used by such sites as Facebook, Digg, and Twitter.