I am blogging my way through a seamless strategy for moving workloads and data across cloud boundaries. Along the way I am taking a look at some relevant inter-cloud technologies that currently exist in EMC’s portfolio, including:
- CloudLink as a foundational security technology.
- CloudBoost as a file-system interface to bridge between clouds.
In this post I’d like to introduce a third technology that can serve as a pillar for inter-cloud networking: CloudArray. The CloudArray technology has evolved within EMC after the TwinStrata acquisition in 2014. Before the acquisition I recall discussing with customers the inevitable expansion of EMC’s (in-array) tiering algorithms (known as FAST) in two directions. The first direction would be northbound (moving “hot” data towards the server) and the second direction would be southbound (moving “cold” data into cloud storage). I wrote about this tiering architecture in 2014 and would use the following diagram to highlight the expansion.
The CloudArray acquisition was consistent with the storage gateway vision, but with a surprising twist: the technology is being asked to implement a new use case: “hot data” in the cloud. Here are some of the primary use cases that the CloudArray technology addresses:
- Capacity Expansion: Migrate older, static data to free up capacity on primary infrastructure and reduce overall cost to store.
- Remote Office Storage: With a tiny infrastructure footprint, enable unlimited scalability into the cloud while locally caching active data
- Off-site Storage: CloudArray as an on-ramp to use the cloud as an off-site tier of storage for copies and inactive data
The CloudArray functionality, including its ability to connect to both public and private clouds, is depicted below.
The surprising twist for CloudArray revolves around the fact that it can in theory be used as a platform to push “hot” data to the cloud. This could be accomplished by leveraging CloudArray as a snapshot engine that creates cloud-based copies or clones. The cloning could be a one-time event, or it could be a scheduled (e.g. daily, weekly) event that occurs at different intervals.
The beauty of this approach is that once primary data has been efficiently and securely copied to the cloud it essentially creates a simulated primary environment with the following primary benefits:
- It is much less expensive than creating internal copies of data.
- It is much more available to experts (internal or external) located globally
- The production systems are unaffected by the access.
In other words, the data can be treated as hot while being accessed in the cloud. Common use cases include:
- staging or recovering in the cloud for disaster scenarios
- separately encapsulating volumes in the cloud for other purposes (including analytic modeling)
The diagram below highlights what the architecture would look like in conjunction with a VMAX system:
CloudArray is another tool in the toolkit for navigating inter-cloud networking use cases. As I move forward in this series of posts I will begin looking at the grouping of technologies below and examine their co-existence and potential uses in combination with each other.
Steve
Twitter: @SteveTodd
EMC Fellow





