Last week I introduced my thoughts on the topic of Data Valuation.
This week, in advance of my discussion at Thursday’s IT@Cork event, I’d like to explore how the emerging Data Insurance phenomena is driven in large part by hybrid and public cloud use cases for both large enterprises and “born digital” startups.
For large enterprises, the image on the left represents a tablet-based scenario where the bits behind the tablet are being stored in a private cloud configuration, while the bits flowing out of the front of the tablet end up being stored in a public cloud configuration. Often times the outflow of enterprise bits happens in a tiered fashion as a result of background algorithms. The diagram below displays the movement of bits from a “hot-edge” flash-based server tier, down to a “warm” flash storage tier, down to a “cold” spinning-disk tier, and then out to a cloud-based storage environment via a cloud gateway solution.
In the context of data protection, this last step (movement to the public cloud) places a whole new level of risk on the content being migrated. Trust must be extended from the private cloud infrastructure to the (a) transport layer, and to the (b) cloud provider.
In the context of data insurance, one starts to ask the following questions:
- What application does this data belong to?
- How critical is the application (and its data) to the business?
- What are the financial/economic impacts of losing the data?
- What are the financial/economic impacts of the data being stolen?
- What are the financial/economic impacts of no longer being able to extract value out of the data?
The answers, in large part, would need to be answered by an economist who has expertise in the area of calculating “potential future economic value”. As Bill Schmarzo suggests, this person could play the emerging corporate role of a Chief Data Officer (CDO).
If the economic value of the data is high enough, then it stands to reason that the CDO may suggest approaching a data insurer to protect the overall data set from theft or loss. The data insurer, in turn, would need to audit the IT infrastructure involved with the storage of the data set, calculate the risk, and generate appropriate premiums.
I’m over-simplifying, of course, but following this thread would imply that the data insurer would need to understand the trust characteristics of the multiple players involved with storing the data set, including:
- The server-based storage tier
- The flash-based disk tier
- The spinning disk tier
- The cloud gateway logic
- The cloud storage tier
These trust characteristics don’t yet exist (my EMC colleague Nikhil Sharma has started a public dialogue on this very topic).
The need for cloud gateway logic is clear (and it explains in part why EMC made the TwinStrata acquisition last year). According to Nikhil, the gateway logic should be able to surface its trust capabilities as part of a trust API. Ultimately the results of calling this API should bubble all the way up to a data insurer who evaluates risk.
Clearly we are a long way from this vision, but my hope for the IT@Cork event is to introduce some of my first thoughts about how to organize an infrastructure to support a data insurance use case in the short term, and overall data valuation in the long term. I will also be looking for research partners that are interested in participating in this discussion with the University of California San Diego.
Steve
EMC Fellow


