When an engineering team puts together architectures, designs, and implementations for products like CLARiiON and VMAX, a strong emphasis is always placed on customer requirements (sound engineering discipline). Technologies and features such as Thick, Thin, FAST, and Green are heavily influenced by customer input, whether it occurs at a customer site or briefing center, or is parsed from an RFP/RFE (request for proposal/enhancement).
There is one other important voice coming from the customer sphere: the voice of the SCSI CDB. With SCSI trace data from a customer site, engineers are able to glean additional insight about customer workloads (that even the customers themselves don’t know). These traces drive internal disk array architectures in a way that augments the verbal and written requirements.
The Symmetrix architecture has long supported an internal trace facility to analyze CDB patterns. With permission from customers these traces were brought back to the Symmetrix R&D lab and analyzed. Over time (e.g. several years ago) the team had acquired some 500-600 traces for brief time periods within the customer’s workday.
At the same time, CLARiiON began to capture day-long traces (e.g. 9-to-5 I/O traffic), and the library grew between the two teams. The value of these traces grew in importance as customer requirements came in for spin-down and tiering functionality. The engineering teams began setting a fairly aggressive goal: capture traces that represent weeks worth of continuous trace data.
VMAX Example
In the most complex of VMAX deployments, the customer may be running hundreds of servers, connected to scores of VMAX ports. The customer could also be running any number of applications, including Oracle databases, heavily virtualized VMware environments, or SAP configurations. Capturing week-long trace data in any and all of these environments would provide a tremendous set of requirements to the engineering teams.
The trick is to capture, gather, and store the trace data in a customer environment, and then figure out how to get it back to the lab, all done without disrupting the customer’s application.
VMAX uses some fairly clever host software to do this. A LINUX server, for example, can be attached to VMAX over a single FC port and gather trace data from all of the front-end directors (before their internal trace buffers fill up). Each director gathers each CDB, along with the port, the LUN, and a timestamp, and it eventually gets routed up to the LINUX server. This process can occur for a week or two in the customer’s environment (or in a lab environment), and then a Symmetrix field engineer will visit the site to upload the data (using Iomega of course!).
Then the hard part actually begins. How do you analyze traces that are approaching a terabyte in size? What happens if there is clock-drift between directors in the VMAX? Effective analysis of the traces is an ongoing process that has gone through years of improvement.
A Sizeable Library
As of 2010, the Trace Library has grown from hundreds of traces to thousands. Here are some of the more interesting statistics:
- Over 54 Billion I/Os have been captured
- These I/Os have been targeted at over 150K active volumes
- The traces represent over 1.2PB worth of transferred data
- The traces represent a total duration of 88 days
These numbers are changing and growing all the time, with a fairly even distribution of traces from international customers (BRIC, EMEA), U.S.A.-based customers, and internal EMC systems. Each trace is accompanied by the “vertical” that it belongs to:
- Financial Services
- High-Tech
- Health Care
- IT Services
- Process Manufacture
- Telco/Media/Entertainment
- Retail
- Government
It’s a great requirements gathering tool, and the field engineers are practiced to the point where it’s quite non-disruptive to set up, run, and capture the data. The value gleaned about working set deltas and spatial usage patterns are usually beyond the knowledge of even our most enterprise-savvy customers.
When the traces are combined with the more traditional form of customer requirements, the end result yields a much more holistic solution.
Steve
Twitter: @SteveTodd

