In the 90s the CLARiiON organization had moved from a text-based management tool (Gridman) to a remote command line interface (CLARCLI) and a remote graphical user interface (ArrayGUIde). I wrote in a recent post that I was offered the opportunity to replace these tools by building a new storage management architecture from scratch.
This new architecture became what is currently known as Navisphere, and yes, it was a complete re-write. Complete re-writes are a pretty drastic step.
But it was necessary.
Here’s why.
A Growing Family
The first CLARiiONs were SCSI disk arrays containing a static number of disks, including 7-slot, 10-slot, 20-slot, and 30-slot systems. The ArrayGUIde GUI was able to manage all of these devices and present a similar look and feel. CLARCLI was also able to manage these devices. The hardware was similar, and the internal software (FLARE) was similar.
In the mid-to-late 90s, however, CLARiiON began moving towards a different architecture: the scalable, building block hardware architecture that it still ships today. The ArrayGUIde software architecture was evaluated. It would be difficult to represent the new, variable-size, building block CLARiiON product using ArrayGUIde. It would also be difficult to enhance the product to be able to manage multiple CLARiiONs (both new and old). Customers were starting to deploy more than one CLARiiON at a site.
It was an opportunity to create a new storage management product that was specifically designed to oversee a growing family of products. The main challenge was to integrate the new while still supporting the old. What were the differences and/or challenges presented by the new product line?
The DAE
CLARiiON’s disk array enclosure (DAE) concept created one of the bigger challenges for the Navisphere architecture. In the past a CLARiiON would report its “model number” and a GUI could instantly draw a picture of what that model looked like. All of the disks were contained in one enclosure; it was straightforward to represent an image of the system.
The DAE model allows customers to add “trays” of storage to an enclosure. This placed new requirements on the storage management architecture to
- discover the number of DAEs
- dynamically recognize new DAEs on the fly
- analyze the cabling between DAEs
- “remember” DAEs that were either removed or faulted
Disk and LUN Counts
The minimum CLARiiON system in the new product line would consist of the DPE (disk processor enclosure) containing 10 disks. The system could scale by adding up to eleven DAEs, for a total of up to 120 disks maximum. The potential for systems with this many disks had its own set of challenges:
- disks were getting bigger, and the need for slicing the disks into pieces became a requirement for the first time.
- when a disk failed, how would a customer know which disk to replace? From which DAE?
- when creating LUNs for presentation to a host system, how would Navi allow the selection of the disks? Should they span DAEs? Be in the same DAE?
- the number of LUNs would grow into the hundreds. How would these LUNs be represented and managed?
Emergence Of SAN
With the emergence of scalable storage and fiber channel came the need for sharing storage among multiple servers. This added more twists to the requirements:
- there’s a need to represent servers connecting to the storage
- there’s a need to assign (and name) LUNs to said servers in a protected way
- there’s a need to balance resources so that one server doesn’t dominate (fairness)
- each server operating system connecting to the shared storage may have their own special SCSI protocol “nuances”
Next Step: Build It
On this blog I like to talk about building software for the storage industry. Given this set of issues and requirements for managing a new breed of storage system, coupled with the requirements of managing legacy systems (and managing multiple systems), in my next post I’d like to describe the main issues we encountered building Navisphere from the ground up.
Steve
