I’ve been reading the blogs about office organization with some interest. People organize their offices based on how their brain works and what tasks they work on.
Scott uses the window filtering techniques that Mac provides. Dave organizes his workspace into layers. Stu goes for the multi-monitor approach.
The “timing” of this topic is perfect. I just completed my yearly ritual of creating a new folder on my desktop. The folder is called “2009”.
I organize everything by time.
Here’s the directory structure I use to store everything that I work on. Note that I organize it by EMC’s quarterly calendar (Q1 just started).
The reason that I do this is because I’ve basically had the same job for nearly 23 years: building software for the storage industry. Inevitably I’ll get a question about something I worked on “a while back”. I’ll lean back in my chair and ask myself “when did I work on that?” and eventually dig it out.
Another attribute of my workspace is sparseness. As I write this post I have a notebook in my office and that’s it. No other scraps of paper. I typically have 3-4 windows open: email, Firefox, Rhapsody, and whatever project I’m working on. Any more than that and I’ll start closing windows. The reason for this, I believe, is that my daily work consists of focusing on building software architectures and designs. I need to concentrate and therefore reduce clutter and help me focus on the many hours of thinking and writing. When I need a break I’ll open Firefox and navigate the tabs for Google Reader, Facebook, LinkedIn, Twitter, EMC ONE, and of course MLB.COM (pitchers and catchers report next month!).
The Centera Angle
Of course I can’t make it through a post without relating this to some product that I’ve built in the past. It’s a nerdy habit but I just can’t help myself.
When I first joined the Centera organization in 2003 the big question was “why does the box slow down as it starts filling up”? The answer, we discovered, was that the content addresses were pure hash values: 26 characters worth of human-readable MD5 or SHA. These content addresses were directly translated into filenames and stored in one of many file systems or directories. Given the random nature of these hash values, these Centera objects were distributed all over the place (Centera’s RAIN architecture has multiple nodes, disks, and file systems).
This became a big deal when customers began storing millions of small objects (which was not the initial design point of Centera). The continual writing of these objects had no locality of reference, and subsequent reads for those objects didn’t either. As a result there was no way to take advantage of any of the common performance optimizations such as disk elevator algorithms or buffer caches.
As a result Centera added a new naming scheme for customers: GUIDs that have time characteristics. These GUIDs were appended to the hash, forming a longer content address. As objects were written proximate in time these GUIDs were examined and grouped in common locations. The Centera team came up with pretty sophisticated algorithms to implement time-slices that focused on storing temporally related objects in the same directory across many disks and nodes in the system.
When an application reads an object written during a certain time frame, there’s a good chance that the app will follow up with requests for objects written at roughly the same time, and the time-based layout results in read-caching and disk head efficiencies.
These types of improvements are the reasons for Centera’s ability to jam so many objects onto each disk in the system (not to mention more efficient object rebuild speeds).
So my work style tends to be similar to a Centera. I organize by time, and when I get a request for information and I can quickly go find it based on the time the content was generated.
Steve

