MIT and Federated Archives

MIT and Federated Archives

The first week in February can often be one of those weeks where we here in New England start asking ourselves “why do we live here”? Last week was freezing! Some of us make it through via our love of winter sports. I myself find pond hockey to be one of the greater experiences in life, but there’s usually too much snow to shovel by the first week in February.

During my visit to MIT recently the wind was whipping off the Charles river and into Cambridge. Across the river, however, the Massachusetts version of grounddog day would soon take place. Last Friday, Feb 6, “the truck” left.

Around here, when someone says “the truck” in February, it’s obvious what they’re talking about.

They’re talking about the Red Sox equipment truck leaving for Fort Myers, Florida.

And that, my friends, is why I’m willing to stick things out for a few more months.

I’d like to share a few details about my visit into MIT. I was asked by EMC’s Burt Kaliski to meet with MIT’s Stuart Matnick and MacKenzie Smith. Burt runs EMC’s worldwide Innovation Network, which expands the company’s knowledge of emerging technologies by being at the table with other leading researchers from industry and universities.

The project that we discussed is called “Data Space”, and a proposal to request funding for Data Space has already been submitted to the US Government. A proposal was actually submitted last year (last year’s copy can be found here) but did not receive approval. They’re hoping that the lessons learned last year will make the difference.

You can read the full paper if you’d like, but my interpretation is that Data Space will strive to be a united federation of research archives.

For example, a university or hospital specializing in research related to biology may have mountains of data in a stove-piped archive. The archive may or may not be publicly accessible, and if it is, it may have its own distinctive data types (e.g. MRIs) and access methods.

A different university or hospital may have psychology research stored in their own archive, while yet another may have an archive for chemistry research.

Certain research topics require extensive interdisciplinary cooperation between these research areas. Neuroscience is one example.  Energy/environment is another. The Data Space proposal will attempt to greatly improve interdisciplinary cooperation between distributed research archives.

Research has already been ongoing for this topic. One example (among many) is the DSpace digital open-source archiving platform. DSpace is being used by roughly 350 research institutions world wide for “access to and long term archiving of research output in digital formats.”

The Data Space proposal hopes to take this type of research to the next level.

I’m hoping the project gets approved. I’d love to contribute to it. I’ll keep people posted.

Steve