Wednesday, February 9, 2011

Complete Master Copy of NARA Research Holdings Now Available at UNC!

The CI-BER project has successfully built a consolidated master copy of all of the research holdings of the National Archives at UNC Chapel Hill. Files were merged into a master collection from a variety of networked locations.

The initial consolidated CI-BER testbed currently holds over 14 million unique files and 27.5TB of data. We expect the testbed to grow significantly over the next few months.

One of the goals is for the CI-BER testbed to enable empirical studies at scale that contribute to the understanding of how to apply new cyberinfrastructure approaches. We hope this will lead to an understanding of how NARA can characterize, analyze, document and manage its vast holdings of records.

The CI-BER team has already started to explore these holdings using data-intensive approaches and visual analytics. Results showing innovative interfaces for collection navigation for geospatial collections will be blogged about next.

1 comment:

  1. I was curious to know a bit more details about how you facilitated the network transfer, what tools you used for the transfer, and for making sure transfers were successfully completed. Also, did you write the data out to iRods or to some other storage system?