NARA - National Archives and Records Administration

Purpose

The National Archives and Records Administration (NARA) stores and maintains the official records and digital collections of the U.S. government, which reached 12 petabytes in 2010. In response to this data deluge, NARA turned to TACC to develop a strategy for computationally-assisted archiving.

With support from the NARA, a multidisciplinary team at TACC conducted research in computational analysis and visual analytics for big archives. During a five year period, until December 2013, TACC investigated issues in big data, including imagining and exploring next generation of methods and tools to tackle big digital archives.

In the era of big data, this research is particularly relevant, considering the possibilities of making unprecedented discoveries through data-intensive science are based on the existence of organized, readily available and documented data and records collections. The research is also relevant to government accountability and the possibility for the public to find documents and data produced by the federal state governments.

To address this issue TACC built a visual analytics framework through which we tested different archives and data analysis functions including collection's content description, functional analysis, preservation assessment, authenticity and integrity, organizational structure and context evaluation.

TACC also designed and constructed Lasso, the first multi-touch tiled display system to explore interactive analysis of records and data.

This project produced 12 publications in high-rated Journals and Conference Proceedings, and was presented in more than 30 venues both nationally and internationally.

Funding Source(s)

  • National Archives and Records Administration (NARA)
Maria Esteva

Research Associate/Data Archivist
maria@tacc.utexas.edu | 512-232-8478

Weijia Xu

Manager, Data Mining & Statistics Group
xwj@tacc.utexas.edu | 512-232-7158