The quantity of data involved in scientific research has exploded in recent years. Simulations, sensor data, and massive archiving efforts each generate hundreds of terabytes of data, which must be stored, accessed and shared. This data requires specialized storage systems, and in recent years, TACC has become a leader in the deployment and use of data-intensive computing. With more than 7 Petabytes of dedicated user storage, TACC's rapid-access, ultra high-density storage systems keep data close to the compute nodes, allowing complex comparison and analysis and making big science possible.
Corral is a system deployed in April 2009 by the Texas Advanced Computing Center to support data-centric science at the University of Texas. Corral consists of 6 Petabytes of online disk and a number of servers providing high-performance storage for all types of digital data. It supports MySQL and Postgres databases, high-performance parallel file system, and web-based access, and other network protocols for storage and retrieval of data to and from sophisticated instruments, HPC simulations, and visualization laboratories.
TACC's long-term mass storage solution is an Oracle® StorageTek Modular Library System, named Ranch. Ranch utilizes Oracle's Sun Storage Archive Manager Filesystem (SAM-FS) for migrating files to/from a tape archival system with a current offline storage capacity of 40 PB. Ranch's disk cache is built on Oracle's Sun ST6540 and DataDirect Networks 9550 disk arrays containing approximately 110 TB of usable spinning disk storage. These disk arrays are controlled by an Oracle Sun x4600 SAM-FS Metadata server which has 16 CPUs and 32 GB of RAM.
Rodeo is TACC's general cloud computing and storage system for open science. Building on successful projects that have implemented cloud technologies including FutureGrid and iPlant Atmosphere, and hosting The Galaxy Project's genomics platform and The Arabidopsis Investigation Resource (TAIR) web service, TACC is pleased to announce Rodeo. Currently, it is a small set of compute nodes and storage that will offer these services: on-demand cloud computing; persistent cloud storage; and persistent cloud hosting of portals, science gateways, and services. Other services will be introduced throughout 2014. Rodeo will be available to friendly users in December 2013, and to the general community by March 2013.