Click here to go to the TACC Home Page

Sun Microsystems® StorageTek Mass Storage Facility

Introduction to Ranch Accessing Ranch
Using TACC tools to interact with Ranch SAM-FS

Introduction to Ranch

TACC's long-term mass storage solution is a Sun Microsystems® StorageTek Mass Storage Facility, ranch (ranch.tacc.utexas.edu). Ranch utilizes Sun's Storage Archive Manager Filesystem (SAM-FS) for migrating files to/from a tape archival system with a current storage capacity of 1 Petabyte(PB).

Ranch's disk cache is built on a Sun ST6540 disk array containing approximately 15.5 Terabytes(TB) of spinning disk. This disk array is controlled by a Sun x4600 SAM-FS Metadata server which has 16 CPUs and 32 GB of RAM.

A single Sun StorageTek SL8500 Automated Tape Library houses all of the offline archival storage. Each SL8500 library contains 10,000 tape slots and 64 tape drive slots. Each tape is capable of holding 500 GB of uncompressed data, so when fully populated a single SL8500 library can house 5 PB. Each SL8500 library also contains 4 handbots to manage tapes and move them to/from the tape drives. If necessary, up to 4 SL8500 libraries can be integrated into a single archival solution, allowing for an offline storage capacity of 20 PB.

The current ranch configuration has 2,000 tapes, and is capable of housing 1 PB of uncompressed data. However, future plans call for further population of the tape slots as well as upgrades of the physical media from 500 GB capacity to 1 TB capacity.

Ranch Picture


Because the HPC machines are used primarily for scientific computing, their disk space ($HOME directory size) is limited. This is also true for TACC's visualization systems. The Archival File Server was designed to "serve" HPC and Vis machines by providing a massive, high-performance file system for user files.

Access to Ranch is achieved by making use of the TACC-defined user environment variables $ARCHIVER and $ARCHIVE. These variables define the hostname of Ranch ($ARCHIVER) and each user's personal archival space ($ARCHIVE).

Currently users are able to log directly into ranch to create directories and demigrate files from tape back to the disk subsystem for later transfer to TACC machines or personal computers.

Logging into Ranch
Users can log into ranch using rsh (from TACC machines) or ssh (from the outside world) by typing:
lonestar% rsh $ARCHIVER
localhost% ssh ranch.tacc.utexas.edu


rls: remote ls
rls, or remote ls, can be used just as a normal ls except it will be able to view your remote files on the tape system. Be sure to include the $ARCHIVE variable to give rls the correct path.
Example:
lslogin2$ ./rls -la test*  
  -rw-r--r--    1 eturner  support    30720 Sep  5 08:50 (DUL) /archive/utexas/staff/eturner/test.tar.bz2
lslogin2$

sinc: sinc is not cp
sinc is a tool that allows the simple aggregation of local files into packed tar files that are saved on remote tape backup. This tool will eliminate the duplication of space associated with creating large tar files on local disk. sinc will automatically place the resulting backup file in the $ARCHIVE directory of the user which executes the command.

unsinc is the counterpart command to sinc which is used to retrieve files from tape storage, automatically migrating files from tape if needed. Remote files are given to unsinc in the format of the relative path from the user's $ARCHIVE directory. sinc and unsinc are available on all production systems and accessible by loading the sinc module: module load sinc.

Compression Options
-zCompress using gzip
-jCompress using bzip2 (default behavior)
-ZCompress using compression
-nDo not use compression (create tar only)
Miscellaneous Options
-fSend/Receive single file only
-vVerbose mode
-i <filename>Local file/directory to archive
-o <filename>Destination file on tape system
-aAutomatically backup the current directory and its subdirectories in tar/bzip2 format. Remote filename will be current path plus .tar.bz2 extension
-i <filename>Remote file/directory to retrieve (cannot be a directory)
-o <path>Destination directory in which to unpack archived contents

SINC Examples:

  • Create a tar/gzip file of the contents of 'lsfstuff' on the archival system:
lslogin2$ du -h lsfstuff/
1.2M    lsfstuff/koomie
3.3M    lsfstuff/
lslogin2$ sinc -z -i lsfstuff -o lsfstuff.tar.gz
lslogin2$ rls -la lsfstuff.tar.gz 
-rw-r--r--    1 eturner  support  1136640 Sep 11 15:38 (REG) /archive/utexas/staff/eturner/lsfstuff.tar.gz
lslogin2$ 
  • Create a tar/bzip2 file of the contents of the current directory on the archival system:
lslogin2$ ls
diff.launcher.lsf  hello.f   launcher.c    launcher.o   output              paramlist        test-job
hello              launcher  launcher.lsf  launcher.sh  Parametric.o380398  README.launcher
lslogin2$ pwd    
/home/utexas/staff/eturner/launcher-test
lslogin2$ sinc -a
lslogin2$ rls -la | grep launcher-test 
-rw-r--r--    1 eturner  support   188492 Sep 11 15:44 (REG) home.utexas.staff.eturner.launcher-test.tar.bz2
lslogin2$ 
  • Create a tar/bzip2 file of the contents of a specified directory on the archival system:
lslogin2$ sinc -i la
lammps.tar.gz  launcher/      launcher-test/ 
lslogin2$ sinc -i launcher-test -o mybackup/launcher.tar.bz2
lslogin2$ rls -la mybackup 
total 384
drwxr-xr-x    2 eturner  support       29 Sep 11 15:49 (REG) .
drwx------   27 eturner  support     4096 Sep 11 15:49 (REG) ..
-rw-r--r--    1 eturner  support   188575 Sep 11 15:49 (REG) launcher.tar.bz2
lslogin2$ 
  • Create a file to the archival system without creating a tar file (single file ONLY; cannot be a directory):
lslogin2$ du -h bigbinaryfile 
8.0K    bigbinaryfile
lslogin2$ sinc -f -i bigbinaryfile -o mybinaryoutputs/bigbinaryfile
lslogin2$ rls -la mybinaryoutputs 
total 16
drwxr-xr-x    2 eturner  support       26 Sep 11 15:55 (REG) .
drwx------   28 eturner  support     4096 Sep 11 15:55 (REG) ..
-rw-r--r--    1 eturner  support     1922 Sep 11 15:55 (REG) bigbinaryfile
lslogin2$

UNSINC Examples:

  • Unpack an archived tar/gzip file into a specified directory:
lslogin2$ rls -la | grep lsf 
-rw-------    1 eturner  support  1432304 Oct 11  2005 (OFL) lsfstuff-wrangler-10-11-05.tar.gz
-rw-r--r--    1 eturner  support  1085440 Sep  4 23:46 (DUL) lsfstuff.tar.bz2
-rw-r--r--    1 eturner  support  1136640 Sep 11 15:38 (REG) lsfstuff.tar.gz
lslogin2$ 
lslogin2$ unsinc -z -i lsfstuff.tar.gz -o mylsfstuff
lslogin2$ ls -la mylsfstuff/
total 28
drwx------   3 eturner support 4096 Sep 12 15:32 .
drwxr-xr-x  53 eturner support 8192 Sep 12 15:32 ..
drwx------   3 eturner support 4096 Apr 11 14:29 lsfstuff
lslogin2$ du -h mylsfstuff/ 
1.2M    mylsfstuff/lsfstuff/koomie
3.3M    mylsfstuff/lsfstuff
3.3M    mylsfstuff/
lslogin2$ 
  • Unpack an archived tar/bzip2 file into the local directory:
lslogin2$ mkdir launcher-restore
lslogin2$ cd launcher-restore/
lslogin2$ unsinc -i home.utexas.staff.eturner.launcher-test.tar.bz2
lslogin2$ ls
diff.launcher.lsf  launcher      launcher.o   Parametric.o380398  test-job
hello              launcher.c    launcher.sh  paramlist
hello.f            launcher.lsf  output       README.launcher
lslogin2$ 
  • Retrieve an archived file without unpacking (single file ONLY; cannot be a directory):
lslogin2$ rls -la mybinaryoutputs 
total 16
drwxr-xr-x    2 eturner  support       26 Sep 11 15:55 (REG) .
drwx------   28 eturner  support     4096 Sep 11 15:55 (REG) ..
-rw-r--r--    1 eturner  support     1922 Sep 11 15:55 (REG) bigbinaryfile
lslogin2$ unsinc -f -i mybinaryoutputs/bigbinaryfile -o mylonestarfiles
lslogin2$ ls -la mylonestarfiles/
total 28
drwx------   2 eturner support 4096 Sep 12 15:28 .
drwxr-xr-x  52 eturner support 8192 Sep 12 15:28 ..
-rw-r--r--   1 eturner support 1922 Sep 11 15:55 bigbinaryfile
lslogin2$ 

Alternatives to sinc and rls

On systems where sinc and rls are not available, you can use the following alternatives to copy files to to Ranch ($ARCHIVER):


  1. tar cvf - <dirname> | ssh ${ARCHIVER} "cat > ${ARCHIVE}/<tarfile.tar>"

  2. where <dirname> is the path to the directory you want to archive on Ranger, and <tarfile.tar> is the name of the archive on Ranch.

    You could add the -z option to gzip, however, it would run faster if you do not compress the tar file. Gzip uses a lot of CPU, and the local network is not typically a bottleneck.

    For large amounts of data, we recommend you create smaller tar files; perhaps breaking the data up by subdirectory. This will also make it more efficient to retrieve portions of your data, as needed. If you are concerned about space and need to compress the tar files, please try to do so when the system is not heavily loaded.


  3. scp <file> ${ARCHIVER}:${ARCHIVE}/<subdirectory>/<filename>

  4. We recommend that small files be tarred together and compressed, but you should try to keep tar files under 10 GB if at all possible (this reduces the chance of file corruption). Binary data does not compress, so you can save that step.

To stage data, (begin the process of retrieving from tape), before transferring back from Ranch, do (from Ranger):


ssh $ARCHIVER stage "file list"


This will being the staging process and return immediately, if you add the stage option -w, it will wait until staging is complete. Then, you can do:


rcp $ARCHIVER:"file list"


Or, you can login to Ranch (using ssh) and issue the commands from there.


Ranch uses the Storage and Archive Manager File System (SAM-FS) as a filesystem. SAM-FS contains several commands to interact with files for users that wish to have more control over their data and scripts instead of using the TACC tools sinc & unsinc.

archivearchive files to tape
stageretrieve files from tape and place in disk cache
slssimilar to ls, with more migration information
sfindSAM-FS find
releaserelease a file from disk cache
ssumset file checksum attributes
sdudu replacement - size of archived directory/file

File Attributes
SAM-FS uses several file attributes that set the behavior of file archiving. These attributes can be set manually for extra control over file archiving.

Archive Attributes
No Archive (-n)File will not be archived
Release Attributes
Partial (-p)First portion of the file will be retained in disk cache
After Archive (-a)File will be released from disk cache after archiving
Stage Attributes
No Stage (-n)File will not be staged; data will be read directly from the archive media
Associative (-a)File is part of an associative stage group

References