Migration Information

IMPORTANT All users are affected by the upgrade. Please read the Migration Guide.


Experienced users -- check out the Quick Start Notes.


BOTH lonestar2.tacc.utexas.edu and lonestar.tacc.utexas.edu addresses now take you to the new lonestar machine.

If you get a MAN IN THE MIDDLE or HOST ID CHANGE notification the first time you use the (reassigned) lonestar.tacc.utexas.edu address, accept the new host identification (if the fingerprint matches the ones shown below). Linux users will have to remove the offending entry from their know_hosts file.

Windows Machines

First, allow the connection, by clicking "Yes" on the popup window shown below (wording my vary depending upon client).
Second, allow the host certificate to be included in your local database (chain) by clicking "Yes" to the query about saving the new host key. See second popup window below.

lonestar HOST ID NOTIFICATION. Accept the new ID with fingerprint "xuming...".

lonestar HOST ID NOTIFICATION. Allow it to be included in your local database.

Linux and Mac OS X Machines


You will get a message similar to this (with the fingerprint displayed in hexadecimal notation):
     lonestar2% cat /tmp/xx
     @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
     @       WARNING: POSSIBLE DNS SPOOFING DETECTED!          @
     @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
     ...
     @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
     @    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @
     @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
     ...
     5c:36:42:99:aa:2d:52:58:70:3a:20:c2:3a:33:e4:2f.
     ...
     Offending key in /Users/tnek/.ssh/known_hosts:1
     RSA host key for lonestar.tacc.utexas.edu has changed...
     Host key verification failed.
Note the line stating "Offending key in ...".
Edit the identified file, and remove the line indicated by the number at the end. (In the case above the file is /Users/tnek/.ssh/known_hosts and the offending line is the first line. Next, login to the machine. You will be prompted to continue the connection to lonestar.tacc.utexas.edu, which has the fingerprint 5c:36:42:99:aa:2d:52:58:70:3a:20:c2:3a:33:e4:2f. Accept the connection by typing "yes".

All documenation below this point now uses "Lonestar" to refer to our new 1300 node (1855 PowerEdge, EM64T) system. All references and documentation for the old 512 node (1750 PowerEdge, IA-32) system have been removed.


Introduction

The TACC Lonestar cluster is one of the largest academic computational resources in the nation. It serves as a computational resource in the NSF TeraGrid partnership, the Texas-wide Computational Grid (HiPCAT), the UT campus grid and the UT research community.

Lonestar contains 1300 processors with a peak performance of 8.3 TFLOPS, 3.9 TB of total memory and 47TB of local disk space. The system supports a 15TB global, parallel file storage, managed by the Lustre file system. Nodes are interconnected with InfiniBand technology in a fat-tree topology with a 1GB/sec point-to-point bandwidth. Also, a 2.8 petabyte archive system and 5TB SAN network storage system are available through the login/development nodes.

Further expansion and an upgrade of the nodes are scheduled in the fall of 2006. The number of nodes will be doubled, and each node will have two dual-core processors with 8GB per node.


Architecture

The configuration and features for the compute nodes, interconnect and I/O systems are described below, and summarized in Tables 1-3.

  • Compute Nodes: Each Dell PowerEdge 1855 blade runs a Linux (Centos) 2.6.9 x86_64 operating system. A node contains two Intel EM64T (64-bit) processors on a single board, as an SMP unit. The CPU frequency is 3.2GHz and supports 2 floating-point operations per clock period with a peak performance of 6.4 GFLOPS. Each node contains 3GB of memory. The memory subsystem has an 800 MHz Front Side Bus, and dual channels with 400 MHz DDR2 modules. Both processors share an access to the memory controllers in the memory controller hub (HCM or North Bridge).

  • Interconnect: The interconnect topology is a fat tree, with an oversubscription of 2. The forty nodes of each frame are connected to 3 TopSpin 24-port 120 switches (leafs) with node-conections of 14, 12, and 14 for the 3 switches. From each TopSpin 120 switch there are 2 uplink connections to 3 TopSpin 96-port 270 switches (cores). Sixteen frame are connected in similar fashion to the cores, supporting 640 nodes. (The additional 10 nodes are connected to 5 120 switches that had only 12-node connections. Figure 3 illustrates the topology connection from the leaf switches to the core switches.

  • Filesystems: The Lonestar Storage includes 60GB of usable local SCSI disk on each node. Home directories are NSF mounted to all nodes and limited by quota to 200MB per user. The Work file system, also accessible from all nodes, is a parallel file system supported by Lustre and 38TB of DataDirect Storage. Archival storage is directly available from the login node, and accessible through rcp. SAN storage is also available to projects with a SAN allocation.


    lonestar2 1855 Motherboard
    Figure 2. Dell 1855 Blade motherboard.
    lonestar2 IB Topology
    Figure 1. Lonestar System: Front row of frames.
    Figure 3. InfiniBand Switch Topology.

    Component Technology Performance/Size
    Peak Floating
    Point Operations
    8.3 TFLOPS
    Nodes Xeon, Dual CPU 650 Nodes / 1300 CPUs
    Memory Distributed 2TB (Aggregate)
    Shared Disk Lustre, parallel File System 15TB
    Local Disk SCSI 47TB (Aggregate)
    Interconnect InfiniBand Switch 1 GB/s P-2-P Bandwidth

    Component Technology
    CPUs Per Node 2
    Motherboard Intel E7520 Chipset
    Memory Per Node 3 GB
    System Bus 800 MHz Processor Front Side Bus (FS): 6.4GB/s
    Memory Bus & Configuration dual channel, 400MHz, 6x512MB DDR2: 6.4GB/s
    PCI Express x4
    73GB Disk 10K RPM Ultra 320 SCSI single channel, 80p
    Infiniband Daughtercard (adpater) Topspin/PCI-X host interface, 4x 10Gbps

    Technology 64-bit (Intel EM64T)
    Clock Speed 3.2GHz
    FP Results/Clock Period 2
    Peak Performance 6.4GFLOPS
    L2 Cache 2MB
    L1 Cache 16KB

    Storage Class Size Architecture Features
    Local 73MB/node SCSI mounted on /tmp
    Parallel 38TB Lustre, DataDirect S2A9500 16 Dell 1850 I/O data servers, Brocade switch
    user striping, MPI-IO, mnt on /work
    SAN 15TB Synergy FS, SUN Storage Tek QLogic switch, SUN V880 Server, mnt on /san/hpc/<project>
    HOME 400GB NSF, Raid-5, 200MB/user Dell 2850 Server, automounted
    Archive 2.8PB DMF (Data Migration Facility)

    SGI Origin 3000 Server, xxx File Cache


    64-Bit Technology

    Intel Extended Memory 64-bit Technology (EM64T) Features

    Users must recompile their applications when migrating from a 32-bit system to Lonestar because of the difference in processor architectures. Lonestar nodes have 64-bit Intel EM64T processors, a 64-bit OS, and support only 64-bit libraries. Common 32-bit to 64-bit porting issues are summarized in Table n.

    The user environment on Lonestar functions nearly identically to the other TACC Linux systems. Except for a difference in the compiler architecture option, the size of long and pointers types (in C), and directory references to 64-bit libraries, the commands for compiling, loading and running applications are the same. To check for potential porting problems between 32- to 64-bit modes, include the –Wp64 option when compiling C or C++ codes with the Intel’s icc compiler. Addition information is available in the compilation section.


    Recommendations and Notice to all users

  • Users migrating from a 32-bit environment (from the old Lonestar to the new Lonestar) should read the Migration Guide.
  • Users who are already familiar with the EM64T architecture need not read the migration guide.
  • All users should read the following overview of the special features and commands that are unique on the Lonestar system.

    Recompile all codes
    The default and recommended compiler for Lonestar is the Intel 9.1 compiler.
    There are some incompatabilities between the 2.4 and 2.6 Linux Kernels.
    The Lonestar Linux 2.6 compiler has been patched for Lustre file system and the PAPI hardware performance counter interface.
    The latest MKL 8.1 and gotoblas libraries have been install.
    Lonestar uses the MVAPICH version of the MPI libraries.
    Unless you have developed your code on a system with an identical configuration, you MUST

    recompile and reload your application codes
    to avoid incompatibility issues.

    Review ALL batch jobs scripts
    The launch wrapper syntax for MPI-compiled code is:

    ibrun ./a.out
    This command will use the default environment for the correct IB communication drives.
    Check your job scripts, and replace any other MPI executable launchers, such as "gm-run", "mpirun", or mvapich_wrapper with ibrun.

    Intel Compiler Architecture Option
    To optimized code specifically for the EM64T technology, include the architecture option (highly recommended):

    -xP
    Check the compiler commands in your makefiles and build scripts. Replace any "-x" type with the -xP option.

    Module Environment Variables
    All Application and Library environment variables created by modules now include a TACC prefix to separate vendor and TACC variable name spaces. Any scripts written for the old Lonestar, Longhorn or Wrangler systems need to update the variable names for execution on Lonestar. The change is illustrated below:

    Replace $PACKAGE-DIRECTORY with $TACC-PACKAGE-DIRECTORY
    e.g. $MKL_LIB --> $TACC_MKL_LIB

    Lustre File System (WORK)
    The $WORK directory on Lonestar is a Lustre File System.
    This file system looks and feels like any other Unix file system.
    Moreover, it supports parallel I/O (MPI-IO); and for large-file I/O, performance can be optimized by user controlled striping.

    Please Read the Lustre Guide
    Performance on this file system may be twice that of other $WORK file systems at TACC.

  • Last modified: July 14 2009 10:52:17.


    System Access

    SSH

    To ensure a secure login session, users must connect to machines using the secure shell, ssh program. Telnet is no longer allowed because of the security vulnerabilities associated with it. The "r" commands rlogin, rsh, and rcp, as well as ftp, are also disabled on this machine for similar reasons. These commands are replaced by the more secure alternatives included in SSH --- ssh, scp, and sftp.

    Before any login sessions can be initiated using ssh, a working SSH client needs to be present in the local machine. Go to the TACC introduction to SSH for information on downloading and installing SSH.

    To initiate a ssh connection to a machine, type the following on the local workstation

    ssh <login-name> @ <machine-name>.tacc.utexas.edu
    Note <login-name> is needed only if the user name on the local machine and the TACC machine differ.

    Password changes (with the passwd command) are forced to adhere to "strength checking" rules, and users are asked to comply with practices presented in the TACC password guide.

    Last modified: July 14 2009 10:52:17.


    Login Info
    Login Shell User Environment Startup Scripts Modules

    Login Shell

    The most important component of a user's environment is the login shell that interprets text on each interactive command line and statements in shell scripts. Each login has a line entry in the /etc/passwd file, and the last field contains the shell launched at login. To determine your login shell, execute:

    grep <my_login_name> /etc/passwd {to see your login shell}

    You can use the chsh command to change your login shell; instructions are in the man page. Available shells are listed in the /etc/shells file with their full-path. To change your login shell, execute:

    cat /etc/shells {select a <shell> from list}
    chsh -s <shell> <username> {use full path of the shell}


    User Environment

    The next most important component of a user's environment is the set of environment variables. Many of the Unix commands and tools, such as the compilers, debuggers, profilers, editors, and just about all applications that have GUIs (Graphical User Interfaces), look in the environment for variables that specify information they may need to access. To see the variables in your environment execute the command:

    env {to see environment variables}

    The variables are listed as keyword/value pairs separated by an equal (=) sign, as illustrated below by the HOME and PATH variables.

    HOME=/home/utexas/staff/milfeld
    PATH=/bin:/usr/bin:/usr/local/apps:/opt/intel/bin

    (PATH has a colon (:) separated list of paths for its value.) It is important to realize that variables set in the environment (with setenv for C shells and export for Bourne shells) are "carried" to the environment of shell scripts and new shell invocations, while normal "shell" variables (created with the set command) are useful only in the present shell. Only environment variables are seen in the env (or printenv) command; execute set to see the (normal) shell variables.


    Startup Scripts

    All Unix systems set up a default environment and provide administrators and users with the ability to execute additional Unix commands to alter the environment. These commands are "sourced"; that is, they are executed by your login shell, and the variables (both normal and environmental) as well as aliases and functions are included in the present environment. We recommend that you customize the login environment by inserting your "startup" commands in .cshrc_user, .login_user, and .profile_user files in your home directory.

    Basic site environment variables and aliases are set in

    /etc/csh.cshrc {C-shell, non-login specific}
    /etc/csh.login {C-shell, specific to login}
    /etc/profile {Bourne-type shells}

    For historical reasons, the C shells source two types of files. The .cshrc type files are sourced first (/etc/csh.cshrc--> $HOME/.cshrc--> /usr/local/cshrc--> $HOME/.cshrc_user). These files are used to set up environments that are to be executed by all scripts and used for access to the machine without a login. For example, the following commands only execute the .cshrc type files on the remote machine:

    scp data lonestar.tacc.utexas.edu: {only .cshrc sourced on lonestar}
    ssh lonestar.tacc.utexas.edu date {only .cshrc sourced on lonestar}

    The .login type files are used to setup environment variables that you commonly use in an interactive session. They are sourced after the .cshrc type files (/etc/csh.login--> $HOME/.login--> /usr/local/login-->
    $HOME/.login_user
    ). Similarly, if your login shell is a Bourne shell (bash, sh, ksh, ...), the profile files are sourced (/etc/profile--> $HOME/.profile--> /usr/local/profile--> $HOME/.profile_user).

    The commands in the /etc files above are concerned with operating system behavior and set the initial PATH, ulimit, umask, and environment variables such as the HOSTNAME. They also source command scripts in /etc/profile.d -- the /etc/csh.cshrc sources files ending in .csh, and /etc/profile sources files ending in .sh. Many site administrators use these scripts to setup the environments for common user tools (vim, less, etc.) and system utilities (ganglia, modules, Globus, LSF, etc.)

    TACC has to coordinate the environments on platforms of several operating systems: AIX, Linux, IRIX, Solaris, and Unicos. In order to efficiently maintain and create a common environment among these systems, TACC uses its own startup files in /usr/local/etc. (A corresponding file in this etc directory is sourced by the .profile, , and .login files that reside in your home directory. (Please do not remove these files and the sourcing commands in them, even if you are a Unix guru.) Any commands that you put in your .login_user, .cshrc_user, or .profile_user file are sourced (if the file exists) at the end of the corresponding /usr/local/etc command files. If you accidentally remove your .login, .cshrc, and .login, you can copy new ones from /usr/local/etc/start-up or execute

    /usr/local/bin/install_ut_startups

    to get a new copy (your old files are renamed with a date suffix).


    Modules

    TACC is constantly including updates and installing revisions for application packages, compilers, communications libraries, and tools and math libraries. To facilitate the task of updating and to provide a uniform mechanism for accessing different revisions of software, TACC uses the modules utility.

    At login, a basic environment for the default compilers, tools, and libraries is set by several modules commands. Your PATH, MANPATH, LIBPATH, directory locations (WORK, ARCHIVE, HOME, ...), alias (cdw, cda, ...) and license paths, are just a few of the environment variables and aliases created for you. This frees you from having to initially set them and update them whenever modifications and updates are made in system and application software.

    Users who need 3rd party applications, special libraries, and tools for their development can quickly tailor their environment with only the applications and tools they need. (Building your own specific application environment through modules allows you to keep your environment free from the clutter of all the other application environments you don't need.)

    Each of the major TACC applications has a modulefile that sets, unsets, appends to, or prepends to environment variables such as $PATH, $LD_LIBRARY_PATH, $INCLUDE_PATH, $MANPATH for the specific application. Each modulefile also sets functions or aliases for use with the application. A user need only invoke a single command,

    module load <application>

    at each login to configure an application/programming environment properly. If you often need an application environment, place the modules command in your .login_user and/or .profile_user shell startup file.

    Most of the package directories are in /usr/local/apps ($APPS) and are named after the package name (<app>). In each package directory there are subdirectories that contain the specific version of the package. The APPS directory structure is shown in the diagram below:

    Lonestar Applications Directory Structure

    TACC Applications Directory Structure

    The directory structure for the fftw package is shown below. The directory fftw in /usr/local/apps contains 3 different version directories for the package: 2.1.3, 2.1.5 and version 3.0. Since fftw-2.1.5 is the present default version, a fftw link is created to the default, the fftw-2.1.5 subdirectory.

    Example FFTW Applications Directory Structure

    Example FFTW Applications Directory Structure

    The directory paths for the different fftw package versions, can be constructed easily with the help of the $APPS variable:

    $APPS/<app>/<app.version> {path to specific package version}
    $APPS/<app>/<app> {link to default version}
    $APPS/fftw/fftw {example, default version directory for fftw}
    $APPS/fftw/fftw-2.1.3 {example, directory for earlier version of fftw}

    The fftw package requires several environment variables that point to its home, libraries, include files, and documentation. These can be set in your environment by loading the fftw module:

    module load fftw

    The details of the environmental changes are in the modulefile, /usr/local/opt/modules/modulefiles/fftw. To see a list of available modules and a synopsis of a modulefile's operations, execute:

    module available {lists modules}
    module help <app> {lists environment changes performed for <app>}

    During upgrades, new modulefiles are created to reflect the changes made to the environment variables. TACC will always announce upgrades and module changes in advance.

    Another feature of modules is the ease in changing the environment for experimenting with new updates or backing down to older application versions. TACC will often make a link from <app>.new to the updated package modulefile (<app>.<new-version>) that has not become the default version yet. Also, the retired default modulefile is often linked to <app>.old. This makes it easier for users to change to new or old environments with the commands:

    module swap <app> <app>.old
    module swap <app> <app>.new

    (If the app module has not been loaded, then it is only necessary to load the new or old version; e.g. module load <app>.old.)

    For more information on modules and a description of how to build modulefiles, check out the man pages and the following URL:

    http://www.tacc.utexas.edu/resources/userguides/modules/.

    For information on customizing your login, go to the following URL:

    http://www.tacc.utexas.edu/resources/userguides/login/.
    Last modified: July 14 2009 10:52:24.


    File Systems

    The TACC HPC platforms have several different file systems with distinct storage characteristics. There are predefined, user-owned directories in these file systems for users to store their data. Of course, these file systems are shared with other users, so they are managed by either a quota limit, a purge policy (time-residency) limit, or a migration policy.

    To determine the size of a file system, cd to the directory of interest and execute the "df" command with the syntax:

    df -k .

    or simply execute it without the "dot" to see all file systems. In the example below the file system name appears on the left, and the used and available space (-k, in units of 1KBytes) appear in the middle columns followed by the percent used:

    % df -k .          
    File System 1k-blocks Used Available Use% Mounted on
    /dev/vg/home 8256952 6675732 1161792 86% /home

    To determine the amount of space occupied in a user-owned directory, cd to the directory and execute the du command with the -sb option (s=summary, b=units in bytes):

    du -sb

    To determine quota limits and usage on $HOME, execute the quota command without any options (from any directory):

    quota

    The four major file systems available on lonestar are:

    home directory
    At login, the system automatically changes to your home directory.
    This is the recommended location to store your source codes and build your executables.
    The quota limit on home is 200MB
    A user's home directory is accessible from the frontend node and any compute node.
    Use $HOME to reference your home directory in scripts.
    Use cd to change to $HOME.

    work directory
    Store large files here.
    Often users change to this directory in their batch scripts and run their jobs in this file system.
    A user's work directory is accessible from the frontend node and any compute node.
    The work file system is approximately 30TB
    Purge Policy: Files with access times greater than 10 days are purged.
    This file system is not backed up.
    Use $WORK to reference your work directory in scripts.
    Use cdw to change to $WORK.

    More on Work -- How to do parallel I/O in the Lustre File System

    scratch or temporary directory
    This is a directory in a local disk on each node where you can store files and perform local I/O for the duration of a batch job.
    Often, in batch jobs it is more efficient to use and store files directly in $WORK (to avoid moving files from scratch at the end of a job).
    The scratch file system is approximately 60GB.
    Files within the scratch directory on each node are removed immediately after the job terminates.
    Use $SCRATCH to reference a temporary directory in scripts.

    archive
    Store permanent files here for archival storage.
    The archive tape capacity is 2.5PB; the disk cache size is 2.5TB.
    The access speed is low relative to the work directory.
    This file system is NOT NSF mounted (directly accessible) on any node.
    Use the rcp command to transfer data to this system.
    Use the rsh command to login to archive from any TACC machine.
      e.g.
          rcp ${ARCHIVER}:$ARCHIVE/myfile  $WORK
          rsh archive

    Used this system for long-term file storage (tape system); it is not appropriate to use it as a staging area.

    More on Archive -- How to use the rcp command

    Last modified: July 14 2009 10:52:18.


    Programming Models

    There are two distinct memory models for computing: distributed-memory and shared-memory. In the former, the message passing interface (MPI) is employed in programs to communicate between processors that use their own memory address space. In the latter, open multiprocessing (OMP) programming techniques are employed for multiple threads (light weight processes) to access memory in a common address space.

    For distributed memory systems, single-program multiple-data (SPMD) and multiple-program multiple-data (MPMD) programming paradigms are used. In the SPMD paradigm, each processor loads the same program image and executes and operates on data in its own address space (different data). This is illustrated in Figure 4. It is the usual mechanism for MPI code: a single executable (a.out in the figure) is available on each node (through a globally accessible file system such as $WORK or $HOME), and launched on each node (through the batch MPI launch command, "ibrun a.out").

    In the MPMD paradigm, each processor loads up and executes a different program image and operates on different data sets, as illustrated in Figure 4. This paradigm is often used by researchers who are investigating the parameter space (parameter sweeps) of certain models, and need to launch 10s or hundreds of single processor executions on different data. (This is a special case of MPMD in which the same executable is used, and there is NO MPI communication.) The executables are launched through the same mechanism as SPMD jobs, but a Unix script is used to assign input parameters for the execution command (through the batch MPI launcher, "ibrun script_command"). Details of the batch mechanism for parameter sweeps are described in the Running Programs section.

    lonestar2
    Figure 4. Distributed Memory Paradigm: Single/Multiple-Program Multiple-Data.

    The shared-memory programming model is used on Symmetric Multi- Processor (SMP) nodes, like the TACC Champion Power5 System. Each node on this system contains 8 CPUs with a single 16GB memory subsystem.

    The programming paradigm for this memory model is called Parallel Vector Processing (PVP) or Shared-Memory Parallel Programming (SMPP). The latter name is derived from the fact that vectorizable loops are often employed as the primary structure for parallelization. The main point of SMPP computing is that all of the processors in the same node share data in a single memory subsystem, as shown in Figure 5. There is no need for explict messaging between processors as with with MPI coding.

    lonestar2
    Figure 5. Shared-Memory Parallel Processing.

    In the SMPP paradigm either compiler directives (as pragmas in C, and special comments in FORTRAN) or explicit threading calls (e.g. with Pthreads) is employed. The majority of science codes now use OpenMP directives that are understood by most vendor compilers, as well as the GNU compilers.

    In cluster systems that have SMP nodes and a high speed interconnect between them, programmers often treat all CPUs within the cluster as having their own local memory. On a node an MPI executable is launched on each CPU and runs within a separate address space. In this way, all CPUs appear as a set of distributed memory machines, even though each node has CPUs that share a single memory subsystem.

    In clusters with SMPs, hybrid programming is sometimes employed to take advantage of higher performance at the node-level for certain algorithms that use SMPP (OMP) parallel coding techniques. In hybrid programming, OMP code is executed on the node as a single process with multiple threads (or an OMP library routine is called), while MPI programming is used at the cluster-level for exchanging data between the distributed memories of the nodes.

    The number of application that benefit from hybrid programming on dual-processor nodes (e.g. on Lonestar) is very small. The programming and support of hybrid codes is complicated by compiler and platform support of both paradigms. However, with the new multi-core multi-socket commodity systems on the horizon, there may be a resurgence in hybrid programming if these systems provide better enhanced performance with SMPP (OMP) algorithms.

    For further information on OpenMP, MPI and on programming models/paradigms, please see the manuals and packages sections of this document.

    Last modified: July 14 2009 10:52:19.


    Compilation
    Serial OpenMP MPI Basic Optimization Loading Libraries

    Compiling and Running Serial Programs

    Lonestar programming environment uses the Intel C++ and Intel Fortran compilers as the default compilter system. The following section highlights the important HPC aspects of using the Intel compilers. The Intel compiler commands can be used for both compiling (making ".o" object files) and linking (making an executable from a ".o" object files). The tables below list the syntax for serial and parallel program compilation.

    Compiling Serial Programs

    Compiler Program TypeSuffix Example
    icc C .c icc   [compiler_options] prog.c
    icc C++ .C, .cc, .cpp, .cxx icc   [compiler_options] prog.cpp
    ifort F77 .f, .for, .ftn ifort   [compiler_options] prog.f
    ifort F90 .f90, .fpp ifort [compiler_options] prog.f90

    Appropriate program-name suffixes are required for each compiler. By default, the executable name is a.out; and it may be renamed with the -o option. To compile without the link step, use the -c option. The following examples illustrate renaming an executable and the use of two important compiler optimization options:

    C icc    -o flamec.exe -O3 -xP prog.cc
    Fortran ifort -o flamef.exe -O3 -xP prog.f90

    Commonly used options may be placed in a icc.cfg or ifc.cfg file for compiling C and Fortran codes, respectively.

    To run a serial (non-parallel) executable, simply type the name of the executable on the command line (and hit return).

    ./a.out {serial code execution}

    The relative path expression "./" tells the shell to look in the present working directory for the executable. It is often used to make sure that an executable of the same name in another directory (as determined by the PATH environment variable) is not executed. Also, if the "." is not in the PATH variable it is necessary to use "./" for the shell to find the executable.

    A list of all compiler options, their syntax, and a terse explanation, is given when the compiler command is executed with the -help option. Also, man pages are available. To see the help or and man information, execute one of:

    icc   -help
    ifort -help
    man icc
    man ifort

    Some of the more important options are listed below. Additional documentation, references, and a number of user guides (pdf, html) are available in the /opt/intel/compiler9.x/docs directory.


    The Intel 9.1 Compiler Suite

    The Intel 9.1 compilers are loaded as the default compilers at login with the intel module. (The 8.1 compilers are available for special porting needs.) The gcc 3.4.4 compiler and module are also available; but we recommend using the Intel suite whenever possible. The 9.1 suite is installed with the EM64T 64-bit standard libraries and will compile programs as 64-bit applications (as the default compiler mode). Any programs compiled on 32-bit systems (including the old Lonestar) need to be recompiled to run natively on Lonestar. Any pre-compiled packages should be EM64T (x86-64) compiled or else errors may occur. Since only 64-bit versions of the MPI libraries have been build on Lonestar, programs compiled in 32-bit mode will not execute MPI code.

    The Intel FORTRAN compiler command is ifort (as of Version 8.0) The ifc command is still accepted, but it displays an annoying message about the name obsolescence.

    Web accessible Intel manuals are available: Intel 9.1 C++ Compiler Documentation and Intel 9.1 Fotran Compiler Documentation.


    Compiling OpenMP Programs

    Since each of the PowerEdge blades (nodes) of the Lonestar cluster is a Xeon dual-processor system, applications can use the shared memory programming paradigm "on node". However, because of the limited number of processors in each node, there are rarely any significant performance benefits to using a shared-memory model on the node.

    The OpenMP compiler options are listed below for those who do need SMP support on the nodes. For hybrid programming, use the mpi-compiler commands, and include the openmp options.


    Compiling Parallel Programs with MPI

    The "mpicmds" commands support the compilation and execution of parallel MPI programs for specific interconnects and compilers. At login, MPI MVAPICH (mvapich) and Intel 9.1 compiler (intel) modules are loaded to produce the default environment which provide the location to the corresponding mpicmds. The mpicc, mpiCC, mpif77, and mpif90 compiler scripts (wrappers) compile MPI code and automatically link startup and message passing libraries into the executable. The following table lists the compiler wrappers for each language:

    Compiling Parallel Programs with MPI

    Compiler Program TypeSuffix Example
    mpicc c .c mpicc  [compiler_options] prog.c
    mpiCC C++ .cc, .C, .cpp, .cxx mpiCC  [compiler_options] prog.cc
    mpif77 F77 .f, .for, .ftn mpif77 [compiler_options] prog.f
    mpif90 F90 .f90, .fpp mpif90 [compiler_options] prog.f90

    Appropriate program-name suffixes are required for each wrapper. By default, the executable name is a.out; and it may be renamed with the -o option. To compile without the link step, use the -c option. The following examples illustrate renaming an executable and the use of two important compiler optimization options:

    C mpicc  -o prog.exe -O3 -xP prog.cc
    Fortran    mpif90 -o prog.exe -O3 -xP prog.f90

    Include linker options such as library paths and library names after the program module names, as explained in the Loading Libraries section below. The Running Code section explains how to execute MPI executables in batch scripts and "interactive batch" runs on compute nodes.

    We recommend that you use the Intel compiler for optimal code performance. TACC does not support the use of the gcc compiler for production codes on the Lonestar system. For those rare cases when gcc is required, for either a module or the main program, you can specify the gcc compiler with the -cc mpcc option for module requiring gcc. (Since gcc- and Intel-compiled code are binary compatible, you should compile all other modules that don't require gcc with the Intel compiler.) When gcc is used to compile the main program, an additional Intel library is required. The examples below show how to invoke the gcc compiler for the two cases:


    mpicc -O3 -xP -c -cc=gcc suba.c
    mpicc -O3 -xP mymain.c suba.o
     
    mpicc -O3 -xP -c suba.c
    mpicc -O3 -xP -cc=gcc -L$ICC_LIB -lirc mymain.c suba.o


    Basic Optimization for Serial and Parallel Programming using OpenMP and MPI

    The MPI compiler wrappers use the same compilers that are invoked for serial code compilation. So, any of the compiler flags used with the icc command can also be used with mpicc; likewise for ifort and mpif90; and iCC and mpiCC. Below are some of the common serial compiler options with descriptions.

    Compiler Options Description
    -03 performs some compile time and memory intensive optimizations in addition to those executed with -O2, but may not improve performance for all programs.
    ‑vec_report[0|...|5] control amount of vectorizer diagnostic information:
    -xP generates specialized code to run exclusviely on EM64T processors.
    -fast DO NOT USE -- static load not allowed.
    -g -fp debugging information produced, disable using EBP as general purpose register
    -openmp enable the parallelizer to generate multi-threaded code based on the OpenMP directives
    ‑openmp_report[0|1|2] control the OpenMP parallelizer diagnostic level.

    Use the -help option with the mpicmds for additional information:

    mpicc   -help
    mpif90  -help
    mpirun -help {use the listed options with the ibrun cmd}
    For detail on the MPI standard go to the URL: www.mcs.anl.gov/mpi.


    Loading Libraries

    Some of the more useful load flags/options are listed below. For a more comprehensive list, consult the ld man page.

    Last modified: July 14 2009 10:52:19.