Frontera

A new NSF-funded petascale computing system

In 2018, a new NSF-funded petascale computing system, Frontera, was awarded for deployment at the Texas Advanced Computing Center (TACC). The goal of this project and system is to open up new possibilities in science and engineering by providing computational capability that makes it possible for investigators to tackle much larger and more complex research challenges across a wide spectrum of domains. Frontera replaces the soon-to-be decommissioned Blue Waters system, the first deployment in the National Science Foundation's petascale computing program. When deployed, Frontera will be one of the most powerful supercomputers in the world, and the fastest supercomputer on a university campus.

Up to 80% of the available hours on Frontera, more than 55 million node hours each year, with be made available through the NSF Petascale Computing Resource Allocation program.

Early user access is expected to begin in the late spring of 2019, with full system production anticipated by mid to late summer of 2019. These dates are subject to adjustment as the project proceeds.

About Frontera

This page provides a high-level technical overview of Frontera to guide PRAC proposals. Information presented here is limited by both vendor non-disclosure agreements and our progress through final vendor negotiations; we will add more detail as we proceed with system procurement. Nonetheless, the guidance here should be sufficient to guide proposers in making requests for compute time.

The service unit to use in requests for Frontera is "node-hours", simply representing a wall-clock hour on a single physical node. This is the same unit used on the Blue Waters system.

System Hardware and Software Overview

Frontera will have two computing subsystems, a primary computing system focused on double precision performance, and a second subsystem focused on single precision streaming-memory computing. Frontera also has multiple storage systems, as well as interfaces to cloud and archive systems, and a set of application nodes for hosting virtual servers. Essential details of each subsystem are provided below.

Estimating Node Hours for Allocation Requests

As with other TACC systems, the fundamental allocation unit on Frontera is a node, and allocations are awarded in node hours. This is true for both the Xeon and single-precision accelerator-based nodes. A project is charged 1 node-hour for the use of a node irrespective of how much work is done on the node (i.e., a job that uses only half the cores in a node is still charge 1 node hour for an hour of computation). As is typical of most HPC systems, the compute nodes are exclusive to a job so that only one job may access all the compute resources provided by the node(s) allocated to the job. 38 PFLOPS of the system's total capability will be provided by >8,000 nodes of Intel's next generation Xeon processor; approximately 8 PF of additional capability will be provided by a single-precision partition of the system. In order to assist in writing allocation proposals, we advise that teams assume that their application will run between 10% and 15% faster on Frontera than Stampede 2 for a fixed node count.

Primary Compute System

The primary computing system will be provided by Dell/EMC and powered by Intel processors, interconnected by a Mellanox Infiniband HDR and HDR-100 interconnect. The initial configuration of the system will have 8,008 available compute nodes.

The configuration of each compute node is described below:

Processors Follow-on to the Intel "Sky Lake" Xeon processor, roughly equivalent to the 8180 processor.
  • Unofficial estimated number of cores: 28 per socket, 56 per node.
  • Unofficial estimated clock rate: 2.7Ghz ("headline"), 2.1Ghz (AVX)
  • Unofficial estimated "peak" node performance: 4.8TF, double precision
Memory DDR-4 memory, 192GB/node
Local Disk 480 GB SSD drive
Network Mellanox InfiniBand, HDR-100

System Interconnect

Frontera compute nodes will be interconnected with HDR-100 links to each node, and HDR (200Gb) links between leaf and core switches. The interconnect will be configured in a fat tree topology with a small oversubscription factor of 11:9.

Storage

Frontera will have multiple file systems; In addition to a home directory, users will be assigned to one of several scratch filesystems on a rotating basis. Each scratch filesystem will have a disk capacity of approximately 15 usable petabytes, and each individual filesystem is expected to maintain a bandwidth of >100GB/s. Total scratch capacity will exceed 50PB. Users with very high bandwidth or IOPS requirements will be able to request an allocation on an all-NVMe filesystem with an approximate capacity of 3PB, and bandwidth of ~1.2TB/s. We intend to limit the number of simultaneous users on the "solid state" scratch component. Users may also request allocation on the "/work" filesystem, a medium-term filesystem (longer than scratch, but shorter than home or archive) which is shared among all TACC platforms. Work is for migration between systems, or the need for "semi-persistent" storage (i.e., a copy of reference genomes that are infrequently updated, but used by compute jobs over 1-2 years). Storage system hardware is provided by DataDirect Networks.

Users may also request archive space on Ranch, the TACC archive system. Ranch is currently undergoing upgrades which will, among other things, dramatically increase the disk cache of the archive to approximately 30PB. The goal of this upgrade is to keep all archive data on disk that has been used in recent months, and to keep small files on disk for up to several years if occasionally being re-used. Behind the disk storage will be multiple tape libraries (with capacity scalable to an exabyte or more); one of the tape options will offer encryption and appropriate compliance to store controlled unclassified information (if CUI data is required, users should specify this in the allocation request).

Single Precision Compute

Precise configuration of the single precision compute system is still in flux. Users may expect some configuration of GPUs that will provide up to 8PF of single precision performance. Nodes are projected to contain 4 GPUs each. For the GPU system, 1 node hour will be assumed to be a single hour on a 4-GPU node. More details will be provided soon.