A Reconfigurable Architecture for Large Scale Machine Learning
Project Catapult is an open research system deployed at TACC in partnership with a Microsoft Research project of the same name. The goal of the project is to investigate the use of field-programmable gate arrays (FPGAs) as data center accelerators to improve performance, reduce power consumption, and open new research avenues of investigation, particularly in the area of Machine Learning. Catapult is deployed by Microsoft in support of the Bing search engine. There are 384 Project Catapult nodes available for open research. Each Project Catapult node is based on the Microsoft Open CloudServer V1 design, extended to include an Altera FPGA on each node.Users will be provided with access to the servers themselves, as well as access to servers on which to run FPGA tools.
Request an account to use Project Catapult: TACC User Portal
The system consists of 432 two-socket Intel Xeon-based nodes, each with 64 GB of memory and an Altera Stratix V D5 FPGA with 8 GB of local DDR3 memory. FPGAs communicate to their host CPUs via a PCIe Gen3 x8 connection, providing 8GB/s guaranteed-not-to-exceed bandwidth, and each FPGA can read and write data stored on its host node using this connection.
The FPGAs are connected to one another via a dedicated network using high-speed serial links. This network forms a two dimensional 6x8 torus within a pod of 48 servers, and provides low latency communication between neighboring FPGAs. This design supports the use of multiple FPGAs to solve a single problem, while adding resilience to server and FPGA failures.
- Two Xeon E5-2450, 2.1GHz, 8-core, 20MB Cache, 95W
- 64GB RAM
- Four 2TB 7.2k 3G SATA 3.5"; Two 480GB 6G Micron SATA SSD 2.5"
- Intel 82599 10GbE Mezz Card
- Altera Stratix V FPGA Card
- Operating System: Windows Server 2012