PerfExpert 4.0

An Easy-to-Use Performance Diagnosis Tool for HPC Applications.


HPC systems are notorious for operating at a small fraction of their peak performance, and the ongoing migration to multi-core and multi-socket compute nodes further complicates performance optimization.

The previously available performance optimization tools require considerable effort to learn and use. To enable wide access to performance optimization, TACC and its technology insertion partners have developed PerfExpert, a tool that combines a simple user interface with a sophisticated analysis engine to:

    Detect and diagnosis the causes for any core, socket, and node-level performance bottlenecks in each procedure and loop of an application;
  • Apply pattern-based software transformations on the application source code to enhance performance on identified bottlenecks;
  • Provide performance analysis report and suggestions for bottleneck remediation for application's performance bottlenecks which we are unable to optimize automatically.

Applying PerfExpert requires only a single command line added to the applications normal job script. PerfExpert automates performance optimization at the core, socket and node levels as far as is possible. PerfExpert is currently available only for the CPU portion of Stampede compute nodes but will be extended to MICs in the near future.

Funding Source(s)

  • Intel Corporation, the NSF Track 2 Ranger grant and the current NSF Stampede grant.


  • Leonardo Fialho, James Browne, "Framework and Modular Infrastructure for Automation of Architectural Adaptation and Performance Optimization for HPC Systems", International Supercomputing Conference (ISC) 2014
  • Ashay Rane, James Browne, "Enhancing Performance Optimization of Multicore/Multichip Nodes with Data Structure Metrics", ACM Transactions on Parallel Compututing 1(1), 3:1-3:20 (May 2014).
  • Ashay Rane, James Browne, Lars Koesterke, "A Systematic Process for Efficient Execution on Intel's Heterogeneous Computation Nodes", Extreme Science and Discovery Environment (XSEDE) 2012
  • Ashay Rane, James Browne, "Enhancing Performance Optimization of Multicore Chips and Multichip Nodes with Data Structure Metrics", Parallel Architectures and Compilation Techniques (PACT) 2012
  • Ashay Rane, James Browne, Lars Koesterke, "PerfExpert and MACPO: Which code segments should (not) be ported to MIC?", TACC-Intel Highly Parallel Computing Symposium, April 2012
  • Ashay Rane, James Browne: Performance Optimization of Data Structures Using Memory Access Characterization. CLUSTER 2011: 570-574
  • Ashay Rane, Saurabh Sardeshpande, James Browne, "Determining Code Segments that can Benefit from Execution on GPUs", poster presented at Supercomputing Conference (SC) 2011
  • M. Burtscher, B.D. Kim, J. Diamond, J. McCalpin, L. Koesterke, and J. Browne. "PerfExpert: An Easy-to-Use Performance Diagnosis Tool for HPC Applications." SC 2010 International Conference for High-Performance Computing, Networking, Storage and Analysis. November 2010 [pdf]

    Talk slides [pdf] [pptx]


  • O. A. Sopeju, M. Burtscher, A. Rane, and J. Browne. "AutoSCOPE: Automatic Suggestions for Code Optimizations Using PerfExpert." 2011 International Conference on Parallel and Distributed Processing Techniques and Applications. July 2011 [pdf]


If you have problems using PerfExpert on Stampede or Lonestar or suggestions for enhancing PerfExpert, contact us: If you are reporting a problem, please use our mailing list.

Want to Contribute

Version 4 of PerfExpert has been designed to allow third-party contributions. There are several different ways to contribute with PerfExpert, such as:

  • Providing new bottleneck alleviation solutions;
  • Creating new strategies to select bottleneck alleviation solutions based on performance metrics;
  • Adding new performance metrics to PerfExpert;
  • Writing modules to modify the source code in order to alleviate the identified bottlenecks.

If you want to contribute with PerfExpert or need help to do research using PerfExpert, contact us: Full directions on how to add to or modify each phase of PerfExpert can be found on the PerfExpert web site. Please tell us how we can we help you to help us.

Mailing List

To subscribe send a message to: or access the list webpage at:

When reporting a problem using or installing PerfExpert, please try to include in your report a compressed file of the /.perfexpert-temp.XXXXXX directory generated by the failed execution or the config.log file generated by configure command if it is an installation issue.

James Browne

Professor Emeritus of Computer Science, UT Austin

Antonio Gomez

Research Scientist, UT Austin

Ashay Rane

PhD student, UT Austin