An Easy-to-Use Performance Diagnosis Tool for HPC Applications with Suggestions for Bottleneck Remediation
HPC systems are notorious for operating at a small fraction of their peak performance, and the ongoing migration to multi-core and multi-socket compute nodes further complicates performance optimization. The readily available performance evaluation tools require considerable effort to learn and use. As a remedy, TACC has helped to develop PerfExpert, a tool that combines a simple user interface with a sophisticated analysis engine to detect core, socket, and node-level performance bottlenecks in each procedure and loop of an application.
PerfExpert automatically analyzes the performance of programs and suggests optimizations to alleviate the identified bottlenecks. It is intended to make performance assessment easy while providing accurate diagnoses of core, chip, and node-level performance bottlenecks. Because of this focus, it suffices (and is recommended) to use scaled-down data sets and resource configurations to obtain accurate analyses.
Professor of Computer Science, UT Austin
Ashay Rane, James Browne, Lars Koesterke, "PerfExpert and MACPO: Which code segments should (not) be ported to MIC?", TACC-Intel Highly Parallel Computing Symposium, April 2012
Ashay Rane, James Browne: Performance Optimization of Data Structures Using Memory Access Characterization. CLUSTER 2011: 570-574
Ashay Rane, Saurabh Sardeshpande, James Browne, "Determining Code Segments that can Benefit from Execution on GPUs", poster presented at Supercomputing Conference (SC) 2011
M. Burtscher, B.D. Kim, J. Diamond, J. McCalpin, L. Koesterke, and J. Browne. "PerfExpert: An Easy-to-Use Performance Diagnosis Tool for HPC Applications." SC 2010 International Conference for High-Performance Computing, Networking, Storage and Analysis. November 2010 [pdf]
Talk slides [pdf] [pptx]
O. A. Sopeju, M. Burtscher, A. Rane, and J. Browne. "AutoSCOPE: Automatic Suggestions for Code Optimizations Using PerfExpert." 2011 International Conference on Parallel and Distributed Processing Techniques and Applications. July 2011 [pdf]
If things don't work as expected or if you have suggestions, contact me at:
ashay.rane [at] tacc.utexas.edu or on:
471-4024. Please try to include the following in your report:
- experiment.xml file that is being passed to PerfExpert
- Error printed on the console
- perfexpert.log (present in the current directory)