High Performance Computing
John joined TACC in 2009 as a Research Scientist in the High Performance Computing Group after a twelve year career in performance analysis and system architecture in the computer industry. His industrial experience includes 3 years at SGI (performance analysis and optimization on the Origin2000 and performance lead on the architecture team for the Altix3000), 6 years at IBM (performance analysis for HPC, processor and system design for Power4/4+ and Power5/5+), and 3 years at AMD (accelerated computing technologies and performance analysis). Prior to his industrial career, John was an oceanographer (Ph.D., Florida State), spending six years as an assistant professor at the University of Delaware engaged in research and teaching on numerical simulation of the large-scale circulation of the oceans.
Performance Analysis in High Performance Computing
Computer System Architecture
Applied Mathematics of Partial Differential Equations
Simulation of Large-Scale Circulation in the Ocean
STREAM: Sustainable Memory Bandwidth in High Performance Computers
ACElab: Advanced Computing Evaluation Laboratory: performance characterization and benchmarking of new computing technologies (co-director)
HPC Challenge Benchmark: a suite of benchmarks to explore different aspects of performance in computer stems and clusters
Selected Journal and Conference Publications
Ardavan Pedram, John D. McCalpin, and Andreas Gerstlauer, 2014: "A Highly Efficient Multicore Floating-Point FFT Architecture Based on Hybrid Linear Algebra/FFT Cores", Journal of Signal Processing Systems, 77(1-2):169-190.
Ardavan Pedram, John McCalpin, and Andreas Gerstlauer, 2013: "Transforming a Linear Algebra Core to an FFT Accelerator". Proceedings of the 2013 IEEE 24th International Conference on Application-Specific Systems, Architectures, and Processors (ASAP), June 2013.
Jeff Diamond, Martin Burtscher, John D. McCalpin, Byoung-Do Kim, Stephen W. Keckler, James C. Browne, 2011: "Evaluation and Optimization of Multicore Performance Bottlenecks in Supercomputing Applications". Proceedings of the 2011 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS). April 10-12, 2011, Austin, TX.
Martin Burtscher, Byoung-Do Kim, Jeff Diamond, John McCalpin, Lars Koesterke, James Browne, 2010: "PerfExpert: An Easy-to-Use Performance Diagnosis Tool for HPC Applications". Proceedings of the ACM/IEEE Supercomputing Conference 2010. November 2010, New Orleans, LA.
McCalpin, J., Moore, C., Hester, P., 2007: "The Role of Multicore Processors in the Evolution of General-Purpose Computing," CTWatch Quarterly, Volume 3, Number 1, February 2007.
H. M. Mathis, H. M., A. Mericas, J. D. McCalpin, R. J. Eickemeyer, and S. R. Kunkel, 2005: "Characterization of simultaneous multithreading (SMT) efficiency in POWER5", IBM Journal of Research and Development, 49(4/5):555-564.
United States Patents
Push For Sharing Instruction, US Patent number 8,099,557. Inventors: John D. McCalpin, Patrick N. Conway. Filed 2008-02-26, Granted 2012-01-17.
Method and System for Code Modification Based on Cache Structure, US Patent number 7,530,063. Inventors: Roch Georges Archambault, Robert James Blainey, Yaoqing Gao, John David McCalpin, Francis Patrick O'Connell, Pascal Vezolle, Steven Wayne White. Filed 2004-05-27, Granted 2009-05-05.
Programming Means for Dynamic Specifications of Cache Management Preferences, US Patent number 7,039,760. Inventors: Ravi Kumar Arimilli, John David McCalpin, Francis Patrick O'Connell, William John Starke. Filed 2003-04-28, Granted 2006-05-02.
Localized Cache Block Flush Instruction, US Patent number 7,194,587. Inventors: John David McCalpin, Balaram Sinharoy, Dereck Edward Williams, Kenneth Lee Wright. Filed 2003-04-24, Granted 2007-03-20.
Texas A&M University
Texas A&M University
Florida State University
IEEE Computer Society
American Geophysical Union