List of performance analysis tools
This article needs additional citations for verification. (November 2011) (Learn how and when to remove this template message)
General purpose, language independent
The following tools work based on log files that can be generated from various systems.
- time (Unix) - can be used to determine the run time of a program, separately counting user time vs. system time, and CPU time vs. clock time.
- timem (Unix) - can be used to determine the wall-clock time, CPU time, and CPU utilization similar to time (Unix) but supports numerous extensions.
- Supports reporting peak resident set size, major and minor page faults, priority and voluntary context switches via getrusage.
- Supports sampling procfs on supporting systems to report metrics such as page-based resident set size, virtual memory size, read-bytes, and write-bytes, etc.
- Supports collecting hardware counters when built with PAPI support.
The following tools work for multiple languages or binaries.
|Name/Manufacturer||OS||Compiler/Language||What It Does||License|
|Arm MAP||Linux||C, C++, Fortran/Fortran90 and Python applications.||Performance profiler. Shows I/O, communication, floating point operation usage and memory access costs.||Proprietary|
|AppDynamics by Cisco||Linux, Windows, iOS, Android, Azure, AWS, AIX||.NET, Java, PHP, HTML5, ObjectiveC/iOS, Java/Android, C/C++, Apache, Nginx, Cassandra, DataBases||See Application Performance Management.||Proprietary|
|AQtime by SmartBear Software||Windows||.NET 1.0 to 4.0 applications (including ASP.NET applications), Silverlight 4.0 applications, Windows 32- and 64-bit applications including C, C++, Delphi for Win32 and VBScript and JScript functions||Performance profiler and memory/resource debugging toolset.||Proprietary|
|CodeAnalyst by AMD||Linux, Windows||C, C++,Objective C .NET, Java (works at the executable level)||AMD uProf supersedes CodeAnalyst and CodeXL for CPU and Power profiling on AMD processors.
||Free/open source (GPL) or proprietary|
|AMD CodeXL by AMD||Linux, Windows||For GPU profiling and debugging: OpenCL.||A tool suite for GPU profiling, GPU debugger and a static kernel analyzer.||Free/open source (MIT)|
|AMD uProf by AMD||Linux, Windows||C, C++, .NET, Java, Fortran||Code profiler, does sampling based profiling on AMD processors.||Proprietary|
|DevPartner by Borland / Micro Focus||.NET, Java||Test suite that automatically detects and diagnoses software defects and performance problems.||Proprietary|
|DTrace by Sun Microsystems||Solaris, Linux, BSD, macOS||Comprehensive dynamic tracing framework for troubleshooting kernel and application problems on production systems in real time.||Free/open source (CDDL)|
|dynamoRIO by RIO||Linux, Windows||Dynamic binary instrumentation framework for the development of dynamic program analysis tools.||Free/open source - BSD|
|Dynatrace||Linux, Windows, iOS, Android, Azure, AWS, AIX, Solaris, HP/UX, zOS, zLinux||.NET, Java, PHP, HTML5, Ajax (for web sites), Objective-C/iOS, Java/Android, C/C++, CICS, Apache, Nginx, Cassandra, Hadoop, MongoDB, HBase||See Application Performance Management.||Proprietary|
|Extrae||Linux, Android||Primarily C/C++/Fortran, but can profile any application linking against supported parallel libraries (e.g. MPI4PY)||HPC performance analysis tool with viewer and supporting utilities. Primarily designed for parallel applications with support for MPI, OpenMP, CUDA, OpenCL, pthreads, and OmpSs. Additional features include user function tracing and hardware event capture via PAPI.||Free/open source - LGPL-2.1|
|FusionReactor||Linux, Windows, macOS, AWS, Azure, Google Cloud||Java, ColdFusion, Apache, MongoDB Works with any Language supported by the JVM||Performs Application Performance Management and Performance and Root Cause Analysis. Combines APM and Low Level Developer Style Tooling; also includes a debugger and Java, memory, thread, and CPU profilers.||Proprietary|
|GlowCode||Windows||64-bit and 32-bit applications, C, C++, .NET, and dlls generated by any language compiler.||Performance and memory profiler that identifies time-intensive functions and detects memory leaks and errors.||Proprietary|
|gprof||Linux/Unix||Any language supported by gcc||Several tools with combined sampling and call-graph profiling. A set of visualization tools, VCG tools, uses the Call Graph Drawing Interface (CGDI) to interface with gprof. Another visualization tool that interfaces with gprof is KProf.||Free/open source - BSD version is part of 4.2BSD and GNU version is part of GNU Binutils (by GNU Project)|
|Instana||Linux, Windows, iOS, Android, Azure, AWS, AIX, Solaris, HP/UX, zOS, zLinux||.NET, .Net core, Java, PHP, Ruby, Python, Crystal, Scala, Kotlin, Clojure, Haskell, Node.js, Web Browser, Apache, Nginx, Cassandra, Hadoop, MongoDB, Elasticsearch, Kafka||See Application Performance Management.||Proprietary|
|Instruments with Xcode||macOS||C, C++, Objective-C/C++, Swift, Cocoa apps.||Instruments shows a time line displaying any event occurring in the application, such as CPU activity variation, memory allocation, and network and file activity, together with graphs and statistics.
Group of events are monitored by selecting specific instruments from: File Activity, Memory Allocations, Time Profiler, GPU activity etc. For system wide impact of the executable: System Trace, System usage, Network Usage, Energy log etc are useful.
|Free. Proprietary. Bundled with Xcode, which is also free.|
|Intel Advisor||Linux and Windows. Viewer only on macOS.||C, C++, Data Parallel C++ and Fortran||A collection of design and analysis tools - vectorization (SIMD) optimization, thread prototyping, automated roofline analysis, offload modeling and flow graph analysis||Freeware and Proprietary. Available as part of Intel oneAPI Base Toolkit.|
|Linux Trace Toolkit (LTT)||Linux||Requires patched kernel||Collects data on processes blocking, context switches, and execution time. This helps identify performance problems over multiple processes or threads. Superseded by LTTng.||GPL|
|LTTng (Linux Trace Toolkit Next Generation)||Linux||System software package for correlated tracing of kernel, applications and libraries.||GPL/LGPL/MIT|
|OProfile||Linux||Profiles everything running on the Linux system, including hard-to-profile programs such as interrupt handlers and the kernel itself.||Sampling profiler for Linux that counts cache misses, stalls, memory fetches, etc.||Open Source GPLv2|
|Oracle Solaris Studio Performance Analyzer||Linux, Solaris||C, C++, Fortran, Java; MPI||Performance and memory profiler.||Proprietary freeware|
|perf tools||Linux kernel 2.6.31+||Sampling profiler with support of hardware events on several architectures.||GPL|
|Performance Application Programming Interface (PAPI)||Various||Library for hardware performance counters on modern microprocessors.|
|LIKWID||Linux||C/C++, Fortran, Python, Java and Lua||Toolsuite of command line applications and library for performance oriented programmers (hardware performance monitoring, affinity control, etc.).||GPLv3|
|Pin by Intel||Linux, Windows, macOS, Android||Dynamic binary instrumentation system that allows users to create custom program analysis tools.||Proprietary but free for non-commercial use|
|Rational PurifyPlus||AIX, Linux, Solaris, Windows||Performance profiling tool, memory debugger and code coverage tool.||Proprietary|
|Scalasca||Linux||C/C++, Fortran||Parallel trace analyser.||Free/open source (BSD license)|
|Shark by Apple||macOS (discontinued with 10.7)||Performance analyzer.||Proprietary freeware|
|Superluminal Performance||Windows, Xbox, PlayStation||C, C++, Rust||Hybrid sampling & instrumenting profiler, built with usability and scalability in mind.||Proprietary|
|Systemtap||Linux||Programmable system tracing/probing tool; may be scripted to generate time- or performance-counter- or function-based profiles of the kernel and/or its userspace.||Open source|
|timemory||Linux, macOS, Windows||C, C++, Python, Fortran||Modular C++ toolkit for creating scalable custom instrumentation and sampling tools for performance analysis. Designed to minimize overhead by adapting to the interface of each performance analysis component at compile-time and simplify adding support for invocation and data storage within multi-threaded and multi-process runtimes. Includes many pre-built components for timing, resource usage, hardware-counters, Roofline Model, and the instrumentation APIs for VTune, Intel Advisor, LIKWID, and Arm MAP, among others. Components can be arbitrarily bundled together into a single handle for collective invocations and input argument broadcasting. Python bindings are provided for every component as a stand-alone class for implementing low-overhead Python profiling tools. Profiling via dynamic instrumentation is available on Linux.||Free/Open-source (MIT)|
|Valgrind||Linux, macOS, Solaris, Android||Any, including assembler||System for debugging and profiling; supports tools to either detect memory management and threading bugs, or profile performance (cachegrind and callgrind). KCacheGrind, valkyrie and alleyoop are front-ends for valgrind.||Free/open source (GPL)|
|VTune Profiler by Intel Corporation
(formerly VTune Amplifier)
|Linux, Windows, viewer only for macOS||C, C++, C#, Data Parallel C++ (DPC++), Fortran, .NET, Java, Python, Go, ASM Assembly||A collection of profiling analyses implemented with sampling, instrumentation and processor trace technologies. Includes Hotspot, Threading, HPC, I/O, FPGA, GPU, System, Throttling and Microarchitecture analyses.||Freeware and Proprietary. Also available as a part of Intel oneAPI base toolkit.
|Windows Performance Analysis Toolkit by Microsoft||Windows||Proprietary freeware|
|RotateRight Zoom||Linux, macOS, Viewer Only for Windows||Supports most compiled languages on ARM and x86 processors.||Graphical and command-line statistical (event-based) profiler.|
|VisualSim||Linux, macOS, Microsoft Windows||Supports C/C++/SystemC||Graphical modeling and Simulation platform to select, analyze and validate architecture of complex electronics systems for performance, power and reliability.||Proprietary|
C and C++
- Arm MAP, a performance profiler supporting Linux platforms.
- AppDynamics, an application performance management solution[buzzword] for C/C++ applications via SDK.
- AQtime Pro, a performance profiler and memory allocation debugger that can be integrated into Microsoft Visual Studio, and Embarcadero RAD Studio, or can run as a stand-alone application.
- IBM Rational Purify was a memory debugger allowing performance analysis.
- Instruments (bundled with Xcode) is used to profile an executable's memory allocations, time usage, filesystem activity, GPU activity etc.
- Intel Parallel Studio contains Intel VTune Amplifier, which tunes both serial and parallel programs. It also includes Intel Advisor and Intel Inspector. Intel Advisor optimizes vectorization (use of SIMD instructions) and prototypes threading implementations. Intel Inspector detects and debugs races, deadlocks and memory errors.
- Parasoft Insure++ provides a graphical tool that displays and animates memory allocations in real time to expose memory blowout, fragmentation, overuse, bottlenecks and leaks.
- Timemory, a modular C++ toolkit for creating performance analysis tools which provides numerous command-line tools and libraries as a by-product of its flexibility and reusability.
- Visual Studio Team System Profiler, commercial profiler by Microsoft.
- inspectIT is an open-source application performance management (APM) solution[buzzword] for monitoring and analyzing software applications, available under the Apache License, Version 2.0 (ALv2).
- JConsole is the profiler which comes with the Java Development Kit
- JRockit Mission Control, a profiler with low overhead.
- Netbeans Profiler, a profiler integrated into the NetBeans IDE (internally uses jvisualvm profiler)
- Plumbr, Java application performance monitoring with automated root cause detection. Links memory leaks, GC inefficiency, slow database and external web service calls, locked threads, and other performance problems to the line in source code that causes them.
- OverOps, Continuous reliability for the modern software supply chain, automatically detect and deliver root cause automation for all errors.
- VisualVM is a visual tool integrating several commandline JDK tools and lightweight profiling capabilities. It is bundled with the Java Development Kit since version 6, update 7.
- FusionReactor, Java application performance monitoring - low overhead, production grade tools for production debugging, code profiling, memory and thread analysis
- CLR Profiler is a free memory profiler provided by Microsoft for CLR applications.
- GlowCode is a performance and memory profiler for .NET applications using C# and other .NET languages. It identifies time-intensive functions and detects memory leaks and errors in native, managed and mixed Windows x64 and x86 applications.
- Visual Studio
- DotNetBlackbox is a post mortem runtime debugger for C# and VB.NET. It logs every step of each command if necessary in the customers environment. It is also possible, to log variable values.