Intel VTune Amplifier XE is a commercial application for software performance analysis for 32 and 64-bit x86 based machines, and has both GUI and command line interfaces. It is available for both Linux and Microsoft Windows operating systems. Although basic features work on both Intel and AMD hardware, advanced hardware-based sampling requires an Intel-manufactured CPU.
Code Optimization 
Intel VTune Amplifier XE assists in various kinds of code profiling including stack sampling, thread profiling and hardware event sampling. The profiler result consists of details such as time spent in each sub routine which can be drilled down to the instruction level. The time taken by the instructions are indicative of any stalls in the pipeline during instruction execution. The tool can be also used to analyze thread performance. The new GUI can filter data based on a selection in the timeline.
- Software sampling
- works on x86 compatible processors and gives both the locations where time is spent and the call stack used.
- JIT profiling support
- profiles dynamically generated code.
- Locks and waits analysis
- finds long synchronization waits that occur when cores are under utilized.
- Threading timeline
- shows thread relationships to identify load balancing and synchronization issues. It can also be used to select a region of time and filter the results. This can remove the clutter of data gathered during uninteresting times like application start up.
- Source view
- Sampling results are displayed line by line on the source / assembly code.
- Hardware event sampling
- This uses the on chip performance monitoring unit and requires an Intel processor. It can find specific tuning opportunities like cache misses and branch mispredictions.
- Performance Tuning Utility (PTU)
- PTU is a separate download that gives VTune Amplifer XE users access to experimental tuning technology. This includes things like Data Access Analysis that identifies memory hotspots and relates them to code hotspots.
See also 
- Intel Releases Thread Tools, Library For Multicore CPUs CRN test center article on VTune Performance Analyzer 9.0, Thread Profiler 3.1 and Thread Checker 3.1.
- Product Review: Intel’s VTune 6 Performance Analyzer Gamasutra web site
- Performance Analysis Tools: A Look at the Intel VTune Performance Analyzer Real World Technologies
- Profiling Runtime Generated and Interpreted Code using the VTune Performance Analyzer
- IEEE Explore (registration required)
- New Intel VTune Performance Analyzer Helps Optimize Web Services Applications
- Lock and Lock-Free Code Compared, Optimized With Intel Thread Profiler Visual Computing