Java performance

From Wikipedia, the free encyclopedia
Jump to: navigation, search
This article is a general presentation of the Java platform performance. For criticisms about Java performance, and more generally about the Java language, see Criticism of Java.

In software development, the Java programming language was historically considered slow[1] because compiled Java programs run on the Java Virtual Machine rather than directly on the computer's processor like C and C++ programs do; however, in newer Java versions the execution performance has been optimized significantly mainly thanks to the introduction of just-in-time compilation. Java performance is a matter of concern because lots of business software has been written in Java after the language quickly became popular in the late 1990s and early 2000s. Concerns over its performance led to the development of specialized hardware able to run Java directly, dubbed Java processors.

The performance of a compiled Java program depends on how optimally its particular tasks are managed by the host Java Virtual Machine (JVM), and how well the JVM takes advantage of the features of the hardware and OS in doing so. Thus, any Java performance test or comparison has to always report the version, vendor, OS and hardware architecture of the used JVM. In a similar manner, the performance of the equivalent natively compiled program will depend on the quality of its generated machine code, so the test or comparison also has to report the name, version and vendor of the used compiler, and its activated optimization directives.

Historically, the execution speed of Java programs improved significantly due to the introduction of Just-In Time compilation (JIT) (in 1997/1998 for Java 1.1),[2][3][4] the addition of language features supporting better code analysis, and optimizations in the JVM itself (such as HotSpot becoming the default for Sun's JVM in 2000). Hardware execution of Java bytecode, such as that offered by ARM's Jazelle, can also offer significant performance improvements.

Virtual machine optimization techniques[edit]

Many optimizations have improved the performance of the JVM over time. However, although Java was often the first Virtual machine to implement them successfully, they have often been used in other similar platforms as well.

Just-In-Time compilation[edit]

Further information: Just-in-time compilation and HotSpot

Early JVMs always interpreted bytecodes. This had a large performance penalty of between a factor 10 and 20 for Java versus C in average applications.[5] To combat this, a just-in-time (JIT) compiler was introduced into Java 1.1. Due to the high cost of compilation, an additional system called HotSpot was introduced into Java 1.2 and was made the default in Java 1.3. Using this framework, the Virtual Machine continually analyzes the program's performance for "hot spots" which are frequently or repeatedly executed. These are then targeted for optimization, leading to high performance execution with a minimum of overhead for less performance-critical code.[6][7] Some benchmarks show a 10-fold speed gain from this technique.[8] However, due to time constraints, the compiler cannot fully optimize the program, and therefore the resulting program is slower than native code alternatives.[9][10]

Adaptive optimization[edit]

Further information: Adaptive optimization

Adaptive optimization is a technique in computer science that performs dynamic recompilation of portions of a program based on the current execution profile. With a simple implementation, an adaptive optimizer may simply make a trade-off between Just-in-time compilation and interpreting instructions. At another level, adaptive optimization may take advantage of local data conditions to optimize away branches and to use inline expansion.

A Virtual Machine like HotSpot is also able to deoptimize a previously JITed code. This allows it to perform aggressive (and potentially unsafe) optimizations, while still being able to deoptimize the code and fall back on a safe path later on.[11][12]

Garbage collection[edit]

The 1.0 and 1.1 Virtual Machines used a mark-sweep collector, which could fragment the heap after a garbage collection. Starting with Java 1.2, the Virtual Machines switched to a generational collector, which has a much better defragmentation behaviour.[13] Modern Virtual Machines use a variety of techniques that have further improved the garbage collection performance.[14]

Other optimization techniques[edit]

Compressed Oops[edit]

Compressed Oops allow Java 5.0+ to address close to 32 GB of heap with 32-bit references. This is significantly reduce memory consumption compares with using 64-bit references as Java uses references much more than some languages like C++. Java 5.0+ can do this as Java doesn't support access to individual bytes, only Objects. These objects are 8-byte aligned by default meaning the lowest 3 bits will always be 0 and don't need to be stored. Java 8 supports larger alignments such as 16-byte alignment to support up to 64 GB with 32-bit references.

Split bytecode verification[edit]

Prior to executing a class, the Sun JVM verifies its bytecodes (see Bytecode verifier). This verification is performed lazily: classes bytecodes are only loaded and verified when the specific class is loaded and prepared for use, and not at the beginning of the program. (Note that other verifiers, such as the Java/400 verifier for IBM System i, can perform most verification in advance and cache verification information from one use of a class to the next.) However, as the Java Class libraries are also regular Java classes, they must also be loaded when they are used, which means that the start-up time of a Java program is often longer than for C++ programs, for example.

A technique named Split-time verification, first introduced in the J2ME of the Java platform, is used in the Java Virtual Machine since the Java version 6. It splits the verification of bytecode in two phases:[15]

  • Design-time - during the compilation of the class from source to bytecode
  • runtime - when loading the class.

In practice this technique works by capturing knowledge that the Java compiler has of class flow and annotating the compiled method bytecodes with a synopsis of the class flow information. This does not make runtime verification appreciably less complex, but does allow some shortcuts.[citation needed]

Escape analysis and lock coarsening[edit]

Further information: Lock (computer science) and Escape analysis

Java is able to manage multithreading at the language level. Multithreading is a technique that allows programs to perform multiple processes concurrently, thereby producing faster programs on computer systems with multiple processors or multiple cores. Also, a multithreaded application has the ability to remain responsive to input, even when it is performing long running tasks.

However, programs that use multithreading need to take extra care of objects shared between threads, locking access to shared methods or blocks when they are used by one of the threads. Locking a block or an object is a time-consuming operation due to the nature of the underlying operating system-level operation involved (see concurrency control and lock granularity).

As the Java library does not know which methods will be used by more than one thread, the standard library always locks blocks when necessary in a multithreaded environment.

Prior to Java 6, the virtual machine always locked objects and blocks when asked to by the program even if there was no risk of an object being modified by two different threads at the same time. For example, in this case, a local Vector was locked before each of the add operations to ensure that it would not be modified by other threads (Vector is synchronized), but because it is strictly local to the method this is not necessary:

public String getNames() {
     Vector v = new Vector();
     v.add("Me");
     v.add("You");
     v.add("Her");
     return v.toString();
}

Starting with Java 6, code blocks and objects are locked only when necessary,[16] so in the above case, the virtual machine would not lock the Vector object at all.

Since version 6u23, Java includes support for escape analysis.[17]

Register allocation improvements[edit]

Prior to Java 6, allocation of registers was very primitive in the "client" virtual machine (they did not live across blocks), which was a problem in architectures which did not have a lot of registers available, such as x86. If there are no more registers available for an operation, the compiler must copy from register to memory (or memory to register), which takes time (registers are significantly faster to access). However, the "server" virtual machine used a color-graph allocator and did not suffer from this problem.

An optimization of register allocation was introduced in Sun's JDK 6;[18] it was then possible to use the same registers across blocks (when applicable), reducing accesses to the memory. This led to a reported performance gain of approximately 60% in some benchmarks.[19]

Class data sharing[edit]

Class data sharing (called CDS by Sun) is a mechanism which reduces the startup time for Java applications, and also reduces memory footprint. When the JRE is installed, the installer loads a set of classes from the system jar file (the jar file containing all the Java class library, called rt.jar) into a private internal representation, and dumps that representation to a file, called a "shared archive". During subsequent JVM invocations, this shared archive is memory-mapped in, saving the cost of loading those classes and allowing much of the JVM's Metadata for these classes to be shared among multiple JVM processes.[20]

The corresponding improvement for start-up time is more noticeable for small programs.[21]

History of performance improvements[edit]

Further information: Java version history

Apart from the improvements listed here, each release of Java introduced many performance improvements in the JVM and Java API.

JDK 1.1.6 : First Just-in-time compilation (Symantec's JIT-compiler)[2][22]

J2SE 1.2 : Use of a generational collector.

J2SE 1.3 : Just-In-Time compilation by HotSpot.

J2SE 1.4 : See here, for a Sun overview of performance improvements between 1.3 and 1.4 versions.

Java SE 5.0 : Class Data Sharing[23]

Java SE 6 :

Other improvements:

See also 'Sun overview of performance improvements between Java 5 and Java 6'.[26]

Java SE 6 Update 10[edit]

  • Java Quick Starter reduces application start-up time by preloading part of JRE data at OS startup on disk cache.[27]
  • Parts of the platform that are necessary to execute an application accessed from the web when JRE is not installed are now downloaded first. The entire JRE is 12 MB, a typical Swing application only needs to download 4 MB to start. The remaining parts are then downloaded in the background.[28]
  • Graphics performance on Windows improved by extensively using Direct3D by default,[29] and use Shaders on GPU to accelerate complex Java 2D operations.[30]

Java 7[edit]

Several performance improvements have been released for Java 7: Future performance improvements are planned for an update of Java 6 or Java 7:[31]

  • Provide JVM support for dynamic languages, following the prototyping work currently done on the Multi Language Virtual Machine,[32]
  • Enhance the existing concurrency library by managing parallel computing on multi-core processors,[33][34]
  • Allow the virtual machine to use both the Client and Server compilers in the same session with a technique called tiered compilation:[35]
    • The Client would be used at startup (because it is good at startup and for small applications),
    • The Server would be used for long-term running of the application (because it outperforms the Client compiler for this).
  • Replace the existing concurrent low-pause garbage collector (also called CMS or Concurrent Mark-Sweep collector) by a new collector called G1 (or Garbage First) to ensure consistent pauses over time.[36][37]

Comparison to other languages[edit]

Objectively comparing the performance of a Java program and another equivalent one written in another programming language such as C++ requires a carefully and thoughtfully constructed benchmark which compares programs expressing algorithms written in as identical a manner as technically possible. The target platform of Java's bytecode compiler is the Java platform, and the bytecode is either interpreted or compiled into machine code by the JVM. Other compilers almost always target a specific hardware and software platform, producing machine code that will stay virtually unchanged during its execution. Very different and hard-to-compare scenarios arise from these two different approaches: static vs. dynamic compilations and recompilations, the availability of precise information about the runtime environment and others.

Java is often Just-in-time compiled at runtime by the Java Virtual Machine, but may also be compiled ahead-of-time, just like C++. When Just-in-time compiled, in micro-benchmarks its performance is generally:[38]

  • slower than or similar to compiled languages such as C or C++,[39]
  • similar to other Just-in-time compiled languages such as C#,[40]
  • much faster than languages without an effective native-code compiler (JIT or AOT), such as Perl, Ruby, PHP and Python.[41]

Program speed[edit]

Java is in some cases equal to C++ on low-level and numeric benchmarks.[42]

Benchmarks often measure performance for small numerically intensive programs. In some real-life programs, Java out-performs C. One example is the benchmark of Jake2 (a clone of Quake 2 written in Java by translating the original GPL C code). The Java 5.0 version performs better in some hardware configurations than its C counterpart.[43] While it's not specified how the data was measured (for example if the original Quake 2 executable compiled in 1997 was used, which may be considered bad as current C compilers may achieve better optimizations for Quake), it notes how the same Java source code can have a huge speed boost just by updating the VM, something impossible to achieve with a 100% static approach.

For other programs the C++ counterpart can—and often does—run significantly faster than the Java equivalent. A benchmark performed by Google in 2011 showed a factor 10 between C++ and Java.[44] At the other extreme, an academic benchmark performed in 2012 with a 3D modelling algorithm showed the Java 6 JVM being from 1.09 to 1.51 times slower than C++ under Windows.[45]

Some optimizations that are possible in Java and similar languages might not be possible in certain circumstances in C++:[46]

  • C-style pointer usage can hinder optimization in languages that support pointers,
  • The use of escape analysis techniques is limited in C++, for example, because a C++ compiler does not always know if an object will be modified in a particular block of code due to pointers,[note 1]
  • Java can access derived instance methods faster than C++ can access derived virtual methods due to C++'s extra Virtual-Table look-up. However, non-virtual methods in C++ do not suffer from V-Table performance bottlenecks, and thus exhibit performance similar to that of Java.

The JVM is also able to perform processor specific optimizations or inline expansion. And, the ability to deoptimize code previously compiled or inlined sometimes allows it to perform more aggressive optimizations than those performed by statically typed languages when external library functions are involved.[47][48]

Results for microbenchmarks between Java and C++ highly depend on which operations are compared. For example, when comparing with Java 5.0:


Notes
  1. ^ Contention of this nature can be alleviated in C++ programs at the source code level by employing advanced techniques such as custom allocators, exploiting precisely the kind of low-level coding complexity that Java was designed to conceal and encapsulate; however, this approach is rarely practical if not adopted (or at least anticipated) while the program remains under primary development.

Multi-core performance[edit]

The scalability and performance of Java applications on multi-core systems is limited by the object allocation rate. This effect is sometimes called an "allocation wall".[55] However, in practice, modern garbage collector algorithms use multiple cores to perform garbage collection, which to some degree alleviates this problem. Some garbage collectors are reported to sustain allocation rates of over a gigabyte per second,[56] and there exist Java-based systems that have no problems scaling to several hundreds of CPU cores and heaps sized several hundreds of GB.[57]

Automatic memory management in Java allows for efficient use of lockless and immutable data structures that are extremely hard or sometimes impossible to implement without some kind of a garbage collection. Java offers a number of such high-level structures in its standard library in the java.util.concurrent package, while many languages historically used for high performance systems like C or C++ are still lacking them.

Startup time[edit]

Java startup time is often much slower than many languages, including C, C++, Perl or Python, because a lot of classes (and first of all classes from the platform Class libraries) must be loaded before being used.

When compared against similar popular runtimes, for small programs running on a Windows machine, the startup time appears to be similar to Mono's and a little slower than .Net's.[58]

It seems that much of the startup time is due to IO-bound operations rather than JVM initialization or class loading (the rt.jar class data file alone is 40 MB and the JVM must seek a lot of data in this huge file).[27] Some tests showed that although the new Split bytecode verification technique improved class loading by roughly 40%, it only translated to about 5% startup improvement for large programs.[59]

Albeit a small improvement it is more visible in small programs that perform a simple operation and then exit, because the Java platform data loading can represent many times the load of the actual program's operation.

Beginning with Java SE 6 Update 10, the Sun JRE comes with a Quick Starter that preloads class data at OS startup to get data from the disk cache rather than from the disk.

Excelsior JET approaches the problem from the other side. Its Startup Optimizer reduces the amount of data that must be read from the disk on application startup, and makes the reads more sequential.

In November 2004, Nailgun, a "client, protocol, and server for running Java programs from the command line without incurring the JVM startup overhead" was publicly released, introducing for the first time an option for scripts to use a JVM as a daemon, for running one or more Java applications with no JVM startup overhead. The Nailgun daemon is insecure - 'all programs are run with the same permissions as the server'; where multi-user security is required, Nailgun is inappropriate without special precautions. Scripts where per-application JVM startup dominates resource usage, see one to two orders of magnitude runtime performance improvement.[60]

Memory usage[edit]

Java memory usage is much heavier than C++'s memory usage because:

  • There is an 8-byte overhead for each object and 12-byte for each array[61] in Java. If the size of an object is not a multiple of 8 bytes, it is rounded up to next multiple of 8. This means an object containing a single byte field occupies 16 bytes and requires a 4-byte reference. Please note that C++ also allocates a pointer (usually 4 or 8 bytes) for every object that declares virtual functions.[62]
  • Parts of the Java Library must be loaded prior to the program execution (at least the classes that are used "under the hood" by the program).[63] This leads to a significant memory overhead for small applications.[citation needed]
  • Both the Java binary and native recompilations will typically be in memory.
  • The virtual machine itself consumes a significant amount of memory.
  • In Java, a composite object (class A which uses instances of B and C) is created using references to allocated instances of B and C. In C++ the memory and performance cost of these types of references can be avoided when the instance of B and/or C exists within A.
  • Lack of address arithmetic makes creating memory-efficient containers, such as tightly spaced structures and XOR linked lists, impossible.

In most cases a C++ application will consume less memory than the equivalent Java application due to the large overhead of Java's virtual machine, class loading and automatic memory resizing. For applications in which memory is a critical factor for choosing between languages and runtime environments, a cost/benefit analysis is required.

Trigonometric functions[edit]

Performance of trigonometric functions can be bad compared to C, because Java has strict specifications for the results of mathematical operations, which may not correspond to the underlying hardware implementation.[64] On the x87, Java since 1.4 does argument reduction for sin and cos in software,[65] causing a big performance hit for values outside the range.[66]

Java Native Interface[edit]

The Java Native Interface has a high overhead associated with it, making it costly to cross the boundary between code running on the JVM and native code.[67][68] Java Native Access (JNA) provides Java programs easy access to native shared libraries (DLLs on Windows) without writing anything but Java code—no JNI or native code is required. This functionality is comparable to Windows' Platform/Invoke and Python's ctypes. Access is dynamic at runtime without code generation. But it comes with a cost and JNA is usually slower than JNI.[69]

User interface[edit]

Swing has been perceived as slower than native widget toolkits, because it delegates the rendering of widgets to the pure Java 2D API. However, benchmarks comparing the performance of Swing versus the Standard Widget Toolkit, which delegates the rendering to the native GUI libraries of the operating system, show no clear winner, and the results greatly depend on the context and the environments.[70]

In most cases Java suffers greatly from its need to copy image data from one place in memory to another before rendering it to the screen. C++ can usually avoid this large overhead by accessing memory directly. The developers of Java have attempted to overcome this limitation with certain so-called "unsafe" direct memory access classes. However, those attempts fall far short of what C++ natively offers. For example, two major Java OpenGL implementations suffer tremendously from this data duplication problem which is difficult, if not impossible, to avoid with Java.[citation needed]

Use for high performance computing[edit]

Some people believe that Java performance for high performance computing (HPC) is similar to Fortran on computation intensive benchmarks, but that JVMs still have scalability issues for performing intensive communication on a Grid Network.[71]

However, high performance computing applications written in Java have recently won benchmark competitions. In 2008[72] and 2009,[73][74] an Apache Hadoop (an open-source high performance computing project written in Java) based cluster was able to sort a terabyte and petabyte of integers the fastest. The hardware setup of the competing systems was not fixed, however.[75][76]

In programming contests[edit]

As Java solutions start slower than solutions in other compiled languages,[77][78] it is not uncommon for Chinese university online judges to use greater time limits for Java solutions[79][80][81][82][83] to be fair to contestants using Java.

See also[edit]

References[edit]

  1. ^ http://www.scribblethink.org/Computer/javaCbenchmark.html
  2. ^ a b "Symantec's Just-In-Time Java Compiler To Be Integrated Into Sun JDK 1.1". 
  3. ^ "Apple Licenses Symantec's Just In Time (JIT) Compiler To Accelerate Mac OS Runtime For Java". [dead link]
  4. ^ "Java gets four times faster with new Symantec just-in-time compiler". 
  5. ^ http://www.shudo.net/jit/perf/
  6. ^ Kawaguchi, Kohsuke (30 March 2008). "Deep dive into assembly code from Java". Retrieved 2 April 2008. 
  7. ^ "Fast, Effective Code Generation in a Just-In-Time Java Compiler". Intel Corporation. Retrieved 22 June 2007. 
  8. ^ This article shows that the performance gain between interpreted mode and Hotspot amounts to more than a factor of 10.
  9. ^ Numeric performance in C, C# and Java
  10. ^ Algorithmic Performance Comparison Between C, C++, Java and C# Programming Languages
  11. ^ "The Java HotSpot Virtual Machine, v1.4.1". Sun Microsystems. Retrieved 20 April 2008. 
  12. ^ Nutter, Charles (28 January 2008). "Lang.NET 2008: Day 1 Thoughts". Retrieved 18 January 2011. "Deoptimization is very exciting when dealing with performance concerns, since it means you can make much more aggressive optimizations...knowing you'll be able to fall back on a tried and true safe path later on" 
  13. ^ IBM DeveloperWorks Library
  14. ^ For example, the duration of pauses is less noticeable now. See for example this clone of Quake 2 written in Java: Jake2.
  15. ^ "New Java SE 6 Feature: Type Checking Verifier". Java.net. Retrieved 18 January 2011. 
  16. ^ Brian Goetz (2005-10-18). "Java theory and practice: Synchronization optimizations in Mustang". IBM. Retrieved 2013-01-26. 
  17. ^ "Java HotSpot Virtual Machine Performance Enhancements". Oracle Corporation. Retrieved 2014-01-14. "Escape analysis is a technique by which the Java Hotspot Server Compiler can analyze the scope of a new object's uses and decide whether to allocate it on the Java heap. Escape analysis is supported and enabled by default in Java SE 6u23 and later." 
  18. ^ Bug report: new register allocator, fixed in Mustang (JDK 6) b59
  19. ^ Mustang's HotSpot Client gets 58% faster! in Osvaldo Pinali Doederlein's Blog at java.net
  20. ^ Class Data Sharing at java.sun.com
  21. ^ Class Data Sharing in JDK 1.5.0 in Java Buzz Forum at artima developer
  22. ^ Mckay, Niali. "Java gets four times faster with new Symantec just-in-time compiler". 
  23. ^ Sun overview of performance improvements between 1.4 and 5.0 versions.
  24. ^ STR-Crazier: Performance Improvements in Mustang in Chris Campbell's Blog at java.net
  25. ^ See here for a benchmark showing an approximately 60% performance boost from Java 5.0 to 6 for the application JFreeChart
  26. ^ Java SE 6 Performance White Paper at http://java.sun.com
  27. ^ a b Haase, Chet (May 2007). "Consumer JRE: Leaner, Meaner Java Technology". Sun Microsystems. Retrieved 27 July 2007. "At the OS level, all of these megabytes have to be read from disk, which is a very slow operation. Actually, it's the seek time of the disk that's the killer; reading large files sequentially is relatively fast, but seeking the bits that we actually need is not. So even though we only need a small fraction of the data in these large files for any particular application, the fact that we're seeking all over within the files means that there is plenty of disk activity. " 
  28. ^ Haase, Chet (May 2007). "Consumer JRE: Leaner, Meaner Java Technology". Sun Microsystems. Retrieved 27 July 2007. 
  29. ^ Haase, Chet (May 2007). "Consumer JRE: Leaner, Meaner Java Technology". Sun Microsystems. Retrieved 27 July 2007. 
  30. ^ Campbell, Chris (7 April 2007). "Faster Java 2D Via Shaders". Retrieved 18 January 2011. 
  31. ^ Haase, Chet (May 2007). "Consumer JRE: Leaner, Meaner Java Technology". Sun Microsystems. Retrieved 27 July 2007. 
  32. ^ "JSR 292: Supporting Dynamically Typed Languages on the Java Platform". jcp.org. Retrieved 28 May 2008. 
  33. ^ Goetz, Brian (4 March 2008). "Java theory and practice: Stick a fork in it, Part 2". Retrieved 9 March 2008. 
  34. ^ Lorimer, R.J. (21 March 2008). "Parallelism with Fork/Join in Java 7". infoq.com. Retrieved 28 May 2008. 
  35. ^ "New Compiler Optimizations in the Java HotSpot Virtual Machine". Sun Microsystems. May 2006. Retrieved 30 May 2008. 
  36. ^ Humble, Charles (13 May 2008). "JavaOne: Garbage First". infoq.com. Retrieved 7 September 2008. 
  37. ^ Coward, Danny (12 November 2008). "Java VM: Trying a new Garbage Collector for JDK 7". Retrieved 15 November 2008. 
  38. ^ "Computer Language Benchmarks Game". benchmarksgame.alioth.debian.org. Retrieved 2 June 2011. 
  39. ^ "Computer Language Benchmarks Game". benchmarksgame.alioth.debian.org. Retrieved 2 June 2011. 
  40. ^ "Computer Language Benchmarks Game". benchmarksgame.alioth.debian.org. Retrieved 2 June 2011. 
  41. ^ "Computer Language Benchmarks Game". benchmarksgame.alioth.debian.org. Retrieved 2 June 2011. 
  42. ^ Computer Language Benchmarks Game
  43. ^ : 260/250 frame/s versus 245 frame/s (see benchmark)
  44. ^ Hundt, Robert. Loop Recognition in C++/Java/Go/Scala. Stanford, California: Google accessdate=2014-03-23. 
  45. ^ L. Gherardi, D. Brugali, D. Comotti (2012). "A Java vs. C++ performance evaluation: a 3D modeling benchmark". University of Bergamo. Retrieved 2014-03-23. "Using the Server compiler, which is best tuned for long-running applications, have instead demonstrated that Java is from 1.09 to 1.91 times slower(...)In conclusion, the results obtained with the server compiler and these important features suggest that Java can be considered a valid alternative to C++" 
  46. ^ Lewis, J.P.; Neumann, Ulrich. "Performance of Java versus C++". Computer Graphics and Immersive Technology Lab, University of Southern California. 
  47. ^ "The Java HotSpot Performance Engine: Method Inlining Example". Oracle Corporation. Retrieved 11 June 2011. 
  48. ^ Nutter, Charles (3 May 2008). "The Power of the JVM". Retrieved 11 June 2011. "What happens if you've already inlined A's method when B comes along? Here again the JVM shines. Because the JVM is essentially a dynamic language runtime under the covers, it remains ever-vigilant, watching for exactly these sorts of events to happen. And here's the really cool part: when situations change, the JVM can deoptimize. This is a crucial detail. Many other runtimes can only do their optimization once. C compilers must do it all ahead of time, during the build. Some allow you to profile your application and feed that into subsequent builds, but once you've released a piece of code it's essentially as optimized as it will ever get. Other VM-like systems like the CLR do have a JIT phase, but it happens early in execution (maybe before the system even starts executing) and doesn't ever happen again. The JVM's ability to deoptimize and return to interpretation gives it room to be optimistic...room to make ambitious guesses and gracefully fall back to a safe state, to try again later." 
  49. ^ "Microbenchmarking C++, C#, and Java: 32-bit integer arithmetic". Dr. Dobb's Journal. 1 July 2005. Retrieved 18 January 2011. 
  50. ^ "Microbenchmarking C++, C#, and Java: 64-bit double arithmetic". Dr. Dobb's Journal. 1 July 2005. Retrieved 18 January 2011. 
  51. ^ "Microbenchmarking C++, C#, and Java: File I/O". Dr. Dobb's Journal. 1 July 2005. Retrieved 18 January 2011. 
  52. ^ "Microbenchmarking C++, C#, and Java: Exception". Dr. Dobb's Journal. 1 July 2005. Retrieved 18 January 2011. 
  53. ^ "Microbenchmarking C++, C#, and Java: Array". Dr. Dobb's Journal. 1 July 2005. Retrieved 18 January 2011. 
  54. ^ "Microbenchmarking C++, C#, and Java: Trigonometric functions". Dr. Dobb's Journal. 1 July 2005. Retrieved 18 January 2011. 
  55. ^ Yi Zhao, Jin Shi, Kai Zheng, Haichuan Wang, Haibo Lin and Ling Shao, Allocation wall: a limiting factor of Java applications on emerging multi-core platforms, Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications, 2009.
  56. ^ C4: The Continuously Concurrent Compacting Collector
  57. ^ Azul bullies Java with 768 core machine
  58. ^ "Benchmark start-up and system performance for .Net, Mono, Java, C++ and their respective UI". 2 September 2010. 
  59. ^ "How fast is the new verifier?". 7 February 2006. Retrieved 9 May 2007. 
  60. ^ The Nailgun Background page demonstrates "best case scenario" speedup of 33 times (for scripted "Hello, world!" i.e. short-run programs).
  61. ^ http://www.javamex.com/tutorials/memory/object_memory_usage.shtml
  62. ^ http://www.informit.com/guides/content.aspx?g=cplusplus&seqNum=195
  63. ^ http://www.tommti-systems.de/go.html?http://www.tommti-systems.de/main-Dateien/reviews/languages/benchmarks.html
  64. ^ "Math (Java Platform SE 6)". Sun Microsystems. Retrieved 8 June 2008. 
  65. ^ Gosling, James (27 July 2005). "Transcendental Meditation". Retrieved 8 June 2008. 
  66. ^ W. Cowell-Shah, Christopher (8 January 2004). "Nine Language Performance Round-up: Benchmarking Math & File I/O". Retrieved 8 June 2008. 
  67. ^ Wilson, Steve; Jeff Kesselman (2001). "JavaTM Platform Performance: Using Native Code". Sun Microsystems. Retrieved 15 February 2008. 
  68. ^ Kurzyniec, Dawid; Vaidy Sunderam. "Efficient Cooperation between Java and Native Codes - JNI Performance Benchmark". Retrieved 15 February 2008. 
  69. ^ "How does JNA performance compare to custom JNI?". Sun Microsystems. Retrieved 26 December 2009. 
  70. ^ Igor, Križnar (10 May 2005). "SWT Vs. Swing Performance Comparison". cosylab.com. Retrieved 24 May 2008. "It is hard to give a rule-of-thumb where SWT would outperform Swing, or vice versa. In some environments (e.g., Windows), SWT is a winner. In others (Linux, VMware hosting Windows), Swing and its redraw optimization outperform SWT significantly. Differences in performance are significant: factors of 2 and more are common, in either direction" 
  71. ^ Brian Amedro, Vladimir Bodnartchouk, Denis Caromel, Christian Delbe, Fabrice Huet, Guillermo L. Taboada (August 2008). "Current State of Java for HPC". INRIA. Retrieved 9 September 2008. "We first perform some micro benchmarks for various JVMs, showing the overall good performance for basic arithmetic operations(...). Comparing this implementation with a Fortran/MPI one, we show that they have similar performance on computation intensive benchmarks, but still have scalability issues when performing intensive communications." 
  72. ^ Owen O'Malley - Yahoo! Grid Computing Team (July 2008). "Apache Hadoop Wins Terabyte Sort Benchmark". Retrieved 21 December 2008. "This is the first time that either a Java or an open source program has won." 
  73. ^ "Hadoop Sorts a Petabyte in 16.25 Hours and a Terabyte in 62 Seconds". CNET.com. 11 May 2009. Retrieved 8 September 2010. "The hardware and operating system details are:(...)Sun Java JDK (1.6.0_05-b13 and 1.6.0_13-b03) (32 and 64 bit)" 
  74. ^ "Hadoop breaks data-sorting world records". CNET.com. 15 May 2009. Retrieved 8 September 2010. 
  75. ^ Chris Nyberg and Mehul Shah. "Sort Benchmark Home Page". Retrieved 30 November 2010. 
  76. ^ Czajkowski, Grzegorz (21 November 2008). "Sorting 1PB with MapReduce". google. Retrieved 1 December 2010. 
  77. ^ http://topcoder.com/home/tco10/2010/06/08/algorithms-problem-writing/
  78. ^ http://acm.timus.ru/help.aspx?topic=java&locale=en
  79. ^ http://acm.pku.edu.cn/JudgeOnline/faq.htm#q11
  80. ^ http://acm.tju.edu.cn/toj/faq.html#qj
  81. ^ http://www.codechef.com/wiki/faq#How_does_the_time_limit_work
  82. ^ http://acm.xidian.edu.cn/land/faq
  83. ^ http://poj.org/faq.htm#q9

External links[edit]