Talk:Just-in-time compilation

From Wikipedia, the free encyclopedia
Jump to: navigation, search
WikiProject Java (Rated Start-class, Low-importance)
WikiProject icon This article is within the scope of WikiProject Java, a collaborative effort to improve the coverage of Java on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.
Start-Class article Start  This article has been rated as Start-Class on the project's quality scale.
 Low  This article has been rated as Low-importance on the project's importance scale.
 

Confusion starts with the very first paragraph[edit]

In computing, just-in-time compilation (JIT), also known as dynamic translation, is a method to improve the runtime performance of computer programs. Traditionally, computer programs had two modes of runtime operation, either interpreted or static (ahead-of-time) compilation.[citation needed] Interpreted code is translated from a high-level language to a machine code continuously during every execution, whereas statically compiled code is translated into machine code before execution, and only requires this translation once. JIT compilers represent a hybrid approach, with translation occurring continuously, as with interpreters, but with caching of translated code to minimize performance degradation. It also offers other advantages over statically compiled code at development time, such as handling of late-bound data types and the ability to enforce security guarantees.

A program has modes of operation? Interpreted and static compilation? How is compilation a mode of a program??? 'Interpreted code ... is translated ... to a machine code'??? Interpreter don't translate, they execute. Traditionally many systems have also allowed mixed modes: some parts are interpreted and other parts are executed as compiled code.

First: Interpretation takes some form of source code (be it some programming language or some code of a VM) and executes each statement by dispatching on the type of statement. There is no form of translation involved. That's why we call it an Interpreter and not a Translator/Compiler. Interpreters don't do translation. They execute the code either directly or call library routines which implement some of the instructions.

Second: Compilation translates some form of source code to some executable code (either for a real machine as machine code or for a virtual machine as 'byte code'.

Third: if an execution engine runs byte code in a virtual machine, the byte code is interpreted at runtime. JIT compilation can now be used to translate the byte code to machine code before execution and cache the results of this translation. Then the byte code is no longer interpreted, but the translated machine code is run instead.

Fourth: JIT compilers don't offer the claimed advantages at development time over static compilation. Static compilation can be incremental.

Sixth: JIT compilers have a possible advantage over static compilation, because they can use runtime information to guide the code generation. — Preceding unsigned comment added by Joswig (talkcontribs) 12:03, 21 February 2011 (UTC)

Comments[edit]

Great work on this article; I found it well-written and very informative. Vazor (talk) 22:01, 7 July 2009 (UTC)

This page is redundant with dynamic translation. Can they be merged? --FOo 04:01, 4 May 2004 (UTC)

Obfusco|FOo]] 12:54, 26 Apr 2005 (UTC)

I think there's some overlap with dynamic compilation and dynamic recompilation, too. I'm not sure how this would best be handled. --StuartBrady (Talk) 19:44, 2 March 2007 (UTC)

A note on interpretation[edit]

Just a comment for future reference: Pure interpreted systems are pretty damn rare these days. Earlier versions of this article held that JIT was a hybrid between interpreters and compilers, and implied that "scripting languages" were typically slow and impractical interpreters. This is inaccurate. With the exception of shells such as bash, old-school interpreters are not so common. Most so-called "interpreted" "scripting" languages (usually an erroneous name) are bytecode compilers -- e.g. Perl and Python. I think it's more likely useful to consider JIT as a bridge between bytecode systems (like these and early JVMs) and full native-code dynamic compilers (like most Common Lisp systems for instance). --FOo 12:54, 26 Apr 2005 (UTC) jjjj


In the first paragraph a sentence starts with "In a bytecode-compiled system such as Perl, GNU CLISP, or early versions of Java,..." This leads to the impression that actual versions of Java were somewhat different. Either parting early and current versions of Java or the use of the term "bytecode-compiled system" is misleading. Regards 145.254.68.190 20:41, 24 August 2005 (UTC)

Performance?[edit]

Thanks to the article, I finally have a better appreciation of JIT, but one thing I've always always heard remains unexplained: performance. I've heard a number of proponents of JIT claim that it is fast, drawing a comparison of JIT and traditionally compiled code (i.e. to machine language).

I'm not sure I get it. I've yet to meet a developer that thinks that compile time is an important code performance metric, so I rule out the comilation to bytecode as the basis of comparison. That just leaves the dynamic translation, as far as I can tell. Since this is merely a bunch of lookups, I can see that this could be fast. But surely machine code is faster, since it requires no translation at all?

What am I missing?

I believe you are missing the value of real-data runtime profiling. While I do not agree with the part of the article where the author asserts that statically compiled code "cannot" be a good as JIT code, I wold say that JIT results across small domains of code, such as individual nested loops, can often be better than a similarly written program which is statically compiled. JIT results can be better because the compiler has access to the actual branch history and profiling metrics from previous passes over the same code body using the actual current dataset. Many (most) applications that are statically compiled are never subjected to execution profiling and reoptimization. If they are, the sample data used for a profiling pass is often contrived or non-normal. That said, JIT code can be profiled into truly abysmal performance if the data set is markedly irregular, such as if the first ten percent of a large data set is very unlike the remaining ninety percent the profiling can be misled into favoring the non-normal code branches. HotSpot in particular will revisit compiled code domains to re-check its previous decisions and perhaps recompile. With contrived datasets this can (if memory serves) lead to thrashing in the compiler. -- R.White
On re-read of "But surely machine code is faster" etc, you may also be missing the part where the JIT translation isn't compilation to bytecode, it is compilation from bytecode to native executable code. That is in e.g. Java, the source was statically compiled to java byte code. The runtime system starts by interpreting said byte code. When it finds some code that is executing repeatedly it invokes a native optimizing compiler to replace that bytecode sequence with a block of native machine code. That compilation happens in a context where several passes over the code have already been profiled so the compiler "knows" which alternatives ("ifs" and "cases" and function calls) should be on the non-branching mainline through the code and so on. The result of running in JIT is a patchwork of native code and bytecode that is theoretically self selected/organized for optimum flow within the actual current data use. That said, I have no information about the relative value of whole application static optimization vs. inherently regional JIT vs. contrived dataset profiled optimization etc. for a "typical" application. -- R. White —Preceding unsigned comment added by 24.19.221.47 (talk) 14:46, 17 August 2010 (UTC)
If you can deliver compiled machine code, that's going to be faster to run than delivering bytecode and dynamically compiling (JIT) at runtime, yes. But if the system demands delivering bytecode, then JIT is faster than a bytecode-interpreting virtual machine.
As I understand it, in the Java case the deliverable is usually bytecode that could be deployed onto any architecture, relying on the virtual machine (whether interpreter or JIT compiler) to run it on the actual deployed system.
The Lisp case is a little different. The language requires (and some applications use) the ability to create new functions at runtime. For instance, the user might enter some parameters or expressions that the application would then build into a function. A dynamic compilation environment means that dynamically created functions get compiled to machine code and optimized as they are created -- a major improvement over running dynamically entered code in an interpreter! --FOo 03:30, 26 September 2005 (UTC)
It may well even be faster than machine code. See HP's Dynamo for details -- it can run PA/RISC programs on PA/RISC hardware faster than running them natively, sometimes by 20% or more. Basically, the set of optimizations you can do at compile-time and the set of optimizations you can do at run-time are quite different. At runtime, you could inline a shared library call, for example. It's really the same reason a JIT runs bytecode faster, except in this case, you have the overhead of an interpreter for non-hot-spots; as long as the performance gain from the optimizations (which accounts for most of the running time) outweighs the overhead of the interpreter, you win.
That's a slightly misleading reading. Dynamo (and, by extension, JIT compilation) is only faster if the libraries are not statically linked. It may waste disk space, but there's (usually) nothing making it "impossible" to statically link every library your code needs, and inline the routines. Obviously, a traditional compiler will then be able to provide far superior optimization of that code. I of course admit that dynamic linking is very pragmatic, elegant and impresses the chix0rs, but if you're comparing performance in the general case, it's really not fair to only consider the dynamically-linked case (where JIT wins, almost by definition). Incidentally, contrast JIT compilation with MetaOcaml (http://en.wikipedia.org/wiki/Objective_Caml#MetaOCaml) for an example of a statically-linked paradigm that provides performance JIT simply can't touch (runtime compilation--subtly different than JIT compilation). It's late. Sorry this is so disjointed. 72.227.165.191 (talk) 09:52, 15 November 2009 (UTC)
This article is terrible! I think what is needed is a side-by-side (a table, perhaps) comparison of the methods for traditional compiled code, traditional interpreted code, and JIT code. General statements should be made about what kinds of translation and optimizations typically happen at each stage (including at least the following: source code -> bytecode -> machine code) in the sequence, and whether the stage is performated at compile-time or run-time. Then, I won't be so frustrated trying to understand all the loosely-connected comparisons being made in this article. —Preceding unsigned comment added by 70.247.165.213 (talk) 19:55, 30 August 2008 (UTC)
Maybe some notes should be made about the acronym JTL (Just Too Late). Form example http://www.thescripts.com/forum/thread16854.html —Preceding unsigned comment added by Doekman (talkcontribs) 12:27, 16 October 2007 (UTC)
Not unless somebody has a reliable source for this usage. I had a quick look, and it didn't seem particularly common, and mainly seems to have been used by one guy in various forums.WolfKeeper 00:30, 31 October 2007 (UTC)

I personally would like to see more information about where JIT is not faster than static compilation. Yes, the dynamic nature means that sometimes it's faster in some cases - the key words being some cases. Not all - some. It misses a lot of potential optimizations that may take long time to complete, since it has to provide functional code quickly. So in many cases it can be slower as well. This leads to the "slow startup" seen in a lot of Java and .NET applications. Even after they're running, they can seem slow while the code is still being optimized and not yet in a completely optimized state. Eventually, yes, they can approach compiled speeds - but they take time to do so. Arguably too much time in some cases, as Sun found out with Java. It's amazing how this issue is nearly completely ignored in this article.  —CobraA1 21:11, 17 December 2008 (UTC)

Memory Impact[edit]

Something else that bugs me about this article: there is no mention about how memory is managed in order to accomplish the task of quickly translating all these instructions, and keeping the (essential) cached snippets handy. Are there entire libraries of source code and/or bytecode sitting in memory constantly, along with compiled snippets?

Why don't you find out, and add it to the article?- (User) WolfKeeper (Talk) 20:09, 30 August 2008 (UTC)
Because I was trying to find out, and came here for clarification. 70.247.170.9 (talk) 00:49, 19 January 2009 (UTC)
The answer is yes. The 'source code' (by which I including byte code if it's a byte coded system) is kept in memory and can sometimes be interpreted without compilation, as well as zero or more code fragments that are associated with that code that may be optimised for particular or general circumstances that that code may be run under.- (User) Wolfkeeper (Talk) 18:51, 8 February 2009 (UTC)

Restructure[edit]

I plan to do some work to restructure this article into something clearer for readers unfamiliar with JITs; I only partially understand. I will start by putting together a table here, to clarify what I see in the article. Would anyone here be offended if I do this? Any thoughts? 70.251.243.16 (talk) 03:30, 8 February 2009 (UTC)

All Wikipedia articles are written for people unfamilar with the subject. That is why you should use the word Just-in-time instead of JIT, in every paragraph. Just as in the Work Breakdown Structure (WBS) article. -- Marcel Douwe Dekker (talk) 15:09, 8 February 2009 (UTC)
You are welcome to suggest that on your own. I don't really see how it relates to what I've said about restructuring the article, though. 70.251.241.131 (talk) 00:11, 8 March 2009 (UTC)

More detailed info?[edit]

I was reading Compiler optimization and wishing that it included methods used specifically when JIT-compiling.. they don't seem to be there, though it does list JIT itself. So I came to this page hoping to find similar information applying just to JIT, i.e. a detailed list of every strategy used, but i don't really see it. It would be nice to see this page take on more of a form like that other one. Inhahe (talk) 13:52, 8 January 2010 (UTC)

Global optimization - wrong statement[edit]

I removed the following incorrect statement from the page - (it was slightly difference, I already fixed some uncontroversial detail): (Context: The system can do global code optimizations (e.g. inlining of library functions) without losing the advantages of dynamic linking and without the overheads inherent to static compilers and linkers.) Specifically, when doing global inline substitutions, a static compiler must insert run-time checks and ensure that a virtual call would occur if the actual class of the object overrides the inlined method (however, this need not be the case for languages which do not provide inheritance). The point is that such checks have always to be inserted, by both static and dynamic compilers. There is an exception, i.e. a particular case where the text is true (if you have a class without subclasses, but the latter is known only at runtime because the class is not final), however this is neither general nor relevant.

I think the text should refer to advantages of dynamic linking such as the possibility to introduce new code at runtime - inlining is better done at runtime because of this, and because of the need for up-to-date profiling data. --Blaisorblade (talk) 23:48, 11 November 2010 (UTC)

I'm sorry, but you don't understand. The runtime system is free to flatten out calls and thus duplicate the library function multiple times, and thus in common cases it can safely create code that is able to avoid running the checks at all (or more precisely evaluate them at compile time, and/or find whole categories of cases where it can just move them out of a loop, or even add checks at strategic places that prevents them optimistically being needed in the middle of loops, and forcing recompilation only *if* it turns out that they are needed after all). A static compiler really can't do that, because it doesn't have the global knowledge of how functions are being used, it doesn't have access to the class/type/array bounds information where there is inheritance, but the runtime system does.Rememberway (talk) 00:11, 12 November 2010 (UTC)
Just because something is not general, doesn't mean it's not relevant. A non general trick can nevertheless save an awful lot of processing time, provided it is gated to detect when it can't be used.Rememberway (talk) 00:11, 12 November 2010 (UTC)

Intro is wrong[edit]

As it stands, the current intro is both factually and logically wrong. Under all possible conditions (except a few cases where the program is immense and the memory is low), a precompiled program runs faster than a program that is compiled during the runtime, since the compilations steal execution power from the non-compiling program execution.

The intent of the article is that VM code runs slower than compiled code, and therefore that compilation from VM code to machine code improves the execution speed. Compiling from VM code to machine code could be performed before the program execution, or in the loading phase of the program, or as in JIT-compilation: during the execution. Whenever it is performed it requires execution power. The previously normal strategy was to either run VM by interpretation, which is slow, or to compile to machine code in the loading phase, which makes loading inconveniently slow. JIT compilation is faster than VM interpretation and removes the time delay when loading the program. Rursus dixit. (mbork3!) 07:55, 31 July 2012 (UTC)

I rewrote. More improvements could be done, but I think the second and third paragraphs (that I didn't modify) were always correct. Rursus dixit. (mbork3!) 08:25, 31 July 2012 (UTC)

How is JIT Compilation achieved with modern code/data separation?[edit]

Modern code/data separation (execute bit, etc) forbids data to be executed, and code sections to be written to. How does the JIT Compiler (say HotSpot) cope with this? I.e. it encounters a new section of bytecode which it hadnt already translated to machine code: I understand the JIT Compiler can create the machine code as data, but how does it start executing it? or does it disable the strict code/data border security? If JIT Compilation systems disable the system's code/data separation, any arbitrary code execution in the bytecode could start executing any machine code, contrary to pure (but slower) interpreters? — Preceding unsigned comment added by 83.134.157.58 (talk) 18:47, 5 August 2012 (UTC)

(A few weeks later now) I found the confirming answer of my suspicion: JIT compilation is in direct conflict with processor/OS level code/data seperation! See for example JIT and PaX . I hereby propose a new section in the article discussing the security implication, gist of which being: to compile to native code and run it one must both be able to write into a memory region and execute the memory region, which modern data/code seperation forbids. Either data/code seperation must be disabled (perhaps only during JIT-compilation/execution), such that the JIT-compilation presents a security risk, or JIT compilation is impossible (the processor/OS will deny writing the execute section, or deny executing the write section). Somebody who is WP:BOLD enough feel free to try and insert. — Preceding unsigned comment added by 83.134.160.197 (talk) 17:03, 23 August 2012 (UTC)
Most operating systems with DEP provide an API to flip a particular memory area from writable and not executable to executable and not writable. The program loader uses this API when loading a program, and programs using JIT, such as Flash, Java, and JavaScript virtual machines, have to use this API as well. Apple iOS, however, forbids applications in the App Store from doing this. --Damian Yerrick (talk | stalk) 17:26, 23 August 2012 (UTC)
You are correct, but this does not contradict what I say. Wheither knowingly (to the end user) or not, there is a DEP security risk implication, which can be avoided at the cost of not JIT compilation, or you can have JIT compilation at the cost of security risk... I suggest the section gives examples as you give for iOs/App Store, for example Hardened Gentoo depending on USE flag, or ... DEP remains orthogonal to JIT compilation (DEP still running on other parts of the system does not mean the JIT compiling process will behave nicely, intentional or not) — Preceding unsigned comment added by 213.49.89.63 (talk) 20:24, 25 August 2012 (UTC)
The difference is that JIT engines are more like kernels or system libraries than like ordinary applications. Like kernels and system libraries, JIT engines act as a program loader, and at least free JIT engines tend to draw a lot more eyeballs than other applications have. There is a security risk in any virtual machine implementation because a virtual machine has to do a lot of the things a kernel does. But there is a similar security risk in the kernel itself, and outside of certain specialized fields where formal verification is common, most people don't avoid using a kernel just because it has unknown risks. --Damian Yerrick (talk | stalk) 15:38, 13 September 2012 (UTC)

Added note at top of page[edit]

Some people may expect "Dynamic translation" to bring up the Dynamic equivalence article (I know I just did)... AnonMoos (talk) 20:45, 23 December 2012 (UTC)

Bytecode isn't a necessary step for JIT[edit]

The article insists heavily on "bytecode" being a starting point for JIT, but that's incorrect: it's possible to perform JIT compilation straight from an abstract syntax tree, a register machine or a stack machine. 217.128.255.181 (talk) 10:48, 9 April 2013 (UTC)

External links modified[edit]

Hello fellow Wikipedians,

I have just added archive links to one external link on Just-in-time compilation. Please take a moment to review my edit. If necessary, add {{cbignore}} after the link to keep me from modifying it. Alternatively, you can add {{nobots|deny=InternetArchiveBot}} to keep me off the page altogether. I made the following changes:

When you have finished reviewing my changes, please set the checked parameter below to true to let others know.

Question? Archived sources still need to be checked

Cheers. —cyberbot IITalk to my owner:Online 10:20, 17 October 2015 (UTC)

External links modified[edit]

Hello fellow Wikipedians,

I have just modified one external link on Just-in-time compilation. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

When you have finished reviewing my changes, please set the checked parameter below to true or failed to let others know (documentation at {{Sourcecheck}}).

Question? Archived sources still need to be checked

Cheers.—cyberbot IITalk to my owner:Online 03:35, 2 May 2016 (UTC)