Jump to content

Source-to-source compiler: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
→‎Porting a codebase: Fixed broken link
Bobbo (talk | contribs)
m →‎History{{anchor|XLT86|TRANS}}: Fix Type (XLAT -> XLT)
Line 9: Line 9:


== History{{anchor|XLT86|TRANS}} ==
== History{{anchor|XLT86|TRANS}} ==
One of the earliest programs of this kind<!-- or perhaps actually the first? --> was [[Digital Research]]'s XLT86 in 1981<!-- possibly some incarnations of it existed earlier already, but the documentation I found is dated 1981 -->, a program written by [[Gary Kildall]], which translated .ASM source code for the [[Intel 8080]] processor into .A86 source code for the [[Intel 8086]]. Using global [[data flow analysis]] on 8080 register usage, the translator would also optimize the output for code size and take care of calling conventions, so that [[CP/M-80]] and [[MP/M-80]] programs could be ported to the [[CP/M-86]] and [[MP/M-86]] platforms automatically. XLAT86 itself was written in [[PL/I|PL/I-80]] and was available for CP/M-80 platforms as well as for [[Digital Equipment Corporation|DEC]] [[OpenVMS|VMS]] (for VAX 11/750 or 11/780).<ref>Digital Research (1981): ''XLT86 - 8080 to 8086 Assembly Language Translator - User's Guide''. Digital Research Inc, Pacific Grove ([http://www.s100computers.com/Software%20Folder/Assembler%20Collection/Digital%20Research%20XLT86%20Manual.pdf]).</ref>
One of the earliest programs of this kind<!-- or perhaps actually the first? --> was [[Digital Research]]'s XLT86 in 1981<!-- possibly some incarnations of it existed earlier already, but the documentation I found is dated 1981 -->, a program written by [[Gary Kildall]], which translated .ASM source code for the [[Intel 8080]] processor into .A86 source code for the [[Intel 8086]]. Using global [[data flow analysis]] on 8080 register usage, the translator would also optimize the output for code size and take care of calling conventions, so that [[CP/M-80]] and [[MP/M-80]] programs could be ported to the [[CP/M-86]] and [[MP/M-86]] platforms automatically. XLT86 itself was written in [[PL/I|PL/I-80]] and was available for CP/M-80 platforms as well as for [[Digital Equipment Corporation|DEC]] [[OpenVMS|VMS]] (for VAX 11/750 or 11/780).<ref>Digital Research (1981): ''XLT86 - 8080 to 8086 Assembly Language Translator - User's Guide''. Digital Research Inc, Pacific Grove ([http://www.s100computers.com/Software%20Folder/Assembler%20Collection/Digital%20Research%20XLT86%20Manual.pdf]).</ref>


A similar, but much less sophisticated program was TRANS.COM, written by [[Tim Paterson]] in 1980 as part of [[86-DOS]]. It could translate some [[Z80]] assembly source code into .ASM source code for the 8086, but supported only a subset of opcodes, registers and modes, often still requiring significant manual correction and rework afterwards. Also it did not carry out any register and jump optimizations.<ref>Seattle Computer Products (1980): ''86-DOS - Disk Operating System for the 8086. User's manual, version 0.3 - Preliminary''. Seattle Computer Products, Seattle ([http://www.patersontech.com/dos/Docs/86_Dos_usr_03.pdf]).</ref><ref name="Paterson_2014_MSDOS125">{{cite web|first1=Tim
A similar, but much less sophisticated program was TRANS.COM, written by [[Tim Paterson]] in 1980 as part of [[86-DOS]]. It could translate some [[Z80]] assembly source code into .ASM source code for the 8086, but supported only a subset of opcodes, registers and modes, often still requiring significant manual correction and rework afterwards. Also it did not carry out any register and jump optimizations.<ref>Seattle Computer Products (1980): ''86-DOS - Disk Operating System for the 8086. User's manual, version 0.3 - Preliminary''. Seattle Computer Products, Seattle ([http://www.patersontech.com/dos/Docs/86_Dos_usr_03.pdf]).</ref><ref name="Paterson_2014_MSDOS125">{{cite web|first1=Tim

Revision as of 12:18, 11 February 2015

A source-to-source compiler, transcompiler, or transpiler is a type of compiler that takes the source code of a programming language as its input and outputs the source code into another programming language. A source-to-source compiler translates between programming languages that operate at approximately the same level of abstraction, while a traditional compiler translates from a higher level programming language to a lower level programming language. For example, a source-to-source compiler may perform a translation of a program from Pascal to C. An automatic parallelizing compiler will frequently take in a high level language program as an input and then transform the code and annotate it with parallel code annotations (e.g., OpenMP) or language constructs (e.g. Fortran's forall statements).[1]

Another purpose of source-to-source-compiling is translating legacy code to use the next version of the underlying programming language or an API that breaks backward compatibility. It will perform automatic code refactoring which is useful when the programs to refactor are outside the control of the original implementer (for example, converting programs from Python 2 to Python 3, or converting programs from an old API to the new API) or when the size of the program makes it impractical or time consuming to refactor it by hand.

Transcompilers may either keep translated code as close to the source code as possible to ease development and debugging of the original source code, or else they may change the structure of the original code so much, that the translated code does not look like the source code.[2] There are also debugging utilities that map the transpiled source code back to the original code; for example, JavaScript source maps allow mapping of the JavaScript code executed by a web browser back to the original source in a transpiled-to-JavaScript language.[3]

History

One of the earliest programs of this kind was Digital Research's XLT86 in 1981, a program written by Gary Kildall, which translated .ASM source code for the Intel 8080 processor into .A86 source code for the Intel 8086. Using global data flow analysis on 8080 register usage, the translator would also optimize the output for code size and take care of calling conventions, so that CP/M-80 and MP/M-80 programs could be ported to the CP/M-86 and MP/M-86 platforms automatically. XLT86 itself was written in PL/I-80 and was available for CP/M-80 platforms as well as for DEC VMS (for VAX 11/750 or 11/780).[4]

A similar, but much less sophisticated program was TRANS.COM, written by Tim Paterson in 1980 as part of 86-DOS. It could translate some Z80 assembly source code into .ASM source code for the 8086, but supported only a subset of opcodes, registers and modes, often still requiring significant manual correction and rework afterwards. Also it did not carry out any register and jump optimizations.[5][6]

Programming language implementation

The first implementations of some programming languages started as transcompilers, and the default implementation for some of those languages are still transcompilers. In addition to the table below, a CoffeeScript maintainer provides a list of languages that compile to JavaScript.[7]

Porting a codebase

When developers want to switch to a different language while retaining most of an existing codebase, it might be better to use a transcompiler compared to rewriting the whole software by hand. In this case, the code often needs manual correction because the automated translation might not work in all cases.

Examples

DMS Software Reengineering Toolkit

DMS Software Reengineering Toolkit is a source-to-source program transformation tool, parameterized by explicit source and target (may be the same) computer language definitions. It can be used for translating from one computer language to another, for compiling domain-specific languages to a general purpose language, or for carrying out optimizations or massive modifications within a specific language. DMS has a library of language definitions for most widely used computer languages (including full C++, and a means for defining other languages which it does not presently know).

LLVM

LLVM can translate from any language supported by gcc 4.2.1 (Ada, C, C++, Fortran, Java, Objective-C, or Objective-C++) or by clang to any of: C, C++, or MSIL by way of the "arch" command in llvm-gcc.

% llvm-g++ -emit-llvm x.cpp -o program.bc -c
% llc -march=c program.bc -o x.c
% cc x.c -lstdc++

% llvm-g++ x.cpp -o program.bc -c
% llc -march=msil program.bc -o program.msil

Translation to C has been removed from LLVM since version 3.1. It had numerous problems, to the point of not being able to compile any nontrivial program.[26]

Emscripten

Emscripten is a C/C++/LLVM to Javascript Source-to-Source compiler that converts applications coded to run natively on linux to run as javascript in a webpage.

Example of using the emscripten C compiler:

% emcc helloworld.c -o helloworld.html

Example of using a make file with Emscripten:

% emmake make

Example of using a configure script with emscripten:

% emconfigure ./configure

Emscripten is very powerful and is remarkably able to compile most large applications that are system independent with almost no modifications to the source code. Some examples are:

Refactoring tools

The refactoring tools automate transforming source code into another:

  • Python's 2to3 tool transforms non-forward-compatible Python 2 code into Python 3 code.
  • Qt's qt3to4 tool convert non–forward-compatible usage of the Qt3 API into Qt4 API usage.
  • Coccinelle uses semantic patches to describe refactoring to apply to C code. It's been applied successfully to refactor the drivers of the Linux kernel due to kernel API changes.[30]
  • pfff can do the same things as Coccinelle, but targets more languages. It can also features a bug finder, offers structural searches and source code visualization. There is good support for C, Java, PHP, Javascript, HTML, CSS, and preliminary support for a lot more languages[31]
  • RefactoringNG is a NetBeans module for refactoring Java code where you can write transformations rules of a program's abstract syntax tree.

See also

References

  1. ^ "Types of compilers". compilers.net. 1997–2005. Retrieved 28 October 2010.
  2. ^ Fowler, Martin (February 12, 2013). "Transparent Compilation". Retrieved February 13, 2013.
  3. ^ Seddon, Ryan (21 March 2012). "Introduction to JavaScript Source Maps". html5rocks.com. Retrieved 21 January 2015.
  4. ^ Digital Research (1981): XLT86 - 8080 to 8086 Assembly Language Translator - User's Guide. Digital Research Inc, Pacific Grove ([1]).
  5. ^ Seattle Computer Products (1980): 86-DOS - Disk Operating System for the 8086. User's manual, version 0.3 - Preliminary. Seattle Computer Products, Seattle ([2]).
  6. ^ Paterson, Tim (2013-12-19) [1982]. "Microsoft DOS V1.1 and V2.0: Z80 to 8086 Translator version 2.21 /msdos/v11source/TRANS.ASM". Computer History Museum, Microsoft. Retrieved 2014-03-25. (NB. While the publishers claim this would be MS-DOS 1.1 and 2.0, it actually is SCP MS-DOS 1.25 and TeleVideo PC DOS 2.11.)
  7. ^ "List of languages that compile to JS". Retrieved December 15, 2014.
  8. ^ "IntelLabs/julia". GitHub.
  9. ^ "Google Groups". google.com.
  10. ^ "6to5 turns ES6+ code into vanilla ES5, so you can use next generation features today". 6to5.org. Retrieved 2015-01-26.
  11. ^ "Traceur is a JavaScript.next-to-JavaScript-of-today compiler". github.com. Retrieved 2014-07-02.
  12. ^ "Script# by nikhilk". Scriptsharp.com. Retrieved 2013-08-02.
  13. ^ "Smart Mobile Studio". SmartMobileStudio.com. Retrieved 2014-03-09.
  14. ^ "ktg / ParenJS — Bitbucket". Bitbucket.org. Retrieved 2014-07-08.
  15. ^ "efene programming language". Marianoguerra.com.ar. Retrieved 2014-07-08.
  16. ^ "Xtend, modernized Java". Eclipse project. Retrieved 2014-10-01.
  17. ^ Maptastic Maple (3.3.9). "Sass: Syntactically Awesome Style Sheets". Sass-lang.com. Retrieved 2014-07-08.{{cite web}}: CS1 maint: numeric names: authors list (link)
  18. ^ "ktg / L++ — Bitbucket". Bitbucket.org. Retrieved 2014-07-08.
  19. ^ "j2objc - A Java to iOS Objective-C translation tool and runtime. - Google Project Hosting". Code.google.com. 2012-10-15. Retrieved 2013-08-02.
  20. ^ Peter van Eerten. "BaCon - A free BAsic CONverter for Unix, BSD and MacOSX". Basic-converter.org. Retrieved 2014-07-08.
  21. ^ "Shed Skin, An experimental (restricted-Python)-to-C++ compiler". Retrieved 2014-10-01.
  22. ^ "java2c-transcompiler - A simple source-to-source from Java to C - Google Project Hosting". Retrieved 8 October 2014.
  23. ^ "Js_of_ocaml". Retrieved 8 October 2014.
  24. ^ "J2Eif Research Page - Chair of Software Engineering". Se.inf.ethz.ch. doi:10.1007/978-3-642-21952-8_4. Retrieved 2014-07-08.
  25. ^ "C2Eif Research Page - Chair of Software Engineering". Se.inf.ethz.ch. Retrieved 2014-07-08.
  26. ^ "LLVM 3.1 Release Notes". llvm.org.
  27. ^ Epic Games; Mozilla. "HTML5 Epic Citadel".
  28. ^ Mozilla. "BannanaBread Demo".
  29. ^ Alon Zakai. "ammo.js".
  30. ^ Valerie Henson (January 20, 2009). "Semantic patching with Coccinelle". lwn.net. Retrieved 28 October 2010.
  31. ^ pfff project. "Welcome to the pfff Wiki!". github.com. Retrieved 2014-10-01.