|This article needs additional citations for verification. (July 2008)|
In computing, an executable file or executable program, or sometimes simply an executable, causes a computer "to perform indicated tasks according to encoded instructions," as opposed to a data file that must be parsed by a program to be meaningful. These instructions are traditionally machine code instructions for a physical CPU. However, in a more general sense, a file containing instructions (such as bytecode) for a software interpreter may also be considered executable; even a scripting language source file may therefore be considered executable in this sense. The exact interpretation depends upon the use; while the term often refers only to machine code files, in the context of protection against computer viruses all files which cause potentially hazardous instruction execution, including scripts, are lumped together for convenience.
Executable code is used to describe sequences of executable instructions that do not necessarily constitute an executable file; for example, sections within a program.
Generation of executable files
While an executable file can be hand-coded in machine language, it is far more usual to develop software as source code in a high-level language easily understood by humans, or in some cases an assembly language more complex for humans but more closely associated with machine code instructions. The high-level language is compiled into either an executable machine code file or a non-executable machine-code object file of some sort; the equivalent process on assembly language source code is called assembly. Several object files are linked to create the executable. Object files, executable or not, are typically in a container format, such as Executable and Linkable Format (ELF). This structures the generated machine code, for example dividing it into sections such as the .text (executable code), .data (static variables), and .rodata (static constants).
In order to be executed by the system (such as an operating system, firmware, or boot loader), an executable file must conform to the system's Application Binary Interface (ABI). Most simply a file is executed by loading the file into memory and simply jumping to the start of the address space and executing from there, but in more complicated interfaces executable files have additional metadata, specifying a separate entry point. For example, in ELF, the entry point is specified in the header in the
e_entry field, which specifies the (virtual) memory address at which to start execution. In the GNU Compiler Collection this field is set by the linker based on the
Executable files typically also include a runtime system, which implements runtime language features (such as task scheduling, exception handling, calling static constructors and destructors, etc.) and interactions with the operating system, notably passing arguments, environment, and returning an exit status, together with other startup and shutdown features such as releasing resources such as file handles. For C, this is done by linking in the crt0 object, which contains the actual entry point and does setup and shutdown by calling the runtime library.
Executable files thus normally contain significant additional machine code beyond that directly generated from the specific source code. In some cases it is desirable to omit this, for example for embedded systems development or simply to understand how compilation, linking, and loading work. In C this can be done by omitting the usual runtime, and instead explicitly specifying a linker script, which generates the entry point and handles startup and shutdown, such as calling
main to start and returning exit status to kernel at end.
The same source code can in general be compiled to run under different computer architectures and operating systems. Sometimes this requires no changes to the source code, and simply outputting different machine code (targeting a different instruction set) and linking to a different runtime (due to operating system differences). In other cases this requires changing the source code, either including compile-time changes (conditional compilation) or run-time changes (checking the environment at run time). Conversion of existing source code for a different platform is called porting.
Interaction with computing platforms
An executable comprises machine code for a particular processor or family of processors. Machine-code instructions for different families are completely different, and executables are totally incompatible. Within families processors may be backwards compatible; for example, a 2014 x86-64 family processor can execute most code for x86 family processors from 1978, but the converse is not true.
Some dependence on the particular hardware, such as a particular graphics card may be coded into the executable. It is usual as far as possible to remove such dependencies from executable programs designed to run on a variety of different hardware, instead installing hardware-dependent device drivers on the computer, which the program interacts with in a standardised way.
Some operating systems designate executable files by filename extension (such as .exe) or noted alongside the file in its metadata (such as by marking an "execute" permission in Unix-like operating systems). Most also check that the file has a valid executable file format to safeguard against random bit sequences inadvertently being run as instructions. Modern operating systems retain control over the computer's resources, requiring that individual programs make system calls to access privileged resources. Since each operating system family features its own system call architecture, executable files are generally tied to specific operating systems, or families of operating systems.
There are many tools available that make executable files made for one operating system work on another one by implementing a similar or compatible application binary interface. For example Wine, which implements a Win32-compatible library for x86 processors. In other cases multiple executables for different targets are packaged together in a fat binary.
When the binary interface of the hardware the executable was compiled for differs from the binary interface on which the executable is run, the program that does this translation is called an emulator. Different files that can execute but do not necessarily conform to a specific hardware binary interface, or instruction set, can be either represented in bytecode for just-in-time compilation, or in source code for use in a scripting language. (see Shebang (Unix))