Jump to content

Entry point

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Skrike (talk | contribs) at 11:08, 4 September 2018 (Haskell: Fix broken link (unfortunately this link is now tied to a specific version of the base library).). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In computer programming, an entry point is where control is transferred from the operating system to a computer program, at which place the processor enters a program or a code fragment and execution begins. In some operating systems or programming languages, the initial entry is not part of the program but of the runtime library, in which case the runtime library initializes the program and then the runtime library enters the program. In other cases, the program may call the runtime library before doing anything when it is entered for the first time, and, after the runtime library returns, the actual code of the program begins to execute. This marks the transition from load time (and dynamic link time, if present) to run time.

In simple layouts, programs begin their execution at the beginning, which is common in scripting languages, simple binary executable formats, and boot loaders. In other cases, the entry point is at some other fixed point, which is some memory address than can be an absolute address or relative address (offset).

Alternatively, execution of a program can begin at a named point, either with a conventional name defined by the programming language or operating system, or at a caller-specified name. In many programming languages, notably C, this named point is a function called main; as a result, the entry point is often called the main function.

Usage

Entry points apply both to source code and to executable files. However, in day-to-day software development, programmers specify the entry points only in source code, which makes them much better known. Entry points in executable files depend on the application binary interface (ABI) of the actual operating system, and are generated by the compiler or linker (if not fixed by the ABI). Non-executable object files may also have entry points, which are used later by the linker when generating entry points of an executable file.

Contemporary

In most of today's popular programming languages and operating systems, a computer program usually only has a single entry point.

In C, C++, D, Rust and Kotlin programs this is a function named main; in Java it is a static method named main (although the class must be specified at the invocation time), and in C# it is a static method named Main.[1][2]

In major operating systems, the standard executable format has a single entry point. In the Executable and Linkable Format (ELF), used in Unix and Unix-like systems such as Linux, the entry point is specified in the e_entry field of the ELF header. In the GNU Compiler Collection (gcc), the entry point used by the linker is the _start symbol. Similarly, in the Portable Executable format, used in Microsoft Windows, the entry point is specified by the AddressOfEntryPoint field, which is inherited from COFF. In COM files, the entry point is at the fixed offset of 0100h.

One notable modern exception to the single-entry-point paradigm is Android. Unlike applications on most other operating systems, Android applications do not have a single entry point – there is no main function, for example. Instead of a single entry point, they have essential components (which include activities and services) which the system can instantiate and run as needed.[3]

An occasionally used technique is the fat binary, which consists of several executables for different targets packaged in a single binary. Most commonly, this is implemented by a single overall entry point, which is compatible with all targets and branches to the target-specific entry point. Alternative techniques include storing separate executables in separate forks, each with its own entry point, which is then selected by the operating system.

Historical

Historically, and in some contemporary legacy systems, such as VMS and OS/400, computer programs have a multitude of entry points, each corresponding to the different functionalities of the program. The usual way to denote entry points, as used system-wide in VMS and in PL/I and MACRO programs, is to append them at the end of the name of the executable image, delimited by a dollar sign ($), e.g. directory.exe$make.

The Apple I computer also used this to some degree. For example, an alternative entry point in Apple I's BASIC would keep the BASIC program useful when the reset button was accidentally pushed.[clarification needed]

Exit point

In general, programs can exit at any time in an unstructured way, by returning to the operating system or crashing. Scripting languages typically end by reaching the end of the program, but for binaries the control must return to the operating system or it will simply run off the end of the process's memory, either executing whatever code is there or (in modern operating systems) resulting in a memory access violation and termination by the operating system.

Usually, there is not a single exit point specified in a program. However, in other cases runtimes ensure that programs always terminate in a structured way via a single exit point, which is guaranteed unless the runtime itself crashes; this allows cleanup code to be run, such as atexit handlers. This can be done by either requiring that programs terminate by returning from the main function, by calling a specific exit function, or by the runtime catching exceptions or operating system signals.

Programming languages

In many programming languages, the main function is where a program starts its execution. It enables high-level organization of the program's functionality, and typically has access to the command arguments given to the program when it was executed.

The main function is generally the first programmer-written function that runs when a program starts, and is invoked directly from the system-specific initialization contained in the runtime environment (crt0 or equivalent). However, some languages can execute user-written functions before main runs, such as the constructors of C++ global objects.

In other languages, notably scripting languages, execution simply begins at the start of the program.

A non-exhaustive list of programming languages follows, describing their way of defining the main entry point:

APL

In APL, when a workspace is loaded, the contents of "quad LX" (latent expression) variable is interpreted as an APL expression and executed.

C and C++

In C and C++, the function prototype of the main function looks like one of the following:

int main(void);
int main();

int main(int argc, char **argv);
int main(int argc, char *argv[]);
int main(int argc, char **argv, char **env);



// more specifically in C
// NOT according to the ISO C standard 5.1.2.2.1
// BUT in embedded programming depending on the µC, this form is also used
void main (void);

The parameters argc, argument count, and argv, argument vector,[4] respectively give the number and values of the program's command-line arguments. The names of argc and argv may be any valid identifier in C, but it is common convention to use these names. In C++, the names are to be taken literally, and the "void" in the parameter list is to be omitted, if strict conformance is desired.[5] Other platform-dependent formats are also allowed by the C and C++ standards, except that in C++ the return type must always be int;[6] for example, Unix (though not POSIX.1) and Windows have a third argument giving the program's environment, otherwise accessible through getenv in stdlib.h:

int main(int argc, char **argv, char **envp);

Darwin-based operating systems, such as macOS, have a fourth parameter containing arbitrary OS-supplied information, such as the path to the executing binary:[7]

int main(int argc, char **argv, char **envp, char **apple);

The value returned from the main function becomes the exit status of the process, though the C standard only ascribes specific meaning to two values: EXIT_SUCCESS (traditionally 0) and EXIT_FAILURE. The meaning of other possible return values is implementation-defined. In case a return value is not defined by the programmer, an implicit return 0; at the end of the main() function is inserted by the compiler; this behavior is required by the C++ standard.

It is guaranteed that argc is non-negative and that argv[argc] is a null pointer. By convention, the command-line arguments specified by argc and argv include the name of the program as the first element if argc is greater than 0; if a user types a command of "rm file", the shell will initialise the rm process with argc = 2 and argv = {"rm", "file", NULL}. As argv[0] is the name that processes appear under in ps, top etc., some programs, such as daemons or those running within an interpreter or virtual machine (where argv[0] would be the name of the host executable), may choose to alter their argv to give a more descriptive argv[0], usually by means of the exec system call.

The main() function is special; normally every C and C++ program must define it exactly once.

If declared, main() must be declared as if it has external linkage; it cannot be declared static or inline.

In C++, main() must be in the global namespace (i.e. ::main), cannot be overloaded, and cannot be a member function, although the name is not otherwise reserved, and may be used for member functions, classes, enumerations, or non-member functions in other namespaces. In C++ (unlike C) main() cannot be called recursively and cannot have its address taken.

C#

When executing a program written in C#, the CLR searches for a static method marked with the .entrypoint IL directive, which takes either no arguments, or a single argument of type string[], and has a return type of void or int, and executes it.[8]

static void Main();
static void Main(string[] args);
static int Main();
static int Main(string[] args);

Command-line arguments are passed in args, similar to how it is done in Java. For versions of Main() returning an integer, similar to both C and C++, it is passed back to the environment as the exit status of the process.

Since C#7.1 there are four more possible signatures of the entry point, which allow asynchronous execution in the Main() Method.[9]

static Task Main()
static Task<int> Main()
static Task Main(string[])
static Task<int> Main(string[])

The Task and Task<int> types are the asynchronous equivalents of void and int.

Clean

Clean is a functional programming language based on graph rewriting. The initial node is called Start and is of type *World -> *World if it changes the world or some fixed type if the program only prints the result after reducing Start.

Start :: *World -> *World
Start world = startIO ...

Or even simpler

Start :: String
Start = "Hello, world!"

One tells the compiler which option to use to generate the executable file.

Common Lisp

ANSI Common Lisp does not define a main function; instead, the code is read and evaluated from top to bottom in a source file. However, the following code will emulate a main function.

(defun hello-main ()
  (format t "Hello World!~%"))

(hello-main)

D

In D, the function prototype of the main function looks like one of the following:

void main();
void main(string[] args);
int main();
int main(string[] args);

Command-line arguments are passed in args, similar to how it is done in C# or Java. For versions of main() returning an integer, similar to both C and C++, it is passed back to the environment as the exit status of the process.

FORTRAN

FORTRAN does not have a main subroutine or function. Instead a PROGRAM statement as the first line can be used to specify that a program unit is a main program, as shown below. The PROGRAM statement cannot be used for recursive calls.[10]

      PROGRAM HELLO
      PRINT *, "Cint!"
      END PROGRAM HELLO

Some versions of Fortran, such as those on the IBM System/360 and successor mainframes, do not support the PROGRAM statement. Many compilers from other software manufacturers will allow a fortran program to be compiled without a PROGRAM statement. In these cases, whatever module that has any non-comment statement where no SUBROUTINE, FUNCTION or BLOCK DATA statement occurs, is considered to be the Main program.

GNAT

Using GNAT, the programmer is not required to write a function called main; a source file containing a single subprogram can be compiled to an executable. The binder will however create a package ada_main, which will contain and export a C-style main function.

Go

In Go programming language, program execution starts with the main function of the package main

package main

import "fmt"

func main() {
 fmt.Println("Hello, World!")
}

There is no way to access arguments or a return code outside of the standard library in Go. These can be accessed via os.Args and os.Exit respectively, both of which are included in the "os" package.

Haskell

A Haskell program must contain a name called main bound to a value of type IO t, for some type t;[11] which is usually IO (). IO is a monad, which organizes side-effects in terms of purely functional code.[12] The main value represents the side-effects-ful computation done by the program. The result of the computation represented by main is discarded; that is why main usually has type IO (), which indicates that the type of the result of the computation is (), the unit type, which contains no information.

main :: IO ()
main = putStrLn "Hello, World!"

Command line arguments are not given to main; they must be fetched using another IO action, such as System.Environment.getArgs.

Java

Java programs start executing at the main method, which has the following method heading:

public static void main(String[] args)
public static void main(String... args)
public static void main(String args[])

Command-line arguments are passed in args. As in C and C++, the name "main()" is special. Java's main methods do not return a value directly, but one can be passed by using the System.exit() method.

Unlike C, the name of the program is not included in args, because the name of the program is exactly the name of the class that contains the main method called, so it is already known. Also unlike C, the number of arguments need not be included, since arrays in Java have a field that keeps track of how many elements there are.

Another aspect unique to Java is that the main function must be included within a class, and then called manually by the runtime. This is because in Java everything has to be contained within a class. For instance, a hello world program in Java may look like so:

public class HelloWorld {
    public static void main(String[] args) {
        System.out.println("Hello, world!");
    }
}

To run this program, one must call java HelloWorld in the directory where the compiled class file (which itself must be named HelloWorld.class) exists. Alternatively, executable JAR files use a manifest file to specify the entry point in a manner that is filesystem-independent from the user's perspective.

In FMSLogo, the procedures when loaded do not execute. To make them execute, it is necessary to use this code:

to procname
 ...                 ; Startup commands (such as print [Welcome])
end
make "startup [procname]

Note that the variable startup is used for the startup list of actions, but the convention is that this calls another procedure that runs the actions. That procedure may be of any name.

OCaml

OCaml has no main function. Programs are evaluated from top to bottom.

Command-line arguments are available in an array named Sys.argv and the exit status is 0 by default.

Example:

print_endline "Hello World"

Pascal

In Pascal, the main procedure is the only unnamed procedure in the program. Because Pascal programs have the procedures and functions in a more rigorous top-down order than C, C++ or Java programs, the main procedure is usually the last procedure in the program. Pascal does not have a special meaning for the name "main" or any similar name.

program Hello(Output);
begin
  writeln('Hello, world!');
end.

Command-line arguments are counted in ParamCount and accessible as strings by ParamStr(n), with n between 0 and ParamCount.

Note that "unit" or "module" based versions of Pascal start the main module with the PROGRAM keyword, while other separately compiled modules start with UNIT (UCSD/Borland) or MODULE (ISO). The unnamed function in modules is often module initialization, and run before the main program starts.

Perl

In Perl, there is no main function. Statements are executed from top to bottom.

Command-line arguments are available in the special array @ARGV. Unlike C, @ARGV does not contain the name of the program, which is $0.

PHP

PHP does not have a "main" function. Starting from the first line of a PHP script, any code not encapsulated by a function header is executed as soon as it is seen.

Pike

In Pike syntax is similar to that of C and C++. The execution begins at main. The "argc" variable keeps the number of arguments passed to the program. The "argv" variable holds the value associated with the arguments passed to the program.

Example:

 int main(int argc, array(string) argv)

Python

Python programs are evaluated top-to-bottom, as is usual in scripting languages: the entry point is the start of the source code. Since definitions must precede use, programs are typically structured with definitions at the top and the code to execute at the bottom (unindented), similar to code for a one-pass compiler, such as in Pascal.

Alternatively, a program can be structured with an explicit main function containing the code to be executed when a program is executed directly, but which can also be invoked by importing the program as a module and calling the function. This can be done by the following idiom, which relies on the internal variable __name__ being set to __main__ when a program is executed, but not when it is imported as a module (in which case it is instead set to the module name); there are many variants of this structure:[13][14][15]

import sys

def main(argv):
    n = int(argv[1])
    print(n + 1)

if __name__ == '__main__':
    sys.exit(main(sys.argv))

In this idiom, the call to the named entry point main is explicit, and the interaction with the operating system (receiving the arguments, calling system exit) are done explicitly by library calls, which are ultimately handled by the Python runtime. This contrast with C, where these are done implicitly by the runtime, based on convention.

QB64

The QB64 language has no main function, the code that is not within a function, or subroutine is executed first, from top to bottom:

print "Hello World! a =";
a = getInteger(1.8d): print a

function getInteger(n as double)
    getInteger = int(n)
end function

Command line arguments (if any) can be read using the COMMAND$ function:

dim shared commandline as string
commandline = COMMAND$

'Several space-separared command line arguments can be read using COMMAND$(n)
commandline1 = COMMAND$(2)

Ruby

In Ruby, there is no distinct main function. The code written without additional "class .. end", "module .. end" enclosures is executed directly, step by step, in context of special "main" object. This object can be referenced using:

irb(main):001:0> self
=> main

and contain the following properties:

irb(main):002:0> self.class
=> Object
irb(main):003:0> self.class.ancestors
=> [Object, Kernel, BasicObject]

Methods defined without additional classes/modules are defined as private methods of the "main" object, and, consequently, as private methods of almost any other object in Ruby:

irb(main):004:0> def foo
irb(main):005:1>   42
irb(main):006:1> end
=> nil
irb(main):007:0> foo
=> 42
irb(main):008:0> [].foo
NoMethodError: private method `foo' called for []:Array
	from (irb):8
	from /usr/bin/irb:12:in `<main>'
irb(main):009:0> false.foo
NoMethodError: private method `foo' called for false:FalseClass
	from (irb):9
	from /usr/bin/irb:12:in `<main>'

Number and values of command-line arguments can be determined using the single ARGV constant array:

$ irb /dev/tty foo bar

tty(main):001:0> ARGV
ARGV
=> ["foo", "bar"]
tty(main):002:0> ARGV.size
ARGV.size
=> 2

Note that first element of ARGV, ARGV[0], contains the first command-line argument, not the name of program executed, as in C. The name of program is available using $0 or $PROGRAM_NAME.[16]

Similar to Python, one could use:

if __FILE__ == $PROGRAM_NAME
  # Put "main" code here
end

Rust

fn main() {
}

Visual Basic

In Visual Basic, when a project contains no forms, the startup object may be the Main() procedure. The Command$ function can be optionally used to access the argument portion of the command line used to launch the program:

Sub Main()
    Debug.Print "Hello World!"
    MsgBox "Arguments if any are: " & Command$
End Sub

Xojo

In Xojo, there are two different project types, each with a different main entry point. Desktop (GUI) applications start with the App.Open event of the project's Application object. Console applications start with the App.Run event of the project's ConsoleApplication object. In both instances, the main function is automatically generated, and cannot be removed from the project.

See also

References

  1. ^ "The main() function". ibm.com. IBM. Retrieved 2014-05-08.
  2. ^ "Main() and Command-Line Arguments (C# Programming Guide)". Msdn.microsoft.com. Retrieved 2014-05-08.
  3. ^ "Application Fundamentals". Android Development. linuxtopia.org. Retrieved 2014-02-19.
  4. ^ argv: the vector term in this variable's name is used in traditional sense to refer to strings.
  5. ^ Parameter types and names of main
  6. ^ Section 3.6.1.2, Standard C++ 2011 edition.
  7. ^ The char *apple Argument Vector
  8. ^ "Console Applications in .NET, or Teaching a New Dog Old Tricks". Msdn.microsoft.com. 2003-06-12. Retrieved 2013-08-19.
  9. ^ https://github.com/dotnet/csharplang/blob/master/proposals/csharp-7.1/async-main.md. {{cite web}}: Missing or empty |title= (help)
  10. ^ XL FORTRAN for AIX. Language Reference. Third Edition, 1994. IBM
  11. ^ "The Haskell 98 Report: Modules". Haskell.org. Retrieved 2013-08-19.
  12. ^ Some Haskell Misconceptions: Idiomatic Code, Purity, Laziness, and IO — on Haskell's monadic IO>
  13. ^ Guido van Rossum (May 15, 2003). "Python main() functions", comments {{cite web}}: External link in |postscript= (help)CS1 maint: postscript (link)
  14. ^ Code Like a Pythonista: Idiomatic Python—on Python scripts used as modules
  15. ^ Ned Batchelder (6 June 2003). "Python main() functions".
  16. ^ Programming Ruby: The Pragmatic Programmer's Guide, Ruby and Its World — on Ruby ARGV