Entry point

From Wikipedia, the free encyclopedia
  (Redirected from Main function (programming))
Jump to navigation Jump to search

In computer programming, an entry point is where the first instructions in a program are executed, and where the program has access to command line arguments.

To start a program's execution, the loader or operating system passes control to its entry point. (During booting, the operating system itself is the program). This marks the transition from load time (and dynamic link time, if present) to run time.

For some operating systems and programming languages, the entry point is in a runtime library, a set of support functions for the language. The library code initializes the program and then passes control to the program proper. In other cases, the program may initialize the runtime library itself.

In simple systems, execution begins at the first statement, which is common in interpreted languages, simple executable formats, and boot loaders. In other cases, the entry point is at some other known memory address which can be an absolute address or relative address (offset).

Alternatively, execution of a program can begin at a named point, either with a conventional name defined by the programming language or operating system or at a caller-specified name. In many C-family languages, this is a function named main; as a result, the entry point is often known as the main function.

In JVM languages such as Java the entry point is a static method named main; in CLI languages such as C# the entry point is a static method named Main.[1]

Example of the main function, in C#.
How the Main() might look in C# source code. Different parts are labeled for reference.

Usage[edit]

Entry points apply both to source code and to executable files. However, in day-to-day software development, programmers specify the entry points only in source code, which makes them much better known. Entry points in executable files depend on the application binary interface (ABI) of the actual operating system, and are generated by the compiler or linker (if not fixed by the ABI). Other linked object files may also have entry points, which are used later by the linker when generating entry points of an executable file.

Entry points are capable of passing on command arguments, variables, or other information as a local variable used by the Main() method. This way, specific options may be set upon execution of the program, and then interpreted by the program. Many programs use this as an alternative way to configure different settings, or perform a set variety of actions using a single program.

Contemporary[edit]

In most of today's popular programming languages and operating systems, a computer program usually only has a single entry point.

In C, C++, D, Rust and Kotlin programs this is a function named main; in Java it is a static method named main (although the class must be specified at the invocation time), and in C# it is a static method named Main.[2][3]

In many major operating systems, the standard executable format has a single entry point. In the Executable and Linkable Format (ELF), used in Unix and Unix-like systems such as Linux, the entry point is specified in the e_entry field of the ELF header. In the GNU Compiler Collection (gcc), the entry point used by the linker is the _start symbol. Similarly, in the Portable Executable format, used in Microsoft Windows, the entry point is specified by the AddressOfEntryPoint field, which is inherited from COFF. In COM files, the entry point is at the fixed offset of 0100h.

One exception to the single-entry-point paradigm is Android. Android applications do not have a single entry point – there is no special main function. Instead, they have essential components (activities and services) which the system can load and run as needed.[4]

An occasionally used technique is the fat binary, which consists of several executables for different targets packaged in a single file. Most commonly, this is implemented by a single overall entry point, which is compatible with all targets and branches to the target-specific entry point. Alternative techniques include storing separate executables in separate forks, each with its own entry point, which is then selected by the operating system.

Historical[edit]

Historically, and in some contemporary legacy systems, such as VMS and OS/400, computer programs have a multitude of entry points, each corresponding to the different functionalities of the program. The usual way to denote entry points, as used system-wide in VMS and in PL/I and MACRO programs, is to append them at the end of the name of the executable image, delimited by a dollar sign ($), e.g. directory.exe$make.

The Apple I computer also used this to some degree. For example, an alternative entry point in Apple I's BASIC would keep the BASIC program useful when the reset button was accidentally pushed.[clarification needed]

Exit point[edit]

In general, programs can exit at any time by returning to the operating system or crashing. Programs in interpreted languages return control to the interpreter, but programs in compiled languages must return to the operating system, otherwise the processor will simply continue executing beyond the end of the program, resulting in undefined behavior.

Usually, there is not a single exit point specified in a program. However, in other cases runtimes ensure that programs always terminate in a structured way via a single exit point, which is guaranteed unless the runtime itself crashes; this allows cleanup code to be run, such as atexit handlers. This can be done by either requiring that programs terminate by returning from the main function, by calling a specific exit function, or by the runtime catching exceptions or operating system signals.

Programming languages[edit]

In many programming languages, the main function is where a program starts its execution. It enables high-level organization of the program's functionality, and typically has access to the command arguments given to the program when it was executed.

The main function is generally the first programmer-written function that runs when a program starts, and is invoked directly from the system-specific initialization contained in the runtime environment (crt0 or equivalent). However, some languages can execute user-written functions before main runs, such as the constructors of C++ global objects.

In other languages, notably many interpreted languages, execution begins at the first statement in the program.

A non-exhaustive list of programming languages follows, describing their way of defining the main entry point:

APL[edit]

In APL, when a workspace is loaded, the contents of "quad LX" (latent expression) variable is interpreted as an APL expression and executed.

C and C++[edit]

In C and C++, the function prototype of the main function looks like one of the following:

int main(void);
int main();

int main(int argc, char **argv);
int main(int argc, char *argv[]);
int main(int argc, char **argv, char **env);

// more specifically in C
// NOT according to the ISO C standard 5.1.2.2.1
// BUT in embedded programming depending on the µC, this form is also used
void main (void);

The parameters argc, argument count, and argv, argument vector,[5] respectively give the number and values of the program's command-line arguments. The names of argc and argv may be any valid identifier in C, but it is common convention to use these names. In C++, the names are to be taken literally, and the "void" in the parameter list is to be omitted, if strict conformance is desired.[6] Other platform-dependent formats are also allowed by the C and C++ standards, except that in C++ the return type must always be int;[7] for example, Unix (though not POSIX.1) and Windows have a third argument giving the program's environment, otherwise accessible through getenv in stdlib.h:

int main(int argc, char **argv, char **envp);

Darwin-based operating systems, such as macOS, have a fourth parameter containing arbitrary OS-supplied information, such as the path to the executing binary:[8]

int main(int argc, char **argv, char **envp, char **apple);

The value returned from the main function becomes the exit status of the process, though the C standard only ascribes specific meaning to two values: EXIT_SUCCESS (traditionally 0) and EXIT_FAILURE. The meaning of other possible return values is implementation-defined. In case a return value is not defined by the programmer, an implicit return 0; at the end of the main() function is inserted by the compiler; this behavior is required by the C++ standard.

It is guaranteed that argc is non-negative and that argv[argc] is a null pointer. By convention, the command-line arguments specified by argc and argv include the name of the program as the first element if argc is greater than 0; if a user types a command of "rm file", the shell will initialise the rm process with argc = 2 and argv = {"rm", "file", NULL}. As argv[0] is the name that processes appear under in ps, top etc., some programs, such as daemons or those running within an interpreter or virtual machine (where argv[0] would be the name of the host executable), may choose to alter their argv to give a more descriptive argv[0], usually by means of the exec system call.

The main() function is special; normally every C and C++ program must define it exactly once.

If declared, main() must be declared as if it has external linkage; it cannot be declared static or inline.

In C++, main() must be in the global namespace (i.e. ::main), cannot be overloaded, and cannot be a member function, although the name is not otherwise reserved, and may be used for member functions, classes, enumerations, or non-member functions in other namespaces. In C++ (unlike C) main() cannot be called recursively and cannot have its address taken.

C#[edit]

When executing a program written in C#, the CLR searches for a static method marked with the .entrypoint IL directive, which takes either no arguments, or a single argument of type string[], and has a return type of void or int, and executes it.[9]

static void Main();
static void Main(string[] args);
static int Main();
static int Main(string[] args);

Command-line arguments are passed in args, similar to how it is done in Java. For versions of Main() returning an integer, similar to both C and C++, it is passed back to the environment as the exit status of the process.

Since C#7.1 there are four more possible signatures of the entry point, which allow asynchronous execution in the Main() Method.[10]

static Task Main()
static Task<int> Main()
static Task Main(string[])
static Task<int> Main(string[])

The Task and Task<int> types are the asynchronous equivalents of void and int.

Clean[edit]

Clean is a functional programming language based on graph rewriting. The initial node is named Start and is of type *World -> *World if it changes the world or some fixed type if the program only prints the result after reducing Start.

Start :: *World -> *World
Start world = startIO ...

Or even simpler

Start :: String
Start = "Hello, world!"

One tells the compiler which option to use to generate the executable file.

Common Lisp[edit]

ANSI Common Lisp does not define a main function; instead, the code is read and evaluated from top to bottom in a source file. However, the following code will emulate a main function.

(defun hello-main ()
  (format t "Hello World!~%"))

(hello-main)

D[edit]

In D, the function prototype of the main function looks like one of the following:

void main();
void main(string[] args);
int main();
int main(string[] args);

Command-line arguments are passed in args, similar to how it is done in C# or Java. For versions of main() returning an integer, similar to both C and C++, it is passed back to the environment as the exit status of the process.

FORTRAN[edit]

FORTRAN does not have a main subroutine or function. Instead a PROGRAM statement as the first line can be used to specify that a program unit is a main program, as shown below. The PROGRAM statement cannot be used for recursive calls.[11]

      PROGRAM HELLO
      PRINT *, "Cint!"
      END PROGRAM HELLO

Some versions of Fortran, such as those on the IBM System/360 and successor mainframes, do not support the PROGRAM statement. Many compilers from other software manufacturers will allow a fortran program to be compiled without a PROGRAM statement. In these cases, whatever module that has any non-comment statement where no SUBROUTINE, FUNCTION or BLOCK DATA statement occurs, is considered to be the Main program.

GNAT[edit]

Using GNAT, the programmer is not required to write a function named main; a source file containing a single subprogram can be compiled to an executable. The binder will however create a package ada_main, which will contain and export a C-style main function.

Go[edit]

In Go programming language, program execution starts with the main function of the package main

package main

import "fmt"

func main() {
 fmt.Println("Hello, World!")
}

There is no way to access arguments or a return code outside of the standard library in Go. These can be accessed via os.Args and os.Exit respectively, both of which are included in the "os" package.

Haskell[edit]

A Haskell program must contain a name main bound to a value of type IO t, for some type t;[12] which is usually IO (). IO is a monad, which organizes side-effects in terms of purely functional code.[13] The main value represents the side-effects-ful computation done by the program. The result of the computation represented by main is discarded; that is why main usually has type IO (), which indicates that the type of the result of the computation is (), the unit type, which contains no information.

main :: IO ()
main = putStrLn "Hello, World!"

Command line arguments are not given to main; they must be fetched using another IO action, such as System.Environment.getArgs.

Java[edit]

Java programs start executing at the main method of a class, which has the following method heading:

public static void main(String[] args)
public static void main(String... args)
public static void main(String args[])

Command-line arguments are passed in args. As in C and C++, the name "main()" is special. Java's main methods do not return a value directly, but one can be passed by using the System.exit() method.

Unlike C, the name of the program is not included in args, because it is the name of the class that contains the main method, so it is already known. Also unlike C, the number of arguments need not be included, since arrays in Java have a field that keeps track of how many elements there are.

The main function must be included within a class. This is because in Java everything has to be contained within a class. For instance, a hello world program in Java may look like:

public class HelloWorld {
    public static void main(String[] args) {
        System.out.println("Hello, world!");
    }
}

To run this program, one must call java HelloWorld in the directory where the compiled class file HelloWorld.class) exists. Alternatively, executable JAR files use a manifest file to specify the entry point in a manner that is filesystem-independent from the user's perspective.

[edit]

In FMSLogo, the procedures when loaded do not execute. To make them execute, it is necessary to use this code:

to procname
 ...                 ; Startup commands (such as print [Welcome])
end
make "startup [procname]

The variable startup is used for the startup list of actions, but the convention is that this calls another procedure that runs the actions. That procedure may be of any name.

OCaml[edit]

OCaml has no main function. Programs are evaluated from top to bottom.

Command-line arguments are available in an array named Sys.argv and the exit status is 0 by default.

Example:

print_endline "Hello World"

Pascal[edit]

In Pascal, the main procedure is the only unnamed block in the program. Because Pascal programs define procedures and functions in a more rigorous bottom-up order than C, C++ or Java programs, the main procedure is usually the last block in the program. Pascal does not have a special meaning for the name "main" or any similar name.

program Hello(Output);
begin
  writeln('Hello, world!');
end.

Command-line arguments are counted in ParamCount and accessible as strings by ParamStr(n), with n between 0 and ParamCount.

Versions of Pascal that support units or modules may also contain an unnamed block in each, which is used to initialize the module. These blocks are executed before the main program entry point is called.

Perl[edit]

In Perl, there is no main function. Statements are executed from top to bottom.

Command-line arguments are available in the special array @ARGV. Unlike C, @ARGV does not contain the name of the program, which is $0.

PHP[edit]

PHP does not have a "main" function. Starting from the first line of a PHP script, any code not encapsulated by a function header is executed as soon as it is seen.

Pike[edit]

In Pike syntax is similar to that of C and C++. The execution begins at main. The "argc" variable keeps the number of arguments passed to the program. The "argv" variable holds the value associated with the arguments passed to the program.

Example:

 int main(int argc, array(string) argv)

Python[edit]

Python programs are evaluated top-to-bottom, as is usual in scripting languages: the entry point is the start of the source code. Since definitions must precede use, programs are typically structured with definitions at the top and the code to execute at the bottom (unindented), similar to code for a one-pass compiler, such as in Pascal.

Alternatively, a program can be structured with an explicit main function containing the code to be executed when a program is executed directly, but which can also be invoked by importing the program as a module and calling the function. This can be done by the following idiom, which relies on the internal variable __name__ being set to __main__ when a program is executed, but not when it is imported as a module (in which case it is instead set to the module name); there are many variants of this structure:[14][15][16]

import sys

def main(argv):
    n = int(argv[1])
    print(n + 1)

if __name__ == '__main__':
    sys.exit(main(sys.argv))

In this idiom, the call to the named entry point main is explicit, and the interaction with the operating system (receiving the arguments, calling system exit) are done explicitly by library calls, which are ultimately handled by the Python runtime. This contrast with C, where these are done implicitly by the runtime, based on convention.

QB64[edit]

The QB64 language has no main function, the code that is not within a function, or subroutine is executed first, from top to bottom:

print "Hello World! a =";
a = getInteger(1.8d): print a

function getInteger(n as double)
    getInteger = int(n)
end function

Command line arguments (if any) can be read using the COMMAND$ function:

dim shared commandline as string
commandline = COMMAND$

'Several space-separared command line arguments can be read using COMMAND$(n)
commandline1 = COMMAND$(2)

Ruby[edit]

In Ruby, there is no distinct main function. The code written without additional "class .. end", "module .. end" enclosures is executed directly, step by step, in context of special "main" object. This object can be referenced using:

irb(main):001:0> self
=> main

and contain the following properties:

irb(main):002:0> self.class
=> Object
irb(main):003:0> self.class.ancestors
=> [Object, Kernel, BasicObject]

Methods defined without additional classes/modules are defined as private methods of the "main" object, and, consequently, as private methods of almost any other object in Ruby:

irb(main):004:0> def foo
irb(main):005:1>   42
irb(main):006:1> end
=> nil
irb(main):007:0> foo
=> 42
irb(main):008:0> [].foo
NoMethodError: private method `foo' called for []:Array
	from (irb):8
	from /usr/bin/irb:12:in `<main>'
irb(main):009:0> false.foo
NoMethodError: private method `foo' called for false:FalseClass
	from (irb):9
	from /usr/bin/irb:12:in `<main>'

Number and values of command-line arguments can be determined using the single ARGV constant array:

$ irb /dev/tty foo bar

tty(main):001:0> ARGV
ARGV
=> ["foo", "bar"]
tty(main):002:0> ARGV.size
ARGV.size
=> 2

The first element of ARGV, ARGV[0], contains the first command-line argument, not the name of program executed, as in C. The name of program is available using $0 or $PROGRAM_NAME.[17]

Similar to Python, one could use:

if __FILE__ == $PROGRAM_NAME
  # Put "main" code here
end

Rust[edit]

fn main() {
}

Swift[edit]

When run in an Xcode Playground,[18] Swift behaves like a scripting language, executing statements from top to bottom; top-level code is allowed.

// HelloWorld.playground

let hello = "hello"
let world = "world"

let helloWorld = hello + " " + world

print(helloWorld) // hello world

Cocoa- and Cocoa Touch-based applications written in Swift are usually initialized with the @NSApplicationMain and @UIApplicationMain attributes, respectively. Those attributes are equivalent in their purpose to the main.m file in Objective-C projects: they implicitly declare the main function that calls UIApplicationMain(_:_:_:_:)[19] which creates an instance of UIApplication.[20] The following code is the default way to initialize a Cocoa Touch-based iOS app and declare its application delegate.

// AppDelegate.swift

import UIKit

@UIApplicationMain
class AppDelegate: UIResponder, UIApplicationDelegate {
    
    var window: UIWindow?
    
    func application(_ application: UIApplication, didFinishLaunchingWithOptions launchOptions: [UIApplication.LaunchOptionsKey: Any]?) -> Bool {
        return true
    }
    
}

Visual Basic[edit]

In Visual Basic, when a project contains no forms, the startup object may be the Main() procedure. The Command$ function can be optionally used to access the argument portion of the command line used to launch the program:

Sub Main()
    Debug.Print "Hello World!"
    MsgBox "Arguments if any are: " & Command$
End Sub

Xojo[edit]

In Xojo, there are two different project types, each with a different main entry point. Desktop (GUI) applications start with the App.Open event of the project's Application object. Console applications start with the App.Run event of the project's ConsoleApplication object. In both instances, the main function is automatically generated, and cannot be removed from the project.

See also[edit]

References[edit]

  1. ^ Wagner, Bill (2017-08-01). "Main() / Entry Points (C# Programming Guide) - Microsoft Developer Network". docs.microsoft.com. Retrieved 2018-12-03.
  2. ^ "The main() function". ibm.com. IBM. Retrieved 2014-05-08.
  3. ^ "Main() and Command-Line Arguments (C# Programming Guide)". Msdn.microsoft.com. Retrieved 2014-05-08.
  4. ^ "Application Fundamentals". Android Development. linuxtopia.org. Retrieved 2014-02-19.
  5. ^ argv: the vector term in this variable's name is used in traditional sense to refer to strings.
  6. ^ Parameter types and names of main
  7. ^ Section 3.6.1.2, Standard C++ 2011 edition.
  8. ^ The char *apple Argument Vector
  9. ^ "Console Applications in .NET, or Teaching a New Dog Old Tricks". Msdn.microsoft.com. 2003-06-12. Retrieved 2013-08-19.
  10. ^ https://github.com/dotnet/csharplang/blob/master/proposals/csharp-7.1/async-main.md. Missing or empty |title= (help)
  11. ^ XL FORTRAN for AIX. Language Reference. Third Edition, 1994. IBM
  12. ^ "The Haskell 98 Report: Modules". Haskell.org. Retrieved 2013-08-19.
  13. ^ Some Haskell Misconceptions: Idiomatic Code, Purity, Laziness, and IO — on Haskell's monadic IO>
  14. ^ Guido van Rossum (May 15, 2003). "Python main() functions", comments
  15. ^ Code Like a Pythonista: Idiomatic Python—on Python scripts used as modules
  16. ^ Ned Batchelder (6 June 2003). "Python main() functions".
  17. ^ Programming Ruby: The Pragmatic Programmer's Guide, Ruby and Its World — on Ruby ARGV
  18. ^ Not to be confused with Swift Playgrounds, an Apple-developed iPad app for learning the Swift programming language.
  19. ^ "UIApplicationMain(_:_:_:_:) — UIKit". Apple Developer Documentation. Retrieved 2019-01-12.
  20. ^ "UIApplication — UIKit". Apple Developer Documentation. Retrieved 2019-01-12.

External links[edit]