System call

In computing, a system call is the mechanism used by an application program to request service from the operating system.

Background

A system call is a request made by any arbitrary program to the operating system for performing tasks -- picked from a predefined set -- which the said program does not have required permissions to execute in its own flow of execution. Most operations interacting with the system require permissions not available to a user level process, i.e. any I/O performed with any arbitrary device present on the system or any form of communication with other processes requires the use of system calls.

The fact that improper use of the system can easily cause a system crash necessitates some level of control. The design of the microprocessor architecture on practically all modern systems (except some embedded systems) offers a series of privilege levels -- the (low) privilege level in which normal applications execute limits the address space of the program so that it cannot access or modify other running applications nor the operating system itself. It also prevents the application from using any system devices (e.g. the frame buffer or network devices). But obviously any normal application needs these abilities; thus it can call the operating system. The OS executes at the highest level of privilege and allows the applications to request services via system calls, which are often implemented through interrupts. If allowed, the system enters a higher privilege level, executes a specific set of instructions which the interrupting program has no direct control over, then returns control to the former flow of execution. This concept also serves as a way to implement security.

With the development of separate operating modes with varying levels of privilege, a mechanism was needed for transferring control safely from lesser privileged modes to higher privileged modes. Less privileged code could not simply transfer control to more privileged code at any arbitrary point and with any arbitrary processor state. To allow it to do so would allow it to break security. For instance, the less privileged code could cause the higher privileged code to execute in the wrong order, or provide it with a bad stack.

The library as an intermediary

Generally, operating systems provide a library that sits between normal programs and the rest of the operating system, usually an implementation of the C library (libc), such as glibc. This library handles the low-level details of passing information to the kernel and switching to supervisor mode, as well as any data processing and preparation which does not need to be done in privileged mode. Ideally, this reduces the coupling between the operating system and the application, and increases portability.

On exokernel based systems, the library is especially important as an intermediary. On exokernels, OSes shield user applications from the very low level kernel API, and provide abstractions and resource management.

Examples and tools

On Unix-based and POSIX-based systems, popular system calls are open, read, write, close, wait, exec, fork, exit, and kill. Many of today's operating systems have hundreds of system calls. For example, Linux has 319 different system calls. FreeBSD has about the same (almost 330).

Tools such as strace and truss report the system calls made by a running process.

Typical implementations

Implementing system calls requires a control transfer which involves some sort of architecture specific feature. A typical way to implement this is to use a software interrupt or trap. Interrupts transfer control to the kernel so software simply needs to set up some register with the system call number they want and execute the software interrupt.

For many RISC processors this is the only feasible implementation, but CISC architectures such as x86 support additional techniques. One example is SYSCALL/SYSRET which is very similar to SYSENTER/SYSEXIT (the two mechanisms were created by Intel and AMD independently, but do basically the same thing). These are "fast" control transfer instructions that are designed to quickly transfer control to the kernel for a system call without the overhead of an interrupt. Linux 2.5 began using this on the x86, where available; formerly it used the INT instruction, where the system call number was placed in the EAX register before interrupt 0x80 was executed.^[1]

An older x86 mechanism is called a call gate and is a way for a program to literally call a kernel function directly using a safe control transfer mechanism the kernel sets up in advance. This approach has been unpopular, presumably due to the requirement of a far call which uses x86 memory segmentation and the resulting lack of portability it causes, and existence of the faster instructions mentioned above.

References

^ Anonymous (2002-12-19). "Linux 2.5 gets vsyscalls, sysenter support". KernelTrap. Retrieved 2008-01-01.

External links

Linux system calls - system calls for Linux kernel 2.2, with IA32 calling conventions
How System Calls Work on Linux/i86
Sysenter Based System Call Mechanism in Linux 2.6
Kernel command using Linux system calls - IBM developerWorks
HOWTO for Implementing a System Call on Linux 2.6 - Amit Choudhary

This article is based on material taken from the Free On-line Dictionary of Computing prior to 1 November 2008 and incorporated under the "relicensing" terms of the GFDL, version 1.3 or later.

[1] Anonymous (2002-12-19). "Linux 2.5 gets vsyscalls, sysenter support". KernelTrap. Retrieved 2008-01-01.

[1]