Linux namespaces: Difference between revisions
Line 116: | Line 116: | ||
* clone, flags to specify which new namespace the new process should be migrated to. |
* clone, flags to specify which new namespace the new process should be migrated to. |
||
* unshare, flags to specify which new namespace the current process should be migrated to. |
* unshare, flags to specify which new namespace the current process should be migrated to. |
||
* setns, enters the namespace specified by a |
* setns, enters the namespace specified by a file descripters. |
||
=== Destruction === |
=== Destruction === |
Revision as of 04:24, 21 November 2016
This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these messages)
|
Original author(s) | Al Viro |
---|---|
Developer(s) | Eric W. Biederman, Pavel Emelyanov, Al Viro, Cyrill Gorcunov et al. |
Initial release | 2002 |
Written in | C |
Operating system | Linux |
Type | System software |
License | GPL and LGPL |
Namespaces are a feature of the Linux kernel that isolates and virtualizes system resources of a collection of processes. Examples of resources that can be virtualized include process IDs, hostnames, user IDs, network access, interprocess communication, and filesystems.
Linux developers use the term namespace to refer to both the namespace kinds, as well as to specific instances of these kinds.
A Linux system is initialized with a single namespace. After initialization, additional namespaces can be created or joined.
History
This section needs expansion. You can help by adding to it. (September 2016) |
Linux namespaces were inspired by the more general namespace functionality used heavily throughout Plan 9 from Bell Labs.[1]
The Linux Namespaces originated in 2002 in the 2.4.19 kernel with work on the mount namespace kind.
Most of the work on the current namespace kinds[clarification needed] was finished in kernel version 3.8.
Namespace kinds
As of kernel 3.8, there are 6 kinds of namespaces. Namespace functionality is the same across all kinds: each process is associated with a namespace and can only see the resources associated with that namespace. This way each process (or group thereof) can have a unique view on the resource. Which resource is isolated depends on the kind of the namespace.[clarification needed]
Mount (mnt)
Mount namespaces control mount points. Upon creation the mounts from the current mount namespace are copied to the new namespace, but mount points created afterwards do not propagate between namespaces (using shared subtrees, it is possible to propagate mount points between namespaces[2]).
The mount namespace kind was the first one to be introduced, at a time nobody thought of other namespaces, that's why its clone flag is CLONE_NEWNS.
Process ID (pid)
Assigns each process a new PID, allows for a different init process (inside of this namespace).
- process get a PID and can be seen from process in the parent namespace too.
- can be nested
- aids in process migration between different hosts
Network (net)
- cannot be nested, each netns is attached to a userns
- The whole network stack
Interprocess Communcation (ipc)
- System V IPC identifiers
- POSIX message queue filesystem
UTS
- hostname
- domainname
User ID (user)
- uids and gids
- Permissions for namespace of the other kinds are checked in the user namespace, they got created in.
Proposed namespaces
This section needs expansion. You can help by adding to it. (September 2016) |
time namespace
Other namespace proposals have been made, for example a time namespace patch was proposed, but not merged into the kernel.
syslog namespace
This section needs expansion. You can help by adding to it. (September 2016) |
Implementation Details
The kernel assigns each process a symbolic link per namespace kind in /proc/<pid>/ns/
. The inode number pointed to by this symlink is the same for each process in this namespace. This uniquely identifies each namespace by the inode number pointed to by one of its symlinks.
Reading the symlink via readlink returns a string containing the namespace kind name and the inode number of the namespace.
Syscalls
Three syscalls can directly manipulate namespaces:
- clone, flags to specify which new namespace the new process should be migrated to.
- unshare, flags to specify which new namespace the current process should be migrated to.
- setns, enters the namespace specified by a file descripters.
Destruction
If a namespace is no longer referenced, it will be deleted, the handling of the contained resource depends on the namespace kind. Namespaces can be referenced in three ways:
- a process belonging to the namespace
- an open filedescriptor to the namespace's file (
/proc/<pid>/ns/<ns-kind>
) - a bind mount of the namespace's file (
/proc/<pid>/ns/<ns-kind>
)
Adoption
Various container software use Linux namespaces in combination with cgroups to isolate their processes, including Docker[3] and LXC.
There is also an unshare wrapper in util-linux. An example to its use is
SHELL=/bin/sh unshare --fork --pid chroot "${chrootdir}" "$@"
See also
References
- ^ "The Use of Name Spaces in Plan 9". 1992.
- ^ "Docker security". docker.com. Retrieved 2016-03-24.