diff options
| author | Michael Kerrisk <mtk.manpages@gmail.com> | 2013-01-14 04:49:29 +0100 |
|---|---|---|
| committer | Michael Kerrisk <mtk.manpages@gmail.com> | 2014-09-13 20:15:57 -0700 |
| commit | 9d005472a828d0bd015b33d8e17381c1d26403fd (patch) | |
| tree | cd4eb13271461dbcb4265ade295fcdc699ee7ca6 | |
| parent | 3dd2331ce714fbb92c788f3a109a9e610e1e4d2d (diff) | |
| download | man-pages-9d005472a828d0bd015b33d8e17381c1d26403fd.tar.gz | |
clone.2, namespaces.7: Move some CLONE_NEWUSER text from clone.2 to namespaces.7
Signed-off-by: Michael Kerrisk <mtk.manpages@gmail.com>
| -rw-r--r-- | man2/clone.2 | 165 | ||||
| -rw-r--r-- | man7/namespaces.7 | 146 |
2 files changed, 159 insertions, 152 deletions
diff --git a/man2/clone.2 b/man2/clone.2 index 14b4bb1150..72cdd66ec0 100644 --- a/man2/clone.2 +++ b/man2/clone.2 @@ -380,6 +380,33 @@ in the same call. .TP +.BR CLONE_NEWPID " (since Linux 2.6.24)" +.\" This explanation draws a lot of details from +.\" http://lwn.net/Articles/259217/ +.\" Authors: Pavel Emelyanov <xemul@openvz.org> +.\" and Kir Kolyshkin <kir@openvz.org> +.\" +.\" The primary kernel commit is 30e49c263e36341b60b735cbef5ca37912549264 +.\" Author: Pavel Emelyanov <xemul@openvz.org> +If +.B CLONE_NEWPID +is set, then create the process in a new PID namespace. +If this flag is not set, then (as with +.BR fork (2)) +the process is created in the same PID namespace as +the calling process. +This flag is intended for the implementation of containers. + +For further information on PID namespaces, see +.BR namespaces (7). + +Use of this flag requires +that the process be privileged +.RB ( CAP_SYS_ADMIN ). +This flag can't be specified in conjunction with +.BR CLONE_THREAD . + +.TP .BR CLONE_NEWUSER (This flag first became meaningful for .BR clone () @@ -397,39 +424,9 @@ If this flag is not set, then (as with .BR fork (2)) the process is created in the same user namespace as the calling process. -A user namespace provides an isolated environment for -security related identifiers, in particular, -user IDs, group IDs, keys (see -.BR keyctl (2)), -and capabilities. - -When a user namespace is created, -it starts out without a mapping of user IDs (group IDs) -to the parent user namespace. -The desired mapping of user IDs (group IDs) to the parent user namespace -may be set by writing into -.IR /proc/[pid]/uid_map -.RI ( /proc/[pid]/gid_map ); -see -.BR proc (5). - -The first process in a user namespace starts out with a complete set -of capabilities with respect to the new user namespace. - -System calls that return user IDs (group IDs) will return -either the user ID (group ID) mapped into the current -user namespace if there is a mapping, or the overflow user ID (group ID); -the default value for the overflow user ID (group ID) is 65534. -See the descriptions of -.IR /proc/sys/kernel/overflowuid -and -.IR /proc/sys/kernel/overflowgid -in -.BR proc (5). +For further information on user namespaces, see +.BR namespaces (7). -Use of this flag requires a kernel configured with the -.BR CONFIG_USER_NS -option. Before Linux 3.8, use of .BR CLONE_NEWUSER required that the caller have three capabilities: @@ -439,111 +436,9 @@ and .BR CAP_SETGID . .\" Before Linux 2.6.29, it appears that only CAP_SYS_ADMIN was needed Starting with Linux 3.8, -no privileges are needed to create a user namespace, -and mount, PID, IPC, network, and UTS namespaces can be created with just the -.B CAP_SYS_ADMIN -capability in the caller's user namespace. - -If -.BR CLONE_NEWUSER -is specified along with other -.B CLONE_NEW* -flags in a single -.BR clone() -call, the user namespace is guaranteed to be created first, -giving the caller privileges over the remaining -namespaces created by the call. -Thus, it possible for an unprivileged caller to specify this combination -of flags. - -Over the years, there have been a lot of features that have been added -to the Linux kernel that are only available to privileged users -because of their potential to confuse set-user-ID-root applications. -In general, it becomes safe to allow the root user in a user namespace to -use those features because it is impossible, while in a user namespace, -to gain more privilege than the root user of a user namespace has. +no privileges are needed to create a user namespace. .TP -.BR CLONE_NEWPID " (since Linux 2.6.24)" -.\" This explanation draws a lot of details from -.\" http://lwn.net/Articles/259217/ -.\" Authors: Pavel Emelyanov <xemul@openvz.org> -.\" and Kir Kolyshkin <kir@openvz.org> -.\" -.\" The primary kernel commit is 30e49c263e36341b60b735cbef5ca37912549264 -.\" Author: Pavel Emelyanov <xemul@openvz.org> -If -.B CLONE_NEWPID -is set, then create the process in a new PID namespace. -If this flag is not set, then (as with -.BR fork (2)) -the process is created in the same PID namespace as -the calling process. -This flag is intended for the implementation of containers. - -A PID namespace provides an isolated environment for PIDs: -PIDs in a new namespace start at 1, -somewhat like a standalone system, and calls to -.BR fork (2), -.BR vfork (2), -or -.BR clone () -will produce processes with PIDs that are unique within the namespace. - -The first process created in a new namespace -(i.e., the process created using the -.BR CLONE_NEWPID -flag) has the PID 1, and is the "init" process for the namespace. -Children that are orphaned within the namespace will be reparented -to this process rather than -.BR init (8). -Unlike the traditional -.B init -process, the "init" process of a PID namespace can terminate, -and if it does, all of the processes in the namespace are terminated. - -PID namespaces form a hierarchy. -When a new PID namespace is created, -the processes in that namespace are visible -in the PID namespace of the process that created the new namespace; -analogously, if the parent PID namespace is itself -the child of another PID namespace, -then processes in the child and parent PID namespaces will both be -visible in the grandparent PID namespace. -Conversely, the processes in the "child" PID namespace do not see -the processes in the parent namespace. -The existence of a namespace hierarchy means that each process -may now have multiple PIDs: -one for each namespace in which it is visible; -each of these PIDs is unique within the corresponding namespace. -(A call to -.BR getpid (2) -always returns the PID associated with the namespace in which -the process lives.) - -After creating the new namespace, -it is useful for the child to change its root directory -and mount a new procfs instance at -.I /proc -so that tools such as -.BR ps (1) -work correctly. -.\" mount -t proc proc /proc -(If -.BR CLONE_NEWNS -is also included in -.IR flags , -then it isn't necessary to change the root directory: -a new procfs instance can be mounted directly over -.IR /proc .) - -Use of this flag requires: a kernel configured with the -.B CONFIG_PID_NS -option and that the process be privileged -.RB ( CAP_SYS_ADMIN ). -This flag can't be specified in conjunction with -.BR CLONE_THREAD . -.TP .BR CLONE_NEWUTS " (since Linux 2.6.19)" If .B CLONE_NEWUTS diff --git a/man7/namespaces.7 b/man7/namespaces.7 index 850a5e2c14..089bf2df91 100644 --- a/man7/namespaces.7 +++ b/man7/namespaces.7 @@ -292,27 +292,88 @@ PID namespaces isolate the process ID number space, meaning that processes in different PID namespaces can have the same PID. PID namespaces allow containers to migrate to a new hosts while the processes inside the container maintain the same PIDs. -Each PID namespace has its own init (PID 1, see -.BR init (1)), -the "ancestor of all processes" that -manages various system initialization tasks and -reaps orphaned child processes when they terminate. - -From the point of view of a particular PID namespace instance, -a process has two PIDs: the PID inside the namespace, -and the PID outside the namespace on the host system. -PID namespaces can be nested: -a process will have one PID for each of the layers of the hierarchy -starting from the PID namespace in which it resides -through to the root PID namespace. -A process can see (e.g., send signals with + +PIDs in a new PID namespace start at 1, +somewhat like a standalone system, and calls to +.BR fork (2), +.BR vfork (2), +or +.BR clone (2) +will produce processes with PIDs that are unique within the namespace. + +The first process created in a new namespace +(i.e., the process created using +.BR clone (2) +with the +.BR CLONE_NEWPID +flag, or the first child created by a process after a call to +.BR unshare (2) +using the +.BR CLONE_NEWPID +flag) has the PID 1, and is the "init" process for the namespace (see +.BR init (1)). +Children that are orphaned within the namespace will be reparented +to this process rather than +.BR init (8). +Unlike the traditional +.B init +process, the "init" process of a PID namespace can terminate, +and if it does, all of the processes in the namespace are terminated. + +PID namespaces can be nested. +When a new PID namespace is created, +the processes in that namespace are visible +in the PID namespace of the process that created the new namespace; +analogously, if the parent PID namespace is itself +the child of another PID namespace, +then processes in the child and parent PID namespaces will both be +visible in the grandparent PID namespace. +Conversely, the processes in the "child" PID namespace do not see +the processes in the parent namespace. +More succinctly: a process can see (e.g., send signals with .BR kill(2)) -only processes contained in its own PID namespace +only to processes contained in its own PID namespace and the namespaces nested below that PID namespace. +A process will have one PID for each of the layers of the hierarchy +starting from the PID namespace in which it resides +through to the root PID namespace. +A call to +.BR getpid (2) +always returns the PID associated with the namespace in which +the process resides. + +After creating a new PID namespace, +it is useful for the child to change its root directory +and mount a new procfs instance at +.I /proc +so that tools such as +.BR ps (1) +work correctly. +.\" mount -t proc proc /proc +(If +.BR CLONE_NEWNS +is also included in the +.IR flags +argument of +.BR clone (2) +or +.BR unshare (2)), +then it isn't necessary to change the root directory: +a new procfs instance can be mounted directly over +.IR /proc .) + +Use of PID namespaces requires a kernel that is configured with the +.B CONFIG_PID_NS +option. + .SS User namespaces (CLONE_NEWUSER) -User namespaces isolate the user and group ID number spaces. +User namespaces isolate +security related identifiers, in particular, +user IDs, group IDs, keys (see +.BR keyctl (2)), +and capabilities. In other words, a process's user and group IDs can be different inside and outside a user namespace. A process can have a normal unprivileged user ID outside a user namespace @@ -321,7 +382,58 @@ in other words, the process has full privileges for operations inside the user namespace, but is unprivileged for operations outside the namespace. -Starting in Linux 3.8, unprivileged processes can create user namespaces. +When a user namespace is created, +it starts out without a mapping of user IDs (group IDs) +to the parent user namespace. +The desired mapping of user IDs (group IDs) to the parent user namespace +may be set by writing into +.IR /proc/[pid]/uid_map +.RI ( /proc/[pid]/gid_map ); +see below. + +The first process in a user namespace starts out with a complete set +of capabilities with respect to the new user namespace. + +System calls that return user IDs (group IDs) will return +either the user ID (group ID) mapped into the current +user namespace if there is a mapping, or the overflow user ID (group ID); +the default value for the overflow user ID (group ID) is 65534. +See the descriptions of +.IR /proc/sys/kernel/overflowuid +and +.IR /proc/sys/kernel/overflowgid +in +.BR proc (5). + +Starting in Linux 3.8, unprivileged processes can create user namespaces, +and mount, PID, IPC, network, and UTS namespaces can be created with just the +.B CAP_SYS_ADMIN +capability in the caller's user namespace. + +If +.BR CLONE_NEWUSER +is specified along with other +.B CLONE_NEW* +flags in a single +.BR clone (2) +or +.BR unshare (2) +call, the user namespace is guaranteed to be created first, +giving the caller privileges over the remaining +namespaces created by the call. +Thus, it possible for an unprivileged caller to specify this combination +of flags. + +Use of user namespaces requires a kernel that is configured with the +.B CONFIG_USER_NS +option. + +Over the years, there have been a lot of features that have been added +to the Linux kernel that are only available to privileged users +because of their potential to confuse set-user-ID-root applications. +In general, it becomes safe to allow the root user in a user namespace to +use those features because it is impossible, while in a user namespace, +to gain more privilege than the root user of a user namespace has. The .IR /proc/[pid]/uid_map |
