aboutsummaryrefslogtreecommitdiffstats
path: root/man5/proc_sys.5
diff options
context:
space:
mode:
authorAlejandro Colomar <alx@kernel.org>2023-08-17 22:47:16 +0200
committerAlejandro Colomar <alx@kernel.org>2023-08-17 23:12:31 +0200
commit0569afbbccd6de28d1bacd13471a679ad2674aa1 (patch)
treee0286e2d8dee88465fc9e2b31697573af0aa10cf /man5/proc_sys.5
parent29597f1e7ecd58e8239a1650c6bdea0517f913af (diff)
parent92cdcec79df039146e5ed42cac23cd4b7e3f9e25 (diff)
downloadman-pages-0569afbbccd6.tar.gz
proc*.5: Make sashimi
[Merge tag 'proc-sashimi-v1' of <git://www.alejandro-colomar.es/src/alx/linux/man-pages/man-pages.git>] proc(5) was a huge page, which was quite hard to maintain, extend, read, and refer to. Split the page into small pages for the different directories and files within /proc. Some pages are still too large (e.g., proc_sys(5)), and will some day be split even more. This split keeps the contents of the original page, without modifying anything; not even the formatting. The only thing that has been modified in this patches, is that directories are consistently represented with a trailing slash. For the file name of the pages, we've used the name of the interface (e.g., /proc/pid/), removing the leading and trailing '/'s and then translating the remaining ones as `tr / _` (e.g., proc_pid.5). The title of the pages (TH) is consistent with this. The NAME of the pages, however, is the actual path name of the interfaces. The man page references have not been updated, as that was a more complex and tedious work, so I expect that they'll be slowly updated as we and users find out. Link: <https://lore.kernel.org/linux-man/e3a5bc09-e835-9819-4aaa-12959495ac59@kernel.org/T/> Acked-by: Oskari Pirhonen <xxc3ncoredxx@gmail.com> Acked-by: Günther Noack <gnoack@google.com> Acked-by: "G. Branden Robinson" <g.branden.robinson@gmail.com> Cc: Brian Inglis <Brian.Inglis@Shaw.ca> Cc: Ingo Schwarze <schwarze@usta.de> Cc: Colin Watson <cjwatson@debian.org> Signed-off-by: Alejandro Colomar <alx@kernel.org>
Diffstat (limited to 'man5/proc_sys.5')
-rw-r--r--man5/proc_sys.51623
1 files changed, 1623 insertions, 0 deletions
diff --git a/man5/proc_sys.5 b/man5/proc_sys.5
new file mode 100644
index 0000000000..78f0c192c2
--- /dev/null
+++ b/man5/proc_sys.5
@@ -0,0 +1,1623 @@
+'\" t
+.\" Copyright (C) 1994, 1995, Daniel Quinlan <quinlan@yggdrasil.com>
+.\" Copyright (C) 2002-2008, 2017, Michael Kerrisk <mtk.manpages@gmail.com>
+.\" Copyright (C) , Andries Brouwer <aeb@cwi.nl>
+.\" Copyright (C) 2023, Alejandro Colomar <alx@kernel.org>
+.\"
+.\" SPDX-License-Identifier: GPL-3.0-or-later
+.\"
+.TH proc_sys 5 (date) "Linux man-pages (unreleased)"
+.SH NAME
+/proc/sys/ \- system information, and sysctl pseudo-filesystem
+.SH DESCRIPTION
+.TP
+.I /proc/sys/
+This directory (present since Linux 1.3.57) contains a number of files
+and subdirectories corresponding to kernel variables.
+These variables can be read and in some cases modified using
+the \fI/proc\fP filesystem, and the (deprecated)
+.BR sysctl (2)
+system call.
+.IP
+String values may be terminated by either \[aq]\e0\[aq] or \[aq]\en\[aq].
+.IP
+Integer and long values may be written either in decimal or in
+hexadecimal notation (e.g., 0x3FFF).
+When writing multiple integer or long values, these may be separated
+by any of the following whitespace characters:
+\[aq]\ \[aq], \[aq]\et\[aq], or \[aq]\en\[aq].
+Using other separators leads to the error
+.BR EINVAL .
+.TP
+.IR /proc/sys/abi/ " (since Linux 2.4.10)"
+This directory may contain files with application binary information.
+.\" On some systems, it is not present.
+See the Linux kernel source file
+.I Documentation/sysctl/abi.rst
+(or
+.I Documentation/sysctl/abi.txt
+before Linux 5.3)
+for more information.
+.TP
+.I /proc/sys/debug/
+This directory may be empty.
+.TP
+.I /proc/sys/dev/
+This directory contains device-specific information (e.g.,
+.IR dev/cdrom/info ).
+On
+some systems, it may be empty.
+.TP
+.I /proc/sys/fs/
+This directory contains the files and subdirectories for kernel variables
+related to filesystems.
+.TP
+.IR /proc/sys/fs/aio\-max\-nr " and " /proc/sys/fs/aio\-nr " (since Linux 2.6.4)"
+.I aio\-nr
+is the running total of the number of events specified by
+.BR io_setup (2)
+calls for all currently active AIO contexts.
+If
+.I aio\-nr
+reaches
+.IR aio\-max\-nr ,
+then
+.BR io_setup (2)
+will fail with the error
+.BR EAGAIN .
+Raising
+.I aio\-max\-nr
+does not result in the preallocation or resizing
+of any kernel data structures.
+.TP
+.I /proc/sys/fs/binfmt_misc
+Documentation for files in this directory can be found
+in the Linux kernel source in the file
+.I Documentation/admin\-guide/binfmt\-misc.rst
+(or in
+.I Documentation/binfmt_misc.txt
+on older kernels).
+.TP
+.IR /proc/sys/fs/dentry\-state " (since Linux 2.2)"
+This file contains information about the status of the
+directory cache (dcache).
+The file contains six numbers,
+.IR nr_dentry ,
+.IR nr_unused ,
+.I age_limit
+(age in seconds),
+.I want_pages
+(pages requested by system) and two dummy values.
+.RS
+.IP \[bu] 3
+.I nr_dentry
+is the number of allocated dentries (dcache entries).
+This field is unused in Linux 2.2.
+.IP \[bu]
+.I nr_unused
+is the number of unused dentries.
+.IP \[bu]
+.I age_limit
+.\" looks like this is unused in Linux 2.2 to Linux 2.6
+is the age in seconds after which dcache entries
+can be reclaimed when memory is short.
+.IP \[bu]
+.I want_pages
+.\" looks like this is unused in Linux 2.2 to Linux 2.6
+is nonzero when the kernel has called shrink_dcache_pages() and the
+dcache isn't pruned yet.
+.RE
+.TP
+.I /proc/sys/fs/dir\-notify\-enable
+This file can be used to disable or enable the
+.I dnotify
+interface described in
+.BR fcntl (2)
+on a system-wide basis.
+A value of 0 in this file disables the interface,
+and a value of 1 enables it.
+.TP
+.I /proc/sys/fs/dquot\-max
+This file shows the maximum number of cached disk quota entries.
+On some (2.4) systems, it is not present.
+If the number of free cached disk quota entries is very low and
+you have some awesome number of simultaneous system users,
+you might want to raise the limit.
+.TP
+.I /proc/sys/fs/dquot\-nr
+This file shows the number of allocated disk quota
+entries and the number of free disk quota entries.
+.TP
+.IR /proc/sys/fs/epoll/ " (since Linux 2.6.28)"
+This directory contains the file
+.IR max_user_watches ,
+which can be used to limit the amount of kernel memory consumed by the
+.I epoll
+interface.
+For further details, see
+.BR epoll (7).
+.TP
+.I /proc/sys/fs/file\-max
+This file defines
+a system-wide limit on the number of open files for all processes.
+System calls that fail when encountering this limit fail with the error
+.BR ENFILE .
+(See also
+.BR setrlimit (2),
+which can be used by a process to set the per-process limit,
+.BR RLIMIT_NOFILE ,
+on the number of files it may open.)
+If you get lots
+of error messages in the kernel log about running out of file handles
+(open file descriptions)
+(look for "VFS: file\-max limit <number> reached"),
+try increasing this value:
+.IP
+.in +4n
+.EX
+echo 100000 > /proc/sys/fs/file\-max
+.EE
+.in
+.IP
+Privileged processes
+.RB ( CAP_SYS_ADMIN )
+can override the
+.I file\-max
+limit.
+.TP
+.I /proc/sys/fs/file\-nr
+This (read-only) file contains three numbers:
+the number of allocated file handles
+(i.e., the number of open file descriptions; see
+.BR open (2));
+the number of free file handles;
+and the maximum number of file handles (i.e., the same value as
+.IR /proc/sys/fs/file\-max ).
+If the number of allocated file handles is close to the
+maximum, you should consider increasing the maximum.
+Before Linux 2.6,
+the kernel allocated file handles dynamically,
+but it didn't free them again.
+Instead the free file handles were kept in a list for reallocation;
+the "free file handles" value indicates the size of that list.
+A large number of free file handles indicates that there was
+a past peak in the usage of open file handles.
+Since Linux 2.6, the kernel does deallocate freed file handles,
+and the "free file handles" value is always zero.
+.TP
+.IR /proc/sys/fs/inode\-max " (only present until Linux 2.2)"
+This file contains the maximum number of in-memory inodes.
+This value should be 3\[en]4 times larger
+than the value in
+.IR file\-max ,
+since \fIstdin\fP, \fIstdout\fP
+and network sockets also need an inode to handle them.
+When you regularly run out of inodes, you need to increase this value.
+.IP
+Starting with Linux 2.4,
+there is no longer a static limit on the number of inodes,
+and this file is removed.
+.TP
+.I /proc/sys/fs/inode\-nr
+This file contains the first two values from
+.IR inode\-state .
+.TP
+.I /proc/sys/fs/inode\-state
+This file
+contains seven numbers:
+.IR nr_inodes ,
+.IR nr_free_inodes ,
+.IR preshrink ,
+and four dummy values (always zero).
+.IP
+.I nr_inodes
+is the number of inodes the system has allocated.
+.\" This can be slightly more than
+.\" .I inode\-max
+.\" because Linux allocates them one page full at a time.
+.I nr_free_inodes
+represents the number of free inodes.
+.IP
+.I preshrink
+is nonzero when the
+.I nr_inodes
+>
+.I inode\-max
+and the system needs to prune the inode list instead of allocating more;
+since Linux 2.4, this field is a dummy value (always zero).
+.TP
+.IR /proc/sys/fs/inotify/ " (since Linux 2.6.13)"
+This directory contains files
+.IR max_queued_events ", " max_user_instances ", and " max_user_watches ,
+that can be used to limit the amount of kernel memory consumed by the
+.I inotify
+interface.
+For further details, see
+.BR inotify (7).
+.TP
+.I /proc/sys/fs/lease\-break\-time
+This file specifies the grace period that the kernel grants to a process
+holding a file lease
+.RB ( fcntl (2))
+after it has sent a signal to that process notifying it
+that another process is waiting to open the file.
+If the lease holder does not remove or downgrade the lease within
+this grace period, the kernel forcibly breaks the lease.
+.TP
+.I /proc/sys/fs/leases\-enable
+This file can be used to enable or disable file leases
+.RB ( fcntl (2))
+on a system-wide basis.
+If this file contains the value 0, leases are disabled.
+A nonzero value enables leases.
+.TP
+.IR /proc/sys/fs/mount\-max " (since Linux 4.9)"
+.\" commit d29216842a85c7970c536108e093963f02714498
+The value in this file specifies the maximum number of mounts that may exist
+in a mount namespace.
+The default value in this file is 100,000.
+.TP
+.IR /proc/sys/fs/mqueue/ " (since Linux 2.6.6)"
+This directory contains files
+.IR msg_max ", " msgsize_max ", and " queues_max ,
+controlling the resources used by POSIX message queues.
+See
+.BR mq_overview (7)
+for details.
+.TP
+.IR /proc/sys/fs/nr_open " (since Linux 2.6.25)"
+.\" commit 9cfe015aa424b3c003baba3841a60dd9b5ad319b
+This file imposes a ceiling on the value to which the
+.B RLIMIT_NOFILE
+resource limit can be raised (see
+.BR getrlimit (2)).
+This ceiling is enforced for both unprivileged and privileged process.
+The default value in this file is 1048576.
+(Before Linux 2.6.25, the ceiling for
+.B RLIMIT_NOFILE
+was hard-coded to the same value.)
+.TP
+.IR /proc/sys/fs/overflowgid " and " /proc/sys/fs/overflowuid
+These files
+allow you to change the value of the fixed UID and GID.
+The default is 65534.
+Some filesystems support only 16-bit UIDs and GIDs, although in Linux
+UIDs and GIDs are 32 bits.
+When one of these filesystems is mounted
+with writes enabled, any UID or GID that would exceed 65535 is translated
+to the overflow value before being written to disk.
+.TP
+.IR /proc/sys/fs/pipe\-max\-size " (since Linux 2.6.35)"
+See
+.BR pipe (7).
+.TP
+.IR /proc/sys/fs/pipe\-user\-pages\-hard " (since Linux 4.5)"
+See
+.BR pipe (7).
+.TP
+.IR /proc/sys/fs/pipe\-user\-pages\-soft " (since Linux 4.5)"
+See
+.BR pipe (7).
+.TP
+.IR /proc/sys/fs/protected_fifos " (since Linux 4.19)"
+The value in this file is/can be set to one of the following:
+.RS
+.TP 4
+0
+Writing to FIFOs is unrestricted.
+.TP
+1
+Don't allow
+.B O_CREAT
+.BR open (2)
+on FIFOs that the caller doesn't own in world-writable sticky directories,
+unless the FIFO is owned by the owner of the directory.
+.TP
+2
+As for the value 1,
+but the restriction also applies to group-writable sticky directories.
+.RE
+.IP
+The intent of the above protections is to avoid unintentional writes to an
+attacker-controlled FIFO when a program expected to create a regular file.
+.TP
+.IR /proc/sys/fs/protected_hardlinks " (since Linux 3.6)"
+.\" commit 800179c9b8a1e796e441674776d11cd4c05d61d7
+When the value in this file is 0,
+no restrictions are placed on the creation of hard links
+(i.e., this is the historical behavior before Linux 3.6).
+When the value in this file is 1,
+a hard link can be created to a target file
+only if one of the following conditions is true:
+.RS
+.IP \[bu] 3
+The calling process has the
+.B CAP_FOWNER
+capability in its user namespace
+and the file UID has a mapping in the namespace.
+.IP \[bu]
+The filesystem UID of the process creating the link matches
+the owner (UID) of the target file
+(as described in
+.BR credentials (7),
+a process's filesystem UID is normally the same as its effective UID).
+.IP \[bu]
+All of the following conditions are true:
+.RS 4
+.IP \[bu] 3
+the target is a regular file;
+.IP \[bu]
+the target file does not have its set-user-ID mode bit enabled;
+.IP \[bu]
+the target file does not have both its set-group-ID and
+group-executable mode bits enabled; and
+.IP \[bu]
+the caller has permission to read and write the target file
+(either via the file's permissions mask or because it has
+suitable capabilities).
+.RE
+.RE
+.IP
+The default value in this file is 0.
+Setting the value to 1
+prevents a longstanding class of security issues caused by
+hard-link-based time-of-check, time-of-use races,
+most commonly seen in world-writable directories such as
+.IR /tmp .
+The common method of exploiting this flaw
+is to cross privilege boundaries when following a given hard link
+(i.e., a root process follows a hard link created by another user).
+Additionally, on systems without separated partitions,
+this stops unauthorized users from "pinning" vulnerable set-user-ID and
+set-group-ID files against being upgraded by
+the administrator, or linking to special files.
+.TP
+.IR /proc/sys/fs/protected_regular " (since Linux 4.19)"
+The value in this file is/can be set to one of the following:
+.RS
+.TP 4
+0
+Writing to regular files is unrestricted.
+.TP
+1
+Don't allow
+.B O_CREAT
+.BR open (2)
+on regular files that the caller doesn't own in
+world-writable sticky directories,
+unless the regular file is owned by the owner of the directory.
+.TP
+2
+As for the value 1,
+but the restriction also applies to group-writable sticky directories.
+.RE
+.IP
+The intent of the above protections is similar to
+.IR protected_fifos ,
+but allows an application to
+avoid writes to an attacker-controlled regular file,
+where the application expected to create one.
+.TP
+.IR /proc/sys/fs/protected_symlinks " (since Linux 3.6)"
+.\" commit 800179c9b8a1e796e441674776d11cd4c05d61d7
+When the value in this file is 0,
+no restrictions are placed on following symbolic links
+(i.e., this is the historical behavior before Linux 3.6).
+When the value in this file is 1, symbolic links are followed only
+in the following circumstances:
+.RS
+.IP \[bu] 3
+the filesystem UID of the process following the link matches
+the owner (UID) of the symbolic link
+(as described in
+.BR credentials (7),
+a process's filesystem UID is normally the same as its effective UID);
+.IP \[bu]
+the link is not in a sticky world-writable directory; or
+.IP \[bu]
+the symbolic link and its parent directory have the same owner (UID)
+.RE
+.IP
+A system call that fails to follow a symbolic link
+because of the above restrictions returns the error
+.B EACCES
+in
+.IR errno .
+.IP
+The default value in this file is 0.
+Setting the value to 1 avoids a longstanding class of security issues
+based on time-of-check, time-of-use races when accessing symbolic links.
+.TP
+.IR /proc/sys/fs/suid_dumpable " (since Linux 2.6.13)"
+.\" The following is based on text from Documentation/sysctl/kernel.txt
+The value in this file is assigned to a process's "dumpable" flag
+in the circumstances described in
+.BR prctl (2).
+In effect,
+the value in this file determines whether core dump files are
+produced for set-user-ID or otherwise protected/tainted binaries.
+The "dumpable" setting also affects the ownership of files in a process's
+.IR /proc/ pid
+directory, as described above.
+.IP
+Three different integer values can be specified:
+.RS
+.TP
+\fI0\ (default)\fP
+.\" In kernel source: SUID_DUMP_DISABLE
+This provides the traditional (pre-Linux 2.6.13) behavior.
+A core dump will not be produced for a process which has
+changed credentials (by calling
+.BR seteuid (2),
+.BR setgid (2),
+or similar, or by executing a set-user-ID or set-group-ID program)
+or whose binary does not have read permission enabled.
+.TP
+\fI1\ ("debug")\fP
+.\" In kernel source: SUID_DUMP_USER
+All processes dump core when possible.
+(Reasons why a process might nevertheless not dump core are described in
+.BR core (5).)
+The core dump is owned by the filesystem user ID of the dumping process
+and no security is applied.
+This is intended for system debugging situations only:
+this mode is insecure because it allows unprivileged users to
+examine the memory contents of privileged processes.
+.TP
+\fI2\ ("suidsafe")\fP
+.\" In kernel source: SUID_DUMP_ROOT
+Any binary which normally would not be dumped (see "0" above)
+is dumped readable by root only.
+This allows the user to remove the core dump file but not to read it.
+For security reasons core dumps in this mode will not overwrite one
+another or other files.
+This mode is appropriate when administrators are
+attempting to debug problems in a normal environment.
+.IP
+Additionally, since Linux 3.6,
+.\" 9520628e8ceb69fa9a4aee6b57f22675d9e1b709
+.I /proc/sys/kernel/core_pattern
+must either be an absolute pathname
+or a pipe command, as detailed in
+.BR core (5).
+Warnings will be written to the kernel log if
+.I core_pattern
+does not follow these rules, and no core dump will be produced.
+.\" 54b501992dd2a839e94e76aa392c392b55080ce8
+.RE
+.IP
+For details of the effect of a process's "dumpable" setting
+on ptrace access mode checking, see
+.BR ptrace (2).
+.TP
+.I /proc/sys/fs/super\-max
+This file
+controls the maximum number of superblocks, and
+thus the maximum number of mounted filesystems the kernel
+can have.
+You need increase only
+.I super\-max
+if you need to mount more filesystems than the current value in
+.I super\-max
+allows you to.
+.TP
+.I /proc/sys/fs/super\-nr
+This file
+contains the number of filesystems currently mounted.
+.TP
+.I /proc/sys/kernel/
+This directory contains files controlling a range of kernel parameters,
+as described below.
+.TP
+.I /proc/sys/kernel/acct
+This file
+contains three numbers:
+.IR highwater ,
+.IR lowwater ,
+and
+.IR frequency .
+If BSD-style process accounting is enabled, these values control
+its behavior.
+If free space on filesystem where the log lives goes below
+.I lowwater
+percent, accounting suspends.
+If free space gets above
+.I highwater
+percent, accounting resumes.
+.I frequency
+determines
+how often the kernel checks the amount of free space (value is in
+seconds).
+Default values are 4, 2, and 30.
+That is, suspend accounting if 2% or less space is free; resume it
+if 4% or more space is free; consider information about amount of free space
+valid for 30 seconds.
+.TP
+.IR /proc/sys/kernel/auto_msgmni " (Linux 2.6.27 to Linux 3.18)"
+.\" commit 9eefe520c814f6f62c5d36a2ddcd3fb99dfdb30e (introduces feature)
+.\" commit 0050ee059f7fc86b1df2527aaa14ed5dc72f9973 (rendered redundant)
+From Linux 2.6.27 to Linux 3.18,
+this file was used to control recomputing of the value in
+.I /proc/sys/kernel/msgmni
+upon the addition or removal of memory or upon IPC namespace creation/removal.
+Echoing "1" into this file enabled
+.I msgmni
+automatic recomputing (and triggered a recomputation of
+.I msgmni
+based on the current amount of available memory and number of IPC namespaces).
+Echoing "0" disabled automatic recomputing.
+(Automatic recomputing was also disabled if a value was explicitly assigned to
+.IR /proc/sys/kernel/msgmni .)
+The default value in
+.I auto_msgmni
+was 1.
+.IP
+Since Linux 3.19, the content of this file has no effect (because
+.I msgmni
+.\" FIXME Must document the 3.19 'msgmni' changes.
+defaults to near the maximum value possible),
+and reads from this file always return the value "0".
+.TP
+.IR /proc/sys/kernel/cap_last_cap " (since Linux 3.2)"
+See
+.BR capabilities (7).
+.TP
+.IR /proc/sys/kernel/cap\-bound " (from Linux 2.2 to Linux 2.6.24)"
+This file holds the value of the kernel
+.I "capability bounding set"
+(expressed as a signed decimal number).
+This set is ANDed against the capabilities permitted to a process
+during
+.BR execve (2).
+Starting with Linux 2.6.25,
+the system-wide capability bounding set disappeared,
+and was replaced by a per-thread bounding set; see
+.BR capabilities (7).
+.TP
+.I /proc/sys/kernel/core_pattern
+See
+.BR core (5).
+.TP
+.I /proc/sys/kernel/core_pipe_limit
+See
+.BR core (5).
+.TP
+.I /proc/sys/kernel/core_uses_pid
+See
+.BR core (5).
+.TP
+.I /proc/sys/kernel/ctrl\-alt\-del
+This file
+controls the handling of Ctrl-Alt-Del from the keyboard.
+When the value in this file is 0, Ctrl-Alt-Del is trapped and
+sent to the
+.BR init (1)
+program to handle a graceful restart.
+When the value is greater than zero, Linux's reaction to a Vulcan
+Nerve Pinch (tm) will be an immediate reboot, without even
+syncing its dirty buffers.
+Note: when a program (like dosemu) has the keyboard in "raw"
+mode, the Ctrl-Alt-Del is intercepted by the program before it
+ever reaches the kernel tty layer, and it's up to the program
+to decide what to do with it.
+.TP
+.IR /proc/sys/kernel/dmesg_restrict " (since Linux 2.6.37)"
+The value in this file determines who can see kernel syslog contents.
+A value of 0 in this file imposes no restrictions.
+If the value is 1, only privileged users can read the kernel syslog.
+(See
+.BR syslog (2)
+for more details.)
+Since Linux 3.4,
+.\" commit 620f6e8e855d6d447688a5f67a4e176944a084e8
+only users with the
+.B CAP_SYS_ADMIN
+capability may change the value in this file.
+.TP
+.IR /proc/sys/kernel/domainname " and " /proc/sys/kernel/hostname
+can be used to set the NIS/YP domainname and the
+hostname of your box in exactly the same way as the commands
+.BR domainname (1)
+and
+.BR hostname (1),
+that is:
+.IP
+.in +4n
+.EX
+.RB "#" " echo \[aq]darkstar\[aq] > /proc/sys/kernel/hostname"
+.RB "#" " echo \[aq]mydomain\[aq] > /proc/sys/kernel/domainname"
+.EE
+.in
+.IP
+has the same effect as
+.IP
+.in +4n
+.EX
+.RB "#" " hostname \[aq]darkstar\[aq]"
+.RB "#" " domainname \[aq]mydomain\[aq]"
+.EE
+.in
+.IP
+Note, however, that the classic darkstar.frop.org has the
+hostname "darkstar" and DNS (Internet Domain Name Server)
+domainname "frop.org", not to be confused with the NIS (Network
+Information Service) or YP (Yellow Pages) domainname.
+These two
+domain names are in general different.
+For a detailed discussion
+see the
+.BR hostname (1)
+man page.
+.TP
+.I /proc/sys/kernel/hotplug
+This file
+contains the pathname for the hotplug policy agent.
+The default value in this file is
+.IR /sbin/hotplug .
+.TP
+.\" Removed in commit 87f504e5c78b910b0c1d6ffb89bc95e492322c84 (tglx/history.git)
+.IR /proc/sys/kernel/htab\-reclaim " (before Linux 2.4.9.2)"
+(PowerPC only) If this file is set to a nonzero value,
+the PowerPC htab
+.\" removed in commit 1b483a6a7b2998e9c98ad985d7494b9b725bd228, before Linux 2.6.28
+(see kernel file
+.IR Documentation/powerpc/ppc_htab.txt )
+is pruned
+each time the system hits the idle loop.
+.TP
+.I /proc/sys/kernel/keys/
+This directory contains various files that define parameters and limits
+for the key-management facility.
+These files are described in
+.BR keyrings (7).
+.TP
+.IR /proc/sys/kernel/kptr_restrict " (since Linux 2.6.38)"
+.\" 455cd5ab305c90ffc422dd2e0fb634730942b257
+The value in this file determines whether kernel addresses are exposed via
+.I /proc
+files and other interfaces.
+A value of 0 in this file imposes no restrictions.
+If the value is 1, kernel pointers printed using the
+.I %pK
+format specifier will be replaced with zeros unless the user has the
+.B CAP_SYSLOG
+capability.
+If the value is 2, kernel pointers printed using the
+.I %pK
+format specifier will be replaced with zeros regardless
+of the user's capabilities.
+The initial default value for this file was 1,
+but the default was changed
+.\" commit 411f05f123cbd7f8aa1edcae86970755a6e2a9d9
+to 0 in Linux 2.6.39.
+Since Linux 3.4,
+.\" commit 620f6e8e855d6d447688a5f67a4e176944a084e8
+only users with the
+.B CAP_SYS_ADMIN
+capability can change the value in this file.
+.TP
+.I /proc/sys/kernel/l2cr
+(PowerPC only) This file
+contains a flag that controls the L2 cache of G3 processor
+boards.
+If 0, the cache is disabled.
+Enabled if nonzero.
+.TP
+.I /proc/sys/kernel/modprobe
+This file contains the pathname for the kernel module loader.
+The default value is
+.IR /sbin/modprobe .
+The file is present only if the kernel is built with the
+.B CONFIG_MODULES
+.RB ( CONFIG_KMOD
+in Linux 2.6.26 and earlier)
+option enabled.
+It is described by the Linux kernel source file
+.I Documentation/kmod.txt
+(present only in Linux 2.4 and earlier).
+.TP
+.IR /proc/sys/kernel/modules_disabled " (since Linux 2.6.31)"
+.\" 3d43321b7015387cfebbe26436d0e9d299162ea1
+.\" From Documentation/sysctl/kernel.txt
+A toggle value indicating if modules are allowed to be loaded
+in an otherwise modular kernel.
+This toggle defaults to off (0), but can be set true (1).
+Once true, modules can be neither loaded nor unloaded,
+and the toggle cannot be set back to false.
+The file is present only if the kernel is built with the
+.B CONFIG_MODULES
+option enabled.
+.TP
+.IR /proc/sys/kernel/msgmax " (since Linux 2.2)"
+This file defines
+a system-wide limit specifying the maximum number of bytes in
+a single message written on a System V message queue.
+.TP
+.IR /proc/sys/kernel/msgmni " (since Linux 2.4)"
+This file defines the system-wide limit on the number of
+message queue identifiers.
+See also
+.IR /proc/sys/kernel/auto_msgmni .
+.TP
+.IR /proc/sys/kernel/msgmnb " (since Linux 2.2)"
+This file defines a system-wide parameter used to initialize the
+.I msg_qbytes
+setting for subsequently created message queues.
+The
+.I msg_qbytes
+setting specifies the maximum number of bytes that may be written to the
+message queue.
+.TP
+.IR /proc/sys/kernel/ngroups_max " (since Linux 2.6.4)"
+This is a read-only file that displays the upper limit on the
+number of a process's group memberships.
+.TP
+.IR /proc/sys/kernel/ns_last_pid " (since Linux 3.3)"
+See
+.BR pid_namespaces (7).
+.TP
+.IR /proc/sys/kernel/ostype " and " /proc/sys/kernel/osrelease
+These files
+give substrings of
+.IR /proc/version .
+.TP
+.IR /proc/sys/kernel/overflowgid " and " /proc/sys/kernel/overflowuid
+These files duplicate the files
+.I /proc/sys/fs/overflowgid
+and
+.IR /proc/sys/fs/overflowuid .
+.TP
+.I /proc/sys/kernel/panic
+This file gives read/write access to the kernel variable
+.IR panic_timeout .
+If this is zero, the kernel will loop on a panic; if nonzero,
+it indicates that the kernel should autoreboot after this number
+of seconds.
+When you use the
+software watchdog device driver, the recommended setting is 60.
+.TP
+.IR /proc/sys/kernel/panic_on_oops " (since Linux 2.5.68)"
+This file controls the kernel's behavior when an oops
+or BUG is encountered.
+If this file contains 0, then the system
+tries to continue operation.
+If it contains 1, then the system
+delays a few seconds (to give klogd time to record the oops output)
+and then panics.
+If the
+.I /proc/sys/kernel/panic
+file is also nonzero, then the machine will be rebooted.
+.TP
+.IR /proc/sys/kernel/pid_max " (since Linux 2.5.34)"
+This file specifies the value at which PIDs wrap around
+(i.e., the value in this file is one greater than the maximum PID).
+PIDs greater than this value are not allocated;
+thus, the value in this file also acts as a system-wide limit
+on the total number of processes and threads.
+The default value for this file, 32768,
+results in the same range of PIDs as on earlier kernels.
+On 32-bit platforms, 32768 is the maximum value for
+.IR pid_max .
+On 64-bit systems,
+.I pid_max
+can be set to any value up to 2\[ha]22
+.RB ( PID_MAX_LIMIT ,
+approximately 4 million).
+.\" Prior to Linux 2.6.10, pid_max could also be raised above 32768 on 32-bit
+.\" platforms, but this broke /proc/[pid]
+.\" See http://marc.theaimsgroup.com/?l=linux-kernel&m=109513010926152&w=2
+.TP
+.IR /proc/sys/kernel/powersave\-nap " (PowerPC only)"
+This file contains a flag.
+If set, Linux-PPC will use the "nap" mode of
+powersaving,
+otherwise the "doze" mode will be used.
+.TP
+.I /proc/sys/kernel/printk
+See
+.BR syslog (2).
+.TP
+.IR /proc/sys/kernel/pty " (since Linux 2.6.4)"
+This directory contains two files relating to the number of UNIX 98
+pseudoterminals (see
+.BR pts (4))
+on the system.
+.TP
+.I /proc/sys/kernel/pty/max
+This file defines the maximum number of pseudoterminals.
+.\" FIXME Document /proc/sys/kernel/pty/reserve
+.\" New in Linux 3.3
+.\" commit e9aba5158a80098447ff207a452a3418ae7ee386
+.TP
+.I /proc/sys/kernel/pty/nr
+This read-only file
+indicates how many pseudoterminals are currently in use.
+.TP
+.I /proc/sys/kernel/random/
+This directory
+contains various parameters controlling the operation of the file
+.IR /dev/random .
+See
+.BR random (4)
+for further information.
+.TP
+.IR /proc/sys/kernel/random/uuid " (since Linux 2.4)"
+Each read from this read-only file returns a randomly generated 128-bit UUID,
+as a string in the standard UUID format.
+.TP
+.IR /proc/sys/kernel/randomize_va_space " (since Linux 2.6.12)"
+.\" Some further details can be found in Documentation/sysctl/kernel.txt
+Select the address space layout randomization (ASLR) policy for the system
+(on architectures that support ASLR).
+Three values are supported for this file:
+.RS
+.TP
+.B 0
+Turn ASLR off.
+This is the default for architectures that don't support ASLR,
+and when the kernel is booted with the
+.I norandmaps
+parameter.
+.TP
+.B 1
+Make the addresses of
+.BR mmap (2)
+allocations, the stack, and the VDSO page randomized.
+Among other things, this means that shared libraries will be
+loaded at randomized addresses.
+The text segment of PIE-linked binaries will also be loaded
+at a randomized address.
+This value is the default if the kernel was configured with
+.BR CONFIG_COMPAT_BRK .
+.TP
+.B 2
+(Since Linux 2.6.25)
+.\" commit c1d171a002942ea2d93b4fbd0c9583c56fce0772
+Also support heap randomization.
+This value is the default if the kernel was not configured with
+.BR CONFIG_COMPAT_BRK .
+.RE
+.TP
+.I /proc/sys/kernel/real\-root\-dev
+This file is documented in the Linux kernel source file
+.I Documentation/admin\-guide/initrd.rst
+.\" commit 9d85025b0418163fae079c9ba8f8445212de8568
+(or
+.I Documentation/initrd.txt
+before Linux 4.10).
+.TP
+.IR /proc/sys/kernel/reboot\-cmd " (Sparc only)"
+This file seems to be a way to give an argument to the SPARC
+ROM/Flash boot loader.
+Maybe to tell it what to do after
+rebooting?
+.TP
+.I /proc/sys/kernel/rtsig\-max
+(Up to and including Linux 2.6.7; see
+.BR setrlimit (2))
+This file can be used to tune the maximum number
+of POSIX real-time (queued) signals that can be outstanding
+in the system.
+.TP
+.I /proc/sys/kernel/rtsig\-nr
+(Up to and including Linux 2.6.7.)
+This file shows the number of POSIX real-time signals currently queued.
+.TP
+.IR /proc/ pid /sched_autogroup_enabled " (since Linux 2.6.38)"
+.\" commit 5091faa449ee0b7d73bc296a93bca9540fc51d0a
+See
+.BR sched (7).
+.TP
+.IR /proc/sys/kernel/sched_child_runs_first " (since Linux 2.6.23)"
+If this file contains the value zero, then, after a
+.BR fork (2),
+the parent is first scheduled on the CPU.
+If the file contains a nonzero value,
+then the child is scheduled first on the CPU.
+(Of course, on a multiprocessor system,
+the parent and the child might both immediately be scheduled on a CPU.)
+.TP
+.IR /proc/sys/kernel/sched_rr_timeslice_ms " (since Linux 3.9)"
+See
+.BR sched_rr_get_interval (2).
+.TP
+.IR /proc/sys/kernel/sched_rt_period_us " (since Linux 2.6.25)"
+See
+.BR sched (7).
+.TP
+.IR /proc/sys/kernel/sched_rt_runtime_us " (since Linux 2.6.25)"
+See
+.BR sched (7).
+.TP
+.IR /proc/sys/kernel/seccomp/ " (since Linux 4.14)"
+.\" commit 8e5f1ad116df6b0de65eac458d5e7c318d1c05af
+This directory provides additional seccomp information and
+configuration.
+See
+.BR seccomp (2)
+for further details.
+.TP
+.IR /proc/sys/kernel/sem " (since Linux 2.4)"
+This file contains 4 numbers defining limits for System V IPC semaphores.
+These fields are, in order:
+.RS
+.TP
+SEMMSL
+The maximum semaphores per semaphore set.
+.TP
+SEMMNS
+A system-wide limit on the number of semaphores in all semaphore sets.
+.TP
+SEMOPM
+The maximum number of operations that may be specified in a
+.BR semop (2)
+call.
+.TP
+SEMMNI
+A system-wide limit on the maximum number of semaphore identifiers.
+.RE
+.TP
+.I /proc/sys/kernel/sg\-big\-buff
+This file
+shows the size of the generic SCSI device (sg) buffer.
+You can't tune it just yet, but you could change it at
+compile time by editing
+.I include/scsi/sg.h
+and changing
+the value of
+.BR SG_BIG_BUFF .
+However, there shouldn't be any reason to change this value.
+.TP
+.IR /proc/sys/kernel/shm_rmid_forced " (since Linux 3.1)"
+.\" commit b34a6b1da371ed8af1221459a18c67970f7e3d53
+.\" See also Documentation/sysctl/kernel.txt
+If this file is set to 1, all System V shared memory segments will
+be marked for destruction as soon as the number of attached processes
+falls to zero;
+in other words, it is no longer possible to create shared memory segments
+that exist independently of any attached process.
+.IP
+The effect is as though a
+.BR shmctl (2)
+.B IPC_RMID
+is performed on all existing segments as well as all segments
+created in the future (until this file is reset to 0).
+Note that existing segments that are attached to no process will be
+immediately destroyed when this file is set to 1.
+Setting this option will also destroy segments that were created,
+but never attached,
+upon termination of the process that created the segment with
+.BR shmget (2).
+.IP
+Setting this file to 1 provides a way of ensuring that
+all System V shared memory segments are counted against the
+resource usage and resource limits (see the description of
+.B RLIMIT_AS
+in
+.BR getrlimit (2))
+of at least one process.
+.IP
+Because setting this file to 1 produces behavior that is nonstandard
+and could also break existing applications,
+the default value in this file is 0.
+Set this file to 1 only if you have a good understanding
+of the semantics of the applications using
+System V shared memory on your system.
+.TP
+.IR /proc/sys/kernel/shmall " (since Linux 2.2)"
+This file
+contains the system-wide limit on the total number of pages of
+System V shared memory.
+.TP
+.IR /proc/sys/kernel/shmmax " (since Linux 2.2)"
+This file
+can be used to query and set the run-time limit
+on the maximum (System V IPC) shared memory segment size that can be
+created.
+Shared memory segments up to 1 GB are now supported in the
+kernel.
+This value defaults to
+.BR SHMMAX .
+.TP
+.IR /proc/sys/kernel/shmmni " (since Linux 2.4)"
+This file
+specifies the system-wide maximum number of System V shared memory
+segments that can be created.
+.TP
+.IR /proc/sys/kernel/sysctl_writes_strict " (since Linux 3.16)"
+.\" commit f88083005ab319abba5d0b2e4e997558245493c8
+.\" commit 2ca9bb456ada8bcbdc8f77f8fc78207653bbaa92
+.\" commit f4aacea2f5d1a5f7e3154e967d70cf3f711bcd61
+.\" commit 24fe831c17ab8149413874f2fd4e5c8a41fcd294
+The value in this file determines how the file offset affects
+the behavior of updating entries in files under
+.IR /proc/sys .
+The file has three possible values:
+.RS
+.TP 4
+\-1
+This provides legacy handling, with no printk warnings.
+Each
+.BR write (2)
+must fully contain the value to be written,
+and multiple writes on the same file descriptor
+will overwrite the entire value, regardless of the file position.
+.TP
+0
+(default) This provides the same behavior as for \-1,
+but printk warnings are written for processes that
+perform writes when the file offset is not 0.
+.TP
+1
+Respect the file offset when writing strings into
+.I /proc/sys
+files.
+Multiple writes will
+.I append
+to the value buffer.
+Anything written beyond the maximum length
+of the value buffer will be ignored.
+Writes to numeric
+.I /proc/sys
+entries must always be at file offset 0 and the value must be
+fully contained in the buffer provided to
+.BR write (2).
+.\" FIXME .
+.\" With /proc/sys/kernel/sysctl_writes_strict==1, writes at an
+.\" offset other than 0 do not generate an error. Instead, the
+.\" write() succeeds, but the file is left unmodified.
+.\" This is surprising. The behavior may change in the future.
+.\" See thread.gmane.org/gmane.linux.man/9197
+.\" From: Michael Kerrisk (man-pages <mtk.manpages@...>
+.\" Subject: sysctl_writes_strict documentation + an oddity?
+.\" Newsgroups: gmane.linux.man, gmane.linux.kernel
+.\" Date: 2015-05-09 08:54:11 GMT
+.RE
+.TP
+.I /proc/sys/kernel/sysrq
+This file controls the functions allowed to be invoked by the SysRq key.
+By default,
+the file contains 1 meaning that every possible SysRq request is allowed
+(in older kernel versions, SysRq was disabled by default,
+and you were required to specifically enable it at run-time,
+but this is not the case any more).
+Possible values in this file are:
+.RS
+.TP 5
+0
+Disable sysrq completely
+.TP
+1
+Enable all functions of sysrq
+.TP
+> 1
+Bit mask of allowed sysrq functions, as follows:
+.PD 0
+.RS
+.TP 5
+\ \ 2
+Enable control of console logging level
+.TP
+\ \ 4
+Enable control of keyboard (SAK, unraw)
+.TP
+\ \ 8
+Enable debugging dumps of processes etc.
+.TP
+\ 16
+Enable sync command
+.TP
+\ 32
+Enable remount read-only
+.TP
+\ 64
+Enable signaling of processes (term, kill, oom-kill)
+.TP
+128
+Allow reboot/poweroff
+.TP
+256
+Allow nicing of all real-time tasks
+.RE
+.PD
+.RE
+.IP
+This file is present only if the
+.B CONFIG_MAGIC_SYSRQ
+kernel configuration option is enabled.
+For further details see the Linux kernel source file
+.I Documentation/admin\-guide/sysrq.rst
+.\" commit 9d85025b0418163fae079c9ba8f8445212de8568
+(or
+.I Documentation/sysrq.txt
+before Linux 4.10).
+.TP
+.I /proc/sys/kernel/version
+This file contains a string such as:
+.IP
+.in +4n
+.EX
+#5 Wed Feb 25 21:49:24 MET 1998
+.EE
+.in
+.IP
+The "#5" means that
+this is the fifth kernel built from this source base and the
+date following it indicates the time the kernel was built.
+.TP
+.IR /proc/sys/kernel/threads\-max " (since Linux 2.3.11)"
+.\" The following is based on Documentation/sysctl/kernel.txt
+This file specifies the system-wide limit on the number of
+threads (tasks) that can be created on the system.
+.IP
+Since Linux 4.1,
+.\" commit 230633d109e35b0a24277498e773edeb79b4a331
+the value that can be written to
+.I threads\-max
+is bounded.
+The minimum value that can be written is 20.
+The maximum value that can be written is given by the
+constant
+.B FUTEX_TID_MASK
+(0x3fffffff).
+If a value outside of this range is written to
+.IR threads\-max ,
+the error
+.B EINVAL
+occurs.
+.IP
+The value written is checked against the available RAM pages.
+If the thread structures would occupy too much (more than 1/8th)
+of the available RAM pages,
+.I threads\-max
+is reduced accordingly.
+.TP
+.IR /proc/sys/kernel/yama/ptrace_scope " (since Linux 3.5)"
+See
+.BR ptrace (2).
+.TP
+.IR /proc/sys/kernel/zero\-paged " (PowerPC only)"
+This file
+contains a flag.
+When enabled (nonzero), Linux-PPC will pre-zero pages in
+the idle loop, possibly speeding up get_free_pages.
+.TP
+.I /proc/sys/net
+This directory contains networking stuff.
+Explanations for some of the files under this directory can be found in
+.BR tcp (7)
+and
+.BR ip (7).
+.TP
+.I /proc/sys/net/core/bpf_jit_enable
+See
+.BR bpf (2).
+.TP
+.I /proc/sys/net/core/somaxconn
+This file defines a ceiling value for the
+.I backlog
+argument of
+.BR listen (2);
+see the
+.BR listen (2)
+manual page for details.
+.TP
+.I /proc/sys/proc
+This directory may be empty.
+.TP
+.I /proc/sys/sunrpc
+This directory supports Sun remote procedure call for network filesystem
+(NFS).
+On some systems, it is not present.
+.TP
+.IR /proc/sys/user " (since Linux 4.9)"
+See
+.BR namespaces (7).
+.TP
+.I /proc/sys/vm/
+This directory contains files for memory management tuning, buffer, and
+cache management.
+.TP
+.IR /proc/sys/vm/admin_reserve_kbytes " (since Linux 3.10)"
+.\" commit 4eeab4f5580d11bffedc697684b91b0bca0d5009
+This file defines the amount of free memory (in KiB) on the system that
+should be reserved for users with the capability
+.BR CAP_SYS_ADMIN .
+.IP
+The default value in this file is the minimum of [3% of free pages, 8MiB]
+expressed as KiB.
+The default is intended to provide enough for the superuser
+to log in and kill a process, if necessary,
+under the default overcommit 'guess' mode (i.e., 0 in
+.IR /proc/sys/vm/overcommit_memory ).
+.IP
+Systems running in "overcommit never" mode (i.e., 2 in
+.IR /proc/sys/vm/overcommit_memory )
+should increase the value in this file to account
+for the full virtual memory size of the programs used to recover (e.g.,
+.BR login (1)
+.BR ssh (1),
+and
+.BR top (1))
+Otherwise, the superuser may not be able to log in to recover the system.
+For example, on x86-64 a suitable value is 131072 (128MiB reserved).
+.IP
+Changing the value in this file takes effect whenever
+an application requests memory.
+.TP
+.IR /proc/sys/vm/compact_memory " (since Linux 2.6.35)"
+When 1 is written to this file, all zones are compacted such that free
+memory is available in contiguous blocks where possible.
+The effect of this action can be seen by examining
+.IR /proc/buddyinfo .
+.IP
+Present only if the kernel was configured with
+.BR CONFIG_COMPACTION .
+.TP
+.IR /proc/sys/vm/drop_caches " (since Linux 2.6.16)"
+Writing to this file causes the kernel to drop clean caches, dentries, and
+inodes from memory, causing that memory to become free.
+This can be useful for memory management testing and
+performing reproducible filesystem benchmarks.
+Because writing to this file causes the benefits of caching to be lost,
+it can degrade overall system performance.
+.IP
+To free pagecache, use:
+.IP
+.in +4n
+.EX
+echo 1 > /proc/sys/vm/drop_caches
+.EE
+.in
+.IP
+To free dentries and inodes, use:
+.IP
+.in +4n
+.EX
+echo 2 > /proc/sys/vm/drop_caches
+.EE
+.in
+.IP
+To free pagecache, dentries, and inodes, use:
+.IP
+.in +4n
+.EX
+echo 3 > /proc/sys/vm/drop_caches
+.EE
+.in
+.IP
+Because writing to this file is a nondestructive operation and dirty objects
+are not freeable, the
+user should run
+.BR sync (1)
+first.
+.TP
+.IR /proc/sys/vm/sysctl_hugetlb_shm_group " (since Linux 2.6.7)"
+This writable file contains a group ID that is allowed
+to allocate memory using huge pages.
+If a process has a filesystem group ID or any supplementary group ID that
+matches this group ID,
+then it can make huge-page allocations without holding the
+.B CAP_IPC_LOCK
+capability; see
+.BR memfd_create (2),
+.BR mmap (2),
+and
+.BR shmget (2).
+.TP
+.IR /proc/sys/vm/legacy_va_layout " (since Linux 2.6.9)"
+.\" The following is from Documentation/filesystems/proc.txt
+If nonzero, this disables the new 32-bit memory-mapping layout;
+the kernel will use the legacy (2.4) layout for all processes.
+.TP
+.IR /proc/sys/vm/memory_failure_early_kill " (since Linux 2.6.32)"
+.\" The following is based on the text in Documentation/sysctl/vm.txt
+Control how to kill processes when an uncorrected memory error
+(typically a 2-bit error in a memory module)
+that cannot be handled by the kernel
+is detected in the background by hardware.
+In some cases (like the page still having a valid copy on disk),
+the kernel will handle the failure
+transparently without affecting any applications.
+But if there is no other up-to-date copy of the data,
+it will kill processes to prevent any data corruptions from propagating.
+.IP
+The file has one of the following values:
+.RS
+.TP
+.B 1
+Kill all processes that have the corrupted-and-not-reloadable page mapped
+as soon as the corruption is detected.
+Note that this is not supported for a few types of pages,
+such as kernel internally
+allocated data or the swap cache, but works for the majority of user pages.
+.TP
+.B 0
+Unmap the corrupted page from all processes and kill a process
+only if it tries to access the page.
+.RE
+.IP
+The kill is performed using a
+.B SIGBUS
+signal with
+.I si_code
+set to
+.BR BUS_MCEERR_AO .
+Processes can handle this if they want to; see
+.BR sigaction (2)
+for more details.
+.IP
+This feature is active only on architectures/platforms with advanced machine
+check handling and depends on the hardware capabilities.
+.IP
+Applications can override the
+.I memory_failure_early_kill
+setting individually with the
+.BR prctl (2)
+.B PR_MCE_KILL
+operation.
+.IP
+Present only if the kernel was configured with
+.BR CONFIG_MEMORY_FAILURE .
+.TP
+.IR /proc/sys/vm/memory_failure_recovery " (since Linux 2.6.32)"
+.\" The following is based on the text in Documentation/sysctl/vm.txt
+Enable memory failure recovery (when supported by the platform).
+.RS
+.TP
+.B 1
+Attempt recovery.
+.TP
+.B 0
+Always panic on a memory failure.
+.RE
+.IP
+Present only if the kernel was configured with
+.BR CONFIG_MEMORY_FAILURE .
+.TP
+.IR /proc/sys/vm/oom_dump_tasks " (since Linux 2.6.25)"
+.\" The following is from Documentation/sysctl/vm.txt
+Enables a system-wide task dump (excluding kernel threads) to be
+produced when the kernel performs an OOM-killing.
+The dump includes the following information
+for each task (thread, process):
+thread ID, real user ID, thread group ID (process ID),
+virtual memory size, resident set size,
+the CPU that the task is scheduled on,
+oom_adj score (see the description of
+.IR /proc/ pid /oom_adj ),
+and command name.
+This is helpful to determine why the OOM-killer was invoked
+and to identify the rogue task that caused it.
+.IP
+If this contains the value zero, this information is suppressed.
+On very large systems with thousands of tasks,
+it may not be feasible to dump the memory state information for each one.
+Such systems should not be forced to incur a performance penalty in
+OOM situations when the information may not be desired.
+.IP
+If this is set to nonzero, this information is shown whenever the
+OOM-killer actually kills a memory-hogging task.
+.IP
+The default value is 0.
+.TP
+.IR /proc/sys/vm/oom_kill_allocating_task " (since Linux 2.6.24)"
+.\" The following is from Documentation/sysctl/vm.txt
+This enables or disables killing the OOM-triggering task in
+out-of-memory situations.
+.IP
+If this is set to zero, the OOM-killer will scan through the entire
+tasklist and select a task based on heuristics to kill.
+This normally selects a rogue memory-hogging task that
+frees up a large amount of memory when killed.
+.IP
+If this is set to nonzero, the OOM-killer simply kills the task that
+triggered the out-of-memory condition.
+This avoids a possibly expensive tasklist scan.
+.IP
+If
+.I /proc/sys/vm/panic_on_oom
+is nonzero, it takes precedence over whatever value is used in
+.IR /proc/sys/vm/oom_kill_allocating_task .
+.IP
+The default value is 0.
+.TP
+.IR /proc/sys/vm/overcommit_kbytes " (since Linux 3.14)"
+.\" commit 49f0ce5f92321cdcf741e35f385669a421013cb7
+This writable file provides an alternative to
+.I /proc/sys/vm/overcommit_ratio
+for controlling the
+.I CommitLimit
+when
+.I /proc/sys/vm/overcommit_memory
+has the value 2.
+It allows the amount of memory overcommitting to be specified as
+an absolute value (in kB),
+rather than as a percentage, as is done with
+.IR overcommit_ratio .
+This allows for finer-grained control of
+.I CommitLimit
+on systems with extremely large memory sizes.
+.IP
+Only one of
+.I overcommit_kbytes
+or
+.I overcommit_ratio
+can have an effect:
+if
+.I overcommit_kbytes
+has a nonzero value, then it is used to calculate
+.IR CommitLimit ,
+otherwise
+.I overcommit_ratio
+is used.
+Writing a value to either of these files causes the
+value in the other file to be set to zero.
+.TP
+.I /proc/sys/vm/overcommit_memory
+This file contains the kernel virtual memory accounting mode.
+Values are:
+.RS
+.IP
+0: heuristic overcommit (this is the default)
+.br
+1: always overcommit, never check
+.br
+2: always check, never overcommit
+.RE
+.IP
+In mode 0, calls of
+.BR mmap (2)
+with
+.B MAP_NORESERVE
+are not checked, and the default check is very weak,
+leading to the risk of getting a process "OOM-killed".
+.IP
+In mode 1, the kernel pretends there is always enough memory,
+until memory actually runs out.
+One use case for this mode is scientific computing applications
+that employ large sparse arrays.
+Before Linux 2.6.0, any nonzero value implies mode 1.
+.IP
+In mode 2 (available since Linux 2.6), the total virtual address space
+that can be allocated
+.RI ( CommitLimit
+in
+.IR /proc/meminfo )
+is calculated as
+.IP
+.in +4n
+.EX
+CommitLimit = (total_RAM \- total_huge_TLB) *
+ overcommit_ratio / 100 + total_swap
+.EE
+.in
+.IP
+where:
+.RS
+.IP \[bu] 3
+.I total_RAM
+is the total amount of RAM on the system;
+.IP \[bu]
+.I total_huge_TLB
+is the amount of memory set aside for huge pages;
+.IP \[bu]
+.I overcommit_ratio
+is the value in
+.IR /proc/sys/vm/overcommit_ratio ;
+and
+.IP \[bu]
+.I total_swap
+is the amount of swap space.
+.RE
+.IP
+For example, on a system with 16 GB of physical RAM, 16 GB
+of swap, no space dedicated to huge pages, and an
+.I overcommit_ratio
+of 50, this formula yields a
+.I CommitLimit
+of 24 GB.
+.IP
+Since Linux 3.14, if the value in
+.I /proc/sys/vm/overcommit_kbytes
+is nonzero, then
+.I CommitLimit
+is instead calculated as:
+.IP
+.in +4n
+.EX
+CommitLimit = overcommit_kbytes + total_swap
+.EE
+.in
+.IP
+See also the description of
+.I /proc/sys/vm/admin_reserve_kbytes
+and
+.IR /proc/sys/vm/user_reserve_kbytes .
+.TP
+.IR /proc/sys/vm/overcommit_ratio " (since Linux 2.6.0)"
+This writable file defines a percentage by which memory
+can be overcommitted.
+The default value in the file is 50.
+See the description of
+.IR /proc/sys/vm/overcommit_memory .
+.TP
+.IR /proc/sys/vm/panic_on_oom " (since Linux 2.6.18)"
+.\" The following is adapted from Documentation/sysctl/vm.txt
+This enables or disables a kernel panic in
+an out-of-memory situation.
+.IP
+If this file is set to the value 0,
+the kernel's OOM-killer will kill some rogue process.
+Usually, the OOM-killer is able to kill a rogue process and the
+system will survive.
+.IP
+If this file is set to the value 1,
+then the kernel normally panics when out-of-memory happens.
+However, if a process limits allocations to certain nodes
+using memory policies
+.RB ( mbind (2)
+.BR MPOL_BIND )
+or cpusets
+.RB ( cpuset (7))
+and those nodes reach memory exhaustion status,
+one process may be killed by the OOM-killer.
+No panic occurs in this case:
+because other nodes' memory may be free,
+this means the system as a whole may not have reached
+an out-of-memory situation yet.
+.IP
+If this file is set to the value 2,
+the kernel always panics when an out-of-memory condition occurs.
+.IP
+The default value is 0.
+1 and 2 are for failover of clustering.
+Select either according to your policy of failover.
+.TP
+.I /proc/sys/vm/swappiness
+.\" The following is from Documentation/sysctl/vm.txt
+The value in this file controls how aggressively the kernel will swap
+memory pages.
+Higher values increase aggressiveness, lower values
+decrease aggressiveness.
+The default value is 60.
+.TP
+.IR /proc/sys/vm/user_reserve_kbytes " (since Linux 3.10)"
+.\" commit c9b1d0981fcce3d9976d7b7a56e4e0503bc610dd
+Specifies an amount of memory (in KiB) to reserve for user processes.
+This is intended to prevent a user from starting a single memory hogging
+process, such that they cannot recover (kill the hog).
+The value in this file has an effect only when
+.I /proc/sys/vm/overcommit_memory
+is set to 2 ("overcommit never" mode).
+In this case, the system reserves an amount of memory that is the minimum
+of [3% of current process size,
+.IR user_reserve_kbytes ].
+.IP
+The default value in this file is the minimum of [3% of free pages, 128MiB]
+expressed as KiB.
+.IP
+If the value in this file is set to zero,
+then a user will be allowed to allocate all free memory with a single process
+(minus the amount reserved by
+.IR /proc/sys/vm/admin_reserve_kbytes ).
+Any subsequent attempts to execute a command will result in
+"fork: Cannot allocate memory".
+.IP
+Changing the value in this file takes effect whenever
+an application requests memory.
+.TP
+.IR /proc/sys/vm/unprivileged_userfaultfd " (since Linux 5.2)"
+.\" cefdca0a86be517bc390fc4541e3674b8e7803b0
+This (writable) file exposes a flag that controls whether
+unprivileged processes are allowed to employ
+.BR userfaultfd (2).
+If this file has the value 1, then unprivileged processes may use
+.BR userfaultfd (2).
+If this file has the value 0, then only processes that have the
+.B CAP_SYS_PTRACE
+capability may employ
+.BR userfaultfd (2).
+The default value in this file is 1.
+.SH SEE ALSO
+.BR proc (5)