1 files changed, 1623 insertions, 0 deletions
diff --git a/man5/proc_sys.5 b/man5/proc_sys.5
new file mode 100644
index 0000000000..78f0c192c2
--- /dev/null
+++ b/man5/proc_sys.5
@@ -0,0 +1,1623 @@
+'\" t
+.\" Copyright (C) 1994, 1995, Daniel Quinlan <quinlan@yggdrasil.com>
+.\" Copyright (C) 2002-2008, 2017, Michael Kerrisk <mtk.manpages@gmail.com>
+.\" Copyright (C) , Andries Brouwer <aeb@cwi.nl>
+.\" Copyright (C) 2023, Alejandro Colomar <alx@kernel.org>
+.\"
+.\" SPDX-License-Identifier: GPL-3.0-or-later
+.\"
+.TH proc_sys 5 (date) "Linux man-pages (unreleased)"
+.SH NAME
+/proc/sys/ \- system information, and sysctl pseudo-filesystem
+.SH DESCRIPTION
+.TP
+.I /proc/sys/
+This directory (present since Linux 1.3.57) contains a number of files
+and subdirectories corresponding to kernel variables.
+These variables can be read and in some cases modified using
+the \fI/proc\fP filesystem, and the (deprecated)
+.BR sysctl (2)
+system call.
+.IP
+String values may be terminated by either \[aq]\e0\[aq] or \[aq]\en\[aq].
+.IP
+Integer and long values may be written either in decimal or in
+hexadecimal notation (e.g., 0x3FFF).
+When writing multiple integer or long values, these may be separated
+by any of the following whitespace characters:
+\[aq]\ \[aq], \[aq]\et\[aq], or \[aq]\en\[aq].
+Using other separators leads to the error
+.BR EINVAL .
+.TP
+.IR /proc/sys/abi/ " (since Linux 2.4.10)"
+This directory may contain files with application binary information.
+.\" On some systems, it is not present.
+See the Linux kernel source file
+.I Documentation/sysctl/abi.rst
+(or
+.I Documentation/sysctl/abi.txt
+before Linux 5.3)
+for more information.
+.TP
+.I /proc/sys/debug/
+This directory may be empty.
+.TP
+.I /proc/sys/dev/
+This directory contains device-specific information (e.g.,
+.IR dev/cdrom/info ).
+On
+some systems, it may be empty.
+.TP
+.I /proc/sys/fs/
+This directory contains the files and subdirectories for kernel variables
+related to filesystems.
+.TP
+.IR /proc/sys/fs/aio\-max\-nr " and " /proc/sys/fs/aio\-nr " (since Linux 2.6.4)"
+.I aio\-nr
+is the running total of the number of events specified by
+.BR io_setup (2)
+calls for all currently active AIO contexts.
+If
+.I aio\-nr
+reaches
+.IR aio\-max\-nr ,
+then
+.BR io_setup (2)
+will fail with the error
+.BR EAGAIN .
+Raising
+.I aio\-max\-nr
+does not result in the preallocation or resizing
+of any kernel data structures.
+.TP
+.I /proc/sys/fs/binfmt_misc
+Documentation for files in this directory can be found
+in the Linux kernel source in the file
+.I Documentation/admin\-guide/binfmt\-misc.rst
+(or in
+.I Documentation/binfmt_misc.txt
+on older kernels).
+.TP
+.IR /proc/sys/fs/dentry\-state " (since Linux 2.2)"
+This file contains information about the status of the
+directory cache (dcache).
+The file contains six numbers,
+.IR nr_dentry ,
+.IR nr_unused ,
+.I age_limit
+(age in seconds),
+.I want_pages
+(pages requested by system) and two dummy values.
+.RS
+.IP \[bu] 3
+.I nr_dentry
+is the number of allocated dentries (dcache entries).
+This field is unused in Linux 2.2.
+.IP \[bu]
+.I nr_unused
+is the number of unused dentries.
+.IP \[bu]
+.I age_limit
+.\" looks like this is unused in Linux 2.2 to Linux 2.6
+is the age in seconds after which dcache entries
+can be reclaimed when memory is short.
+.IP \[bu]
+.I want_pages
+.\" looks like this is unused in Linux 2.2 to Linux 2.6
+is nonzero when the kernel has called shrink_dcache_pages() and the
+dcache isn't pruned yet.
+.RE
+.TP
+.I /proc/sys/fs/dir\-notify\-enable
+This file can be used to disable or enable the
+.I dnotify
+interface described in
+.BR fcntl (2)
+on a system-wide basis.
+A value of 0 in this file disables the interface,
+and a value of 1 enables it.
+.TP
+.I /proc/sys/fs/dquot\-max
+This file shows the maximum number of cached disk quota entries.
+On some (2.4) systems, it is not present.
+If the number of free cached disk quota entries is very low and
+you have some awesome number of simultaneous system users,
+you might want to raise the limit.
+.TP
+.I /proc/sys/fs/dquot\-nr
+This file shows the number of allocated disk quota
+entries and the number of free disk quota entries.
+.TP
+.IR /proc/sys/fs/epoll/ " (since Linux 2.6.28)"
+This directory contains the file
+.IR max_user_watches ,
+which can be used to limit the amount of kernel memory consumed by the
+.I epoll
+interface.
+For further details, see
+.BR epoll (7).
+.TP
+.I /proc/sys/fs/file\-max
+This file defines
+a system-wide limit on the number of open files for all processes.
+System calls that fail when encountering this limit fail with the error
+.BR ENFILE .
+(See also
+.BR setrlimit (2),
+which can be used by a process to set the per-process limit,
+.BR RLIMIT_NOFILE ,
+on the number of files it may open.)
+If you get lots
+of error messages in the kernel log about running out of file handles
+(open file descriptions)
+(look for "VFS: file\-max limit <number> reached"),
+try increasing this value:
+.IP
+.in +4n
+.EX
+echo 100000 > /proc/sys/fs/file\-max
+.EE
+.in
+.IP
+Privileged processes
+.RB ( CAP_SYS_ADMIN )
+can override the
+.I file\-max
+limit.
+.TP
+.I /proc/sys/fs/file\-nr
+This (read-only) file contains three numbers:
+the number of allocated file handles
+(i.e., the number of open file descriptions; see
+.BR open (2));
+the number of free file handles;
+and the maximum number of file handles (i.e., the same value as
+.IR /proc/sys/fs/file\-max ).
+If the number of allocated file handles is close to the
+maximum, you should consider increasing the maximum.
+Before Linux 2.6,
+the kernel allocated file handles dynamically,
+but it didn't free them again.
+Instead the free file handles were kept in a list for reallocation;
+the "free file handles" value indicates the size of that list.
+A large number of free file handles indicates that there was
+a past peak in the usage of open file handles.
+Since Linux 2.6, the kernel does deallocate freed file handles,
+and the "free file handles" value is always zero.
+.TP
+.IR /proc/sys/fs/inode\-max " (only present until Linux 2.2)"
+This file contains the maximum number of in-memory inodes.
+This value should be 3\[en]4 times larger
+than the value in
+.IR file\-max ,
+since \fIstdin\fP, \fIstdout\fP
+and network sockets also need an inode to handle them.
+When you regularly run out of inodes, you need to increase this value.
+.IP
+Starting with Linux 2.4,
+there is no longer a static limit on the number of inodes,
+and this file is removed.
+.TP
+.I /proc/sys/fs/inode\-nr
+This file contains the first two values from
+.IR inode\-state .
+.TP
+.I /proc/sys/fs/inode\-state
+This file
+contains seven numbers:
+.IR nr_inodes ,
+.IR nr_free_inodes ,
+.IR preshrink ,
+and four dummy values (always zero).
+.IP
+.I nr_inodes
+is the number of inodes the system has allocated.
+.\" This can be slightly more than
+.\" .I inode\-max
+.\" because Linux allocates them one page full at a time.
+.I nr_free_inodes
+represents the number of free inodes.
+.IP
+.I preshrink
+is nonzero when the
+.I nr_inodes
+>
+.I inode\-max
+and the system needs to prune the inode list instead of allocating more;
+since Linux 2.4, this field is a dummy value (always zero).
+.TP
+.IR /proc/sys/fs/inotify/ " (since Linux 2.6.13)"
+This directory contains files
+.IR max_queued_events ", " max_user_instances ", and " max_user_watches ,
+that can be used to limit the amount of kernel memory consumed by the
+.I inotify
+interface.
+For further details, see
+.BR inotify (7).
+.TP
+.I /proc/sys/fs/lease\-break\-time
+This file specifies the grace period that the kernel grants to a process
+holding a file lease
+.RB ( fcntl (2))
+after it has sent a signal to that process notifying it
+that another process is waiting to open the file.
+If the lease holder does not remove or downgrade the lease within
+this grace period, the kernel forcibly breaks the lease.
+.TP
+.I /proc/sys/fs/leases\-enable
+This file can be used to enable or disable file leases
+.RB ( fcntl (2))
+on a system-wide basis.
+If this file contains the value 0, leases are disabled.
+A nonzero value enables leases.
+.TP
+.IR /proc/sys/fs/mount\-max " (since Linux 4.9)"
+.\" commit d29216842a85c7970c536108e093963f02714498
+The value in this file specifies the maximum number of mounts that may exist
+in a mount namespace.
+The default value in this file is 100,000.
+.TP
+.IR /proc/sys/fs/mqueue/ " (since Linux 2.6.6)"
+This directory contains files
+.IR msg_max ", " msgsize_max ", and " queues_max ,
+controlling the resources used by POSIX message queues.
+See
+.BR mq_overview (7)
+for details.
+.TP
+.IR /proc/sys/fs/nr_open " (since Linux 2.6.25)"
+.\" commit 9cfe015aa424b3c003baba3841a60dd9b5ad319b
+This file imposes a ceiling on the value to which the
+.B RLIMIT_NOFILE
+resource limit can be raised (see
+.BR getrlimit (2)).
+This ceiling is enforced for both unprivileged and privileged process.
+The default value in this file is 1048576.
+(Before Linux 2.6.25, the ceiling for
+.B RLIMIT_NOFILE
+was hard-coded to the same value.)
+.TP
+.IR /proc/sys/fs/overflowgid " and " /proc/sys/fs/overflowuid
+These files
+allow you to change the value of the fixed UID and GID.
+The default is 65534.
+Some filesystems support only 16-bit UIDs and GIDs, although in Linux
+UIDs and GIDs are 32 bits.
+When one of these filesystems is mounted
+with writes enabled, any UID or GID that would exceed 65535 is translated
+to the overflow value before being written to disk.
+.TP
+.IR /proc/sys/fs/pipe\-max\-size " (since Linux 2.6.35)"
+See
+.BR pipe (7).
+.TP
+.IR /proc/sys/fs/pipe\-user\-pages\-hard " (since Linux 4.5)"
+See
+.BR pipe (7).
+.TP
+.IR /proc/sys/fs/pipe\-user\-pages\-soft " (since Linux 4.5)"
+See
+.BR pipe (7).
+.TP
+.IR /proc/sys/fs/protected_fifos " (since Linux 4.19)"
+The value in this file is/can be set to one of the following:
+.RS
+.TP 4
+0
+Writing to FIFOs is unrestricted.
+.TP
+1
+Don't allow
+.B O_CREAT
+.BR open (2)
+on FIFOs that the caller doesn't own in world-writable sticky directories,
+unless the FIFO is owned by the owner of the directory.
+.TP
+2
+As for the value 1,
+but the restriction also applies to group-writable sticky directories.
+.RE
+.IP
+The intent of the above protections is to avoid unintentional writes to an
+attacker-controlled FIFO when a program expected to create a regular file.
+.TP
+.IR /proc/sys/fs/protected_hardlinks " (since Linux 3.6)"
+.\" commit 800179c9b8a1e796e441674776d11cd4c05d61d7
+When the value in this file is 0,
+no restrictions are placed on the creation of hard links
+(i.e., this is the historical behavior before Linux 3.6).
+When the value in this file is 1,
+a hard link can be created to a target file
+only if one of the following conditions is true:
+.RS
+.IP \[bu] 3
+The calling process has the
+.B CAP_FOWNER
+capability in its user namespace
+and the file UID has a mapping in the namespace.
+.IP \[bu]
+The filesystem UID of the process creating the link matches
+the owner (UID) of the target file
+(as described in
+.BR credentials (7),
+a process's filesystem UID is normally the same as its effective UID).
+.IP \[bu]
+All of the following conditions are true:
+.RS 4
+.IP \[bu] 3
+the target is a regular file;
+.IP \[bu]
+the target file does not have its set-user-ID mode bit enabled;
+.IP \[bu]
+the target file does not have both its set-group-ID and
+group-executable mode bits enabled; and
+.IP \[bu]
+the caller has permission to read and write the target file
+(either via the file's permissions mask or because it has
+suitable capabilities).
+.RE
+.RE
+.IP
+The default value in this file is 0.
+Setting the value to 1
+prevents a longstanding class of security issues caused by
+hard-link-based time-of-check, time-of-use races,
+most commonly seen in world-writable directories such as
+.IR /tmp .
+The common method of exploiting this flaw
+is to cross privilege boundaries when following a given hard link
+(i.e., a root process follows a hard link created by another user).
+Additionally, on systems without separated partitions,
+this stops unauthorized users from "pinning" vulnerable set-user-ID and
+set-group-ID files against being upgraded by
+the administrator, or linking to special files.
+.TP
+.IR /proc/sys/fs/protected_regular " (since Linux 4.19)"
+The value in this file is/can be set to one of the following:
+.RS
+.TP 4
+0
+Writing to regular files is unrestricted.
+.TP
+1
+Don't allow
+.B O_CREAT
+.BR open (2)
+on regular files that the caller doesn't own in
+world-writable sticky directories,
+unless the regular file is owned by the owner of the directory.
+.TP
+2
+As for the value 1,
+but the restriction also applies to group-writable sticky directories.
+.RE
+.IP
+The intent of the above protections is similar to
+.IR protected_fifos ,
+but allows an application to
+avoid writes to an attacker-controlled regular file,
+where the application expected to create one.
+.TP
+.IR /proc/sys/fs/protected_symlinks " (since Linux 3.6)"
+.\" commit 800179c9b8a1e796e441674776d11cd4c05d61d7
+When the value in this file is 0,
+no restrictions are placed on following symbolic links
+(i.e., this is the historical behavior before Linux 3.6).
+When the value in this file is 1, symbolic links are followed only
+in the following circumstances:
+.RS
+.IP \[bu] 3
+the filesystem UID of the process following the link matches
+the owner (UID) of the symbolic link
+(as described in
+.BR credentials (7),
+a process's filesystem UID is normally the same as its effective UID);
+.IP \[bu]
+the link is not in a sticky world-writable directory; or
+.IP \[bu]
+the symbolic link and its parent directory have the same owner (UID)
+.RE
+.IP
+A system call that fails to follow a symbolic link
+because of the above restrictions returns the error
+.B EACCES
+in
+.IR errno .
+.IP
+The default value in this file is 0.
+Setting the value to 1 avoids a longstanding class of security issues
+based on time-of-check, time-of-use races when accessing symbolic links.
+.TP
+.IR /proc/sys/fs/suid_dumpable " (since Linux 2.6.13)"
+.\" The following is based on text from Documentation/sysctl/kernel.txt
+The value in this file is assigned to a process's "dumpable" flag
+in the circumstances described in
+.BR prctl (2).
+In effect,
+the value in this file determines whether core dump files are
+produced for set-user-ID or otherwise protected/tainted binaries.
+The "dumpable" setting also affects the ownership of files in a process's
+.IR /proc/ pid
+directory, as described above.
+.IP
+Three different integer values can be specified:
+.RS
+.TP
+\fI0\ (default)\fP
+.\" In kernel source: SUID_DUMP_DISABLE
+This provides the traditional (pre-Linux 2.6.13) behavior.
+A core dump will not be produced for a process which has
+changed credentials (by calling
+.BR seteuid (2),
+.BR setgid (2),
+or similar, or by executing a set-user-ID or set-group-ID program)
+or whose binary does not have read permission enabled.
+.TP
+\fI1\ ("debug")\fP
+.\" In kernel source: SUID_DUMP_USER
+All processes dump core when possible.
+(Reasons why a process might nevertheless not dump core are described in
+.BR core (5).)
+The core dump is owned by the filesystem user ID of the dumping process
+and no security is applied.
+This is intended for system debugging situations only:
+this mode is insecure because it allows unprivileged users to
+examine the memory contents of privileged processes.
+.TP
+\fI2\ ("suidsafe")\fP
+.\" In kernel source: SUID_DUMP_ROOT
+Any binary which normally would not be dumped (see "0" above)
+is dumped readable by root only.
+This allows the user to remove the core dump file but not to read it.
+For security reasons core dumps in this mode will not overwrite one
+another or other files.
+This mode is appropriate when administrators are
+attempting to debug problems in a normal environment.
+.IP
+Additionally, since Linux 3.6,
+.\" 9520628e8ceb69fa9a4aee6b57f22675d9e1b709
+.I /proc/sys/kernel/core_pattern
+must either be an absolute pathname
+or a pipe command, as detailed in
+.BR core (5).
+Warnings will be written to the kernel log if
+.I core_pattern
+does not follow these rules, and no core dump will be produced.
+.\" 54b501992dd2a839e94e76aa392c392b55080ce8
+.RE
+.IP
+For details of the effect of a process's "dumpable" setting
+on ptrace access mode checking, see
+.BR ptrace (2).
+.TP
+.I /proc/sys/fs/super\-max
+This file
+controls the maximum number of superblocks, and
+thus the maximum number of mounted filesystems the kernel
+can have.
+You need increase only
+.I super\-max
+if you need to mount more filesystems than the current value in
+.I super\-max
+allows you to.
+.TP
+.I /proc/sys/fs/super\-nr
+This file
+contains the number of filesystems currently mounted.
+.TP
+.I /proc/sys/kernel/
+This directory contains files controlling a range of kernel parameters,
+as described below.
+.TP
+.I /proc/sys/kernel/acct
+This file
+contains three numbers:
+.IR highwater ,
+.IR lowwater ,
+and
+.IR frequency .
+If BSD-style process accounting is enabled, these values control
+its behavior.
+If free space on filesystem where the log lives goes below
+.I lowwater
+percent, accounting suspends.
+If free space gets above
+.I highwater
+percent, accounting resumes.
+.I frequency
+determines
+how often the kernel checks the amount of free space (value is in
+seconds).
+Default values are 4, 2, and 30.
+That is, suspend accounting if 2% or less space is free; resume it
+if 4% or more space is free; consider information about amount of free space
+valid for 30 seconds.
+.TP
+.IR /proc/sys/kernel/auto_msgmni " (Linux 2.6.27 to Linux 3.18)"
+.\" commit 9eefe520c814f6f62c5d36a2ddcd3fb99dfdb30e (introduces feature)
+.\" commit 0050ee059f7fc86b1df2527aaa14ed5dc72f9973 (rendered redundant)
+From Linux 2.6.27 to Linux 3.18,
+this file was used to control recomputing of the value in
+.I /proc/sys/kernel/msgmni
+upon the addition or removal of memory or upon IPC namespace creation/removal.
+Echoing "1" into this file enabled
+.I msgmni
+automatic recomputing (and triggered a recomputation of
+.I msgmni
+based on the current amount of available memory and number of IPC namespaces).
+Echoing "0" disabled automatic recomputing.
+(Automatic recomputing was also disabled if a value was explicitly assigned to
+.IR /proc/sys/kernel/msgmni .)
+The default value in
+.I auto_msgmni
+was 1.
+.IP
+Since Linux 3.19, the content of this file has no effect (because
+.I msgmni
+.\" FIXME Must document the 3.19 'msgmni' changes.
+defaults to near the maximum value possible),
+and reads from this file always return the value "0".
+.TP
+.IR /proc/sys/kernel/cap_last_cap " (since Linux 3.2)"
+See
+.BR capabilities (7).
+.TP
+.IR /proc/sys/kernel/cap\-bound " (from Linux 2.2 to Linux 2.6.24)"
+This file holds the value of the kernel
+.I "capability bounding set"
+(expressed as a signed decimal number).
+This set is ANDed against the capabilities permitted to a process
+during
+.BR execve (2).
+Starting with Linux 2.6.25,
+the system-wide capability bounding set disappeared,
+and was replaced by a per-thread bounding set; see
+.BR capabilities (7).
+.TP
+.I /proc/sys/kernel/core_pattern
+See
+.BR core (5).
+.TP
+.I /proc/sys/kernel/core_pipe_limit
+See
+.BR core (5).
+.TP
+.I /proc/sys/kernel/core_uses_pid
+See
+.BR core (5).
+.TP
+.I /proc/sys/kernel/ctrl\-alt\-del
+This file
+controls the handling of Ctrl-Alt-Del from the keyboard.
+When the value in this file is 0, Ctrl-Alt-Del is trapped and
+sent to the
+.BR init (1)
+program to handle a graceful restart.
+When the value is greater than zero, Linux's reaction to a Vulcan
+Nerve Pinch (tm) will be an immediate reboot, without even
+syncing its dirty buffers.
+Note: when a program (like dosemu) has the keyboard in "raw"
+mode, the Ctrl-Alt-Del is intercepted by the program before it
+ever reaches the kernel tty layer, and it's up to the program
+to decide what to do with it.
+.TP
+.IR /proc/sys/kernel/dmesg_restrict " (since Linux 2.6.37)"
+The value in this file determines who can see kernel syslog contents.
+A value of 0 in this file imposes no restrictions.
+If the value is 1, only privileged users can read the kernel syslog.
+(See
+.BR syslog (2)
+for more details.)
+Since Linux 3.4,
+.\" commit 620f6e8e855d6d447688a5f67a4e176944a084e8
+only users with the
+.B CAP_SYS_ADMIN
+capability may change the value in this file.
+.TP
+.IR /proc/sys/kernel/domainname " and " /proc/sys/kernel/hostname
+can be used to set the NIS/YP domainname and the
+hostname of your box in exactly the same way as the commands
+.BR domainname (1)
+and
+.BR hostname (1),
+that is:
+.IP
+.in +4n
+.EX
+.RB "#" " echo \[aq]darkstar\[aq] > /proc/sys/kernel/hostname"
+.RB "#" " echo \[aq]mydomain\[aq] > /proc/sys/kernel/domainname"
+.EE
+.in
+.IP
+has the same effect as
+.IP
+.in +4n
+.EX
+.RB "#" " hostname \[aq]darkstar\[aq]"
+.RB "#" " domainname \[aq]mydomain\[aq]"
+.EE
+.in
+.IP
+Note, however, that the classic darkstar.frop.org has the
+hostname "darkstar" and DNS (Internet Domain Name Server)
+domainname "frop.org", not to be confused with the NIS (Network
+Information Service) or YP (Yellow Pages) domainname.
+These two
+domain names are in general different.
+For a detailed discussion
+see the
+.BR hostname (1)
+man page.
+.TP
+.I /proc/sys/kernel/hotplug
+This file
+contains the pathname for the hotplug policy agent.
+The default value in this file is
+.IR /sbin/hotplug .
+.TP
+.\" Removed in commit 87f504e5c78b910b0c1d6ffb89bc95e492322c84 (tglx/history.git)
+.IR /proc/sys/kernel/htab\-reclaim " (before Linux 2.4.9.2)"
+(PowerPC only) If this file is set to a nonzero value,
+the PowerPC htab
+.\" removed in commit 1b483a6a7b2998e9c98ad985d7494b9b725bd228, before Linux 2.6.28
+(see kernel file
+.IR Documentation/powerpc/ppc_htab.txt )
+is pruned
+each time the system hits the idle loop.
+.TP
+.I /proc/sys/kernel/keys/
+This directory contains various files that define parameters and limits
+for the key-management facility.
+These files are described in
+.BR keyrings (7).
+.TP
+.IR /proc/sys/kernel/kptr_restrict " (since Linux 2.6.38)"
+.\" 455cd5ab305c90ffc422dd2e0fb634730942b257
+The value in this file determines whether kernel addresses are exposed via
+.I /proc
+files and other interfaces.
+A value of 0 in this file imposes no restrictions.
+If the value is 1, kernel pointers printed using the
+.I %pK
+format specifier will be replaced with zeros unless the user has the
+.B CAP_SYSLOG
+capability.
+If the value is 2, kernel pointers printed using the
+.I %pK
+format specifier will be replaced with zeros regardless
+of the user's capabilities.
+The initial default value for this file was 1,
+but the default was changed
+.\" commit 411f05f123cbd7f8aa1edcae86970755a6e2a9d9
+to 0 in Linux 2.6.39.
+Since Linux 3.4,
+.\" commit 620f6e8e855d6d447688a5f67a4e176944a084e8
+only users with the
+.B CAP_SYS_ADMIN
+capability can change the value in this file.
+.TP
+.I /proc/sys/kernel/l2cr
+(PowerPC only) This file
+contains a flag that controls the L2 cache of G3 processor
+boards.
+If 0, the cache is disabled.
+Enabled if nonzero.
+.TP
+.I /proc/sys/kernel/modprobe
+This file contains the pathname for the kernel module loader.
+The default value is
+.IR /sbin/modprobe .
+The file is present only if the kernel is built with the
+.B CONFIG_MODULES
+.RB ( CONFIG_KMOD
+in Linux 2.6.26 and earlier)
+option enabled.
+It is described by the Linux kernel source file
+.I Documentation/kmod.txt
+(present only in Linux 2.4 and earlier).
+.TP
+.IR /proc/sys/kernel/modules_disabled " (since Linux 2.6.31)"
+.\" 3d43321b7015387cfebbe26436d0e9d299162ea1
+.\" From Documentation/sysctl/kernel.txt
+A toggle value indicating if modules are allowed to be loaded
+in an otherwise modular kernel.
+This toggle defaults to off (0), but can be set true (1).
+Once true, modules can be neither loaded nor unloaded,
+and the toggle cannot be set back to false.
+The file is present only if the kernel is built with the
+.B CONFIG_MODULES
+option enabled.
+.TP
+.IR /proc/sys/kernel/msgmax " (since Linux 2.2)"
+This file defines
+a system-wide limit specifying the maximum number of bytes in
+a single message written on a System V message queue.
+.TP
+.IR /proc/sys/kernel/msgmni " (since Linux 2.4)"
+This file defines the system-wide limit on the number of
+message queue identifiers.
+See also
+.IR /proc/sys/kernel/auto_msgmni .
+.TP
+.IR /proc/sys/kernel/msgmnb " (since Linux 2.2)"
+This file defines a system-wide parameter used to initialize the
+.I msg_qbytes
+setting for subsequently created message queues.
+The
+.I msg_qbytes
+setting specifies the maximum number of bytes that may be written to the
+message queue.
+.TP
+.IR /proc/sys/kernel/ngroups_max " (since Linux 2.6.4)"
+This is a read-only file that displays the upper limit on the
+number of a process's group memberships.
+.TP
+.IR /proc/sys/kernel/ns_last_pid " (since Linux 3.3)"
+See
+.BR pid_namespaces (7).
+.TP
+.IR /proc/sys/kernel/ostype " and " /proc/sys/kernel/osrelease
+These files
+give substrings of
+.IR /proc/version .
+.TP
+.IR /proc/sys/kernel/overflowgid " and " /proc/sys/kernel/overflowuid
+These files duplicate the files
+.I /proc/sys/fs/overflowgid
+and
+.IR /proc/sys/fs/overflowuid .
+.TP
+.I /proc/sys/kernel/panic
+This file gives read/write access to the kernel variable
+.IR panic_timeout .
+If this is zero, the kernel will loop on a panic; if nonzero,
+it indicates that the kernel should autoreboot after this number
+of seconds.
+When you use the
+software watchdog device driver, the recommended setting is 60.
+.TP
+.IR /proc/sys/kernel/panic_on_oops " (since Linux 2.5.68)"
+This file controls the kernel's behavior when an oops
+or BUG is encountered.
+If this file contains 0, then the system
+tries to continue operation.
+If it contains 1, then the system
+delays a few seconds (to give klogd time to record the oops output)
+and then panics.
+If the
+.I /proc/sys/kernel/panic
+file is also nonzero, then the machine will be rebooted.
+.TP
+.IR /proc/sys/kernel/pid_max " (since Linux 2.5.34)"
+This file specifies the value at which PIDs wrap around
+(i.e., the value in this file is one greater than the maximum PID).
+PIDs greater than this value are not allocated;
+thus, the value in this file also acts as a system-wide limit
+on the total number of processes and threads.
+The default value for this file, 32768,
+results in the same range of PIDs as on earlier kernels.
+On 32-bit platforms, 32768 is the maximum value for
+.IR pid_max .
+On 64-bit systems,
+.I pid_max
+can be set to any value up to 2\[ha]22
+.RB ( PID_MAX_LIMIT ,
+approximately 4 million).
+.\" Prior to Linux 2.6.10, pid_max could also be raised above 32768 on 32-bit
+.\" platforms, but this broke /proc/[pid]
+.\" See http://marc.theaimsgroup.com/?l=linux-kernel&m=109513010926152&w=2
+.TP
+.IR /proc/sys/kernel/powersave\-nap " (PowerPC only)"
+This file contains a flag.
+If set, Linux-PPC will use the "nap" mode of
+powersaving,
+otherwise the "doze" mode will be used.
+.TP
+.I /proc/sys/kernel/printk
+See
+.BR syslog (2).
+.TP
+.IR /proc/sys/kernel/pty " (since Linux 2.6.4)"
+This directory contains two files relating to the number of UNIX 98
+pseudoterminals (see
+.BR pts (4))
+on the system.
+.TP
+.I /proc/sys/kernel/pty/max
+This file defines the maximum number of pseudoterminals.
+.\" FIXME Document /proc/sys/kernel/pty/reserve
+.\"     New in Linux 3.3
+.\"     commit e9aba5158a80098447ff207a452a3418ae7ee386
+.TP
+.I /proc/sys/kernel/pty/nr
+This read-only file
+indicates how many pseudoterminals are currently in use.
+.TP
+.I /proc/sys/kernel/random/
+This directory
+contains various parameters controlling the operation of the file
+.IR /dev/random .
+See
+.BR random (4)
+for further information.
+.TP
+.IR /proc/sys/kernel/random/uuid " (since Linux 2.4)"
+Each read from this read-only file returns a randomly generated 128-bit UUID,
+as a string in the standard UUID format.
+.TP
+.IR /proc/sys/kernel/randomize_va_space " (since Linux 2.6.12)"
+.\" Some further details can be found in Documentation/sysctl/kernel.txt
+Select the address space layout randomization (ASLR) policy for the system
+(on architectures that support ASLR).
+Three values are supported for this file:
+.RS
+.TP
+.B 0
+Turn ASLR off.
+This is the default for architectures that don't support ASLR,
+and when the kernel is booted with the
+.I norandmaps
+parameter.
+.TP
+.B 1
+Make the addresses of
+.BR mmap (2)
+allocations, the stack, and the VDSO page randomized.
+Among other things, this means that shared libraries will be
+loaded at randomized addresses.
+The text segment of PIE-linked binaries will also be loaded
+at a randomized address.
+This value is the default if the kernel was configured with
+.BR CONFIG_COMPAT_BRK .
+.TP
+.B 2
+(Since Linux 2.6.25)
+.\" commit c1d171a002942ea2d93b4fbd0c9583c56fce0772
+Also support heap randomization.
+This value is the default if the kernel was not configured with
+.BR CONFIG_COMPAT_BRK .
+.RE
+.TP
+.I /proc/sys/kernel/real\-root\-dev
+This file is documented in the Linux kernel source file
+.I Documentation/admin\-guide/initrd.rst
+.\" commit 9d85025b0418163fae079c9ba8f8445212de8568
+(or
+.I Documentation/initrd.txt
+before Linux 4.10).
+.TP
+.IR /proc/sys/kernel/reboot\-cmd " (Sparc only)"
+This file seems to be a way to give an argument to the SPARC
+ROM/Flash boot loader.
+Maybe to tell it what to do after
+rebooting?
+.TP
+.I /proc/sys/kernel/rtsig\-max
+(Up to and including Linux 2.6.7; see
+.BR setrlimit (2))
+This file can be used to tune the maximum number
+of POSIX real-time (queued) signals that can be outstanding
+in the system.
+.TP
+.I /proc/sys/kernel/rtsig\-nr
+(Up to and including Linux 2.6.7.)
+This file shows the number of POSIX real-time signals currently queued.
+.TP
+.IR /proc/ pid /sched_autogroup_enabled " (since Linux 2.6.38)"
+.\" commit 5091faa449ee0b7d73bc296a93bca9540fc51d0a
+See
+.BR sched (7).
+.TP
+.IR /proc/sys/kernel/sched_child_runs_first " (since Linux 2.6.23)"
+If this file contains the value zero, then, after a
+.BR fork (2),
+the parent is first scheduled on the CPU.
+If the file contains a nonzero value,
+then the child is scheduled first on the CPU.
+(Of course, on a multiprocessor system,
+the parent and the child might both immediately be scheduled on a CPU.)
+.TP
+.IR /proc/sys/kernel/sched_rr_timeslice_ms " (since Linux 3.9)"
+See
+.BR sched_rr_get_interval (2).
+.TP
+.IR /proc/sys/kernel/sched_rt_period_us " (since Linux 2.6.25)"
+See
+.BR sched (7).
+.TP
+.IR /proc/sys/kernel/sched_rt_runtime_us " (since Linux 2.6.25)"
+See
+.BR sched (7).
+.TP
+.IR /proc/sys/kernel/seccomp/ " (since Linux 4.14)"
+.\" commit 8e5f1ad116df6b0de65eac458d5e7c318d1c05af
+This directory provides additional seccomp information and
+configuration.
+See
+.BR seccomp (2)
+for further details.
+.TP
+.IR /proc/sys/kernel/sem " (since Linux 2.4)"
+This file contains 4 numbers defining limits for System V IPC semaphores.
+These fields are, in order:
+.RS
+.TP
+SEMMSL
+The maximum semaphores per semaphore set.
+.TP
+SEMMNS
+A system-wide limit on the number of semaphores in all semaphore sets.
+.TP
+SEMOPM
+The maximum number of operations that may be specified in a
+.BR semop (2)
+call.
+.TP
+SEMMNI
+A system-wide limit on the maximum number of semaphore identifiers.
+.RE
+.TP
+.I /proc/sys/kernel/sg\-big\-buff
+This file
+shows the size of the generic SCSI device (sg) buffer.
+You can't tune it just yet, but you could change it at
+compile time by editing
+.I include/scsi/sg.h
+and changing
+the value of
+.BR SG_BIG_BUFF .
+However, there shouldn't be any reason to change this value.
+.TP
+.IR /proc/sys/kernel/shm_rmid_forced " (since Linux 3.1)"
+.\" commit b34a6b1da371ed8af1221459a18c67970f7e3d53
+.\" See also Documentation/sysctl/kernel.txt
+If this file is set to 1, all System V shared memory segments will
+be marked for destruction as soon as the number of attached processes
+falls to zero;
+in other words, it is no longer possible to create shared memory segments
+that exist independently of any attached process.
+.IP
+The effect is as though a
+.BR shmctl (2)
+.B IPC_RMID
+is performed on all existing segments as well as all segments
+created in the future (until this file is reset to 0).
+Note that existing segments that are attached to no process will be
+immediately destroyed when this file is set to 1.
+Setting this option will also destroy segments that were created,
+but never attached,
+upon termination of the process that created the segment with
+.BR shmget (2).
+.IP
+Setting this file to 1 provides a way of ensuring that
+all System V shared memory segments are counted against the
+resource usage and resource limits (see the description of
+.B RLIMIT_AS
+in
+.BR getrlimit (2))
+of at least one process.
+.IP
+Because setting this file to 1 produces behavior that is nonstandard
+and could also break existing applications,
+the default value in this file is 0.
+Set this file to 1 only if you have a good understanding
+of the semantics of the applications using
+System V shared memory on your system.
+.TP
+.IR /proc/sys/kernel/shmall " (since Linux 2.2)"
+This file
+contains the system-wide limit on the total number of pages of
+System V shared memory.
+.TP
+.IR /proc/sys/kernel/shmmax " (since Linux 2.2)"
+This file
+can be used to query and set the run-time limit
+on the maximum (System V IPC) shared memory segment size that can be
+created.
+Shared memory segments up to 1 GB are now supported in the
+kernel.
+This value defaults to
+.BR SHMMAX .
+.TP
+.IR /proc/sys/kernel/shmmni " (since Linux 2.4)"
+This file
+specifies the system-wide maximum number of System V shared memory
+segments that can be created.
+.TP
+.IR /proc/sys/kernel/sysctl_writes_strict " (since Linux 3.16)"
+.\" commit f88083005ab319abba5d0b2e4e997558245493c8
+.\" commit 2ca9bb456ada8bcbdc8f77f8fc78207653bbaa92
+.\" commit f4aacea2f5d1a5f7e3154e967d70cf3f711bcd61
+.\" commit 24fe831c17ab8149413874f2fd4e5c8a41fcd294
+The value in this file determines how the file offset affects
+the behavior of updating entries in files under
+.IR /proc/sys .
+The file has three possible values:
+.RS
+.TP 4
+\-1
+This provides legacy handling, with no printk warnings.
+Each
+.BR write (2)
+must fully contain the value to be written,
+and multiple writes on the same file descriptor
+will overwrite the entire value, regardless of the file position.
+.TP
+0
+(default) This provides the same behavior as for \-1,
+but printk warnings are written for processes that
+perform writes when the file offset is not 0.
+.TP
+1
+Respect the file offset when writing strings into
+.I /proc/sys
+files.
+Multiple writes will
+.I append
+to the value buffer.
+Anything written beyond the maximum length
+of the value buffer will be ignored.
+Writes to numeric
+.I /proc/sys
+entries must always be at file offset 0 and the value must be
+fully contained in the buffer provided to
+.BR write (2).
+.\" FIXME .
+.\"     With /proc/sys/kernel/sysctl_writes_strict==1, writes at an
+.\"     offset other than 0 do not generate an error. Instead, the
+.\"     write() succeeds, but the file is left unmodified.
+.\"     This is surprising. The behavior may change in the future.
+.\"     See thread.gmane.org/gmane.linux.man/9197
+.\"		From: Michael Kerrisk (man-pages <mtk.manpages@...>
+.\"		Subject: sysctl_writes_strict documentation + an oddity?
+.\"		Newsgroups: gmane.linux.man, gmane.linux.kernel
+.\"		Date: 2015-05-09 08:54:11 GMT
+.RE
+.TP
+.I /proc/sys/kernel/sysrq
+This file controls the functions allowed to be invoked by the SysRq key.
+By default,
+the file contains 1 meaning that every possible SysRq request is allowed
+(in older kernel versions, SysRq was disabled by default,
+and you were required to specifically enable it at run-time,
+but this is not the case any more).
+Possible values in this file are:
+.RS
+.TP 5
+0
+Disable sysrq completely
+.TP
+1
+Enable all functions of sysrq
+.TP
+> 1
+Bit mask of allowed sysrq functions, as follows:
+.PD 0
+.RS
+.TP 5
+\ \ 2
+Enable control of console logging level
+.TP
+\ \ 4
+Enable control of keyboard (SAK, unraw)
+.TP
+\ \ 8
+Enable debugging dumps of processes etc.
+.TP
+\ 16
+Enable sync command
+.TP
+\ 32
+Enable remount read-only
+.TP
+\ 64
+Enable signaling of processes (term, kill, oom-kill)
+.TP
+128
+Allow reboot/poweroff
+.TP
+256
+Allow nicing of all real-time tasks
+.RE
+.PD
+.RE
+.IP
+This file is present only if the
+.B CONFIG_MAGIC_SYSRQ
+kernel configuration option is enabled.
+For further details see the Linux kernel source file
+.I Documentation/admin\-guide/sysrq.rst
+.\" commit 9d85025b0418163fae079c9ba8f8445212de8568
+(or
+.I Documentation/sysrq.txt
+before Linux 4.10).
+.TP
+.I /proc/sys/kernel/version
+This file contains a string such as:
+.IP
+.in +4n
+.EX
+#5 Wed Feb 25 21:49:24 MET 1998
+.EE
+.in
+.IP
+The "#5" means that
+this is the fifth kernel built from this source base and the
+date following it indicates the time the kernel was built.
+.TP
+.IR /proc/sys/kernel/threads\-max " (since Linux 2.3.11)"
+.\" The following is based on Documentation/sysctl/kernel.txt
+This file specifies the system-wide limit on the number of
+threads (tasks) that can be created on the system.
+.IP
+Since Linux 4.1,
+.\" commit 230633d109e35b0a24277498e773edeb79b4a331
+the value that can be written to
+.I threads\-max
+is bounded.
+The minimum value that can be written is 20.
+The maximum value that can be written is given by the
+constant
+.B FUTEX_TID_MASK
+(0x3fffffff).
+If a value outside of this range is written to
+.IR threads\-max ,
+the error
+.B EINVAL
+occurs.
+.IP
+The value written is checked against the available RAM pages.
+If the thread structures would occupy too much (more than 1/8th)
+of the available RAM pages,
+.I threads\-max
+is reduced accordingly.
+.TP
+.IR /proc/sys/kernel/yama/ptrace_scope " (since Linux 3.5)"
+See
+.BR ptrace (2).
+.TP
+.IR /proc/sys/kernel/zero\-paged " (PowerPC only)"
+This file
+contains a flag.
+When enabled (nonzero), Linux-PPC will pre-zero pages in
+the idle loop, possibly speeding up get_free_pages.
+.TP
+.I /proc/sys/net
+This directory contains networking stuff.
+Explanations for some of the files under this directory can be found in
+.BR tcp (7)
+and
+.BR ip (7).
+.TP
+.I /proc/sys/net/core/bpf_jit_enable
+See
+.BR bpf (2).
+.TP
+.I /proc/sys/net/core/somaxconn
+This file defines a ceiling value for the
+.I backlog
+argument of
+.BR listen (2);
+see the
+.BR listen (2)
+manual page for details.
+.TP
+.I /proc/sys/proc
+This directory may be empty.
+.TP
+.I /proc/sys/sunrpc
+This directory supports Sun remote procedure call for network filesystem
+(NFS).
+On some systems, it is not present.
+.TP
+.IR /proc/sys/user " (since Linux 4.9)"
+See
+.BR namespaces (7).
+.TP
+.I /proc/sys/vm/
+This directory contains files for memory management tuning, buffer, and
+cache management.
+.TP
+.IR /proc/sys/vm/admin_reserve_kbytes " (since Linux 3.10)"
+.\" commit 4eeab4f5580d11bffedc697684b91b0bca0d5009
+This file defines the amount of free memory (in KiB) on the system that
+should be reserved for users with the capability
+.BR CAP_SYS_ADMIN .
+.IP
+The default value in this file is the minimum of [3% of free pages, 8MiB]
+expressed as KiB.
+The default is intended to provide enough for the superuser
+to log in and kill a process, if necessary,
+under the default overcommit 'guess' mode (i.e., 0 in
+.IR /proc/sys/vm/overcommit_memory ).
+.IP
+Systems running in "overcommit never" mode (i.e., 2 in
+.IR /proc/sys/vm/overcommit_memory )
+should increase the value in this file to account
+for the full virtual memory size of the programs used to recover (e.g.,
+.BR login (1)
+.BR ssh (1),
+and
+.BR top (1))
+Otherwise, the superuser may not be able to log in to recover the system.
+For example, on x86-64 a suitable value is 131072 (128MiB reserved).
+.IP
+Changing the value in this file takes effect whenever
+an application requests memory.
+.TP
+.IR /proc/sys/vm/compact_memory " (since Linux 2.6.35)"
+When 1 is written to this file, all zones are compacted such that free
+memory is available in contiguous blocks where possible.
+The effect of this action can be seen by examining
+.IR /proc/buddyinfo .
+.IP
+Present only if the kernel was configured with
+.BR CONFIG_COMPACTION .
+.TP
+.IR /proc/sys/vm/drop_caches " (since Linux 2.6.16)"
+Writing to this file causes the kernel to drop clean caches, dentries, and
+inodes from memory, causing that memory to become free.
+This can be useful for memory management testing and
+performing reproducible filesystem benchmarks.
+Because writing to this file causes the benefits of caching to be lost,
+it can degrade overall system performance.
+.IP
+To free pagecache, use:
+.IP
+.in +4n
+.EX
+echo 1 > /proc/sys/vm/drop_caches
+.EE
+.in
+.IP
+To free dentries and inodes, use:
+.IP
+.in +4n
+.EX
+echo 2 > /proc/sys/vm/drop_caches
+.EE
+.in
+.IP
+To free pagecache, dentries, and inodes, use:
+.IP
+.in +4n
+.EX
+echo 3 > /proc/sys/vm/drop_caches
+.EE
+.in
+.IP
+Because writing to this file is a nondestructive operation and dirty objects
+are not freeable, the
+user should run
+.BR sync (1)
+first.
+.TP
+.IR  /proc/sys/vm/sysctl_hugetlb_shm_group " (since Linux 2.6.7)"
+This writable file contains a group ID that is allowed
+to allocate memory using huge pages.
+If a process has a filesystem group ID or any supplementary group ID that
+matches this group ID,
+then it can make huge-page allocations without holding the
+.B CAP_IPC_LOCK
+capability; see
+.BR memfd_create (2),
+.BR mmap (2),
+and
+.BR shmget (2).
+.TP
+.IR /proc/sys/vm/legacy_va_layout " (since Linux 2.6.9)"
+.\" The following is from Documentation/filesystems/proc.txt
+If nonzero, this disables the new 32-bit memory-mapping layout;
+the kernel will use the legacy (2.4) layout for all processes.
+.TP
+.IR /proc/sys/vm/memory_failure_early_kill " (since Linux 2.6.32)"
+.\" The following is based on the text in Documentation/sysctl/vm.txt
+Control how to kill processes when an uncorrected memory error
+(typically a 2-bit error in a memory module)
+that cannot be handled by the kernel
+is detected in the background by hardware.
+In some cases (like the page still having a valid copy on disk),
+the kernel will handle the failure
+transparently without affecting any applications.
+But if there is no other up-to-date copy of the data,
+it will kill processes to prevent any data corruptions from propagating.
+.IP
+The file has one of the following values:
+.RS
+.TP
+.B 1
+Kill all processes that have the corrupted-and-not-reloadable page mapped
+as soon as the corruption is detected.
+Note that this is not supported for a few types of pages,
+such as kernel internally
+allocated data or the swap cache, but works for the majority of user pages.
+.TP
+.B 0
+Unmap the corrupted page from all processes and kill a process
+only if it tries to access the page.
+.RE
+.IP
+The kill is performed using a
+.B SIGBUS
+signal with
+.I si_code
+set to
+.BR BUS_MCEERR_AO .
+Processes can handle this if they want to; see
+.BR sigaction (2)
+for more details.
+.IP
+This feature is active only on architectures/platforms with advanced machine
+check handling and depends on the hardware capabilities.
+.IP
+Applications can override the
+.I memory_failure_early_kill
+setting individually with the
+.BR prctl (2)
+.B PR_MCE_KILL
+operation.
+.IP
+Present only if the kernel was configured with
+.BR CONFIG_MEMORY_FAILURE .
+.TP
+.IR /proc/sys/vm/memory_failure_recovery " (since Linux 2.6.32)"
+.\" The following is based on the text in Documentation/sysctl/vm.txt
+Enable memory failure recovery (when supported by the platform).
+.RS
+.TP
+.B 1
+Attempt recovery.
+.TP
+.B 0
+Always panic on a memory failure.
+.RE
+.IP
+Present only if the kernel was configured with
+.BR CONFIG_MEMORY_FAILURE .
+.TP
+.IR /proc/sys/vm/oom_dump_tasks " (since Linux 2.6.25)"
+.\" The following is from Documentation/sysctl/vm.txt
+Enables a system-wide task dump (excluding kernel threads) to be
+produced when the kernel performs an OOM-killing.
+The dump includes the following information
+for each task (thread, process):
+thread ID, real user ID, thread group ID (process ID),
+virtual memory size, resident set size,
+the CPU that the task is scheduled on,
+oom_adj score (see the description of
+.IR /proc/ pid /oom_adj ),
+and command name.
+This is helpful to determine why the OOM-killer was invoked
+and to identify the rogue task that caused it.
+.IP
+If this contains the value zero, this information is suppressed.
+On very large systems with thousands of tasks,
+it may not be feasible to dump the memory state information for each one.
+Such systems should not be forced to incur a performance penalty in
+OOM situations when the information may not be desired.
+.IP
+If this is set to nonzero, this information is shown whenever the
+OOM-killer actually kills a memory-hogging task.
+.IP
+The default value is 0.
+.TP
+.IR /proc/sys/vm/oom_kill_allocating_task " (since Linux 2.6.24)"
+.\" The following is from Documentation/sysctl/vm.txt
+This enables or disables killing the OOM-triggering task in
+out-of-memory situations.
+.IP
+If this is set to zero, the OOM-killer will scan through the entire
+tasklist and select a task based on heuristics to kill.
+This normally selects a rogue memory-hogging task that
+frees up a large amount of memory when killed.
+.IP
+If this is set to nonzero, the OOM-killer simply kills the task that
+triggered the out-of-memory condition.
+This avoids a possibly expensive tasklist scan.
+.IP
+If
+.I /proc/sys/vm/panic_on_oom
+is nonzero, it takes precedence over whatever value is used in
+.IR /proc/sys/vm/oom_kill_allocating_task .
+.IP
+The default value is 0.
+.TP
+.IR /proc/sys/vm/overcommit_kbytes " (since Linux 3.14)"
+.\" commit 49f0ce5f92321cdcf741e35f385669a421013cb7
+This writable file provides an alternative to
+.I /proc/sys/vm/overcommit_ratio
+for controlling the
+.I CommitLimit
+when
+.I /proc/sys/vm/overcommit_memory
+has the value 2.
+It allows the amount of memory overcommitting to be specified as
+an absolute value (in kB),
+rather than as a percentage, as is done with
+.IR overcommit_ratio .
+This allows for finer-grained control of
+.I CommitLimit
+on systems with extremely large memory sizes.
+.IP
+Only one of
+.I overcommit_kbytes
+or
+.I overcommit_ratio
+can have an effect:
+if
+.I overcommit_kbytes
+has a nonzero value, then it is used to calculate
+.IR CommitLimit ,
+otherwise
+.I overcommit_ratio
+is used.
+Writing a value to either of these files causes the
+value in the other file to be set to zero.
+.TP
+.I /proc/sys/vm/overcommit_memory
+This file contains the kernel virtual memory accounting mode.
+Values are:
+.RS
+.IP
+0: heuristic overcommit (this is the default)
+.br
+1: always overcommit, never check
+.br
+2: always check, never overcommit
+.RE
+.IP
+In mode 0, calls of
+.BR mmap (2)
+with
+.B MAP_NORESERVE
+are not checked, and the default check is very weak,
+leading to the risk of getting a process "OOM-killed".
+.IP
+In mode 1, the kernel pretends there is always enough memory,
+until memory actually runs out.
+One use case for this mode is scientific computing applications
+that employ large sparse arrays.
+Before Linux 2.6.0, any nonzero value implies mode 1.
+.IP
+In mode 2 (available since Linux 2.6), the total virtual address space
+that can be allocated
+.RI ( CommitLimit
+in
+.IR /proc/meminfo )
+is calculated as
+.IP
+.in +4n
+.EX
+CommitLimit = (total_RAM \- total_huge_TLB) *
+	      overcommit_ratio / 100 + total_swap
+.EE
+.in
+.IP
+where:
+.RS
+.IP \[bu] 3
+.I total_RAM
+is the total amount of RAM on the system;
+.IP \[bu]
+.I total_huge_TLB
+is the amount of memory set aside for huge pages;
+.IP \[bu]
+.I overcommit_ratio
+is the value in
+.IR /proc/sys/vm/overcommit_ratio ;
+and
+.IP \[bu]
+.I total_swap
+is the amount of swap space.
+.RE
+.IP
+For example, on a system with 16 GB of physical RAM, 16 GB
+of swap, no space dedicated to huge pages, and an
+.I overcommit_ratio
+of 50, this formula yields a
+.I CommitLimit
+of 24 GB.
+.IP
+Since Linux 3.14, if the value in
+.I /proc/sys/vm/overcommit_kbytes
+is nonzero, then
+.I CommitLimit
+is instead calculated as:
+.IP
+.in +4n
+.EX
+CommitLimit = overcommit_kbytes + total_swap
+.EE
+.in
+.IP
+See also the description of
+.I /proc/sys/vm/admin_reserve_kbytes
+and
+.IR /proc/sys/vm/user_reserve_kbytes .
+.TP
+.IR /proc/sys/vm/overcommit_ratio " (since Linux 2.6.0)"
+This writable file defines a percentage by which memory
+can be overcommitted.
+The default value in the file is 50.
+See the description of
+.IR /proc/sys/vm/overcommit_memory .
+.TP
+.IR /proc/sys/vm/panic_on_oom " (since Linux 2.6.18)"
+.\" The following is adapted from Documentation/sysctl/vm.txt
+This enables or disables a kernel panic in
+an out-of-memory situation.
+.IP
+If this file is set to the value 0,
+the kernel's OOM-killer will kill some rogue process.
+Usually, the OOM-killer is able to kill a rogue process and the
+system will survive.
+.IP
+If this file is set to the value 1,
+then the kernel normally panics when out-of-memory happens.
+However, if a process limits allocations to certain nodes
+using memory policies
+.RB ( mbind (2)
+.BR MPOL_BIND )
+or cpusets
+.RB ( cpuset (7))
+and those nodes reach memory exhaustion status,
+one process may be killed by the OOM-killer.
+No panic occurs in this case:
+because other nodes' memory may be free,
+this means the system as a whole may not have reached
+an out-of-memory situation yet.
+.IP
+If this file is set to the value 2,
+the kernel always panics when an out-of-memory condition occurs.
+.IP
+The default value is 0.
+1 and 2 are for failover of clustering.
+Select either according to your policy of failover.
+.TP
+.I /proc/sys/vm/swappiness
+.\" The following is from Documentation/sysctl/vm.txt
+The value in this file controls how aggressively the kernel will swap
+memory pages.
+Higher values increase aggressiveness, lower values
+decrease aggressiveness.
+The default value is 60.
+.TP
+.IR /proc/sys/vm/user_reserve_kbytes " (since Linux 3.10)"
+.\" commit c9b1d0981fcce3d9976d7b7a56e4e0503bc610dd
+Specifies an amount of memory (in KiB) to reserve for user processes.
+This is intended to prevent a user from starting a single memory hogging
+process, such that they cannot recover (kill the hog).
+The value in this file has an effect only when
+.I /proc/sys/vm/overcommit_memory
+is set to 2 ("overcommit never" mode).
+In this case, the system reserves an amount of memory that is the minimum
+of [3% of current process size,
+.IR user_reserve_kbytes ].
+.IP
+The default value in this file is the minimum of [3% of free pages, 128MiB]
+expressed as KiB.
+.IP
+If the value in this file is set to zero,
+then a user will be allowed to allocate all free memory with a single process
+(minus the amount reserved by
+.IR /proc/sys/vm/admin_reserve_kbytes ).
+Any subsequent attempts to execute a command will result in
+"fork: Cannot allocate memory".
+.IP
+Changing the value in this file takes effect whenever
+an application requests memory.
+.TP
+.IR /proc/sys/vm/unprivileged_userfaultfd " (since Linux 5.2)"
+.\" cefdca0a86be517bc390fc4541e3674b8e7803b0
+This (writable) file exposes a flag that controls whether
+unprivileged processes are allowed to employ
+.BR userfaultfd (2).
+If this file has the value 1, then unprivileged processes may use
+.BR userfaultfd (2).
+If this file has the value 0, then only processes that have the
+.B CAP_SYS_PTRACE
+capability may employ
+.BR userfaultfd (2).
+The default value in this file is 1.
+.SH SEE ALSO
+.BR proc (5)