aboutsummaryrefslogtreecommitdiffstats
path: root/man7/socket.7
diff options
context:
space:
mode:
Diffstat (limited to 'man7/socket.7')
-rw-r--r--man7/socket.764
1 files changed, 32 insertions, 32 deletions
diff --git a/man7/socket.7 b/man7/socket.7
index f12adf2d9e..c455c1cc2c 100644
--- a/man7/socket.7
+++ b/man7/socket.7
@@ -91,7 +91,7 @@ for more information on families and types.
These functions are used by the user process to send or receive packets
and to do other socket operations.
For more information see their respective manual pages.
-
+.PP
.BR socket (2)
creates a socket,
.BR connect (2)
@@ -252,7 +252,7 @@ the various system calls (e.g.,
.BR getpeername (2)),
which are generic to all socket domains,
to determine the domain of a particular socket address.
-
+.PP
To allow any type of socket address to be passed to
interfaces in the sockets API,
the type
@@ -262,7 +262,7 @@ The purpose of this type is purely to allow casting of
domain-specific socket address types to a "generic" type,
so as to avoid compiler warnings about type mismatches in
calls to the sockets API.
-
+.PP
In addition, the sockets API provides the data type
.IR "struct sockaddr_storage".
This type
@@ -272,13 +272,13 @@ address structures; it is large enough and is aligned properly.
IPv6 socket addresses.)
The structure includes the following field, which can be used to identify
the type of socket address actually stored in the structure:
-
+.PP
.in +4n
.nf
sa_family_t ss_family;
.fi
.in
-
+.PP
The
.I sockaddr_storage
structure is useful in programs that must handle socket addresses
@@ -323,7 +323,7 @@ non-zero value which is less than the packet's data length,
the packet will be truncated to the length returned.
If the value returned by the filter is greater than or equal to the
packet's data length, the packet is allowed to proceed unmodified.
-
+.IP
The argument for
.BR SO_ATTACH_FILTER
is a
@@ -346,13 +346,13 @@ is a file descriptor returned by the
.BR bpf (2)
system call and must refer to a program of type
.BR BPF_PROG_TYPE_SOCKET_FILTER.
-
+.IP
These options may be set multiple times for a given socket,
each time replacing the previous filter program.
The classic and extended versions may be called on the same socket,
but the previous filter will always be replaced such that a socket
never has more than one filter defined.
-
+.IP
Both classic and extended BPF are explained in the kernel source file
.I Documentation/networking/filter.txt
.TP
@@ -367,7 +367,7 @@ program which defines how packets are assigned to
the sockets in the reuseport group (that is, all sockets which have
.BR SO_REUSEPORT
set and are using the same local address to receive packets).
-
+.IP
The BPF program must return an index between 0 and N\-1 representing
the socket which should receive the packet
(where N is the number of sockets in the group).
@@ -375,7 +375,7 @@ If the BPF program returns an invalid index,
socket selection will fall back to the plain
.BR SO_REUSEPORT
mechanism.
-
+.IP
Sockets are numbered in the order in which they are added to the group
(that is, the order of
.BR bind (2)
@@ -387,10 +387,10 @@ When a socket is removed from a reuseport group (via
.BR close (2)),
the last socket in the group will be moved into the closed socket's
position.
-
+.IP
These options may be set repeatedly at any time on any socket in the group
to replace the current BPF program used by all sockets in the group.
-
+.IP
.BR SO_ATTACH_REUSEPORT_CBPF
takes the same argument type as
.BR SO_ATTACH_FILTER
@@ -398,7 +398,7 @@ and
.BR SO_ATTACH_REUSEPORT_EBPF
takes the same argument type as
.BR SO_ATTACH_BPF.
-
+.IP
UDP support for this feature is available since Linux 4.5;
TCP support is available since Linux 4.6.
.TP
@@ -420,7 +420,7 @@ sockets.
It is not supported for packet sockets (use normal
.BR bind (2)
there).
-
+.IP
Before Linux 3.8,
this socket option could be set, but could not retrieved with
.BR getsockopt (2).
@@ -495,7 +495,7 @@ Expects an integer boolean flag.
.\" setsockopt 70da268b569d32a9fddeea85dc18043de9d89f89
Sets or gets the CPU affinity of a socket.
Expects an integer flag.
-
+.IP
.in +4n
.nf
int cpu = 1;
@@ -503,7 +503,7 @@ socklen_t len = sizeof(cpu);
setsockopt(fd, SOL_SOCKET, SO_INCOMING_CPU, &cpu, &len);
.fi
.in
-
+.IP
Because all of the packets for a single stream
(i.e., all packets for the same 4-tuple)
arrive on the single RX queue that is associated with a particular CPU,
@@ -573,7 +573,7 @@ These filters include any set using the socket options
.BR SO_ATTACH_REUSEPORT_CBPF
and
.BR SO_ATTACH_REUSEPORT_EPBF .
-
+.IP
The typical use case is for a privileged process to set up a raw socket
(an operation that requires the
.BR CAP_NET_RAW
@@ -582,7 +582,7 @@ capability), apply a restrictive filter, set the
option,
and then either drop its privileges or pass the socket file descriptor
to an unprivileged process via a UNIX domain socket.
-
+.IP
Once the
.BR SO_LOCK_FILTER
option has been enabled, attempts to change or remove the filter
@@ -636,7 +636,7 @@ sockets, sets the value of the "peek offset" for the
system call when used with
.BR MSG_PEEK
flag.
-
+.IP
When this option is set to a negative value
(it is set to \-1 for all new sockets),
traditional behavior is provided:
@@ -644,14 +644,14 @@ traditional behavior is provided:
with the
.BR MSG_PEEK
flag will peek data from the front of the queue.
-
+.IP
When the option is set to a value greater than or equal to zero,
then the next peek at data queued in the socket will occur at
the byte offset specified by the option value.
At the same time, the "peek offset" will be
incremented by the number of bytes that were peeked from the queue,
so that a subsequent peek will return the next data in the queue.
-
+.IP
If data is removed from the front of the queue via a call to
.BR recv (2)
(or similar) without the
@@ -663,22 +663,22 @@ flag will cause the "peek offset" to be adjusted to maintain
the correct relative position in the queued data,
so that a subsequent peek will retrieve the data that would have been
retrieved had the data not been removed.
-
+.IP
For datagram sockets, if the "peek offset" points to the middle of a packet,
the data returned will be marked with the
.BR MSG_TRUNC
flag.
-
+.IP
The following example serves to illustrate the use of
.BR SO_PEEK_OFF .
Suppose a stream socket has the following queued input data:
-
+.IP
aabbccddeeff
.IP
The following sequence of
.BR recv (2)
calls would have the effect noted in the comments:
-
+.IP
.in +4n
.nf
int ov = 4; // Set peek offset to 4
@@ -858,7 +858,7 @@ To prevent port hijacking,
all of the processes binding to the same address must have the same
effective UID.
This option can be employed with both TCP and UDP sockets.
-
+.IP
For TCP sockets, this option allows
.BR accept (2)
load distribution in a multi-threaded server to be improved by
@@ -870,7 +870,7 @@ thread that distributes connections,
or having multiple threads that compete to
.BR accept (2)
from the same socket.
-
+.IP
For UDP sockets,
the use of this option can provide better distribution
of incoming datagrams to multiple processes (or threads) as compared
@@ -938,7 +938,7 @@ Increasing this value requires
The default for this option is controlled by the
.I /proc/sys/net/core/busy_read
file.
-
+.IP
The value in the
.I /proc/sys/net/core/busy_poll
file determines how long
@@ -948,11 +948,11 @@ and
will busy poll when they operate on sockets with
.BR SO_BUSY_POLL
set and no events to report are found.
-
+.IP
In both cases,
busy polling will only be done when the socket last received data
from a network device that supports this option.
-
+.IP
While busy polling may improve latency of some applications,
care must be taken when using it since this will increase
both CPU utilization and power usage.
@@ -1037,7 +1037,7 @@ per socket.
.SS Ioctls
These operations can be accessed using
.BR ioctl (2):
-
+.PP
.in +4n
.nf
.IB error " = ioctl(" ip_socket ", " ioctl_type ", " &value_result ");"
@@ -1140,7 +1140,7 @@ Linux assumes that half of the send/receive buffer is used for internal
kernel structures; thus the values in the corresponding
.I /proc
files are twice what can be observed on the wire.
-
+.PP
Linux will allow port reuse only with the
.B SO_REUSEADDR
option