aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorAndy Pan <i@andypan.me>2024-08-01 11:38:38 +0000
committerAlejandro Colomar <alx@kernel.org>2024-08-21 23:51:03 +0200
commit71988df59d2585cac1147a6f785df65693b7d77f (patch)
treec16793357bd8fd49674abbb5a2d3e4ff7539ef93
parent893db5f60c73259b29534525009cfb98f03dbf43 (diff)
downloadman-pages-71988df59d2585cac1147a6f785df65693b7d77f.tar.gz
epoll.7: Clarify the event distribution under edge-triggered mode
For the moment, the edge-triggered epoll generates an event for each receipt of a chunk of data, that is to say, epoll_wait() will return and tell us a monitored file descriptor is ready whenever there is a new activity on that FD since we were last informed about that FD. This is not a real _edge_ implementation for epoll, but it's been working this way for years and plenty of projects are relying on it to eliminate the overhead of one system call of read(2) per wakeup event. There are several renowned open-source projects relying on this feature for notification function (with eventfd): register eventfd with EPOLLET and avoid calling read(2) on the eventfd when there is wakeup event (eventfd being written). Examples: nginx [1], netty [2], tokio [3], libevent [4], ect. [5] These projects are widely used in today's Internet infrastructures. Thus, changing this behavior of epoll ET will fundamentally break them and cause a significant negative impact. Linux has changed it for pipe before [6], breaking some Android libraries, which had got "reverted" somehow. [7] [8] Nevertheless, the paragraph in the manual pages describing this characteristic of epoll ET seems ambiguous, I think a more explict sentence should be used to clarify it. We're improving the notification mechanism for libuv recently by exploiting this feature with eventfd, which brings us a significant performance boost. [9] Therefore, we (as well as the maintainers of nginx, netty, tokio, etc.) would have a sense of security to build an enhanced notification function based on this feature if there is a guarantee of retaining this implementation of epoll ET for the backward compatibility in the man pages. [1]: https://github.com/nginx/nginx/blob/efc6a217b92985a1ee211b6bb7337cd2f62deb90/src/event/modules/ngx_epoll_module.c#L386-L457 [2]: https://github.com/netty/netty/pull/9192 [3]: https://github.com/tokio-rs/mio/blob/309daae21ecb1d46203a7dbc0cf4c80310240cba/src/sys/unix/waker.rs#L111-L143 [4]: https://github.com/libevent/libevent/blob/525f5d0a14c9c103be750f2ca175328c25505ea4/event.c#L2597-L2614 [5]: https://github.com/libuv/libuv/pull/4400#issuecomment-2123798748 [6]: https://lkml.iu.edu/hypermail/linux/kernel/2010.1/04363.html [7]: https://github.com/torvalds/linux/commit/3a34b13a88caeb2800ab44a4918f230041b37dd9 [8]: https://github.com/torvalds/linux/commit/3b844826b6c6affa80755254da322b017358a2f4 [9]: https://github.com/libuv/libuv/pull/4400#issuecomment-2103232402 Signed-off-by: Andy Pan <i@andypan.me> Cc: <linux-api@vger.kernel.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Message-ID: <20240801-epoll-et-desc-v5-1-7fcb9260a3b2@andypan.me> Signed-off-by: Alejandro Colomar <alx@kernel.org>
-rw-r--r--man/man7/epoll.73
1 files changed, 2 insertions, 1 deletions
diff --git a/man/man7/epoll.7 b/man/man7/epoll.7
index 9515001312..86e5f83631 100644
--- a/man/man7/epoll.7
+++ b/man/man7/epoll.7
@@ -121,7 +121,8 @@ input buffer;
meanwhile the remote peer might be expecting a response based on the
data it already sent.
The reason for this is that edge-triggered mode
-delivers events only when changes occur on the monitored file descriptor.
+delivers events only when changes occur on the monitored file descriptor,
+that is, an event will be generated upon each receipt of a chunk of data.
So, in step
.B 5
the caller might end up waiting for some data that is already present inside