Re: sys_paccept: disable paccept() until API design is resolved

Previous thread: Re: [PATCH] VFS: make file->f_pos access atomic on 32bit arch by Michael Trimarchi on Tuesday, September 16, 2008 - 4:11 am. (2 messages)

Next thread: [PATCH] linux/inotify.h: do not include <linux/fcntl.h> in userspace by Kirill A. Shutemov on Tuesday, September 16, 2008 - 5:22 am. (10 messages)
From: Michael Kerrisk
Date: Tuesday, September 16, 2008 - 5:05 am

Andrew,

The patch below disables the new sys_paccept() for now.  Please
apply for 2.6.27-rc, so that we do not release this API into
the wild before a conclusion has been reached about its design.

The reasons for disabling paccept() are as follows:

* The API is more complex than needed.  There is AFAICS no demonstrated
   use case that the sigset argument of this syscall serves that
   couldn't equally be served by the use of pselect/ppoll/epoll_pwait +
   traditional accept().  Roland seems to concur with this opinion
   (http://thread.gmane.org/gmane.linux.kernel/723953/focus=732255).
   I have (more than once) asked Ulrich to explain otherwise
   (http://thread.gmane.org/gmane.linux.kernel/723952/focus=731018),
   but he does not respond, so one is left to assume that he doesn't
   know of such a case.

* The use of a sigset argument is not consistent with other I/O APIs
   that can block on a single file descriptor (e.g., read(), recv(),
   connect()).

* The behavior of paccept() when interrupted by a signal is IMO
   strange: the kernel restarts the system call if SA_RESTART was set
   for the handler.  I think that it should not do this -- that it
   should behave consistently with paccept()/ppoll()/epoll_pwait(),
   which never restart, regardless of SA_RESTART.  The reasoning here
   is that the very purpose of paccept() is to wait for a connection
   or a signal, and that restarting in the latter case is probably
   never useful.  (Note: Roland disagrees on this point, believing
   that rather paccept() should be consistent with accept() in its
   behavior wrt EINTR
   (http://thread.gmane.org/gmane.linux.kernel/723953/focus=732255).)

I believe that instead, a simpler API, consistent with Ulrich's
other recent additions, is preferable:

accept4(int fd, struct sockaddr *sa, socklen_t *salen, ind flags);

(This simpler API was originally proposed by Ulrich:
http://thread.gmane.org/gmane.linux.network/92072)

If this simpler API is added, then if we later ...
From: Oleg Nesterov
Date: Tuesday, September 16, 2008 - 6:04 am

Also, the implementation of sys_paccept() is not &quot;perfect&quot;, imho.

	sys_paccept:

		ret = do_accept(...);

		if (ret &lt; 0 &amp;&amp; signal_pending()) {
			set_restore_sigmask();
			return ret;
		}

It doesn't check that ret == ERESTARTSYS/EINTR. I can't say this
is bug, but let's suppose that do_accept() returns (say) -EINVAL,
and then the task is interrupted by the signal.

Now, if the signal comes after sys_paccept() checks signal_pending(),
we return -EINVAL, and the signal handler runs with the original
current-&gt;blocked mask, as expected.

However, if the signal happens in the window before signal_pending(),
we still return -EINVAL, but the signal handler runs with
-&gt;blocked == sigmask. A bit odd, but probably harmless.


Note also that unless I misread the code, do_paccept() returns
ERESTARTSYS or EINTR depending on -&gt;sk_rcvtimeo. Yes, it is very
clear why sock_intr_errno() does this, but this doesn't make the
behaviour of paccept() more understandable.

Oleg.

--

From: Ulrich Drepper
Date: Tuesday, September 16, 2008 - 4:17 pm

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



It would unnecessarily require programs to be changed.  I've explained
that programs cannot efficiently use accept() and poll() when multiple
threads are involved.  This means in these situations you'll find a

This is because none of the other interfaces had (so far) be revised.

You use your own opinion as the deciding factor?  The behavior differs

The signal set wasn't actually my idea.  See:


I have explained the need already. you just chose to ignore it.

- --
➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iEYEARECAAYFAkjQPoQACgkQ2ijCOnn/RHTNZwCfaXdw5Yhy/chAUMqR2kZE8Rsm
wzUAnA7PtvODGyAMeahl44+mqasqGS1U
=Gh2E
-----END PGP SIGNATURE-----
--

From: Michael Kerrisk
Date: Tuesday, September 16, 2008 - 5:24 pm

I'm assuming that you mean this text from another thread:

]]

I find this very difficult to understand -- what makes it difficult is
the wording of the explanation.  Could you please take some time to
(much) more clearly explain the above, and especially to very clearly
explain why paccept() serves needs that can't be dealt with by

I already clearly stated it was my opinion, and pointed out that

[CC=+ Evgeniy Polyakov &lt;johnpol@2ka.mipt.ru&gt;]


No.  You still have given no clear explanation.  Please give one.

Cheers,

Michael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
man-pages online: http://www.kernel.org/doc/man-pages/online_pages.html
Found a bug? http://www.kernel.org/doc/man-pages/reporting_bugs.html
--

From: Evgeniy Polyakov
Date: Wednesday, September 17, 2008 - 9:46 am

Hi Michael.


I asked to put signal mask there to be able to siplify life of those who
use accept() and signals. It is not 1005 _required_, but just like it is
not required to be in ppoll(). It is an optimization which should allow
to block signals during code execution without lots of additional steps
(like disabling/enabling them around the call via additional syscalls).

-- 
	Evgeniy Polyakov
--

From: Nick Piggin
Date: Tuesday, September 16, 2008 - 6:22 pm

There is a good reason, and that is that if there is any questioning
of a patch that adds a userspace API, then it is much better to be
safe than sorry.

We don't get enough people reviewing these things as is, so
ignoring the review we do get is not the right thing to do. There
is much more harm in releasing the kernel with a poor API than
just holding off for another release until issues are sorted out.
--

From: Rémi
Date: Tuesday, September 16, 2008 - 11:50 pm

Hmm. In a multithreaded program, it makes a lot of sense to use blocking I/O 
in general - not just blocking accept(). Of course, this assumes that there 
is only one file descriptor to wait on at a time. This is not necessarily 
true for I/O, nor is it for accept(). For instance, modern TCP servers should 
have an IPv4 and an IPv6 socket to accept() from...

If the code path is such that only one file descriptor is being waited on at a 
time, then using blocking I/O halves the number of syscalls, and thirds the 
number of context switch between hardware packet reception, and userland data 

Right.

Then again... Why not recommend threaded programs to use sigwait() in a 
dedicated task and give up on the asynchronous signal handlers completely?

-- 
Rémi Denis-Courmont
Maemo Software, Nokia Devices R&amp;D
--

From: Oleg Nesterov
Date: Wednesday, September 17, 2008 - 7:30 am

And Michael asks why this behaviour (and paccept() itself) is useful.
I must admit I don't understand this too.

It is very possible that we both just need the help from expert (you).
(Ulrich, there is no irony, seriously).

Oleg.

--

Previous thread: Re: [PATCH] VFS: make file->f_pos access atomic on 32bit arch by Michael Trimarchi on Tuesday, September 16, 2008 - 4:11 am. (2 messages)

Next thread: [PATCH] linux/inotify.h: do not include <linux/fcntl.h> in userspace by Kirill A. Shutemov on Tuesday, September 16, 2008 - 5:2