(Please Cc: Frans and myself on replies, we're not subscribed to netdev@.) (Resent from mutt, after Opera trashed my headers...) (DaveM, this was Cc:d to you earlier today under a different subject. The failing wget is new though.) Dear Kernel Hackers, I am getting rather frequent "TCP(...): Application bug, race in MSG_PEEK." complaints for fetchmail these days. The strange part is that fetchmail is a single-threaded application that does not deal with URG data. I cannot see what fetchmail would race against in this situation. Frans Pop has found interesting patterns probably related to segment or packet/payload sizes (1460 bytes!) - see his message quoted below. Application use: Fetchmail does use MSG_PEEK in several places to look ahead. The same process that peeks will later read the data. It can happen that we peek at 1 byte, then more bytes, or that we peek at 512 bytes and read only smaller amounts (say, 30 - 70) of them before we peek again. As to the application source code: Look for instance for fm_peek in the function SockRead() in <http://mknod.org/svn/fetchmail/branches/BRANCH_6-3/socket.c> and its callers. Can we be sure that (a) the kernel correctly handles peek_seq and tp->copied_seq and their comparison (see Frans's message below - is the != in the if statement actually the right thing or should this be a > or <?), and (b) that the message is printed only if there is a real app bug (that I fail to see), and (c) that the kernel handles package boundaries, wraparounds, and buffers correctly? Might the reception of further data in the socket's receive buffer trigger the message? As in: between peek and read, or between read and peek, further data arrives, and triggers the message? Might the kernel's TCP code be confused about absolute vs. relative sequence/position indexes/pointers/counters inside the TCP code? Thanks a lot in advance. (Frans's earlier message included below.) Best ...
I cannot resist myself from noting that this certainly wasn't the patch
1460 is size of the maximum wire segment when not using timestamps (1500
MTU - (20+20 headers), would be 1448 with timestamps). So basically the
skbs in the receiver queue will be of that length (holds until the driver
What would you think about the following, untested patch... I suppose it
is enough to capture the racy situations except with that crazy urg hole,
grr (I suppose that will need just another variable to do the offset
of one).
--
i.
[RFC PATCH] tcp: fix MSG_PEEK race check
Commit 518a09ef11 (tcp: Fix recvmsg MSG_PEEK influence of
blocking behavior) lets the loop run longer than this check
did previously expect, so we need to be more careful with
this check and consider the work we have been doing.
I'm a bit unsure if this improved check can still fail as
if (!sock_flag(sk, SOCK_URGINLINE)) {
++*seq;
...
does not increment copied.
Compile tested.
Signed-off-by: Ilpo JCan you please elaborate why you think that? It may be horribly broken (I've never claimed to be a C coder, and probably never will), but it I'll give your patch a try and report back. Thanks, FJP --
On Thu, 7 May 2009, Frans Pop wrote: > On Thursday 07 May 2009, Ilpo J
Ah, yes. You're absolutely right. Duh. gcc does warn about it, but I must have missed that. --
I've been running with the patch for 2 days now and have not seen any more MSG_PEEK errors, so as far as I'm concerned the patch does fix the issue (needed the time in order to be confident of that). So: Tested-by: Frans Pop <elendil@planet.nl> Suggest to also add a CC for stable. Cheers, FJP --
On Sat, 9 May 2009, Frans Pop wrote: > On Thursday 07 May 2009, Ilpo J
Hmm. I wonder if it wouldn't be better to keep the two issues separate. The initial patch is a clear regression fix (4 people have reported it against fetchmail for Debian). The URG part is IMO a separate issue which I at least have never seen in practice. And my Tested-by doesn't cover the additional change either. That said, I have added the URG change (as an incremental patch) in my local git repo and will give it a go when I next build a kernel (may take a week). I don't expect to be able to confirm it fixes the URG race, but I can at least check that it doesn't cause any false messages with my --
On Mon, 11 May 2009, Frans Pop wrote: > On Monday 11 May 2009, Ilpo J
OK. I understood that there's always been a corner case with URG that could cause incorrect messages [1] and I thought the additional change was to fix that, but if this is related to the same regression then of course it's fine by me. I never claimed that. In fact, I was the one who also saw the issue with I'll still give the new patch a try on my next build :-) The rest I happily leave up to you and David. Thanks for clarifying. Cheers, FJP --
On Mon, 11 May 2009, Frans Pop wrote: > On Monday 11 May 2009, Ilpo J
From: "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> The issue being discussed there is exactly the case where a thread is triggering the copied_seq advance in tcp_check_urg() and using MSG_PEEK at the same time. I'm looking more closely into this patch right now, but I might ask you to split the two fixes up if I can't convince myself %100 of the URG part. We've already broken URG enough lately :-) --
As long as the non-URG part goes into 2.6.30 (it's a regression fix in my perception), I'll be a happy camper - at least that would stop the kernel's finger pointing at innocent applications. 8-) OK, I presume an application is innocent if it doesn't use MSG_OOB flags - is that sufficient? -- Matthias Andree --
From: "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> Ok, now that I've looked at this, the urg_hole part of this change has to be removed. That case being accounted for with urg_hole is exactly what the debugging message is trying to catch, where we are doing MSG_PEEK and tcp_check_urg() advances ->copied_seq on us during one of those "release_sock();/lock_sock();" sequences (which thus invoke TCP input processing). Could you please respin this patch with the URG bits removed? Thanks! --
On Sun, 17 May 2009, David Miller wrote: > From: "Ilpo J
Am 18.05.2009, 09:24 Uhr, schrieb Ilpo Järvinen WRT the earlier patch, we have one more success report from one of the users who reported the problem, namely Ian Zimmermann: <http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=513695#155> -- Matthias Andree --
From: "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> I see now what the situation is, thanks for explaining. You're account for the "*seq" advance for URG that happens in A long long time ago we had an assertion here checking whether ->copied_seq moved without our knowledge. Alexey and I found this could trigger and investigation of that is what helped us find the tcp_check_urg()+MSG_PEEK case. That's when we added this test and log message. When the MSG_PEEK test existing in the !copied if() branch of tcp_recvmsg(), so many of these code paths we are dealing with in this fix could simply not be reached when MSG_PEEK. That ++*seq could never happen, because if "copied" was non-zero and MSG_PEEK was true we would leave the loop and return from tcp_recvmsg() immediately. Now, we have to accomodate those adjustments. Since I now understand your urg_hole code I'm going to apply your v2 patch which takes care of the URG stuff. I also buy the argument that perhaps we should get rid of the log message, but look at how it helped us find this kernel regression :-) --
On Mon, 18 May 2009, David Miller wrote: > From: "Ilpo J
From: "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi> recvmsg can deal with it fine, because we reset the peek_seq when we print out that message. There is no way that an application writer has any clue about this interaction, where peeked bytes disappear and then reappear in the out-of-band URG byte. That's why the message is there. --
Hi, ...and this discovery is a reason to leave it in, and perhaps make sure that the check code is properly linked (through comments for lack of better means) to the actual data transfer functions. Cheers MA --
