Re: Strange Application bug, race in MSG_PEEK complaints (was: Bug#513695: fetchmail: race in MSG_PEEK)

Previous thread: [PATCH]: Fix build regression from address list changes. by David Miller on Wednesday, May 6, 2009 - 3:44 pm. (1 message)

Next thread: RE: [PATCH 4/4] bnx2i: Add bnx2i iSCSI driver. by Karen Xie on Wednesday, May 6, 2009 - 4:02 pm. (1 message)
From: Matthias Andree
Date: Wednesday, May 6, 2009 - 4:02 pm

(Please Cc: Frans and myself on replies, we're not subscribed to netdev@.)
(Resent from mutt, after Opera trashed my headers...)
(DaveM, this was Cc:d to you earlier today under a different subject. The  
failing wget is new though.)

Dear Kernel Hackers,

I am getting rather frequent "TCP(...): Application bug, race in  
MSG_PEEK." complaints for fetchmail these days.
The strange part is that fetchmail is a single-threaded application that  
does not deal with URG data.
I cannot see what fetchmail would race against in this situation.

Frans Pop has found interesting patterns probably related to segment or  
packet/payload sizes (1460 bytes!) - see his message quoted below.


Application use: Fetchmail does use MSG_PEEK in several places to look  
ahead. The same process that peeks will later read the data. It can happen  
that we peek at 1 byte, then more bytes, or that we peek at 512 bytes and  
read only smaller amounts (say, 30 - 70) of them before we peek again.

As to the application source code: Look for instance for fm_peek in the  
function SockRead() in  
<http://mknod.org/svn/fetchmail/branches/BRANCH_6-3/socket.c> and its  
callers.

Can we be sure that

(a) the kernel correctly handles peek_seq and tp->copied_seq and their  
comparison (see Frans's message below - is the != in the if statement  
actually the right thing or should this be a > or <?), and

(b) that the message is printed only if there is a real app bug (that I  
fail to see), and

(c) that the kernel handles package boundaries, wraparounds, and buffers  
correctly?

Might the reception of further data in the socket's receive buffer trigger  
the message? As in: between peek and read, or between read and peek,  
further data arrives, and triggers the message?

Might the kernel's TCP code be confused about absolute vs. relative  
sequence/position indexes/pointers/counters inside the TCP code?

Thanks a lot in advance. (Frans's earlier message included below.)

Best ...
From: Ilpo Järvinen
Date: Wednesday, May 6, 2009 - 11:48 pm

I cannot resist myself from noting that this certainly wasn't the patch 

1460 is size of the maximum wire segment when not using timestamps (1500 
MTU - (20+20 headers), would be 1448 with timestamps). So basically the 
skbs in the receiver queue will be of that length (holds until the driver 

What would you think about the following, untested patch... I suppose it 
is enough to capture the racy situations except with that crazy urg hole, 
grr (I suppose that will need just another variable to do the offset
of one).

-- 
 i.

[RFC PATCH] tcp: fix MSG_PEEK race check

Commit 518a09ef11 (tcp: Fix recvmsg MSG_PEEK influence of
blocking behavior) lets the loop run longer than this check
did previously expect, so we need to be more careful with
this check and consider the work we have been doing.

I'm a bit unsure if this improved check can still fail as
	if (!sock_flag(sk, SOCK_URGINLINE)) {
		++*seq;
		...
does not increment copied.

Compile tested.

Signed-off-by: Ilpo J

Can you please elaborate why you think that? It may be horribly broken
(I've never claimed to be a C coder, and probably never will), but it


I'll give your patch a try and report back.

Thanks,
FJP
--

From: Ilpo Järvinen
Date: Thursday, May 7, 2009 - 11:48 am

On Thu, 7 May 2009, Frans Pop wrote:

> On Thursday 07 May 2009, Ilpo J

Ah, yes. You're absolutely right. Duh.
gcc does warn about it, but I must have missed that.
--


I've been running with the patch for 2 days now and have not seen any more 
MSG_PEEK errors, so as far as I'm concerned the patch does fix the issue 
(needed the time in order to be confident of that).

So:
Tested-by: Frans Pop <elendil@planet.nl>

Suggest to also add a CC for stable.

Cheers,
FJP
--

From: Ilpo Järvinen
Date: Sunday, May 10, 2009 - 11:32 pm

On Sat, 9 May 2009, Frans Pop wrote:

> On Thursday 07 May 2009, Ilpo J
From: Frans Pop
Date: Monday, May 11, 2009 - 5:50 am

Hmm. I wonder if it wouldn't be better to keep the two issues separate. 
The initial patch is a clear regression fix (4 people have reported it 
against fetchmail for Debian). The URG part is IMO a separate issue which 
I at least have never seen in practice.
And my Tested-by doesn't cover the additional change either.

That said, I have added the URG change (as an incremental patch) in my 
local git repo and will give it a go when I next build a kernel (may take 
a week). I don't expect to be able to confirm it fixes the URG race, but 
I can at least check that it doesn't cause any false messages with my 

--

From: Ilpo Järvinen
Date: Monday, May 11, 2009 - 6:32 am

On Mon, 11 May 2009, Frans Pop wrote:

> On Monday 11 May 2009, Ilpo J
From: Frans Pop
Date: Monday, May 11, 2009 - 6:54 am

OK. I understood that there's always been a corner case with URG that 
could cause incorrect messages [1] and I thought the additional change 
was to fix that, but if this is related to the same regression then of 
course it's fine by me.


I never claimed that. In fact, I was the one who also saw the issue with 


I'll still give the new patch a try on my next build :-)
The rest I happily leave up to you and David.

Thanks for clarifying.

Cheers,
FJP
--

From: Ilpo Järvinen
Date: Monday, May 11, 2009 - 7:57 am

On Mon, 11 May 2009, Frans Pop wrote:

> On Monday 11 May 2009, Ilpo J
From: David Miller
Date: Sunday, May 17, 2009 - 3:31 pm

From: "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi>

The issue being discussed there is exactly the case where a thread
is triggering the copied_seq advance in tcp_check_urg() and using
MSG_PEEK at the same time.

I'm looking more closely into this patch right now, but I might ask
you to split the two fixes up if I can't convince myself %100 of the
URG part.  We've already broken URG enough lately :-)
--

From: Matthias Andree
Date: Monday, May 18, 2009 - 1:02 am

As long as the non-URG part goes into 2.6.30 (it's a regression fix in my  
perception), I'll be a happy camper - at least that would stop the  
kernel's finger pointing at innocent applications. 8-)

OK, I presume an application is innocent if it doesn't use MSG_OOB flags -  
is that sufficient?

-- 
Matthias Andree
--

From: David Miller
Date: Sunday, May 17, 2009 - 3:41 pm

From: "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi>

Ok, now that I've looked at this, the urg_hole part of this change has
to be removed.

That case being accounted for with urg_hole is exactly what the
debugging message is trying to catch, where we are doing MSG_PEEK and
tcp_check_urg() advances ->copied_seq on us during one of those
"release_sock();/lock_sock();" sequences (which thus invoke TCP input
processing).

Could you please respin this patch with the URG bits removed?

Thanks!
--

From: Ilpo Järvinen
Date: Monday, May 18, 2009 - 12:24 am

On Sun, 17 May 2009, David Miller wrote:

> From: "Ilpo J
From: Matthias Andree
Date: Monday, May 18, 2009 - 8:34 am

Am 18.05.2009, 09:24 Uhr, schrieb Ilpo Järvinen  

WRT the earlier patch, we have one more success report from one of the  
users who reported the problem, namely Ian Zimmermann:

<http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=513695#155>

-- 
Matthias Andree
--

From: David Miller
Date: Monday, May 18, 2009 - 3:04 pm

From: "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi>

I see now what the situation is, thanks for explaining.

You're account for the "*seq" advance for URG that happens in


A long long time ago we had an assertion here checking whether
->copied_seq moved without our knowledge.  Alexey and I found this
could trigger and investigation of that is what helped us
find the tcp_check_urg()+MSG_PEEK case.  That's when we added this
test and log message.

When the MSG_PEEK test existing in the !copied if() branch of
tcp_recvmsg(), so many of these code paths we are dealing with in this
fix could simply not be reached when MSG_PEEK.  That ++*seq could
never happen, because if "copied" was non-zero and MSG_PEEK was true
we would leave the loop and return from tcp_recvmsg() immediately.

Now, we have to accomodate those adjustments.

Since I now understand your urg_hole code I'm going to apply your v2
patch which takes care of the URG stuff.  I also buy the argument that
perhaps we should get rid of the log message, but look at how it
helped us find this kernel regression :-)
--

From: Ilpo Järvinen
Date: Monday, May 18, 2009 - 9:33 pm

On Mon, 18 May 2009, David Miller wrote:

> From: "Ilpo J
From: David Miller
Date: Monday, May 18, 2009 - 9:40 pm

From: "Ilpo Järvinen" <ilpo.jarvinen@helsinki.fi>

recvmsg can deal with it fine, because we reset the peek_seq when we
print out that message.

There is no way that an application writer has any clue about this
interaction, where peeked bytes disappear and then reappear in
the out-of-band URG byte.  That's why the message is there.
--

From: Matthias Andree
Date: Tuesday, May 19, 2009 - 2:05 am

Hi,

...and this discovery is a reason to leave it in, and perhaps make sure
that the check code is properly linked (through comments for lack of better
means) to the actual data transfer functions.

Cheers
MA
--

Previous thread: [PATCH]: Fix build regression from address list changes. by David Miller on Wednesday, May 6, 2009 - 3:44 pm. (1 message)

Next thread: RE: [PATCH 4/4] bnx2i: Add bnx2i iSCSI driver. by Karen Xie on Wednesday, May 6, 2009 - 4:02 pm. (1 message)