kerneloops.org: 2.6.26-rc possible regression in ext3

Previous thread: kerneloops.org: 2.6.26-rc possible regression in ext3 by Arjan van de Ven on Wednesday, June 18, 2008 - 10:34 pm. (7 messages)

Next thread: [PATCH 0/3 2.6.27] cxgb3i: Add iSCSI driver by Karen Xie on Wednesday, June 18, 2008 - 10:10 pm. (2 messages)
From: Arjan van de Ven
Date: Wednesday, June 18, 2008 - 10:36 pm

In the kerneloops.org stats, a new oops is rapidly climbing the charts.
The oops is a page fault in the ext3 "do_slit" function, and the first
report of it was with 2.6.26-rc6-git3.

It happens with various applications; the backtraces are at:

http://www.kerneloops.org/search.php?search=do_split

but are generally of this pattern:

*do_split
ext3_add_entry
ext3_rename
vfs_rename
... <various paths into vfs_rename> ...

or

*do_split
? add_dirent_to_buf
ext3_add_entry
ext3_new_inode
ext3_add_nondir
ext3_create
vfs_create
....


did we change anything in ext

--

From: Dave Airlie
Date: Wednesday, June 18, 2008 - 10:42 pm

This is a bug in rawhide in gcc miscompiling something...

https://bugzilla.redhat.com/show_bug.cgi?id=451068

Dave.
--

From: Arjan van de Ven
Date: Wednesday, June 18, 2008 - 10:48 pm

thanks for letting us know so fast!
I've marked this one in the database as a fedora gcc bug
--

From: Linus Torvalds
Date: Wednesday, June 18, 2008 - 11:42 pm

Gaah. I should read all my email instead of wasting my time trying to 
match up the code with what I can reproduce..

		Linus
--

From: Arjan van de Ven
Date: Thursday, June 19, 2008 - 12:09 am

unfortunately, kerneloops.org didn't pick up the link to this bug (due to the fact
that the oops in the bug was a jpeg....)... maybe one day if I'm really bored
I'll implement OCR into it ;)

sorry about wasting your time

--

From: Adrian Bunk
Date: Thursday, June 19, 2008 - 1:11 am

If I understand it correctly that's a bug in upstream gcc 4.3.1
(but not in gcc 4.3.0)?

Expect a lot more of this to pop up in the future.

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

--

From: Mikael Pettersson
Date: Thursday, June 19, 2008 - 1:32 am

Adrian Bunk writes:
 > On Thu, Jun 19, 2008 at 03:42:34PM +1000, Dave Airlie wrote:
 > > On Thu, Jun 19, 2008 at 3:36 PM, Arjan van de Ven <arjan@linux.intel.com> wrote:
 > > > In the kerneloops.org stats, a new oops is rapidly climbing the charts.
 > > > The oops is a page fault in the ext3 "do_slit" function, and the first
 > > > report of it was with 2.6.26-rc6-git3.
 > > >
 > > > It happens with various applications; the backtraces are at:
 > > >
 > > > http://www.kerneloops.org/search.php?search=do_split
 > > 
 > > This is a bug in rawhide in gcc miscompiling something...
 > > 
 > > https://bugzilla.redhat.com/show_bug.cgi?id=451068
 > 
 > If I understand it correctly that's a bug in upstream gcc 4.3.1
 > (but not in gcc 4.3.0)?
 > 
 > Expect a lot more of this to pop up in the future.
 > Should we #error for gcc 4.3.1?

There are other nasty bugs in gcc-4.3.0. I actually
had to completely ban 4.3.0 in a user-space project
I'm involved with (Erlang) due to gcc PR36339 (fixed
in 4.3.1).

What's the gcc bugzilla number for this new 4.3.1 bug?
--

From: Adrian Bunk
Date: Thursday, June 19, 2008 - 3:49 am

#36533

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

--

From: Arjan van de Ven
Date: Thursday, June 19, 2008 - 6:40 am

it/s better to find if the gcc guys made a testcase for this bug (they normally do) and
test based on that.
--

From: Adrian Bunk
Date: Thursday, June 19, 2008 - 8:10 am

The gcc Bugzilla contains a testcase.

But how do you plan to integrate it into a kernel build?

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

--

From: Arjan van de Ven
Date: Thursday, June 19, 2008 - 8:18 am

we already have several of these.
Just look at scripts/gcc-x86_64-has-stack-protector.sh for an example of such a beast.
--

From: Adrian Bunk
Date: Thursday, June 19, 2008 - 8:25 am

Checking whether gcc supports some flags is easy.

But miscompilations are a different issue.

Especially since we also want to reject broken gcc versions for cross 
compilations.

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

--

From: Arjan van de Ven
Date: Thursday, June 19, 2008 - 8:27 am

have you actually looked at this script?
You didn't, since the script doesn't check if gcc supports some flag.
It checks very specifically for a code generation pattern...

Please go look at the script first before responding.

--

From: Adrian Bunk
Date: Thursday, June 19, 2008 - 8:43 am

I did look, but I missed the last pipe...

Do we know for sure this bug can only trigger on 32bit x86?

Or is there anything else I miss in gcc-x86_64-has-stack-protector.sh 
that allows to use this approach to check for wrong code generation 
caused by platform independent gcc bugs?

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

--

From: Eric Sandeen
Date: Thursday, June 19, 2008 - 7:00 am

Arjan, I was just looking at kerneloops last night, seeing the count for
this oops climb, and was wishing there were some way to annotate an oops
signature with more info.  If I could have tagged this with the RH
bugzilla nr. it might have saved a lot of time for folks.  Is this
feasible?  Or is finding the oops text in bugzilla the only way?

Thanks,

-Eric

--

From: Arjan van de Ven
Date: Thursday, June 19, 2008 - 7:07 am

there's a way to add a description to oopses (you might have seen some of these
descriptions already); however I've not implemented an account system yet so for
now it's only me who can add these.
--

From: Eric Sandeen
Date: Thursday, June 19, 2008 - 7:17 am

Ok, that was my guess.  I'll shoot you an email next time.  :)

Thanks,
-Eric
--

Previous thread: kerneloops.org: 2.6.26-rc possible regression in ext3 by Arjan van de Ven on Wednesday, June 18, 2008 - 10:34 pm. (7 messages)

Next thread: [PATCH 0/3 2.6.27] cxgb3i: Add iSCSI driver by Karen Xie on Wednesday, June 18, 2008 - 10:10 pm. (2 messages)