login
Header Space

 
 

Re: 2.6.24.2: 4KSTACKS + pcdrw + dm + mount -> stack overflow: ide-cd related? dm-related?

Previous thread: Tabs, spaces, indent and 80 character lines by Richard Knutsson on Saturday, February 23, 2008 - 8:48 pm. (8 messages)

Next thread: [patch 2/2] x86,fpu: lazy allocation of FPU area by Suresh Siddha on Saturday, February 23, 2008 - 10:34 pm. (4 messages)
To: <linux-kernel@...>
Date: Saturday, February 23, 2008 - 9:56 pm

A loop mount/umounting a pcdrw or iso9660 (through the pktcdvd device)
sees a stack overflow in four or five tries. Doing the same thing with
the same CD in a normal non-pktcdvd-mounted drive doesn't cause a crash.

Here's a couple of oopses. config follows.

(There are a wide variety. Some I couldn't collect because they appeared
to recurse, blurring past much too fast to read. Some simply didn't go
out of the netconsole logger at all for no obvious reason.)

(This may or may not be PREEMPT+PREEMPT_BKL-specific: I'll try turning
them off tomorrow and repeating.)

(The presence of dm in the first oops below must surely be attributed to
preempt: certainly my CD isn't managed by dm :) )

pktcdvd: Fixed packets, 32 blocks, Mode-2 disc
pktcdvd: Max. media speed: 4
pktcdvd: write speed 2x
pktcdvd: 551232kB available on disc
UDF-fs INFO UDF 0.9.8.1 (2004/29/09) Mounting volume 'LinuxUDF', timestamp 2004/02/09 07:10 (1000)
do_IRQ: stack overflow: 480
Pid: 4645, comm: mount Not tainted 2.6.24.2-dirty #4
 [&lt;c0104171&gt;] do_IRQ+0x66/0xc5
 [&lt;c0102f8b&gt;] common_interrupt+0x23/0x28
 [&lt;c027b5da&gt;] ide_outsl+0x5/0x9
 [&lt;c027c540&gt;] ata_output_data+0x4d/0x64
 [&lt;c027b8a6&gt;] atapi_output_bytes+0x19/0x3f
 [&lt;c0285377&gt;] cdrom_transfer_packet_command+0xb5/0xde
 [&lt;c0282607&gt;] cdrom_timer_expiry+0x0/0x51
 [&lt;c02840c3&gt;] cdrom_start_packet_command+0x14f/0x157
 [&lt;c02853e9&gt;] cdrom_do_pc_continuation+0x0/0x2c
 [&lt;c027aa33&gt;] ide_do_request+0x70a/0x943
 [&lt;c0363ef3&gt;] schedule_timeout+0x13/0x8b
 [&lt;c021faeb&gt;] elv_drain_elevator+0x15/0x58
 [&lt;c0220277&gt;] elv_insert+0xf6/0x1d9
 [&lt;c0285377&gt;] cdrom_transfer_packet_command+0xb5/0xde
 [&lt;c0282607&gt;] cdrom_timer_expiry+0x0/0x51
 [&lt;c027b038&gt;] ide_do_drive_cmd+0x99/0xe9
 [&lt;c0282abe&gt;] cdrom_queue_packet_command+0x35/0xa9
 [&lt;c0363b2b&gt;] schedule+0x321/0x33e
 [&lt;c0363ef3&gt;] schedule_timeout+0x13/0x8b
 [&lt;c0282d11&gt;] cdrom_read_tocentry+0x96/0xa1
 [&lt;c02220d3&gt;] b...
To: <linux-kernel@...>
Cc: <petero2@...>, <dm-devel@...>
Date: Sunday, February 24, 2008 - 11:59 am

It is not preempt-specific, nor dm-specific. Nor it is very easy to
capture tracebacks of: even netconsole generally gives up when faced
with a string of recursive tracebacks blurring past forever at blinding
speed.

But while I'd normally blame pktcdvd there's only one pktcdvd function
in these tracebacks (pkt_open) and it's not got a significant stack
footprint.

More notable is a great stack of mutual recursion between
dm_bio_destructor() and the CDROM code: it seems to burn most of the
stack on this sort of thrashing. Here's one of those tracebacks again:

do_IRQ: stack overflow: 480
id: 4645, comm: mount Not tainted 2.6.24.2-dirty #4
 [&lt;c0104171&gt;] do_IRQ+0x66/0xc5
 [&lt;c0102f8b&gt;] common_interrupt+0x23/0x28
 [&lt;c027b5da&gt;] ide_outsl+0x5/0x9
 [&lt;c027c540&gt;] ata_output_data+0x4d/0x64
 [&lt;c027b8a6&gt;] atapi_output_bytes+0x19/0x3f
 [&lt;c0285377&gt;] cdrom_transfer_packet_command+0xb5/0xde
 [&lt;c0282607&gt;] cdrom_timer_expiry+0x0/0x51
 [&lt;c02840c3&gt;] cdrom_start_packet_command+0x14f/0x157
 [&lt;c02853e9&gt;] cdrom_do_pc_continuation+0x0/0x2c
 [&lt;c027aa33&gt;] ide_do_request+0x70a/0x943
 [&lt;c0363ef3&gt;] schedule_timeout+0x13/0x8b
 [&lt;c021faeb&gt;] elv_drain_elevator+0x15/0x58
 [&lt;c0220277&gt;] elv_insert+0xf6/0x1d9
 [&lt;c0285377&gt;] cdrom_transfer_packet_command+0xb5/0xde
 [&lt;c0282607&gt;] cdrom_timer_expiry+0x0/0x51
 [&lt;c027b038&gt;] ide_do_drive_cmd+0x99/0xe9
 [&lt;c0282abe&gt;] cdrom_queue_packet_command+0x35/0xa9
 [&lt;c0363b2b&gt;] schedule+0x321/0x33e
 [&lt;c0363ef3&gt;] schedule_timeout+0x13/0x8b
 [&lt;c0282d11&gt;] cdrom_read_tocentry+0x96/0xa1
 [&lt;c02220d3&gt;] blk_end_sync_rq+0x0/0x23
 [&lt;c028315b&gt;] cdrom_read_toc+0x14b/0x42e
 [&lt;c02220d3&gt;] blk_end_sync_rq+0x0/0x23
 [&lt;c027b07e&gt;] ide_do_drive_cmd+0xdf/0xe9
 [&lt;c0283ed2&gt;] ide_cdrom_audio_ioctl+0x13c/0x1de
 [&lt;c02af8fe&gt;] dm_bio_destructor+0x0/0x8
 [&lt;c017162c&gt;] end_bio_bh_io_sync+0x0/0x27
 [&lt;c0282f42&gt;] cdrom_check_status+0x55/0x60
 [...
To: Nix <nix@...>
Cc: <linux-kernel@...>, <dm-devel@...>
Date: Sunday, February 24, 2008 - 12:56 pm

Did you verify that with "make checkstack" or just by looking at the
source code? On my system, pkt_open() consumes 584 bytes because the
compiler decides to inline lots of functions that would not normally
be part of long call chains. The following patch fixes that problem on

Maybe dm_bio_destructor() is just old cruft left on the stack from
previous function calls?


From: Peter Osterlund &lt;petero2@telia.com&gt;

Signed-off-by: Peter Osterlund &lt;petero2@telia.com&gt;
---

 drivers/block/pktcdvd.c |   16 ++++++++--------
 1 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/block/pktcdvd.c b/drivers/block/pktcdvd.c
index 674cd66..f2510e7 100644
--- a/drivers/block/pktcdvd.c
+++ b/drivers/block/pktcdvd.c
@@ -849,7 +849,7 @@ static int pkt_flush_cache(struct pktcdvd_device *pd)
 /*
  * speed is given as the normal factor, e.g. 4 for 4x
  */
-static int pkt_set_speed(struct pktcdvd_device *pd, unsigned write_speed, unsigned read_speed)
+static noinline int pkt_set_speed(struct pktcdvd_device *pd, unsigned write_speed, unsigned read_speed)
 {
 	struct packet_command cgc;
 	struct request_sense sense;
@@ -1776,7 +1776,7 @@ static int pkt_get_track_info(struct pktcdvd_device *pd, __u16 track, __u8 type,
 	return pkt_generic_packet(pd, &amp;cgc);
 }
 
-static int pkt_get_last_written(struct pktcdvd_device *pd, long *last_written)
+static noinline int pkt_get_last_written(struct pktcdvd_device *pd, long *last_written)
 {
 	disc_information di;
 	track_information ti;
@@ -1813,7 +1813,7 @@ static int pkt_get_last_written(struct pktcdvd_device *pd, long *last_written)
 /*
  * write mode select package based on pd-&gt;settings
  */
-static int pkt_set_write_settings(struct pktcdvd_device *pd)
+static noinline int pkt_set_write_settings(struct pktcdvd_device *pd)
 {
 	struct packet_command cgc;
 	struct request_sense sense;
@@ -1972,7 +1972,7 @@ static int pkt_writable_disc(struct pktcdvd_device *pd, disc_information *di)
 	return 1;
 }
 
-static i...
To: Peter Osterlund <petero2@...>
Cc: Nix <nix@...>, <linux-kernel@...>, <dm-devel@...>
Date: Monday, February 25, 2008 - 4:25 pm

yup, I'll grab that.  I'll even write your changelog for you (grr).

But first, let's do this:


From: Andrew Morton &lt;akpm@linux-foundation.org&gt;

People are adding `noinline' in various places to prevent excess stack
consumption due to gcc inlining.  But once this is done, it is quite unobvious
why the `noinline' is present in the code.  We can comment each and every
site, or we can use noinline_for_stack.


Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
---

 include/linux/compiler.h |    6 ++++++
 1 file changed, 6 insertions(+)

diff -puN include/linux/compiler.h~add-noinline_for_stack include/linux/compiler.h
--- a/include/linux/compiler.h~add-noinline_for_stack
+++ a/include/linux/compiler.h
@@ -138,6 +138,12 @@ extern void __chk_io_ptr(const volatile 
 #define noinline
 #endif
 
+/*
+ * Rather then using noinline to prevent stack consumption, use
+ * noinline_for_stack instead.  For documentaiton reasons.
+ */
+#define noinline_for_stack noinline
+
 #ifndef __always_inline
 #define __always_inline inline
 #endif
_

(Note that these changes don't let DM off the hook!)

--
To: Peter Osterlund <petero2@...>
Cc: <linux-kernel@...>, <dm-devel@...>
Date: Sunday, February 24, 2008 - 1:02 pm

I just looked at the source; I forgot `make checkstack' existed.

On this system:

0xc0263e0f pkt_open [vmlinux]:                          556

which is nearly as bad.

(As an aside, I'm surprised I didn't oops when packet-writing as well:

0xc021270d udf_process_sequence [vmlinux]:              692
0xc020f43d udf_add_entry [vmlinux]:                     628

owch. I guess that's called via a shorter call chain...)


I'll try the patch after this series of backups is done :)

-- 
`The rest is a tale of post and counter-post.' --- Ian Rawlings
                                                   describes USENET
--
To: Nix <nix@...>
Cc: Peter Osterlund <petero2@...>, <linux-kernel@...>, <dm-devel@...>
Date: Thursday, March 6, 2008 - 12:14 pm

If you are interested, linux-next or -mm tree should contain a patch now
that fixes the problem...

								Honza
-- 
Jan Kara &lt;jack@suse.cz&gt;
SuSE CR Labs
--
To: Jan Kara <jack@...>
Cc: Peter Osterlund <petero2@...>, <linux-kernel@...>, <dm-devel@...>
Date: Monday, March 10, 2008 - 4:09 pm

As soon as I've figured out why dhclient is failing to establish DHCP
connections from 2.6.14.3 (the DHCPOFFERs seem to get ignored, it's
weird and the kernel shouldn't have anything to do with it, but it does)
so I can actually compile something that works and doesn't knock me off
the net every two minutes, I'll try it :)

(yes, I could try -mm directly, but it's been some weeks since I backed
up, thanks to an abruptly chocolate-filled CD-RW drive, so I'm being a
bit cautious.)

-- 
`The rest is a tale of post and counter-post.' --- Ian Rawlings
                                                   describes USENET
--
To: Nix <nix@...>
Cc: Peter Osterlund <petero2@...>, <linux-kernel@...>, <dm-devel@...>, Jan Kara <jack@...>
Date: Monday, February 25, 2008 - 4:30 pm

udf_process_sequence() seems to be another victim of gcc inlining.

udf_add_entry() defines a couple of 256-byte local arrays.
--
To: Andrew Morton <akpm@...>
Cc: Nix <nix@...>, Peter Osterlund <petero2@...>, <linux-kernel@...>, <dm-devel@...>, Jan Kara <jack@...>
Date: Monday, February 25, 2008 - 6:48 pm

Yes, exactly two of them. One is  non-trivial to get rid of - it's
used for encoding of filename before we write it, but one is used during
scanning of the directory whether the entry doesn't already exists (oh,
my!) and we can just rip that off..

								Honza
-- 
Jan Kara &lt;jack@suse.cz&gt;
SuSE CR Labs
--
To: Jan Kara <jack@...>
Cc: Andrew Morton <akpm@...>, Nix <nix@...>, Peter Osterlund <petero2@...>, <linux-kernel@...>, <dm-devel@...>, Jan Kara <jack@...>
Date: Tuesday, February 26, 2008 - 7:10 am

Why can't we do just



UDF: Optimize stack usage

Signed-off-by: Jiri Kosina &lt;jkosina@suse.cz&gt;

diff --git a/fs/udf/namei.c b/fs/udf/namei.c
index 112a5fb..706a2b5 100644
--- a/fs/udf/namei.c
+++ b/fs/udf/namei.c
@@ -336,7 +336,7 @@ static struct fileIdentDesc *udf_add_entry(struct inode *dir,
 {
 	struct super_block *sb = dir-&gt;i_sb;
 	struct fileIdentDesc *fi = NULL;
-	char name[UDF_NAME_LEN], fname[UDF_NAME_LEN];
+	char *name, *fname;
 	int namelen;
 	loff_t f_pos;
 	int flen;
@@ -352,6 +352,14 @@ static struct fileIdentDesc *udf_add_entry(struct inode *dir,
 	struct extent_position epos = {};
 	struct udf_inode_info *dinfo;
 
+	name = kmalloc(sizeof(char) * UDF_NAME_LEN, GFP_KERNEL);
+	fname = kmalloc(sizeof(char) * UDF_NAME_LEN, GFP_KERNEL);
+
+	if (!name || !fname) {
+		*err = -ENOMEM;
+		return NULL;
+	}
+
 	if (dentry) {
 		if (!dentry-&gt;d_name.len) {
 			*err = -EINVAL;
diff --git a/fs/udf/super.c b/fs/udf/super.c
index f3ac4ab..42e3ba8 100644
--- a/fs/udf/super.c
+++ b/fs/udf/super.c
@@ -1345,7 +1345,7 @@ static void udf_load_logicalvolint(struct super_block *sb, kernel_extent_ad loc)
  *	July 1, 1997 - Andrew E. Mileski
  *	Written, tested, and released.
  */
-static int udf_process_sequence(struct super_block *sb, long block,
+static int noinline udf_process_sequence(struct super_block *sb, long block,
 				long lastblock, kernel_lb_addr *fileset)
 {
 	struct buffer_head *bh = NULL;
--
To: Jiri Kosina <jkosina@...>
Cc: Jan Kara <jack@...>, Andrew Morton <akpm@...>, Nix <nix@...>, Peter Osterlund <petero2@...>, <linux-kernel@...>, <dm-devel@...>, Jan Kara <jack@...>
Date: Tuesday, February 26, 2008 - 1:25 pm

Wouldn't it be better to check each individually, so you do wind up leaking a 

DRH

-- 
Dialup is like pissing through a pipette. Slow and excruciatingly painful.
--
To: Jiri Kosina <jkosina@...>
Cc: Jan Kara <jack@...>, Andrew Morton <akpm@...>, Nix <nix@...>, Peter Osterlund <petero2@...>, <linux-kernel@...>, <dm-devel@...>, Jan Kara <jack@...>
Date: Tuesday, February 26, 2008 - 7:29 am

this bit is missing i think:

	if (name)
		kfree(name);
	if (fname)
		kfree(fname);

	Ingo
--
To: Ingo Molnar <mingo@...>
Cc: Jan Kara <jack@...>, Andrew Morton <akpm@...>, Nix <nix@...>, Peter Osterlund <petero2@...>, <linux-kernel@...>, <dm-devel@...>, Jan Kara <jack@...>
Date: Tuesday, February 26, 2008 - 7:37 am

Ergh, of course, stupid me, sorry, it should be freed on all exit paths. I 
am not sending updated patch, as Jan is probably working on complete 
removal of one of those fields ... ?

Thanks,

-- 
Jiri Kosina
--
To: Jiri Kosina <jkosina@...>
Cc: Ingo Molnar <mingo@...>, Andrew Morton <akpm@...>, Nix <nix@...>, Peter Osterlund <petero2@...>, <linux-kernel@...>, <dm-devel@...>, Jan Kara <jack@...>
Date: Tuesday, February 26, 2008 - 12:41 pm

Yes, I'll convert one variable to kmalloc and the other one remove
completely. Stay tuned ;).

								Honza
-- 
Jan Kara &lt;jack@suse.cz&gt;
SUSE Labs, CR
--
To: Jan Kara <jack@...>
Cc: Nix <nix@...>, Peter Osterlund <petero2@...>, <linux-kernel@...>, <dm-devel@...>, Jan Kara <jack@...>
Date: Monday, February 25, 2008 - 6:51 pm

kmalloc is quite fast ;)
--
Previous thread: Tabs, spaces, indent and 80 character lines by Richard Knutsson on Saturday, February 23, 2008 - 8:48 pm. (8 messages)

Next thread: [patch 2/2] x86,fpu: lazy allocation of FPU area by Suresh Siddha on Saturday, February 23, 2008 - 10:34 pm. (4 messages)
speck-geostationary