[patch 20/21] Keys: Fix key serial number collision handling

Previous thread: none

Next thread: [PATCH 1/3] Make the IDE DMA timeout modifiable by Suleiman Souhlal on Tuesday, February 20, 2007 - 6:19 pm. (20 messages)
From: Greg KH
Date: Tuesday, February 20, 2007 - 6:36 pm

This is the start of the stable review cycle for the 2.6.19.5 release.

This will probably be the last release of the 2.6.19-stable series, so
if there are patches that you feel should be applied to that tree,
please let me know.

There are 21 patches in this series, all will be posted as a response to
this one.  If anyone has any issues with these being applied, please let
us know.  If anyone is a maintainer of the proper subsystem, and wants
to add a Signed-off-by: line to the patch, please respond with it.

These patches are sent out with a number of different people on the Cc:
line.  If you wish to be a reviewer, please email stable@kernel.org to
add your name to the list.  If you want to be off the reviewer list,
also email us.

Responses should be made by Friday February 23 00:00 UTC.  Anything
received after that time might be too late.

The whole patch set can be downloaded at:
	kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.19.5-rc1.gz

thanks,

the -stable release team
-

From: Greg KH
Date: Tuesday, February 20, 2007 - 6:36 pm

-stable review patch.  If anyone has any objections, please let us know.

------------------
Host endianess does not affect the order that pixel rgb data comes
in from the quickcam (the values are bytes, not words or longs).  The
driver is erroniously swapping the order of rgb values for big endian
machines.  This patch is needed get the Quickcam communicator working
on big endian machines (tested on powerpc)

(cherry picked from commit c6d704c8c4453f05717ba88792f70f8babf95268)

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 drivers/media/video/usbvideo/quickcam_messenger.h |   14 --------------
 1 file changed, 14 deletions(-)

--- linux-2.6.19.4.orig/drivers/media/video/usbvideo/quickcam_messenger.h
+++ linux-2.6.19.4/drivers/media/video/usbvideo/quickcam_messenger.h
@@ -35,27 +35,13 @@ struct rgb {
 };
 
 struct bayL0 {
-#ifdef __BIG_ENDIAN
-	u8 r;
-	u8 g;
-#elif __LITTLE_ENDIAN
 	u8 g;
 	u8 r;
-#else
-#error not byte order defined
-#endif
 };
 
 struct bayL1 {
-#ifdef __BIG_ENDIAN
-	u8 g;
-	u8 b;
-#elif __LITTLE_ENDIAN
 	u8 b;
 	u8 g;
-#else
-#error not byte order defined
-#endif
 };
 
 struct cam_size {

--
-

From: Greg KH
Date: Tuesday, February 20, 2007 - 6:37 pm

-stable review patch.  If anyone has any objections, please let us know.

------------------
We are doing ->buf_prepare(buf) before adding buf to q->stream list. This
means that videobuf_qbuf() should not try to re-add a STATE_PREPARED buffer.

(cherry picked from commit 419dd8378dfa32985672ab7927b4bc827f33b332)

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 drivers/media/video/video-buf.c |    1 +
 1 file changed, 1 insertion(+)

--- linux-2.6.19.4.orig/drivers/media/video/video-buf.c
+++ linux-2.6.19.4/drivers/media/video/video-buf.c
@@ -700,6 +700,7 @@ videobuf_qbuf(struct videobuf_queue *q,
 		goto done;
 	}
 	if (buf->state == STATE_QUEUED ||
+	    buf->state == STATE_PREPARED ||
 	    buf->state == STATE_ACTIVE) {
 		dprintk(1,"qbuf: buffer is already queued or active.\n");
 		goto done;

--
-

From: Greg KH
Date: Tuesday, February 20, 2007 - 6:36 pm

-stable review patch.  If anyone has any objections, please let us know.

------------------
Or status flags together in DECODER_GET_STATUS instead of and-zapping them.

(cherry picked from commit 55d5440d4587454628a850ce26703639885af678)

Signed-off-by: Martin Samuelsson <sam@home.se>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 drivers/media/video/ks0127.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

--- linux-2.6.19.4.orig/drivers/media/video/ks0127.c
+++ linux-2.6.19.4/drivers/media/video/ks0127.c
@@ -712,13 +712,13 @@ static int ks0127_command(struct i2c_cli
 		*iarg = 0;
 		status = ks0127_read(ks, KS_STAT);
 		if (!(status & 0x20))		 /* NOVID not set */
-			*iarg = (*iarg & DECODER_STATUS_GOOD);
+			*iarg = (*iarg | DECODER_STATUS_GOOD);
 		if ((status & 0x01))		      /* CLOCK set */
-			*iarg = (*iarg & DECODER_STATUS_COLOR);
+			*iarg = (*iarg | DECODER_STATUS_COLOR);
 		if ((status & 0x08))		   /* PALDET set */
-			*iarg = (*iarg & DECODER_STATUS_PAL);
+			*iarg = (*iarg | DECODER_STATUS_PAL);
 		else
-			*iarg = (*iarg & DECODER_STATUS_NTSC);
+			*iarg = (*iarg | DECODER_STATUS_NTSC);
 		break;
 
 	//Catch any unknown command

--
-

From: Greg KH
Date: Tuesday, February 20, 2007 - 6:36 pm

-stable review patch.  If anyone has any objections, please let us know.

------------------
Autodetect LG TAPC G701D as tuner type 37, fixing
mis-detected tuners in some Hauppauge tv tuner cards.

Thanks to Adonis Papas, for pointing this out.

(cherry picked from commit 1323fbda1343f50f198bc8bd6d1d59c8b7fc45bf)

Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 drivers/media/video/tveeprom.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-2.6.19.4.orig/drivers/media/video/tveeprom.c
+++ linux-2.6.19.4/drivers/media/video/tveeprom.c
@@ -184,7 +184,7 @@ hauppauge_tuner[] =
 	{ TUNER_ABSENT,        "Thompson DTT757"},
 	/* 80-89 */
 	{ TUNER_ABSENT,        "Philips FQ1216LME MK3"},
-	{ TUNER_ABSENT,        "LG TAPC G701D"},
+	{ TUNER_LG_PAL_NEW_TAPC, "LG TAPC G701D"},
 	{ TUNER_LG_NTSC_NEW_TAPC, "LG TAPC H791F"},
 	{ TUNER_LG_PAL_NEW_TAPC, "TCL 2002MB 3"},
 	{ TUNER_LG_PAL_NEW_TAPC, "TCL 2002MI 3"},

--
-

From: Greg KH
Date: Tuesday, February 20, 2007 - 6:38 pm

-stable review patch.  If anyone has any objections, please let us know.

------------------

[PATCH] usb-audio: work around wrong frequency in CM6501 descriptors

The C-Media CM6501 chip's descriptors say that altsetting 5 supports
48 kHz, but it actually plays at 96 kHz.

Signed-off-by: Clemens Ladisch <clemens@ladisch.de>
Signed-off-by: Jaroslav Kysela <perex@suse.cz>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 sound/usb/usbaudio.c |    8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

--- linux-2.6.19.4.orig/sound/usb/usbaudio.c
+++ linux-2.6.19.4/sound/usb/usbaudio.c
@@ -2471,7 +2471,13 @@ static int parse_audio_format_rates(stru
 		fp->nr_rates = nr_rates;
 		fp->rate_min = fp->rate_max = combine_triple(&fmt[8]);
 		for (r = 0, idx = offset + 1; r < nr_rates; r++, idx += 3) {
-			unsigned int rate = fp->rate_table[r] = combine_triple(&fmt[idx]);
+			unsigned int rate = combine_triple(&fmt[idx]);
+			/* C-Media CM6501 mislabels its 96 kHz altsetting */
+			if (rate == 48000 && nr_rates == 1 &&
+			    chip->usb_id == USB_ID(0x0d8c, 0x0201) &&
+			    fp->altsetting == 5 && fp->maxpacksize == 392)
+				rate = 96000;
+			fp->rate_table[r] = rate;
 			if (rate < fp->rate_min)
 				fp->rate_min = rate;
 			else if (rate > fp->rate_max)

--
-

From: Greg KH
Date: Tuesday, February 20, 2007 - 6:37 pm

-stable review patch.  If anyone has any objections, please let us know.

------------------
From: Peter Korsgaard <jacmet@sunsite.dk>

smc911x_phy_configure's error handling unconditionally unlocks the
spinlock even if it wasn't locked. Patch fixes it.

Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk>
Cc: Jeff Garzik <jeff@garzik.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 drivers/net/smc911x.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

--- linux-2.6.19.4.orig/drivers/net/smc911x.c
+++ linux-2.6.19.4/drivers/net/smc911x.c
@@ -965,11 +965,11 @@ static void smc911x_phy_configure(void *
 	 * We should not be called if phy_type is zero.
 	 */
 	if (lp->phy_type == 0)
-		 goto smc911x_phy_configure_exit;
+		 goto smc911x_phy_configure_exit_nolock;
 
 	if (smc911x_phy_reset(dev, phyaddr)) {
 		printk("%s: PHY reset timed out\n", dev->name);
-		goto smc911x_phy_configure_exit;
+		goto smc911x_phy_configure_exit_nolock;
 	}
 	spin_lock_irqsave(&lp->lock, flags);
 
@@ -1038,6 +1038,7 @@ static void smc911x_phy_configure(void *
 
 smc911x_phy_configure_exit:
 	spin_unlock_irqrestore(&lp->lock, flags);
+smc911x_phy_configure_exit_nolock:
 	lp->work_pending = 0;
 }
 

--
-

From: Greg KH
Date: Tuesday, February 20, 2007 - 6:37 pm

-stable review patch.  If anyone has any objections, please let us know.

------------------

From: "Ken Chen" <kenchen@google.com>

An AIO bug was reported that sleeping function is being called in softirq
context:

BUG: warning at kernel/mutex.c:132/__mutex_lock_common()
Call Trace:
     [<a000000100577b00>] __mutex_lock_slowpath+0x640/0x6c0
     [<a000000100577ba0>] mutex_lock+0x20/0x40
     [<a0000001000a25b0>] flush_workqueue+0xb0/0x1a0
     [<a00000010018c0c0>] __put_ioctx+0xc0/0x240
     [<a00000010018d470>] aio_complete+0x2f0/0x420
     [<a00000010019cc80>] finished_one_bio+0x200/0x2a0
     [<a00000010019d1c0>] dio_bio_complete+0x1c0/0x200
     [<a00000010019d260>] dio_bio_end_aio+0x60/0x80
     [<a00000010014acd0>] bio_endio+0x110/0x1c0
     [<a0000001002770e0>] __end_that_request_first+0x180/0xba0
     [<a000000100277b90>] end_that_request_chunk+0x30/0x60
     [<a0000002073c0c70>] scsi_end_request+0x50/0x300 [scsi_mod]
     [<a0000002073c1240>] scsi_io_completion+0x200/0x8a0 [scsi_mod]
     [<a0000002074729b0>] sd_rw_intr+0x330/0x860 [sd_mod]
     [<a0000002073b3ac0>] scsi_finish_command+0x100/0x1c0 [scsi_mod]
     [<a0000002073c2910>] scsi_softirq_done+0x230/0x300 [scsi_mod]
     [<a000000100277d20>] blk_done_softirq+0x160/0x1c0
     [<a000000100083e00>] __do_softirq+0x200/0x240
     [<a000000100083eb0>] do_softirq+0x70/0xc0

See report: http://marc.theaimsgroup.com/?l=linux-kernel&m=116599593200888&w=2

flush_workqueue() is not allowed to be called in the softirq context.
However, aio_complete() called from I/O interrupt can potentially call
put_ioctx with last ref count on ioctx and triggers bug.  It is simply
incorrect to perform ioctx freeing from aio_complete.

The bug is trigger-able from a race between io_destroy() and aio_complete().
A possible scenario:

cpu0                               cpu1
io_destroy                         aio_complete
  wait_for_all_aios {                __aio_put_req
     ...                                 ...
From: Greg KH
Date: Tuesday, February 20, 2007 - 6:38 pm

-stable review patch.  If anyone has any objections, please let us know.

------------------
From: David Howells <dhowells@redhat.com>

[PATCH] Keys: Fix key serial number collision handling

Fix the key serial number collision avoidance code in key_alloc_serial().

This didn't use to be so much of a problem as the key serial numbers were
allocated from a simple incremental counter, and it would have to go through
two billion keys before it could possibly encounter a collision.  However, now
that random numbers are used instead, collisions are much more likely.

This is fixed by finding a hole in the rbtree where the next unused serial
number ought to be and using that by going almost back to the top of the
insertion routine and redoing the insertion with the new serial number rather
than trying to be clever and attempting to work out the insertion point
pointer directly.

This fixes kernel BZ #7727.

Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Chuck Ebbert <cebbert@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 security/keys/key.c |   33 ++++++++++++++-------------------
 1 file changed, 14 insertions(+), 19 deletions(-)

--- linux-2.6.19.4.orig/security/keys/key.c
+++ linux-2.6.19.4/security/keys/key.c
@@ -188,6 +188,7 @@ static inline void key_alloc_serial(stru
 
 	spin_lock(&key_serial_lock);
 
+attempt_insertion:
 	parent = NULL;
 	p = &key_serial_tree.rb_node;
 
@@ -202,39 +203,33 @@ static inline void key_alloc_serial(stru
 		else
 			goto serial_exists;
 	}
-	goto insert_here;
+
+	/* we've found a suitable hole - arrange for this key to occupy it */
+	rb_link_node(&key->serial_node, parent, p);
+	rb_insert_color(&key->serial_node, &key_serial_tree);
+
+	spin_unlock(&key_serial_lock);
+	return;
 
 	/* we found a key with the proposed serial number - walk the tree from
 	 * that point looking for the next unused serial number */
 serial_exists:
 	for ...
From: Greg KH
Date: Tuesday, February 20, 2007 - 6:39 pm

-stable review patch.  If anyone has any objections, please let us know.

------------------
If you lose this race, it can iput a socket inode twice and you
get a BUG in fs/inode.c

When I added the option for user-space to close a socket,
I added some cruft to svc_delete_socket so that I could call
that function when closing a socket per user-space request.

This was the wrong thing to do.  I should have just set SK_CLOSE
and let normal mechanisms do the work.

Not only wrong, but buggy.  The locking is all wrong and it openned
up a race where-by a socket could be closed twice.

So this patch:
  Introduces svc_close_socket which sets SK_CLOSE then either leave
  the close up to a thread, or calls svc_delete_socket if it can
  get SK_BUSY.

  Adds a bias to sk_busy which is removed when SK_DEAD is set,
  This avoid races around shutting down the socket.

  Changes several 'spin_lock' to 'spin_lock_bh' where the _bh 
  was missing.

Bugzilla-url: http://bugzilla.kernel.org/show_bug.cgi?id=7916

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>


---
 include/linux/sunrpc/svcsock.h |    2 -
 net/sunrpc/svc.c               |    4 +--
 net/sunrpc/svcsock.c           |   52 +++++++++++++++++++++++++++++------------
 3 files changed, 41 insertions(+), 17 deletions(-)

--- linux-2.6.19.4.orig/include/linux/sunrpc/svcsock.h
+++ linux-2.6.19.4/include/linux/sunrpc/svcsock.h
@@ -63,7 +63,7 @@ struct svc_sock {
  * Function prototypes.
  */
 int		svc_makesock(struct svc_serv *, int, unsigned short);
-void		svc_delete_socket(struct svc_sock *);
+void		svc_close_socket(struct svc_sock *);
 int		svc_recv(struct svc_rqst *, long);
 int		svc_send(struct svc_rqst *);
 void		svc_drop(struct svc_rqst *);
--- linux-2.6.19.4.orig/net/sunrpc/svc.c
+++ linux-2.6.19.4/net/sunrpc/svc.c
@@ -387,7 +387,7 @@ svc_destroy(struct svc_serv *serv)
 		svsk = list_entry(serv->sv_tempsocks.next,
 				  struct svc_sock,
 				  ...
From: Greg KH
Date: Tuesday, February 20, 2007 - 6:38 pm

-stable review patch.  If anyone has any objections, please let us know.

------------------
[PATCH] usbaudio - Fix Oops with broken usb descriptors

This is a patch for ALSA Bug #2724. Some webcams provide bogus
settings with no valid rates. With this patch those are skipped.

Signed-off-by: Gregor Jasny <gjasny@web.de>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Jaroslav Kysela <perex@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 sound/usb/usbaudio.c |    6 ++++++
 1 file changed, 6 insertions(+)

--- linux-2.6.19.4.orig/sound/usb/usbaudio.c
+++ linux-2.6.19.4/sound/usb/usbaudio.c
@@ -2456,6 +2456,7 @@ static int parse_audio_format_rates(stru
 		 * build the rate table and bitmap flags
 		 */
 		int r, idx, c;
+		unsigned int nonzero_rates = 0;
 		/* this table corresponds to the SNDRV_PCM_RATE_XXX bit */
 		static unsigned int conv_rates[] = {
 			5512, 8000, 11025, 16000, 22050, 32000, 44100, 48000,
@@ -2478,6 +2479,7 @@ static int parse_audio_format_rates(stru
 			    fp->altsetting == 5 && fp->maxpacksize == 392)
 				rate = 96000;
 			fp->rate_table[r] = rate;
+			nonzero_rates |= rate;
 			if (rate < fp->rate_min)
 				fp->rate_min = rate;
 			else if (rate > fp->rate_max)
@@ -2493,6 +2495,10 @@ static int parse_audio_format_rates(stru
 			if (!found)
 				fp->needs_knot = 1;
 		}
+		if (!nonzero_rates) {
+			hwc_debug("All rates were zero. Skipping format!\n");
+			return -1;
+		}
 		if (fp->needs_knot)
 			fp->rates |= SNDRV_PCM_RATE_KNOT;
 	} else {

--
-

From: Greg KH
Date: Tuesday, February 20, 2007 - 6:38 pm

-stable review patch.  If anyone has any objections, please let us know.

------------------


[PATCH] usbaudio - Fix Oops with unconventional sample rates

The patch fixes the memory corruption by the support of unconventional
sample rates.  Also, it avoids the too restrictive constraints if
any of usb descriptions contain continuous rates.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 sound/usb/usbaudio.c |   43 +++++++++++++++++++++++++------------------
 1 file changed, 25 insertions(+), 18 deletions(-)

--- linux-2.6.19.4.orig/sound/usb/usbaudio.c
+++ linux-2.6.19.4/sound/usb/usbaudio.c
@@ -186,6 +186,7 @@ struct snd_usb_substream {
 	u64 formats;			/* format bitmasks (all or'ed) */
 	unsigned int num_formats;		/* number of supported audio formats (list) */
 	struct list_head fmt_list;	/* format list */
+	struct snd_pcm_hw_constraint_list rate_list;	/* limited rates */
 	spinlock_t lock;
 
 	struct snd_urb_ops ops;		/* callbacks (must be filled at init) */
@@ -1810,28 +1811,33 @@ static int check_hw_params_convention(st
 static int snd_usb_pcm_check_knot(struct snd_pcm_runtime *runtime,
 				  struct snd_usb_substream *subs)
 {
-	struct list_head *p;
-	struct snd_pcm_hw_constraint_list constraints_rates;
+	struct audioformat *fp;
+	int count = 0, needs_knot = 0;
 	int err;
 
-	list_for_each(p, &subs->fmt_list) {
-		struct audioformat *fp;
-		fp = list_entry(p, struct audioformat, list);
-
-		if (!fp->needs_knot)
-			continue;
-
-		constraints_rates.count = fp->nr_rates;
-		constraints_rates.list = fp->rate_table;
-		constraints_rates.mask = 0;
-
-		err = snd_pcm_hw_constraint_list(runtime, 0,
-			SNDRV_PCM_HW_PARAM_RATE,
-			&constraints_rates);
+	list_for_each_entry(fp, &subs->fmt_list, list) {
+		if (fp->rates & SNDRV_PCM_RATE_CONTINUOUS)
+			return 0;
+		count += fp->nr_rates;
+		if (fp->needs_knot)
+			needs_knot = 1;
+	}
+	if (!needs_knot)
+		return 0;
 
-		if (err < 0)
-			return ...
From: Greg KH
Date: Tuesday, February 20, 2007 - 6:38 pm

-stable review patch.  If anyone has any objections, please let us know.

------------------
From: Ingo Molnar <mingo@elte.hu>

[PATCH] net, 8139too.c: fix netpoll deadlock

fix deadlock in the 8139too driver: poll handlers should never forcibly
enable local interrupts, because they might be used by netpoll/printk
from IRQ context.

  =================================
  [ INFO: inconsistent lock state ]
  2.6.19 #11
  ---------------------------------
  inconsistent {softirq-on-W} -> {in-softirq-W} usage.
  swapper/1 [HC0[0]:SC1[1]:HE1:SE0] takes:
   (&npinfo->poll_lock){-+..}, at: [<c0350a41>] net_rx_action+0x64/0x1de
  {softirq-on-W} state was registered at:
    [<c0134c86>] mark_lock+0x5b/0x39c
    [<c0135012>] mark_held_locks+0x4b/0x68
    [<c01351e9>] trace_hardirqs_on+0x115/0x139
    [<c02879e6>] rtl8139_poll+0x3d7/0x3f4
    [<c035c85d>] netpoll_poll+0x82/0x32f
    [<c035c775>] netpoll_send_skb+0xc9/0x12f
    [<c035cdcc>] netpoll_send_udp+0x253/0x25b
    [<c0288463>] write_msg+0x40/0x65
    [<c011cead>] __call_console_drivers+0x45/0x51
    [<c011cf16>] _call_console_drivers+0x5d/0x61
    [<c011d4fb>] release_console_sem+0x11f/0x1d8
    [<c011d7d7>] register_console+0x1ac/0x1b3
    [<c02883f8>] init_netconsole+0x55/0x67
    [<c010040c>] init+0x9a/0x24e
    [<c01049cf>] kernel_thread_helper+0x7/0x10
    [<ffffffff>] 0xffffffff
  irq event stamp: 819992
  hardirqs last  enabled at (819992): [<c0350a16>] net_rx_action+0x39/0x1de
  hardirqs last disabled at (819991): [<c0350b1e>] net_rx_action+0x141/0x1de
  softirqs last  enabled at (817552): [<c01214e4>] __do_softirq+0xa3/0xa8
  softirqs last disabled at (819987): [<c0106051>] do_softirq+0x5b/0xc9

  other info that might help us debug this:
  no locks held by swapper/1.

  stack backtrace:
   [<c0104d88>] dump_trace+0x63/0x1e8
   [<c0104f26>] show_trace_log_lvl+0x19/0x2e
   [<c010532d>] show_trace+0x12/0x14
   [<c0105343>] dump_stack+0x14/0x16
   [<c0134980>] print_usage_bug+0x23c/0x246
   [<c0134d33>] ...
From: Greg KH
Date: Tuesday, February 20, 2007 - 6:37 pm

-stable review patch.  If anyone has any objections, please let us know.

------------------

eighty_ninty_three() had word 93 validitity check but not the 80c bit
test itself (bit 12).  This increases the chance of incorrect wire
detection especially because host side cable detection is often
unreliable and we sometimes soley depend on drive side cable
detection.  Fix it.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Acked-by: Alan <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 drivers/ide/ide-iops.c |    2 ++
 1 file changed, 2 insertions(+)

--- linux-2.6.19.4.orig/drivers/ide/ide-iops.c
+++ linux-2.6.19.4/drivers/ide/ide-iops.c
@@ -607,6 +607,8 @@ u8 eighty_ninty_three (ide_drive_t *driv
 	if(!(drive->id->hw_config & 0x4000))
 		return 0;
 #endif /* CONFIG_IDEDMA_IVB */
+	if (!(drive->id->hw_config & 0x2000))
+		return 0;
 	return 1;
 }
 

--
-

From: Greg KH
Date: Tuesday, February 20, 2007 - 6:38 pm

-stable review patch.  If anyone has any objections, please let us know.

------------------

Correct assignment of DOT1XENABLE in WE-19 codepaths.
RX_UNENCRYPTED_EAPOL = 1 really means setting DOT1XENABLE _off_, and
vice versa.  The original WE-19 patch erroneously reversed that.  This
patch fixes association with unencrypted and WEP networks when using
wpa_supplicant.

It also adds two missing break statements that, left out, could result
in incorrect card configuration.

Applies to (I think) 2.6.19 and later.

Signed-off-by: Dan Williams <dcbw@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 drivers/net/wireless/prism54/isl_ioctl.c |    8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

--- linux-2.6.19.4.orig/drivers/net/wireless/prism54/isl_ioctl.c
+++ linux-2.6.19.4/drivers/net/wireless/prism54/isl_ioctl.c
@@ -1395,11 +1395,16 @@ static int prism54_set_auth(struct net_d
 		break;
 
 	case IW_AUTH_RX_UNENCRYPTED_EAPOL:
-		dot1x = param->value ? 1 : 0;
+		/* dot1x should be the opposite of RX_UNENCRYPTED_EAPOL;
+		 * turn off dot1x when  allowing recepit of unencrypted eapol
+		 * frames, turn on dot1x when we disallow receipt
+		 */
+		dot1x = param->value ? 0x00 : 0x01;
 		break;
 
 	case IW_AUTH_PRIVACY_INVOKED:
 		privinvoked = param->value ? 1 : 0;
+		break;
 
 	case IW_AUTH_DROP_UNENCRYPTED:
 		exunencrypt = param->value ? 1 : 0;
@@ -1589,6 +1594,7 @@ static int prism54_set_encodeext(struct 
 			}
 			key.type = DOT11_PRIV_TKIP;
 			key.length = KEY_SIZE_TKIP;
+			break;
 		default:
 			return -EINVAL;
 		}

--
-

From: Greg KH
Date: Tuesday, February 20, 2007 - 6:38 pm

-stable review patch.  If anyone has any objections, please let us know.

------------------

Use different constraint for gcc < 4.1 in bitops.h

+m is really correct for a RMW instruction, but some older gccs
error out. I finally gave in and ifdefed it.

This fixes compilation errors with some compiler version.

Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 include/asm-x86_64/bitops.h |   34 ++++++++++++++++++++--------------
 1 file changed, 20 insertions(+), 14 deletions(-)

--- linux-2.6.19.4.orig/include/asm-x86_64/bitops.h
+++ linux-2.6.19.4/include/asm-x86_64/bitops.h
@@ -7,7 +7,13 @@
 
 #include <asm/alternative.h>
 
-#define ADDR (*(volatile long *) addr)
+#if __GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 1)
+/* Technically wrong, but this avoids compilation errors on some gcc
+   versions. */
+#define ADDR "=m" (*(volatile long *) addr)
+#else
+#define ADDR "+m" (*(volatile long *) addr)
+#endif
 
 /**
  * set_bit - Atomically set a bit in memory
@@ -23,7 +29,7 @@ static __inline__ void set_bit(int nr, v
 {
 	__asm__ __volatile__( LOCK_PREFIX
 		"btsl %1,%0"
-		:"+m" (ADDR)
+		:ADDR
 		:"dIr" (nr) : "memory");
 }
 
@@ -40,7 +46,7 @@ static __inline__ void __set_bit(int nr,
 {
 	__asm__ volatile(
 		"btsl %1,%0"
-		:"+m" (ADDR)
+		:ADDR
 		:"dIr" (nr) : "memory");
 }
 
@@ -58,7 +64,7 @@ static __inline__ void clear_bit(int nr,
 {
 	__asm__ __volatile__( LOCK_PREFIX
 		"btrl %1,%0"
-		:"+m" (ADDR)
+		:ADDR
 		:"dIr" (nr));
 }
 
@@ -66,7 +72,7 @@ static __inline__ void __clear_bit(int n
 {
 	__asm__ __volatile__(
 		"btrl %1,%0"
-		:"+m" (ADDR)
+		:ADDR
 		:"dIr" (nr));
 }
 
@@ -86,7 +92,7 @@ static __inline__ void __change_bit(int 
 {
 	__asm__ __volatile__(
 		"btcl %1,%0"
-		:"+m" (ADDR)
+		:ADDR
 		:"dIr" (nr));
 }
 
@@ -103,7 +109,7 @@ static __inline__ void change_bit(int nr
 {
 	__asm__ __volatile__( LOCK_PREFIX
 		"btcl %1,%0"
-		:"+m" (ADDR)
+		:ADDR
 ...
From: Greg KH
Date: Tuesday, February 20, 2007 - 6:37 pm

-stable review patch.  If anyone has any objections, please let us know.

------------------

80c test mask is at bits 18 and 19 of EIDE Controller Configuration
not 22 and 23.  Fix it.

Signed-off-by: Tejun Heo <htejun@gmail.com>
Acked-by: Alan Cox <alan@lxorguk.ukuu.org.uk>

---
 drivers/ata/pata_amd.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

--- linux-2.6.19.4.orig/drivers/ata/pata_amd.c
+++ linux-2.6.19.4/drivers/ata/pata_amd.c
@@ -128,7 +128,7 @@ static void timing_setup(struct ata_port
 
 static int amd_pre_reset(struct ata_port *ap)
 {
-	static const u32 bitmask[2] = {0x03, 0xC0};
+	static const u32 bitmask[2] = {0x03, 0x0C};
 	static const struct pci_bits amd_enable_bits[] = {
 		{ 0x40, 1, 0x02, 0x02 },
 		{ 0x40, 1, 0x01, 0x01 }
@@ -247,7 +247,7 @@ static void amd133_set_dmamode(struct at
  */
 
 static int nv_pre_reset(struct ata_port *ap) {
-	static const u8 bitmask[2] = {0x03, 0xC0};
+	static const u8 bitmask[2] = {0x03, 0x0C};
 	static const struct pci_bits nv_enable_bits[] = {
 		{ 0x50, 1, 0x02, 0x02 },
 		{ 0x50, 1, 0x01, 0x01 }

--
-

From: Greg KH
Date: Tuesday, February 20, 2007 - 6:37 pm

-stable review patch.  If anyone has any objections, please let us know.

------------------

Also PTRACE_OLDSETOPTIONS should be accepted, as done by kernel/ptrace.c and
forced by binary compatibility. UML/32bit breaks because of this - since it is wise
enough to use PTRACE_OLDSETOPTIONS to be binary compatible with 2.4 host
kernels.

Until 2.6.17 (commit f0f2d6536e3515b5b1b7ae97dc8f176860c8c2ce) we had:

       default:
                return sys_ptrace(request, pid, addr, data);

Instead here we have:
        case PTRACE_GET_THREAD_AREA:
	case ...:
                return sys_ptrace(request, pid, addr, data);

        default:
                return -EINVAL;

This change was a style change - when a case is added, it must be explicitly
tested this way. In this case, not enough testing was done.

Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 arch/x86_64/ia32/ptrace32.c |    1 +
 1 file changed, 1 insertion(+)

--- linux-2.6.19.4.orig/arch/x86_64/ia32/ptrace32.c
+++ linux-2.6.19.4/arch/x86_64/ia32/ptrace32.c
@@ -243,6 +243,7 @@ asmlinkage long sys32_ptrace(long reques
 	case PTRACE_SINGLESTEP:
 	case PTRACE_DETACH:
 	case PTRACE_SYSCALL:
+	case PTRACE_OLDSETOPTIONS:
 	case PTRACE_SETOPTIONS:
 	case PTRACE_SET_THREAD_AREA:
 	case PTRACE_GET_THREAD_AREA:

--
-

From: Greg KH
Date: Tuesday, February 20, 2007 - 6:36 pm

-stable review patch.  If anyone has any objections, please let us know.

------------------
Suspending with the cx88xx module loaded causes the system to lock up
because the cx88_audio_thread kthread was missing a try_to_freeze()
call, which caused it to go into a tight loop and result in softlockup
when suspending. Fix that.

(cherry picked from commit a96afb3e9428f2e7463344f12dbc85faf08e2e09)

Signed-off-by: Robert Hancock <hancockr@shaw.ca>
Signed-off-by: Mauro Carvalho Chehab <mchehab@infradead.org>
Signed-off-by: Michael Krufky <mkrufky@linuxtv.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 drivers/media/video/cx88/cx88-tvaudio.c |    2 ++
 1 file changed, 2 insertions(+)

--- linux-2.6.19.4.orig/drivers/media/video/cx88/cx88-tvaudio.c
+++ linux-2.6.19.4/drivers/media/video/cx88/cx88-tvaudio.c
@@ -38,6 +38,7 @@
 #include <linux/module.h>
 #include <linux/moduleparam.h>
 #include <linux/errno.h>
+#include <linux/freezer.h>
 #include <linux/kernel.h>
 #include <linux/slab.h>
 #include <linux/mm.h>
@@ -974,6 +975,7 @@ int cx88_audio_thread(void *data)
 		msleep_interruptible(1000);
 		if (kthread_should_stop())
 			break;
+		try_to_freeze();
 
 		/* just monitor the audio status for now ... */
 		memset(&t, 0, sizeof(t));

--
-

From: Chuck Ebbert
Date: Wednesday, February 21, 2007 - 6:00 pm

drivers/media/video/cx88/cx88-tvaudio.c:41:27: error: linux/freezer.h: No such file or directory
make[4]: *** [drivers/media/video/cx88/cx88-tvaudio.o] Error 1
make[3]: *** [drivers/media/video/cx88] Error 2
make[3]: *** Waiting for unfinished jobs....

-

From: Michael Krufky
Date: Wednesday, February 21, 2007 - 6:14 pm

Yikes...  This one shouldn't have been sent to 2.6.18.y nor 2.6.19.y ... tree-mixup :-/

Please drop this one.  Thanks, Chuck.

Sorry about that...

-Mike Krufky
-

From: Greg KH
Date: Tuesday, February 20, 2007 - 6:37 pm

-stable review patch.  If anyone has any objections, please let us know.

------------------

There is a kernel oops on bcm43xx when resuming due to an overly tight timeout loop.

Signed-off-by: Larry Finger<Larry.Finger@lwfinger.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 drivers/net/wireless/bcm43xx/bcm43xx.h |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-2.6.19.4.orig/drivers/net/wireless/bcm43xx/bcm43xx.h
+++ linux-2.6.19.4/drivers/net/wireless/bcm43xx/bcm43xx.h
@@ -21,7 +21,7 @@
 #define PFX				KBUILD_MODNAME ": "
 
 #define BCM43xx_SWITCH_CORE_MAX_RETRIES	50
-#define BCM43xx_IRQWAIT_MAX_RETRIES	50
+#define BCM43xx_IRQWAIT_MAX_RETRIES	100
 
 #define BCM43xx_IO_SIZE			8192
 

--
-

From: Greg KH
Date: Tuesday, February 20, 2007 - 6:38 pm

-stable review patch.  If anyone has any objections, please let us know.

------------------

From: Michael Buesch <mb@bu3sch.de>

If bcm43xx were to process an afterburner (ampdu) status response, Linux would oops. The
ampdu and intermediate status bits are properly named.

Signed-off-by: Michael Buesch <mb@bu3sch.de>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 drivers/net/wireless/bcm43xx/bcm43xx_main.c |    8 +++-----
 drivers/net/wireless/bcm43xx/bcm43xx_xmit.h |   10 ++--------
 2 files changed, 5 insertions(+), 13 deletions(-)

--- linux-2.6.19.4.orig/drivers/net/wireless/bcm43xx/bcm43xx_main.c
+++ linux-2.6.19.4/drivers/net/wireless/bcm43xx/bcm43xx_main.c
@@ -1449,12 +1449,10 @@ static void handle_irq_transmit_status(s
 
 		bcm43xx_debugfs_log_txstat(bcm, &stat);
 
-		if (stat.flags & BCM43xx_TXSTAT_FLAG_IGNORE)
+		if (stat.flags & BCM43xx_TXSTAT_FLAG_AMPDU)
+			continue;
+		if (stat.flags & BCM43xx_TXSTAT_FLAG_INTER)
 			continue;
-		if (!(stat.flags & BCM43xx_TXSTAT_FLAG_ACK)) {
-			//TODO: packet was not acked (was lost)
-		}
-		//TODO: There are more (unknown) flags to test. see bcm43xx_main.h
 
 		if (bcm43xx_using_pio(bcm))
 			bcm43xx_pio_handle_xmitstatus(bcm, &stat);
--- linux-2.6.19.4.orig/drivers/net/wireless/bcm43xx/bcm43xx_xmit.h
+++ linux-2.6.19.4/drivers/net/wireless/bcm43xx/bcm43xx_xmit.h
@@ -137,14 +137,8 @@ struct bcm43xx_xmitstatus {
 	u16 unknown; //FIXME
 };
 
-#define BCM43xx_TXSTAT_FLAG_ACK		0x01
-//TODO #define BCM43xx_TXSTAT_FLAG_???	0x02
-//TODO #define BCM43xx_TXSTAT_FLAG_???	0x04
-//TODO #define BCM43xx_TXSTAT_FLAG_???	0x08
-//TODO #define BCM43xx_TXSTAT_FLAG_???	0x10
-#define BCM43xx_TXSTAT_FLAG_IGNORE	0x20
-//TODO #define BCM43xx_TXSTAT_FLAG_???	0x40
-//TODO #define BCM43xx_TXSTAT_FLAG_???	0x80
+#define BCM43xx_TXSTAT_FLAG_AMPDU	0x10
+#define BCM43xx_TXSTAT_FLAG_INTER	0x20
 
 u8 bcm43xx_plcp_get_ratecode_cck(const u8 bitrate);
 u8 ...
From: Greg KH
Date: Tuesday, February 20, 2007 - 6:37 pm

-stable review patch.  If anyone has any objections, please let us know.

------------------

From: Atsushi Nemoto <anemo@mba.ocn.ne.jp>

The usage of the century bit was inverted on 2.6.19 following to PCF8563's
description, but it was not match to usage suggested by RTC8564's
datasheet.  Anyway what MO_C=1 means can vary on each platform.  This patch
is to detect its polarity in get_datetime routine.  The default value of
c_polarity is 0 (MO_C=1 means 19xx) so that this patch does not change
current behavior even if get_datetime was not called before set_datetime.

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Cc: Jean-Baptiste Maneyrol <jean-baptiste.maneyrol@teamlog.com>
Cc: David Brownell <dbrownell@users.sourceforge.net>
Cc: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>

---
 drivers/rtc/rtc-pcf8563.c |   40 ++++++++++++++++++++++++++++++++++------
 1 file changed, 34 insertions(+), 6 deletions(-)

--- linux-2.6.19.4.orig/drivers/rtc/rtc-pcf8563.c
+++ linux-2.6.19.4/drivers/rtc/rtc-pcf8563.c
@@ -53,6 +53,25 @@ I2C_CLIENT_INSMOD;
 #define PCF8563_SC_LV		0x80 /* low voltage */
 #define PCF8563_MO_C		0x80 /* century */
 
+struct pcf8563 {
+	struct i2c_client client;
+	/*
+	 * The meaning of MO_C bit varies by the chip type.
+	 * From PCF8563 datasheet: this bit is toggled when the years
+	 * register overflows from 99 to 00
+	 *   0 indicates the century is 20xx
+	 *   1 indicates the century is 19xx
+	 * From RTC8564 datasheet: this bit indicates change of
+	 * century. When the year digit data overflows from 99 to 00,
+	 * this bit is set. By presetting it to 0 while still in the
+	 * 20th century, it will be set in year 2000, ...
+	 * There seems no reliable way to know how the system use this
+	 * bit.  So let's do it heuristically, assuming we are live in
+	 * 1970...2069.
+	 */
+	int c_polarity;	/* 0: MO_C=1 means 19xx, otherwise MO_C=1 means 20xx ...
From: Stefan Richter
Date: Wednesday, February 21, 2007 - 6:36 am

There is one here: "Missing critical phys_to_virt in lib/swiotlb.c".
http://lkml.org/lkml/2007/2/4/116
It fixes a DMA related bug which was seen with a variety of drivers
especially on EM64T machines with more than 3GB RAM. I hope you can
extract the patch from this MIME attachment.

Adrian, AFAICS it applies as-is to 2.6.16.y too. I don't have a machine
to test personally, but it is quite obvious.

The mentioned bigger patch has been merged by Linus between 2.6.20 and
2.6.21-rc1.
-- 
Stefan Richter
-=====-=-=== --=- =-=-=
http://arcgraph.de/sr/
-

From: Stefan Richter
Date: Wednesday, February 21, 2007 - 6:37 am

Probably not unless I attach it for real.
-- 
Stefan Richter
-=====-=-=== --=- =-=-=
http://arcgraph.de/sr/
From: Adrian Bunk
Date: Thursday, March 8, 2007 - 10:35 pm

Thanks.



cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Chuck Ebbert
Date: Wednesday, February 21, 2007 - 9:38 am

The attached patch is in 2.6.20 and fixes problems with
no sound from certain Intel HDA adapters.
From: Chuck Ebbert
Date: Wednesday, February 21, 2007 - 9:50 am

The attached fixes an oops in the usbnet driver. The same patch is
in 2.6.21-rc1, but that one has many whitespace changes. This is much
smaller.


From: David Brownell <david-b@pacbell.net>
Signed-off-by: David Brownell <david-b@pacbell.net>

From: Chuck Ebbert
Date: Wednesday, February 21, 2007 - 12:31 pm

What is the status of:

http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20/2.6.20-mm2/broke...
http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.20/2.6.20-mm2/broke...

They fix a serious bug that causes machines to freeze up
or just run very slowly.

I'd like to see these in -stable if possible.
-

From: Andrew Morton
Date: Wednesday, February 21, 2007 - 12:47 pm

On Wed, 21 Feb 2007 14:31:41 -0500


They're not even in mainline yet.
-

From: Linus Torvalds
Date: Wednesday, February 21, 2007 - 1:09 pm

Would be good to have Eric also ack them as safe and obvious.

Btw, that latter one has corrupted sign-offs from Andi (it's in the middle 
of the text, very confusing).

		Linus
-

From: Eric W. Biederman
Date: Wednesday, February 21, 2007 - 3:45 pm

Bug reports:
 I have seen a couple of user reports and heard about a few more.  Given
 that Chuck Ebbert appears to have tested them I'm guessing redhat has
 seen a couple of reports as well.  One of the issues is that in
 some cases you can be susceptible and go for weeks without hitting
 this, so the bug reports aren't likely to come back fast.  So this is
 a long term stability issue.

 Even if we just put in my tiny fix that allows us to generally
 survive this condition in stable, it prints a nasty warning message
 so I expect people will want a more complete fix.

Vulnerability:
 I believe it is possible to trigger this bug on any SMP machine.

Obviousness:
 The first patch is obvious, but of course that isn't the interesting
 bit.

 The second patch is still fairly simple, and it appears to have
 undergone testing from people besides myself.  

 So in the interest of a timely if not perfect fix I think it is a
 good patch.  In particular I do not see any area where it would makes
 things worse.

Bugs:
 There is one small issue that is probably worth fixing.
 apic_in_service_vector only works correctly because we never have
 more than one local apic irq in service at the same time, (we keep
 irqs disabled during all of the interrupt routines).  The appended
 incremental patch addresses that.

Outstanding Issues:
 The big outstanding issue I am currently working on is that in my
 testing I have found evidence to suggest that ioapics do not strictly
 follow the pci ordering rules, so exactly when the last interrupt
 sent before we masked the interrupt at the interrupt controller will
 arrive is in question.  So to really be safe we cannot tear down the
 data structures for handling the interrupt in the old location until
 after we have seen the next interrupt showing up in the new location.

 I don't know if it is possible to for the issue I have just described
 to cause problems in practice. I intend to fix this for 2.6.21 if the
 patch will be ...
From: Eric W. Biederman
Date: Tuesday, February 27, 2007 - 11:37 pm

Hmm..  I seem to have failed to send out this reply a few days ago :(


There are two questions.
1) What can we do to make the situation better.
2) Is the hole completely plugged.

When I wrote the patch I had the local apic priorities backwards in my
head.  So apic_in_service_vector can return the wrong value if two
irqs are in service.  Now I don't think we allows ourselves to enable
interrupts in an interrupt service routing until after we have acked
the local apic so this should be harmless.  The fix is also trivial
of just having apic_in_service_vector return: "~get_irq_regs()->orig_rax".

Except for that one possible problem everything I can think of are
just theoretical cracks at this point, and they don't make the
situation any worse.

Given that this patch has appears to have undergone a noticeable
amount of testing, by people other than myself, and clears up the
symptoms.  I have no problem 


Eric
-

From: Zwane Mwaikambo
Date: Wednesday, February 28, 2007 - 1:51 am

Hi Eric,
	Thanks for getting this cruft cleaned up. I have a few comments 
regarding;

handle-irqs-pending-in-irr-during-irq-migration.patch

1) It relies on checking the IRR, this could race with the corresponding 
vector bit being set by hardware.

2) apic_handle_pending_vector is oddly named since it doesn't actually 
handle a pending vector but drops it if it has been freed.

3) It looks complex

So how about the following scheme. Add a check in do_IRQ whether the irq's 
domain contains the current cpu, if not we free the vector upon handler 
completion.

Cheers,
	Zwane
-

From: Eric W. Biederman
Date: Wednesday, February 28, 2007 - 5:28 am

The mostly correct assumption is that because that vector is masked and

  Because that check will leak vector entries.  And after a while the
  box will be unable to migrate irqs, and possible something more
  severe.

Yes.  It is moderately complex.  After receiving a little feedback
like this I have something much simpler and more robust mered into the
current git for 2.6.21.  Which except for my stupid it doesn't compile
on uniprocessor bug should be good.

However it took me 13 patches to come up with something clean and
simple.

Basically I wait until an irq has arrived at the new location until I
free it, and even then I send a lowest priority IPI to land to the cpu in
question before I free it so that if that other cpu has it stuck in the
pending bit that gets processed before the freeing happens.
Even with that I'm still only 99% certain that the last in flight irq before
we reprogrammed it actually made it to a different cpus local apic.   But
there appears to be nothing more that I can do.  I have exhausted every
property I can find.  Added to that is the fact that simply handing the
irq in IRR empirically is sufficient.  So I truly believe in practice
the code in my first patch is sufficient, and what I am doing for 2.6.21
is better simply because it is simpler and much more paranoid and thus
affords us with a bit of margin.  If one irq is delivered to a local
apic you would expect the previous incarnation of that irq to be
delivered to a local apic first...

Honestly I would be completely happy if all that gets back ported is
my stupid patch, that adds:

		if (!disable_apic)
			ack_APIC_irq();

Before
		if (printk_ratelimit())
			printk(KERN_EMERG "%s: %d.%d No irq handler for vector\n",

In do_IRQ.  That is sufficient in most cases to keep the box from
falling over.  Is obviously correct.  And only emits a scary message.
If that isn't sufficient to give everyone warm fuzzy. I think the
patch under discussion make sense for a backport.  At least it ...
From: Greg KH
Date: Wednesday, February 28, 2007 - 12:52 pm

Documentation/stable_kernel_rules.txt

thanks,

greg k-h
-

From: Eric W. Biederman
Date: Wednesday, February 28, 2007 - 4:25 pm

Ok if that is really what we are going with, the this silly patch isn't
simple enough for a backport.  There used to other rules to the effect
the patch must be merged in mainline, and we only backport to one kernel
revision.

I think it fails the 100 lines with context test.

The meaning of obviously correct is a little bit nebulous.  But if
something is obvious multiple people can easily understand what
is going on.  I haven't gotten any feedback that has said yes I
see what you are doing on the mentioned patch.

I'm really not certain how this patch got seriously proposed then.
I guess it was the serious of the issues of peoples boxes falling
over.

I guess somewhere I got the rules for weird vendor trees confused with
our stable branches.  The relaxed stable branch rules probably did it
to me.

So the best we can do is the commit below for a backport.  It doesn't
fix the issue but it generally keeps the machines from falling over.

p.s. The copy below is whitespace damaged because I just cut and
pasted it into this email.

commit 2fb12a9bca5ad9aa6dcd2c639b4a7656a8843ef8
Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Tue Feb 13 13:26:25 2007 +0100

    [PATCH] x86-64: survive having no irq mapping for a vector
    
    Occasionally the kernel has bugs that result in no irq being found for a
    given cpu vector.  If we acknowledge the irq the system has a good chance
    of continuing even though we dropped an irq message.  If we continue to
    simply print a message and not acknowledge the irq the system is likely to
    become non-responsive shortly there after.
    
    AK: Fixed compilation for UP kernels
    
    Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
    Signed-off-by: Andi Kleen <ak@suse.de>
    Cc: "Luigi Genoni" <luigi.genoni@pirelli.com>
    Cc: Andi Kleen <ak@suse.de>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

diff --git a/arch/x86_64/kernel/irq.c b/arch/x86_64/kernel/irq.c
index 0c06af6..3bc30d2 ...
From: Eric W. Biederman
Date: Wednesday, February 21, 2007 - 1:13 pm

If you don't have it you at least want the patch below.  It generally
makes the bug non-fatal.

I'm still working my way through possible fixes...  Although the
patch in question is close, and normally fixes it in my utmost
paranoia I can still find problems with it.

Eric


commit 2fb12a9bca5ad9aa6dcd2c639b4a7656a8843ef8
Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Tue Feb 13 13:26:25 2007 +0100

    [PATCH] x86-64: survive having no irq mapping for a vector
    
    Occasionally the kernel has bugs that result in no irq being found for a
    given cpu vector.  If we acknowledge the irq the system has a good chance
    of continuing even though we dropped an irq message.  If we continue to
    simply print a message and not acknowledge the irq the system is likely to
    become non-responsive shortly there after.
    
    AK: Fixed compilation for UP kernels
    
    Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
    Signed-off-by: Andi Kleen <ak@suse.de>
    Cc: "Luigi Genoni" <luigi.genoni@pirelli.com>
    Cc: Andi Kleen <ak@suse.de>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

-

From: Chuck Ebbert
Date: Wednesday, February 21, 2007 - 1:21 pm

We've tested it and found no problems so far.  It's definitely
better than what's there now. :)
-

From: Andi Kleen
Date: Wednesday, February 21, 2007 - 3:19 pm

Putting that patch into stable would be a good idea, agreed.

-Andi
-

From: Andi Kleen
Date: Wednesday, February 21, 2007 - 3:20 pm

I didn't think the problem was serious enough for a backport. Do we have
user reports? 

It's certainly not trivial obvious patches. 

Eric, what is your opinion? 

-Andi
-

From: Chuck Ebbert
Date: Wednesday, February 21, 2007 - 3:39 pm

Yes, lots:

-

From: Andi Kleen
Date: Wednesday, February 21, 2007 - 6:19 pm

Ok for me then.

-Andi
-

From: Greg KH
Date: Wednesday, February 21, 2007 - 1:39 pm

-stable for 2.6.19 and/or .18?

I haven't pushed out the next round of patches for the 2.6.20-stable
tree, I have a _lot_ of them there to catch up on still...

thanks,

greg k-h
-

From: Chuck Ebbert
Date: Wednesday, February 21, 2007 - 1:44 pm

The bug is new in .19 and is still in .20 and .21-rc.

-

From: Chuck Ebbert
Date: Wednesday, February 21, 2007 - 3:33 pm

This patch should go in 2.6.19 and 2.6.20 -stable as well.
(It's in 2.6.21-rc.)

From: Chuck Ebbert
Date: Wednesday, February 21, 2007 - 3:43 pm

This is the rest of the NAPI fixes for 2.6.19-stable.




From: Chuck Ebbert
Date: Thursday, February 22, 2007 - 9:09 am

This seems appropriate for 2.6.19-stable.  It fixes CVE-2006-5753,
listed as a "high" security risk. Taken from hg kernel repository,
and already in 2.6.20.


Previous thread: none

Next thread: [PATCH 1/3] Make the IDE DMA timeout modifiable by Suleiman Souhlal on Tuesday, February 20, 2007 - 6:19 pm. (20 messages)