Re: 2.6.27-rc5-mmotm0829 - lvm issues at boot, possible disk_devt() related?

Previous thread: [PATCH] kernel/resource.c: fix new kernel-doc warning by Randy Dunlap on Sunday, August 31, 2008 - 8:31 pm. (1 message)

Next thread: [PATCH] x86: move mtrr cpu cap setting early in early_init_xxxx by Yinghai Lu on Sunday, August 31, 2008 - 9:06 pm. (1 message)
From: Valdis.Kletnieks
Date: Sunday, August 31, 2008 - 8:36 pm

I bisected this down to the 08/28 version of linux-next.patch that was in
mmotm-0829.  27-rc3-mmotm0814 works fine.

System boots, loads the initrd, starts running.  The initrd tries to run lvm
to find my root file system (the one drive has sda1 a small /boot, and sda2
is the rest of the disk, and LVM space).  Every single partition complains:

device-mapper: core: bdget failed in dm_suspend
device-mapper: resume ioctl failed: invalid argument
Unable to resume Volgroup00-root (254:0)

and repeats the 3 lines with different Volgroup-foo and 254:N values for
each LVM logical volume.  After this, it rolls over and dies because it
didn't find the root filesystem.  

This is with an initrd build with Fedora Rawhide current as of last night.
Using an older initrd that's been working just fine for several releases
just silently hangs around the lvm startup. (My initrd config does not include
any .ko files, just nash and lvm and similar early-userspace stuff).

Any ideas?  Am willing to try patches/debugging code, if anybody can think
what I should be instrumenting to get more info.

My gut feeling is an issue in Tejun Heo's patches to implement extended dev
numbers - one of these two or a related patch:

commit 805d265623f217cbeac02b6f704342af2c320b5b
Author: Tejun Heo <tj@kernel.org>
Date:   Mon Aug 25 19:47:22 2008 +0900

    block: implement extended dev numbers

commit 346862bf911e5f5194b13ba1a5608f8d7f1e758a
Author: Tejun Heo <tj@kernel.org>
Date:   Mon Aug 25 19:47:19 2008 +0900

    block: don't depend on consecutive minor space

(The second adds disk_devt() to drivers/md/dm-ioctl.c, which is (a) where
we are dying and (b) would cause us to die if it was broken, and (c) most
of the code change to dm-ioctl.c - so I'm suspicious...)

This looks suspicious as well:

commit d15722bcd6dfd88e9ce108405f2313266a5ae1d2
Author: Tejun Heo <tj@kernel.org>
Date:   Mon Aug 25 19:56:17 2008 +0900

    block: allow disk to have extended device number
...
    * If ...
From: Tejun Heo
Date: Monday, September 1, 2008 - 12:58 am

Yeah, I made a mistake converting two of them and devt lookup fails when
the disk is zero sized.  Bartlomiej debugged the problem and posted a
patch and I followed up with an updated patch.  It should be fine in the
next round.

  http://article.gmane.org/gmane.linux.kernel.next/2663
  http://article.gmane.org/gmane.linux.kernel.next/2676

If you're seeing other problems, please let me know.

Thanks.

-- 
tejun
--

From: Valdis.Kletnieks
Date: Monday, September 1, 2008 - 2:15 am

Confirming - 2.6.27-rc5-mmotm0829 plus the merge of the 2 above patches
does find the LVM volumes and come up.  Thanks for the clue.. :)
From: Jens Axboe
Date: Monday, September 1, 2008 - 2:50 am

Thanks for confirming, both patches are in the updated block branch that
-mm and -next pull down.

-- 
Jens Axboe

--

From: Alasdair G Kergon
Date: Monday, September 1, 2008 - 2:56 am

I expect we'll need some patches to userspace lvm2 to support these extended
device numbers properly too...

Alasdair (back from holiday)
-- 
agk@redhat.com
--

From: Jens Axboe
Date: Monday, September 1, 2008 - 3:15 am

They'll be defaulting to off from now on, so it should not be a big
worry. But Alan Brunelle did find that the "10-character limit
in dm/lib/libdm-deptree is too small".

-- 
Jens Axboe

--

From: Alan D. Brunelle
Date: Tuesday, September 2, 2008 - 5:16 am

Tejun pointed out:

"dev_t is 32bits and MINORBITS is 20.  So, major 12 bits, minor 20
bits, so 4 characters for major, 7 characters for minor."

That would mean: 4+':'+7+'\0' = 13 characters at a minimum, so attached
patch seems to work...
Previous thread: [PATCH] kernel/resource.c: fix new kernel-doc warning by Randy Dunlap on Sunday, August 31, 2008 - 8:31 pm. (1 message)

Next thread: [PATCH] x86: move mtrr cpu cap setting early in early_init_xxxx by Yinghai Lu on Sunday, August 31, 2008 - 9:06 pm. (1 message)