Re: [PATCH][v4] Modify loop device to be able to manage partitions of the disk image

Previous thread: BUG: drop_pagecache_sb vs kjournald lockup by David Chinner on Tuesday, March 18, 2008 - 4:28 am. (5 messages)

Next thread: Re: [ANNOUNCE] Ramback: faster than a speeding bullet by David Newall on Tuesday, March 18, 2008 - 6:03 am. (7 messages)
From: Laurent Vivier
Date: Tuesday, March 18, 2008 - 5:19 am

v3 is an updated version of v2, replacing a "%d" by a "%lu".

This patch allows to use loop device with partitionned disk image.

Original behavior of loop is not modified.

A new parameter is introduced to define how many partition we want to be
able to manage per loop device. This parameter is "loop_max_part".

For instance, to manage 63 partitions / loop device, we will do:
# modprobe loop loop_max_part=63
# ls -l /dev/loop?
brw-rw---- 1 root disk 7,   0 2008-03-05 14:55 /dev/loop0
brw-rw---- 1 root disk 7,  64 2008-03-05 14:55 /dev/loop1
brw-rw---- 1 root disk 7, 128 2008-03-05 14:55 /dev/loop2
brw-rw---- 1 root disk 7, 192 2008-03-05 14:55 /dev/loop3
brw-rw---- 1 root disk 7, 256 2008-03-05 14:55 /dev/loop4
brw-rw---- 1 root disk 7, 320 2008-03-05 14:55 /dev/loop5
brw-rw---- 1 root disk 7, 384 2008-03-05 14:55 /dev/loop6
brw-rw---- 1 root disk 7, 448 2008-03-05 14:55 /dev/loop7

And to attach a raw partitionned disk image, the original losetup is used:

# losetup -f etch.img
EXT3 FS on loop0p1, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
loop: module loaded
 loop0: p1 p2 < p5 >
# ls -l /dev/loop?*
brw-rw---- 1 root disk 7,   0 2008-03-05 14:55 /dev/loop0
brw-rw---- 1 root disk 7,   1 2008-03-05 14:57 /dev/loop0p1
brw-rw---- 1 root disk 7,   2 2008-03-05 14:57 /dev/loop0p2
brw-rw---- 1 root disk 7,   5 2008-03-05 14:57 /dev/loop0p5
brw-rw---- 1 root disk 7,  64 2008-03-05 14:55 /dev/loop1
brw-rw---- 1 root disk 7, 128 2008-03-05 14:55 /dev/loop2
brw-rw---- 1 root disk 7, 192 2008-03-05 14:55 /dev/loop3
brw-rw---- 1 root disk 7, 256 2008-03-05 14:55 /dev/loop4
brw-rw---- 1 root disk 7, 320 2008-03-05 14:55 /dev/loop5
brw-rw---- 1 root disk 7, 384 2008-03-05 14:55 /dev/loop6
brw-rw---- 1 root disk 7, 448 2008-03-05 14:55 /dev/loop7
# mount /dev/loop0p1 /mnt
kjournald starting.  Commit interval 5 seconds
EXT3 FS on loop0p1, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
# ls /mnt
bench  cdrom  home        lib         ...
From: Laurent Vivier
Date: Wednesday, March 19, 2008 - 5:36 am

This patch allows to use loop device with partitionned disk image.

Original behavior of loop is not modified.

A new parameter is introduced to define how many partition we want to be
able to manage per loop device. This parameter is "max_part".

For instance, to manage 63 partitions / loop device, we will do:
# modprobe loop max_part=63
# ls -l /dev/loop?*
brw-rw---- 1 root disk 7,   0 2008-03-05 14:55 /dev/loop0
brw-rw---- 1 root disk 7,  64 2008-03-05 14:55 /dev/loop1
brw-rw---- 1 root disk 7, 128 2008-03-05 14:55 /dev/loop2
brw-rw---- 1 root disk 7, 192 2008-03-05 14:55 /dev/loop3
brw-rw---- 1 root disk 7, 256 2008-03-05 14:55 /dev/loop4
brw-rw---- 1 root disk 7, 320 2008-03-05 14:55 /dev/loop5
brw-rw---- 1 root disk 7, 384 2008-03-05 14:55 /dev/loop6
brw-rw---- 1 root disk 7, 448 2008-03-05 14:55 /dev/loop7

And to attach a raw partitionned disk image, the original losetup is used:

# losetup -f etch.img
# ls -l /dev/loop?*
brw-rw---- 1 root disk 7,   0 2008-03-05 14:55 /dev/loop0
brw-rw---- 1 root disk 7,   1 2008-03-05 14:57 /dev/loop0p1
brw-rw---- 1 root disk 7,   2 2008-03-05 14:57 /dev/loop0p2
brw-rw---- 1 root disk 7,   5 2008-03-05 14:57 /dev/loop0p5
brw-rw---- 1 root disk 7,  64 2008-03-05 14:55 /dev/loop1
brw-rw---- 1 root disk 7, 128 2008-03-05 14:55 /dev/loop2
brw-rw---- 1 root disk 7, 192 2008-03-05 14:55 /dev/loop3
brw-rw---- 1 root disk 7, 256 2008-03-05 14:55 /dev/loop4
brw-rw---- 1 root disk 7, 320 2008-03-05 14:55 /dev/loop5
brw-rw---- 1 root disk 7, 384 2008-03-05 14:55 /dev/loop6
brw-rw---- 1 root disk 7, 448 2008-03-05 14:55 /dev/loop7
# mount /dev/loop0p1 /mnt
# ls /mnt
bench  cdrom  home        lib         mnt   root     srv  usr
bin    dev    initrd      lost+found  opt   sbin     sys  var
boot   etc    initrd.img  media       proc  selinux  tmp  vmlinuz
# umount /mnt
# losetup -d /dev/loop0

Of course, the same behavior can be done using kpartx on a loop device,
but modifying loop avoids to stack several layers of block device (loop ...
From: Randy Dunlap
Date: Wednesday, March 19, 2008 - 1:11 pm

What happened to the update to Documentation/kernel-parameters.txt


---
~Randy
--

From: Laurent Vivier
Date: Wednesday, March 19, 2008 - 1:24 pm

Well, perhaps I didn't understand the comment of Andrew:

"This shouldn't be needed."

I though it means I should remove it. So, Andrew ???

And to comment the changes between v3 and v4:

- remove modification from kernel-parameters.txt (as you saw)
- rename the parameter to max_part (according Andrew comments)
- add an "ioctl_by_bdev(bdev, BLKRRPART, 0);" on loop_clr_fd()
  (to remove loopXpY from /dev/ on "losetup -d")

-- 
------- Laurent.Vivier@bull.net  -------
  "The best way to predict the future 
      is to invent it." - Alan Kay

--

From: Andrew Morton
Date: Wednesday, March 19, 2008 - 2:28 pm

On Wed, 19 Mar 2008 21:24:41 +0100

No, given that all module_param() options are available via the boot
command line when the module is linked into vmlinux, we don't document them
separately.

There should be a way of auto-generating all the documentation for all the
module parameters from their MODULE_PARM_DESC's.  And there probably is,
but I'm not sure how this is done (?)

(does `make help', fails to spot it).

You can do `modinfo loop' but that probably doesn't work if
CONFIG_BLK_DEV_LOOP=y?



I assume you tested the "loop.max_part=N" option?

--

From: Laurent Vivier
Date: Wednesday, March 19, 2008 - 2:39 pm

"No" is "To document max_part is not needed"

or


Yes, I did (with N=63)

Regards,
Laurent
-- 
------------- Laurent.Vivier@bull.net ---------------
"The best way to predict the future is to invent it."
- Alan Kay

--

From: Andrew Morton
Date: Wednesday, March 19, 2008 - 2:43 pm

On Wed, 19 Mar 2008 22:39:10 +0100

The former ;)
--

From: Randy Dunlap
Date: Wednesday, March 19, 2008 - 4:03 pm

First of all, I didn't see Andrew's message until awhile after yours,


No, nothing in tree like that.

Would such an auto-generator use source files or compiled modules?
Using the latter means that (a) something like allmodconfig must be done
and (b) it only works for the compiled $ARCH(es), whereas using source code
has neither of those "problems."

I can work on updating
http://www.xenotime.net/linux/scripts/module-params (from Oct-2006).


-- 
~Randy
--

From: Bill Davidsen
Date: Thursday, March 20, 2008 - 2:36 pm

I totally don't understand this comment, where is the file with the list 
of canonical module parameters now? I must have missed the discussion of 
why it is changed. And what has the boot command line or linking modules 


-- 
Bill Davidsen <davidsen@tmr.com>
   "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot
--

From: Andrew Morton
Date: Tuesday, March 18, 2008 - 1:54 pm

On Tue, 18 Mar 2008 13:19:20 +0100


Because this module_param() gives us the loop.loop_max_part=N kernel boot
parameter.

Given which, I think we could rename it to just "max_part":

	modprobe loop max_part=4

	kernel vmlinuz-... loop.max_part=4



--

From: Bill Davidsen
Date: Wednesday, March 19, 2008 - 1:24 pm

-- 
Bill Davidsen <davidsen@tmr.com>
   "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot
--

From: Laurent Vivier
Date: Wednesday, March 19, 2008 - 1:32 pm

What do you mean ?

NBD doesn't manage partitions... but I also have a patch to do that.
NBD implies an NBD server and an NBD client in userspace, so loop is
better when disk image is in a raw format.

-- 
------- Laurent.Vivier@bull.net  -------
  "The best way to predict the future 
      is to invent it." - Alan Kay

--

From: Bill Davidsen
Date: Sunday, March 23, 2008 - 4:33 pm

Actually the usual partitioning tools will create partitions on nbd 
volumes, but without inodes they are not useful. I thought we used to 
have that working, to work on virtual machine "disk" files which were 
partitioned, but that was several years ago and I could be 
misremembering. I did use fdisk on an nbd device before I asked about 
overhead, but I didn't try to use the partitions.

In any case, if you have code to make nbd partitions work in a currently 
useful way, that might be useful for keeping disk images handy for 
mount. The kvm copy on write might let the fresh install image be shared 
and VMs customize as needed.

-- 
Bill Davidsen <davidsen@tmr.com>
   "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot

--

From: Laurent Vivier
Date: Tuesday, March 25, 2008 - 3:34 am

This patch allows to use partitions with network block devices (NBD).

A new parameter is introduced to define how many partition we want to be
able to manage per network block device. This parameter is "max_part".

For instance, to manage 63 partitions / loop device, we will do:

   [on the server side]
# nbd-server 1234 /dev/sdb
   [on the client side]
# modprobe nbd max_part=63
# ls -l /dev/nbd*
brw-rw---- 1 root disk 43,   0 2008-03-25 11:14 /dev/nbd0
brw-rw---- 1 root disk 43,  64 2008-03-25 11:11 /dev/nbd1
brw-rw---- 1 root disk 43, 640 2008-03-25 11:11 /dev/nbd10
brw-rw---- 1 root disk 43, 704 2008-03-25 11:11 /dev/nbd11
brw-rw---- 1 root disk 43, 768 2008-03-25 11:11 /dev/nbd12
brw-rw---- 1 root disk 43, 832 2008-03-25 11:11 /dev/nbd13
brw-rw---- 1 root disk 43, 896 2008-03-25 11:11 /dev/nbd14
brw-rw---- 1 root disk 43, 960 2008-03-25 11:11 /dev/nbd15
brw-rw---- 1 root disk 43, 128 2008-03-25 11:11 /dev/nbd2
brw-rw---- 1 root disk 43, 192 2008-03-25 11:11 /dev/nbd3
brw-rw---- 1 root disk 43, 256 2008-03-25 11:11 /dev/nbd4
brw-rw---- 1 root disk 43, 320 2008-03-25 11:11 /dev/nbd5
brw-rw---- 1 root disk 43, 384 2008-03-25 11:11 /dev/nbd6
brw-rw---- 1 root disk 43, 448 2008-03-25 11:11 /dev/nbd7
brw-rw---- 1 root disk 43, 512 2008-03-25 11:11 /dev/nbd8
brw-rw---- 1 root disk 43, 576 2008-03-25 11:11 /dev/nbd9
# nbd-client localhost 1234 /dev/nbd0
Negotiation: ..size = 80418240KB
bs=1024, sz=80418240

-------NOTE, RFC: partition table is not automatically read.
The driver sets bdev->bd_invalidated to 1 to force the read of the partition
table of the device, but this is done only on an open of the device.
So we have to do a "touch /dev/nbdX" or something like that.
It can't be done from the nbd-client or nbd driver because at this
level we can't ask to read the partition table and to serve the request
at the same time (-> deadlock)

If someone has a better idea, I'm open to any suggestion.
-------NOTE, RFC

# fdisk -l /dev/nbd0

Disk /dev/nbd0: 82.3 GB, ...
Previous thread: BUG: drop_pagecache_sb vs kjournald lockup by David Chinner on Tuesday, March 18, 2008 - 4:28 am. (5 messages)

Next thread: Re: [ANNOUNCE] Ramback: faster than a speeding bullet by David Newall on Tuesday, March 18, 2008 - 6:03 am. (7 messages)