Re: [code] Unlimited partitions, a try

Previous thread: Re: [2.6.23-rc9 regression] commit 4f01a757 broke HAL input support by Anssi Hannula on Friday, October 5, 2007 - 7:36 am. (1 message)

Next thread: [PATCH] vc bell config by Jan Engelhardt on Friday, October 5, 2007 - 7:55 am. (8 messages)
To: Linux Kernel Mailing List <linux-kernel@...>
Date: Friday, October 5, 2007 - 7:50 am

15 partitions (at least for sd_mod devices) are too few.

So I tried the following: after scanning the disk (sda), when we know
the number of partitions P on a disk, create a new block device
/dev/gd0 that is a copy of sda (in terms of disk->queue, etc.). This
is done using alloc_disk(P).

However, read() on gd0 will just return 0. It takes a `blockdev --rereadpt
/dev/gd0` before the disk is accessible. And if I add a call to
rescan inside gpdisk_new() it oopses (probably rightfully so). I do not
know all the block layer magic, so expect some horrible code.

Ideas, hints, anything is welcome.

---
block/Makefile | 1
block/genhd.c | 5 +
block/gpdisk.c | 139 ++++++++++++++++++++++++++++++++++++++++++++++++++
drivers/scsi/sd.c | 1
fs/partitions/check.c | 19 ++++++
include/linux/genhd.h | 16 +++++
include/scsi/sd.h | 3 +
7 files changed, 182 insertions(+), 2 deletions(-)

Index: linux-2.6.23/block/Makefile
===================================================================
--- linux-2.6.23.orig/block/Makefile
+++ linux-2.6.23/block/Makefile
@@ -3,6 +3,7 @@
#

obj-$(CONFIG_BLOCK) := elevator.o ll_rw_blk.o ioctl.o genhd.o scsi_ioctl.o
+obj-${CONFIG_BLOCK} += gpdisk.o

obj-$(CONFIG_BLK_DEV_BSG) += bsg.o
obj-$(CONFIG_IOSCHED_NOOP) += noop-iosched.o
Index: linux-2.6.23/block/genhd.c
===================================================================
--- linux-2.6.23.orig/block/genhd.c
+++ linux-2.6.23/block/genhd.c
@@ -744,6 +744,9 @@ struct gendisk *alloc_disk_node(int mino
rand_initialize_disk(disk);
INIT_WORK(&disk->async_notify,
media_change_notify_thread);
+ atomic_set(&disk->gpdisk_enabled, false);
+ disk->gpdisk = NULL;
+ disk->gpdisk_parent = NULL;
}
return disk;
}
@@ -793,6 +796,8 @@ EXPORT_SYMBOL(set_device_ro);
void set_disk_ro(struct gendisk *disk, int flag)
{
int i;
+ if (gpdisk_online(disk))
+ set_disk_ro(disk->gpdisk, flag);
dis...

To: Jan Engelhardt <jengelh@...>
Cc: Linux Kernel Mailing List <linux-kernel@...>
Date: Friday, October 5, 2007 - 6:11 pm

Now when we have 20-bit minors, can't we simply recycle some of the
higher bits for additional partitions, across the board? 63 partitions
seem to have been sufficient; at least I haven't heard anyone complain
about that for 15 years.

-hpa
-

To: H. Peter Anvin <hpa@...>
Cc: Jan Engelhardt <jengelh@...>, Linux Kernel Mailing List <linux-kernel@...>
Date: Friday, October 5, 2007 - 7:29 pm

On Fri, 05 Oct 2007 15:11:52 -0700

This was proposed ages ago. Al Viro vetoed sparse minors and it has been
stuck this way ever since. If you have > 15 partitions use device mapper
for it. I'd prefer it fixed but its arguable that device mapper is the
right way to punt all our partitioning to userspace

Alan
-

To: Alan Cox <alan@...>
Cc: H. Peter Anvin <hpa@...>, Jan Engelhardt <jengelh@...>, Linux Kernel Mailing List <linux-kernel@...>, Christophe Varoqui <christophe.varoqui@...>
Date: Saturday, October 6, 2007 - 4:36 am

Then please fix support for extended partitions in kpartx (part of
multipath-tools). Debian has an incomplete patch that does the right
thing on activation, but not on deactivation of partitions, and has an
obvious off-by-one in the "kpartx -l /dev/sda" output.

Signed-off-by: Hannes Reinecke <hare@suse.de>
Edited by Alexander E. Patrakov to fix incorrect output of "kpartx -l"
Signed-off-by: Alexander E. Patrakov <patrakov@ums.usu.ru>

--- a/kpartx/kpartx.c
+++ b/kpartx/kpartx.c
@@ -387,10 +387,10 @@ main(int argc, char **argv){
slices[j].minor = m++;

start = slices[j].start - slices[k].start;
- printf("%s%s%d : 0 %lu /dev/dm-%d %lu\n",
+ printf("%s%s%d : 0 %lu %s%s%d %lu\n",
mapname, delim, j+1,
(unsigned long) slices[j].size,
- slices[k].minor, start);
+ mapname, delim, k+1, start);
c--;
}
/* Terminate loop if nothing more to resolve */
@@ -431,7 +431,7 @@ main(int argc, char **argv){
break;

case ADD:
- for (j=0, c = 0; j<n; j++) {
+ for (j = 0, c = 0; j < n; j++) {
if (slices[j].size == 0)
continue;

@@ -477,6 +477,7 @@ main(int argc, char **argv){
d = c;
while (c) {
for (j = 0; j < n; j++) {
+ unsigned long start;
int k = slices[j].container - 1;

if (slices[j].size == 0)
@@ -487,7 +488,7 @@ main(int argc, char **argv){
continue;

/* Skip all simple slices */
- if (k < 0)
+ if (slices[j].container == 0)
continue;

/* Check container slice */
@@ -502,10 +503,11 @@ main(int argc, char **argv){
}
strip_slash(partname);

+ start = slices[j].start - slices[k].start;
if (safe_sprintf(params, "%d:%d %lu",
slices[k].major,
slices[k].minor,
- (unsigned long)slices[j].start)) {
+ start)) {
fprintf(stderr, "params too small\n");
exit(1);
}
@@ -524,9 +526,12 @@ main(int argc, char **...

To: Alan Cox <alan@...>
Cc: Jan Engelhardt <jengelh@...>, Linux Kernel Mailing List <linux-kernel@...>
Date: Friday, October 5, 2007 - 7:29 pm

Sure. However, that takes having that bit of userspace in even the most
trivial configurations, and not just on bootup, but continuously.

-hpa
-

To: H. Peter Anvin <hpa@...>
Cc: Alan Cox <alan@...>, Jan Engelhardt <jengelh@...>, Linux Kernel Mailing List <linux-kernel@...>
Date: Saturday, October 6, 2007 - 3:33 pm

I'm not sure that configurations requiring more than 15 partitions are
properly described as "trivial." Which is not to disagree with your
point about required user tools, but most systems needing such tools
will be large and complex enough that a userspace solution will be
acceptable.

--
Bill Davidsen <davidsen@tmr.com>
"We have more to fear from the bungling of the incompetent than from
the machinations of the wicked." - from Slashdot
-

To: Bill Davidsen <davidsen@...>
Cc: H. Peter Anvin <hpa@...>, Alan Cox <alan@...>, Jan Engelhardt <jengelh@...>, Linux Kernel Mailing List <linux-kernel@...>
Date: Tuesday, October 9, 2007 - 2:15 pm

And there is LVM too, which seems better than partitions in many cases.

--
Len Sorensen
-

To: H. Peter Anvin <hpa@...>
Cc: Linux Kernel Mailing List <linux-kernel@...>
Date: Friday, October 5, 2007 - 6:27 pm

GPT allows up to 128 partitions, and the linux partition code currently
allows for up to MAX_PART (256). Assuming 1048576/128, that would give
8192 disks. With dynamic minor allocation and reuse, all that goes away
and the limit becomes a bit less than 1048576 _partitions_.
-

To: Jan Engelhardt <jengelh@...>
Cc: Linux Kernel Mailing List <linux-kernel@...>
Date: Friday, October 5, 2007 - 6:57 pm

Yes, but you're proposing something with substantially higher switch
threshold...

-hpa
-

Previous thread: Re: [2.6.23-rc9 regression] commit 4f01a757 broke HAL input support by Anssi Hannula on Friday, October 5, 2007 - 7:36 am. (1 message)

Next thread: [PATCH] vc bell config by Jan Engelhardt on Friday, October 5, 2007 - 7:55 am. (8 messages)