Re: regarding major number of block extended devt

Previous thread: [PATCH v3] usb: add Freescale QE/CPM USB peripheral controller driver by Li Yang on Tuesday, September 2, 2008 - 4:58 am. (3 messages)

Next thread: Re: [v4l-dvb-maintainer] [PULL] http://linuxtv.org/hg/~mcisely/pvrusb2 by Michael Krufky on Tuesday, September 2, 2008 - 6:29 am. (4 messages)
From: Tejun Heo
Date: Tuesday, September 2, 2008 - 5:26 am

Hello,

Extended devt is scheduled for 2.6.28 merge and is currently using 259.
 Can extended devt keep this major or should it use something else?

Thanks.

-- 
tejun
--

From: Tejun Heo
Date: Tuesday, September 2, 2008 - 5:35 am

Oops, I forgot to write information about ext devt.

 259 block	Block extended device numbers

		This is pool of dynamically allocated block device
		numbers.  Currently, ide and sd overflows into this
		region if there are more partitions than the existing
		device number scheme can accomodate, but any block
		device can use it and it's not restricted to
		overflows.

Thanks.

-- 
tejun
--

From: H. Peter Anvin
Date: Tuesday, September 2, 2008 - 1:16 pm

It would seem better to simply use the high minors on the 
already-existing ide and scsi majors?

	-hpa
--

From: Tejun Heo
Date: Tuesday, September 2, 2008 - 9:13 pm

I thought it would be better to break from those majors to make it clear
that the traditional minor allocation scheme isn't followed anymore.
Programs which expect certain majors are likely to need update for how
it deals with minors, so....

Thanks.

-- 
tejun
--

From: H. Peter Anvin
Date: Wednesday, September 3, 2008 - 9:12 am

But just allocating a big bucket of device numbers and throw it all into 
a pot semirandomly is likely to cause more damage, not less.

	-hpa
--

From: Tejun Heo
Date: Wednesday, September 3, 2008 - 9:21 am

To use ext devt, the system has to use udev for device numbers.  As long
as udev is used, the major number doesn't matter.  In addition, breaking
drastically (e.g. can't find the device) seems better than subtle
failure (e.g. weird partition number calculation based on the
traditional minor number scheme) and CONFIG_DEBUG_BLOCK_EXT_DEVT is
exactly aimed at making breakages obvious.

I don't really see there's much to gain by sharing the original major
numbers.

Thanks.

-- 
tejun
--

From: H. Peter Anvin
Date: Wednesday, September 3, 2008 - 9:27 am

I'm sorry, but that's simply false.  There is a *lot* of code out there 
that assumes you can determine what the device is by correlating the 
major number with /proc/devices.

	-hpa
--

From: Tejun Heo
Date: Wednesday, September 3, 2008 - 9:45 am

Then, we're between the rock and hard place then as there also is a
lot of code which assumes certain layout of sd or hd minor numbers.
Keeping only the major numbers doesn't really resolve any problem.  It
may be able to mask a few but that can be more harmful than helpful.

So, if a program expects certain major numbers, it won't be able to
access the partitions which have overflowed to the extended area.  If
a program uses udev or sys hierarchy to walk through devices, it will
be able to use them all.  Isn't that much better than overflowing into
the same major and hope that everything would work out okay?

Thanks.

-- 
tejun
--

From: H. Peter Anvin
Date: Wednesday, September 3, 2008 - 9:57 am

Oh dear...

I just realized that you're talking about *partitions*, not *devices*. 
There is a metric boatload of code out there that assumes you can take a 
device number, mask off some number of bits, and reach the parent 
device.  They will generally do that without checking if they are right 
or not.

As such, you're liable to suffer corruption of unrelated devices.

In that sense, yes, a separate major will help somewhat.

	-hpa

--

From: H. Peter Anvin
Date: Wednesday, September 3, 2008 - 11:11 am

Thinking about it some more, one invariant this is *guaranteed* to 
violate is:

	partition_number = partition_device - master_device

Code that needs a partition number (which is common enough) are using 
this invariant, because (a) it has held for 17 years and (b) because 
there is still no alternative other that relying on fragile naming 
scheme hacks.

(a) we can't do anything about, but (b) we can, by introducing a 
partition number attribute in sysfs.

I would consider this a precondition for this.

	-hpa
--

From: Tejun Heo
Date: Wednesday, September 3, 2008 - 5:25 pm

Hello,


Yeah, that would certainly be a nice addition.  Also, if partitions
are made proper classes, they'll be easily enumerable by
/sys/block/*/partitions/*.

Jens, what do you think?

-- 
tejun
--

From: H. Peter Anvin
Date: Wednesday, September 3, 2008 - 5:28 pm

Note that addition /partitions/ is somewhat unlikely to be useful, since 
existing code will have to search through random crap in sysfs to look 
for the partition directories anyway.

	-hpa
--

From: Tejun Heo
Date: Wednesday, September 3, 2008 - 5:35 pm

Currently it has to list /sys/block/DEV/DEV[-]N/.  With proper
classification, it can do /sys/block/*/partitions/*.  We'll need to keep
around symlinks at the root level.  It also plays well with how other
subsystems have been changing.

-- 
tejun
--

From: H. Peter Anvin
Date: Wednesday, September 3, 2008 - 5:43 pm

Yes, my point was mostly that in order to support older kernels, most 
code is going to want to just access /sys/block/DEV/DEV*N/ anyway.  What 
I have done in my code is I do a readdir() on /sys/block/DEV and look 
for subdirectories with a "dev" member.

Changing them to symlinks would actually break at least my code 
(arguably bad programming on my part), since  optimize by looking for 
DT_DIR.

	-hpa
--

From: Jens Axboe
Date: Thursday, September 4, 2008 - 4:50 am