Re: [PATCH/RFC] pci: dynids.use_driver_data considered harmful

Previous thread: [PATCH/RESEND] pci: dynids.use_driver_data considered harmful by Milton Miller on Saturday, July 12, 2008 - 1:02 pm. (5 messages)

Next thread: [PATCH] cxacru: Fix printk format flag in error message by Simon Arlott on Saturday, July 12, 2008 - 2:19 pm. (2 messages)
From: Milton Miller
Date: Saturday, July 12, 2008 - 2:08 pm

For those on lkml, I mistyped the address in the first mail.  I just
did a second pass edit on the long debug session prompting this patch
fixed the cc to linux-kernel.  So all those not on linux-pci should
be seeing that shortly, or one can look in mmotm.


Its not one driver, its more like 100 to 150.

Why is giving the driver the data it needs to support a card bad?
The debug session shows that not giving it the data it needs
is worse.

Worse, the kernel thinks it superior to the human and actively ignores
input with out even telling the user it did so.

The only argument I can come up with is if the data is a pointer, then
its likely that that root will not be able to find the correct value or
put it in a script that doesn't get updated on the next kernel install
and causes a crash.  Root isn't likely to add driver data unless it
discovers that it is needed, so the existing default of passing 0
should remain.

If you want to argue that being able to tell the driver that it should
treat this card like card X is wrong, you should be arguing that the
whole notion of adding match table entries to the driver is wrong.
When it comes down to it, its the same thing.  Yet, we support adding
ids today.  The feature was added, and is used successfully with many
drivers.   I'm trying to extend the population of drivers where it works
to include those that support more than one similar card but need to
know which one they are dealing with.


I did the work to establish the scope of the problem.

So 3 drivers set the flag.  98 + 14 should set it immediately.  The 27 
that
use the value as a pointer probably can be re-written to use it as an 
index.  And
the hundreds of other drivers (I don't have the tree here to count) 
apparently
don't care, as they most likely never look at that field.  (I am 
ingoring

Obviously I disagree.

Since we added passing the id table entry to get the driver data from 
the
table entry, we have close to 150 drivers that use it, instead of old ...
From: Milton Miller
Date: Saturday, July 12, 2008 - 3:48 pm

For the record or independent investigation, here is the patch and grep 
of
output from the kernel log.   Hopefully you can read them after this 
mailer
mangles them.

milton

[   30.412634] PCI Driver CyberPro(in cyber2000fb) uses data 3/3 times 
for indexes
[   30.418967] PCI Driver aty128fb(in aty128fb) uses data 37/47 times 
for indexes
[   30.419787] PCI Driver radeonfb(in radeonfb) uses data 98/98 times 
for flags
[   30.421104] PCI Driver sisfb(in sisfb) uses data 11/12 times for 
indexes
[   30.422332] PCI Driver savagefb(in savagefb) uses data 23/23 times 
for flags
[   30.423088] PCI Driver neofb(in neofb) uses data 9/9 times for flags
[   30.424431] PCI Driver imsttfb(in imsttfb) uses data 1/2 times for 
indexes
[   30.427089] PCI Driver s3fb(in s3fb) uses data 12/12 times for flags
[   30.428495] PCI Driver sstfb(in sstfb) uses data 1/2 times for 
indexes
[   30.429439] PCI Driver cirrusfb(in cirrusfb) uses data 11/11 times 
for indexes
[   30.431251] PCI Driver gxt4500(in gxt4500) uses data 1/2 times for 
indexes
[   33.082152] PCI Driver epca(in epca) uses data 3/4 times for indexes
[   37.857569] PCI Driver serial(in 8250_pci) uses data 177/180 times 
for indexes
[   37.868296] PCI Driver jsm(in jsm) uses data 4/5 times for indexes
[   37.901045] PCI Driver parport_pc(in parport_pc) uses data 52/53 
times for indexes
[   37.906019] PCI Driver parport_serial(in parport_serial) uses data 
30/31 times for indexes
[   38.253736] PCI Driver DAC960(in DAC960) uses data 7/7 times for 
pointers
[   38.983951] PCI Driver e1000e(in e1000e) uses data 29/38 times for 
indexes
[   39.026014] PCI Driver Sundance Technology IPG Triple-Speed 
Ethernet(in ipg) uses data 5/6 times for indexes
[   39.033218] PCI Driver cxgb(in cxgb) uses data 5/7 times for indexes
[   39.039465] PCI Driver cxgb3(in cxgb3) uses data 9/10 times for 
indexes
[   39.092305] PCI Driver 3c59x(in 3c59x) uses data 37/38 times for 
indexes
[   39.098029] PCI Driver typhoon(in typhoon) ...
From: Milton Miller
Date: Wednesday, July 16, 2008 - 3:18 am

Greg,

Please respond to this email and explain why the patch

pci: dynids.use_driver_data considered harmful

http://www.ussg.iu.edu/hypermail/linux/kernel/0807.1/index.html#2188

should not be applied.   I am not arguing the correctness of
the removed code, rather its utility and benefit to the linux
community.

As far as I can tell, the code only succeeds in limiting the
usefulness of the pci dynamic id addition mechanism.  We chose
not to limit which drivers can have a table entry added, now
let us not limit telling the driver which previous entry is
most similar to our new entry.

If a driver doesn't set this bit, and only 3 out of 419 do,
then the facility can only be used if the driver can function
correctly with the data zero.  In some drivers (radeonfb) a
nonzero flag is always set, in some a list of boards or
chipsets is listed in an arbitrary order (e1000e), and in yet
others the field is used as a pointer without checking for NULL
(DAC960, iwl3945).



You sent your request for others to withdraw the patch
from consideration when I resent the patch without seeing your
comments that were less than 12 hours old, but have been silent
for the last 60 hours since I responded to them over the next
several hours.   If I do not hear from you with technical
arguments for keeping the code, I will resend the patch for
consideration.

milton


--

From: Greg KH
Date: Thursday, July 17, 2008 - 12:07 am

Sorry, I'm on the road right now and will not get back until Friday.
Then I have the big merges with Linus to get through.  I'll try to get
to this by Monday, but my original point still stands, this was
implemented for a reason, saying that not enough drivers use it properly
does not make the need for it to go away.  It is required for them, so
perhaps the other 419 drivers also need to have the flag set.  That's
pretty trivial to do, right?

thanks,

greg k-h
--

From: Milton Miller
Date: Thursday, July 17, 2008 - 7:36 am

Fine, I'll give you a little time.  But I want to hear specifics how it
helps drivers.   I have shown how it hurts many drivers.   My arguement
is that once we set the flag on the drivers, we will end up with it set
on all drivers or the remaining drivers will not care.   There were 413+
drivers in linux-next that were compiled by allyesconfig, about 150 used
driver data.

If the purpose is to enforce range checking, then I'll start working on
patches for those 100 drivers to do that.   But I don't see why a binary
flag helps any driver.   The flag is not "I will accept a additional id
table", its "I will accept the additional driver data that I need to
operate" on my dynamically added id table.

milton

--

From: Jean Delvare
Date: Wednesday, August 6, 2008 - 12:31 am

Hi Greg,


Not a good enough argument, sorry. There have been many cases in the
past where code has been withdrawn after some times because we
realized that we got it wrong in the first place.

So, please explain what the current code is good for. Honestly, my
initial reaction to Milton's proposal was "what an idiot, this flag is
there for an obvious safety reason and we don't want to remove it" but
after reading both his arguments and the code, I found that I have
nothing to backup my claim. If you do, please let us know your

If you are suggesting to blindly set the flag to all PCI drivers (or
even just all the ones which make use of the driver_data field -
doesn't make a difference), this simply shows how useless this flag is.
If you don't, then one would have to check the code of all drivers and
add validation code for the driver_data value; but then this no longer
falls into the "trivial" category.

Thanks,
-- 
Jean Delvare
--

From: Greg KH
Date: Thursday, August 14, 2008 - 3:12 pm

The technical reason was that this flag was needed to let some drivers
work properly with the new_id file, right?


It's pretty "trivial" to look to see if the field is set in the pci_id
structure, that should be all that is needed, right?

thanks,

greg k-h
--

From: Milton Miller
Date: Friday, August 15, 2008 - 7:50 am

So please identify at least one such driver where the only correct 
answer is 0.  I even made it easy for you, i identified which drivers 
set dynamic_id and how.  I identified specific drivers where its the 
wrong answer.

So:  If you are arguing the code is still needed, please identify  at 
least one case that it is correct.  (Oh, I don't buy that if 0 is safe 
but 1 is equally safe, that the existing code is correct).

milton

--

From: Jean Delvare
Date: Friday, August 15, 2008 - 8:50 am

Hi Greg,


Hmm, so in fact you don't have a clue? ;)

I wasn't involved in the addition of this flag, but my impression is
that it was added in the hope that it would prevent users from passing
additional data to drivers using the driver_data field but not prepared
(read: not robust enough) to handle possibly wrong data from user-space.
Probably, the idea was that the person adding the
dynids.use_driver_data flag would be responsible for verifying that the
driver had the required checks in place to guarantee that "nothing"
would go wrong even if the data provided by the user wasn't correct.

(This doesn't make much sense IMHO, as there is zero guarantee that the
developer actually did that - but I believe this is the idea behind the

If the flag goes away, nothing will break per se. In particular, the
drivers which were using the flag won't stop working - you don't prevent
things from happening when removing a safety, on the contrary, you
allow more things to happen. What will change is that all the drivers
which _do_ use the driver_data field but didn't set the
dynids.use_driver_data flag, will now possibly receive their driver_data
from the user.

Which you may say is unsafe, because these drivers may not expect that
(i.e. they probably don't have any check to protect themselves against
bogus driver_data values.) But OTOH most of these drivers can't really
take dynamic IDs at the moment (because the code prevents the user from
passing the required driver_data, silently forcing it to 0) so this
would be a big win in terms of usability. On top of that, considering
that (silently) defaulting to 0 for driver_data is safe, is just plain

Well... If you consider that all drivers using the driver_data field
should have the dynids.use_driver_data flag set unconditionally and
without looking at the code, then in fact you agree with Milton and
myself that this flag should be dropped. The whole point of the flag,
if my guess is correct, was to hint the developer that he/she ...
From: Jesse Barnes
Date: Friday, August 15, 2008 - 10:46 am

Yeah it doesn't seem to be very heavily used at this point. :)

I see around 400 uses of id->driver_data right now, and only 2 of 
use_driver_data.  If we remove the use_driver_data field, drivers (except for 
the 2 using the field) will start to see whatever values userspace passes to 
them rather than 0, and at least in some of the few cases I looked at that 
could cause breakage due to out of bounds references.

So I think your point about dynamic IDs in general is a good one; this flag 
really does look like an "audit was done" bit, but doesn't go as far is it 
should.

The patch you posted to forbid dynamic binding unless use_driver_data is iset 
is probably the safest thing to do, given that drivers that *don't* set 
use_driver_data look like they might misbehave even with a driver_data value 
of 0...

What do you think, Greg?  Convinced yet? :)

Thanks,
Jesse
--

From: Jean Delvare
Date: Friday, August 15, 2008 - 11:55 am

Hi Jesse,


In fact we can do even better than that. We could accept from
user-space only driver_data values which at least one device ID entry in
the driver already uses. That should be fairly easy to implement, and
would offer a level of safety an order of magnitude above what we have
at the moment... And it works both ways: if 0 is not a valid data for
some driver, that would force the user to provide a non-zero (and
valid) data value. And it guarantees that the user can't ask for
something the driver doesn't expect, so drivers don't even need extra
checks. And no need for a use_driver_data flag either.

The only drawback is that it prevents the user from passing a "new"
data value even if it would be valid. But honestly, I don't expect that
case to happen frequently... if ever at all. So I'd say the benefits
totally outweight the drawback.

If the interested people agree with the idea, I'll look into
implementing it.

-- 
Jean Delvare
--

From: Jesse Barnes
Date: Friday, August 15, 2008 - 12:15 pm

Well the audit would show if user supplied "new" values are needed; otherwise 
the approach sounds good to me.

Thanks
-- 
Jesse Barnes, Intel Open Source Technology Center
--

From: Greg KH
Date: Friday, August 15, 2008 - 11:22 pm

That sounds reasonable, and should work properly.

No objection from me.

thanks,

greg k-h
--

From: Jean Delvare
Date: Sunday, August 17, 2008 - 12:06 pm

Hi all,


Ok, here's what it could look like:

* * * * *

From: Jean Delvare <khali@linux-fr.org>
Subject: PCI: Check dynids driver_data value for validity

Only accept dynids those driver_data value matches one of the driver's
pci_driver_id entry. This prevents the user from accidentally passing
values the drivers do not expect.

Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Milton Miller <miltonm@bga.com>
Cc: Greg KH <greg@kroah.com>
---
 Documentation/PCI/pci.txt       |    4 ++++
 drivers/i2c/busses/i2c-amd756.c |    4 ----
 drivers/i2c/busses/i2c-viapro.c |    4 ----
 drivers/pci/pci-driver.c        |   18 ++++++++++++++++--
 4 files changed, 20 insertions(+), 10 deletions(-)

--- linux-2.6.27-rc3.orig/Documentation/PCI/pci.txt	2008-08-17 18:24:33.000000000 +0200
+++ linux-2.6.27-rc3/Documentation/PCI/pci.txt	2008-08-17 18:24:38.000000000 +0200
@@ -163,6 +163,10 @@ need pass only as many optional fields a
 	o class and classmask fields default to 0
 	o driver_data defaults to 0UL.
 
+Note that driver_data must match the value used by any of the pci_device_id
+entries defined in the driver. This makes the driver_data field mandatory
+if all the pci_device_id entries have a non-zero driver_data value.
+
 Once added, the driver probe routine will be invoked for any unclaimed
 PCI devices listed in its (newly updated) pci_ids list.
 
--- linux-2.6.27-rc3.orig/drivers/i2c/busses/i2c-amd756.c	2008-08-17 17:15:57.000000000 +0200
+++ linux-2.6.27-rc3/drivers/i2c/busses/i2c-amd756.c	2008-08-17 19:42:14.000000000 +0200
@@ -332,10 +332,6 @@ static int __devinit amd756_probe(struct
 	int error;
 	u8 temp;
 	
-	/* driver_data might come from user-space, so check it */
-	if (id->driver_data >= ARRAY_SIZE(chipname))
-		return -EINVAL;
-
 	if (amd756_ioport) {
 		dev_err(&pdev->dev, "Only one device supported "
 		       "(you have a strange motherboard, btw)\n");
--- ...
From: Greg KH
Date: Sunday, August 17, 2008 - 8:50 pm

Acked-by: Greg Kroah-Hartman <gregkh@suse.de>

Looks good, thanks for sticking with it and creating this.

greg k-h
--

From: Jesse Barnes
Date: Monday, August 18, 2008 - 10:13 am

Looks good; I think we'll want to put this into linux-next along with Milton's 
change.  I'll push them out after a quick smoke test.

Thanks,
Jesse
--

From: Jesse Barnes
Date: Monday, August 18, 2008 - 1:41 pm

Applied to linux-next, thanks Jean.

Jesse
--

From: Milton Miller
Date: Tuesday, August 19, 2008 - 11:01 am

Thanks Jean for doing this.  Sometimes things move quickly after a long 
stall.  I thought about proposing a similar patch and therefore have to 

There are a few drivers that could benefit, mainly ones that I 
identified as using flags.  For example, the radeon driver uses 
different fields of the data to specify crt controller, video output 
device, etc.   I'm fine with deferring a flag for such drivers until 

so, if anyone asks,

Concept-Acked-By: Milton Miller <miltonm@bga.com>

milton

--

From: Jean Delvare
Date: Wednesday, August 6, 2008 - 12:22 am

Hi Milton, hi Greg,


I _am_ arguing the correctness of the removed code. That code silently
defaults driver_data to 0 when the driver doesn't set
dynids.use_driver_data. This assumes that 0 is always a safe value,

I have to agree with Milton. Allowing all drivers to use dynamic IDs
first, and then limiting the use of the driver_data field to drivers
setting a specific flag, doesn't make sense. For one thing, just
because the driver sets the flag, doesn't mean that it does the proper
checks on the passed value. For another, as written above, assuming
that 0 is a safe value (that doesn't need to be checked by the driver)
is incorrect.

The correct approach here if you really want to play it safe, would be
to not allow dynamic IDs by default. Drivers would set a flag when they
want to use them (instead of when they want to use driver_data as we
currently do.) Setting the flag would suggest that you have checked
that nothing can go wrong when a dynamic ID is added. As I said above,
it's impossible to actually enforce other than with careful code
review, but if you insist on trying, maybe we can have these drivers
set a validation callback function rather than a flag. In the absence of
validation callback function, dynamic IDs would be forbidden. This has
a cost in terms of driver size though.

That would be a pretty big change, as someone would have to go through
all the PCI drivers and update them. I certainly do not have the time
to do that, but maybe Milton does if we agree on that?

The alternative is to just apply Milton's patch and get away with all
checks. That's perfectly fine with me, as I can't see how it is worse
than what we currently have, but apparently not with Greg (although I
still don't know why.) If Milton's patch isn't going to be applied,
then I would like to suggest applying the following patch of mine as a
measure to prevent what happened to Milton and started this discussion:

* * * * *

Subject: PCI: Don't silently ignore dynids driver data

If ...
Previous thread: [PATCH/RESEND] pci: dynids.use_driver_data considered harmful by Milton Miller on Saturday, July 12, 2008 - 1:02 pm. (5 messages)

Next thread: [PATCH] cxacru: Fix printk format flag in error message by Simon Arlott on Saturday, July 12, 2008 - 2:19 pm. (2 messages)