For those on lkml, I mistyped the address in the first mail. I just did a second pass edit on the long debug session prompting this patch fixed the cc to linux-kernel. So all those not on linux-pci should be seeing that shortly, or one can look in mmotm. Its not one driver, its more like 100 to 150. Why is giving the driver the data it needs to support a card bad? The debug session shows that not giving it the data it needs is worse. Worse, the kernel thinks it superior to the human and actively ignores input with out even telling the user it did so. The only argument I can come up with is if the data is a pointer, then its likely that that root will not be able to find the correct value or put it in a script that doesn't get updated on the next kernel install and causes a crash. Root isn't likely to add driver data unless it discovers that it is needed, so the existing default of passing 0 should remain. If you want to argue that being able to tell the driver that it should treat this card like card X is wrong, you should be arguing that the whole notion of adding match table entries to the driver is wrong. When it comes down to it, its the same thing. Yet, we support adding ids today. The feature was added, and is used successfully with many drivers. I'm trying to extend the population of drivers where it works to include those that support more than one similar card but need to know which one they are dealing with. I did the work to establish the scope of the problem. So 3 drivers set the flag. 98 + 14 should set it immediately. The 27 that use the value as a pointer probably can be re-written to use it as an index. And the hundreds of other drivers (I don't have the tree here to count) apparently don't care, as they most likely never look at that field. (I am ingoring Obviously I disagree. Since we added passing the id table entry to get the driver data from the table entry, we have close to 150 drivers that use it, instead of old ...
For the record or independent investigation, here is the patch and grep of output from the kernel log. Hopefully you can read them after this mailer mangles them. milton [ 30.412634] PCI Driver CyberPro(in cyber2000fb) uses data 3/3 times for indexes [ 30.418967] PCI Driver aty128fb(in aty128fb) uses data 37/47 times for indexes [ 30.419787] PCI Driver radeonfb(in radeonfb) uses data 98/98 times for flags [ 30.421104] PCI Driver sisfb(in sisfb) uses data 11/12 times for indexes [ 30.422332] PCI Driver savagefb(in savagefb) uses data 23/23 times for flags [ 30.423088] PCI Driver neofb(in neofb) uses data 9/9 times for flags [ 30.424431] PCI Driver imsttfb(in imsttfb) uses data 1/2 times for indexes [ 30.427089] PCI Driver s3fb(in s3fb) uses data 12/12 times for flags [ 30.428495] PCI Driver sstfb(in sstfb) uses data 1/2 times for indexes [ 30.429439] PCI Driver cirrusfb(in cirrusfb) uses data 11/11 times for indexes [ 30.431251] PCI Driver gxt4500(in gxt4500) uses data 1/2 times for indexes [ 33.082152] PCI Driver epca(in epca) uses data 3/4 times for indexes [ 37.857569] PCI Driver serial(in 8250_pci) uses data 177/180 times for indexes [ 37.868296] PCI Driver jsm(in jsm) uses data 4/5 times for indexes [ 37.901045] PCI Driver parport_pc(in parport_pc) uses data 52/53 times for indexes [ 37.906019] PCI Driver parport_serial(in parport_serial) uses data 30/31 times for indexes [ 38.253736] PCI Driver DAC960(in DAC960) uses data 7/7 times for pointers [ 38.983951] PCI Driver e1000e(in e1000e) uses data 29/38 times for indexes [ 39.026014] PCI Driver Sundance Technology IPG Triple-Speed Ethernet(in ipg) uses data 5/6 times for indexes [ 39.033218] PCI Driver cxgb(in cxgb) uses data 5/7 times for indexes [ 39.039465] PCI Driver cxgb3(in cxgb3) uses data 9/10 times for indexes [ 39.092305] PCI Driver 3c59x(in 3c59x) uses data 37/38 times for indexes [ 39.098029] PCI Driver typhoon(in typhoon) ...
Greg, Please respond to this email and explain why the patch pci: dynids.use_driver_data considered harmful http://www.ussg.iu.edu/hypermail/linux/kernel/0807.1/index.html#2188 should not be applied. I am not arguing the correctness of the removed code, rather its utility and benefit to the linux community. As far as I can tell, the code only succeeds in limiting the usefulness of the pci dynamic id addition mechanism. We chose not to limit which drivers can have a table entry added, now let us not limit telling the driver which previous entry is most similar to our new entry. If a driver doesn't set this bit, and only 3 out of 419 do, then the facility can only be used if the driver can function correctly with the data zero. In some drivers (radeonfb) a nonzero flag is always set, in some a list of boards or chipsets is listed in an arbitrary order (e1000e), and in yet others the field is used as a pointer without checking for NULL (DAC960, iwl3945). You sent your request for others to withdraw the patch from consideration when I resent the patch without seeing your comments that were less than 12 hours old, but have been silent for the last 60 hours since I responded to them over the next several hours. If I do not hear from you with technical arguments for keeping the code, I will resend the patch for consideration. milton --
Sorry, I'm on the road right now and will not get back until Friday. Then I have the big merges with Linus to get through. I'll try to get to this by Monday, but my original point still stands, this was implemented for a reason, saying that not enough drivers use it properly does not make the need for it to go away. It is required for them, so perhaps the other 419 drivers also need to have the flag set. That's pretty trivial to do, right? thanks, greg k-h --
Fine, I'll give you a little time. But I want to hear specifics how it helps drivers. I have shown how it hurts many drivers. My arguement is that once we set the flag on the drivers, we will end up with it set on all drivers or the remaining drivers will not care. There were 413+ drivers in linux-next that were compiled by allyesconfig, about 150 used driver data. If the purpose is to enforce range checking, then I'll start working on patches for those 100 drivers to do that. But I don't see why a binary flag helps any driver. The flag is not "I will accept a additional id table", its "I will accept the additional driver data that I need to operate" on my dynamically added id table. milton --
Hi Greg, Not a good enough argument, sorry. There have been many cases in the past where code has been withdrawn after some times because we realized that we got it wrong in the first place. So, please explain what the current code is good for. Honestly, my initial reaction to Milton's proposal was "what an idiot, this flag is there for an obvious safety reason and we don't want to remove it" but after reading both his arguments and the code, I found that I have nothing to backup my claim. If you do, please let us know your If you are suggesting to blindly set the flag to all PCI drivers (or even just all the ones which make use of the driver_data field - doesn't make a difference), this simply shows how useless this flag is. If you don't, then one would have to check the code of all drivers and add validation code for the driver_data value; but then this no longer falls into the "trivial" category. Thanks, -- Jean Delvare --
The technical reason was that this flag was needed to let some drivers work properly with the new_id file, right? It's pretty "trivial" to look to see if the field is set in the pci_id structure, that should be all that is needed, right? thanks, greg k-h --
So please identify at least one such driver where the only correct answer is 0. I even made it easy for you, i identified which drivers set dynamic_id and how. I identified specific drivers where its the wrong answer. So: If you are arguing the code is still needed, please identify at least one case that it is correct. (Oh, I don't buy that if 0 is safe but 1 is equally safe, that the existing code is correct). milton --
Hi Greg, Hmm, so in fact you don't have a clue? ;) I wasn't involved in the addition of this flag, but my impression is that it was added in the hope that it would prevent users from passing additional data to drivers using the driver_data field but not prepared (read: not robust enough) to handle possibly wrong data from user-space. Probably, the idea was that the person adding the dynids.use_driver_data flag would be responsible for verifying that the driver had the required checks in place to guarantee that "nothing" would go wrong even if the data provided by the user wasn't correct. (This doesn't make much sense IMHO, as there is zero guarantee that the developer actually did that - but I believe this is the idea behind the If the flag goes away, nothing will break per se. In particular, the drivers which were using the flag won't stop working - you don't prevent things from happening when removing a safety, on the contrary, you allow more things to happen. What will change is that all the drivers which _do_ use the driver_data field but didn't set the dynids.use_driver_data flag, will now possibly receive their driver_data from the user. Which you may say is unsafe, because these drivers may not expect that (i.e. they probably don't have any check to protect themselves against bogus driver_data values.) But OTOH most of these drivers can't really take dynamic IDs at the moment (because the code prevents the user from passing the required driver_data, silently forcing it to 0) so this would be a big win in terms of usability. On top of that, considering that (silently) defaulting to 0 for driver_data is safe, is just plain Well... If you consider that all drivers using the driver_data field should have the dynids.use_driver_data flag set unconditionally and without looking at the code, then in fact you agree with Milton and myself that this flag should be dropped. The whole point of the flag, if my guess is correct, was to hint the developer that he/she ...
Yeah it doesn't seem to be very heavily used at this point. :) I see around 400 uses of id->driver_data right now, and only 2 of use_driver_data. If we remove the use_driver_data field, drivers (except for the 2 using the field) will start to see whatever values userspace passes to them rather than 0, and at least in some of the few cases I looked at that could cause breakage due to out of bounds references. So I think your point about dynamic IDs in general is a good one; this flag really does look like an "audit was done" bit, but doesn't go as far is it should. The patch you posted to forbid dynamic binding unless use_driver_data is iset is probably the safest thing to do, given that drivers that *don't* set use_driver_data look like they might misbehave even with a driver_data value of 0... What do you think, Greg? Convinced yet? :) Thanks, Jesse --
Hi Jesse, In fact we can do even better than that. We could accept from user-space only driver_data values which at least one device ID entry in the driver already uses. That should be fairly easy to implement, and would offer a level of safety an order of magnitude above what we have at the moment... And it works both ways: if 0 is not a valid data for some driver, that would force the user to provide a non-zero (and valid) data value. And it guarantees that the user can't ask for something the driver doesn't expect, so drivers don't even need extra checks. And no need for a use_driver_data flag either. The only drawback is that it prevents the user from passing a "new" data value even if it would be valid. But honestly, I don't expect that case to happen frequently... if ever at all. So I'd say the benefits totally outweight the drawback. If the interested people agree with the idea, I'll look into implementing it. -- Jean Delvare --
Well the audit would show if user supplied "new" values are needed; otherwise the approach sounds good to me. Thanks -- Jesse Barnes, Intel Open Source Technology Center --
That sounds reasonable, and should work properly. No objection from me. thanks, greg k-h --
Hi all,
Ok, here's what it could look like:
* * * * *
From: Jean Delvare <khali@linux-fr.org>
Subject: PCI: Check dynids driver_data value for validity
Only accept dynids those driver_data value matches one of the driver's
pci_driver_id entry. This prevents the user from accidentally passing
values the drivers do not expect.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: Milton Miller <miltonm@bga.com>
Cc: Greg KH <greg@kroah.com>
---
Documentation/PCI/pci.txt | 4 ++++
drivers/i2c/busses/i2c-amd756.c | 4 ----
drivers/i2c/busses/i2c-viapro.c | 4 ----
drivers/pci/pci-driver.c | 18 ++++++++++++++++--
4 files changed, 20 insertions(+), 10 deletions(-)
--- linux-2.6.27-rc3.orig/Documentation/PCI/pci.txt 2008-08-17 18:24:33.000000000 +0200
+++ linux-2.6.27-rc3/Documentation/PCI/pci.txt 2008-08-17 18:24:38.000000000 +0200
@@ -163,6 +163,10 @@ need pass only as many optional fields a
o class and classmask fields default to 0
o driver_data defaults to 0UL.
+Note that driver_data must match the value used by any of the pci_device_id
+entries defined in the driver. This makes the driver_data field mandatory
+if all the pci_device_id entries have a non-zero driver_data value.
+
Once added, the driver probe routine will be invoked for any unclaimed
PCI devices listed in its (newly updated) pci_ids list.
--- linux-2.6.27-rc3.orig/drivers/i2c/busses/i2c-amd756.c 2008-08-17 17:15:57.000000000 +0200
+++ linux-2.6.27-rc3/drivers/i2c/busses/i2c-amd756.c 2008-08-17 19:42:14.000000000 +0200
@@ -332,10 +332,6 @@ static int __devinit amd756_probe(struct
int error;
u8 temp;
- /* driver_data might come from user-space, so check it */
- if (id->driver_data >= ARRAY_SIZE(chipname))
- return -EINVAL;
-
if (amd756_ioport) {
dev_err(&pdev->dev, "Only one device supported "
"(you have a strange motherboard, btw)\n");
--- ...Acked-by: Greg Kroah-Hartman <gregkh@suse.de> Looks good, thanks for sticking with it and creating this. greg k-h --
Looks good; I think we'll want to put this into linux-next along with Milton's change. I'll push them out after a quick smoke test. Thanks, Jesse --
Applied to linux-next, thanks Jean. Jesse --
Thanks Jean for doing this. Sometimes things move quickly after a long stall. I thought about proposing a similar patch and therefore have to There are a few drivers that could benefit, mainly ones that I identified as using flags. For example, the radeon driver uses different fields of the data to specify crt controller, video output device, etc. I'm fine with deferring a flag for such drivers until so, if anyone asks, Concept-Acked-By: Milton Miller <miltonm@bga.com> milton --
Hi Milton, hi Greg, I _am_ arguing the correctness of the removed code. That code silently defaults driver_data to 0 when the driver doesn't set dynids.use_driver_data. This assumes that 0 is always a safe value, I have to agree with Milton. Allowing all drivers to use dynamic IDs first, and then limiting the use of the driver_data field to drivers setting a specific flag, doesn't make sense. For one thing, just because the driver sets the flag, doesn't mean that it does the proper checks on the passed value. For another, as written above, assuming that 0 is a safe value (that doesn't need to be checked by the driver) is incorrect. The correct approach here if you really want to play it safe, would be to not allow dynamic IDs by default. Drivers would set a flag when they want to use them (instead of when they want to use driver_data as we currently do.) Setting the flag would suggest that you have checked that nothing can go wrong when a dynamic ID is added. As I said above, it's impossible to actually enforce other than with careful code review, but if you insist on trying, maybe we can have these drivers set a validation callback function rather than a flag. In the absence of validation callback function, dynamic IDs would be forbidden. This has a cost in terms of driver size though. That would be a pretty big change, as someone would have to go through all the PCI drivers and update them. I certainly do not have the time to do that, but maybe Milton does if we agree on that? The alternative is to just apply Milton's patch and get away with all checks. That's perfectly fine with me, as I can't see how it is worse than what we currently have, but apparently not with Greg (although I still don't know why.) If Milton's patch isn't going to be applied, then I would like to suggest applying the following patch of mine as a measure to prevent what happened to Milton and started this discussion: * * * * * Subject: PCI: Don't silently ignore dynids driver data If ...
