Linus,
In order to stop the bkl contagion in ioctl and push the
bkl from Vfs to the drivers that don't implement a proper
unlock_ioctl, this patch proposes to add a new "locked_ioctl"
field in struct file_operation.
This field is never called from vfs and requires to assign a
new "deprecated_ioctl" helper to the unlocked_ioctl field.
This deprecated_ioctl() helper then calls the locked_ioctl()
callback with the bkl held.
The point of doing this is to change every users of fops::ioctl
to the new scheme. Once there is no more users of the ioctl field,
we'll then be able to remove it.
Also this change brings a new config BKL which is always
enabled for now. Every drivers that use the bkl through
explicit lock_kernel() calls or by using deprecated_ioctl()
will have to select CONFIG_BKL.
Once we get no more uses of the bkl from the core features but
only on individual drivers, config BKL can be turned off
by default and selected by those drivers if needed, relegating
the bkl as an obsolete library for poor old drivers.
And given the work happening currently (tty mutex, vfs pushdowns,
procfs bkl removal, llseek janitorials, etc...), this may happen
sooner than was expected.
Now the point of having this patch in 2.6.34 is to turn
the old ioctl implementations to the new scheme in a set
of patches that relevant maintainers can apply to their
tree (or apply by ourself for unmaintained areas) for .35
Otherwise we would need to queue everything in a single
tree that may bring conflicts in the next merge window.
This single patch has very few chances to bring any regressions.
Please pull the bkl/ioctl branch that can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
bkl/ioctl
Thanks,
Frederic
---
Arnd Bergmann (1):
vfs: Introduce CONFIG_BKL and deprecated_ioctl
fs/ioctl.c | 22 ++++++++++++++++++++++
include/linux/fs.h | 3 +++
include/linux/smp_lock.h | 4 ++++
...Linus,
In this v2, I've removed the declaration of default_llseek
from smp_lock.h, as this export can be made later (we
want to make any use of default_llseek() depend on
CONFIG_BKL as well, but that can wait).
Please pull the bkl/ioctl-v2 branch that can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
bkl/ioctl-v2
Thanks,
Frederic
---
Arnd Bergmann (1):
vfs: Introduce CONFIG_BKL and deprecated_ioctl
fs/ioctl.c | 22 ++++++++++++++++++++++
include/linux/fs.h | 3 +++
include/linux/smp_lock.h | 3 +++
kernel/Kconfig.locks | 10 ++++++++++
4 files changed, 38 insertions(+), 0 deletions(-)
---
commit 333f5fb46d15d057f0d69da0dd8b0a3db89bf34f
Author: Arnd Bergmann <arnd@arndb.de>
Date: Thu Apr 1 14:42:38 2010 +0200
vfs: Introduce CONFIG_BKL and deprecated_ioctl
This is a preparation for the removal of the big kernel lock that
introduces new interfaces for device drivers still using it.
We can start marking those device drivers as 'depends on CONFIG_BKL'
now, and make that symbol optional later, when the point has come
at which we are able to build a kernel without the BKL.
Similarly, device drivers that currently make use of the implicit
BKL locking around the ioctl function can now get annotated by
changing
.ioctl = foo_ioctl,
to
.locked_ioctl = foo_ioctl,
.unlocked_ioctl = deprecated_ioctl,
As soon as no driver remains using the old ioctl callback, it can
get removed.
[fweisbec: move config bkl from Kconfig.debug to Kconfig.locks,
rename default_ioctl to deprecated_ioctl]
v2: Remove default_llseek() declaration from smp_lock.h, we
don't need to prepare it to be modularized right now.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
diff --git a/fs/ioctl.c ...Hi Linus, This is the second version of this pull request and you don't seem to be pulling it either. Could you please tell us what kind of hesitation or problems you may have with this? If you think it's too late to merge this, we can still cook it for 2.6.35 and apply every dependent ioctl conversion in the same tree, this will have the drawback that we can't push dependent patches to the relevant maintainers trees (that said I guess that a good part of the drivers that still implement ioctl are about unmaintained areas). Or may be you don't like the core idea of this patch. Either way please tell us so that we can go ahead with this and choose a direction that looks more appropriate for you. Thanks, Frederic. --
I don't see the point, frankly. Especially not outside the merge window, but quite frankly, I don't see it in general. The whole thing seems to be designed to be inconvenient, and to have a config option that I fundamentally don't believe in (CONFIG_BKL). Linus --
More detail: it still leaves that old "ioctl" function pointer that needs the BKL and is ungreppable. So the whole and only point of the patch is to make our current mess even _more_ complex, with three different cases. No, thank you. Quite frankly, if you want to get rid of the BKL in ioctl's and make them easily greppable, then I would suggest a simple renaming: rename the current '->ioctl()' thing to '->bkl_ioctl()', and mark it deprecated. No new config options, no new games. Just rename it. No need to mark things with CONFIG_BKL, when you can just see it by virtue of them using 'bkl_ioctl'. Linus --
We want to do the rename in the next merge window and remove the old ->ioctl(), this patch is just a preparation for this so we can start queuing the patches for the rename in maintainer trees. The addition of the deprecated_ioctl() helper is not essential but let's us move all BKL users into loadable modules next. The CONFIG_BKL stuff is not a requirement for doing this, but something we /also/ want to do in the next merge window, i.e. mark all BKL users as CONFIG_BKL, not just the ones that use the locked_ioctl (or bkl_ioctl). I agree that it's now a bit late for this, but when I initially suggested this (before -rc3, IIRC), that would have given us the chance to queue up all the patches with a dependency on this for the next merge window. Now the current outlook is probably that we do a lot of the preparation work like this patch in 2.6.35-rc1, which moves the merge of the interesting parts out to the 2.6.36 timeframe. Arnd --
.. and I think that's simply fundamentally wrong. What does it buy us, except for another really annoying config option? It sure as hell doesn't buy us any code-size (what, a couple of bytes). Quite frankly, if you want to just prepare to rename things one by one, then you might as well just have a single line #define bkl_ioctl ioctl and then you can do .bkl_ioctl = driver_ioctl but the thing is - what does that _buy_ us without the ability to grep for and cause compile errors for drivers that haven't done this? Nothing. So seriously - I'd _much_ rather just get one single large patch that just renames everything. None of this crap that makes no sense. Linus --
With CONFIG_BKL disabled, we gain a few cyles in the scheduler, since we no longer need to check if the BKL was held and needs to be released. Not much, but better than just a few bytes of object size. With CONFIG_BKL=m, what we gain is visibility: Doing an lsmod will show you if bkl.ko is loaded and what other modules are using it. My hope is that this will motivate people to remove it from the remaining modules. One of the crazier ideas was to automatically taint the kernel when bkl.ko gets loaded. This does not gain us anything besides making a Being able to grep is not the reason for this patch. The reason is that we're trying to eliminate the BKL from all non-obscure code in the kernel and one of the nasty ones is fs/ioctl.c, which always gets compiled in. The trick with bkl.ko requires that any code calling lock_kernel() needs to be a module, so the deprecated_ioctl() helper will have to be part Ok. We'll go back to my intial approach for ioctl then, doing it all at once. The reason for the staged approach was to avoid bisect breakage and conflicts with maintainer updates, but since most of the remaining users at this point are really unmaintained (or staging drivers), there's probably not much to worry about here. There will be one patch that does three things: - add a deprecated_ioctl() function - rename fops->ioctl to fops->bkl_ioctl - set fops->unlocked_ioctl=deprecated_ioctl for any file_operations instance that uses bkl_ioctl Is this ok for you? Regarding the CONFIG_BKL option, should that also be done as another patch changing all the Kconfig files? My preference would be to have one patch adding the Kconfig option first, giving the maintainers the choice to either mark their code 'depends on BKL' or to remove the dependency by changing their code. Arnd --
That has _nothing_ to do with the ioctl's though. Stop mixing things up. There are two totally independent issues: - making the BKL ioctl's be explicit and findable - eventually getting rid of the BKL entirely and I think you guys are totally mixing things up, and making things WORSE in the process. The notion of having _three_ different "ioctl()" function pointers just makes me want to gag. And there is absolutely _zero_ reason for it. Tjhere is no way in hell that we want to have every subsystem maintainer try to independently do their own ioctl's. Most of the drivers that have those things are basically unmaintained or on the back burner anyway. So don't make the current ugly ioctl situation worse. Not even as a stop-gap, because there is absolutely _zero_ upside to making yet another new crazy temporary ioctl interface. And don't try to conflate the issue of ioctl and BKL. There are still code-paths that do lock_kernel() without the ioctl's, so the whole ioctl renaming has _zero_ to do with CONFIG_BKL. Linus --
Our final goal was not to have three different ioctl interfaces. This state was only deemed to be temporary. This was the only way to make the change smoother and don't conflict with other trees with a single monolithic patch. But if you are ok with a single one, then we are going this way and we'll send it for It's true, but once it gets pushed down/dropped from every core parts (which is what we are working on currently in parallel), lock_kernel() and .bkl_ioctl is only going to be used by unmaintained drivers. This is the time where having a CONFIG_BKL is going to make sense. And it won't be a question of saving some bytes but improve efficiency of schedule() for those who don't need such old or unmaintained drivers. May be we should only start to focus on this new config once this state is reached. And to prepare for that, are you ok with this scheme of: - .ioctl = foo, + .unlocked_ioctl = bkl_ioctl, + .bkl_ioctl = foo, ...done at the same time as the big rename patch. This will prepare to remove the bkl from vfs and build it conditionally from the bkl lib, once the bkl is out the core? --
Seriously, why not just
- .ioctl = foo,
+ .bkl_ioctl = foo
because that line of
+ .unlocked_ioctl = bkl_ioctl,
is just total and utter _garbage_. There is zero reason for it.
In the long run (this is a year from now, when we rename "unlocked_ioctl"
back to just "ioctl"), the vfs_ioctl code will just do
struct file_operations *fops = filp->f_op;
if (!fops)
return -ENOTTY;
if (fops->ioctl) {
int error = fops->ioctl(...)
if (error == -ENOIOCTLCMD)
error = -EINVAL;
return error;
}
#ifdef CONFIG_BKL
if (fops->bkl_ioctl) {
int error;
lock_kernel();
error = fops->bkl_ioctl(...)
unlock_kernel();
return error;
}
#endif
return -ENOTTY;
and we're all done.
At NO point is there any advantage to that "bkl_ioctl" crap. It doesn't
help the legacy drivers (which won't even _compile_ unless CONFIG_BKL is
set anyway), it doesn't help the core code, it doesn't help _anybody_.
Not today, not tomorrow, not with CONFIG_BKL, and not without.
Linus
--
Well, we won't be able to get this bkl.ko but the desired effect of having it was rather psychological than practical, as Arnd explained: to make the dependency more visible and pull concerned people into dropping the bkl from the drivers they are using. But other than that, the final effect remains pretty the same so it's not a big deal. I'm ok with that. Thanks for clarifying the situation! --
We could also stop playing games with with this and just kill the locked variant of ioctl right away. No rename to bkl_ioctl, no helper functions. It's served it's purpose and we now have the list of 157 files that still use fops->ioctl, so if we just push the BKL into those files and make them use unlocked_ioctl, we will be able to remove ->ioctl for good. We can do the rename of ->unlocked_ioctl to ->ioctl right after that if you like, or in a year from now, I don't care. Arnd --
I wouldn't mind that. But we've not found people to do it so far, so...
Anyway, I spent some time automating the ioctl -> bkl_ioctl conversion.
The following is the end result. Comments?
Regardless of _what_ we do, renaming 'ioctl' to 'bkl_ioctl' is a good
thing. It means that we can reliably grep for these usages, whether it is
then to later introduce a CONFIG_BKL or if it is to get rid of the
bkl_ioctl entirely.
NOTE! This has gone through a "allmodconfig" (with staging drivers), but
only on x86-64. There are quite probably missing architecture conversions
and/or drivers. But this should be the bulk of them.
It doesn't look that bad. Just about 100 files affected.
Linus
---
commit 5312f12ed68dd43acca09ce646fdce3896a292bb
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date: Mon Apr 26 10:55:23 2010 -0700
Rename 'struct file_operations' 'ioctl' fn pointer to 'bkl_ioctl'
This makes it much easier to grep for the use of implicit BKL usage in
the traditional 'ioctl' function pointer.
The patch was mostly auto-generated with the following shell scripting:
# Create an approximate file list of ioctl initializers
git grep -w -20 file_operations |
grep '\.ioctl[ ]*=' |
sed 's/-[ ]*\.ioctl.*$//' |
sort -u > f
# Run a sed script to replace 'ioctl' with 'bkl_ioctl'
cat f | while read i; do
j="$i.sed"
sed 's/\.ioctl([ ]*)=/.bkl_ioctl\1=/' < $i > $j
mv $j $i
done
followed by some similarly trivial scripting to then mostly fix up
places that tried to align all the initializers with tabs or spaces.
Finally, some manual fixup, and testing of the end result with a
allmodconfig make (with 'staging' drivers explicitly enabled).
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
arch/cris/arch-v10/drivers/ds1302.c | 2 +-
arch/cris/arch-v10/drivers/gpio.c | 2 +-
...I'll have another go at this tonight, see how far I get doing it
Yes, looks good. I've compared it to a previous series of mine that
changed all the files in a different way. A number of those changes
have found their way into your tree by now, and there were only
two left that you didn't catch (see below), and one false positive
The v4l2_file_operations is actually another can of worms, I count
77 more files with a locked ioctl function in there. I'll put that on
the list of things to do before the BKL is gone.
Arnd
---
Subject: Missing bits from ->ioctl to ->bkl_ioctl conversion
Feel free to fold this into your patch.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
index 3de2f32..4890683 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -722,7 +722,7 @@ struct file_operations {
ssize_t (*aio_write) (struct kiocb *, const struct iovec *, unsigned long, loff_t);
int (*readdir) (struct file *, void *, filldir_t);
unsigned int (*poll) (struct file *, struct poll_table_struct *);
- int (*ioctl) (struct inode *, struct file *, unsigned int, unsigned long);
+ int (*bkl_ioctl) (struct inode *, struct file *, unsigned int, unsigned long);
long (*unlocked_ioctl) (struct file *, unsigned int, unsigned long);
long (*compat_ioctl) (struct file *, unsigned int, unsigned long);
int (*mmap) (struct file *, struct vm_area_struct *);
diff --git a/sound/oss/au1550_ac97.c b/sound/oss/au1550_ac97.c
index c1070e3..7295781 100644
--- a/sound/oss/au1550_ac97.c
+++ b/sound/oss/au1550_ac97.c
@@ -835,11 +835,11 @@ au1550_ioctl_mixdev(struct inode *inode, struct file *file,
}
static /*const */ struct file_operations au1550_mixer_fops = {
- owner:THIS_MODULE,
- llseek:au1550_llseek,
- ioctl:au1550_ioctl_mixdev,
- open:au1550_open_mixdev,
- release:au1550_release_mixdev,
+ .owner = THIS_MODULE,
+ .llseek = au1550_llseek,
+ .bkl_ioctl = ...Yeah, those v4l2_file_operations were annoying. I didn't notice at first,
so my first automated script renamed all of them too (because I had just
Yes, but that - along with the sound drivers, that also have their own
ioctl interface that gets driven under the BKL - are independent patches.
Your patch did one of the au1550_ac97 'struct file_operations', but missed
another one. I had missed both, because they had that crazy old-style
initializer, and didn't trigger a compile on x86-64. But I fixed the other
one too.
So now the thing looks like the appended.
Linus
---
commit 6a0ac48e0ac12430a72189c06af02a16f4c4b3f2
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date: Mon Apr 26 10:55:23 2010 -0700
Rename 'struct file_operations' 'ioctl' fn pointer to 'bkl_ioctl'
This makes it much easier to grep for the use of implicit BKL usage in
the traditional 'ioctl' function pointer.
The patch was mostly auto-generated with the following shell scripting:
# Create an approximate file list of ioctl initializers
git grep -w -20 file_operations |
grep '\.ioctl[ ]*=' |
sed 's/-[ ]*\.ioctl.*$//' |
sort -u > f
# Run a sed script to replace 'ioctl' with 'bkl_ioctl'
cat f | while read i; do
j="$i.sed"
sed 's/\.ioctl([ ]*)=/.bkl_ioctl\1=/' < $i > $j
mv $j $i
done
followed by some similarly trivial scripting to then mostly fix up
places that tried to align all the initializers with tabs or spaces.
Finally, some manual fixup, and testing of the end result with a
allmodconfig make (with 'staging' drivers explicitly enabled).
[ Two missing conversions added by Arnd Bergmann ]
Signed-off-by: Arnd Bergmann <arnd@arndb.de> [ vfs.txt and au1550_ac97.c ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
Documentation/filesystems/vfs.txt | 2 +-
...Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
drivers/staging/crystalhd/crystalhd_lnx.c | 13 +++++++++----
drivers/staging/dt3155/dt3155_drv.c | 24 ++++++++++++++++++------
drivers/staging/poch/poch.c | 17 +++++++++++++++--
drivers/staging/vme/devices/vme_user.c | 18 +++++++++++++++---
4 files changed, 57 insertions(+), 15 deletions(-)
diff --git a/drivers/staging/crystalhd/crystalhd_lnx.c b/drivers/staging/crystalhd/crystalhd_lnx.c
index 54bad65..3291e14 100644
--- a/drivers/staging/crystalhd/crystalhd_lnx.c
+++ b/drivers/staging/crystalhd/crystalhd_lnx.c
@@ -16,6 +16,7 @@
***************************************************************************/
#include <linux/version.h>
+#include <linux/smp_lock.h>
#include <linux/slab.h>
#include "crystalhd_lnx.h"
@@ -261,12 +262,12 @@ static int chd_dec_api_cmd(struct crystalhd_adp *adp, unsigned long ua,
}
/* API interfaces */
-static int chd_dec_ioctl(struct inode *in, struct file *fd,
- unsigned int cmd, unsigned long ua)
+static long chd_dec_ioctl(struct file *fd, unsigned int cmd, unsigned long ua)
{
struct crystalhd_adp *adp = chd_get_adp();
crystalhd_cmd_proc cproc;
struct crystalhd_user *uc;
+ int ret;
if (!adp || !fd) {
BCMLOG_ERR("Invalid adp\n");
@@ -279,13 +280,17 @@ static int chd_dec_ioctl(struct inode *in, struct file *fd,
return -ENODATA;
}
+ lock_kernel();
cproc = crystalhd_get_cmd_proc(&adp->cmds, cmd, uc);
if (!cproc) {
BCMLOG_ERR("Unhandled command: %d\n", cmd);
+ unlock_kernel();
return -EINVAL;
}
- return chd_dec_api_cmd(adp, ua, uc->uid, cmd, cproc);
+ ret = chd_dec_api_cmd(adp, ua, uc->uid, cmd, cproc);
+ unlock_kernel();
+ return ret;
}
static int chd_dec_open(struct inode *in, struct file *fd)
@@ -345,7 +350,7 @@ static int chd_dec_close(struct inode *in, struct file *fd)
static const struct file_operations chd_dec_fops = {
.owner = THIS_MODULE,
- .ioctl = ...Hi Greg, If you are fine with this patch, I think it would be better if you get it in your tree. --
Ok, I'll queue it up (note, the poch driver is now gone in the staging tree.) thanks, greg k-h --
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
drivers/scsi/3w-9xxx.c | 10 +++++++---
drivers/scsi/3w-sas.c | 7 +++++--
drivers/scsi/3w-xxxx.c | 10 +++++++---
drivers/scsi/aacraid/linit.c | 11 ++++++++---
drivers/scsi/dpt_i2o.c | 20 +++++++++++++++++---
drivers/scsi/gdth.c | 20 +++++++++++++++-----
drivers/scsi/megaraid.c | 20 +++++++++++++++++---
drivers/scsi/megaraid/megaraid_mm.c | 22 +++++++++++++++++-----
drivers/scsi/osst.c | 14 ++++++++++----
drivers/scsi/sg.c | 17 ++++++++++++++---
10 files changed, 117 insertions(+), 34 deletions(-)
diff --git a/drivers/scsi/3w-9xxx.c b/drivers/scsi/3w-9xxx.c
index e9788f5..4f74850 100644
--- a/drivers/scsi/3w-9xxx.c
+++ b/drivers/scsi/3w-9xxx.c
@@ -123,7 +123,7 @@ static void twa_aen_queue_event(TW_Device_Extension *tw_dev, TW_Command_Apache_H
static int twa_aen_read_queue(TW_Device_Extension *tw_dev, int request_id);
static char *twa_aen_severity_lookup(unsigned char severity_code);
static void twa_aen_sync_time(TW_Device_Extension *tw_dev, int request_id);
-static int twa_chrdev_ioctl(struct inode *inode, struct file *file, unsigned int cmd, unsigned long arg);
+static long twa_chrdev_ioctl(struct file *file, unsigned int cmd, unsigned long arg);
static int twa_chrdev_open(struct inode *inode, struct file *file);
static int twa_fill_sense(TW_Device_Extension *tw_dev, int request_id, int copy_sense, int print_host);
static void twa_free_request_id(TW_Device_Extension *tw_dev,int request_id);
@@ -218,7 +218,7 @@ static struct device_attribute *twa_host_attrs[] = {
/* File operations struct for character device */
static const struct file_operations twa_fops = {
.owner = THIS_MODULE,
- .ioctl = twa_chrdev_ioctl,
+ .unlocked_ioctl = twa_chrdev_ioctl,
.open = twa_chrdev_open,
.release = NULL
};
@@ -635,8 +635,9 @@ out:
} /* End twa_check_srl() */
/* This ...Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
drivers/isdn/capi/capi.c | 17 ++++++++++++++---
drivers/isdn/divert/divert_procfs.c | 19 ++++++++++++++++---
drivers/isdn/i4l/isdn_common.c | 18 +++++++++++++++---
drivers/isdn/mISDN/timerdev.c | 10 ++++++----
4 files changed, 51 insertions(+), 13 deletions(-)
diff --git a/drivers/isdn/capi/capi.c b/drivers/isdn/capi/capi.c
index ee58375..0cabe31 100644
--- a/drivers/isdn/capi/capi.c
+++ b/drivers/isdn/capi/capi.c
@@ -787,8 +787,7 @@ capi_poll(struct file *file, poll_table * wait)
}
static int
-capi_ioctl(struct inode *inode, struct file *file,
- unsigned int cmd, unsigned long arg)
+capi_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
{
struct capidev *cdev = file->private_data;
capi_ioctl_struct data;
@@ -981,6 +980,18 @@ register_out:
}
}
+static long
+capi_unlocked_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
+{
+ int ret;
+
+ lock_kernel();
+ ret = capi_ioctl(file, cmd, arg);
+ unlock_kernel();
+
+ return ret;
+}
+
static int capi_open(struct inode *inode, struct file *file)
{
struct capidev *cdev;
@@ -1026,7 +1037,7 @@ static const struct file_operations capi_fops =
.read = capi_read,
.write = capi_write,
.poll = capi_poll,
- .ioctl = capi_ioctl,
+ .unlocked_ioctl = capi_unlocked_ioctl,
.open = capi_open,
.release = capi_release,
};
diff --git a/drivers/isdn/divert/divert_procfs.c b/drivers/isdn/divert/divert_procfs.c
index 9f49d90..8084557 100644
--- a/drivers/isdn/divert/divert_procfs.c
+++ b/drivers/isdn/divert/divert_procfs.c
@@ -11,6 +11,7 @@
#include <linux/module.h>
#include <linux/poll.h>
+#include <linux/smp_lock.h>
#include <linux/slab.h>
#ifdef CONFIG_PROC_FS
#include <linux/proc_fs.h>
@@ -178,8 +179,7 @@ isdn_divert_close(struct inode *ino, struct file *filep)
/* IOCTL */
/*********/
static int
-isdn_divert_ioctl(struct inode *inode, struct file *file,
- uint ...This is half the work of getting rid of the BKL in the ioctl file operation, the rest in arch/ and fs/ still needs to be done, maybe two more hours of work (for someone else than me ;-)). Pushdown is straightforward. In many cases, it's rather obvious that the BKL is not needed at all, but let's not mix the removal with the pushdown. Arnd Bergmann (6): dvb: push down BKL into ioctl functions scsi: push down BKL into ioctl functions isdn: push down BKL into ioctl functions staging: push down BKL into ioctl functions v4l: always use unlocked_ioctl drivers: push down BKL into various drivers drivers/block/pktcdvd.c | 13 ++++++-- drivers/char/apm-emulation.c | 8 +++-- drivers/char/applicom.c | 13 +++++--- drivers/char/ds1620.c | 16 ++++++++- drivers/char/dtlk.c | 15 +++++---- drivers/char/generic_nvram.c | 17 ++++++++-- drivers/char/genrtc.c | 16 ++++++++- drivers/char/hpet.c | 14 +++++--- drivers/char/i8k.c | 21 ++++++++++-- drivers/char/ipmi/ipmi_devintf.c | 26 +++++++++++++--- drivers/char/ipmi/ipmi_watchdog.c | 17 +++++++++- drivers/char/nvram.c | 10 ++++-- drivers/char/nwflash.c | 7 +++- drivers/char/raw.c | 42 ++++++++++++++----------- drivers/hwmon/fschmd.c | 9 +++-- drivers/hwmon/w83793.c | 10 ++++-- drivers/input/misc/hp_sdc_rtc.c | 34 ++++++++++++++------ drivers/isdn/capi/capi.c | 17 ++++++++-- drivers/isdn/divert/divert_procfs.c | 19 ++++++++++-- drivers/isdn/i4l/isdn_common.c | 18 +++++++++-- drivers/isdn/mISDN/timerdev.c | 10 ++++-- drivers/macintosh/nvram.c | 2 +- ...
Hi Arnd. These pushdowns conflicted with Linus's renaming. In addition I found a few errors due to the change to unlocked_ioctl. I will resubmit fixes to these problems in a new mail. --
This requires changing all users of dvb_usercopy to
omit the inode argument.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
drivers/media/dvb/dvb-core/dmxdev.c | 31 +++++++++++++++++++-------
drivers/media/dvb/dvb-core/dvb_ca_en50221.c | 17 ++++++++++----
drivers/media/dvb/dvb-core/dvb_frontend.c | 30 +++++++++++++-------------
drivers/media/dvb/dvb-core/dvb_net.c | 15 +++++++++---
drivers/media/dvb/dvb-core/dvbdev.c | 17 +++++++++-----
drivers/media/dvb/dvb-core/dvbdev.h | 11 +++------
drivers/media/dvb/firewire/firedtv-ci.c | 5 +--
drivers/media/dvb/ttpci/av7110.c | 4 +-
drivers/media/dvb/ttpci/av7110_av.c | 8 +++---
drivers/media/dvb/ttpci/av7110_ca.c | 5 +--
10 files changed, 85 insertions(+), 58 deletions(-)
diff --git a/drivers/media/dvb/dvb-core/dmxdev.c b/drivers/media/dvb/dvb-core/dmxdev.c
index 9ddc579..425862f 100644
--- a/drivers/media/dvb/dvb-core/dmxdev.c
+++ b/drivers/media/dvb/dvb-core/dmxdev.c
@@ -25,6 +25,7 @@
#include <linux/slab.h>
#include <linux/vmalloc.h>
#include <linux/module.h>
+#include <linux/smp_lock.h>
#include <linux/poll.h>
#include <linux/ioctl.h>
#include <linux/wait.h>
@@ -963,7 +964,7 @@ dvb_demux_read(struct file *file, char __user *buf, size_t count,
return ret;
}
-static int dvb_demux_do_ioctl(struct inode *inode, struct file *file,
+static int dvb_demux_do_ioctl(struct file *file,
unsigned int cmd, void *parg)
{
struct dmxdev_filter *dmxdevfilter = file->private_data;
@@ -1084,10 +1085,16 @@ static int dvb_demux_do_ioctl(struct inode *inode, struct file *file,
return ret;
}
-static int dvb_demux_ioctl(struct inode *inode, struct file *file,
- unsigned int cmd, unsigned long arg)
+static long dvb_demux_ioctl(struct file *file, unsigned int cmd,
+ unsigned long arg)
{
- return dvb_usercopy(inode, file, cmd, arg, dvb_demux_do_ioctl);
+ int ret;
+
+ lock_kernel();
+ ret = ...v4l drivers may still use a locked ioctl method,
but we now always export an unlocked_ioctl and
lock ourselves, so that the ->ioctl file operation
can get removed.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
drivers/media/video/v4l2-dev.c | 17 ++++++++++++++---
1 files changed, 14 insertions(+), 3 deletions(-)
diff --git a/drivers/media/video/v4l2-dev.c b/drivers/media/video/v4l2-dev.c
index 7090699..3606694 100644
--- a/drivers/media/video/v4l2-dev.c
+++ b/drivers/media/video/v4l2-dev.c
@@ -22,6 +22,7 @@
#include <linux/mm.h>
#include <linux/string.h>
#include <linux/errno.h>
+#include <linux/smp_lock.h>
#include <linux/init.h>
#include <linux/kmod.h>
#include <linux/slab.h>
@@ -215,16 +216,21 @@ static unsigned int v4l2_poll(struct file *filp, struct poll_table_struct *poll)
return vdev->fops->poll(filp, poll);
}
-static int v4l2_ioctl(struct inode *inode, struct file *filp,
+static long v4l2_ioctl(struct file *filp,
unsigned int cmd, unsigned long arg)
{
struct video_device *vdev = video_devdata(filp);
+ int ret;
if (!vdev->fops->ioctl)
return -ENOTTY;
/* Allow ioctl to continue even if the device was unregistered.
Things like dequeueing buffers might still be useful. */
- return vdev->fops->ioctl(filp, cmd, arg);
+ lock_kernel();
+ ret = vdev->fops->ioctl(filp->f_path.dentry->d_inode, filp, cmd, arg);
+ unlock_kernel();
+
+ return ret;
}
static long v4l2_unlocked_ioctl(struct file *filp,
@@ -323,6 +329,11 @@ static const struct file_operations v4l2_unlocked_fops = {
.llseek = no_llseek,
};
+/*
+ * Note: this should not be needed, just check
+ * both pointers in v4l2_ioctl, or kill
+ * fops->ioctl. -arnd
+ */
static const struct file_operations v4l2_fops = {
.owner = THIS_MODULE,
.read = v4l2_read,
@@ -330,7 +341,7 @@ static const struct file_operations v4l2_fops = {
.open = v4l2_open,
.get_unmapped_area = v4l2_get_unmapped_area,
.mmap = v4l2_mmap,
- .ioctl = ...These are the last remaining device drivers using the ->ioctl file operation in the drivers directory. Signed-off-by: Arnd Bergmann <arnd@arndb.de> --- drivers/block/pktcdvd.c | 13 ++++++++-- drivers/char/apm-emulation.c | 8 ++++-- drivers/char/applicom.c | 13 ++++++---- drivers/char/ds1620.c | 16 +++++++++++- drivers/char/dtlk.c | 15 ++++++------ drivers/char/generic_nvram.c | 17 +++++++++++-- drivers/char/genrtc.c | 16 +++++++++++- drivers/char/hpet.c | 14 +++++++---- drivers/char/i8k.c | 21 ++++++++++++++--- drivers/char/ipmi/ipmi_devintf.c | 26 +++++++++++++++++---- drivers/char/ipmi/ipmi_watchdog.c | 17 ++++++++++++- drivers/char/nvram.c | 10 ++++++-- drivers/char/nwflash.c | 7 ++++- drivers/char/raw.c | 42 ++++++++++++++++++++--------------- drivers/hwmon/fschmd.c | 9 ++++--- drivers/hwmon/w83793.c | 10 +++++--- drivers/input/misc/hp_sdc_rtc.c | 34 ++++++++++++++++++++-------- drivers/macintosh/nvram.c | 2 +- drivers/macintosh/via-pmu.c | 17 +++++++++++-- drivers/mtd/mtdchar.c | 19 +++++++++++---- drivers/pcmcia/pcmcia_ioctl.c | 17 +++++++++++-- drivers/rtc/rtc-m41t80.c | 16 +++++++++++- drivers/sbus/char/openprom.c | 44 +++++++++++++++++++++---------------- drivers/usb/mon/mon_bin.c | 23 ++++++++++++++----- drivers/usb/mon/mon_stat.c | 3 +- 25 files changed, 307 insertions(+), 122 deletions(-) diff --git a/drivers/block/pktcdvd.c b/drivers/block/pktcdvd.c index ddf1942..84a5658 100644 --- a/drivers/block/pktcdvd.c +++ b/drivers/block/pktcdvd.c @@ -55,6 +55,7 @@ #include <linux/proc_fs.h> #include <linux/seq_file.h> #include <linux/miscdevice.h> +#include <linux/smp_lock.h> #include <linux/freezer.h> #include <linux/mutex.h> #include <linux/slab.h> @@ ...
From: Linus Torvalds <torvalds@linux-foundation.org> "allmodconfig" on sparc64 passes with this patch too, just FYI... --
Thanks! I've queued it for the next merge window in git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git bkl/ioctl --
Btw, I hope you took the second version, that had the two additional fixes from Arnd (and my expansion of his fix to au1550_ac97.c). Looking at the arch code, I doubt there are any big architecture-specific things. But there could easily be some other drivers like au1550_ac97.c that only get enabled on certain architectures and missed the grepping for some reason. That said, because of Arnd's fix, I did end up grepping for 'file_operations' and old-style gcc initializers (ie "ioctl: xyz" rather than the proper ".ioctl = xyz"), and the grep came up empty. But it's possible that there's something hiding: with all of serial, bluetooth, block drivers, sound, socket proto's and v4l2 each having their own 'ioctl' pointers, it's not entirely trivial to grep for it all and be sure.. So there might be one or two cases still hiding, but it looks unlikely. And I'm almost certain that it definitely isn't more than just one or two. Linus --
The scheduler will be helped most by getting rid of the BKL altogether. We are
in reaching distance of that now ...
Once that state is achived we can just get rid of the BKL and mass-push
per-driver mutexes into those remaining drivers - in a possibly scripted way.
Something like:
foo-driver.c
DEFINE_MUTEX(foo_mutex);
foo_ioctl()
{
mutex_lock(&foo_mutex);
...
mutex_unlock(&foo_mutex);
}
foo_open()
{
mutex_lock(&foo_mutex);
...
mutex_unlock(&foo_mutex);
}
This could be done all automated for a hundred old drivers if need to be.
There would be no bkl_ioctl's left.
That, even if it looks somewhat coarse is still better than having _yet
another_ 'temporary transition'. The Big Kernel Lock was supposed to be
transitionary to begin with. It's been 10+ years and counting :-)
Ingo
--
I don't think it can be fully automated. For the majority of the modules, your approach would work fine, but there are still the well-known pitfalls in corner cases: - recursive uses in functions outside of ioctl (possibly none left after the TTY layer is done, but who knows) - lock-order problems with other mutexes (see DRM) - code that depends on autorelease to allow one ioctl while another is sleeping. (a small number of drivers) Semi-automated should be fine though. These rules are relatively easy to check, so we can mass-convert all the trivial cases. Examples for nontrivial modules are mostly file systems, see ncpfs, I think the immediate goal should be to get the BKL out of everthing that's either used by real people or that's reasonably easy to do. We have patches for almost all of these now [1], and I've been running a kernel with CONFIG_BKL=n for a few weeks now. As we progress through the remaining modules, an increasing number of systems can run without this. I see the next steps as: 1. make it possible to build a kernel without BKL, by removing the BKL from all core components for good, and making all users depend on CONFIG_BKL. Make it default y and put it under CONFIG_DEBUG. 2. Remove the BKL from all modules that are either easy to convert to a private mutex or that are relevant to real users (e.g. definitely DRM, but maybe not hpfs). 3. Change CONFIG_BKL to "default n, depends on DEPRECATED" 4. Remove the remaining modules that nobody knows how to fix or cares about. Possibly the number of modules here is close to zero. 5. Remove the implementation of the BKL, since no users are left. Getting stage 1 done for 2.6.36 or even 2.6.35 should be possible but still needs a lot of work. I think we should focus on that for now and then see how much is left to do for the other stages. This is still a temporary transition, but since we can't do all at once, I don't see better way. Arnd [1] ...
Corner cases are not a problem as long as the risk of them going unnoticed is Not a problem even if there's any such usage left: lockdep will sort those out This is a real issue, and in fact it's an unknown: there may be an unknown number of random sleep points within BKL codepaths that is being relied on in creative ways. Note that by introducing a mutex we (in most cases) make the locking _stricter_, so the biggest risk from that is a lockup - which will be debuggable via lockdep. Ingo --
And the hung task detector too which is the last resort to detect uncovered resource dependencies (was really useful for reiserfs). But the problem is among those people who may use such ancient drivers, I guess few of them will have those debug config enabled. And because there are almost no testers of these drivers, nobody/few will ever So, as explained above, lockdep won't even help here. I mean, for callsites that are obvious, say when it is clear that the bkl is leaf lock or doesn't introduce uncovered resource dependencies due to non-release on sleep, we should do such conversion. And I guess most drivers that use the bkl follow this scheme. But for the others (rares I think), the operation looks unsafe to me. If we don't have the hardware to test the driver, then lockdep and hung task detectors are going to be useless. That said, once we reach that point with 4 users of bkl remaining, may be that will be time to buy such hardware for a symbolic $1 in obscure places and do the tests. Or just git-rm if we are too lazy. --
Precisely. Note that the actual value of BKL-covered code shrinks every time we push the BKL out of commonly used code. So even if there's still usage sites around, there will be a point where it makes sense to just desupport it or do a final, crude conversion. If there are no active users who are willing to help us debug potential bugs then there's frankly no value in us being just as careful with those drivers as we are with other, more commonly used code. For the rest, lockdep & softlockup detector will be plenty enough to debug any bugs that may slip through - just like you did it with ReiserFS. Ingo --
I actually support your idea of automatic conversion of modules in the way you lined out, but would not want to apply it to the entire kernel at once, but only to modules that were manually inspected to be safe for doing this. This minimizes both the risk from corner cases and the risk from manual conversion. Arnd --
