OK, since the previous announcement, I've revisited all of the open()
functions which didn't get lock_kernel() calls the first time around.
Alan pointed out that even a completely empty open() might, in fact,
need to acquire the BKL, so now they all do. Hopefully, this completes
this work (at this level - there's plenty of down-pushing to do within
subsystems).There's a new tree with this stuff:
git://git.lwn.net/linux-2.6.git bkl-removal
Stephen, might it be about time to pull this into linux-next and see
what explodes?If others have BKL-removal patches which look like 2.6.27 material, I'll
happily collect them in this tree.Thanks,
jon
P.S. Here's what's in it now:
arch/cris/arch-v10/drivers/eeprom.c | 4 +-
arch/cris/arch-v10/drivers/gpio.c | 3 ++
arch/cris/arch-v10/drivers/i2c.c | 2 +
arch/cris/arch-v10/drivers/sync_serial.c | 34 ++++++++++++++----------
arch/cris/arch-v32/drivers/cryptocop.c | 3 +-
arch/cris/arch-v32/drivers/i2c.c | 2 +
arch/cris/arch-v32/drivers/mach-a3/gpio.c | 4 ++
arch/cris/arch-v32/drivers/mach-fs/gpio.c | 5 ++-
arch/cris/arch-v32/drivers/sync_serial.c | 33 ++++++++++++++---------
arch/mips/kernel/rtlx.c | 7 ++++
arch/mips/kernel/vpe.c | 12 ++++++--
arch/mips/sibyte/common/sb_tbprof.c | 25 ++++++++++++-----
arch/sh/boards/landisk/gio.c | 10 ++++---
arch/x86/kernel/cpuid.c | 25 ++++++++++++-----
arch/x86/kernel/msr.c | 16 ++++++++---
block/bsg.c | 7 ++++
drivers/block/aoe/aoechr.c | 7 ++++
drivers/block/paride/pg.c | 22 +++++++++++----
drivers/block/paride/pt.c | 8 ++++-
drivers/char/cs5535_gpio.c | 2 +
drivers/char/drm/drm_fops.c | 9 ++++--
drivers/char/dsp56k...
I have yet to look at these drivers in detail whether they need BKL or
drivers/firewire/fw-cdev.c::fw_device_op_open() does not need the BKL.
--
Stefan Richter
-=====-==--- -=-= =--==
http://arcgraph.de/sr/
--
video1394 needs it to serialize module init vs. open. The race
condition there can be prevented by splitting hpsb_register_highlevel()
into an _init and a _register function. I will follow up with a patch.dv1394 and raw1394 do not need the BKL in their open() methods AFAICS.
--
Stefan Richter
-=====-==--- -=-= =--==
http://arcgraph.de/sr/
--
From: Stefan Richter <stefanr@s5r6.in-berlin.de>
Subject: ieee1394: video1394: reorder module init, prepare BKL removalThis prepares video1394 for removal of the BKL (big kernel lock):
It allows video1394_open() to be called while video1394_init_module()
is still in progress.Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
---
drivers/ieee1394/highlevel.c | 4 +---
drivers/ieee1394/highlevel.h | 13 ++++++++++++-
drivers/ieee1394/video1394.c | 2 ++
3 files changed, 15 insertions(+), 4 deletions(-)Index: linux/drivers/ieee1394/highlevel.c
===================================================================
--- linux.orig/drivers/ieee1394/highlevel.c
+++ linux/drivers/ieee1394/highlevel.c
@@ -228,10 +228,8 @@ void hpsb_register_highlevel(struct hpsb
{
unsigned long flags;+ hpsb_init_highlevel(hl);
INIT_LIST_HEAD(&hl->addr_list);
- INIT_LIST_HEAD(&hl->host_info_list);
-
- rwlock_init(&hl->host_info_lock);down_write(&hl_drivers_sem);
list_add_tail(&hl->hl_list, &hl_drivers);
Index: linux/drivers/ieee1394/highlevel.h
===================================================================
--- linux.orig/drivers/ieee1394/highlevel.h
+++ linux/drivers/ieee1394/highlevel.h
@@ -2,7 +2,7 @@
#define IEEE1394_HIGHLEVEL_H#include <linux/list.h>
-#include <linux/spinlock_types.h>
+#include <linux/spinlock.h>
#include <linux/types.h>struct module;
@@ -103,6 +103,17 @@ int highlevel_lock64(struct hpsb_host *h
void highlevel_fcp_request(struct hpsb_host *host, int nodeid, int direction,
void *data, size_t length);+/**
+ * hpsb_init_highlevel - initialize a struct hpsb_highlevel
+ *
+ * This is only necessary if hpsb_get_hostinfo_bykey can be called
+ * before hpsb_register_highlevel.
+ */
+static inline void hpsb_init_highlevel(struct hpsb_highlevel *hl)
+{
+ rwlock_init(&hl->host_info_lock);
+ INIT_LIST_HEAD(&hl->host_info_list);
...
i'm wondering how this should best interact with the kill-the-BKL bits
that attack it all from the infrastructure side. Your bits are the safer
ones, so i guess i'll try to track your changes from my branch to
increase testing of it all?Ingo
--
Hi Jon,
I have added that to today's build right at the end so that if it
explodes too badly, it will be removed again.I haven't been following this too closely, so how much of this stuff has
been posted/review/tested?--=20
Cheers,
Stephen Rothwell sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/
This is the second posting (some of the changesets are new this time
around). It all builds and runs for me, but a lot of this work affects
some of the oldest, most cobweb-filled code around, and I sure don't
have that hardware. Some others have looked at, but I'm not sure how
deeply.The basic concept of this work is straightforward, so I have good reason
to hope I've not done anything really stupid.jon
--
Great work, Jon! It's really cool to see some real momentum towards
getting rid of the BKL at last.> drivers/infiniband/core/ucm.c | 2 +
> drivers/infiniband/core/user_mad.c | 7 ++++
> drivers/infiniband/core/uverbs_main.c | 9 ++++--
> drivers/infiniband/hw/ipath/ipath_file_ops.c | 2 +All of these changes look fine from a pure "push the BKL down" point of
view. However I am 99% sure no BKL use is required in any of these (and
I will think deeper to get another .9% surer tomorrow).Is the plan that we have a pure "push the BKL down" changeset merged,
and then I can merge BKL removal patches for these places that never
needed the BKL? (I guess I can send you such a patch to base on top of
your tree for when Linus pulls it? Is 2.6.27 the plan?) The
alternative is to never add the BKL to these places as part of this
patch -- which seems to be a bad, risky plan, since if any mistakes are
made, then bisection just lands on some giant patch.Thanks,
Roland
--
If you're sure that this code doesn't need the BKL (and it kind of
looked that way to me), the preferred approach seems to be to put in a
comment to that effect so that it's clear that the code has been looked
at. So sending me a patch which does this would be great. Otherwise,
if you're willing to swear on top of a stack of Knuth output that the
BKL is not needed for specific open functions, I can revert my patch
back out and put in the comment - whichever you prefer.jon
--
> If you're sure that this code doesn't need the BKL (and it kind of
> looked that way to me), the preferred approach seems to be to put in a
> comment to that effect so that it's clear that the code has been looked
> at. So sending me a patch which does this would be great.OK, will send such a patch after auditing more carefully. Just to be
very clear, the issue is to make sure the locking is sufficient to
protect against multiple racing open calls to the same character device?- R.
--
That's part of it, but, as Alan pointed out, there's more. The BKL
currently protects open() calls against concurrency with other opens,
with ioctl(), and with driver initialization as well. So it's a matter
of having one's locking and ordering act together in general.jon
--
> That's part of it, but, as Alan pointed out, there's more. The BKL
> currently protects open() calls against concurrency with other opens,
> with ioctl(), and with driver initialization as well. So it's a matter
> of having one's locking and ordering act together in general.Thanks. Just to be super-explicit, ioctl() cannot be called on a given
file until the open() for that particular file has returned, right?And the point about driver initialization is that open() can be called
as soon as the file operations are registered, even if the module_init
function has not returned?Thanks,
Roland
--
Right. Or two ioctls against each other unless one sleepers, or against
random other parts of the kernel which still use the BKL - eg fasync,Exactly.
Alan
--
ioctl() will not be called on a given file descriptor before open() is
done, no. If there are other file descriptors open, though, somebody
can be calling ioctl() on them while the open() for the new one isThat is the point, yes. The key there is to avoid registering the
device before everything has been set up to actually manage the device.jon
--
There's the case where one thread is calling ioctl on the new
descriptor before open (in other thread) has returned (it's malicious
and trying to oops you, it's accidentally trying to operate on a
closed descriptor, etc). It might hit the window between the
descripor being installed and open returning.Jeff
--
Work email - jdike at linux dot intel dot com
--
