Re: [PATCH 6/6] procfs: Kill the bkl in ioctl

Previous thread: [PATCH]vmscan: handle underflow for get_scan_ratio by Shaohua Li on Monday, March 29, 2010 - 10:53 pm. (49 messages)

Next thread: linux-next: manual merge of the slabh tree with the percpu tree by Stephen Rothwell on Monday, March 29, 2010 - 11:22 pm. (2 messages)
From: Frederic Weisbecker
Date: Monday, March 29, 2010 - 11:20 pm

Hi,

This patchset drops any use of the big kernel lock from procfs.
May be there is still one or two NULL llseek somewhere but I
think we have fixed 99% of those that were in the procfs core.

Users of procfs implementing an ioctl have not been easy to
spot automatically (there are hundreds of procfs users)
as there are many indirect ways to register a procfs, depending
on the subsystem you are.

So for those who want to verify the reliability of this check,
you can look at the script there:

	http://tglx.de/~fweisbec/seek.py

Beware it's very dirty! The hardcoded path are those I had
to check manually (or that I added to the automatic check).
One day I should learn how to use Coccinelle instead.

In the worst case, the remaining ones this script or my eyes
forgot will trigger a warning.

Thanks.


Arnd Bergmann (1):
  procfs: Kill BKL in llseek on proc base

Frederic Weisbecker (5):
  procfs: Use generic_file_llseek in /proc/kcore
  procfs: Use generic_file_llseek in /proc/kmsg
  procfs: Use generic_file_llseek in /proc/vmcore
  procfs: Push down the bkl from ioctl
  procfs: Kill the bkl in ioctl

 drivers/char/i8k.c                  |   53 +++++++++++++++------
 drivers/isdn/divert/divert_procfs.c |   90 ++++++++++++++++++++++-------------
 fs/proc/base.c                      |   10 ++++-
 fs/proc/inode.c                     |    4 +-
 fs/proc/kcore.c                     |    1 +
 fs/proc/kmsg.c                      |    1 +
 fs/proc/vmcore.c                    |    1 +
 net/sunrpc/cache.c                  |   20 ++++++--
 8 files changed, 123 insertions(+), 57 deletions(-)

--

From: Frederic Weisbecker
Date: Monday, March 29, 2010 - 11:20 pm

From: Arnd Bergmann <arnd@arndb.de>

We don't use the BKL elsewhere, so use generic_file_llseek
so we can avoid default_llseek taking the BKL.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
[restore proc_fdinfo_file_operations as non-seekable]
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: John Kacur <jkacur@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
---
 fs/proc/base.c |   10 +++++++++-
 1 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index a731084..95d91cf 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -728,6 +728,7 @@ out_no_task:
 
 static const struct file_operations proc_info_file_operations = {
 	.read		= proc_info_read,
+	.llseek		= generic_file_llseek,
 };
 
 static int proc_single_show(struct seq_file *m, void *v)
@@ -985,6 +986,7 @@ out_no_task:
 
 static const struct file_operations proc_environ_operations = {
 	.read		= environ_read,
+	.llseek		= generic_file_llseek,
 };
 
 static ssize_t oom_adjust_read(struct file *file, char __user *buf,
@@ -1058,6 +1060,7 @@ static ssize_t oom_adjust_write(struct file *file, const char __user *buf,
 static const struct file_operations proc_oom_adjust_operations = {
 	.read		= oom_adjust_read,
 	.write		= oom_adjust_write,
+	.llseek		= generic_file_llseek,
 };
 
 #ifdef CONFIG_AUDITSYSCALL
@@ -1129,6 +1132,7 @@ out_free_page:
 static const struct file_operations proc_loginuid_operations = {
 	.read		= proc_loginuid_read,
 	.write		= proc_loginuid_write,
+	.llseek		= generic_file_llseek,
 };
 
 static ssize_t proc_sessionid_read(struct file * file, char __user * buf,
@@ -1149,6 +1153,7 @@ static ssize_t proc_sessionid_read(struct file * file, char __user * buf,
 
 static const struct file_operations proc_sessionid_operations = {
 	.read		= ...
From: Alexey Dobriyan
Date: Monday, March 29, 2010 - 11:40 pm

There is no warning for default default_llseek case.
This should be done same way as proc ioctls.
--

From: Frederic Weisbecker
Date: Monday, March 29, 2010 - 11:50 pm

I don't think we should. We have overriden the llseek for
the procfs users located in the proc core (just fs/proc)
but we haven't touched all of the external users, and since
there are hundreds of them, I guess a lot don't implement
llseek.

We would need to first override those that are visible upstream
and warn for the further ones after this step only.

--

From: Frederic Weisbecker
Date: Monday, March 29, 2010 - 11:20 pm

/proc/vmcore has no llseek and then falls down to use default_llseek.
This is racy against read_vmcore() that directly manipulates fpos
but it doesn't hold the bkl there so using it in llseek doesn't
protect anything.

Let's use generic_file_llseek() instead.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: John Kacur <jkacur@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
---
 fs/proc/vmcore.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/fs/proc/vmcore.c b/fs/proc/vmcore.c
index 0872afa..f942ecb 100644
--- a/fs/proc/vmcore.c
+++ b/fs/proc/vmcore.c
@@ -162,6 +162,7 @@ static ssize_t read_vmcore(struct file *file, char __user *buffer,
 
 static const struct file_operations proc_vmcore_operations = {
 	.read		= read_vmcore,
+	.lseek		= generic_file_llseek,
 };
 
 static struct vmcore* __init get_new_element(void)
-- 
1.6.2.3

--

From: Arnd Bergmann
Date: Tuesday, March 30, 2010 - 3:38 am

Acked-by: Arnd Bergmann <arnd@arndb.de>
--

From: Frederic Weisbecker
Date: Monday, March 29, 2010 - 11:20 pm

There are no more users of procfs that implement the ioctl
callback. Drop the bkl from this path and warn on any use
of this callback.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: John Kacur <jkacur@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
---
 fs/proc/inode.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/proc/inode.c b/fs/proc/inode.c
index 445a02b..afcda85 100644
--- a/fs/proc/inode.c
+++ b/fs/proc/inode.c
@@ -231,9 +231,9 @@ static long proc_reg_unlocked_ioctl(struct file *file, unsigned int cmd, unsigne
 		if (rv == -ENOIOCTLCMD)
 			rv = -EINVAL;
 	} else if (ioctl) {
-		lock_kernel();
+		WARN_ONCE(1, "Procfs ioctl handlers must use unlocked_ioctl, "
+			  "%pf will be called without the Bkl held\n", ioctl);
 		rv = ioctl(file->f_path.dentry->d_inode, file, cmd, arg);
-		unlock_kernel();
 	}
 
 	pde_users_dec(pde);
-- 
1.6.2.3

--

From: Alexey Dobriyan
Date: Monday, March 29, 2010 - 11:38 pm

Then delete the branch.
Or go through formal feature-removal procedure.
--

From: Frederic Weisbecker
Date: Tuesday, March 30, 2010 - 12:07 am

I thought about it. I even started to write something in this
feature-removal file but realized that I can't remove the
.ioctl() callback from file operations. We still need to check
the user hasn't made the mistake of implementing it.

What I can plan as a feature removal, though, is to keep the warning
but don't actually call the ioctl.

--

From: Arnd Bergmann
Date: Tuesday, March 30, 2010 - 3:33 am

I believe we can actually remove ioctl from file_operations. The patch I did
to convert all users to ".unlocked_ioctl = default_ioctl," should really catch
all cases, and I think we can enforce this by renaming fops->ioctl to locked_ioctl
or old_ioctl to make sure we didn't miss any, and then mandate that this one
is only used when unlocked_ioctl is set to default_ioctl.

I also remember going through procfs ioctl operations some time ago and finding
exactly three users, which I believe are the same ones that Frederic found.

	Arnd
--

From: Frederic Weisbecker
Date: Wednesday, March 31, 2010 - 10:22 am

I just looked at the patch in question and noted that the changelog
is pretty high, but how could it be else.
Actually it's not that large, but highly spread:

 Documentation/DocBook/kernel-hacking.tmpl   |    2 +-
 Documentation/filesystems/vfs.txt           |    3 +-
 arch/arm/kernel/etm.c                       |    1 +
 arch/cris/arch-v10/drivers/ds1302.c         |    3 ++
 arch/cris/arch-v10/drivers/gpio.c           |    2 +
 arch/cris/arch-v10/drivers/i2c.c            |    2 +
 arch/cris/arch-v10/drivers/pcf8563.c        |    3 ++
 arch/cris/arch-v10/drivers/sync_serial.c    |    4 ++-
 arch/cris/arch-v32/drivers/cryptocop.c      |    4 ++-
 arch/cris/arch-v32/drivers/i2c.c            |    2 +
 arch/cris/arch-v32/drivers/mach-a3/gpio.c   |    2 +
 arch/cris/arch-v32/drivers/mach-fs/gpio.c   |    2 +
 arch/cris/arch-v32/drivers/pcf8563.c        |    5 +++-
 arch/cris/arch-v32/drivers/sync_serial.c    |    4 ++-
 arch/ia64/kernel/perfmon.c                  |    2 +
 arch/ia64/sn/kernel/sn2/sn_hwperf.c         |    2 +
 arch/m68k/bvme6000/rtc.c                    |    2 +
 arch/m68k/mvme16x/rtc.c                     |    2 +
 arch/um/drivers/harddog_kern.c              |    2 +
 arch/um/drivers/hostaudio_kern.c            |    3 ++
 arch/um/drivers/mmapper_kern.c              |    3 ++
 drivers/block/DAC960.c                      |    3 +-
 drivers/block/paride/pg.c                   |    2 +
 drivers/block/paride/pt.c                   |    2 +
 drivers/block/pktcdvd.c                     |    3 ++
 drivers/char/apm-emulation.c                |    2 +
 drivers/char/applicom.c                     |    2 +
 drivers/char/ds1302.c                       |    1 +
 drivers/char/ds1620.c                       |    2 +
 drivers/char/dtlk.c                         |    2 +
 drivers/char/generic_nvram.c                |    2 +
 drivers/char/genrtc.c                       |    2 +
 drivers/char/i8k.c                          |    2 +
 drivers/char/ip2/ip2main.c                  ...
From: Arnd Bergmann
Date: Wednesday, March 31, 2010 - 1:21 pm

I don't think the warning helps all that much, at least not across an
entire release. We could leave it in for the merge window and fix all
users for -rc1, then submit a patch that kills everything that came
in during the merge window and remove it completely in -rc2.

Getting rid of ioctl completely is a lot of work though, covering the
entire lot of ~150 device drivers. I think the patch as is (or the
variant renaming .ioctl to .locked_ioctl) is far less work and has

I don't think we really need to get rid of it this soon in the obsolete
drivers, pushing down the BKL into an unlocked_ioctl function only slightly
shifts the problem around, since the driver still depends on the BKL then
and gets disabled if you build with CONFIG_BKL=n.

IMHO, a better use of your time would be to completely remove the BKL
along with the ioctl function from any of the drivers in this lists that
looks like it could be relevant to real users.

In the meantime, we can move the declaration of the .locked_ioctl callback
into an #ifdef CONFIG_BKL, to make sure nobody builds a driver with an
ioctl function that does not get called.

Another crazy idea I had was to simply turn the BKL into a regular mutex
as soon as we can show that all remaining users are of the non-recursive
kind and don't rely on the autorelease-on-sleep. Doing that would be
much easier without the pushdown into .unlocked_ioctl than it would be
with it.

	Arnd
--

From: Arnd Bergmann
Date: Wednesday, March 31, 2010 - 2:04 pm

I just looked at all the users of lock_kernel remaining with my patch
series. For 90% of them, it is completely obvious that they don't rely
on nested locking, and they very much look like they don't need the
autorelease either, because the BKL was simply pushed down into the
open, ioctl and llseek functions.

There are a few file systems (udf, ncpfs, autofs, coda, ...) and some
network protocols (appletalk, ipx, irnet and x25) for which it is not
obviously, though still quite likely, the case.

So we could actually remove the BKL recursion code soon, or even turn
all of it into a regular mutex, at least as an experimental option.

The recursive users that I've removed in my series are the block, tty,
input and sound subsystems, as well as the init code.

	Arnd
--

From: Alan Cox
Date: Wednesday, March 31, 2010 - 2:55 pm

There are some very subtle recursive cases in the tty code - hangup
triggered close and consoles being an absolute gem I've just had to debug
in my lock removal bits so far...

Alan
--

From: Arnd Bergmann
Date: Thursday, April 1, 2010 - 2:07 am

Yes, I've seen some of them. What I meant above is that with
CONFIG_TTY_MUTEX=y, the TTY code no longer uses the BKL in a
nested way, and quite likely no either code does either.

The TTY code with my patch now has tty_lock() for all cases that
I concluded are never nested in another tty_lock, and tty_lock_nested()
for those I did not understand or that I know they are nested (the
latter type usually comes with a comment). The only difference
between the two is a WARN_ON(tty_locked()) in tty_lock, so we can
see where the analysis was wrong.

	Arnd
--

From: Frederic Weisbecker
Date: Wednesday, March 31, 2010 - 2:56 pm

This is a solution that has been tried more than once already. But Linus
has told he wouldn't pull something that turns the bkl into a mutex or a
semaphore.

Plus it's quite hard to tell that it does or not auto-release somewhere
This is often something you can really spot on runtime or on small path
only.

The simple fact the bkl is not always a leaf lock makes it need the
auto-release, otherwise you experience very bad unexpected lock
dependencies.

--

From: Arnd Bergmann
Date: Thursday, April 1, 2010 - 4:37 am

Ok. Starting from the same observation of simplicity in the remaining code,
we should also be able to find a semi-automatic way of turning the BKL usage

Well, the autorelease by itself is not needed anywhere. What is needed

I'm arguing that we can probably show the BKL to be the outermost
lock for the majority of the remaining drivers, which only get it
from their open(), ioctl() or llseek() functions, which are all
called without any locks held. If the BKL is a regular mutex, lockdep
should warn of the other cases.

	Arnd
--

From: Frederic Weisbecker
Date: Wednesday, March 31, 2010 - 2:41 pm

Hmm, yeah you're right actually. Since we have this CONFIG_BKL thing
plus a future check to prevent from people implementing new ioctl
(checking ioctl without default_ioctl), it's actually better than


Ok, now how to get this all merged? A single monolithic patch is probably
not appropriate.

The simplest is to have a single branch with the default_ioctl implemented,
and then attributed to drivers in a set cut by subsystems/drivers. And
push the whole for the next -rc1.

The other solution is to push default_ioctl for this release and get
the driver changes to each concerned tree. That said, I suspect a good
part of them are unmaintained, hence the other solution looks better
to me.


Hmm?

--

From: Arnd Bergmann
Date: Thursday, April 1, 2010 - 5:42 am

I don't care much about the unmaintained parts, we can always have a
tree collecting all the patches for those drivers and merge it in -rc1.

I'd say the nicest way would be to get Linus to merge the patch
below now, so we can start queuing stuff in maintainer trees on top
of it, and check against linux-next what is still missing and push
all of them from our branch.

	Arnd

---
Subject: [PATCH] introduce CONFIG_BKL and default_ioctl

This is a preparation for the removal of the big kernel lock that
introduces new interfaces for device drivers still using it.

We can start marking those device drivers as 'depends on CONFIG_BKL'
now, and make that symbol optional later, when the point has come
at which we are able to build a kernel without the BKL.

Similarly, device drivers that currently make use of the implicit
BKL locking around the ioctl function can now get annotated by
changing

	.ioctl = foo_ioctl,

to

	.locked_ioctl = foo_ioctl,
	.unlocked_ioctl = default_ioctl,

As soon as no driver remains using the old ioctl callback, it can
get removed.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 fs/ioctl.c               |   22 ++++++++++++++++++++++
 include/linux/fs.h       |    3 +++
 include/linux/smp_lock.h |    4 ++++
 lib/Kconfig.debug        |   10 ++++++++++
 4 files changed, 39 insertions(+), 0 deletions(-)

diff --git a/fs/ioctl.c b/fs/ioctl.c
index 6c75110..52c2698 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -58,6 +58,28 @@ static long vfs_ioctl(struct file *filp, unsigned int cmd,
 	return error;
 }
 
+#ifdef CONFIG_BKL
+/*
+ * default_ioctl - call unlocked_ioctl with BKL held
+ *
+ * Setting only the the ioctl operation but not unlocked_ioctl will become
+ * invalid in the future, all drivers that are not converted to unlocked_ioctl
+ * should set .unlocked_ioctl = default_ioctl now.
+ */
+long default_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
+{
+	int error = -ENOTTY;
+	if (filp->f_op->locked_ioctl) ...
From: Stefan Richter
Date: Saturday, April 3, 2010 - 10:53 am

Typo.  default_ioctl - call locked_ioctl with BKL held
-- 
Stefan Richter
-=====-==-=- -=-- ---==
http://arcgraph.de/sr/
--

From: Frederic Weisbecker
Date: Saturday, April 10, 2010 - 9:09 am

Do you mind if I rename this to deprecated_ioctl()?
This "default" naming suggests a fallback everyone that don't
need tricky things should use.

--

From: Arnd Bergmann
Date: Monday, April 12, 2010 - 8:05 am

Sounds good. The idea was to keep the naming consistent with
default_llseek, but after the discussion on llseek, we will
probably do something else with it anyway.

	Arnd
--

From: Frederic Weisbecker
Date: Saturday, April 10, 2010 - 9:14 am

Al, we would like to make this for 2.6.34, so that we can start
converting the drivers that use bkl'ed ioctl to this new scheme
in the relevant maintainers trees.

Can we get your ack?

Thanks.

--

From: Frederic Weisbecker
Date: Saturday, April 10, 2010 - 9:24 am

Al, we would like to make this for 2.6.34, so that we can start
converting the drivers that use bkl'ed ioctl to this new scheme
in the relevant maintainers trees.

Can we get your ack?

Thanks.

--

From: Stefan Richter
Date: Thursday, April 1, 2010 - 4:39 am

[Documentation/ arch/, drivers/, drivers/, and more drivers/, fs/,

A side note:  A considerable portion of this particular commit in Arnd's
git actually does not deal with .ioctl->.unlocked_ioctl at all, but
purely with .llseek.  Many(?) of these changes deal with .ioctl and
.llseek together.  (Arnd also says so in the last paragraph of his
changelog.)

IOW there are less .ioctl implementations left than one could think from
a look at the diffstat.
-- 
Stefan Richter
-=====-==-=- -=-- ----=
http://arcgraph.de/sr/
--

From: Arnd Bergmann
Date: Thursday, April 1, 2010 - 5:45 am

Given our recent discussions on the llseek topic, it's probably better to
revert most of the changes that purely deal with llseek. My current idea
is to use an explicit default_llseek only if one of the following is given:

- we convert ioctl to unlocked_ioctl in the same file_operations, or
- the module uses the big kernel lock explicitly elsewhere.

Even then, there may be a number of cases where we can show it not
to be necessary, e.g. when the driver does not care about f_pos.
Concurrent llseek is racy by nature, so in most drivers, using the
BKL in llseek does not gain anything over using i_mutex.

	Arnd
--

From: Frederic Weisbecker
Date: Saturday, April 10, 2010 - 8:28 am

So you mean we should attribute explicit default_llseek to the evil
places instead of explicit generic_file_llseek in the safe ones?
That's not a bad idea as it would result in much less changes.

The problem happens the day you switch to generic_file_llseek() as the
new default llseek(), how do you prove that all remaining fops
that don't implement .llseek don't use the bkl? There will be
hundreds of them and saying "we've looked all of them and they don't
need it" will be a scary justification.

On the opposite, attributing explicit generic_file_llseek or
non_seekable_open on the safe places and default_llseek on
the dozens of others doubtful places is easier to get a
safe conclusion.

But yeah we should try, at least attributing explicit
default_llseek won't harm, quite the opposite.

--

From: Christoph Hellwig
Date: Sunday, April 11, 2010 - 6:03 am

Note that an lssek that actually does something is the wrong default,
even if we have it that way currently.  If the default is changed it
should be changed to give the semantics that nonseekable_open()
gives us.  Given that you guys are so motivated to do something in
this area it might be a good idea to do this in a few simple steps:

 - make sure every file operation either has a ->llseek instead or
   calls nonseekable_open from ->open
 - remove nonseekable_open and all calls to it
 - switch all users of no_llseek to not set a ->llsek after auditing
   that there's no corner case where we want to allow pread/pwrite
   but not lseek, which is rather unlikely
 - walk through the instances now using default_llseek and chose
   a better implementation for this particular instance.  Often
   this will be just removing the the lssek method as not allowing
   seeks is the right thing to do for character drivers, even if it
   is a behaviour change from the current version which usually
   is the result of sloppy coding.

--

From: Arnd Bergmann
Date: Monday, April 12, 2010 - 10:34 am

I still think it would be better to always set llseek if we do that,
even if nonseekable_open is already there. I can come up with scripts
that check that case, but checking that the open function always
calls nonseekable_open when it returns success is beyond my grep


This part is really hard. While in many cases, the driver maintainer
might know what user space is potentially opening some character
device, it's really hard to tell for outsiders whether the behaviour
should be no_llseek (then the default) or noop_llseek to work around
broken user space.

I think the rule set for the conversion needs to be one that can
be done purely based on the code. How about this:

For each file operation {
	if (uses f_pos) {
		if (same module uses BKL)
			-> default_llseek
		else
			-> generic_file_llseek
	} else {
		if (driver maintained)
			-> no_llseek (with maintainer ACK)
		else
			-> noop_llseek
	}
}

Once that is done, we can turn the default into nonseekable
behavior and start removing instances of explicit no_llseek
and nonseekable_open.

Should we also rename default_llseek to deprecated_llseek in the
process, to go along with the approach for ioctl?

	Arnd
--

From: Frederic Weisbecker
Date: Monday, April 12, 2010 - 2:53 pm

Also even if llseek is useless for a module, turning it into
unseekable somehow changes the userspace ABI. I guess this
is harmless 99% of the time, but still. And maintainers tend



It is also hard to determine a given driver really doesn't use
the bkl. A sole lock_kernel() grep in its files is not sufficient.




Yeah, preferably.

Thanks.

--

From: Arnd Bergmann
Date: Tuesday, April 13, 2010 - 2:26 am

Why not? In my 2.6.33 based series, I have removed all implicit
uses of the BKL, so we can be sure that it doesn't use the BKL
unless the module is part of that series. The only two cases
I can think of are:

- ioctl callback, which we should do in the same change, like I
originally did. If a driver defines ->ioctl(), make it use
deprecated_ioctl() and default_llseek()/deprecated_llseek.

- Any of the file systems from Jan's series.

	Arnd
--

From: Frederic Weisbecker
Date: Tuesday, April 13, 2010 - 1:10 pm

Ok looks like a good plan then.

Thanks.

--

From: Christoph Hellwig
Date: Tuesday, April 13, 2010 - 11:03 am

Yes, it's not quite easily greppable.  Making no seek allowed the



I wouldn't bother.  If you can actually work on your plan default_llseek
should be gone soon enough.

--

From: Frederic Weisbecker
Date: Monday, March 29, 2010 - 11:20 pm

Push down the bkl from procfs's ioctl main handler to its users.
Only three procfs users implement an ioctl (non unlocked) handler.
Turn them into unlocked_ioctl and push down the Devil inside.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: John Kacur <jkacur@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
---
 drivers/char/i8k.c                  |   53 +++++++++++++++------
 drivers/isdn/divert/divert_procfs.c |   90 ++++++++++++++++++++++-------------
 net/sunrpc/cache.c                  |   20 ++++++--
 3 files changed, 109 insertions(+), 54 deletions(-)

diff --git a/drivers/char/i8k.c b/drivers/char/i8k.c
index fc8cf7a..85f8893 100644
--- a/drivers/char/i8k.c
+++ b/drivers/char/i8k.c
@@ -23,6 +23,7 @@
 #include <linux/seq_file.h>
 #include <linux/dmi.h>
 #include <linux/capability.h>
+#include <linux/smp_lock.h>
 #include <asm/uaccess.h>
 #include <asm/io.h>
 
@@ -82,8 +83,7 @@ module_param(fan_mult, int, 0);
 MODULE_PARM_DESC(fan_mult, "Factor to multiply fan speed with");
 
 static int i8k_open_fs(struct inode *inode, struct file *file);
-static int i8k_ioctl(struct inode *, struct file *, unsigned int,
-		     unsigned long);
+static long i8k_ioctl(struct file *, unsigned int, unsigned long);
 
 static const struct file_operations i8k_fops = {
 	.owner		= THIS_MODULE,
@@ -91,7 +91,7 @@ static const struct file_operations i8k_fops = {
 	.read		= seq_read,
 	.llseek		= seq_lseek,
 	.release	= single_release,
-	.ioctl		= i8k_ioctl,
+	.unlocked_ioctl	= i8k_ioctl,
 };
 
 struct smm_regs {
@@ -307,8 +307,7 @@ static int i8k_get_dell_signature(int req_fn)
 	return regs.eax == 1145651527 && regs.edx == 1145392204 ? 0 : -1;
 }
 
-static int i8k_ioctl(struct inode *ip, struct file *fp, unsigned int cmd,
-		     unsigned long arg)
+static ...
From: Alexey Dobriyan
Date: Monday, March 29, 2010 - 11:31 pm

From: Frederic Weisbecker
Date: Tuesday, March 30, 2010 - 12:02 am

Yeah. It's a very rough pushdown, I haven't looked at any bit to figure
out what could need to be protected or not. I even did not know what does
PDE. So I kept a plain bkl path. I just thought any further thinking should
be done in a further patch.

But I can move the bkl after in this same patch if you prefer.

Thanks.

--

From: Frederic Weisbecker
Date: Friday, April 9, 2010 - 7:45 am

Push down the bkl from procfs's ioctl main handler to its users.
Only three procfs users implement an ioctl (non unlocked) handler.
Turn them into unlocked_ioctl and push down the Devil inside.

v2: PDE(inode)->data doesn't need to be under bkl

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: John Kacur <jkacur@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
---
 drivers/char/i8k.c                  |   53 +++++++++++++++------
 drivers/isdn/divert/divert_procfs.c |   90 ++++++++++++++++++++++-------------
 net/sunrpc/cache.c                  |   20 ++++++--
 3 files changed, 109 insertions(+), 54 deletions(-)

diff --git a/drivers/char/i8k.c b/drivers/char/i8k.c
index fc8cf7a..85f8893 100644
--- a/drivers/char/i8k.c
+++ b/drivers/char/i8k.c
@@ -23,6 +23,7 @@
 #include <linux/seq_file.h>
 #include <linux/dmi.h>
 #include <linux/capability.h>
+#include <linux/smp_lock.h>
 #include <asm/uaccess.h>
 #include <asm/io.h>
 
@@ -82,8 +83,7 @@ module_param(fan_mult, int, 0);
 MODULE_PARM_DESC(fan_mult, "Factor to multiply fan speed with");
 
 static int i8k_open_fs(struct inode *inode, struct file *file);
-static int i8k_ioctl(struct inode *, struct file *, unsigned int,
-		     unsigned long);
+static long i8k_ioctl(struct file *, unsigned int, unsigned long);
 
 static const struct file_operations i8k_fops = {
 	.owner		= THIS_MODULE,
@@ -91,7 +91,7 @@ static const struct file_operations i8k_fops = {
 	.read		= seq_read,
 	.llseek		= seq_lseek,
 	.release	= single_release,
-	.ioctl		= i8k_ioctl,
+	.unlocked_ioctl	= i8k_ioctl,
 };
 
 struct smm_regs {
@@ -307,8 +307,7 @@ static int i8k_get_dell_signature(int req_fn)
 	return regs.eax == 1145651527 && regs.edx == 1145392204 ? 0 : -1;
 }
 
-static ...
From: Frederic Weisbecker
Date: Saturday, April 10, 2010 - 6:25 am

Push down the bkl from procfs's ioctl main handler to its users.
Only three procfs users implement an ioctl (non unlocked) handler.
Turn them into unlocked_ioctl and push down the Devil inside.

v2: PDE(inode)->data doesn't need to be under bkl
v3: And don't forget to git-add the result

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: John Kacur <jkacur@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
---
 drivers/char/i8k.c                  |   53 +++++++++++++++------
 drivers/isdn/divert/divert_procfs.c |   90 ++++++++++++++++++++++-------------
 net/sunrpc/cache.c                  |   17 +++++--
 3 files changed, 107 insertions(+), 53 deletions(-)

diff --git a/drivers/char/i8k.c b/drivers/char/i8k.c
index fc8cf7a..85f8893 100644
--- a/drivers/char/i8k.c
+++ b/drivers/char/i8k.c
@@ -23,6 +23,7 @@
 #include <linux/seq_file.h>
 #include <linux/dmi.h>
 #include <linux/capability.h>
+#include <linux/smp_lock.h>
 #include <asm/uaccess.h>
 #include <asm/io.h>
 
@@ -82,8 +83,7 @@ module_param(fan_mult, int, 0);
 MODULE_PARM_DESC(fan_mult, "Factor to multiply fan speed with");
 
 static int i8k_open_fs(struct inode *inode, struct file *file);
-static int i8k_ioctl(struct inode *, struct file *, unsigned int,
-		     unsigned long);
+static long i8k_ioctl(struct file *, unsigned int, unsigned long);
 
 static const struct file_operations i8k_fops = {
 	.owner		= THIS_MODULE,
@@ -91,7 +91,7 @@ static const struct file_operations i8k_fops = {
 	.read		= seq_read,
 	.llseek		= seq_lseek,
 	.release	= single_release,
-	.ioctl		= i8k_ioctl,
+	.unlocked_ioctl	= i8k_ioctl,
 };
 
 struct smm_regs {
@@ -307,8 +307,7 @@ static int i8k_get_dell_signature(int req_fn)
 	return regs.eax == 1145651527 && ...
From: Frederic Weisbecker
Date: Sunday, May 16, 2010 - 6:23 pm

I have updated this patch because the direct pushdown into
the handler was rather invasive and error prone.

Instead I've used wrappers.

Arnd I've kept your ack, hope it's still fine for you.

Thanks.

---
From d79b6f4de5db0103ceb4734e42ad101d836d61d9 Mon Sep 17 00:00:00 2001
From: Frederic Weisbecker <fweisbec@gmail.com>
Date: Tue, 30 Mar 2010 07:27:50 +0200
Subject: [PATCH v4] procfs: Push down the bkl from ioctl

Push down the bkl from procfs's ioctl main handler to its users.
Only three procfs users implement an ioctl (non unlocked) handler.
Turn them into unlocked_ioctl and push down the Devil inside.

v2: PDE(inode)->data doesn't need to be under bkl
v3: And don't forget to git-add the result
v4: Use wrappers to pushdown instead of an invasive and error prone
    handlers surgery.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: John Kacur <jkacur@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
---
 drivers/char/i8k.c                  |   21 ++++++++++++++++-----
 drivers/isdn/divert/divert_procfs.c |   18 ++++++++++++++----
 net/sunrpc/cache.c                  |   14 ++++++++++----
 3 files changed, 40 insertions(+), 13 deletions(-)

diff --git a/drivers/char/i8k.c b/drivers/char/i8k.c
index fc8cf7a..4cd8b22 100644
--- a/drivers/char/i8k.c
+++ b/drivers/char/i8k.c
@@ -23,6 +23,7 @@
 #include <linux/seq_file.h>
 #include <linux/dmi.h>
 #include <linux/capability.h>
+#include <linux/smp_lock.h>
 #include <asm/uaccess.h>
 #include <asm/io.h>
 
@@ -82,8 +83,7 @@ module_param(fan_mult, int, 0);
 MODULE_PARM_DESC(fan_mult, "Factor to multiply fan speed with");
 
 static int i8k_open_fs(struct inode *inode, struct file *file);
-static int i8k_ioctl(struct inode *, struct file *, unsigned ...
From: Arnd Bergmann
Date: Tuesday, March 30, 2010 - 3:37 am

Looks good to me. I would have used a single unlock and return statement
in i8k_ioctl and isdn_divert_ioctl, with goto instead of adding an
unlock to each return, but it doesn't matter much.

Acked-by: Arnd Bergmann <arnd@arndb.de>
--

From: Frederic Weisbecker
Date: Tuesday, March 30, 2010 - 11:27 am

I did that first, but actually that didn't make much difference:

ret = foo;            unlock_kernel()


Thanks.

--

From: Arnd Bergmann
Date: Tuesday, March 30, 2010 - 11:54 am

Yes, the amount of code needed is comparable, but it is much easier
to validate that you did not miss an unlock when you know that there
is a single return statement in the function. It also helps the next
person that may want to replace the BKL with a different lock.

	Arnd
--

From: Frederic Weisbecker
Date: Tuesday, March 30, 2010 - 12:21 pm

Ah you're right!

--

From: Frederic Weisbecker
Date: Monday, March 29, 2010 - 11:20 pm

No need to hold the bkl to seek here, none of the other
fops callbacks use it.

Use generic_file_llseek explicitly.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: John Kacur <jkacur@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
---
 fs/proc/kmsg.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/fs/proc/kmsg.c b/fs/proc/kmsg.c
index cfe90a4..bd4b5a7 100644
--- a/fs/proc/kmsg.c
+++ b/fs/proc/kmsg.c
@@ -53,6 +53,7 @@ static const struct file_operations proc_kmsg_operations = {
 	.poll		= kmsg_poll,
 	.open		= kmsg_open,
 	.release	= kmsg_release,
+	.llseek		= generic_file_llseek,
 };
 
 static int __init proc_kmsg_init(void)
-- 
1.6.2.3

--

From: Arnd Bergmann
Date: Tuesday, March 30, 2010 - 3:38 am

Acked-by: Arnd Bergmann <arnd@arndb.de>
--

From: Frederic Weisbecker
Date: Monday, March 29, 2010 - 11:20 pm

/proc/kcore has no llseek and then falls down to use default_llseek.
This is racy against read_kcore() that directly manipulates fpos
but it doesn't hold the bkl there so using it in llseek doesn't
protect anything.

Let's use generic_file_llseek() instead.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: John Kacur <jkacur@redhat.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
---
 fs/proc/kcore.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c
index a44a789..da21060 100644
--- a/fs/proc/kcore.c
+++ b/fs/proc/kcore.c
@@ -557,6 +557,7 @@ static int open_kcore(struct inode *inode, struct file *filp)
 static const struct file_operations proc_kcore_operations = {
 	.read		= read_kcore,
 	.open		= open_kcore,
+	.llseek		= generic_file_llseek,
 };
 
 #ifdef CONFIG_MEMORY_HOTPLUG
-- 
1.6.2.3

--

From: Arnd Bergmann
Date: Tuesday, March 30, 2010 - 3:28 am

Acked-by: Arnd Bergmann <arnd@arndb.de>
--

From: Frederic Weisbecker
Date: Saturday, April 10, 2010 - 6:27 am

I've applied the patchset to

git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing.git
	bkl/procfs


--

Previous thread: [PATCH]vmscan: handle underflow for get_scan_ratio by Shaohua Li on Monday, March 29, 2010 - 10:53 pm. (49 messages)

Next thread: linux-next: manual merge of the slabh tree with the percpu tree by Stephen Rothwell on Monday, March 29, 2010 - 11:22 pm. (2 messages)