Hi,
The layout of struct blk_user_trace_setup is a bit unfortunate, it gets
padded differently on 32-bit and 64-bit archs. So right now it's not
possible to trace 64-bit kernels with a 32-bit app. This patch fixes
that up by adding a compat ioctl handler for BLKTRACESETUP.Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
diff --git a/fs/compat_ioctl.c b/fs/compat_ioctl.c
index 5a5b711..b18b9cc 100644
--- a/fs/compat_ioctl.c
+++ b/fs/compat_ioctl.c
@@ -2052,6 +2052,51 @@ static int raw_ioctl(unsigned fd, unsigned cmd, unsigned long arg)
}
return ret;
}
+
+struct blk_user_trace_setup32 {
+ char name[32];
+ u16 act_mask;
+ u16 pad;
+ u32 buf_size;
+ u32 buf_nr;
+ u64 start_lba;
+ u64 end_lba;
+ u32 pid;
+} __attribute__((packed));
+
+#define BLKTRACESETUP32 _IOWR(0x12,115,struct blk_user_trace_setup32)
+
+static int blktrace32_setup(int fd, unsigned cmd, unsigned long arg)
+{
+ struct blk_user_trace_setup __user *buts = compat_alloc_user_space(sizeof(*buts));
+ struct blk_user_trace_setup32 __user *buts32 = compat_ptr(arg);
+ int err;
+
+ if (copy_in_user(&buts->name, &buts32->name, BDEVNAME_SIZE) ||
+ get_user(buts->act_mask, &buts32->act_mask) ||
+ get_user(buts->buf_size, &buts32->buf_size) ||
+ get_user(buts->buf_nr, &buts32->buf_nr) ||
+ get_user(buts->start_lba, &buts32->start_lba) ||
+ get_user(buts->end_lba, &buts32->end_lba) ||
+ get_user(buts->pid, &buts32->pid))
+ return -EFAULT;
+
+ err = sys_ioctl(fd, BLKTRACESETUP, (unsigned long) buts);
+ if (err)
+ return err;
+
+ if (copy_to_user(&buts32->name, &buts->name, BDEVNAME_SIZE) ||
+ put_user(buts32->act_mask, &buts->act_mask) ||
+ put_user(buts32->buf_size, &buts->buf_size) ||
+ put_user(buts32->buf_nr, &buts->buf_nr) ||
+ put_user(buts32->start_lba, &buts->start_lba) ||
+ put_user(buts32->end_lba, ...
actually, I would guess that it is currently working on s390, sparc64,
I'd prefer to not add anything to fs/compat_ioctl.c at all, but always
handle these in the places where the native version is handled.In your case, I'd either mark BLKTRACESETUP32 as COMPATIBLE_IOCTL() and
handle it from inside of blk_trace_ioctl(), or handle it inErrm, no. Everyone makes that mistake once, so you're in good company,
but the packed attribute makes this incorrect on every architecture
except x86_64 and ia64, because only i386 has no padding before the u64
and after the last member.We now have the compat_u64 type that behaves like the 32 bit user space
version of an unsigned long long. If you use that to defineThe naming convention these days is to use a 'compat_' prefix, not a '32'
Most of these fields are read-only for the kernel, so you should only need
the first copy_to_user. I think you should split the blk_trace_setup function
to have the common code take a struct blk_user_trace_setup kernel pointer,
and one or two versions that just do the copy_{to,from}_user.Arnd <><
-
Not so good. But how could it work on the above archs, surely the
And ditto for u16 and u32, I gather? If you notice, it's not padding in
I was pretty sure that we modified start_lba and end_lba and copied
those back as well, so it was a conscious decision to just copy
everything for safety. But checking we only do mangle and copy back
->name, so your suggestion is a good one.--
Jens Axboe-
Hi Arnd,
Updated patch below. I kept the code in compat_ioctl.c, to me it seems
like the cleanest approach. I need the BLKTRACESETUP32 define both in
compat_ioctl.c and blktrace.c if I move it, and I need to hard-core the
struct size or define it in both places. And guard the code in
blktrace.c with an ifdef for CONFIG_COMPAT. Not pretty, imho.I haven't tested this one yet, but at least it compiles and the sizing
seems right. The u16 padding was an artifact of the
__attribute__((packed)) so that could be removed.diff --git a/fs/compat_ioctl.c b/fs/compat_ioctl.c
index 5a5b711..54162e5 100644
--- a/fs/compat_ioctl.c
+++ b/fs/compat_ioctl.c
@@ -2052,6 +2052,44 @@ static int raw_ioctl(unsigned fd, unsigned cmd, unsigned long arg)
}
return ret;
}
+
+struct blk_user_trace_setup32 {
+ char name[32];
+ u16 act_mask;
+ u32 buf_size;
+ u32 buf_nr;
+ compat_u64 start_lba;
+ compat_u64 end_lba;
+ u32 pid;
+};
+
+#define BLKTRACESETUP32 _IOWR(0x12,115,struct blk_user_trace_setup32)
+
+static int blktrace32_setup(int fd, unsigned cmd, unsigned long arg)
+{
+ struct blk_user_trace_setup __user *buts = compat_alloc_user_space(sizeof(*buts));
+ struct blk_user_trace_setup32 __user *buts32 = compat_ptr(arg);
+ int err;
+
+ if (copy_in_user(&buts->name, &buts32->name, 32) ||
+ get_user(buts->act_mask, &buts32->act_mask) ||
+ get_user(buts->buf_size, &buts32->buf_size) ||
+ get_user(buts->buf_nr, &buts32->buf_nr) ||
+ get_user(buts->start_lba, &buts32->start_lba) ||
+ get_user(buts->end_lba, &buts32->end_lba) ||
+ get_user(buts->pid, &buts32->pid))
+ return -EFAULT;
+
+ err = sys_ioctl(fd, cmd, (unsigned long) buts);
+ if (err)
+ return err;
+
+ if (copy_to_user(&buts32->name, &buts->name, 32))
+ return -EFAULT;
+
+ return err;
+}
+
#endif /* CONFIG_BLOCK */struct serial_struct32 {
@@ -2555,7 +2593,7 @@ COMPATIBLE_IOCTL(BLKSECTSET...
Here's my counterproposal for the blktrace compat code. It doesn't have
any of the problems I found in your patch obviously, but I haven't tested
it either, so I'm sure you can spot a bug or two in here. Comments?Jens, I think the best overall solution would be to have a
block/compat_ioctl.c file with all the compat handling for block
devices moved over from fs/compat_ioctl.c, and done in a nicer way.
If you agree, with this approach, I'd volunteer to come up with a
patch.Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Index: linux-2.6/block/blktrace.c
===================================================================
--- linux-2.6.orig/block/blktrace.c
+++ linux-2.6/block/blktrace.c
@@ -312,33 +312,26 @@ static struct rchan_callbacks blk_relay_
/*
* Setup everything required to start tracing
*/
-static int blk_trace_setup(struct request_queue *q, struct block_device *bdev,
- char __user *arg)
+static int do_blk_trace_setup(struct request_queue *q, struct block_device *bdev,
+ struct blk_user_trace_setup *buts)
{
- struct blk_user_trace_setup buts;
struct blk_trace *old_bt, *bt = NULL;
struct dentry *dir = NULL;
char b[BDEVNAME_SIZE];
int ret, i;- if (copy_from_user(&buts, arg, sizeof(buts)))
- return -EFAULT;
-
- if (!buts.buf_size || !buts.buf_nr)
+ if (!buts->buf_size || !buts->buf_nr)
return -EINVAL;- strcpy(buts.name, bdevname(bdev, b));
+ strcpy(buts->name, bdevname(bdev, b));/*
* some device names have larger paths - convert the slashes
* to underscores for this to work as expected
*/
- for (i = 0; i < strlen(buts.name); i++)
- if (buts.name[i] == '/')
- buts.name[i] = '_';
-
- if (copy_to_user(arg, &buts, sizeof(buts)))
- return -EFAULT;
+ for (i = 0; i < strlen(buts->name); i++)
+ if (buts->name[i] == '/')
+ buts->name[i] = '_';ret = -ENOMEM;
bt = kzalloc(sizeof(*bt), GFP_KERNEL);
@@ -350,7 +343,7 @@ static int blk_trace_setup(struct reques
goto ...
Sometimes I find it hard to stop myself once I have the idea.
The patch below moves all block related ioctl conversion out
of fs/compat_ioctl.c into the compat_blkdev_ioctl() function.Is that a direction we should be heading towards? If so, I
can do some testing and split this big patch into more
logical units for better review.I also found a few interesting bugs in the process:
* BLKRASET is both ULONG_IOCTL and COMPATIBLE_IOCTL, but this should
be entirely harmless
* BLKSECTGET writes 2 bytes normally, our compat_ version writes
4 bytes!
* FDSETPRM32, FDDEFPRM32 and FDGETPRM32 are doing potentially dangerous
stuff with kernel pointers. sparse actually warns about this, but
from what I could see from floppy.c, the kernel always ignores these
pointers. I did not check any other floppy drivers implementing the
same calls.Arnd <><
Index: linux-2.6/block/Makefile
===================================================================
--- linux-2.6.orig/block/Makefile
+++ linux-2.6/block/Makefile
@@ -11,3 +11,4 @@ obj-$(CONFIG_IOSCHED_DEADLINE) += deadli
obj-$(CONFIG_IOSCHED_CFQ) += cfq-iosched.oobj-$(CONFIG_BLK_DEV_IO_TRACE) += blktrace.o
+obj-$(CONFIG_COMPAT) += compat_ioctl.o
Index: linux-2.6/block/compat_ioctl.c
===================================================================
--- /dev/null
+++ linux-2.6/block/compat_ioctl.c
@@ -0,0 +1,772 @@
+#include <linux/blkdev.h>
+#include <linux/blkpg.h>
+#include <linux/blktrace_api.h>
+#include <linux/cdrom.h>
+#include <linux/compat.h>
+#include <linux/elevator.h>
+#include <linux/fd.h>
+#include <linux/hdreg.h>
+#include <linux/syscalls.h>
+#include <linux/smp_lock.h>
+#include <linux/types.h>
+#include <linux/uaccess.h>
+
+#include <scsi/sg.h>
+
+static int compat_put_ushort(unsigned long arg, unsigned short val)
+{
+ return put_user(val, (unsigned short __user *)compat_ptr(arg));
+}
+
+static int compa...
This looks a lot better! I don't mind seperating the block bits, when we
get the whole bunch in there. Just seemed overly silly and complicated
to do it for just one ioctl command. When you are happy with this patch,
I'll add it to the pending block stuff for 2.6.24. The blktrace fix
isn't a regression so need not go into 2.6.23.It seems some bits are missing though (like BLKTRACESETUP32), did you
Always nice to find bugs when doing cleanups ;-)
--
Jens Axboe-
I did the large patch on top of the blktrace patch I sent earlier,
so it doesn't apply cleanly. When doing the final version, I'll use
a more sensible patch order.Arnd <><
-
Ah I see, can you generate one against current git?
--
Jens Axboe-
fs/compat_ioctl.c is still a mess and I'd prefer to get rid of it
over time, by moving everything to the respective drivers.A significant portion of our ioctl numbers are common to a number
of block drivers, so should not handle them in each driver separately,
but rather in the block layer.In the process, we remove a few indirections from the call graph
for compat block ioctls. This patch is still an entirely untested
preview and should be split into smaller chunks.Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
block/Makefile | 1
block/blktrace.c | 54 +-
block/compat_ioctl.c | 820 ++++++++++++++++++++++++++++++++++++++
block/ioctl.c | 17
fs/compat_ioctl.c | 679 -------------------------------
include/linux/blktrace_api.h | 7
6 files changed, 861 insertions(+), 717 deletions(-)Index: linux-2.6/block/Makefile
===================================================================
--- linux-2.6.orig/block/Makefile
+++ linux-2.6/block/Makefile
@@ -11,3 +11,4 @@ obj-$(CONFIG_IOSCHED_DEADLINE) += deadli
obj-$(CONFIG_IOSCHED_CFQ) += cfq-iosched.oobj-$(CONFIG_BLK_DEV_IO_TRACE) += blktrace.o
+obj-$(CONFIG_COMPAT) += compat_ioctl.o
Index: linux-2.6/block/blktrace.c
===================================================================
--- linux-2.6.orig/block/blktrace.c
+++ linux-2.6/block/blktrace.c
@@ -312,33 +312,26 @@ static struct rchan_callbacks blk_relay_
/*
* Setup everything required to start tracing
*/
-static int blk_trace_setup(struct request_queue *q, struct block_device *bdev,
- char __user *arg)
+int do_blk_trace_setup(struct request_queue *q, struct block_device *bdev,
+ struct blk_user_trace_setup *buts)
{
- struct blk_user_trace_setup buts;
struct blk_trace *old_bt, *bt = NULL;
struct dentry *dir = NULL;
char b[BDEVNAME_SIZE];
int ret, i;- if (copy_from_user(&buts, arg, sizeof(buts)))
- return -EFAULT;
-
- if (!buts.buf_size ||...
As before, this looks pretty good. I'll toss it in the tester and see if
things still work.--
Jens Axboe-
Ok, thanks a lot.
Arnd <><
-
The sizes are ok now, but I still don't like the idea of adding more
stuff to fs/compat_ioctl.c. I also noticed another problem now, see below.The preferred way to define compat_ioctl handlers is to use a ->compat_ioctl
file operation so you don't need any code in compat_ioctl.c at all.
You still need the #ifdef in blktrace.c though if you want to building extraYou are dereferencing 'buts' here, which is a user space pointer. This is
broken and cannot work on architectures that have split kernel/user address
spaces, and a potential security hole on those that don't.
sparse would warn about this kind of bug, but of course one of the problemsSame here, this needs to be copy_in_user.
Arnd <><
-
From: Jens Axboe <jens.axboe@oracle.com>
Acked-by: David S. Miller <davem@davemloft.net>
-
| Alan Cox | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Greg Kroah-Hartman | [PATCH 004/196] Chinese: add translation of SubmittingPatches |
| Bart Van Assche | Re: Integration of SCST in the mainstream Linux kernel |
| Andrew Morton | Re: [RFC/PATCH] Documentation of kernel messages |
git: | |
| Winkler, Tomas | RE: iwlwifi: fix build bug in "iwlwifi: fix LED stall" |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Mark Lord | Re: [BUG] New Kernel Bugs |
