Hi, linux-arch! I'd like to tell you about CONFIG_HAVE_ARCH_TRACEHOOK
and what each arch's maintainers will want to do about it.
In the current tree (since 2.6.27-rc1), arch/Kconfig defines the
internal config item HAVE_ARCH_TRACEHOOK. The idea is to set a new
baseline of what an arch has to implement to support tracing/debugging
of user tasks. After you do these things, then new kinds of tracing
features, new replacements for ptrace, or drastic changes to the
implementation details of ptrace, can come along later and just work on
your arch without every arch maintainer having to worry about the
details of new features or new non-arch implementation details. When
your arch is ready, adding "select HAVE_ARCH_TRACEHOOK" somewhere in an
arch-specific Kconfig file makes known that your arch is compatible with
new code. New features and options will "require HAVE_ARCH_TRACEHOOK".
In 2.6.27 (as of rc6), powerpc{,64} and sparc{,64} have complete support
and do "select HAVE_ARCH_TRACEHOOK". In 2.6.28, x86 will have it (that
work is already done and waiting in the tip/x86/tracehook branch), and I
believe s390 is also already on track to have it done in 2.6.28. Since
2.6.27-rc1, everything mentioned here has been in place in the generic
code, so you can start the changes for your arch as soon as you like.
If there is interest, I can run an informal session at the Kernel Summit
and/or Linux Plumbers Conference next week to walk arch hackers through
each work item. (I don't have any presentation to give or anything, all
the info is in this posting and/or documented in the source code. But
we can work through it all in detail in person for as long as you'd like.)
Let me know ASAP if folks want this, so we can get it scheduled.
It has been suggested that we set a deadline to require every arch to be
compatible and set HAVE_ARCH_TRACEHOOK. If your arch hadn't been
updated after that deadline, then your kernel build might break, or it
might still build but suddenly fail to ...This should be gone for 2.6.28, too. (no ACK from the ia64 people yet, but mips and parisc are doen for sure) --
user_stack_pointer() is apparently a requirement, too. Although given that you already have a task_struct pointer the only place you currently use it (lib/syscall.c), it makes more sense to use KSTK_ESP/KSTK_EIP which is provided by almost everyone already. Signed-off-by: Paul Mundt <lethal@linux-sh.org> --- lib/syscall.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lib/syscall.c b/lib/syscall.c index a4f7067..888c36a 100644 --- a/lib/syscall.c +++ b/lib/syscall.c @@ -11,8 +11,8 @@ static int collect_syscall(struct task_struct *target, long *callno, if (unlikely(!regs)) return -EAGAIN; - *sp = user_stack_pointer(regs); - *pc = instruction_pointer(regs); + *sp = KSTK_ESP(target); + *pc = KSTK_EIP(target); *callno = syscall_get_nr(target, regs); if (*callno != -1L && maxargs > 0) --
> user_stack_pointer() is apparently a requirement, too. Ah, yes! I'd forgotten about user_stack_pointer() and instruction_pointer(). Almost? It must be everyone, right? It's used unconditionally in fs/proc/array.c (the only use). I hadn't noticed these before. They aren't commented anywhere. I've got to say, too, these are truly dismal names! Also, I've just noticed that x86-64's user_stack_pointer() is wrong for the case when a fast-path 64-bit syscall is in progress. To get it right, it needs a bit from the struct thread_info, so a call that takes task_struct instead of (or in addition to) pt_regs is certainly right. These are defined in asm/processor.h, as macros. It would be better if they could be inlines, but they really can't because asm/processor.h is before struct task_struct is defined, etc. I wonder if we could move these to another header where they can be clean inlines. I'd sure like to change those names while we're at it. Thoughts? Thanks, Roland --
Okay, let's comment on each bit separately.
Regsets
-------
These don't appear to be a problem for ARM, and turn out to be relatively
clean. The only thing I did do was invent some alternative simpler
helper functions rather than using the user_regset_copy* functions (to
avoid taking the address of function arguments, which needlessly forces
them onto the stack.)
However, in looking at other architectures, I notice that sparc does this
when initializing its regsets:
.n = 38 * sizeof(u32),
.size = sizeof(u32), .align = sizeof(u32),
and sparc64:
.n = 36 * sizeof(u64),
.size = sizeof(u64), .align = sizeof(u64),
which, given that fs/binfmt_elf.c does this:
size_t size = regset->n * regset->size;
void *data = kmalloc(size, GFP_KERNEL);
if (unlikely(!data))
return 0;
means sparc ends up allocating 38 * sizeof(u32) * sizeof(u32), and
sparc64 ends up with 36 * sizeof(u64) * sizeof(u64), which must surely
be wrong?
--
Russell King
Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/
maintainer of:
--
From: Russell King <rmk+lkml@arm.linux.org.uk> Yep, definitely a bug, good catch. I guess, better to allocate too much by accident rather than too little in this case :-) I'll fix this up, thanks! --
Ideally those would get inlined so that doesn't happen. Their only purpose is to make it easier to write get/set functions in arch code. Whatever you find preferable in your arch code is certainly fine with me. If you have something different, or changes to those regset.h functions, that would be of use for other arch code too, then bring them on (under a new subject Yup! Sure looks like Dave skipped the step where it says "test that core dumps work ... and produce correct results". :-) Thanks, Roland --
From: Roland McGrath <roland@redhat.com> It was allocating too much memory, which is harmless. So it did work, and I did test it. --
> It was allocating too much memory, which is harmless. Didn't it also write NT_PRFPREG notes of the wrong size? Thanks, Roland --
From: Roland McGrath <roland@redhat.com> Yep, but gdb was "generous in what it received" and happily read the contents. --
Ah. I've always done core sanity checks with: 1. generate core1 on old kernel 2. generate core2 on new kernel (identical userland scenario) 3. eu-readelf -nl core1 > a 4. eu-readelf -nl core2 > b 5. diff -u a b Then you can eyeball any expected drift like SP address randomization, and be suspicious of all other differences. (Of course, I also test that gdb still likes it.) Thanks, Roland --
