From: Arjan van de Ven <arjan@linux.intel.com>
Subject: [PATCH] debug: add notifier chain debugging
during some development we suspected a case where we left something
in a notifier chain that was from a module that was unloaded already...
and that sort of thing is rather hard to track down.
This patch adds a very simple sanity check (which isn't all that
expensive) to make sure the notifier we're about to call is
actually from either the kernel itself of from a still-loaded
module, avoiding a hard-to-chase-down crash.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
---
kernel/notifier.c | 16 ++++++++++++++++
lib/Kconfig.debug | 10 ++++++++++
2 files changed, 26 insertions(+), 0 deletions(-)
diff --git a/kernel/notifier.c b/kernel/notifier.c
index 823be11..143fdd7 100644
--- a/kernel/notifier.c
+++ b/kernel/notifier.c
@@ -21,6 +21,10 @@ BLOCKING_NOTIFIER_HEAD(reboot_notifier_list);
static int notifier_chain_register(struct notifier_block **nl,
struct notifier_block *n)
{
+ if (!kernel_text_address((unsigned long)n->notifier_call)) {
+ WARN(1, "Invalid notifier registered!");
+ return 0;
+ }
while ((*nl) != NULL) {
if (n->priority > (*nl)->priority)
break;
@@ -34,6 +38,10 @@ static int notifier_chain_register(struct notifier_block **nl,
static int notifier_chain_cond_register(struct notifier_block **nl,
struct notifier_block *n)
{
+ if (!kernel_text_address((unsigned long)n->notifier_call)) {
+ WARN(1, "Invalid notifier registered!");
+ return 0;
+ }
while ((*nl) != NULL) {
if ((*nl) == n)
return 0;
@@ -82,6 +90,14 @@ static int __kprobes notifier_call_chain(struct notifier_block **nl,
while (nb && nr_to_call) {
next_nb = rcu_dereference(nb->next);
+
+#ifdef CONFIG_DEBUG_NOTIFIERS
+ if (!kernel_text_address((unsigned long)nb->notifier_call)) {
+ WARN(1, "Invalid notifier called!");
+ nb = next_nb;
+ continue;
+ }
+#endif
ret = nb->notifier_call(nb, val, v);
if (nr_calls)
diff ...applied for testing, thanks Arjan. Ingo --
Seems strange to add checks to the registration functions. What could If we remove the first two checks then surely we can afford to add the remaining check unconditionally and lose the new config option and its ifdef. --
On Tue, 19 Aug 2008 21:09:32 -0700 Andi told me that this notifier is a hotpath for kprobes (systemtap) use models; at which point making this a config option makes sort of sense to get the last bit of performance out of that. Personally I would always want this option on for anything I run, and I suspect (enterprise) distros will just always turn it on as well since it's a good sanity check at low cost. I'd be happy to lose the config (I think we have way too many unneeded ones of those already); I'll make a patch. -- If you want to reach me at my work email, use arjan@linux.intel.com For development, discussion and tips for power savings, visit http://www.lesswatts.org --
What about checking yourself? At least the braindead use of notiiers in the pagefault path for kprobes (even unconditional on x86_64) is gone now and replaced with proper kprobes hooks. The die notifier that's also used by kprobes shouldn't be in a performance critical path. --
This breaks on ia64 (and pa-risc I think) where function pointers don't point
directly at the code, they point to a {code,data} structure which is itself
located in data space, not text space.
-Tony
--
On Mon, 25 Aug 2008 15:39:13 -0700 is there a way to go to the actual address? I'm sure this is a bit more common.... (like kallsyms!) -- If you want to reach me at my work email, use arjan@linux.intel.com For development, discussion and tips for power savings, visit http://www.lesswatts.org --
See dereference_function_descriptor() in lib/vsprintf.c (where it will be clear that my memory was wrong and that PPC64 is the other architecture that needs this). Perhaps this needs to be moved to some place in asm-generic so that it can be used by code like your sanity check? -Tony --
On Mon, 25 Aug 2008 22:08:00 -0700 or we make a func_is_kernel_text() thast maps underneath... -- If you want to reach me at my work email, use arjan@linux.intel.com For development, discussion and tips for power savings, visit http://www.lesswatts.org --
On Mon, 25 Aug 2008 22:08:00 -0700 The patch below fixes this; Ingo, please replace the patch with this one. From: Arjan van de Ven <arjan@linux.intel.com> Subject: [PATCH] debug: add notifier chain debugging during some development we suspected a case where we left something in a notifier chain that was from a module that was unloaded already... and that sort of thing is rather hard to track down. This patch adds a very simple sanity check (which isn't all that expensive) to make sure the notifier we're about to call is actually from either the kernel itself of from a still-loaded module, avoiding a hard-to-chase-down crash. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Acked-by: Tony Luck <tony.luck@intel.com> --- include/linux/kernel.h | 3 +++ kernel/extable.c | 16 ++++++++++++++++ kernel/notifier.c | 6 ++++++ lib/vsprintf.c | 2 +- 4 files changed, 26 insertions(+), 1 deletions(-) diff --git a/include/linux/kernel.h b/include/linux/kernel.h index 2651f80..4e1366b 100644 --- a/include/linux/kernel.h +++ b/include/linux/kernel.h @@ -187,6 +187,9 @@ extern unsigned long long memparse(char *ptr, char **retptr); extern int core_kernel_text(unsigned long addr); extern int __kernel_text_address(unsigned long addr); extern int kernel_text_address(unsigned long addr); +extern int func_ptr_is_kernel_text(void *ptr); +extern void *dereference_function_descriptor(void *ptr); + struct pid; extern struct pid *session_of_pgrp(struct pid *pgrp); diff --git a/kernel/extable.c b/kernel/extable.c index a26cb2e..adf0cc9 100644 --- a/kernel/extable.c +++ b/kernel/extable.c @@ -66,3 +66,19 @@ int kernel_text_address(unsigned long addr) return 1; return module_text_address(addr) != NULL; } + +/* + * On some architectures (PPC64, IA64) function pointers + * are actually only tokens to some data that then holds the + * real function address. As a result, to find if a function + * pointer is part of the kernel ...
as it's rather large. So i kept the config option, and it defaults to and that should be an unlikely() too i guess. i've applied v2 as a delta patch to -tip, with these changes - see the commit below. Ingo --------------------> From e0f789bde8bda8ee6cf084197b6f3fc951ad241a Mon Sep 17 00:00:00 2001 From: Arjan van de Ven <arjan@linux.intel.com> Date: Fri, 15 Aug 2008 15:29:38 -0700 Subject: [PATCH] debug: add notifier chain debugging, v2 - unbreak ia64 (and powerpc) where function pointers dont point at code but at data (reported by Tony Luck) [ mingo@elte.hu: various cleanups ] Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> --- include/linux/kernel.h | 3 +++ kernel/extable.c | 16 ++++++++++++++++ kernel/notifier.c | 10 +--------- lib/vsprintf.c | 2 +- 4 files changed, 21 insertions(+), 10 deletions(-) diff --git a/include/linux/kernel.h b/include/linux/kernel.h index 1ceafa4..892529d 100644 --- a/include/linux/kernel.h +++ b/include/linux/kernel.h @@ -187,6 +187,9 @@ extern unsigned long long memparse(char *ptr, char **retptr); extern int core_kernel_text(unsigned long addr); extern int __kernel_text_address(unsigned long addr); extern int kernel_text_address(unsigned long addr); +extern int func_ptr_is_kernel_text(void *ptr); +extern void *dereference_function_descriptor(void *ptr); + struct pid; extern struct pid *session_of_pgrp(struct pid *pgrp); diff --git a/kernel/extable.c b/kernel/extable.c index a26cb2e..adf0cc9 100644 --- a/kernel/extable.c +++ b/kernel/extable.c @@ -66,3 +66,19 @@ int kernel_text_address(unsigned long addr) return 1; return module_text_address(addr) != NULL; } + +/* + * On some architectures (PPC64, IA64) function pointers + * are actually only tokens to some data that then holds the + * real function address. As a result, to find if a function + * pointer is part of the kernel text, we need to do some + * ...
On Wed, 27 Aug 2008 08:39:15 +0200 really it isn't; dereference_function_descriptor is a fall through and huh? the consensus was that the config needed to go ... nothing changed on that front. I agree absolutely with not introducing config options please do not add new unlikely()'s to the kernel.. not on non-hotpaths at least --
