Re: [PATCH 3/3 V13] RO/NX protection for loadable kernel

Previous thread: [PATCH] msm: 8x60: Add TLMM to the iomap. by Gregory Bean on Tuesday, November 16, 2010 - 2:35 pm. (1 message)

Next thread: [PATCH v3 2/2] msm: gpio: Add irq support to v2 gpiolib. by Gregory Bean on Tuesday, November 16, 2010 - 2:38 pm. (3 messages)
From: matthieu castet
Date: Tuesday, November 16, 2010 - 2:35 pm

This patch is a logical extension of the protection provided by
CONFIG_DEBUG_RODATA to LKMs. The protection is provided by splitting
module_core and module_init into three logical parts each and setting
appropriate page access permissions for each individual section:

 1. Code: RO+X
 2. RO data: RO+NX
 3. RW data: RW+NX

In order to achieve proper protection, layout_sections() have been
modified to align each of the three parts mentioned above onto page
boundary. Next, the corresponding page access permissions are set
right before successful exit from load_module(). Further, free_module()
and sys_init_module have been modified to set module_core and
module_init as RW+NX right before calling module_free().

By default, the original section layout and access flags are preserved.
When compiled with CONFIG_DEBUG_SET_MODULE_RONX=y, the patch
will page-align each group of sections to ensure that each page contains
only one type of content and will enforce RO/NX for each group of pages.

V1: Initial proof-of-concept patch.
V2: The patch have been re-written to reduce the number of #ifdefs and
to make it architecture-agnostic. Code formatting have been corrected also.
V3: Opportunistic RO/NX protectiuon is now unconditional. Section
page-alignment is enabled when CONFIG_DEBUG_RODATA=y.
V4: Removed most macros and improved coding style.
V5: Changed page-alignment and RO/NX section size calculation
V6: Fixed comments. Restricted RO/NX enforcement to x86 only
V7: Introduced CONFIG_DEBUG_SET_MODULE_RONX, added calls to
set_all_modules_text_rw() and set_all_modules_text_ro() in ftrace
V8: updated for compatibility with linux 2.6.33-rc5
V9: coding style fixes
V10: more coding style fixes
V11: minor adjutments for -tip
V12: minor adjutments for v2.6.35-rc2-tip
V13: minor adjutments for v2.6.37-rc1-tip

Signed-off-by: Siarhei Liakh <sliakh.lkml@gmail.com>
Signed-off-by: Xuxian Jiang <jiang@cs.ncsu.edu>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
Reviewed-by: James Morris ...
From: tip-bot for matthieu castet
Date: Thursday, November 18, 2010 - 7:13 am

Commit-ID:  84e1c6bb38eb318e456558b610396d9f1afaabf0
Gitweb:     http://git.kernel.org/tip/84e1c6bb38eb318e456558b610396d9f1afaabf0
Author:     matthieu castet <castet.matthieu@free.fr>
AuthorDate: Tue, 16 Nov 2010 22:35:16 +0100
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Thu, 18 Nov 2010 13:32:56 +0100

x86: Add RO/NX protection for loadable kernel modules

This patch is a logical extension of the protection provided by
CONFIG_DEBUG_RODATA to LKMs. The protection is provided by
splitting module_core and module_init into three logical parts
each and setting appropriate page access permissions for each
individual section:

 1. Code: RO+X
 2. RO data: RO+NX
 3. RW data: RW+NX

In order to achieve proper protection, layout_sections() have
been modified to align each of the three parts mentioned above
onto page boundary. Next, the corresponding page access
permissions are set right before successful exit from
load_module(). Further, free_module() and sys_init_module have
been modified to set module_core and module_init as RW+NX right
before calling module_free().

By default, the original section layout and access flags are
preserved. When compiled with CONFIG_DEBUG_SET_MODULE_RONX=y,
the patch will page-align each group of sections to ensure that
each page contains only one type of content and will enforce
RO/NX for each group of pages.

  -v1: Initial proof-of-concept patch.
  -v2: The patch have been re-written to reduce the number of #ifdefs
       and to make it architecture-agnostic. Code formatting has also
       been corrected.
  -v3: Opportunistic RO/NX protection is now unconditional. Section
       page-alignment is enabled when CONFIG_DEBUG_RODATA=y.
  -v4: Removed most macros and improved coding style.
  -v5: Changed page-alignment and RO/NX section size calculation
  -v6: Fixed comments. Restricted RO/NX enforcement to x86 only
  -v7: Introduced CONFIG_DEBUG_SET_MODULE_RONX, added
       calls to set_all_modules_text_rw() and ...
From: Valdis.Kletnieks
Date: Wednesday, November 24, 2010 - 8:41 pm

This is incompatible with CONFIG_JUMP_LABEL:

[  252.093624] BUG: unable to handle kernel paging request at ffffffffa0680764
[  252.094008] IP: [<ffffffff81225ee0>] generic_swap+0xa/0x1a
[  252.094008] PGD 1a1e067 PUD 1a22063 PMD 1093ac067 PTE 8000000109786161
[  252.094008] Oops: 0003 [#1] PREEMPT SMP 

[  252.094008] Pid: 3740, comm: modprobe Not tainted 2.6.37-rc3-mmotm1123 #1 0X564R/Latitude E6500                  
[  252.094008] RIP: 0010:[<ffffffff81225ee0>]  [<ffffffff81225ee0>] generic_swap+0xa/0x1a
[  252.094008] RSP: 0018:ffff88011a217d98  EFLAGS: 00010206
[  252.094008] RAX: 00000000000000d9 RBX: 0000000000000030 RCX: 000000000000007c
[  252.094008] RDX: 0000000000000017 RSI: ffffffffa0680794 RDI: ffffffffa0680764
[  252.094008] RBP: ffff88011a217d98 R08: 0000000000000000 R09: ffff88011a217d38
[  252.094008] R10: 0000000000000000 R11: 000000000000013d R12: ffffffffa0680764
[  252.094008] R13: 0000000000000018 R14: 0000000000000018 R15: 0000000000000018
[  252.094008] FS:  00007f6ecb897720(0000) GS:ffff8800df100000(0000) knlGS:0000000000000000
[  252.094008] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  252.094008] CR2: ffffffffa0680764 CR3: 000000011911e000 CR4: 00000000000406e0
[  252.094008] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  252.094008] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  252.094008] Process modprobe (pid: 3740, threadinfo ffff88011a216000, task ffff88011861e300)
[  252.094008] Stack:
[  252.094008]  ffff88011a217e28 ffffffff81226007 ffffffff81a31158 0000000000000000
[  252.094008]  0000000000000030 0000000000000048 ffffffff8105fa34 ffffffff81225ed6
[  252.094008]  ffffffff00000018 ffffffff00000018 0000000000000018 00000030ffffffff
[  252.094008] Call Trace:
[  252.094008]  [<ffffffff81226007>] sort+0x117/0x1b0
[  252.094008]  [<ffffffff8105fa34>] ? jump_label_cmp+0x0/0x1b
[  252.094008]  [<ffffffff81225ed6>] ? generic_swap+0x0/0x1a
[  252.094008]  [<ffffffff8105faa6>] ...
From: mat
Date: Friday, November 26, 2010 - 10:23 am

Le Wed, 24 Nov 2010 22:41:07 -0500,
could you try the attached patch ?

on module load, we sort the __jump_table section. So we should make it
writable.


Matthieu
From: Valdis.Kletnieks
Date: Monday, November 29, 2010 - 9:59 am

Confirming that fixes the issue I was seeing, thanks...
From: Kees Cook
Date: Wednesday, December 8, 2010 - 3:19 pm

Acked-by: Kees Cook <kees.cook@canonical.com>

Can this please get committed to tip?

Thanks,

-Kees

-- 
Kees Cook
Ubuntu Security Team
--

From: mat
Date: Friday, December 10, 2010 - 4:18 pm

Le Wed, 8 Dec 2010 14:19:51 -0800,
I think it is not need anymore with  Steven Rostedt patch [1]

Matthieu

--

From: Kees Cook
Date: Friday, December 10, 2010 - 5:27 pm

Hi,


Which patch was that? (Or, rather, what's a good url to see it?)

-Kees

-- 
Kees Cook
Ubuntu Security Team
--

From: Kees Cook
Date: Saturday, December 11, 2010 - 4:15 pm

Ah-ha! Thanks for the pointer. So, instead of the other explicit jump_table
fix, can this get pushed into tip?

Thanks!

-Kees

-- 
Kees Cook
Ubuntu Security Team
--

From: Ingo Molnar
Date: Wednesday, December 22, 2010 - 5:40 am

What's the status of this bug?

If we still need the patch then please submit it standalone with a proper subject 
line, with acks/signoffs added, etc.

Thanks,

	Ingo
--

From: Valdis.Kletnieks
Date: Wednesday, December 22, 2010 - 2:35 pm

Steve Rostedt's patch that moves the setting of the page permissions seems to
make this patch no longer necessary.  I tripped over this same issue, but the
version in the latest -mmotm does not need it, as it includes Steve's fix.
From: Ingo Molnar
Date: Wednesday, December 22, 2010 - 2:57 pm

It would be nice to see that fix submitted so that it gets into the tree that 
introduced the bug.

Steve, Andrew?

Thanks,

	Ingo
--

From: Steven Rostedt
Date: Wednesday, December 22, 2010 - 3:02 pm

Hi Ingo,

Which tree/branch should I base this on? I could do write up formal
patch and push it out.

-- Steve


--

From: Ingo Molnar
Date: Thursday, December 23, 2010 - 1:49 am

Please use tip:x86/security, it contains the RO/NX patches:

 691513f70d39: x86: Resume trampoline must be executable
 84e1c6bb38eb: x86: Add RO/NX protection for loadable kernel modules
 5bd5a452662b: x86: Add NX protection for kernel data

Thanks!

	Ingo
--

From: Steven Rostedt
Date: Thursday, December 23, 2010 - 8:01 am

Can I add your "Tested-by: ..." to the commit?

Thanks,

-- Steve


--

From: Valdis.Kletnieks
Date: Thursday, December 23, 2010 - 6:43 pm

Sure.  I can verify that the patch series (a) doesn't break well-behaved
modules and (b) it *does* trigger for misbehaving modules (I sent the
VirtualBox crew a bug report for a module that tried to write to an area
marked executable).

From: Steven Rostedt
Date: Monday, November 29, 2010 - 11:15 am

This patch breaks function tracer:

------------[ cut here ]------------
WARNING: at kernel/trace/ftrace.c:1014 ftrace_bug+0x114/0x171()
Hardware name: Precision WorkStation 470    
Modules linked in: i2c_core(+)
Pid: 86, comm: modprobe Not tainted 2.6.37-rc2+ #68
Call Trace:
 [<ffffffff8104e957>] warn_slowpath_common+0x85/0x9d
 [<ffffffffa00026db>] ? __process_new_adapter+0x7/0x34 [i2c_core]
 [<ffffffffa00026db>] ? __process_new_adapter+0x7/0x34 [i2c_core]
 [<ffffffff8104e989>] warn_slowpath_null+0x1a/0x1c
 [<ffffffff810a9dfe>] ftrace_bug+0x114/0x171
 [<ffffffffa00026db>] ? __process_new_adapter+0x7/0x34 [i2c_core]
 [<ffffffff810aa0db>] ftrace_process_locs+0x1ae/0x274
 [<ffffffffa00026db>] ? __process_new_adapter+0x7/0x34 [i2c_core]
 [<ffffffff810aa29e>] ftrace_module_notify+0x39/0x44
 [<ffffffff814405cf>] notifier_call_chain+0x37/0x63
 [<ffffffff8106e054>] __blocking_notifier_call_chain+0x46/0x5b
 [<ffffffff8106e07d>] blocking_notifier_call_chain+0x14/0x16
 [<ffffffff8107ffde>] sys_init_module+0x73/0x1f3
 [<ffffffff8100acf2>] system_call_fastpath+0x16/0x1b
---[ end trace 2aff4f4ca53ec746 ]---
ftrace faulted on writing [<ffffffffa00026db>]
__process_new_adapter+0x7/0x34 [i2c_core]



Here we set the text read only before we call the notifiers. The
function tracer changes the calls to mcount into nops via a notifier

Below is the patch that fixes this bug.

-- Steve

Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>

diff --git a/kernel/module.c b/kernel/module.c
index ba421e6..5ccf52c 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -2804,18 +2804,6 @@ static struct module *load_module(void __user *umod,
 	kfree(info.strmap);
 	free_copy(&info);
 
-	/* Set RO and NX regions for core */
-	set_section_ro_nx(mod->module_core,
-				mod->core_text_size,
-				mod->core_ro_size,
-				mod->core_size);
-
-	/* Set RO and NX regions for init ...
From: Rusty Russell
Date: Monday, November 29, 2010 - 4:35 pm

That seems fine.

I note that both before and after this patch we potentially execute code
in the module (via parse_args) before we set it ro & nx.  But fixing this
last bit of coverage is probably not worth the pain...

Cheers,
Rusty.
--

From: Steven Rostedt
Date: Tuesday, November 30, 2010 - 7:46 am

Rusty, can I take this as an "Acked-by"?

Thanks,

-- Steve


--

From: Rusty Russell
Date: Wednesday, December 1, 2010 - 6:36 am

Yep... Acked-by: Rusty Russell <rusty@rustcorp.com.au>

Thanks,
Rusty.
--

From: mat
Date: Tuesday, November 30, 2010 - 2:20 pm

Hi,

Le Mon, 29 Nov 2010 13:15:42 -0500,

That's look fine for me.

But I wonder why ftrace_arch_code_modify_prepare isn't called ?

It is only called when we start/stop tracing ?

Matthieu
--

From: Steven Rostedt
Date: Tuesday, November 30, 2010 - 5:38 pm

Correct. There's no reason to use it for the changing of mcount callers
to nops. For core kernel code, it happens before SMP is enabled. For
modules, it happens before the module code is executed. Except for what
Rusty stated with the module parameters, but if you are executing
complex code with that, you deserve what you get ;-)

The initial changes are made outside of stop machine. The code is not
being executed by anyone else.

-- Steve


--

Previous thread: [PATCH] msm: 8x60: Add TLMM to the iomap. by Gregory Bean on Tuesday, November 16, 2010 - 2:35 pm. (1 message)

Next thread: [PATCH v3 2/2] msm: gpio: Add irq support to v2 gpiolib. by Gregory Bean on Tuesday, November 16, 2010 - 2:38 pm. (3 messages)