sorry for delay, vacation.
sorry again, I'll try to comment this later...
Thanks Mathias.
I think this should be fixed anyway. Could you try the patch below?
In any case swapper should be immune to signals, and its ->thread_group
Looks like, you kernel is old. Any chance you can also test the recent
setpgid(0,0) just moves the caller's pgrp from PGID 0, that is why it
Thanks.
I didn't try it, but it looks overcomplicated to trigger this bug, or
I missed something? Afaics, init could be just
int main(void)
{
kill(0, SIGGKILL);
}
No?
Oleg.
We should also change INIT_SIGHAND, but _hopefully_ this is enough
to fix the crash.
--- x/include/linux/init_task.h
+++ x/include/linux/init_task.h
@@ -172,6 +172,7 @@ extern struct cred init_cred;
[PIDTYPE_PGID] = INIT_PID_LINK(PIDTYPE_PGID), \
[PIDTYPE_SID] = INIT_PID_LINK(PIDTYPE_SID), \
}, \
+ .thread_group = LIST_HEAD_INIT(tsk.thread_group), \
.dirties = INIT_PROP_LOCAL_SINGLE(dirties), \
INIT_IDS \
INIT_PERF_EVENTS(tsk) \
--
forgot to mention... Also. I strongly believe we should change INIT_STRUCT_PID.tasks, it should be hlist_empty(). But this needs another discussion. Oleg. --
Hello Oleg, It's old because it's the result of bisecting the cause of the problem. It's actually some 2.6.24 kernel but I could reproduce the bug with Yes, sure. Killing the process group, while having a PGID of 0 are the only prerequisites to trigger this bug. In my example I forked a child and let it do the call to kill to not have init (PID 1) beeing killed, too. The kernel doesn't like that. :) This works for me. Thanks. Tested-by: Mathias Krause <mathias.krause@secunet.com> --
Hello, Mathias Krause reports that a buggy (or special) /sbin/init can crash the kernel if it sends a signal to its pgrp/sid before it changes its initial (0,0) pids. See the changelog for 1/4. git-bisect blames "start the global /sbin/init with 0,0 special pids" commit 430c623121ea88ca80595c99fdc63b7f8a803ae5, but in fact the problem was caused by another change, see 2/4. The patches do not depend on each other, 3/4 fixes another problem, 4/4 is purely cosmetic. Oleg. --
The trivial /sbin/init doing
int main(void)
{
kill(0, SIGKILL)
}
crashes the kernel.
This happens because __kill_pgrp_info(init_struct_pid) also sends SIGKILL
to the swapper process which runs with the uninitialized ->thread_group.
Change INIT_TASK() to initialize ->thread_group properly.
Note: the real problem is that the swapper process must not be visible to
signals, see the next patch. But this change is right anyway and fixes
the crash.
Reported-and-tested-by: Mathias Krause <mathias.krause@secunet.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
include/linux/init_task.h | 1 +
1 file changed, 1 insertion(+)
--- 34-rc1/include/linux/init_task.h~1_INIT_TASK_THREAD_GROUP 2010-05-10 19:44:19.000000000 +0200
+++ 34-rc1/include/linux/init_task.h 2010-05-10 19:45:27.000000000 +0200
@@ -172,6 +172,7 @@ extern struct cred init_cred;
[PIDTYPE_PGID] = INIT_PID_LINK(PIDTYPE_PGID), \
[PIDTYPE_SID] = INIT_PID_LINK(PIDTYPE_SID), \
}, \
+ .thread_group = LIST_HEAD_INIT(tsk.thread_group), \
.dirties = INIT_PROP_LOCAL_SINGLE(dirties), \
INIT_IDS \
INIT_PERF_EVENTS(tsk) \
--
Oleg Nesterov [oleg@redhat.com] wrote:
| The trivial /sbin/init doing
|
| int main(void)
| {
| kill(0, SIGKILL)
| }
|
| crashes the kernel.
Really subtle. Good catch.
So, now init is not part of any process group until it calls setsid().
So the above SIGKILL is lost right ? - i.e it does not kill even init
itself.
In my quick test, the following init process lives on inspite of the
SIGKILL.
main()
{
kill(0, SIGKILL);
while(1)
sleep(1);
}
I don't have a better solution. Maybe a hung init is better than a
crashed kernel. the patches look good.
Acked-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
--
No, no. swapper != init. With or without these patches (more precisely, Yes, /sbin/init is not killable, that is why it survies. Yes. if /sbin/init exits the kernel panics. The real test-case shouldn't Agreed!!! I sent the patch a long ago. But security people do not like it, they use exit() from init to provoke the crash intentionally. Thanks! Oleg. --
"statically initialize struct pid for swapper" commit 820e45db says:
Statically initialize a struct pid for the swapper process (pid_t == 0)
and attach it to init_task. This is needed so task_pid(), task_pgrp()
and task_session() interfaces work on the swapper process also.
OK, but:
- it doesn't make sense to add init_task.pids[].node into
init_struct_pid.tasks[], and in fact this just wrong.
idle threads are special, they shouldn't be visible on any
global list. In particular do_each_pid_task(init_struct_pid)
shouldn't see swapper.
This is the actual reason why kill(0, SIGKILL) from /sbin/init
(which starts with 0,0 special pids) crashes the kernel. The
signal sent to pgid/sid == 0 must never see idle threads, even
if the previous patch fixed the crash itself.
- we have other idle threads running on the non-boot CPUs, see
the next patch.
Change INIT_STRUCT_PID/INIT_PID_LINK to create the empty/unhashed
hlist_head/hlist_node. Like any other idle thread swapper can never exit,
so detach_pid()->__hlist_del() is not possible, but we could change
INIT_PID_LINK() to set pprev = &next if needed.
All we need is the valid swapper->pids[].pid == &init_struct_pid.
Reported-by: Mathias Krause <mathias.krause@secunet.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
include/linux/init_task.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
--- 34-rc1/include/linux/init_task.h~2_INIT_PID_DONT_LINK_SWAPPER 2010-05-10 19:45:27.000000000 +0200
+++ 34-rc1/include/linux/init_task.h 2010-05-10 20:26:08.000000000 +0200
@@ -53,9 +53,9 @@ extern struct group_info init_groups;
#define INIT_STRUCT_PID { \
.count = ATOMIC_INIT(1), \
.tasks = { \
- { .first = &init_task.pids[PIDTYPE_PID].node }, \
- { .first = &init_task.pids[PIDTYPE_PGID].node }, \
- { .first = &init_task.pids[PIDTYPE_SID].node }, \
+ { .first = NULL }, \
+ { .first = NULL }, \
+ { .first = NULL }, \
...Crimey, trying to find some way this could get dereferenced, finding myself impressed with the likes of set_ftrace_swapper(). Acked-by: Serge E. Hallyn <serue@us.ibm.com> thanks, -serge --
Yes, I was worried too. But afaics we should never use this hlist_node. Thanks for review! Oleg. --
copy_process(pid => &init_struct_pid) doesn't do attach_pid/etc.
It shouldn't, but this means that the idle threads run with the wrong
pids copied from the caller's task_struct. In x86 case the caller is
either kernel_init() thread or keventd.
In particular, this means that after the series of cpu_up/cpu_down an
idle thread (which never exits) can run with .pid pointing to nowhere.
Change fork_idle() to initialize idle->pids[] correctly. We only set
.pid = &init_struct_pid but do not add .node to list, INIT_TASK() does
the same for the boot-cpu idle thread (swapper).
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
kernel/fork.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
--- 34-rc1/kernel/fork.c~3_FORK_IDLE_SET_PIDS 2010-03-24 18:07:03.000000000 +0100
+++ 34-rc1/kernel/fork.c 2010-05-10 20:45:33.000000000 +0200
@@ -1339,6 +1339,16 @@ noinline struct pt_regs * __cpuinit __at
return regs;
}
+static inline void init_idle_pids(struct pid_link *links)
+{
+ enum pid_type type;
+
+ for (type = PIDTYPE_PID; type < PIDTYPE_MAX; ++type) {
+ INIT_HLIST_NODE(&links[type].node); /* not really needed */
+ links[type].pid = &init_struct_pid;
+ }
+}
+
struct task_struct * __cpuinit fork_idle(int cpu)
{
struct task_struct *task;
@@ -1346,8 +1356,10 @@ struct task_struct * __cpuinit fork_idle
task = copy_process(CLONE_VM, 0, idle_regs(&regs), 0, NULL,
&init_struct_pid, 0);
- if (!IS_ERR(task))
+ if (!IS_ERR(task)) {
+ init_idle_pids(task->pids);
init_idle(task, cpu);
+ }
return task;
}
--
Cosmetic, no changes in the compiled code. Just s/NULL/SIG_DFL/ to make
it more readable and grep-friendly.
Note: probably SIG_IGN makes more sense, we could kill ignore_signals().
But then kernel_init() should do flush_signal_handlers() before exec().
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---
include/linux/init_task.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- 34-rc1/include/linux/init_task.h~4_INIT_SIGHAND_USE_SIG_DFL 2010-05-10 20:26:08.000000000 +0200
+++ 34-rc1/include/linux/init_task.h 2010-05-10 21:06:03.000000000 +0200
@@ -43,7 +43,7 @@ extern struct nsproxy init_nsproxy;
#define INIT_SIGHAND(sighand) { \
.count = ATOMIC_INIT(1), \
- .action = { { { .sa_handler = NULL, } }, }, \
+ .action = { { { .sa_handler = SIG_DFL, } }, }, \
.siglock = __SPIN_LOCK_UNLOCKED(sighand.siglock), \
.signalfd_wqh = __WAIT_QUEUE_HEAD_INITIALIZER(sighand.signalfd_wqh), \
}
--
On Mon, 10 May 2010 21:49:17 +0200 Do you see a need to merge these into 2.6.34? (I don't) --
No, the problem is minor, it is not possible to exploit it unless /sbin/init does "bad things". And the long CC asks for review. Although 1/4 is "obviously good" in any case and I strongly believe 2/4 is right at least in general. Oleg. --
All 4 Acked-by: Roland McGrath <roland@redhat.com> --
