Re: [PATCH 4/5] don't panic if /sbin/init exits or killed

Previous thread: [PATCH 3/5] signals: allow the kernel to actually kill /sbin/init by Oleg Nesterov on Sunday, March 16, 2008 - 8:54 am. (1 message)

Next thread: [PATCH 5/5] ptrace: it is fun to strace /sbin/init by Oleg Nesterov on Sunday, March 16, 2008 - 8:54 am. (7 messages)
From: Oleg Nesterov
Date: Sunday, March 16, 2008 - 8:54 am

If the buggy init exits, the kernel panics. I see no point for this. It is very
possible that the system is still usable enough, at least to read the logs and
prepare the bug report.

Change exit_child_reaper() to do BUG() instead of panic().

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>

--- 25/kernel/exit.c~4_EXIT_DONT_PANIC	2008-03-16 16:49:30.000000000 +0300
+++ 25/kernel/exit.c	2008-03-16 16:49:35.000000000 +0300
@@ -861,8 +861,10 @@ static inline void exit_child_reaper(str
 	if (likely(tsk->group_leader != task_child_reaper(tsk)))
 		return;
 
-	if (tsk->nsproxy->pid_ns == &init_pid_ns)
-		panic("Attempted to kill init!");
+	if (unlikely(tsk->nsproxy->pid_ns == &init_pid_ns)) {
+		printk(KERN_EMERG "Kernel panic - init is killed!\n");
+		BUG();
+	}
 
 	/*
 	 * @tsk is the last thread in the 'cgroup-init' and is exiting.

--

From: Roland McGrath
Date: Sunday, March 16, 2008 - 3:19 pm

BUG() does not seem right to me.  This does not diagnose any kernel bug.
The kernel source location and backtrace are not useful.  In fact, they
are likely to mislead the user into reporting the bug to the wrong place
(because it will look like a kernel bug).

I gather your motivation is to get something "recoverable" rather than
always rebooting.  This might be useful for developers like you and me.
I suspect that conservative administrators of production systems prefer
the current behavior.  If the boot init dies, that is reasonably likely
to be a "catastrophic" failure of the system as a whole as far as the
proprietor of a production system is concerned.  That is, the system may
no longer behave as expected in ways essential for its normal operation.
If it sticks around in that condition, appearing to be available but not
doing everything it should, that is usually worse than a quick and
orderly crash (which the installation's procedures and monitoring
infrastructure are often prepared to handle).

panic is a bit extreme for the situation, where we have no reason yet to
think kernel data structures are inconsistent.  A sync+reboot or sync+crash
without bust_spinlocks et al might be better.

For letting init die and calling it recoverable for hacking purposes, a
sysctl to disable the panic/crash makes sense.  But I don't think we
should change the default setting.

Have you tested how recoverable it really is?  I wonder what happens
with init having exited when things get reparented to it.  Don't the
zombies just pile up?


Thanks,
Roland
--

From: Krzysztof Halasa
Date: Sunday, March 16, 2008 - 3:54 pm

That's not very different from any other "important" program crash,
we don't sync+reboot when httpd or sshd is killed.
-- 
Krzysztof Halasa
--

From: Oleg Nesterov
Date: Sunday, March 16, 2008 - 4:03 pm

Yes sure, we leak the re-parented zombies, and nobody can take care of
/etc/inittab. As expected.



Well, I think the generic "if we have a chance to survive, we should try
to survive" rule is good.

If the boot init dies, at least the admin has a chance to figure out what
has happened, and -o remount,ro /.

Every BUG/BUG_ON in fact means the system is not useable, but still it does
not panic(), but tries to proceed.

In short, I can't see why panic() is better. Except we have panic_timeout,

OK, I won't argue (not that I agree ;).

Oleg.

--

From: Alan Cox
Date: Sunday, March 16, 2008 - 3:55 pm

A long long time ago someone posted a patch which arranged that if the
init process died it got respawned via the same list of process names as
happens on boot.

Or for that matter we could catch the dying init and replace it with a
kernel side while(1) wait(NULL); so that we at least continue the cleanup.

Alan
--

From: Roland McGrath
Date: Sunday, March 16, 2008 - 4:32 pm

> But panic() isn't better? It doesn't provide any useful info.

It is not misleading in the same way.  It's clear that going to look at the

For me and you, I agree.  I think the common case is that there is no admin
prepared to do any such thing, but just someone expecting a reboot to fix
things and preferring that a failing system reboot itself in the middle of

Many production systems probably set panic_on_oops.  Having the init panic
behavior keyed on that seems fine to me.  I just don't like the "kernel bug
at this source line" output when it's not true.


Thanks,
Roland
--

From: Oleg Nesterov
Date: Sunday, March 16, 2008 - 4:49 pm

Ah, OK. We can change this to dump_stack() without BUG().

(but again! panic() isn't better, it also looks like a kernel bug).

Oleg.

--

From: Roland McGrath
Date: Sunday, March 16, 2008 - 4:59 pm

> Ah, OK. We can change this to dump_stack() without BUG().

I'm not sure that's a whole lot better as far as confusion.  
And what use is it?  You know init exited.  Just print its exit_code 
and/or group_exit_code and you know what kind of kernel entry the exit was.


Thanks,
Roland
--

From: Oleg Nesterov
Date: Sunday, March 16, 2008 - 5:05 pm

Agreed. It was either killed, or exited. The exit code provides enough info.

Thanks, I'll resend this patch.

Oleg.

--

From: H. Peter Anvin
Date: Friday, March 28, 2008 - 10:47 pm

This would be highly undesirable in a production system, since it would 
leave the machine an unusable zombie.  In a production system, the panic 
can be made to reboot the system, bringing it back online.

	-hpa

--

From: Oleg Nesterov
Date: Saturday, March 29, 2008 - 3:51 am

I can't agree. Following this logic, we should always use panic() instead
of BUG().

I think the system should try to use any chance to survive, and we have
panic_on_oops.


That said, Stephen has a good reason to nack this patch. But still I hope
it is possible to find a simple solution. Not that I think this is really
important, but this panic() is silly, imho.

Oleg.

--

Previous thread: [PATCH 3/5] signals: allow the kernel to actually kill /sbin/init by Oleg Nesterov on Sunday, March 16, 2008 - 8:54 am. (1 message)

Next thread: [PATCH 5/5] ptrace: it is fun to strace /sbin/init by Oleg Nesterov on Sunday, March 16, 2008 - 8:54 am. (7 messages)