[PATCH 1/2] limit print_fatal_signal() rate (was: [RFC] log out-of-virtual-memory events)

Previous thread: [PATCH] Make common helpers for seq_files that work with list_head-s by Pavel Emelianov on Thursday, May 17, 2007 - 8:19 am. (3 messages)

Next thread: Re: usb-scanner-cameras kernel-2.6.22 and udev-095 problem by art on Thursday, May 17, 2007 - 10:05 am. (2 messages)
From: Andrea Righi
Date: Thursday, May 17, 2007 - 9:24 am

I'm looking for a way to keep track of the processes that fail to allocate new
virtual memory. What do you think about the following approach (untested)?

--

Print informations about the processes that fail to allocate virtual memory.

Signed-off-by: Andrea Righi <a.righi@cineca.it>

diff -urpN linux-2.6.21/mm/mmap.c linux-2.6.21-vm-log-enomem/mm/mmap.c
--- linux-2.6.21/mm/mmap.c	2007-04-26 05:08:32.000000000 +0200
+++ linux-2.6.21-vm-log-enomem/mm/mmap.c	2007-05-17 18:05:39.000000000 +0200
@@ -77,6 +77,26 @@ int sysctl_max_map_count __read_mostly =
 atomic_t vm_committed_space = ATOMIC_INIT(0);
 
 /*
+ * Print current process informations when it fails to allocate new virtual
+ * memory.
+ */
+static inline void log_vm_enomem(void)
+{
+	unsigned long total_vm = 0;
+	struct mm_struct *mm;
+
+	task_lock(current);
+	mm = current->mm;
+	if (mm)
+		total_vm = mm->total_vm;
+	task_unlock(current);
+
+	printk(KERN_INFO
+	       "out of virtual memory for process %d (%s): total_vm=%lu, uid=%d\n",
+	       current->pid, current->comm, total_vm, current->uid);
+}
+
+/*
  * Check that a process has enough memory to allocate a new virtual
  * mapping. 0 means there is enough memory for the allocation to
  * succeed and -ENOMEM implies there is not.
@@ -175,6 +195,7 @@ int __vm_enough_memory(long pages, int c
 		return 0;
 error:
 	vm_unacct_memory(pages);
+	log_vm_enomem();
 
 	return -ENOMEM;
 }
-

From: Rik van Riel
Date: Thursday, May 17, 2007 - 11:22 am

Looks like an easy way for users to spam syslogd over and
over and over again.

At the very least, shouldn't this be dependant on print_fatal_signals?

-- 
Politics is the struggle between those who want to make their country
the best in the world, and those who believe it already is.  Each group
calls the other unpatriotic.
-

From: Jan Engelhardt
Date: Thursday, May 17, 2007 - 11:28 pm

Speaking of signals, everytime I get a segfault (or force one with a test
program) on x86_64, the kernel prints to dmesg:

fail[22278]: segfault at 0000000000000000 rip 00000000004004b8 rsp
00007ffff7ecda50 error 6

I do not see such on i386, so why for x86_64?


	Jan
-- 
-

From: Andi Kleen
Date: Friday, May 18, 2007 - 4:47 am

So that you know that one of your programs crashed. That's a feature.

-Andi
-

From: Jan Engelhardt
Date: Saturday, May 19, 2007 - 12:46 am

This feature could be handy for i386 too.


	Jan
-- 
-

From: Andrea Righi
Date: Saturday, May 19, 2007 - 2:35 am

What about your /proc/sys/kernel/print-fatal-signals? it must be set to 1 to
enable that feature.

-Andrea
-

From: Jan Engelhardt
Date: Saturday, May 19, 2007 - 3:06 am

That file does not exist on versions
  2.6.18 <= version <= 2.6.20


	Jan
-- 
-

From: Andrea Righi
Date: Saturday, May 19, 2007 - 3:16 am

This means that you must apply the print_fatal_signals patch...

-Andrea
-

From: Folkert van Heusden
Date: Saturday, May 19, 2007 - 5:14 pm

Since 2.6.18.2 I use this patch. With 2.6.21.1 it still applies altough
with a small offsets. Works like a charm.


Signed-off by: Folkert van Heusden <folkert@vanheusden.com>

--- linux-2.6.18.2/kernel/signal.c      2006-11-04 02:33:58.000000000 +0100
+++ linux-2.6.18.2.new/kernel/signal.c  2006-11-17 15:59:13.000000000 +0100
@@ -706,6 +706,15 @@
        struct sigqueue * q = NULL;
        int ret = 0;

+       if (sig == SIGQUIT || sig == SIGILL  || sig == SIGTRAP ||
+           sig == SIGABRT || sig == SIGBUS  || sig == SIGFPE  ||
+           sig == SIGSEGV || sig == SIGXCPU || sig == SIGXFSZ ||
+           sig == SIGSYS  || sig == SIGSTKFLT)
+       {
+               printk(KERN_WARNING "Sig %d send to %d owned by %d.%d (%s)\n",
+                       sig, t -> pid, t -> uid, t -> gid, t -> comm);
+       }
+
        /*
         * fast-pathed signals for kernel-internal things like SIGSTOP
         * or SIGKILL.



Folkert van Heusden

-- 
www.biglumber.com <- site where one can exchange PGP key signatures 
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com
-

From: Eric Dumazet
Date: Saturday, May 19, 2007 - 8:55 pm

Please check line 219 of Documentation/CodingStyle, Section 3.1: Spaces

	and no space around the '.' and "->" structure member operators.

Thank you

-

From: Folkert van Heusden
Date: Sunday, May 20, 2007 - 4:21 am

New version without the spaces around '->' and a nice 'unlikely' added. 

Signed-off by: Folkert van Heusden <folkert@vanheusden.com>

--- linux-2.6.18.2/kernel/signal.c	2006-11-04 02:33:58.000000000 +0100
+++ linux-2.6.18.2.new/kernel/signal.c	2006-11-17 15:59:13.000000000 +0100
@@ -706,6 +706,15 @@
 	struct sigqueue * q = NULL;
 	int ret = 0;
 
+	if (unlikely(sig == SIGQUIT || sig == SIGILL  || sig == SIGTRAP ||
+	    sig == SIGABRT || sig == SIGBUS  || sig == SIGFPE  ||
+	    sig == SIGSEGV || sig == SIGXCPU || sig == SIGXFSZ ||
+	    sig == SIGSYS  || sig == SIGSTKFLT))
+	{
+		printk(KERN_WARNING "Sig %d send to %d owned by %d.%d (%s)\n",
+			sig, t->pid, t->uid, t->gid, t->comm);
+	}
+
 	/*
 	 * fast-pathed signals for kernel-internal things like SIGSTOP
 	 * or SIGKILL.


Folkert van Heusden

-- 
MultiTail cok yonlu kullanimli bir program, loglari okumak, verilen
kommandolari yerine getirebilen. Filter, renk verme, merge, 'diff-
view', vs.  http://www.vanheusden.com/multitail/
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com
-

From: Stephen Hemminger
Date: Sunday, May 20, 2007 - 9:08 am

On Sun, 20 May 2007 13:21:11 +0200

Would turning that into a switch() generate better code.

-- 
Stephen Hemminger <shemminger@linux-foundation.org>
-

From: Folkert van Heusden
Date: Sunday, May 20, 2007 - 9:12 am

Doubt it: in the worst case you still nee to check for each possibility.
Furthermore a.f.a.i.k. with switch you cannot do 'unlinkely()'.


Folkert van Heusden

-- 
MultiTail er et flexible tool for å kontrolere Logfiles og commandoer.
Med filtrer, farger, sammenføringer, forskeliger ansikter etc.
http://www.vanheusden.com/multitail/
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com
-

From: Jan Engelhardt
Date: Sunday, May 20, 2007 - 1:38 pm

With if(), it generates a ton of "CMP, JE" instructions.
With switch(), I would assume gcc transforms it into using
a jump table (aka "JMP [table+sig]")

I tried it: with switch(), gcc transforms this into a
bitmap comparison ("MOV eax, 1; SHL eax, sig; TEST eax, 0x830109f8"),
which seems even cheaper than a jump table.


	Jan
-- 
-

From: Folkert van Heusden
Date: Sunday, May 20, 2007 - 1:55 pm

Ok, here's the new patch against 2.6.21.1:

Signed-off by Folkert van Heusden <folkert@vanheusden.com>

--- kernel/signal.c.org	2007-05-20 22:47:13.000000000 +0200
+++ kernel/signal.c	2007-05-20 22:54:17.000000000 +0200
@@ -739,6 +739,25 @@
 	struct sigqueue * q = NULL;
 	int ret = 0;
 
+	/* emit some logging for nasty signals
+	 * especially SIGSEGV and friends aught to be looked at when happening
+	 */
+	switch(sig) {
+	case SIGQUIT: 
+	case SIGILL: 
+	case SIGTRAP:
+	case SIGABRT: 
+	case SIGBUS: 
+	case SIGFPE:
+	case SIGSEGV: 
+	case SIGXCPU: 
+	case SIGXFSZ:
+	case SIGSYS: 
+	case SIGSTKFLT:
+		printk(KERN_WARNING "Sig %d send to %d owned by %d.%d (%s)\n",
+			sig, t -> pid, t -> uid, t -> gid, t -> comm);
+	}
+
 	/*
 	 * fast-pathed signals for kernel-internal things like SIGSTOP
 	 * or SIGKILL.

Folkert van Heusden

-- 
MultiTail è uno flexible tool per seguire di logfiles e effettuazione
di commissioni. Feltrare, provedere da colore, merge, 'diff-view',
etc. http://www.vanheusden.com/multitail/
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com
-

From: Andi Kleen
Date: Sunday, May 20, 2007 - 2:14 pm

Unconditional? That's definitely a very bad idea. If anything only unhandled
signals should be printed this way because some programs use them internally. 
But I think your list is far too long anyways.

-Andi
-

From: Folkert van Heusden
Date: Sunday, May 20, 2007 - 2:20 pm

Use these signals internally? Afaik these are fatal, stopping the

So, which ones would you like to have removed then?


Folkert van Heusden

-- 
To MultiTail einai ena polymorfiko ergaleio gia ta logfiles kai tin
eksodo twn entolwn. Prosferei: filtrarisma, xrwmatismo, sygxwneysi,
diaforetikes provoles. http://www.vanheusden.com/multitail/
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com
-

From: Folkert van Heusden
Date: Sunday, May 20, 2007 - 2:23 pm

(and why, of course, to enlighten me: some are educated guesses)


Folkert van Heusden

-- 
MultiTail ist eine flexible Applikation um Logfiles und Kommando
Eingaben zu überprüfen. Inkl. Filter, Farben, Zusammenführen,
Ansichten etc. http://www.vanheusden.com/multitail/
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com
-

From: Andi Kleen
Date: Sunday, May 20, 2007 - 3:24 pm

SIGFPE at least and the accounting signals are dubious too. SIGQUIT can
be also relatively common.

-Andi
-

From: Jeff Dike
Date: Sunday, May 20, 2007 - 3:22 pm

And SIGSEGV and SIGBUS - UML catches these internally and handles them.

				Jeff
-

From: Gábor Lénárt
Date: Monday, May 21, 2007 - 4:26 am

For example mplayer uses SIGSEGV: it does not check ALL of the possible
error situations, in case of SIGSEGV (a bad stream etc) it quickly
reinit itself from the last position. It's much better in performance (well
sure for a stream without error, or only few of them) than to check all of
the possible problems. Nowdays it may not a problem to check everything but
it WAS when MPlayer born (much slower CPUs). But some signals are also
useful for emulation etc (eg in my own DOS emulator some years ago ....).

-- 
- Gábor
-

From: Andrea Righi
Date: Monday, May 21, 2007 - 3:45 am

Maybe you could use somthing similar to unhandled_signal() in
arch/x86_64/mm/fault.c, but I agree that the list seems a bit too long...

-Andrea
-

From: Folkert van Heusden
Date: Monday, May 21, 2007 - 4:04 am

What about the following enhancement: I check with sig_fatal if it would
kill the process and only then emit a message. So when an application
takes care itself of handling it nothing is printed.

Signed-off by: Folkert van Heusden <folkert@vanheusden.com>

--- kernel/signal.c.org	2007-05-20 22:47:13.000000000 +0200
+++ kernel/signal.c	2007-05-21 12:59:52.000000000 +0200
@@ -739,6 +739,8 @@
 	struct sigqueue * q = NULL;
 	int ret = 0;
 
+	/* emit some logging for unhandled signals
+	 */
+	if (sig_fatal(t, sig))
+	{
+		printk(KERN_WARNING "Sig %d send to %d owned by %d.%d (%s)\n",
+		sig, t -> pid, t -> uid, t -> gid, t -> comm);
+	}
+
 	/*
 	 * fast-pathed signals for kernel-internal things like SIGSTOP
 	 * or SIGKILL.

of course, this can also be limited to only the interesting signals:

Signed-off by: Folkert van Heusden <folkert@vanheusden.com>

--- kernel/signal.c.org	2007-05-20 22:47:13.000000000 +0200
+++ kernel/signal.c	2007-05-21 12:59:52.000000000 +0200
@@ -739,6 +739,28 @@
 	struct sigqueue * q = NULL;
 	int ret = 0;
 
+	/* emit some logging for nasty signals
+	 * especially SIGSEGV and friends aught to be looked at when happening
+	 */
+	switch(sig) {
+	case SIGQUIT: 
+	case SIGILL: 
+	case SIGTRAP:
+	case SIGABRT: 
+	case SIGBUS: 
+	case SIGFPE:
+	case SIGSEGV: 
+	case SIGXCPU: 
+	case SIGXFSZ:
+	case SIGSYS: 
+	case SIGSTKFLT:
+		if (sig_fatal(t, sig))
+		{
+			printk(KERN_WARNING "Sig %d send to %d owned by %d.%d (%s)\n",
+			sig, t -> pid, t -> uid, t -> gid, t -> comm);
+		}
+	}
+
 	/*
 	 * fast-pathed signals for kernel-internal things like SIGSTOP
 	 * or SIGKILL.


Folkert van Heusden

-- 

Multitail - gibkaja utilita po sledovaniju log-fajlov i vyvoda
kommand. Fil'trovanie, raskrašivanie, slijanie, vizual'noe sravnenie,
i t.d.  http://www.vanheusden.com/multitail/
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com
-

From: Jan Engelhardt
Date: Monday, May 21, 2007 - 5:30 am

Jan
-- 
-

From: Folkert van Heusden
Date: Monday, May 21, 2007 - 5:47 am

Can we already use that one in send_signal? As the signal needs to be
send first I think before we know if it was handled or not? sig_fatal
checks if the handler is set to default - which is it is not taken care


Description:
This patch adds code to the signal-sender making it log a message when
an unhandled fatal signal will be delivered.

Signed-off by: Folkert van Heusden <folkert@vanheusden.com>

--- kernel/signal.c.org	2007-05-20 22:47:13.000000000 +0200
+++ kernel/signal.c	2007-05-21 14:46:05.000000000 +0200
@@ -739,6 +739,12 @@
 	struct sigqueue * q = NULL;
 	int ret = 0;
 
+	/* unhandled fatal signals are logged */
+	if (sig_fatal(t, sig)) {
+		printk(KERN_WARNING "Sig %d sent to %d owned by %d.%d (%s)\n",
+		sig, t->pid, t->uid, t->gid, t->comm);
+	}
+
 	/*
 	 * fast-pathed signals for kernel-internal things like SIGSTOP
 	 * or SIGKILL.


Folkert van Heusden

-- 
MultiTail è uno flexible tool per seguire di logfiles e effettuazione
di commissioni. Feltrare, provedere da colore, merge, 'diff-view',
etc. http://www.vanheusden.com/multitail/
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com
-

From: Andrea Righi
Date: Monday, May 21, 2007 - 6:58 am

What about ptrace()'d processes? I don't think we should log signals for them...

-Andrea
-

From: Folkert van Heusden
Date: Monday, May 21, 2007 - 11:59 am

Why not?


Folkert van Heusden

-- 
MultiTail es una herramienta flexibele para consiguir archivos de log,
y para ejecutar ordenes. Filtrar, añadir colores, merger y vista de
las differencias.  http://www.vanheusden.com/multitail/
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com
-

From: Andrea Righi
Date: Monday, May 21, 2007 - 3:15 pm

Maybe sometimes it's useful, maybe not, but I suppose that usually only the
controlling process should care about the critical signals received by the
controlled process. I simply don't think it should be a system issue. For
example I wouldn't like to have a lot of messages in the kernel logs just
because I'm debugging some segfaulting programs with gdb.

-Andrea
-

From: Satyam Sharma
Date: Wednesday, May 23, 2007 - 11:00 am

Gargh ... why does this want to be in the *kernel*'s logs? In any case, can
you please make this KERN_INFO (or lower) instead of KERN_WARNING.
-

From: Folkert van Heusden
Date: Wednesday, May 23, 2007 - 11:45 am

Description:
This patch adds code to the signal-sender making it log a message when
an unhandled fatal signal will be delivered.

Signed-of by: Folkert van Heusden <folkert@vanheusden.com

--- kernel/signal.c.org	2007-05-20 22:47:13.000000000 +0200
+++ kernel/signal.c	2007-05-21 14:46:05.000000000 +0200
@@ -739,6 +739,12 @@
 	struct sigqueue * q = NULL;
 	int ret = 0;
 
+	/* unhandled fatal signals are logged */
+	if (sig_fatal(t, sig)) {
+		printk(KERN_INFO "Sig %d sent to %d owned by %d.%d (%s)\n",
+		sig, t->pid, t->uid, t->gid, t->comm);
+	}
+
 	/*
 	 * fast-pathed signals for kernel-internal things like SIGSTOP
 	 * or SIGKILL.


Folkert van Heusden

-- 
Temperature outside:    21.437500, temperature livingroom: 
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com
-

From: Folkert van Heusden
Date: Sunday, June 10, 2007 - 12:53 pm

New version. This one also informs the user about the sende pid/uid of
the signal (when applicable).

Signed-of by: Folkert van Heusden <folkert@vanheusden.com

--- linux/kernel/signal.c.org	2007-05-20 22:47:13.000000000 +0200
+++ linux/kernel/signal.c	2007-06-10 00:21:31.000000000 +0200
@@ -739,6 +739,18 @@
 	struct sigqueue * q = NULL;
 	int ret = 0;
 
+	/* unhandled fatal signals are logged */
+	if (sig_fatal(t, sig)) {
+		if (is_si_special(info))
+			printk(KERN_INFO "Sig %d sent to %d owned by %d.%d (%s)\n",
+				sig, t->pid, t->uid, t->gid, t->comm);
+		else
+			printk(KERN_INFO "Sig %d sent to %d owned by %d.%d (%s), sent by pid %d, uid %d\n",
+				sig, t->pid, t->uid, t->gid, t->comm,
+				info -> _sifields._kill._pid,
+				info -> _sifields._kill._uid);
+	}
+
 	/*
 	 * fast-pathed signals for kernel-internal things like SIGSTOP
 	 * or SIGKILL.


Folkert van Heusden

-- 
Feeling generous? -> http://www.vanheusden.com/wishlist.php
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com
-

From: Jiri Kosina
Date: Sunday, June 10, 2007 - 1:06 pm

Am I the only one whose eyes are hurt by these spaces?

-- 
Jiri Kosina
-

From: Jan Engelhardt
Date: Sunday, June 10, 2007 - 1:37 pm

They were discussed before already. And they were fixed up (t->uid...).
And now new ones got added. Ergh.



	Jan
-- 
-

From: Andrea Righi
Date: Friday, May 18, 2007 - 12:50 am

Anyway, with print-fatal-signals enabled a user could spam syslogd too, simply
with a (char *)0 = 0 program, but we could always identify the spam attempts
logging the process uid...

In any case, I agree, it should depend on that patch...

What about adding a simple msleep_interruptible(SOME_MSECS) at the end of
log_vm_enomem() or, at least, a might_sleep() to limit the potential spam/second
rate?

-Andrea
-

From: Robin Holt
Date: Friday, May 18, 2007 - 2:16 am

An msleep will slow down this process, but do nothing about slowing
down the amount of logging.  Simply fork a few more processes and all
you are doing with msleep is polluting the pid space.

What about a throttling similar to what ia64 does for floating point
assist faults (handle_fpu_swa()).  There is a thread flag to not log
the events at all.  It is rate throttled globally, but uses per cpu
variables for early exits.  This algorithm scaled well to a thousand
cpus.

I think this may be a good fit.

Good Luck,
Robin
-

From: Andrea Righi
Date: Friday, May 18, 2007 - 8:55 am

Actually using printk_ratelimit() should be enough... BTW print_fatal_signals()
should use it too.

-Andrea

---

Signed-off-by: Andrea Righi <a.righi@cineca.it>

diff -urpN linux-2.6.21/mm/mmap.c linux-2.6.21-vm-log-enomem/mm/mmap.c
--- linux-2.6.21/mm/mmap.c	2007-04-26 05:08:32.000000000 +0200
+++ linux-2.6.21-vm-log-enomem/mm/mmap.c	2007-05-18 17:17:32.000000000 +0200
@@ -77,6 +77,29 @@ int sysctl_max_map_count __read_mostly =
 atomic_t vm_committed_space = ATOMIC_INIT(0);
 
 /*
+ * Print current process informations when it fails to allocate new virtual
+ * memory.
+ */
+static inline void log_vm_enomem(void)
+{
+	unsigned long total_vm = 0;
+	struct mm_struct *mm;
+
+	if (unlikely(!printk_ratelimit()))
+		return;
+
+	task_lock(current);
+	mm = current->mm;
+	if (mm)
+		total_vm = mm->total_vm;
+	task_unlock(current);
+
+	printk(KERN_INFO
+	       "out of virtual memory for process %d (%s): total_vm=%lu, uid=%d\n",
+		current->pid, current->comm, total_vm, current->uid);
+}
+
+/*
  * Check that a process has enough memory to allocate a new virtual
  * mapping. 0 means there is enough memory for the allocation to
  * succeed and -ENOMEM implies there is not.
@@ -175,6 +198,7 @@ int __vm_enough_memory(long pages, int c
 		return 0;
 error:
 	vm_unacct_memory(pages);
+	log_vm_enomem();
 
 	return -ENOMEM;
 }
-

From: Andrea Righi
Date: Friday, May 18, 2007 - 9:05 am

I mean, something like this...

---

Limit the rate of the printk()s in print_fatal_signal() to avoid potential DoS
problems.

Signed-off-by: Andrea Righi <a.righi@cineca.it>

diff -urpN linux-2.6.22-rc1-mm1/kernel/signal.c linux-2.6.22-rc1-mm1-limit-print_fatal_signals-rate/kernel/signal.c
--- linux-2.6.22-rc1-mm1/kernel/signal.c	2007-05-18 17:48:55.000000000 +0200
+++ linux-2.6.22-rc1-mm1-limit-print_fatal_signals-rate/kernel/signal.c	2007-05-18 17:58:13.000000000 +0200
@@ -790,6 +790,9 @@ static void print_vmas(void)
 
 static void print_fatal_signal(struct pt_regs *regs, int signr)
 {
+	if (unlikely(!printk_ratelimit()))
+		return;
+
 	printk("%s/%d: potentially unexpected fatal signal %d.\n",
 		current->comm, current->pid, signr);
 
-

From: Bernd Eckenfels
Date: Friday, May 18, 2007 - 9:34 am

can we have both KERN_WARNING please?

Gruss
Bernd
-

From: Andrea Righi
Date: Saturday, May 19, 2007 - 3:33 am

Depends on print_fatal_signals patch.

---

Limit the rate of print_fatal_signal() to avoid potential denial-of-service
attacks.

Signed-off-by: Andrea Righi <a.righi@cineca.it>

diff -urpN linux-2.6.22-rc1-mm1/kernel/signal.c linux-2.6.22-rc1-mm1-vm-log-enomem/kernel/signal.c
--- linux-2.6.22-rc1-mm1/kernel/signal.c	2007-05-19 11:25:24.000000000 +0200
+++ linux-2.6.22-rc1-mm1-vm-log-enomem/kernel/signal.c	2007-05-19 11:30:00.000000000 +0200
@@ -790,7 +790,10 @@ static void print_vmas(void)
 
 static void print_fatal_signal(struct pt_regs *regs, int signr)
 {
-	printk("%s/%d: potentially unexpected fatal signal %d.\n",
+	if (unlikely(!printk_ratelimit()))
+		return;
+
+	printk(KERN_WARNING "%s/%d: potentially unexpected fatal signal %d.\n",
 		current->comm, current->pid, signr);
 
 #ifdef __i386__
-

From: Andrew Morton
Date: Sunday, May 20, 2007 - 8:31 pm

Well OK.  But vdso-print-fatal-signals.patch is designated not-for-mainline
anyway.

I think the DoS which you identify has been available for a very long time
on ia64, x86_64 and perhaps others.


-

From: Andrea Righi
Date: Monday, May 21, 2007 - 3:44 am

For the mainline a fix could be the following...

---

Limit the rate of the kernel logging for the segfaults of user applications, to
avoid potential message floods or denial-of-service attacks.

Signed-off-by: Andrea Righi <a.righi@cineca.it>

diff -urpN linux-2.6.22-rc2/arch/avr32/mm/fault.c linux-2.6.22-rc2-limit-segfaults-printk-rate/arch/avr32/mm/fault.c
--- linux-2.6.22-rc2/arch/avr32/mm/fault.c	2007-05-19 13:11:30.000000000 +0200
+++ linux-2.6.22-rc2-limit-segfaults-printk-rate/arch/avr32/mm/fault.c	2007-05-21 11:48:37.000000000 +0200
@@ -158,7 +158,7 @@ bad_area:
 	up_read(&mm->mmap_sem);
 
 	if (user_mode(regs)) {
-		if (exception_trace)
+		if (exception_trace && printk_ratelimit())
 			printk("%s%s[%d]: segfault at %08lx pc %08lx "
 			       "sp %08lx ecr %lu\n",
 			       is_init(tsk) ? KERN_EMERG : KERN_INFO,
diff -urpN linux-2.6.22-rc2/arch/x86_64/mm/fault.c linux-2.6.22-rc2-limit-segfaults-printk-rate/arch/x86_64/mm/fault.c
--- linux-2.6.22-rc2/arch/x86_64/mm/fault.c	2007-05-21 11:42:07.000000000 +0200
+++ linux-2.6.22-rc2-limit-segfaults-printk-rate/arch/x86_64/mm/fault.c	2007-05-21 11:45:55.000000000 +0200
@@ -489,7 +489,8 @@ bad_area_nosemaphore:
 		    (address >> 32))
 			return;
 
-		if (exception_trace && unhandled_signal(tsk, SIGSEGV)) {
+		if (exception_trace && unhandled_signal(tsk, SIGSEGV) &&
+		    printk_ratelimit()) {
 			printk(
 		       "%s%s[%d]: segfault at %016lx rip %016lx rsp %016lx error %lx\n",
 					tsk->pid > 1 ? KERN_INFO : KERN_EMERG,
-

From: Ingo Molnar
Date: Thursday, May 24, 2007 - 12:58 am

btw., why? It's very, very useful to distro, early-boot-userspace and 
glibc development. The only add-on change should be to not print SIGKILL 
events. Otherwise it's very much a keeper. Hm?

	Ingo
-

From: Andrew Morton
Date: Thursday, May 24, 2007 - 1:15 am

err, because that's what I decided a year ago.  I wonder why ;)

Perhaps because of the DoS thing, but it has a /proc knob and defaults to

<promotes it>
-

From: Ingo Molnar
Date: Thursday, May 24, 2007 - 2:55 am

yeah. There's also a boot option. To address the DoS angle, should i 
make it optionally printk_ratelimit() perhaps? (although often the 

thanks :-)

	Ingo
-

From: Andrew Morton
Date: Thursday, May 24, 2007 - 9:23 am

I don't think so, really.  It takes a deliberate act to turn the thing
on, after all.

I we _were_ concerned about the logspam then it might be better to make the
feature turn itself off after 100 messages, rather than ratelimiting it.
-

From: Andrea Righi
Date: Thursday, May 24, 2007 - 1:50 am

Actually it seems that SIGKILLs are not printed. In get_signal_to_deliver() we have:

[snip]
@@ -1843,6 +1879,8 @@ relock:
 		 * Anything else is fatal, maybe with a core dump.
 		 */
 		current->flags |= PF_SIGNALED;
+		if ((signr != SIGKILL) && print_fatal_signals)
+			print_fatal_signal(regs, signr);
 		if (sig_kernel_coredump(signr)) {
 			/*
 			 * If it was able to dump core, this kills all
[snip]

-Andrea
-

From: Ingo Molnar
Date: Thursday, May 24, 2007 - 2:58 am

yeah. Either i implemented that and forgot, or someone else implemented 
it. :)

	Ingo
-

From: Ingo Molnar
Date: Thursday, May 24, 2007 - 2:57 am

ah, that's already included in the version in -mm.

admittedly, the #ifdef __i386__ is quite lame, but there's no generic 
safely-try-to-show-code-at-addr function available at the moment.

	Ingo
-

From: Bernd Eckenfels
Date: Friday, May 18, 2007 - 9:36 am

And align this one with the print_fatal layout:

       printk(KERN_WARNING
              "%s/%d process cannot request more virtual memory: total_vm=%lu, uid=%d\n",
               current->comm, current->pid, total_vm, current->uid);

Greetings
Bernd
-

From: Andrea Righi
Date: Saturday, May 19, 2007 - 3:34 am

Depends on print_fatal_signals patch.

---

Print informations about userspace processes that fail to allocate new virtual
memory.

Signed-off-by: Andrea Righi <a.righi@cineca.it>

diff -urpN linux-2.6.22-rc1-mm1/mm/mmap.c linux-2.6.22-rc1-mm1-vm-log-enomem/mm/mmap.c
--- linux-2.6.22-rc1-mm1/mm/mmap.c	2007-05-19 11:25:24.000000000 +0200
+++ linux-2.6.22-rc1-mm1-vm-log-enomem/mm/mmap.c	2007-05-19 11:55:05.000000000 +0200
@@ -77,6 +77,31 @@ int sysctl_overcommit_ratio = 50;	/* def
 int sysctl_max_map_count __read_mostly = DEFAULT_MAX_MAP_COUNT;
 atomic_t vm_committed_space = ATOMIC_INIT(0);
 
+extern int print_fatal_signals;
+
+/*
+ * Print current process informations when it fails to allocate new virtual
+ * memory.
+ */
+static inline void log_vm_enomem(void)
+{
+	unsigned long total_vm = 0;
+	struct mm_struct *mm;
+
+	if (unlikely(!printk_ratelimit()))
+		return;
+
+	task_lock(current);
+	mm = current->mm;
+	if (mm)
+		total_vm = mm->total_vm;
+	task_unlock(current);
+
+	printk(KERN_WARNING
+	       "%s/%d process cannot request more virtual memory: total_vm=%lu, uid=%d\n",
+	       current->comm, current->pid, total_vm, current->uid);
+}
+
 /*
  * Check that a process has enough memory to allocate a new virtual
  * mapping. 0 means there is enough memory for the allocation to
@@ -177,6 +202,9 @@ int __vm_enough_memory(long pages, int c
 error:
 	vm_unacct_memory(pages);
 
+	if (print_fatal_signals)
+		log_vm_enomem();
+
 	return -ENOMEM;
 } 
-

From: Andrew Morton
Date: Sunday, May 20, 2007 - 8:32 pm

Why is this useful?
-

From: Andrea Righi
Date: Monday, May 21, 2007 - 3:48 am

Well... in strict overcommit mode (overcommit_memory=2) this is the only way to
track down problems of the (bad-designed) user applications that exit when they
receive a -ENOMEM without logging anything... and, anyway, it could be an
additional aid in figuring out what is going wrong on inside a system. BTW, I
don't think it should be enabled by default, so this is the reason why it should
depend on print_fatal_signals patch.

-Andrea
-

From: Folkert van Heusden
Date: Saturday, May 19, 2007 - 5:15 pm

Yeah well it's all captured by syslogd/klogd and written to a file and
diskspace is cheap.


Folkert van Heusden

-- 
Feeling generous? -> http://www.vanheusden.com/wishlist.php
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com
-

Previous thread: [PATCH] Make common helpers for seq_files that work with list_head-s by Pavel Emelianov on Thursday, May 17, 2007 - 8:19 am. (3 messages)

Next thread: Re: usb-scanner-cameras kernel-2.6.22 and udev-095 problem by art on Thursday, May 17, 2007 - 10:05 am. (2 messages)