Subject: move WARN_ON() out of line
From: Arjan van de Ven <arjan@linux.intel.com>
CC: Ingo Molnar <mingo@elte.hu>
CC: Andrew Morton <akpm@linux-foundation.org>A quick grep shows that there are currently 1145 instances of WARN_ON
in the kernel. Currently, WARN_ON is pretty much entirely inlined,
which makes it hard to enhance it without growing the size of the kernel
(and getting Andrew unhappy).This patch moves WARN_ON() out of line entirely. I've considered keeping
the test inline and moving only the slowpath out of line, but I decided
against that: an out of line test reduces the pressure on the CPUs
branch predictor logic and gives smaller code, while a function call
to a fixed location is quite fast. Likewise I've considered doing something
similar to BUG() (eg use a trapping instruction) but that's not really
better (it needs the test inline again and recovering from an invalid
instruction isn't quite fun).The code size reduction of this patch was about 6.5Kb (on a distro style
.config):text data bss dec hex filename
3096493 293455 2760704 6150652 5dd9fc vmlinux.before
3090006 293455 2760704 6144165 5dc0a5 vmlinux.afterSigned-off-by: Arjan van de Ven <arjan@linux.intel.com>
---
include/asm-generic/bug.h | 13 ++++---------
kernel/panic.c | 13 +++++++++++++
2 files changed, 17 insertions(+), 9 deletions(-)Index: linux-2.6.24-rc6/include/asm-generic/bug.h
===================================================================
--- linux-2.6.24-rc6.orig/include/asm-generic/bug.h
+++ linux-2.6.24-rc6/include/asm-generic/bug.h
@@ -32,15 +32,10 @@ struct bug_entry {
#endif#ifndef HAVE_ARCH_WARN_ON
-#define WARN_ON(condition) ({ \
- int __ret_warn_on = !!(condition); \
- if (unlikely(__ret_warn_on)) { \
- printk("WARNING: at %s:%d %s()\n", __FILE__, \
- __LINE__, __FUNCTION__); \
- dump_stack(); \
- } \
- unlikely(__ret_warn_on); \
-})
+extern int do_...
Power implements WARN_ON this way, and all the machinery is in place to
generically implement WARN_ON that way if you want. It does generate
denser code than the call (since its just a single trapping instruction
with no need for argument setup), and the performance cost of the trap
shouldn't matter if warnings are rare (which one would hope).J
--
I just did an experiment with this to see how much is on the table. I made
a file with 1024 WARN_ON()'s (new style, eg the out of line call) and 1024 BUG_ON()'s,
which on i386 already use the trap.
This shows that the BUG_ON() case is 2Kb shorter in generated code. From this 2Kb you
need to subtract all the code size that is needed to deal with the trap and the module
merging/unmerging of trap points etc etc, so lets say that a total of 1Kb is left on the table.
HOWEVER, if you have a module with, say, only 4 WARN_ON()/BUG_ON()'s, you actually LOOSE
48 bytes, because of the extra overhead of how the trap data is stored.So... call me unconvinced for now. There's 30 Kb on the table with the easy, obviously safe
transform, and maybe another 1Kb with the much more tricky trapping scenario, but only
for the vmlinux case; the module case seems to be a loss instead.--
Yeah, that seems reasonable if you're optimising for overall size. Did
you count the difference of including the function name? We decided not
to include it for BUG because its usefulness/size tradeoff didn't seem
terribly important.But my goal was actually to reduce icache pollution, so by my reckoning
code bytes were much more expensive than data ones, so putting all BUG
information in a separate section makes those bytes much less
significant than putting anything inline in code. Also, the trap for
WARN_ON would be smaller than BUG, because it wouldn't need the spurious
infinite loop needed to make gcc understand the control flow of a BUG.On the other hand, you could put the call to out of line warning
function in a separate section to achieve the same effect.J
--
in the WARN_ON case it's not there either, based on Ingo's idea we do a kallsyms lookup
yeah and gcc even has a compiler option for that. Doubt it's really worth it,
--
Eh I have to retract my math here; I used a slightly older version of the WARN_ON patch series.
(before Ingo's suggestion)
In the new model, even at 1024 the out of line WARN_ON function call is smaller than the BUG_ON method.So I think that at least for x86, it's a loss to do what you suggest....
--
if people wonder where this comes from:
the BUG_ON code sequence is 13 bytes, the WARN_ON sequence
is 24 bytes, so 11 bytes longer. HOWEVER, the BUG_ON approach
also needs 12 bytes of data (20 on 64 bit) per bug, a nett loss
of 1 byte on 32 bit x86. (plus some general overhead for storing
sections as such, but that scales per ELF file, not per BUG_ON instance)--
Isn't this going to print "panic.c" for __FILE__, "do_warn_on" for __FUNCTION__
and whatever line number you have there for __LINE__? I put up a userspace equivalent
of what you're doing here, which confirms my suspision:dmvo@cipher:/tmp$ cat c.c
#include <stdio.h>#define WARN_ON(condition) do_warn_on((unsigned long)(condition), __FILE__, \
__LINE__, __FUNCTION__)int do_warn_on(const unsigned long condition, const char *file,
const int line, const char *function)
{
if (condition) {
printf("WARNING: at %s:%d %s()\n",
__FILE__, __LINE__, __FUNCTION__);
}
return !!condition;
}int main()
{
WARN_ON(1);
return 0;
}
dmvo@cipher:/tmp$ cc c.c
dmvo@cipher:/tmp$ ./a.out
WARNING: at c.c:11 do_warn_on()
dmvo@cipher:/tmp$I think that your intention was to propagate the parameters to the do_warn_on() routine
--
Hi Arjan,
I don't think you want to use __FILE__ and friends here...
--
hm. This passes in 4 arguments to do_warn_on().
i think we could get away with no arguments (!), by using section
tricks. Firstly, we can get rid of __FUNCTION__ and replace it with a
ksyms lookup - that is fine enough. Secondly, we could put __FILE__ and
__LINE__ into a text section and key it up to the return address from
do_warn_on().the condition code should not be passed in at all i think - it creates
extra function calls to do_warn_on() all the time.Ingo
--
BUG_ON already does this, and WARN_ON can reuse all the same machinery.
J
--
the asm generated for this is 2 movl instructions for immediate to register.
function calls are *CHEAP*.
passing the condition is actually near free (remember we have regparm!), it's likely to be
in a register already anyway.Doing the test inline makes stuff bigger, and also spreads the branch prediction pain around
rather than having one nicely predictable place...--
Hi Arjan,
I've got a couple of patches in -mm at the moment that introduces __WARN()
and uses that (and lets architectures override __WARN, since for example
powerpc does use trapping instructions similarly to BUG()).The two patches in question are:
bugh-remove-have_arch_bug--have_arch_warn.patch
powerpc-switch-to-generic-warn_on-bug_on.patchCare to do this incrementally on top of that instead? I.e. call
do_warn_on() from the asm-generic/bug.h __WARN() instead.-Olof
--
ok just did that; will post shortly
--
I hate the do_foo naming scheme (how about __warn_on?), but otherwise:
While we're here, I'll mention that dump_stack probably ought to take a
severity level argument.--
Mathematics is the supreme nostalgia of our time.--
You don't have to change all of them at once. Just create a new function
that does take a level and make the old dump_stack and WARN_ON call that
then slowly convert everything else across if so desired.Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
Having 2 is just silly though, and it'll take a long time, if ever,
to do this sort of thing slowly. Better to do it all at once.
The amount of change doesn't get smaller if you spread it out;
it's still a question if the total amount of change is worth
the added functionality.--
| Roland Dreier | Re: Integration of SCST in the mainstream Linux kernel |
| Jan Engelhardt | intel iommu (Re: -mm merge plans for 2.6.23) |
| Greg Kroah-Hartman | [PATCH 005/196] Chinese: add translation of SubmittingDrivers |
| Linus Torvalds | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
git: | |
| Linus Torvalds | Re: [GIT]: Networking |
| Gerrit Renker | [PATCH 0/37] dccp: Feature negotiation - last call for comments |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Frans Pop | svc: failed to register lockdv1 RPC service (errno 97). |
