Re: [RFC][PATCH] bitfields API

Previous thread: Re: Fwd: Re: [PATCH] USB: add USB test and measurement class driver by Marcel Janssen on Thursday, August 28, 2008 - 2:22 pm. (3 messages)

Next thread: [patch 2.6.27-rc4] rtc-cmos: wakes again from S5 by David Brownell on Thursday, August 28, 2008 - 2:29 pm. (4 messages)
To: David Miller <davem@...>, Ingo Molnar <mingo@...>, Andrew Morton <akpm@...>
Cc: Pekka Enberg <penberg@...>, <linux-kernel@...>
Date: Thursday, August 28, 2008 - 2:32 pm

Hi,

kmemcheck is the kernel patch that detects use of uninitialized memory.

kmemcheck reports eagerly, that is, any load of uninitialized memory
will be reported, even though the bits in question will be thrown away
later (before they are used in making a decision).

One problem that we have encountered (which generates a false positive
report) is that of bitfield accesses. In particular, whenever a bit of
a bitfield is set (or cleared), the whole byte may be loaded before the
specific bit is toggled. If the bitfield has not been accessed prior to
this, it will of course load the uninitialized byte and report this to
the user.

One solution to the problem is to tell the compiler that the memory
location in question is in fact uninitialized. This means that the
compiler can optimize out the initial load (since it will be unused in
either case), and no warning will be given by kmemcheck.

One way to tell this to the compiler is to set all the members of the
bitfield at once. We could do this unconditionally, although it adds
some overhead on certain architectures, or we can do it for certain
architectures only. My proposal is the attached patch -- a bitfields
"API". Please see the patch description for a thorough explanation.

(Please note that the patches do nothing unless the architecture is
explicit about wanting the functionality. On i386, the second patch in
the series would actually save 7 bytes, while on sparc it would take
about 48 bytes _more_.)

(The alternative to the patch is to explicitly initialize all the
fields [including any leftover "padding" fields that must also be
present] to zero at the same time, before copying in the new values.
This seems a little excessive to me, and I believe that a single
bitfield_begin_init() is neater and harder to break than an explicit
[and seemingly meaningless] list of bitfield member initializations).

So:

How do you feel about this patch? It's all about making kmemcheck more
useful... and not much else. Does it have any...

To: Vegard Nossum <vegard.nossum@...>
Cc: David Miller <davem@...>, Ingo Molnar <mingo@...>, Andrew Morton <akpm@...>, Pekka Enberg <penberg@...>, <linux-kernel@...>
Date: Thursday, August 28, 2008 - 2:40 pm

--

To: Alexey Dobriyan <adobriyan@...>
Cc: David Miller <davem@...>, Ingo Molnar <mingo@...>, Andrew Morton <akpm@...>, Pekka Enberg <penberg@...>, <linux-kernel@...>
Date: Thursday, August 28, 2008 - 2:46 pm

Ok, that's constructive :-P

Can we skip the type and always assume that it should be __u8/uint8_t?
I read somewhere that bitfields should anyway always be 1 byte wide if
the bitfield should be "portable". Would it help (to make this less
horrible) to omit the type declaration and have just the bitfield
members as arguments to the macro?

Thanks,

Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036
--

To: Vegard Nossum <vegard.nossum@...>
Cc: David Miller <davem@...>, Ingo Molnar <mingo@...>, Andrew Morton <akpm@...>, Pekka Enberg <penberg@...>, <linux-kernel@...>
Date: Thursday, August 28, 2008 - 3:05 pm

Or you can parse instruction stream a bit more.

--

To: Alexey Dobriyan <adobriyan@...>
Cc: David Miller <davem@...>, Ingo Molnar <mingo@...>, Andrew Morton <akpm@...>, Pekka Enberg <penberg@...>, <linux-kernel@...>
Date: Thursday, August 28, 2008 - 3:18 pm

I think we have different opinions of what exactly constitutes "horrible" :-P

Your suggestion would have made sense if I was a company of 10
developers who could import all of valgrind source code (including
opcode decoder and instruction emulator) into the kernel. But I only
writing kmemcheck in my free time, so this will never happen. Or.. at
least I will not do it. Of course, kmemcheck, valgrind/memcheck, and
indeed the kernel itself are all open source, so anybody could do it.

In the mean-time, I am looking for an acceptable solution. Other
debugging features use helper annotations too. But I am pretty sure
that the immediate solution will not include parsing the instruction
stream a bit more.

Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036
--

To: Alexey Dobriyan <adobriyan@...>
Cc: Vegard Nossum <vegard.nossum@...>, David Miller <davem@...>, Ingo Molnar <mingo@...>, Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Thursday, August 28, 2008 - 3:07 pm

Hi Alexey,

Uhm, I'm not sure how that would work out. I mean, we'd need to see if
parts of the loaded byte are used or not... Doesn't sound practical or
too pretty.

Pekka
--

To: Vegard Nossum <vegard.nossum@...>
Cc: Alexey Dobriyan <adobriyan@...>, David Miller <davem@...>, Ingo Molnar <mingo@...>, Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Thursday, August 28, 2008 - 3:02 pm

Why not do something like this (as suggested by Ingo, I think)? Yeah, the
macro should go into kmemcheck.h but I don't have a tree handy...

Pekka

Index: linux-2.6/include/linux/bitfield.h
===================================================================
--- /dev/null
+++ linux-2.6/include/linux/bitfield.h
@@ -0,0 +1,10 @@
+#ifndef __LINUX_BITFIELD_H
+#define __LINUX_BITFIELD_H
+
+#ifdef CONFIG_KMEMCHECK
+#define KMEMCHECK_BIT_FIELD(field) do { field = 0; } while (0)
+#else
+#define KMEMCHECK_BIT_FIELD(field) do { } while (0)
+#endif /* CONFIG_KMEMCHECK */
+
+#endif /* __LINUX_BITFIELD_H */

Index: linux-2.6/net/core/skbuff.c
===================================================================
--- linux-2.6.orig/net/core/skbuff.c
+++ linux-2.6/net/core/skbuff.c
@@ -55,6 +55,7 @@
#include <linux/rtnetlink.h>
#include <linux/init.h>
#include <linux/scatterlist.h>
+#include <linux/bitfield.h>

#include <net/protocol.h>
#include <net/dst.h>
@@ -209,6 +210,11 @@ struct sk_buff *__alloc_skb(unsigned int
skb->data = data;
skb_reset_tail_pointer(skb);
skb->end = skb->tail + size;
+ KMEMCHECK_BIT_FIELD(skb->local_df);
+ KMEMCHECK_BIT_FIELD(skb->cloned);
+ KMEMCHECK_BIT_FIELD(skb->ip_summed);
+ KMEMCHECK_BIT_FIELD(skb->nohdr);
+ KMEMCHECK_BIT_FIELD(skb->nfctinfo);
/* make sure we initialize shinfo sequentially */
shinfo = skb_shinfo(skb);
atomic_set(&shinfo->dataref, 1);
--

To: Pekka J Enberg <penberg@...>
Cc: Alexey Dobriyan <adobriyan@...>, David Miller <davem@...>, Ingo Molnar <mingo@...>, Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Thursday, August 28, 2008 - 3:38 pm

That looks good to me. If the extra lines are okay with net people, we
can put this in the fixlets branch and make it the norm for dealing
with bitfields in kmemcheck.

One thing to keep in mind that if the members of the bitfield does not
span the entire width of the bitfield, the remaining bits must also be
assigned (as an extra "filler" member), otherwise GCC will not
optimize it to a single store. But that is not an issue in this
particular case since all the bits are used.

Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036
--

To: Pekka J Enberg <penberg@...>, Alexey Dobriyan <adobriyan@...>
Cc: David Miller <davem@...>, Ingo Molnar <mingo@...>, Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Saturday, August 30, 2008 - 4:28 am

Hm, and this is exactly the case for the "do_not_encrypt" field of skbuff:

#ifdef CONFIG_IPV6_NDISC_NODETYPE
__u8 ndisc_nodetype:2;
#endif
#if defined(CONFIG_MAC80211) || defined(CONFIG_MAC80211_MODULE)
__u8 do_not_encrypt:1;
#endif

So we have no way to know where or how big the filler should be. This is
why the simple patch above is not sufficient.

Alexey: I have a modified proposal with slightly different syntax for
DEFINE_BITFIELD. Can you say whether this is acceptable or not? Please
see this short mockup example:

<--- cut --->

#include <stdint.h>
#include <string.h>

#define DEFINE_BITFIELD(name, fields...) \
union { \
struct fields name; \
struct fields; \
};

#define KMEMCHECK_ANNOTATE_BITFIELD(bitfield) \
do { \
memset(&(bitfield), 0, sizeof(bitfield)); \
} while(0)

struct skbuff {
DEFINE_BITFIELD(flags1, {
uint8_t pkt_type:3,
fclone:2,
ipvs_property:1,
peeked:1,
nf_trace:1;
uint16_t protocol;
});

DEFINE_BITFIELD(flags2, {
#ifdef CONFIG_IPV6_NDISC_NODETYPE
uint8_t ndisc_nodetype:2;
#endif
#if defined(CONFIG_MAC80211) || defined(CONFIG_MAC80211_MODULE)
uint8_t do_not_encrypt:1;
#endif
});
};

void __alloc_skb(struct skbuff *skb)
{
KMEMCHECK_ANNOTATE_BITFIELD(skb->flags1);
KMEMCHECK_ANNOTATE_BITFIELD(skb->flags2);
}

<--- cut --->

Thanks,

Vegard
--

To: Alexey Dobriyan <adobriyan@...>
Cc: Vegard Nossum <vegard.nossum@...>, David Miller <davem@...>, Ingo Molnar <mingo@...>, Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Thursday, August 28, 2008 - 2:40 pm

Hi Alexey,

Heh, heh, one alternative is to have a kmemcheck_memset() thingy that
unconditionally zeroes bit fields and maybe is a no-op when kmemcheck is
disabled.
--

To: Pekka Enberg <penberg@...>
Cc: Alexey Dobriyan <adobriyan@...>, Vegard Nossum <vegard.nossum@...>, David Miller <davem@...>, Ingo Molnar <mingo@...>, Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Thursday, August 28, 2008 - 4:27 pm

This sounds as if this might cause bugs to disappear when debugging gets
turned on?

Or do I miss anything?

cu
Adrian

--

"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed

--

To: Adrian Bunk <bunk@...>
Cc: Pekka Enberg <penberg@...>, Alexey Dobriyan <adobriyan@...>, David Miller <davem@...>, Ingo Molnar <mingo@...>, Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Thursday, August 28, 2008 - 4:59 pm

You are correct :-)

Almost all the possible solutions (at least the feasible ones) are
trade-offs between false-positives and false-negatives.

So here we are trading a bunch of false-positive errors (a couple of
thousand for transferring a 1M file over ssh :-)) for detecting any
code that uses an uninitialized flag in struct skbuff. So in this case
it is more useful to hide reports about this single bit-field.

Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036
--

To: Adrian Bunk <bunk@...>
Cc: Alexey Dobriyan <adobriyan@...>, Vegard Nossum <vegard.nossum@...>, David Miller <davem@...>, Ingo Molnar <mingo@...>, Andrew Morton <akpm@...>, <linux-kernel@...>
Date: Thursday, August 28, 2008 - 4:54 pm

Yeah, I suppose. The problem doing that unconditionally is that it
increases kernel text slightly on some architectures (e.g. sparc).
However, as long as you use the KMEMCHECK_BIT_FIELD annotation only in
places that give you false positives, it's we should be safe.

Pekka
--

Previous thread: Re: Fwd: Re: [PATCH] USB: add USB test and measurement class driver by Marcel Janssen on Thursday, August 28, 2008 - 2:22 pm. (3 messages)

Next thread: [patch 2.6.27-rc4] rtc-cmos: wakes again from S5 by David Brownell on Thursday, August 28, 2008 - 2:29 pm. (4 messages)