Heh... The first catches are lovely:
struct fxsrAlignAssert {
int _:!(offsetof(struct task_struct,
thread.i387.fxsave) & 15);
};
as an idiotic way to do BUILD_BUG() and
#define _IOC_TYPECHECK(t) \
((sizeof(t) == sizeof(t[1]) && \
sizeof(t) < (1 << _IOC_SIZEBITS)) ? \
sizeof(t) : __invalid_size_argument_for_IOC)
poisoning _IOW() et.al., so those who do something like
static const char *v4l1_ioctls[] = {
[_IOC_NR(VIDIOCGCAP)] = "VIDIOCGCAP",
run into trouble. The former is "tell jbeulich to cut down on crack",
but the latter... Probably ought to be
#define _IOC_TYPECHECK(t) \
(sizeof(t) + BUILD_BUG_ON_ZERO(sizeof(t) == sizeof(t[1]) && \
sizeof(t) < (1 << _IOC_SIZEBITS)))
Objections? The only reason that doesn't break gcc to hell and back is
that gcc has unfixed bugs in that area. It certainly is not a valid C
or even a remotely sane one.
-
Yes, looks good. I originally came up with _IOC_TYPECHECK before we had
the generic BUILD_BUG_ON().
One minor issue though:
While BUILD_BUG_ON and a few other macros in linux/kernel.h are currently
exported to user space, I would think that they should really be hidden
in #ifdef __KERNEL__, which means that we also need something like
#ifdef __KERNEL__
#define _IOC_TYPECHECK(t) \
(sizeof(t) + BUILD_BUG_ON_ZERO(sizeof(t) == sizeof(t[1]) && \
sizeof(t) < (1 << _IOC_SIZEBITS)))
#else
#define _IOC_TYPECHECK(t) sizeof(t)
#endif
Arnd <><
-
Point, but then we might want to export the damn thing in properly named version (i.e. put it into __.... reserved namespace). It doesn't depend on anything kernel-specific... FWIW, the current version suffers from similar problem - it's outside of __KERNEL__ in userland-exported header and it can lead to breakage when included in build by something other than gcc... -
On the other hand, this one really does seem to be "nice". I don't think it's a misfeature to be able to do "obvious compile-time constant optimizations" on initializer indexes. The bitfield size thing in some ways does do the same thing - it's clearly _odd_, but if I had my I agree that it's obviously not "valid C", but I don't agree that it's not remotely sane. Why not allow that extension? Linus -
Why? I'd say it's not better than BUILD_BUG_ON_ZERO() use instead of that ?:. This code would remain unchanged; the entire reason why we run into problems is that a way to force an error at build time had produced something that was not a constant expression. It's not a matter of having to change something in the users of that sucker (or _IOC_NR(), or _IOR(), or _IO()). The code in question is there for one thing - to give (constant) sizeof(t) or an error if t doesn't look good for Er... That's nice, assuming you don't suddenly run into "code with convoluted bunch of macros breaks on a different version of compiler and/or different CFLAGS". IOW, having the optimizer strength define the boundary between the programs that compile and ones that give an error is not a good idea, IMO. Where do you place that boundary? Is 0 && n good? How about 0 & n? Or 0 * n? Or n - n? Or (n+1)-(n+1) from macro expansion? Note that gcc does _not_ take the last one as integer constant expression, but happens to deal with the earlier ones. OTOH, it does deal with n * m - n * m. And I would not bet a dime on gcc versions being consistent with each other in that area. Now, there is one possible answer that makes some sense - allow removal of unevaluated parts in &&, || and ?:. I can do that, but I'm not sure if it's actually worth doing. The only case in the kernel tree become more readable with use of BUILD_BUG_ON_ZERO() anyway and I would be very surprised to find any real code where something of that kind Because it's not well-defined and because having "let me check if this version of compiler will eat that" as the only way to tell if construct would be OK is not a good thing... -
Oh, _that_ part I have no problem with. It's more that it seems that the gcc optimization is ok at least as an extension. Linus -
Sure, but it's not an extension (yet), but an implementation side-effect; it would have to be (semi-formally) defined in the manual to be an extension. Until that happens, anyone using this "feature" risks haven his code broken at any time (or, rather, his code already was broken but he didn't know it). See gcc.gnu.org/PR456 for more discussion. Yes it's an old bug... Segher -
Humm... Right, so __builtin_offsetof() needs special treatment too. Oh, bugger. Is offsetof(struct foo, a.x[n]) a documented extension? I _know_ that it's not promised by 7.17, but gcc eats it (and obviously that sucker requires extra treatment in that case). Parsing __builtin_offsetof() arguments is going to be fun ;-/ Right now sparse has it as a predefined macro, but if we want to do that kind of analysis, we need to really parse it. OTOH, that's not such a big deal... Parser would need to accept ident ( \[ expr \] | . ident )* there, typecheck would walk down, check types and find offsets of struct members on the way down and build expression on the way back (e.g. in form of constant + sum of stuff from [...]), doing evaluate_expression() on each index and slapping Int_const_expr on created nodes accordingly. In the end, slap the Int_const_expr on resulting expression in obvious way. expand would work as usual, no interesting nodes surviving by that point... OK, that's doable and probably should be split into implementation of __builtin_offsetof() (sans Int_const_expr logics) and Int_const_expr parts merged into the patch that does all handling of integer constant expressions. Joy. OK, folks, disregard 16/16 in the current form; everything prior to it stands on its own. -
I asked on comp.std.c about this; the feeling was that only identifiers are intended to be permitted as the 2nd argument to offsetof. If true the standard has a very obscure way of stating something that could don't forget -> if you're going to accept extra stuff. GCC forgot -> with the parser rewrite, yes I filed a PR. Neil. -
In offsetof() second argument??? Ah, hell... You mean the situations
like offsetof(struct foo, x->a) in
struct foo {
struct {
int a;
} x[10];
};
OK...
-
-> is not allowed within the second arg to __builtin_offsetof(). Or do you mean something else? What's the PR #, and did it ever get fixed? Segher -
It is. See info gcc -> C Extensions -> Offsetof Segher -
Acknowledged. The rest of the patches look good to me, so I'll merge 1-1= 5 soon, and ignore 16. Do you have those patches in public git somewhere, or would you like me t= o just apply the patches from mail? - Josh Triplett
git://git.kernel.org/pub/scm/linux/kernel/git/viro/sparse.git/ misc (or directly from hera) -
No changes to validation output at any point in the series. Merged. Tha= nks! - Josh Triplett
OK, here's the replacement. First the handling of __builtin_offsetof()
(below in this mail), then integer constant expressions (in followup).
They can be pulled from
From: Al Viro <viro@zeniv.linux.org.uk>
Date: Tue, 26 Jun 2007 16:06:32 -0400
Subject: [PATCH] implement __builtin_offsetof()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
evaluate.c | 84 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
expand.c | 1 +
expression.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++
expression.h | 10 +++++++
ident-list.h | 1 +
inline.c | 18 +++++++++++-
lib.c | 1 -
show-parse.c | 1 +
8 files changed, 181 insertions(+), 2 deletions(-)
diff --git a/evaluate.c b/evaluate.c
index 07ebf0c..c702ac4 100644
--- a/evaluate.c
+++ b/evaluate.c
@@ -2608,6 +2608,87 @@ static struct symbol *evaluate_call(struct expression *expr)
return expr->ctype;
}
+static struct symbol *evaluate_offsetof(struct expression *expr)
+{
+ struct expression *e = expr->down;
+ struct symbol *ctype = expr->in;
+ int class;
+
+ if (expr->op == '.') {
+ struct symbol *field;
+ int offset = 0;
+ if (!ctype) {
+ expression_error(expr, "expected structure or union");
+ return NULL;
+ }
+ examine_symbol_type(ctype);
+ class = classify_type(ctype, &ctype);
+ if (class != TYPE_COMPOUND) {
+ expression_error(expr, "expected structure or union");
+ return NULL;
+ }
+
+ field = find_identifier(expr->ident, ctype->symbol_list, &offset);
+ if (!field) {
+ expression_error(expr, "unknown member");
+ return NULL;
+ }
+ ctype = field;
+ expr->type = EXPR_VALUE;
+ expr->value = offset;
+ expr->ctype = size_t_ctype;
+ } else {
+ if (!ctype) {
+ expression_error(expr, "expected structure or union");
+ return NULL;
+ }
+ examine_symbol_type(ctype);
+ class = classify_type(ctype, &ctype);
+ if (class != (TYPE_COMPOUND | TYPE_PTR)) {
+ expression_error(expr, "expected array");
+ return ...Both patches looked good to me in review, and neither changes the validation output. Merged. Thanks! - Josh Triplett -
From: Al Viro <viro@zeniv.linux.org.uk>
Date: Sun, 24 Jun 2007 03:11:14 -0400
Subject: [PATCH] fix handling of integer constant expressions
Hopefully correct handling of integer constant expressions. Please, review.
Rules:
* two new flags for expression: int_const_expr and float_literal.
* parser sets them by the following rules:
* EXPR_FVALUE gets float_literal
* EXPR_VALUE gets int_const_expr
* EXPR_PREOP[(] inherits from argument
* EXPR_SIZEOF, EXPR_PTRSIZEOF, EXPR_ALIGNOF get int_const_expr
* EXPR_BINOP, EXPR_COMPARE, EXPR_LOGICAL, EXPR_CONDITIONAL,
EXPR_PREOP[+,-,!,~]: get marked int_const_expr if all their
arguments are marked that way
* EXPR_CAST gets marked int_const_expr if argument is marked
that way; if argument is marked float_literal but not
int_const_expr, we get both flags set.
* EXPR_TYPE also gets marked int_const_expr (to make it DTRT
on the builtin_same_type_p() et.al.)
* EXPR_OFFSETOF gets marked int_const_expr
When we get an expression from parser, we know that having int_const_expr on
it is almost equivalent to "it's an integer constant expression". Indeed,
the only checks we still have not done are that all casts present in there
are to integer types, that expression is correctly typed and that all indices
in offsetof are integer constant expressions. That belongs to evaluate_expression()
and is easily done there.
* evaluate_expression() removes int_const_expr from some nodes:
* EXPR_BINOP, EXPR_COMPARE, EXPR_LOGICAL, EXPR_CONDITIONAL,
EXPR_PREOP: if the node is marked int_const_expr and some
of its arguments are not marked that way once we have
done evaluate_expression() on them, unmark our node.
* EXPR_IMLICIT_CAST: inherit flags from ...Am I invoking sparse wrongly? ./sparse -W -Wall doesn't diagnose the following TU, for example. extern int a; extern int as1[(a = 2)]; Neil. -
sparse simply doesn't check that. We don't have anything resembling
support of VLA. Note that check for integer constant expression
has nothing to do with that;
int x[(int)(0.6 + 0.6)];
is valid (if stupid). And yes, footnote in 6.6 contradicts 6.7.5.2(1);
too bad...
We certainly need to do checks on array sizes; however, that part
("if it has static storage duration, it should not be a VLA") is minor.
And then there are gccisms:
size_t foo(int n)
{
struct {
int a[n];
char b;
} x;
return offsetof(typeof(x), b);
}
Yes, it's eaten up just fine. And yes, such structures are silently
accepted even with -pedantic -std=c99, which is a bug. Sigh...
We'll need to tackle VLAs at some point, but it certainly won't be fun ;-/
-
If it did support VLAs it would point out that this is a constraint violation. VLAs must have block or function prototype scope. -- Derek M. Jones tel: +44 (0) 1252 520 667 Knowledge Software Ltd mailto:derek@knosof.co.uk Applications Standards Conformance Testing http://www.knosof.co.uk -
I know. It's just that "let's do something about array size checks" triggers "yeah, but the poor sod who does that will have to sort VLAs *and* gcc extensions around VLAs out", which is not a nice thought and so far nobody had touched that area at all. -
Well, the above has two bugs that sparse could notice _independently_ of variable-sized arrays: - assignment outside of a function - variable size array that isn't an automatic variable (strictly speaking, that's not even a variable size - it's a constant 2, just with a non-constant expression - maybe you misread the "=" as an "==") Linus -
Right; what I'm saying is that we don't do any checks on array sizes at
all, mostly since nobody is brave enough to deal with VLAs (which we'll
With == it would be a different bug ;-)
BTW, VLA can be not just auto variable - it can be used in derivation of
such (i.e. you can say int (*p)[n], just not for anything not in block
or prototype scope). And $DEITY help us[1] when ({...}) comes into the
game, since it allows leaking types out of the scope they'd been
declared in...
[1] or gcc - it gets an ICE galore in that class of testcases
-
It isn't valid; it fails the test twice. Both 0.6 are not "immediate operands of casts". Their sum is, but that's irrelevant. Therefore the dimension is not an ICE and a diagnostic is required. Neil. -
Egads... After rereading that... What a mess.
int foo(void)
{
static int a[1][0,2];
}
is, AFAICS, allowed. Reason:
int a[0,2]
is a VLA due to 6.7.5.2[4] (0,2 is not an ICE). However, due to the language
in the same section,
int a[1][0,2]
is *not* a VLA, since (a) 2 is an ICE and (b) its element type "has a known
constant size" (it does - the value of 0,2 is certainly guaranteed to be 2).
I.e., it's VM type, but not a VLA. I.e. only the first part of 6.7.5.2[2]
applies and we are actually fine.
So we can have a static single-element array of int [0,2], but
not a plain static int [0,2]. Lovely, that...
-
DR 312 clarified the meaning of "known constant size" http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_312.htm in the sensible way, thankfully, so your example is actually invalid. Neil. -
<looks> Aha. OK... I'd say that 6.7.5.2[4] should simply have "element type is not a variable length array" instead, regardless of clarification of "known constant size", since element type is not allowed to be incomplete in any case. Would be more readable... But yes, clarification makes sense anyway. -
Here are three independently invalid non-ICEs that sparse doesn't
diagnose.
extern int f(void);
enum { cast_to_ptr = (int) (void *) 0 };
enum { cast_to_float = (int) (double) 1 };
enum { fncall = 0 ? f(): 3 };
Hey, I did warn you it was tricky :)
Neil.
-
Nope. They are recognized as non-ICE (try with struct { int n :<expr>;};
for tester). Again, no check for ICE in enums, and that one is for
a good reason - we might very well want non-integer enums as an extension
at some point. We certainly need to check that they are constant, though.
If you want to test ICE recognition, right now only the following places
are checking for it:
* bitfield width
* __attribute__((aligned(<number>)))
* __attribute__((address_space(<number>)))
* [<index>] in designators within initializer list
* [<index> ... <index>] - ditto, gccism
That's it. We can use it elsewhere too, but that's a separate bunch of
patches (trivial to do now).
-
Bah. You don't escape that easily :)
extern int a;
struct b { unsigned int b1:(1,2); };
struct c { unsigned int c1: 1 ? 2: a++; };
are both invalid yet undiagnosed.
Neil.
-
Son of a... expand_comma() cannibalizes the node, should restore ->flags
Ditto for expand_conditional, but there we should preserve the original
->flags instead - might be non-zero and we ought to do that after
expanding the taken branch...
From: Al Viro <viro@zeniv.linux.org.uk>
Date: Wed, 27 Jun 2007 09:10:54 -0400
Subject: [PATCH] fix the missed cannibalizing simplifications
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
expand.c | 9 +++++++--
1 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/expand.c b/expand.c
index f945518..9a536d5 100644
--- a/expand.c
+++ b/expand.c
@@ -429,8 +429,10 @@ static int expand_comma(struct expression *expr)
cost = expand_expression(expr->left);
cost += expand_expression(expr->right);
- if (expr->left->type == EXPR_VALUE || expr->left->type == EXPR_FVALUE)
+ if (expr->left->type == EXPR_VALUE || expr->left->type == EXPR_FVALUE) {
*expr = *expr->right;
+ expr->flags = 0;
+ }
return cost;
}
@@ -488,12 +490,15 @@ static int expand_conditional(struct expression *expr)
cond_cost = expand_expression(cond);
if (cond->type == EXPR_VALUE) {
+ unsigned flags = expr->flags;
if (!cond->value)
true = false;
if (!true)
true = cond;
+ cost = expand_expression(*true);
*expr = *true;
- return expand_expression(expr);
+ expr->flags = flags;
+ return cost;
}
cost = expand_expression(true);
--
1.5.0-rc2.GIT
-
Now I think I only see one class of issues; the following is valid
C99 (I believe that's what you intend to follow) but being rejected:
struct a { int comma: 1 ? 2: (2, 3); };
It's invalid C90.
Neil.
-
*unprintable* Yes, I see... OK, null pointer constants handling (next patch in the queue) introduces is_zero_constant() (silent evaluation of integer constant expression, with division by 0/too large shift/- on lowest value of signed integer type leaving the branch as-is, so that later expand would generate a proper error on it; then checking if we'd reduced the sucker to EXPR_VALUE[0]). I'll pull it into a separate patch, along with is_nonzero_constant(), and change rules for potential ICE on parser stage to maybe-ICE && y => maybe-ICE maybe-ICE || y => maybe-ICE maybe-ICE ? x : y => maybe-ICE if at least one of x and y is maybe-ICE maybe-ICE ? : y => maybe-ICE letting evaluate_expression() on such suckers use them if the first argument turns out to be ICE after its evaluate_expression()... It really stinks, especially since we can't say "oh, parent it known to be non-ICE, no need to bother" - subexpression might be shared. -
... or it could be done simpler, if we keep the current logics for Int_const_expr flag at parse time and add a 'const expression' one with rules as above. Anyway, I'm going to get some sleep before dealing with that crap. -
This passes an incorrect type to expand_expression; it wants a struct expression *, but *true has type struct expression; did you want to pass true rather than *true? - Josh Triplett -
Those two *really* shouldn't fail. I don't care if the C standard says so,
that is *fine*.
In particular, "offsetof()" should be portably able to basically be the
standard #define, which involves an integer cast from a constant pointer.
That had *better* be a valid constant integer expression, because it's
very useful.
And I think standards can go screw themselves, and you can make it an
error with some "--standard-pedantic" switch or similar.
Standards are just random pieces of paper, for crying out loud! They have
Again, I think that's a deficiency of a standard that tries to be
acceptable to everybody rather than about a "really good language".
So I personally think we should allow it too if at all possible, and
again, use some "--standard-pedantic" to flag it as an error.
Why? Because things like that may not look sensible when written out, but
they are often _very_ sensible when they are the result of a macro that
does some error checking or other thing.
The classic example of this is "__builtin_constant_p()". It is a *great*
way to make a macro that does different things depending on whether
something is a compile-time constant or not, and no, it's not standard,
but dang, it's so useful that a standard that doesn't allow sane use of it
is basically bogus.
So look at the "ntohl()" kind of thing, and realize that it's just "Good
Practice(tm)" to be able to make a ntohl() macro that can be used for
initializers, including very much enum initializers. Ie
enum { defaultport = htons(9418) };
is actually nice code for something like the kernel, but it turns out that
in order to make this work, you have to do it as
#define htons(x) (__builtin_constant_p(x) ? constant_htons(x) : __htons(x))
and that in turn generates *exactly* the kind of thing you talk of above.
And when you give _your_ example, it looks insane. When I give _my_
example, it generates exactly the same thing, but suddenly it has a great
reason for doing so, and ...Agreed. Issuing a warning on these kinds of constructs does not seem useful, but if people really want it, it should go in some warning option that does not get turned on by default, and probably should not Also agreed. Same goes for other short-circuiting operations like &&, ||, and ?: without the center argument; if you can determine at compilation time that it does not need to evaluate part of the expression at all, go ahead and ignore that part of the expression even if it does not constitute an integer constant expression. If you want to optionally check for this case and issue a diagnostic, put it under -Wstrict-constant-expressions or similar. - Josh Triplett -
That's not quite right. In principle, __builtin_choose_expr() could be That actually means extra work for evaluate_expression(). Unfortunately. The thing is, we want to typecheck all branches, even ones not taken. _However_, we don't want to expand all of them. Having extra places where we have to do expansion means extra work. -
Eh... I'd say that my variant for offsetof() is simply better - it usually directly turns into EXPR_VALUE, right in place, without rather convoluted work. Aside of "should such cast be a constant integer expression"... -
Umm. But sparse is meant to parse C code. Which very much includes *other* projects. The kernel, for example, has its own offsetof. And yes, these days we use "__compiler_offsetof()", but we used to do #define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER) and I seriously doubt that the kernel is the only one doing things like that. Linus -
You can't have it both way, really. If we are talking about annotating a codebase we _can_ annotate, that one is not a problem at all. If we are talking about vanilla C project that never heard about sparse... We can define whatever extensions we like, but such project has to cope with whatever C compilers they had been using. So "sparse believes that this defintion of offsetof can be used as array size" will mean fsck-all outside of #ifdef __CHECKER__ and under such ifdef we can always define it to builtin; if anything, that will be faster and easier on sparse. -
Yes it's useful. That's why GCC gives you __builtin_offsetof() Sure, as long as you don't care about compatibility across compilers, what matters is what the compilers you _do_ use actually implement. And note that GCC doesn't guarantee you much over what the C standard does. Almost everything it allows extra is just an implementation side effect. Segher -
gcc logs: * expressions of form <....> can be reduced to cheaper form by <....>. Tested and merged. gcc logs a year later: * revert commit <...>, it causes subtle problems (see PR<....>, <....> and <....>). Proposed replacement is too intrusive for stable branch. gcc logs a month later: * tested and merged the real fix for PR<....>; will go into the next release. foobar logs a year later: * gcc versions between <...> and <...> refuse to compile baz.c, complain about non-constant index in initializer. Waded through the sewers of macros we have in barf.h and blah.h, found what had been causing that. Fixed. Ain't fun... -
If I understand correctly what bugs you are talking about, most (all?) of those were solved in the dark ages already Why not? Nothing in the C standard says all your externs have to be defined in some other translation unit you link It's unusual at least :-) Segher -
Alas, no. gcc is amazingly (and inconsistently) sloppy about the
It's not about externs. It's about things like
unsigned n;
int a[] = {[n - n + n - n] = 1};
And yes, gcc does eat that. With -pedantic -std=c99, at that.
However,
unsigned n;
int a[] = {[n + n - n - n] = 1};
gets you error: nonconstant array index in initializer
And that's 4.1, not 3.x...
-
Ah yes, now I see what you were talking about. Most of this Why are you using such an ancient compiler? :-) (Not that it is fixed in the current release though). Segher -
