* Alexander van Heukelum <heukelum@mailshack.com> wrote:
ok, that's rather convincing.
the generic version in lib/find_next_bit.c is open-coded C which gcc can
optimize pretty nicely.
the hand-coded assembly versions in arch/x86/lib/bitops_32.c mostly use
the special x86 'bit search forward' (BSF) instruction - which i know
from the days when the scheduler relied on it has some non-trivial setup
costs. So especially when there's _small_ bitmasks involved, it's more
expensive.
i'm not surprised that the hand-coded assembly versions had a bug ...
[ this means we have to test it quite carefully though, as lots of code
only ever gets tested on x86 so code could have built dependency on
the buggy behavior. ]
i'd not worry about that too much. Have you tried to build with:
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_OPTIMIZE_INLINING=y
(the latter only available in x86.git)
i've picked it up into x86.git, lets see how it goes in practice.
Ingo
--