Re: Bug in arch/i386/lib/delay.c file, delay_loop function

Previous thread: kernel: sysctl table check failed: /dev/parport/parport0/devices/ppdev0/timeslice Sysctl already exists by Dave Jones on Friday, May 30, 2008 - 8:05 am. (2 messages)

Next thread: linux-next: strange CPU calibration by Gabriel C on Friday, May 30, 2008 - 8:42 am. (1 message)
From: Jiri Hladky
Date: Friday, May 30, 2008 - 8:15 am

Hi all,

when trying to understand how Bogomips are implemented I have found
bug in arch/i386/lib/delay.c file, delay_loop function

/* simple loop based delay: */
static void delay_loop(unsigned long loops)
{
  int d0;

  __asm__ __volatile__(
    "\tjmp 1f\n"
    ".align 16\n"
    "1:\tjmp 2f\n"
    ".align 16\n"
    "2:\tdecl %0\n\tjns 2b"
    :"=&a" (d0)
    :"0" (loops));
}


The function fails for loops > 2^31+1. It because SF is set when dec
returns numbers > 2^31

The fix is to use jnz instruction instead of jns (and add one decl
instruction to the end to have exactly the same number of loops as in
original version):

__asm__ __volatile__(
	"\tjmp 1f\n"
	".align 16\n"
	"1:\tjmp 2f\n"
	".align 16\n"
	"2:\tdecl %0\n\tjnz 2b\n"
	"decl %0"
	:"=&a" (d0)
	:"0" (loops));

IMHO, d0 is not needed at all so that we can further simplify the code:
static void delay_loop(unsigned long loops)
__asm__ __volatile__(
	"\tjmp 1f\n"
	".align 16\n"
	"1:\tjmp 2f\n"
	".align 16\n"
	"2:\tdecl %0\n\tjnz 2b\n"
	"decl %0"
	:/*we don't need output */
	:"a" (loops));
}

I will attach three small C-program to test it
delay-orig.c - original loop from kernel source code
delay-fixed.c - fixed loop
delay-fixed1.c - fixed loop without d0 variable

Outputs:
============== delay-orig.c ==================
time delay-orig 2147483649
loops 2147483649
loops 2147483649
do -2147483648

real    0m0.002s
user    0m0.000s
sys     0m0.000s

================== delay-fixed.c =============
time delay-fixed 2147483649
loops 2147483649
loops 2147483649
do -1

real    0m1.025s
user    0m1.024s
sys     0m0.000s

========== delay-fixed1.c =====================
time delay-fixed1 2147483649
loops 2147483649
loops 2147483649

real    0m1.073s
user    0m1.060s
sys     0m0.004s


and update kernel source file arch/i386/lib/delay.c. Please let me
know if these modifications make sense.

Thanks a lot
Jiri
From: Ingo Molnar
Date: Saturday, May 31, 2008 - 9:39 am

very interesting! Could you send a kernel patch for it please:

  diff -up arch/x86/lib/delay_32.c.orig arch/x86/lib/delay_32.c

	Ingo
--

From: Martin Mares
Date: Sunday, June 1, 2008 - 10:05 am

It is a long time since I have hacked that file, but you should definitely
make sure that the function is never called with a zero argument. In such
case, the original version made just a single pass, but your version
makes 2^32 of them.

				Have a nice fortnight
-- 
Martin `MJ' Mares                          <mj@ucw.cz>   http://mj.ucw.cz/
Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth
Homo homini lupus, frater fratri lupior, bohemus bohemo lupissimus.
--

From: Jiri Hladky
Date: Monday, June 2, 2008 - 3:00 am

Hi Ingo, hi Martin!

As Martin correctly pointed out, my code will fail when loops=0. So I
have added a fix for this.

The patch file is attached:

diff -up arch/x86/lib/delay_32.c-orig arch/x86/lib/delay_32.c > delay_32.c_patch

I'm attaching also both kernel source codes (delay_32.c-orig and
delay_32.c) for reference.

Thanks
Jiri

From: Ingo Molnar
Date: Tuesday, June 17, 2008 - 2:03 am

applied to tip/x86/delay, thanks Jiri! If it passes testing it will show 
up in the v2.6.27 kernel.

[ A few small patch format nits: the first hunk of the patch didnt 
  apply, it was whitespace-damaged - i hand-merged that. The assembly 
  had non-standard formatting - nowadays we try to format the assembly
  so that it looks nice in the .c file, not i the .s file. I fixed that 
  too. Also, the best way is to send fixes in the format below, with a 
  nice commit description, followed by a signed-off-by line. I fixed 
  that as well. ]

please double-check the end result. You can pick up tip/master via:

  http://people.redhat.com/mingo/tip.git/README

do "git-log -1 e01b70ef" or "git-log arch/x86/lib/delay_32.c" to see 
your commit.

	Ingo

------------->
commit e01b70ef3eb3080fecc35e15f68cd274c0a48163
Author: Jiri Hladky <hladky.jiri@googlemail.com>
Date:   Mon Jun 2 12:00:19 2008 +0200

    x86: fix bug in arch/i386/lib/delay.c file, delay_loop function
    
    when trying to understand how Bogomips are implemented I have found a
    bug in arch/i386/lib/delay.c file, delay_loop function.
    
    The function fails for loops > 2^31+1. It because SF is set when dec
    returns numbers > 2^31.
    
    The fix is to use jnz instruction instead of jns (and add one decl
    instruction to the end to have exactly the same number of loops as in
    original version).
    
    Martin Mares observed:
    
    > It is a long time since I have hacked that file, but you should definitely
    > make sure that the function is never called with a zero argument. In such
    > case, the original version made just a single pass, but your version
    > makes 2^32 of them.
    
    fixed that.
    
    Signed-off-by: Ingo Molnar <mingo@elte.hu>

diff --git a/arch/x86/lib/delay_32.c b/arch/x86/lib/delay_32.c
index d710f2d..ef69131 100644
--- a/arch/x86/lib/delay_32.c
+++ b/arch/x86/lib/delay_32.c
@@ -3,6 +3,7 @@
  *
  *	Copyright (C) 1993 Linus Torvalds
  ...
Previous thread: kernel: sysctl table check failed: /dev/parport/parport0/devices/ppdev0/timeslice Sysctl already exists by Dave Jones on Friday, May 30, 2008 - 8:05 am. (2 messages)

Next thread: linux-next: strange CPU calibration by Gabriel C on Friday, May 30, 2008 - 8:42 am. (1 message)