Re: Floating point exception by perf record

Previous thread: Re: linux-next: build warning after merge of the tree by Stephen Rothwell on Sunday, April 4, 2010 - 12:26 am. (1 message)

Next thread: [PATCH] Swap including order of util.h and string.h of util/string.c by Hitoshi Mitake on Sunday, April 4, 2010 - 1:13 am. (3 messages)
From: Hitoshi Mitake
Date: Sunday, April 4, 2010 - 1:10 am

Hi Arnaldo,

I suffered a mysterious floating point exception when executing perf
on current tip/master.

For example,

% sudo ./perf lock record ./perf bench sched messaging
# Running sched/messaging benchmark...
# 20 sender and receiver processes per group
# 10 groups == 400 processes run

      Total time: 1.375 [sec]
[ perf record: Woken up 0 times to write data ]
[ perf record: Captured and wrote 160.002 MB perf.data (~6990599 samples) ]
[1]    26348 floating point exception  sudo ./perf lock record ./perf 
bench sched messaging

dmesg says that

[10212.752490] perf[26348] trap divide error ip:7fade8564310 
sp:7fff46d926e8 error:0 in libnewt.so.0.52.2[7fade8556000+15000]

Example is of perf lock, but this is not a special phenomenon of perf lock,
other users of perf record, like perf sched, caused the same problem.

I found that this might be caused by the line 283 of perf.c

	if (fclose(stdout)) # <- closing stdout!
		die("close failed on standard output: %s", strerror(errno));

I don't know well about libnewt, do you know something like cause?

The version of libnewt I'm using is 0.52.2 (default of Debian lenny)

Thanks,
	Hitoshi
--

From: Arnaldo Carvalho de Melo
Date: Monday, April 5, 2010 - 11:31 am

I just built a kernel where 'perf lock' can work, but I'm seeing a
different problem:

[root@doppio ~]# perf lock trace | head -3
  Warning: Error: expected 'data' but read 'overwrite'
         swapper-0     [000]   619.086424: lock_acquire: 0xffff8800029d4b28 &q->lock
         swapper-0     [000]   619.086429: lock_acquire: 0xffff8800029d08f8 &ctx->lock
         swapper-0     [000]   619.086439: lock_acquire: 0xffffffff81a38e58 clockevents_lock
[root@doppio ~]# perf lock report | head -3
  Warning: Error: expected 'data' but read 'overwrite'
                ID                 Name   acquired  contended total wait (ns)   max wait (ns)   min wait (ns) 

0xffff8800029d4b28             &q->lock          0          0               0               0               0 
[root@doppio ~]# 

My system details:

[root@doppio ~]# uname -a
Linux doppio.ghostprotocols.net 2.6.34-rc3-tip+ #1 SMP Mon Apr 5 13:49:42 BRT 2010 x86_64 x86_64 x86_64 GNU/Linux
[root@doppio ~]# rpm -q newt
newt-0.52.10-4.fc11.x86_64
[root@doppio ~]# 

No segfaults here :-\

- Arnaldo
--

From: Hitoshi Mitake
Date: Wednesday, April 7, 2010 - 1:32 am

On 04/06/10 03:31, Arnaldo Carvalho de Melo wrote:
 > Em Sun, Apr 04, 2010 at 05:10:57PM +0900, Hitoshi Mitake escreveu:
 >> I suffered a mysterious floating point exception when executing perf
 >> on current tip/master.
 >>
 >> For example,
 >>
 >> % sudo ./perf lock record ./perf bench sched messaging
 >> # Running sched/messaging benchmark...
 >> # 20 sender and receiver processes per group
 >> # 10 groups == 400 processes run
 >>
 >>       Total time: 1.375 [sec]
 >> [ perf record: Woken up 0 times to write data ]
 >> [ perf record: Captured and wrote 160.002 MB perf.data (~6990599 
samples) ]
 >> [1]    26348 floating point exception  sudo ./perf lock record ./perf
 >> bench sched messaging
 >>
 >> dmesg says that
 >>
 >> [10212.752490] perf[26348] trap divide error ip:7fade8564310
 >> sp:7fff46d926e8 error:0 in libnewt.so.0.52.2[7fade8556000+15000]
 >>
 >> Example is of perf lock, but this is not a special phenomenon of 
perf lock,
 >> other users of perf record, like perf sched, caused the same problem.
 >>
 >> I found that this might be caused by the line 283 of perf.c
 >>
 >> 	if (fclose(stdout)) #<- closing stdout!
 >> 		die("close failed on standard output: %s", strerror(errno));
 >>
 >> I don't know well about libnewt, do you know something like cause?
 >>
 >> The version of libnewt I'm using is 0.52.2 (default of Debian lenny)
 >
 > I just built a kernel where 'perf lock' can work, but I'm seeing a
 > different problem:
 >
 > [root@doppio ~]# perf lock trace | head -3
 >    Warning: Error: expected 'data' but read 'overwrite'
 >           swapper-0     [000]   619.086424: lock_acquire: 
0xffff8800029d4b28&q->lock
 >           swapper-0     [000]   619.086429: lock_acquire: 
0xffff8800029d08f8&ctx->lock
 >           swapper-0     [000]   619.086439: lock_acquire: 
0xffffffff81a38e58 clockevents_lock
 > [root@doppio ~]# perf lock report | head -3
 >    Warning: Error: expected 'data' but read 'overwrite'
 >                  ID                 Name   ...
From: Arnaldo Carvalho de Melo
Date: Wednesday, April 7, 2010 - 3:38 am

Take a look at:

http://git.kernel.org/?p=linux/kernel/git/acme/linux-2.6.git;a=summary

"perf" branch, there is at least one fix there, the "overwrite" one.

- Arnaldo
--

From: Hitoshi Mitake
Date: Wednesday, April 7, 2010 - 11:46 pm

On 04/07/10 19:38, Arnaldo Carvalho de Melo wrote:
 > Em Wed, Apr 07, 2010 at 05:32:24PM +0900, Hitoshi Mitake escreveu:
 >> On 04/06/10 03:31, Arnaldo Carvalho de Melo wrote:
 >>> Em Sun, Apr 04, 2010 at 05:10:57PM +0900, Hitoshi Mitake escreveu:
 >>> [root@doppio ~]# perf lock report | head -3
 >>>     Warning: Error: expected 'data' but read 'overwrite'
 >>>                   ID                 Name   acquired  contended
 >> total wait (ns)   max wait (ns)   min wait (ns)
 >>>
 >>> 0xffff8800029d4b28&q->lock          0          0               0
 >> 0               0
 >>> [root@doppio ~]#
 >>>
 >>> My system details:
 >>>
 >>> [root@doppio ~]# uname -a
 >>> Linux doppio.ghostprotocols.net 2.6.34-rc3-tip+ #1 SMP Mon Apr 5
 >> 13:49:42 BRT 2010 x86_64 x86_64 x86_64 GNU/Linux
 >>> [root@doppio ~]# rpm -q newt
 >>> newt-0.52.10-4.fc11.x86_64
 >>> [root@doppio ~]#
 >>>
 >>> No segfaults here :-\
 >>>
 >>> - Arnaldo
 >>>
 >>
 >> I tested perf lock trace and got a similar error (but different).
 >> % sudo ./perf lock trace
 >>    Fatal: reading input file (size expected=3 received=0)
 >> It seems that perf.data is broken :( I'll search the cause.
 >
 > Take a look at:
 >
 > http://git.kernel.org/?p=linux/kernel/git/acme/linux-2.6.git;a=summary
 >
 > "perf" branch, there is at least one fix there, the "overwrite" one.
 >

Thanks for your notify!
I applied changes on your branch, but my problem was not solved...

I found that the member "offset" of struct perf_file_header is broken.
I made this easy change,

diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 6c9aa16..dbea087 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -571,6 +571,7 @@ int perf_header__process_sections(struct perf_header 
*self, int fd,
         if (perf_header__getbuffer64(self, fd, feat_sec, sec_size))
                 goto out_free;

+       printf("feat_sec->offset:%llu\n", feat_sec->offset);
         err = 0;
         while (idx < nr_sections ...
From: Hitoshi Mitake
Date: Thursday, April 8, 2010 - 12:29 am

On 04/08/10 15:46, Hitoshi Mitake wrote:
 > On 04/07/10 19:38, Arnaldo Carvalho de Melo wrote:
 >  > Em Wed, Apr 07, 2010 at 05:32:24PM +0900, Hitoshi Mitake escreveu:
 >  >> On 04/06/10 03:31, Arnaldo Carvalho de Melo wrote:
 >  >>> Em Sun, Apr 04, 2010 at 05:10:57PM +0900, Hitoshi Mitake escreveu:
 >  >>> [root@doppio ~]# perf lock report | head -3
 >  >>> Warning: Error: expected 'data' but read 'overwrite'
 >  >>> ID Name acquired contended
 >  >> total wait (ns) max wait (ns) min wait (ns)
 >  >>>
 >  >>> 0xffff8800029d4b28&q->lock 0 0 0
 >  >> 0 0
 >  >>> [root@doppio ~]#
 >  >>>
 >  >>> My system details:
 >  >>>
 >  >>> [root@doppio ~]# uname -a
 >  >>> Linux doppio.ghostprotocols.net 2.6.34-rc3-tip+ #1 SMP Mon Apr 5
 >  >> 13:49:42 BRT 2010 x86_64 x86_64 x86_64 GNU/Linux
 >  >>> [root@doppio ~]# rpm -q newt
 >  >>> newt-0.52.10-4.fc11.x86_64
 >  >>> [root@doppio ~]#
 >  >>>
 >  >>> No segfaults here :-\
 >  >>>
 >  >>> - Arnaldo
 >  >>>
 >  >>
 >  >> I tested perf lock trace and got a similar error (but different).
 >  >> % sudo ./perf lock trace
 >  >> Fatal: reading input file (size expected=3 received=0)
 >  >> It seems that perf.data is broken :( I'll search the cause.
 >  >
 >  > Take a look at:
 >  >
 >  > http://git.kernel.org/?p=linux/kernel/git/acme/linux-2.6.git;a=summary
 >  >
 >  > "perf" branch, there is at least one fix there, the "overwrite" one.
 >  >
 >
 > Thanks for your notify!
 > I applied changes on your branch, but my problem was not solved...
 >
 > I found that the member "offset" of struct perf_file_header is broken.
 > I made this easy change,
 >
 > diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
 > index 6c9aa16..dbea087 100644
 > --- a/tools/perf/util/header.c
 > +++ b/tools/perf/util/header.c
 > @@ -571,6 +571,7 @@ int perf_header__process_sections(struct perf_header
 > *self, int fd,
 > if (perf_header__getbuffer64(self, fd, feat_sec, sec_size))
 > goto out_free;
 >
 > + ...
Previous thread: Re: linux-next: build warning after merge of the tree by Stephen Rothwell on Sunday, April 4, 2010 - 12:26 am. (1 message)

Next thread: [PATCH] Swap including order of util.h and string.h of util/string.c by Hitoshi Mitake on Sunday, April 4, 2010 - 1:13 am. (3 messages)