Finding what is stuck...

Previous thread: [PATCH 1/6] pci: fix merging left out for BAR print out by Yinghai Lu on Monday, September 1, 2008 - 4:37 pm. (11 messages)

Next thread: Re: forcedeth: option to disable 100Hz timer by Robert Hancock on Monday, September 1, 2008 - 5:41 pm. (2 messages)
From: J.A.
Date: Monday, September 1, 2008 - 5:04 pm

Hi all...

I'm running 2.6.27-rc5-git2 on an Aspire One.
The system is in general pretty responsive, but sometimes it just gets
totally stuck. Even the mouse stops. 

It looks related to disk (SSD) access, but I'm not totally sure.
Is there any way to find what is getting stuck ? I know that SSDs can
be slow on write, I don't mind if the system is faster or slower
(it's small :))), but if the speed is constant. That occasional
pauses are strange, like if SSD flushing gets stuck on BKL
(I know, no idea about what I talk...).

I'm using ext3 fs, noop iosched. But as I say, I'm not sure that the
disk writes are the culprit.

Any idea about how to find this ?

TIA

-- 
J.A. Magallon                       \                     Software is like sex:
                                     \               It's better when it's free
Mandriva Linux release 2008.1 (Cooker) for i586
Linux kernel 2.6.23-jam01,  gcc version 4.2.2 20070909
--

From: Arjan van de Ven
Date: Monday, September 1, 2008 - 5:08 pm

On Tue, 2 Sep 2008 02:04:47 +0200

Have you tried to run "latencytop"?
(you need to enable this in the kernel config as well)

it tends to (for me at least) point out very well where stalls happen,
or at least, what the system is doing when they happen.

(hint: make sure you do "make install" before running it)

-- 
If you want to reach me at my work email, use arjan@linux.intel.com
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
--

From: J.A.
Date: Monday, September 1, 2008 - 5:12 pm

Thanks, I'm a bit limited to distro kernels 'cause I don't want to build
my own kernel on the One, but I have just looked and Mandriva kernel is
built with LATENCYTOP=y. I will try that and report...

-- 
J.A. Magallon                       \                     Software is like sex:
                                     \               It's better when it's free
Mandriva Linux release 2008.1 (Cooker) for i586
Linux kernel 2.6.23-jam01,  gcc version 4.2.2 20070909
--

From: J.A.
Date: Wednesday, September 3, 2008 - 4:00 pm

These are some shots of latencytop while working. I copied the screen
when I saw any very high timing...

Cause                                                Maximum     Percentage
Walking directory tree                            790.2 msec         10.6 %
Scheduler: waiting for cpu                         40.5 msec         38.5 %

fsync() on a file                                 3191.6 msec         36.7 %
Reading EXT3 directory                            1553.3 msec          3.9 %
EXT3: Waiting for journal access                  1518.7 msec          3.8 %
Deleting an inode                                 657.5 msec          1.7 %
EXT3: Looking for file                            576.3 msec          1.4 %
Writing a page to disk                            363.0 msec          6.1 %
call_usermodehelper_exec request_module __sock_cre 69.4 msec          0.2 %
Scheduler: waiting for cpull                       28.0 msec         18.9 %
Page fault                                          7.9 msec          0.3 %

fsync() on a file                                 8843.7 msec         81.8 %
Writing a page to disk                            1548.6 msec          2.9 %
EXT3: Waiting for journal access                  356.5 msec          1.1 %
Scheduler: waiting for cpu                         40.6 msec          6.0 %

EXT3: Waiting for journal access                  6153.6 msec         17.2 %
fsync() on a file                                 4902.7 msec         53.5 %
Writing a page to disk                            3487.2 msec          9.8 %
Writing buffer to disk (synchronous)              2288.7 msec          4.6 %
synchronous write                                 1320.6 msec          2.6 %
Truncating file                                   302.6 msec          0.6 %
Page fault                                        290.5 msec          0.6 %
Scheduler: waiting for cpu                         13.4 msec          4.5 %
Waiting for event (select)                          5.0 msec          ...
From: Arjan van de Ven
Date: Wednesday, September 3, 2008 - 4:50 pm

On Thu, 4 Sep 2008 01:00:43 +0200

wow bad ones..

one thing to note.. you have *something* doing fsync() a lot it seems
(latencytop is likely to tell you which one it is); fsync on ext3 is
really expensive, especially on an ssd that is slow to write to.


-- 
If you want to reach me at my work email, use arjan@linux.intel.com
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
--

From: Jan Knutar
Date: Monday, September 1, 2008 - 6:53 pm

Are you hitting swap at all? Does it have swap?

On my N810 tablet there can be a huge slowdown if it starts swapping. 
The first time it dips into swap is relatively painless when stuff gets 
written out sequentially, but after that when writes are mostly random 
the throughput drops to a few kilobytes per second, causing massive 
slowdown... The MMC/SD cards hate random write, and I expect SSDs are 
no better.

iostat -x -k 10
in a terminal is useful. Check the iowait and util% numbers after/during 
slowdown...
--

Previous thread: [PATCH 1/6] pci: fix merging left out for BAR print out by Yinghai Lu on Monday, September 1, 2008 - 4:37 pm. (11 messages)

Next thread: Re: forcedeth: option to disable 100Hz timer by Robert Hancock on Monday, September 1, 2008 - 5:41 pm. (2 messages)