On Fri 14.Mar'08 at 12:02:14 +0100, Helge Hafting wrote:As I noted in another thread (unfortunately ignored so far), sometimes xjed opens the file and my desktop is ok, as responsive as I expect it to be. So this isn't a xjed bug, even tough it could be better and load parts of the file on demand (like vi appears to do). So it is a kernel problem! I did this test a few times today, using today's git kernel and it takes 3 seconds for 'ls' to appear in the screen after typing, and 10 seconds (I checked with the clock) to be executed. Today I think I've found a reliable way to reproduce this bad behaviour (I've got it 5 times from 5 attempts so far). It is independent of elevator={anticipatory, cfq, deadline}, I could get the bad results with all of them. I just boot the machine, mount the swap partition manually (it looks like a Mandriva Cooker bug) and start 'xjed bigfile.txt &' in the terminal (it also happens outside of X). It takes more than 15 minutes to load the 380MB file when I hit the problem. It takes like 10 minutes to finish Ingo's cfs-debug script (I can see it stop _after_ the file is loaded). I could verify that swap usage goes to 400 MB out of 2 GB, and when I start the test this is what I have: total used free shared buffers cached Mem: 515496 148628 366868 0 9916 80612 -/+ buffers/cache: 58100 457396 Swap: 2144636 0 2144636 In one of the tests I started 'vmstat 1 >> vmstat_log3.txt' a few seconds before starting xjed. And this is what I got (notice that it has 400 lines, showing that it took 400 seconds to load the file, but in fact it was even more. Time seems to be distorted during this test) www.ift.unesp.br/users/crmafra/vmstat_log3.txt I could also get some info from latencytop during the test, but it was very difficult because the terminal screen took ages to refresh. This is worst log I could get manually: Cause Maximum Percentage get_request_wait __make_request generic_make_reque2201.3 msec 8.4 % get_request_wait __make_request generic_make_reque1877.3 msec 20.5 % sync_page __lock_page handle_mm_fault do_page_faul945.5 msec 0.4 % sync_page __lock_page find_lock_page filemap_fault860.3 msec 17.2 % sync_page __lock_page handle_mm_fault do_page_faul828.3 msec 0.9 % sync_page __lock_page handle_mm_fault do_page_faul817.9 msec 21.7 % sync_buffer __wait_on_buffer __bread ext3_get_bran752.6 msec 2.8 % get_request_wait __make_request generic_make_reque726.6 msec 0.4 % sync_page __lock_page find_lock_page filemap_fault617.0 msec 0.3 % inary search_binary_handler do_execve Process dhclient-script (3934) get_request_wait __make_request generic_make_reque1562.5 msec 36.8 %ge shrink_page_list shrink_inactive_list shri sync_page __lock_page find_lock_page filemap_fault473.4 msec 28.2 %lt do_page_fault error_code sync_page __lock_page handle_mm_fault do_page_faul405.9 msec 13.3 % Scheduler: waiting for cpu 120.1 msec 19.0 % congestion_wait try_to_free_pages __alloc_pages __ 99.7 msec 2.5 %o_page_cache_readahead filemap_fault __do_faul congestion_wait __alloc_pages __do_page_cache_read 15.9 msec 0.2 %head filemap_fault __do_fault handle_mm_fault do_page_fault error_code So I think I am definitely hitting some kernel bug. If someone in this list becomes interested in this problem, I will be happy to help with any testing necessary. I can live with this problem (I don't even need to read that file anymore). But the bug exists. Another thing I want to mention is about the CFS debug script results. For example: [mafra@localhost:~]$ grep se.wait_max cfs-debug-info-2008.03.15-17.22.51 se.wait_max : 320.220900 se.wait_max : 0.000000 se.wait_max : 0.000000 se.wait_max : 308.730029 se.wait_max : 0.000000 se.wait_max : 216.652542 se.wait_max : 412.076086 se.wait_max : 0.000000 se.wait_max : 0.000000 se.wait_max : 100.015586 se.wait_max : 0.000000 se.wait_max : 149.990226 se.wait_max : 15.718068 se.wait_max : 49.988316 se.wait_max : 0.000000 se.wait_max : 29.287277 se.wait_max : 62.785571 se.wait_max : 602.883709 se.wait_max : 10.646829 se.wait_max : 0.008188 se.wait_max : 19.871847 se.wait_max : 19.740026 se.wait_max : 0.027021 se.wait_max : 19.817248 se.wait_max : 0.026897 se.wait_max : 0.640153 se.wait_max : 0.808060 se.wait_max : 0.389118 se.wait_max : 0.027009 se.wait_max : 0.083961 se.wait_max : 3.264754 se.wait_max : 0.564208 se.wait_max : 0.092138 se.wait_max : 0.991528 se.wait_max : 3.058615 se.wait_max : 0.026908 se.wait_max : 11.891953 se.wait_max : 10.954652 se.wait_max : 1.115334 se.wait_max : 1.226744 se.wait_max : 10.947289 se.wait_max : 11.980765 se.wait_max : 1.349930 se.wait_max : 12.950094 se.wait_max : 1.438758 se.wait_max : 12.995926 se.wait_max : 1.523568 se.wait_max : 13.067781 se.wait_max : 15.039010 se.wait_max : 1.646566 se.wait_max : 15.134557 se.wait_max : 1.757942 se.wait_max : 1.830261 se.wait_max : 15.265369 se.wait_max : 1.921997 se.wait_max : 15.192441 se.wait_max : 2.097341 se.wait_max : 0.243086 se.wait_max : 2.028549 se.wait_max : 2.169289 se.wait_max : 2.280411 se.wait_max : 2.356268 se.wait_max : 1.719666 se.wait_max : 2.451732 se.wait_max : 2.535262 se.wait_max : 1.846795 se.wait_max : 2.489097 se.wait_max : 2.550678 se.wait_max : 2.816960 se.wait_max : 2.728959 se.wait_max : 2.068324 se.wait_max : 2.912895 se.wait_max : 3.072540 se.wait_max : 0.952594 se.wait_max : 3.334903 se.wait_max : 3.160131 se.wait_max : 3.399756 se.wait_max : 11.581309 se.wait_max : 3.684554 se.wait_max : 278.279914 se.wait_max : 23.601522 se.wait_max : 98.543577 se.wait_max : 259.453483 se.wait_max : 0.000000 se.wait_max : 0.000000 se.wait_max : 127.155292 se.wait_max : 0.000000 se.wait_max : 494.934718 se.wait_max : 0.000000 se.wait_max : 20.022997 se.wait_max : 40.072908 se.wait_max : 0.000000 se.wait_max : 0.000000 se.wait_max : 0.000000 se.wait_max : 0.000000 se.wait_max : 0.000000 se.wait_max : 0.000000 se.wait_max : 27.785589 se.wait_max : 45.076186 se.wait_max : 33.335700 se.wait_max : 0.000000 se.wait_max : 0.000000 se.wait_max : 0.000000 se.wait_max : 121.580585 se.wait_max : 6.825214 se.wait_max : 18.178228 se.wait_max : 0.000000 se.wait_max : 364.938378 se.wait_max : 186.608743 se.wait_max : 0.000000 se.wait_max : 191.874571 se.wait_max : 12.138841 se.wait_max : 0.000000 se.wait_max : 0.000000 se.wait_max : 365.831692 se.wait_max : 196.069929 se.wait_max : 186.608743 se.wait_max : 0.000000 se.wait_max : 0.000000 se.wait_max : 0.000000 se.wait_max : 0.000000 se.wait_max : 235.761630 se.wait_max : 0.000000 se.wait_max : 0.000000 se.wait_max : 0.000000 se.wait_max : 78.088419 se.wait_max : 33.871607 se.wait_max : 0.000000 se.wait_max : 26.201039 se.wait_max : 476.629993 se.wait_max : 587.639944 se.wait_max : 184.918777 se.wait_max : 334.997228 se.wait_max : 216.701937 se.wait_max : 0.909482 se.wait_max : 134.804841 se.wait_max : 114.442959 se.wait_max : 100.189898 se.wait_max : 100.155489 se.wait_max : 134.945855 se.wait_max : 114.634475 se.wait_max : 0.000000 se.wait_max : 542.107819 se.wait_max : 40.222426 se.wait_max : 0.000000 se.wait_max : 16.558527 se.wait_max : 0.000000 se.wait_max : 0.000000 se.wait_max : 0.000000 se.wait_max : 99.997595 se.wait_max : 0.000000 se.wait_max : 0.000000 se.wait_max : 56.632829 se.wait_max : 0.000000 se.wait_max : 0.000000 se.wait_max : 80.577203 This is the worst register I have, it was obtained after clearing the CFS stats. But most of the logs I have from this script are fine (only a few se.wait_max over 40 msecs, with the worst being around 120 or so). --
| Andrew Morton | Re: Linux 2.6.21-rc4 |
| Andrew Morton | echo mem > /sys/power/state |
| Adrian Bunk | Re: 2.6.23-rc9 boot failure (megaraid?) |
| Florian Schmidt | blacklist kernel boot option |
git: | |
| Nicolas Pitre | Re: [PATCH] gc: call "prune --expire 2.weeks.ago" |
| Linus Torvalds | Re: git versus CVS (versus bk) |
| Ping Yin | [RFC] git reset --recover |
| Russ Dill | git-svn+cygwin failed fetch |
| GVG GVG | ssh_exchange_identification: Connection closed by remote host |
| Karel Kulhavy | lookup option in /etc/resolv.conf ignored |
| Alexey Suslikov | OT: OpenBSD on Asus eeePC |
| ACP | maybe OT 3 year anniversay of Chuck Yerkes death |
| David Miller | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Karsten Keil | Linux IPv6 DAD not full conform to RFC 4862 ? |
| Arkadiusz Miskiewicz | htb and UDP packages bigger than 1500 |
| Ilpo Järvinen | Re: [PATCH] INET : removes per bucket rwlock in tcp/dccp ehash table |
