Re: Swap on loop device on tmpfs locks up machine

Previous thread: Re: Whats going on? (somebody can say something?) by Janos Haar on Friday, September 26, 2008 - 11:07 am. (1 message)

Next thread: [PATCH] fs/block_dev.c: __read_mostly improvement and sb_is_blkdev_sb utilization by Denis ChengRq on Friday, September 26, 2008 - 11:52 am. (1 message)
From: Vegard Nossum
Date: Friday, September 26, 2008 - 11:46 am

Hi,

It turns out that swap over loop device on tmpfs will lock up the
machine. That is, all programs become blocked, but I can still do
things like switch VT consoles.

To reproduce (as root):

    mount -t tmpfs tmpfs /mnt
    dd if=/dev/zero of=/mnt/disk
    swapoff -a
    losetup -f /mnt/disk
    mkswap /dev/loop0 # replace loop0 with actual loop device
    swapon /dev/loop0
    <memory hog, like continuous malloc()>

I'm not sure it's really a very good idea to do this in the first
place, but should something give a warning or prevent a user from
doing it?

There is no output on serial console, I have most debugging options
turned on. SysRq-l (backtrace for active CPUs) shows nothing much
interesting:

Pid: 0, comm: swapper Not tainted (2.6.27-rc7-00106-g6ef190c #1)
EIP: 0060:[<c011f295>] EFLAGS: 00000202 CPU: 0
EIP is at native_safe_halt+0x5/0x10

And SysRq-w (task dump for blocked tasks) shows that almost all
processes are blocking in schedule_timeout, for example:

bash          D f6eb5cf0  6184  3688   3687
       f6eb5d38 00200046 00000000 f6eb5cf0 c0159b4f 00000000 f6a7f2c0 c0159d7b
       c2036d80 f6933fc0 67525eae 000000f4 f6a7f2c0 f6a7f534 c2036d80 f6eb4000
       c0595da3 c0955b80 000210af 00000000 c013f9eb 000124b0 00000000 00200296
Call Trace:
 [<c0159b4f>] ? mark_held_locks+0x6f/0x90
 [<c0159d7b>] ? trace_hardirqs_on+0xb/0x10
 [<c0595da3>] ? _spin_unlock_irqrestore+0x43/0x70
 [<c013f9eb>] ? __mod_timer+0x9b/0xe0
 [<c05933c8>] schedule_timeout+0x48/0xc0
 [<c0159d7b>] ? trace_hardirqs_on+0xb/0x10
 [<c013f480>] ? process_timeout+0x0/0x10
 [<c05933c3>] ? schedule_timeout+0x43/0xc0
 [<c05932ee>] io_schedule_timeout+0x1e/0x30
 [<c0187385>] congestion_wait+0x55/0x70
 [<c0149900>] ? autoremove_wake_function+0x0/0x50
 [<c0181969>] throttle_vm_writeout+0x69/0x80
 [<c0184fa5>] shrink_zone+0x75/0x130
 [<c010a325>] ? native_sched_clock+0xb5/0x110
 [<c01853c7>] do_try_to_free_pages+0x107/0x3c0
 [<c018576d>] try_to_free_pages+0x6d/0x80
 [<c0183ea0>] ? ...
From: David Newall
Date: Saturday, September 27, 2008 - 5:49 am

Doesn't tmpfs use otherwise-free virtual memory?  I expect the machine
would lock up if you put swap (i.e. additional virtual memory) on such a
device.
--

From: Bill Davidsen
Date: Saturday, September 27, 2008 - 10:38 am

To reinstate the paragraph from the O.P. you snipped:
 >> I'm not sure it's really a very good idea to do this in the first
 >> place, but should something give a warning or prevent a user from
 >> doing it?

I think you are both right, it is a bad thing to do, it does seem to lock up, 
and something should prevent a user from doing that. But it may be easier to fix 
the lockup than get the "prevent" right, there appears to be a loop there.

Just a simple questions to the O.P.: what were you thinking?!! Or was this a 
test just to see what would happen?

-- 
Bill Davidsen <davidsen@tmr.com>
   "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot
--

From: Vegard Nossum
Date: Saturday, September 27, 2008 - 11:30 am

Just playing with the kernel :-)

Sometimes the "insane" things to do will turn up real errors in the
code. This one is in the borderlands, but I thought it wouldn't hurt
to post the results in either case.


Vegard

-- 
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
	-- E. W. Dijkstra, EWD1036
--

From: Hugh Dickins
Date: Sunday, September 28, 2008 - 8:18 am

It's just a "don't do that" in my opinion, and it doesn't seem
to have caused much trouble for sysadmins down the years.

It's good to have a loop driver that can make regular files look
like block devices, and it's good to have that working on tmpfs;
and I'm glad that trying to swapon a tmpfs file directly just
happens to fail because tmpfs doesn't support bmap().

But I don't think it's worth adding in some "valid for swap" call
to block devices, and saying no when loop or when loop over tmpfs.
Trying to swap to loop over tmpfs is a particularly clear example
of something that will end badly - unless the tmpfs file is locked
in memory? haven't tried that - but I wouldn't recommend swapping
to loop over anything (it interposes levels between swap and device
which just increase the likelihood of hang).

Just don't do that.

Hugh
--

Previous thread: Re: Whats going on? (somebody can say something?) by Janos Haar on Friday, September 26, 2008 - 11:07 am. (1 message)

Next thread: [PATCH] fs/block_dev.c: __read_mostly improvement and sb_is_blkdev_sb utilization by Denis ChengRq on Friday, September 26, 2008 - 11:52 am. (1 message)