"I've long hated the non-killability of tasks accessing a dead NFS server," Matthew Wilcox said along with a prototype patch to fix the issue based on a 2002 posting by Linus Torvalds. Matthew added, "I've only added one real user of the killable concept to this patch -- try_lock_page(). However, this is enough for 'cat */*/*' to be killable with a ^C when I unplug the ethernet cord between it and the nfs server."
Linus responded favorably to the patch, "hey, I obviously approve. And the patch looks simple." He went on to suggest that he was interested in merging the patch during the next merge window, "feel free to re-submit after 2.6.23 is out the door, I don't think anybody will really complain. Any NFS user will know why something like this can be really nice."
From: Matthew Wilcox [email blocked] Subject: [RFC] TASK_KILLED Date: Wed, 29 Aug 2007 14:40:48 -0600 I've long hated the non-killability of tasks accessing a dead NFS server. Linus had an idea for fixing this way back in 2002: http://www.ussg.iu.edu/hypermail/linux/kernel/0208.0/0167.html which I've prototyped in this patch. Splitting up TASK_* into separate bits is going to need a lot more auditing, I think. It was easier back in 2002, but since then we've added TASK_STOPPED and TASK_TRACED which also need to be gingerly checked for. There's some debug code left in here to discourage Linus from just applying it. I've only added one real user of the killable concept to this patch -- try_lock_page(). However, this is enough for 'cat */*/*' to be killable with a ^C when I unplug the ethernet cord between it and the nfs server. I have another version of this patch which makes TASK_KILLABLE a separate state on par with TASK_INTERRUPTIBLE and TASK_UNINTERRUPTIBLE, but I don't like it as much as this one. I'll post it if there's demand.
From: Linus Torvalds [email blocked] Subject: Re: [RFC] TASK_KILLED Date: Wed, 29 Aug 2007 21:38:03 -0700 (PDT) On Wed, 29 Aug 2007, Matthew Wilcox wrote: > > I've long hated the non-killability of tasks accessing a dead > NFS server. Linus had an idea for fixing this way back in 2002: > http://www.ussg.iu.edu/hypermail/linux/kernel/0208.0/0167.html which > I've prototyped in this patch. Hey, I obviously approve. And the patch looks simple. Feel free to re-submit after 2.6.23 is out the door, I don't think anybody will really complain. Any NFS user will know why something like this can be really nice. Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [email blocked] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Hmm
This should have been fixed a long... long time ago...
Sounds simple, shame that it hasn't been fixed long ago.
Amazing how long time it took to get it fixed.
Wouldn't it be possible and better to use a time-out check that automatically closes the connection on time-out?
time-out
NFS is actually stateless, so there is no connection to close.
Mounting with intr?
What about mounting NFS mounts with "intr"? Does that not cover all needed cases?
intr
Unfortunately, it doesn't. The uninterruptible sleep that I addressed in this patch happens in the VFS where NFS can do nothing about it. The 'intr' option also allows all signals, not just fatal ones. While this is POSIX-compliant behaviour, it's also something that applications just aren't expecting, so it can cause them to break.
When you unplug Ethernet at
When you unplug Ethernet at the client's side, the event of link carrier loss should be promptly communicated to all parts interested in networking, including of course NFS client. In that case, cat */*/* as well as any other process doing read/raddir from the affected file, should, in ideal, get -1 in return and errno set to ENETDOWN. As well, when you lose NFS server from the network, stop getting ARP replies from it, or receive corresponding ICMP message, the error should be EHOSTDOWN.
The original unix filesystem was designed for disks connected by bus and used exclusively by one local host. The bus was assumed to have quite better reliability than the network. Unfortunately the assumption about reliability was transferred to NFS case without any adjustments. So, these non-interruptible processes trying to work with inaccessible NFS server are really annoying.
Compare this with Plan9's protocol 9P. No such problems as the system was designed originally to be networked. As you may see, in today technology, the difference between bus and network gets less and less certain. Ethernet, SATA, SAS, USB, FireWire, Infiniband, FiberChannel, iSCSI - what is what ? As well, today research and optimization are directed to shared storage access involving also multiple storage hosts. See NFSv4, GFS2, Lustre and other similar projects.