On 9/17/07, Daniel Phillips <phillips@phunq.net> wrote:
Hope you enjoyed yourself. First off, as always thanks for the
extremely insightful reply.
To give you context for where I'm coming from; I'm looking to get NBD
to survive the mke2fs hell I described here:
http://marc.info/?l=linux-mm&m=118981112030719&w=2
Once the memory requirements of a userspace daemon (e.g. nbd-server)
are known; should one mlockall() the memory similar to how is done in
heartbeat daemon's realtime library?
Bigger question for me is what kind of hell am I (or others) in for to
try to cap nbd-server's memory usage? All those glib-gone-wild
changes over the recent past feel problematic but I'll look to work
with Wouter to see if we can get things bounded.
Would peter's per bdi dirty page accounting patchset provide this? If
not, what steps are you taking to disable this mechanism? I've found
that nbd-server is frequently locked with 'blk_congestion_wait' in its
call trace when I hit the deadlock.
I've embraced Evgeniy's bio throttle patch on a 2.6.22.6 kernel
http://thread.gmane.org/gmane.linux.network/68021/focus=68552
But are you referring to that (as you did below) or is this more a
reference to peterz's bdi dirty accounting patchset?
I've been using Avi Kivity's patch from some time ago:
http://lkml.org/lkml/2004/7/26/68
to get nbd-server to to run in PF_MEMALLOC mode (could've just used
the _POSIX_PRIORITY_SCHEDULING hack instead right?)... it didn't help
on its own; I likely didn't have enough of the stars aligned to see my
MD+NBD mke2fs test not deadlock.
I assume peterz's network deadlock avoidance patchset (or some subset
of it) has you covered here?
OK, yes I've included Christoph's recursive reclaim patch and didn't
have any luck either. Good to know that patch isn't _really_ going to
help me.
I've been working off-list (with Evgeniy's help!) to give the bio
throttling patch a try. I hacked MD (md.c and raid1.c) to limit NBD
members to only 10 in-flight IOs. Without this throttle I'd see up to
170 IOs on the raid1's nbd0 member; with it the IOs holds farely
constant at ~16. But this didn't help my deadlock test either. Also,
throttling in-flight IOs like this feels inherently sub-optimal. Have
you taken any steps to make the 'bio-limit' dynamic in some way?
Anyway, I'm thinking I need to be stacking more/all of these things
together rather than trying them piece-wise.
I'm going to try adding all the things I've learned into the mix all
at once; including both of peterz's patchsets. Peter, do you have a
git repo or website/ftp site for you r latest per-bdi and network
deadlock patchsets? Pulling them out of LKML archives isn't "fun".
Also, I've noticed that the more recent network deadlock avoidance
patchsets haven't included NBD changes; any reason why these have been
dropped? Should I just look to shoe-horn in previous NBD-oriented
patches from an earlier version of that patchset?
That would be quite helpful; all that I've learned has largely been
from your various posts (or others' responses to your posts).
Requires a hell of a lot of digging and ultimately I'm still missing
something.
In closing, if you (or others) are aware of a minimalist recipe that
would help me defeat this mke2fs MD+NBD deadlock test (as detailed in
my linux-mm post that I referenced above) I'd be hugely grateful.
thanks,
Mike
-