This patch allows Network Block Device to be mounted locally.
It creates a kthread to avoid the deadlock described in NBD tools documentation.
So, if nbd-client hangs waiting pages, the kblockd thread can continue its
work and free pages.
Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
---
drivers/block/nbd.c | 146 ++++++++++++++++++++++++++++++++++-----------------
include/linux/nbd.h | 4 +-
2 files changed, 100 insertions(+), 50 deletions(-)
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index b4c0888..de6685e 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -29,6 +29,7 @@
#include <linux/kernel.h>
#include <net/sock.h>
#include <linux/net.h>
+#include <linux/kthread.h>
#include <asm/uaccess.h>
#include <asm/system.h>
@@ -434,6 +435,87 @@ static void nbd_clear_que(struct nbd_device *lo)
}
+static void nbd_handle_req(struct nbd_device *lo, struct request *req)
+{
+ if (!blk_fs_request(req))
+ goto error_out;
+
+ nbd_cmd(req) = NBD_CMD_READ;
+ if (rq_data_dir(req) == WRITE) {
+ nbd_cmd(req) = NBD_CMD_WRITE;
+ if (lo->flags & NBD_READ_ONLY) {
+ printk(KERN_ERR "%s: Write on read-only\n",
+ lo->disk->disk_name);
+ goto error_out;
+ }
+ }
+
+ req->errors = 0;
+
+ mutex_lock(&lo->tx_lock);
+ if (unlikely(!lo->sock)) {
+ mutex_unlock(&lo->tx_lock);
+ printk(KERN_ERR "%s: Attempted send on closed socket\n",
+ lo->disk->disk_name);
+ req->errors++;
+ nbd_end_request(req);
+ return;
+ }
+
+ lo->active_req = req;
+
+ if (nbd_send_req(lo, req) != 0) {
+ printk(KERN_ERR "%s: Request send failed\n",
+ lo->disk->disk_name);
+ req->errors++;
+ nbd_end_request(req);
+ } else {
+ spin_lock(&lo->queue_lock);
+ list_add(&req->queuelist, &lo->queue_head);
+ spin_unlock(&lo->queue_lock);
+ }
+
+ lo->active_req = NULL;
+ mutex_unlock(&lo->tx_lock);
+ wake_up_all(&lo->active_wq);
+
+ return;
+
+error_out:
+ req->errors++;
+ nbd_end_request(req);
+}
+
+static int ...Hmm, and if there are no other pages that can be freed? Unlikely, but can happen AFAICT. -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html --
Local NBD is good for when the content you want to make available through the block device is dynamic (generated on-the-fly), non-linear or supersparse. Take for example VMware virtual disks. Just a guess, but they roughly can look like this: kilobytes 0.. 1: header kilobytes 1..10: correspond to LBA 0..20 kilobytes 11..20: correspond to LBA 40..60 kilobytes 21..22: correspond to LBA 22..23 So what we have is non-linearity -- LBA 22 comes after LBA 40 -- loop does not deal with that. And there is supersparsity -- the VMDK file itself is complete, but unallocated regions like LBA 24..40 are sparse/zero when projected onto a file/block device, respectively; loop cannot deal with that either. In fact, VMware uses local nbd today for its vmware-loop helper utility, most likely because of the above-mentioned reasons. (Though it quite often hung last time I tried.) --
It allows to write userlevel block device. In my case, I can mount disk Correct. The patch improves the NBD behavior even if it is not perfect. And I think if no other page can be freed your system is in very bad move ;-) Laurent -- ----------------- Laurent.Vivier@bull.net ------------------ "La perfection est atteinte non quand il ne reste rien à ajouter mais quand il ne reste rien à enlever." Saint Exupéry --
Not necessarily. Problems start when the system wants to free memory by writing out pages through NBD, and the userspace process servicing it tries to allocate some memory in order to accomplish this. Recent kernels have gotten much better at coping with this, so it might not be easy to make local NBD deadlock under normal circumstances. But if you try hard enough, it's not impossible: throttle_vm_writeout() can stall an allocation until pending writes have completed, all with plenty of memory available in the system. BTW, you can basically substitute local NBD with fuse-over-loop, and get a similar kind of service, with similar problems. Miklos --
So the description should be "This patch lowers probability of deadlock if you mount Network Block Device locally" Hmm. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html --
