Re: [1/1] Block device throttling [Re: Distributed storage.]

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Evgeniy Polyakov <johnpol@...>
Cc: Jens Axboe <jens.axboe@...>, <netdev@...>, <linux-kernel@...>, <linux-fsdevel@...>, Peter Zijlstra <peterz@...>
Date: Tuesday, August 28, 2007 - 5:08 pm

On Tuesday 28 August 2007 10:54, Evgeniy Polyakov wrote:

No.  Please go and read it the description of the race again.  A count
gets irretrievably lost because the write operation of the first
decrement is overwritten by the second. Data gets lost.  Atomic 
operations exist to prevent that sort of thing.  You either need to use 
them or have a deep understanding of SMP read and write ordering in 
order to preserve data integrity by some equivalent algorithm.


How about like the patch below.  This throttles any block driver by
implementing a throttle metric method so that each block driver can
keep track of its own resource consumption in units of its choosing.
As an (important) example, it implements a simple metric for device
mapper devices.  Other block devices will work as before, because
they do not define any metric.  Short, sweet and untested, which is
why I have not posted it until now.

This patch originally kept its accounting info in backing_dev_info,
however that structure seems to be in some and it is just a part of
struct queue anyway, so I lifted the throttle accounting up into
struct queue.  We should be able to report on the efficacy of this
patch in terms of deadlock prevention pretty soon.

--- 2.6.22.clean/block/ll_rw_blk.c	2007-07-08 16:32:17.000000000 -0700
+++ 2.6.22/block/ll_rw_blk.c	2007-08-24 12:07:16.000000000 -0700
@@ -3237,6 +3237,15 @@ end_io:
  */
 void generic_make_request(struct bio *bio)
 {
+	struct request_queue *q = bdev_get_queue(bio->bi_bdev);
+
+	if (q && q->metric) {
+		int need = bio->bi_reserved = q->metric(bio);
+		bio->queue = q;
+		wait_event_interruptible(q->throttle_wait, atomic_read(&q->available) >= need);
+		atomic_sub(&q->available, need);
+	}
+
 	if (current->bio_tail) {
 		/* make_request is active */
 		*(current->bio_tail) = bio;
--- 2.6.22.clean/drivers/md/dm.c	2007-07-08 16:32:17.000000000 -0700
+++ 2.6.22/drivers/md/dm.c	2007-08-24 12:14:23.000000000 -0700
@@ -880,6 +880,11 @@ static int dm_any_congested(void *conges
 	return r;
 }
 
+static unsigned dm_metric(struct bio *bio)
+{
+	return bio->bi_vcnt;
+}
+
 /*-----------------------------------------------------------------
  * An IDR is used to keep track of allocated minor numbers.
  *---------------------------------------------------------------*/
@@ -997,6 +1002,10 @@ static struct mapped_device *alloc_dev(i
 		goto bad1_free_minor;
 
 	md->queue->queuedata = md;
+	md->queue->metric = dm_metric;
+	atomic_set(&md->queue->available, md->queue->capacity = 1000);
+	init_waitqueue_head(&md->queue->throttle_wait);
+
 	md->queue->backing_dev_info.congested_fn = dm_any_congested;
 	md->queue->backing_dev_info.congested_data = md;
 	blk_queue_make_request(md->queue, dm_request);
--- 2.6.22.clean/fs/bio.c	2007-07-08 16:32:17.000000000 -0700
+++ 2.6.22/fs/bio.c	2007-08-24 12:10:41.000000000 -0700
@@ -1025,7 +1025,12 @@ void bio_endio(struct bio *bio, unsigned
 		bytes_done = bio->bi_size;
 	}
 
-	bio->bi_size -= bytes_done;
+	if (!(bio->bi_size -= bytes_done) && bio->bi_reserved) {
+		struct request_queue *q = bio->queue;
+		atomic_add(&q->available, bio->bi_reserved);
+		bio->bi_reserved = 0; /* just in case */
+		wake_up(&q->throttle_wait);
+	}
 	bio->bi_sector += (bytes_done >> 9);
 
 	if (bio->bi_end_io)
--- 2.6.22.clean/include/linux/bio.h	2007-07-08 16:32:17.000000000 -0700
+++ 2.6.22/include/linux/bio.h	2007-08-24 11:53:51.000000000 -0700
@@ -109,6 +109,9 @@ struct bio {
 	bio_end_io_t		*bi_end_io;
 	atomic_t		bi_cnt;		/* pin count */
 
+	struct request_queue	*queue;		/* for throttling */
+	unsigned		bi_reserved;	/* throttle metric */
+
 	void			*bi_private;
 
 	bio_destructor_t	*bi_destructor;	/* destructor */
--- 2.6.22.clean/include/linux/blkdev.h	2007-07-08 16:32:17.000000000 -0700
+++ 2.6.22/include/linux/blkdev.h	2007-08-24 12:04:14.000000000 -0700
@@ -395,6 +395,10 @@ struct request_queue
 	struct work_struct	unplug_work;
 
 	struct backing_dev_info	backing_dev_info;
+	unsigned (*metric)(struct bio *bio);	/* bio throttle metric */
+	wait_queue_head_t	throttle_wait;
+	atomic_t		available;
+	unsigned		capacity;
 
 	/*
 	 * The queue owner gets to use this for whatever they like.
-
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Distributed storage., Evgeniy Polyakov, (Tue Jul 31, 1:13 pm)
Re: Distributed storage., Daniel Phillips, (Fri Aug 3, 8:41 pm)
Re: Distributed storage., Evgeniy Polyakov, (Sat Aug 4, 12:44 pm)
Re: Distributed storage., Daniel Phillips, (Sun Aug 5, 4:06 am)
Re: Distributed storage., Evgeniy Polyakov, (Sun Aug 5, 11:01 am)
Re: Distributed storage., Daniel Phillips, (Sun Aug 5, 5:35 pm)
Re: Distributed storage., Evgeniy Polyakov, (Mon Aug 6, 4:28 am)
Re: Distributed storage., Manu Abraham, (Fri Aug 3, 1:04 am)
Re: Distributed storage., Evgeniy Polyakov, (Tue Aug 28, 1:19 pm)
Re: Distributed storage., Evgeniy Polyakov, (Sat Aug 4, 1:03 pm)
Re: Distributed storage., Dave Dillow, (Fri Aug 3, 10:51 pm)
Re: Distributed storage., Manu Abraham, (Fri Aug 3, 11:44 pm)
Re: Distributed storage., Evgeniy Polyakov, (Fri Aug 3, 6:44 am)
Re: Distributed storage., Mike Snitzer, (Fri Aug 3, 12:09 am)
Re: Distributed storage., Daniel Phillips, (Fri Aug 3, 8:49 pm)
Re: Distributed storage., Evgeniy Polyakov, (Fri Aug 3, 6:42 am)
Re: Distributed storage., Daniel Phillips, (Thu Aug 2, 5:08 pm)
Re: Distributed storage., Evgeniy Polyakov, (Fri Aug 3, 6:26 am)
Re: Distributed storage., Daniel Phillips, (Fri Aug 3, 9:19 pm)
Re: Distributed storage., Evgeniy Polyakov, (Sat Aug 4, 12:37 pm)
Re: Distributed storage., Daniel Phillips, (Sun Aug 5, 4:04 am)
Re: Distributed storage., Evgeniy Polyakov, (Sun Aug 5, 11:08 am)
Re: Distributed storage., Daniel Phillips, (Sun Aug 5, 5:23 pm)
Re: Distributed storage., Jens Axboe, (Tue Aug 7, 8:05 am)
Re: Distributed storage., Daniel Phillips, (Tue Aug 7, 2:24 pm)
Re: Distributed storage., Jens Axboe, (Tue Aug 7, 4:55 pm)
Re: Distributed storage., Daniel Phillips, (Sun Aug 12, 7:36 pm)
Re: Distributed storage., Jens Axboe, (Mon Aug 13, 3:28 am)
Re: Distributed storage., Daniel Phillips, (Mon Aug 13, 4:59 am)
Re: Distributed storage., Jens Axboe, (Mon Aug 13, 5:12 am)
Re: Distributed storage., Daniel Phillips, (Mon Aug 13, 7:27 pm)
Re: Distributed storage., Jens Axboe, (Mon Aug 13, 3:45 am)
Re: Distributed storage., Daniel Phillips, (Mon Aug 13, 5:08 am)
Re: Distributed storage., Evgeniy Polyakov, (Mon Aug 13, 5:18 am)
Re: Distributed storage., Daniel Phillips, (Mon Aug 13, 6:12 am)
Re: Distributed storage., Evgeniy Polyakov, (Mon Aug 13, 7:03 am)
Re: Distributed storage., Daniel Phillips, (Mon Aug 13, 7:45 am)
Re: Distributed storage., Jens Axboe, (Mon Aug 13, 5:13 am)
Re: Distributed storage., Daniel Phillips, (Mon Aug 13, 5:55 am)
Re: Distributed storage., Jens Axboe, (Mon Aug 13, 6:06 am)
Re: Distributed storage., Daniel Phillips, (Mon Aug 13, 6:15 am)
Re: Distributed storage., Jens Axboe, (Mon Aug 13, 6:22 am)
Re: Distributed storage., Daniel Phillips, (Mon Aug 13, 6:32 am)
Block device throttling [Re: Distributed storage.], Evgeniy Polyakov, (Wed Aug 8, 5:54 am)
Re: Block device throttling [Re: Distributed storage.], Daniel Phillips, (Mon Aug 13, 1:36 am)
Re: Block device throttling [Re: Distributed storage.], Evgeniy Polyakov, (Mon Aug 13, 4:23 am)
Re: Block device throttling [Re: Distributed storage.], Daniel Phillips, (Mon Aug 13, 7:18 am)
Re: Block device throttling [Re: Distributed storage.], Evgeniy Polyakov, (Mon Aug 13, 8:18 am)
Re: Block device throttling [Re: Distributed storage.], Daniel Phillips, (Mon Aug 13, 9:04 am)
Re: Block device throttling [Re: Distributed storage.], Evgeniy Polyakov, (Tue Aug 14, 4:46 am)
Re: Block device throttling [Re: Distributed storage.], Daniel Phillips, (Tue Aug 14, 7:13 am)
Re: Block device throttling [Re: Distributed storage.], Evgeniy Polyakov, (Tue Aug 14, 7:30 am)
Re: Block device throttling [Re: Distributed storage.], Daniel Phillips, (Tue Aug 14, 7:35 am)
Re: Block device throttling [Re: Distributed storage.], Evgeniy Polyakov, (Tue Aug 14, 7:50 am)
Re: Block device throttling [Re: Distributed storage.], Daniel Phillips, (Tue Aug 14, 8:32 am)
Re: Block device throttling [Re: Distributed storage.], Evgeniy Polyakov, (Tue Aug 14, 8:46 am)
Re: Block device throttling [Re: Distributed storage.], Daniel Phillips, (Tue Aug 14, 8:54 am)
Re: Block device throttling [Re: Distributed storage.], Daniel Phillips, (Mon Aug 13, 2:44 am)
Re: Block device throttling [Re: Distributed storage.], Evgeniy Polyakov, (Mon Aug 13, 4:14 am)
Re: Block device throttling [Re: Distributed storage.], Daniel Phillips, (Mon Aug 13, 7:04 am)
Re: Block device throttling [Re: Distributed storage.], Evgeniy Polyakov, (Mon Aug 13, 8:04 am)
Re: Block device throttling [Re: Distributed storage.], Daniel Phillips, (Mon Aug 13, 8:18 am)
Re: Block device throttling [Re: Distributed storage.], Evgeniy Polyakov, (Mon Aug 13, 8:24 am)
Re: Block device throttling [Re: Distributed storage.], Daniel Phillips, (Mon Aug 13, 1:22 am)
[1/1] Block device throttling [Re: Distributed storage.], Evgeniy Polyakov, (Wed Aug 8, 6:17 am)
Re: [1/1] Block device throttling [Re: Distributed storage.], Daniel Phillips, (Mon Aug 27, 5:57 pm)
Re: [1/1] Block device throttling [Re: Distributed storage.], Evgeniy Polyakov, (Tue Aug 28, 5:35 am)
Re: [1/1] Block device throttling [Re: Distributed storage.], Daniel Phillips, (Tue Aug 28, 1:27 pm)
Re: [1/1] Block device throttling [Re: Distributed storage.], Evgeniy Polyakov, (Tue Aug 28, 1:54 pm)
Re: [1/1] Block device throttling [Re: Distributed storage.], Daniel Phillips, (Tue Aug 28, 5:08 pm)
Re: [1/1] Block device throttling [Re: Distributed storage.], Evgeniy Polyakov, (Wed Aug 29, 4:53 am)
Re: [1/1] Block device throttling [Re: Distributed storage.], Daniel Phillips, (Thu Aug 30, 7:20 pm)
Re: [1/1] Block device throttling [Re: Distributed storage.], Alasdair G Kergon, (Fri Aug 31, 5:41 pm)
Re: [1/1] Block device throttling [Re: Distributed storage.], Daniel Phillips, (Sun Sep 2, 12:42 am)
Re: [1/1] Block device throttling [Re: Distributed storage.], Evgeniy Polyakov, (Fri Aug 31, 1:33 pm)
Re: [1/1] Block device throttling [Re: Distributed storage.], Daniel Phillips, (Sun Aug 12, 7:16 pm)
Re: [1/1] Block device throttling [Re: Distributed storage.], Evgeniy Polyakov, (Mon Aug 13, 4:18 am)
Re: [1/1] Block device throttling [Re: Distributed storage.], Evgeniy Polyakov, (Wed Aug 8, 9:28 am)
Re: Distributed storage., Evgeniy Polyakov, (Mon Aug 6, 4:25 am)
Re: Distributed storage., Evgeniy Polyakov, (Fri Aug 3, 6:57 am)
Re: Distributed storage., Peter Zijlstra, (Fri Aug 3, 8:27 am)
Re: Distributed storage., Evgeniy Polyakov, (Fri Aug 3, 9:49 am)
Re: Distributed storage., Daniel Phillips, (Fri Aug 3, 3:41 pm)
Re: Distributed storage., Peter Zijlstra, (Fri Aug 3, 10:53 am)
Re: Distributed storage., Daniel Phillips, (Fri Aug 3, 3:48 pm)