Hi Jens,
The following patches are changes of the block layer for
request-based dm-multipath.
The patches are created on the following commit of for-2.6.28
in linux-2.6-block.
---------------------------------------------------------------
commit 1fd75f9b2e1c273ab145c9bf3e7a8afcbd8be87a
Author: Jens Axboe <jens.axboe@oracle.com>
Date: Tue Sep 16 15:09:28 2008 -0700
block: cleanup some of the integrity stuff in blkdev.h
---------------------------------------------------------------
No major changes since the last post (http://lkml.org/lkml/2008/9/12/100),
except for following up the differences from 2.6.27-rc6.
(As I said, I don't include patches for refactoring of similar
interfaces at this time. The refactoring patches will be posted
after the request-based dm-multipath work is done.)
Summary of the patches:
1/3: block: add a request data completion interface
2/3: block: add a request submission interface
3/3: block: add a queue flag for request stacking support
Please review and apply.
Thanks,
Kiyoshi Ueda
--
This patch adds blk_update_request(), which updates struct request
with completing its data part, but doesn't complete the struct
request itself.
Though it looks like end_that_request_first() of older kernels,
blk_update_request() should be used only by request stacking drivers.
Request-based dm will use it in bio->bi_end_io callback to update
the original request when a data part of a cloned request completes.
Followings are additional background information of why request-based
dm needs this interface.
- Request stacking drivers can't use blk_end_request() directly from
the lower driver's completion context (bio->bi_end_io or rq->end_io),
because some device drivers (e.g. ide) may try to complete
their request with queue lock held, and it may cause deadlock.
See below for detailed description of possible deadlock:
<http://marc.info/?l=linux-kernel&m=120311479108569&w=2>
- To solve that, request-based dm offloads the completion of
cloned struct request to softirq context (i.e. using
blk_complete_request() from rq->end_io).
- Though it is possible to use the same solution from bio->bi_end_io,
it will delay the notification of bio completion to the original
submitter. Also, it will cause inefficient partial completion,
because the lower driver can't perform the cloned request anymore
and request-based dm needs to requeue and redispatch it to
the lower driver again later. That's not good.
- So request-based dm needs blk_update_request() to perform the bio
completion in the lower driver's completion context, which is more
efficient.
Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com>
Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
---
block/blk-core.c | 57 +++++++++++++++++++++++++++++++++++++++++--------
include/linux/blkdev.h | 2 +
2 files changed, 50 insertions(+), 9 deletions(-)
Index: ...This patch adds blk_insert_cloned_request(), a generic request submission interface for request stacking drivers. Request-based dm will use it to submit their clones to underlying devices. blk_rq_check_limits() is also added because it is possible that the lower queue has stronger limitations than the upper queue if multiple drivers are stacking at request-level. Not only for blk_insert_cloned_request()'s internal use, the function will be used by request-based dm when the queue limitation is modified (e.g. by replacing dm's table). Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: Jens Axboe <jens.axboe@oracle.com> --- block/blk-core.c | 81 +++++++++++++++++++++++++++++++++++++++++++++++++ include/linux/blkdev.h | 3 + 2 files changed, 84 insertions(+) Index: linux-2.6-block/block/blk-core.c =================================================================== --- linux-2.6-block.orig/block/blk-core.c +++ linux-2.6-block/block/blk-core.c @@ -1522,6 +1522,87 @@ void submit_bio(int rw, struct bio *bio) EXPORT_SYMBOL(submit_bio); /** + * blk_rq_check_limits - Helper function to check a request for the queue limit + * @q: the queue + * @rq: the request being checked + * + * Description: + * @rq may have been made based on weaker limitations of upper-level queues + * in request stacking drivers, and it may violate the limitation of @q. + * Since the block layer and the underlying device driver trust @rq + * after it is inserted to @q, it should be checked against @q before + * the insertion using this generic function. + * + * This function should also be useful for request stacking drivers + * in some cases below, so export this fuction. + * Request stacking drivers like request-based dm may change the queue + * limits while requests are in the queue (e.g. dm's table swapping). + * Such request stacking drivers should check those requests agaist + * the ...
This patch adds a queue flag to indicate the block device can be used for request stacking. Request stacking drivers need to stack their devices on top of only devices of which q->request_fn is functional. Since bio stacking drivers (e.g. md, loop) basically initialize their queue using blk_alloc_queue() and don't set q->request_fn, the check of (q->request_fn == NULL) looks enough for that purpose. However, dm will become both types of stacking driver (bio-based and request-based). And dm will always set q->request_fn even if the dm device is bio-based of which q->request_fn is not functional actually. So we need something else to distinguish the type of the device. Adding a queue flag is a solution for that. The reason why dm always sets q->request_fn is to keep the compatibility of dm user-space tools. Currently, all dm user-space tools are using bio-based dm without specifying the type of the dm device they use. To use request-based dm without changing such tools, the kernel must decide the type of the dm device automatically. The automatic type decision can't be done at the device creation time and needs to be deferred until such tools load a mapping table, since the actual type is decided by dm target type included in the mapping table. So a dm device has to be initialized using blk_init_queue() so that we can load either type of table. Then, all queue stuffs are set (e.g. q->request_fn) and we have no element to distinguish that it is bio-based or request-based, even after a table is loaded and the type of the device is decided. By the way, some stuffs of the queue (e.g. request_list, elevator) are needless when the dm device is used as bio-based. But the memory size is not so large (about 20[KB] per queue on ia64), so I hope the memory loss can be acceptable for bio-based dm users. Signed-off-by: Kiyoshi Ueda <k-ueda@ct.jp.nec.com> Signed-off-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com> Cc: Jens Axboe <jens.axboe@oracle.com> --- block/blk-core.c ...
Thank you, I've merged these three patches. -- Jens Axboe --
