Another post of my block I/O data integrity patches. This kit goes on
top of the scsi_data_buffer and sd.h cleanups I posted earlier today.There has been no changes to the block layer code since my last
submission.Within SCSI, the changes are cleanups based on comments from Christoph
as well as working support for Type 3 and 4KB sectors.What's This All About?
----------------------These patches allow data integrity information (checksum and more) to
be attached to I/Os at the block/filesystem layers and transferred
through the entire I/O stack all the way to the physical storage
device.The integrity metadata can be generated in close proximity to the
original data. Capable host adapters, RAID arrays and physical disks
can verify the data integrity and abort I/Os in case of a mismatch.Right now this is SCSI disk only but similar efforts are in progress
for SATA and SCSI tape. With a few minor nits due to protocol
limitations the proposed SATA format is identical to the SCSI ditto
for easy interoperability.T10 DIF
-------SCSI drives can usually be reformatted to 520-byte sectors, yielding 8
extra bytes per sector. These 8 bytes have traditionally been used by
RAID controllers to store internal protection information.DIF (Data Integrity Field) is an extension to the SCSI Block Commands
that standardizes the format of the 8 extra bytes and defines ways to
interact with the contents at the protocol level.Each 8-byte DIF tuple is split into three chunks:
- a 16-bit guard tag containing a CRC of the data portion of
the sector.- a 16-bit application tag which is up for grabs.
- a 32-bit reference tag which contains an incrementing
counter for each sector. For DIF Type 1 it also needs to
match the physical LBA on the drive.There are three types of DIF defined: Type 1, Type 2, and Type 3.
These patches support Type 1 and Type 3. Type 2 depends on 32-byte
CDBs and is work in progress.Since the DIF tupl...
Pointers to archives would have been appreciated. I can't, for the life
Thanks for all of the great documentation. It would be good to include
some instructions on how one would test this, and what testing you
performed. Also, please use the '-p' switch to diff, as it makes
reviewing patches much easier.I set out to try your changes, but ran into some problems. First, this
patch set didn't apply cleanly to a git checkout. So, I grabbed your
mercurial repository, but got a build failure:block/blk-core.c: In function 'generic_make_request':
include/linux/bio.h:469: sorry, unimplemented: inlining failed in call to 'bio_i
ntegrity_enabled': function body not available
block/blk-core.c:1388: sorry, unimplemented: called from here
make[1]: *** [block/blk-core.o] Error 1
make: *** [block] Error 2Cheers,
Jeff
--
>>>>> "Jeff" == Jeff Moyer <jmoyer@redhat.com> writes:
Jeff> Pointers to archives would have been appreciated. I can't, for
Jeff> the life of me, find these.http://marc.info/?l=linux-scsi&m=121272302931588&w=2
http://marc.info/?l=linux-scsi&m=121278031605941&w=2
http://marc.info/?l=linux-scsi&m=121302438515260&w=2
http://marc.info/?l=linux-scsi&m=121278067906564&w=2Jeff> Thanks for all of the great documentation. It would be good to
Jeff> include some instructions on how one would test this, and what
Jeff> testing you performed.modprobe scsi_debug dix=199 dif=1 guard=1 dev_size_mb=1024 num_parts=1
I'm testing with XFS and btrfs. Generally doing kernel builds, etc.
ext2/3 are still problematic because they modify pages in flight.Jeff> I set out to try your changes, but ran into some problems.
Jeff> First, this patch set didn't apply cleanly to a git checkout.I generally track Linus closely so it must be because of the patches
you were missing.You can grab my patch stack here. It's always in sync with the hg
repo:http://oss.oracle.com/~mkp/patches/
Jeff> block/blk-core.c: In function 'generic_make_request':
Jeff> include/linux/bio.h:469: sorry, unimplemented: inlining failed
Jeff> in call to 'bio_i ntegrity_enabled': function body not available
Jeff> block/blk-core.c:1388: sorry, unimplemented: called from here
Jeff> make[1]: *** [block/blk-core.o] Error 1 make: *** [block] Error
Jeff> 2Odd. Which compiler are you using? Compiles just fine for me on both
EL5 and FC9.Judging from the error I'm guessing it's objecting to the inlining.
Tried to work around it. Please pull, update and let me know whether
that did the trick.--
Martin K. Petersen Oracle Linux Engineering--
On Tue, Jun 10, 2008 at 11:28 AM, Martin K. Petersen
Have you made the ext2/3/4 developers aware of this?
Could you elaborate on the interaction between the data integrity
support in the block layer and a given filesystem? Shouldn't _any_
filesystem "just work" given that the block layer is what is
generating the checksums and then verifying them on read?regards,
Mike
--
Mike> Have you made the ext2/3/4 developers aware of this?
Yep.
Mike> Shouldn't _any_ filesystem "just work" given that the block
Mike> layer is what is generating the checksums and then verifying
Mike> them on read?Yep.
There are a couple of issues. One problem is that pages are no longer
locked down during I/O. Instead the writeback bit is being set to
indicate that I/O is in progress. Not all corners of ext* have been
adapted to that properly. Especially ext2 suffers and often modifies
pages containing metadata while they are in flight. If I remember
correctly, ext2/dir.c hasn't been made aware of writeback at all and
assumes the page lock still works like it used to.That is normally not a huge problem because the page is being
scheduled for write again shortly thereafter. So the inconsistent
block on disk gets overwritten pretty much instantly. But that kind
of sloppy behavior is a no-go with integrity checking turned on.There also appears to be some quirks in the page cache in general.
There's something not quite right in clear_page_dirty() /
page_mkwrite() territory. If I sync excessively I can make any fs
keel over. peterz said that an mmapped page is supposed to be
read-only during writeback but that appears to be racy when a forced
sync is involved.That's my recollection, anyway. I've been busy with the innards of
the integrity code stuff for a couple of months and haven't poked at
the fs/vm issues for a while.--
Martin K. Petersen Oracle Linux Engineering--
So, is it safe to say that the library routines for integrity-aware file
systems have not been tested at all? Specifically, I'm talking about:
bio_integrity_tag_size
bio_integrity_set_tagI did a new clone (just to be sure I got your change) and I get the same
problem. I also can't see the changeset in the log, so are you sure you
pushed it?I got rid of the inline in the definition in bio.h. The .c file didn't
define the function as inline, so I didn't have to change it. It seems
to be building now.Cheers,
Jeff
--
>>>>> "Jeff" == Jeff Moyer <jmoyer@redhat.com> writes:
Jeff> So, is it safe to say that the library routines for
Jeff> integrity-aware file systems have not been tested at all?
Jeff> Specifically, I'm talking about: bio_integrity_tag_size
Jeff> bio_integrity_set_tag bio_integrity_get_tagI have not tried using them from within a filesystem, if that's what
you mean. But I have attached random strings to bios and read them
back later.Jeff> gcc (GCC) 4.1.2 20071124 (Red Hat 4.1.2-41)
Ok, I'm trying to chase down a 5.2 box to figure out what the problem
is. Maybe I'll just move that function to the header file.Jeff> I did a new clone (just to be sure I got your change) and I get
Jeff> the same problem. I also can't see the changeset in the log, so
Jeff> are you sure you pushed it?Yup, it's there.
Jeff> I got rid of the inline in the definition in bio.h. The .c file
Jeff> didn't define the function as inline, so I didn't have to change
Jeff> it. It seems to be building now.The problem is that your gcc is unhappy about the fact that the
inlined function is defined elsewhere. The gcc info page said only
declare it inline in the header and not the declaration. The change I
pushed removed inline from the .c file. But that didn't help.--
Martin K. Petersen Oracle Linux Engineering--
I was checking to see if you had exercised the code at all, and you
have. Great!Cheers,
Jeff
--
12 files changed, 935 insertions(+), 7 deletions(-)
Documentation/scsi/data-integrity.txt | 57 ++
drivers/scsi/Kconfig | 1
drivers/scsi/Makefile | 2
drivers/scsi/scsi_error.c | 3
drivers/scsi/scsi_lib.c | 4
drivers/scsi/scsi_sysfs.c | 4
drivers/scsi/sd.c | 58 ++
drivers/scsi/sd.h | 22 +
drivers/scsi/sd_dif.c | 644 +++++++++++++++++++++++++++++++++
include/scsi/scsi_cmnd.h | 3
include/scsi/scsi_dif.h | 140 +++++++
include/scsi/scsi_host.h | 4Configure DMA of protection information and issue READ/WRITE commands
with RDPROTECT/WRPROTECT set accordingly.Force READ CAPACITY(16) if the target has the PROTECT bit set and grab
an extra byte of response (P_TYPE and PROT_EN are in byte 12).Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
---
diff -r ad65bfde4e05 -r 8bc1728dc75a Documentation/scsi/data-integrity.txt
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/Documentation/scsi/data-integrity.txt Sat Jun 07 00:45:15 2008 -0400
@@ -0,0 +1,57 @@
+----------------------------------------------------------------------
+1.0 INTRODUCTION
+
+For a general overview of the data integrity framework please consult
+Documentation/block/data-integrity.txt.
+
+----------------------------------------------------------------------
+2.0 SCSI LAYER IMPLEMENTATION DETAILS
+
+The scsi_command has been extended with a scatterlist for the
+integrity metadata. Note that all SCSI mid layer changes refer to
+this using the term "protection information" which is what it is
+called in the T10 spec.
+
+The term DIF (Data Integrity Field) is specific to SCSI disks (SBC).
+The SCSI midlayer doesn't know, or care, about the contents of the
+protection scatterlist, except it calls blk_rq_map_integrity_sg()
+during command initialization.
+
+
+2.1 SCSI DEVICE SCANNING
+
...
10 files changed, 849 insertions(+)
Documentation/block/data-integrity.txt | 327 +++++++++++++++++++++++++++
block/Kconfig | 12
block/Makefile | 1
block/blk-core.c | 7
block/blk-integrity.c | 385 ++++++++++++++++++++++++++++++++
block/blk-merge.c | 3
block/blk.h | 8
block/elevator.c | 6
include/linux/blkdev.h | 97 ++++++++
include/linux/genhd.h | 3Support for merging and mapping bio integrity metadata.
Block device integrity type registration.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
---
diff -r f2ae9d5bce4c -r 5be7c534c954 Documentation/block/data-integrity.txt
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/Documentation/block/data-integrity.txt Sat Jun 07 00:45:15 2008 -0400
@@ -0,0 +1,327 @@
+----------------------------------------------------------------------
+1. INTRODUCTION
+
+Modern filesystems feature checksumming of data and metadata to
+protect against data corruption. However, the detection of the
+corruption is done at read time which could potentially be months
+after the data was written. At that point the original data that the
+application tried to write is most likely lost.
+
+The solution is to ensure that the disk is actually storing what the
+application meant it to. Recent additions to both the SCSI family
+protocols (SBC Data Integrity Field, SCC protection proposal) as well
+as SATA/T13 (External Path Protection) try to remedy this by adding
+support for appending integrity metadata to an I/O. The integrity
+metadata (or protection information in SCSI terminology) includes a
+checksum for each sector as well as an incrementing counter that
+ensures the individual sectors are written in the right order. And
+for some protection schemes also that the I/O is written to the right
+place on...
All of the sysfs interfaces should be documented in Documentation/ABI/
not here, otherwise they will be tough to find and figure out how to
use :)thanks,
greg k-h
--
Greg> All of the sysfs interfaces should be documented in
Greg> Documentation/ABI/ not here, otherwise they will be tough to
Greg> find and figure out how to use :)Will do.
--
Martin K. Petersen Oracle Linux Engineering--
4 files changed, 825 insertions(+), 3 deletions(-)
fs/Makefile | 1
fs/bio-integrity.c | 715 +++++++++++++++++++++++++++++++++++++++++++++++++++
fs/bio.c | 27 +
include/linux/bio.h | 85 ++++++Allows integrity metadata to be attached to a bio.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
---
diff -r 318fa71e735d -r f2ae9d5bce4c fs/Makefile
--- a/fs/Makefile Sat Jun 07 00:45:14 2008 -0400
+++ b/fs/Makefile Sat Jun 07 00:45:15 2008 -0400
@@ -19,6 +19,7 @@
obj-y += no-block.o
endif+obj-$(CONFIG_BLK_DEV_INTEGRITY) += bio-integrity.o
obj-$(CONFIG_INOTIFY) += inotify.o
obj-$(CONFIG_INOTIFY_USER) += inotify_user.o
obj-$(CONFIG_EPOLL) += eventpoll.o
diff -r 318fa71e735d -r f2ae9d5bce4c fs/bio-integrity.c
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/fs/bio-integrity.c Sat Jun 07 00:45:15 2008 -0400
@@ -0,0 +1,715 @@
+/*
+ * bio-integrity.c - bio data integrity extensions
+ *
+ * Copyright (C) 2007, 2008 Oracle Corporation
+ * Written by: Martin K. Petersen <martin.petersen@oracle.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License version
+ * 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; see the file COPYING. If not, write to
+ * the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139,
+ * USA.
+ *
+ */
+
+#include <linux/blkdev.h>
+#include <linux/mempool.h>
+#include <linux/bio.h>
+#include <linux/workqueue.h>
+
+static struct kmem_cache *bio_integrity_slab __read_mostly;
+static struct workqueue_struct *kintegrityd_w...
"Martin K. Petersen" <martin.petersen@oracle.com> writes:
I see you've changed this to:
nr_sectors = (len + bi->tag_size - 1) / bi->tag_size;
set_tag and get_tag are almost identical. Any chance you want to factor
^^^^^^^^^^Hmm, up until this point you use bi to mean bio_integrity, but now
struct blk_integrity_exchg is not yet defined in your patch set, so this
The above two loops look pretty much the same to me. Can you factor
that out to a helper?Cheers,
Jeff
--
Jeff> That assignment isn't necessary.
Zap!
Jeff> nr_sectors = (len + bi->tag_size - 1) / bi->tag_size;
Jeff> why not simply use DIV_ROUND_UP?
Fixed.
Jeff> set_tag and get_tag are almost identical. Any chance you want
Jeff> to factor out that code?Done.
Jeff> Hmm, up until this point you use bi to mean bio_integrity, but
Jeff> now it means blk_integrity. Confusion will ensue. ;)Err, uhm. There is no bio_integrity. There's the bio integrity
payload which I always refer to as struct bip *bip. And struct
blk_integrity which is always bi. I'm also anal about using bv for
the data bio_vec and iv for the integrity bio_vec. I can't see any
place where I'm inconsistent.Jeff> struct blk_integrity_exchg is not yet defined in your patch set,
Jeff> so this will likely break git bisect.bio-integrity.patch and blk-integrity.patch are artificially split up
to ease the review process. They are not meant to be separateJeff> Does this actually need to be zeroed?
Jeff> The above two loops look pretty much the same to me. Can you
Jeff> factor that out to a helper?I've created helpers for marking head and tail of the ivec.
--
Martin K. Petersen Oracle Linux Engineering
--
OK, just wanted to make sure you were aware of it.
Cheers,
Jeff
--
Seems i've missing something. What is purpose of this black magic?
do you want just express following?
Why here and later sector_size == 4096 is so special, what about 1k and
--
DM> Seems i've missing something. What is purpose of this black magic?
DM> do you want just express following? nr_sectors = (len +
DM> bi->tag_size - 1) / bi->tag_size;Yep. In the original DIF spec, only 2-byte tags were supported so the
check for an odd length was a fast and elegant solution. But now that
I implemented Type 3 which has 6 bytes of tag space that's an invalidDM> Why here and later sector_size == 4096 is so special, what about
DM> 1k and 2k sect_sz? Do you want just transform value from 512 to
DM> bi->sectors_size?Well, so only 512-byte DIF storage devices are currently available.
The whole industry is in the process of transitioning to 4KB sectors.
There will be no DIF devices with 1KB or 2KB sectors.And even as it stands it's unclear that 4KB sectors are going to look
like they do in the current version of the spec. It's going to be an
interoperability nightmare as it is now as the tag is attached to the
hardware sector size. This means that it's still only 8 bytes of DIF
for a device with 4KB sectors (IOW, 4104 bytes and not 4160). That
means that *two* protection buffers would have to be generated for -
say - a mirror with heterogeneous sector sizes. And tagging won't
work as there's not the same space available for both drives of the
mirror.The tag space problem also causes issues with RAID arrays exporting
512 byte sectors to the host but using drives with 4KB sectors in the
back. Where is the array going to store the tags for the last 7 512
byte sectors?So 4KB vs. DIF is up in the air at this point. The current checks are
there because I've been messing with 4KB sector devices for other
reasons. And technically they are in accordance with the current
spec.--
Martin K. Petersen Oracle Linux Engineering--
5 files changed, 90 insertions(+)
drivers/scsi/Kconfig | 15 +++++++++++++++
drivers/scsi/scsi_lib.c | 42 ++++++++++++++++++++++++++++++++++++++++++
drivers/scsi/scsi_scan.c | 3 +++
include/scsi/scsi_cmnd.h | 29 +++++++++++++++++++++++++++++
include/scsi/scsi_device.h | 1 +- Add support for an extra scatter-gather list containing protection
information.- Remember devices with protection information turned on in INQUIRY.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
---
diff -r 5be7c534c954 -r ad65bfde4e05 drivers/scsi/Kconfig
--- a/drivers/scsi/Kconfig Sat Jun 07 00:45:15 2008 -0400
+++ b/drivers/scsi/Kconfig Sat Jun 07 00:45:15 2008 -0400
@@ -260,6 +260,21 @@
default m
depends on SCSI
depends on MODULES
+
+config SCSI_PROTECTION
+ bool "SCSI Data Integrity Protection"
+ depends on SCSI
+ depends on BLK_DEV_INTEGRITY
+ help
+ Some SCSI devices support data protection features above and
+ beyond those implemented in the transport. Select this
+ option to enable protection information to be transferred to
+ and from a device. Specifically, this option will enable DIF
+ (Data Integrity Field) for SCSI disks.
+
+ The SCSI protection features depend on the block layer data
+ integrity infrastructure so the latter must be enabled for
+ this option to work.menu "SCSI Transports"
depends on SCSI
diff -r 5be7c534c954 -r ad65bfde4e05 drivers/scsi/scsi_lib.c
--- a/drivers/scsi/scsi_lib.c Sat Jun 07 00:45:15 2008 -0400
+++ b/drivers/scsi/scsi_lib.c Sat Jun 07 00:45:15 2008 -0400
@@ -778,6 +778,13 @@
kmem_cache_free(scsi_sdb_cache, bidi_sdb);
cmd->request->next_rq->special = NULL;
}
+
+#if defined(CONFIG_SCSI_PROTECTION)
+ if (scsi_prot_sg_count(cmd)) {
+ scsi_free_sgtable(cmd->prot_sdb);
+ kmem_cache_free(scsi_sdb_cache, cmd->prot_sdb);
+ }
+#endif
}
EXPORT_SYMBOL(scsi_release_buffers);@@ -1031,6 +1038,32 @@
return BLKPREP_OK;
...
4 files changed, 83 insertions(+)
include/linux/crc-t10dif.h | 8 +++++
lib/Kconfig | 7 ++++
lib/Makefile | 1
lib/crc-t10dif.c | 67 ++++++++++++++++++++++++++++++++++++++++++++Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
---
diff -r b76a6f493f0c -r fb6f53e13ff3 include/linux/crc-t10dif.h
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/include/linux/crc-t10dif.h Sat Jun 07 00:45:14 2008 -0400
@@ -0,0 +1,8 @@
+#ifndef _LINUX_CRC_T10DIF_H
+#define _LINUX_CRC_T10DIF_H
+
+#include <linux/types.h>
+
+__u16 crc_t10dif(unsigned char const *, size_t);
+
+#endif
diff -r b76a6f493f0c -r fb6f53e13ff3 lib/Kconfig
--- a/lib/Kconfig Sat Jun 07 00:45:14 2008 -0400
+++ b/lib/Kconfig Sat Jun 07 00:45:14 2008 -0400
@@ -28,6 +28,13 @@
modules require CRC16 functions, but a module built outside
the kernel tree does. Such modules that use library CRC16
functions require M here.
+
+config CRC_T10DIF
+ tristate "CRC calculation for the T10 Data Integrity Field"
+ help
+ This option is only needed if a module that's not in the
+ kernel tree needs to calculate CRC checks for use with the
+ SCSI data integrity subsystem.config CRC_ITU_T
tristate "CRC ITU-T V.41 functions"
diff -r b76a6f493f0c -r fb6f53e13ff3 lib/Makefile
--- a/lib/Makefile Sat Jun 07 00:45:14 2008 -0400
+++ b/lib/Makefile Sat Jun 07 00:45:14 2008 -0400
@@ -45,6 +45,7 @@
obj-$(CONFIG_BITREVERSE) += bitrev.o
obj-$(CONFIG_CRC_CCITT) += crc-ccitt.o
obj-$(CONFIG_CRC16) += crc16.o
+obj-$(CONFIG_CRC_T10DIF)+= crc-t10dif.o
obj-$(CONFIG_CRC_ITU_T) += crc-itu-t.o
obj-$(CONFIG_CRC32) += crc32.o
obj-$(CONFIG_CRC7) += crc7.o
diff -r b76a6f493f0c -r fb6f53e13ff3 lib/crc-t10dif.c
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/lib/crc-t10dif.c Sat Jun 07 00:45:14 2008 -0400
@@ -0,0 +1,67 @@
+/*
+ * T10 Data Integrity Field CRC16 calculation
+ *
+ * Copyright (c) 2007 Oracle Corporation. All rights reserved.
+ ...
2 files changed, 38 insertions(+), 28 deletions(-)
fs/bio.c | 36 ++++++++----------------------------
include/linux/bio.h | 30 ++++++++++++++++++++++++++++++Move struct bio_set and biovec_slab definitions to bio.h so they can
be used outside of bio.c.Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
---
diff -r fb6f53e13ff3 -r 550d61001baa fs/bio.c
--- a/fs/bio.c Sat Jun 07 00:45:14 2008 -0400
+++ b/fs/bio.c Sat Jun 07 00:45:14 2008 -0400
@@ -28,24 +28,9 @@
#include <linux/blktrace_api.h>
#include <scsi/sg.h> /* for struct sg_iovec */-#define BIO_POOL_SIZE 2
-
static struct kmem_cache *bio_slab __read_mostly;-#define BIOVEC_NR_POOLS 6
-
-/*
- * a small number of entries is fine, not going to be performance critical.
- * basically we just need to survive
- */
-#define BIO_SPLIT_ENTRIES 2
mempool_t *bio_split_pool __read_mostly;
-
-struct biovec_slab {
- int nr_vecs;
- char *name;
- struct kmem_cache *slab;
-};/*
* if you change this list, also change bvec_alloc or things will
@@ -60,23 +45,18 @@
#undef BV/*
- * bio_set is used to allow other portions of the IO system to
- * allocate their own private memory pools for bio and iovec structures.
- * These memory pools in turn all allocate from the bio_slab
- * and the bvec_slabs[].
- */
-struct bio_set {
- mempool_t *bio_pool;
- mempool_t *bvec_pools[BIOVEC_NR_POOLS];
-};
-
-/*
* fs_bio_set is the bio_set containing bio and iovec memory pools used by
* IO code that does not need private memory pools.
*/
-static struct bio_set *fs_bio_set;
+struct bio_set *fs_bio_set;-static inline struct bio_vec *bvec_alloc_bs(gfp_t gfp_mask, int nr, unsigned long *idx, struct bio_set *bs)
+inline int bvec_nr_vecs(int idx)
+{
+ return bvec_slabs[idx].nr_vecs;
+}
+EXPORT_SYMBOL(bvec_nr_vecs);
+
+struct bio_vec *bvec_alloc_bs(gfp_t gfp_mask, int nr, unsigned long *idx, struct bio_set *bs)
{
struct bio_vec *bvl;diff -r fb6f53e...
Don't sell yourself short; you also implemented bvec_nr_vecs.
Looks okay to me.
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
--
2 files changed, 26 insertions(+)
fs/bio.c | 24 ++++++++++++++++++++++++
include/linux/bio.h | 2 ++Helper function to find the sector offset in a bio given bvec index
and page offset.Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
---
diff -r 550d61001baa -r 318fa71e735d fs/bio.c
--- a/fs/bio.c Sat Jun 07 00:45:14 2008 -0400
+++ b/fs/bio.c Sat Jun 07 00:45:14 2008 -0400
@@ -1232,6 +1232,30 @@
return bp;
}+sector_t bio_sector_offset(struct bio *bio, unsigned short index, unsigned int offset)
+{
+ struct bio_vec *bv;
+ unsigned int sector_sz = bio->bi_bdev->bd_disk->queue->hardsect_size;
+ sector_t sectors;
+ int i;
+
+ sectors = 0;
+
+ BUG_ON(index >= bio->bi_vcnt);
+
+ bio_for_each_segment(bv, bio, i) {
+ if (i == index) {
+ if (offset > bv->bv_offset)
+ sectors += (offset - bv->bv_offset) / sector_sz;
+ return sectors;
+ }
+
+ sectors += bv->bv_len / sector_sz;
+ }
+
+ BUG();
+}
+EXPORT_SYMBOL(bio_sector_offset);/*
* create memory pools for biovec's in a bio_set.
diff -r 550d61001baa -r 318fa71e735d include/linux/bio.h
--- a/include/linux/bio.h Sat Jun 07 00:45:14 2008 -0400
+++ b/include/linux/bio.h Sat Jun 07 00:45:14 2008 -0400
@@ -315,6 +315,8 @@
extern int bio_add_pc_page(struct request_queue *, struct bio *, struct page *,
unsigned int, unsigned int);
extern int bio_get_nr_vecs(struct block_device *);
+extern sector_t bio_sector_offset(struct bio *, unsigned short, unsigned int);
+
extern struct bio *bio_map_user(struct request_queue *, struct block_device *,
unsigned long, unsigned int, int);
struct sg_iovec;--
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
--
Unless I've missed something, this helper isn't used anywhere in your
patchset.Cheers,
Jeff
--
>>>>> "Jeff" == Jeff Moyer <jmoyer@redhat.com> writes:
Jeff> Unless I've missed something, this helper isn't used anywhere in
Jeff> your patchset.You are right.
bio_sector_offset() is used in the patch that adds integrity support
to DM. I didn't post the MD/DM patches this time around in an attempt
to limit the number of patches and focus on the core code.You can see the DM patch here:
http://oss.oracle.com/mercurial/mkp/linux-2.6-di/rev/6b79173c501a
--
Martin K. Petersen Oracle Linux Engineering--
| Greg Kroah-Hartman | [PATCH 012/196] nozomi driver |
| Ingo Molnar | Re: [patch 00/13] Syslets, "Threadlets", generic AIO support, v3 |
| Rafael J. Wysocki | [PATCH -mm 5/6] Freezer: Remove PF_NOFREEZE from bluetooth threads |
| Ingo Molnar | Re: [PATCH 00/23] per device dirty throttling -v8 |
git: | |
| David Miller | [GIT]: Networking |
| David Miller | Re: [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Gerrit Renker | [PATCH 15/37] dccp: Set per-connection CCIDs via socket options |
| Natalie Protasevich | [BUG] New Kernel Bugs |
