Hello everyone, Btrfs v0.13 is now available for download from: http://oss.oracle.com/projects/btrfs/ We took another short break from the multi-device code to make the minor mods required to compile on 2.6.25, fix some problematic bugs and do more tuning. The most important fix is for file data checksumming errors. These might show up on .o files from compiles or other files where seeky writes were done internally to fill it up. The end result was a bunch of zeros in the file where people expected their data to be. Thanks to Yan Zheng for tracking it down. GregKH provided most of the 2.6.25 port with some sysfs updates. Since the sysfs files are not used much and Greg has offered additional cleanups, I've disabled the btrfs sysfs interface on kernels older than 2.6.25. This way he won't have to back port any of his changes. Optimizations and other fixes: * File data checksumming done in larger chunks, resulting in fewer btree searches and fewer kmap calls. * CPU Optimizations for back reference removal * CPU Optimizations for block allocation, and much more efficient searching through the free space cache. * Allocation optimizations, the free space clustering code was not properly allocating from a cluster once it found it. For normal mounts the fix improves metadata writeback, for mount -o ssd it improves everything. * Unaligned access fixes from Dave Miller * Btree reads are done in larger bios when possible * i_block accounting is fixed -chris --
Hi Chris, Following are two patches that allow btrfs-progs to build on ia64, and prevent a SEGV when trying to do a mkfs.btrfs. I've gotten as far as successfully creating a btrfs filesystem (at least that's what btrfsck tells me), but haven't been able to mount it yet, probably because of the sector size issue. /ac --
Oh, and for those curious, here's the output of btrfsck: [root@canola btrfs-progs-unstable]# btrfsck /dev/cciss/c2d1 found device 1 on /dev/cciss/c2d1 lowest devid now 1 found Btrfs on /dev/cciss/c2d1 with 1 devices opening /dev/cciss/c2d1 devid 1 fd 4 found 102400 bytes used err is 0 total csum bytes: 0 total tree bytes: 81920 btree space waste bytes: 79462 file data blocks allocated: 0 referenced 0 [yes, that's working over the cciss block device ;)] /ac --
From: Alex Chiang <achiang@hp.com> You should be able to make a filesystem with a sector size >= PAGE_SIZE and it should work just fine. Please give it a try. --
Hrm, I'm having issues still. First, here's a patch for
mkfs.btrfs to allow the user to pass in a different sector size.
Bug report follows...
/ac
From f955af8deb68a00d92326a283491ea088f996a53 Mon Sep 17 00:00:00 2001
From: Alex Chiang <achiang@hp.com>
Date: Mon, 31 Mar 2008 16:41:21 -0600
Subject: [PATCH] Teach mkfs.btrfs about configurable sectorsizes
Currently, btrfs assumes PAGE_SIZE <= sectorsize, and sectorsize
is hardcoded to 4K in mkfs.btrfs.
Give mkfs.btrfs a new command line option to specify a different
sector size. The syntax follows mke2fs's -E extended-options syntax,
and the code is taken from mke2fs.
---
mkfs.c | 76 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
1 files changed, 74 insertions(+), 2 deletions(-)
diff --git a/mkfs.c b/mkfs.c
index 49f7308..9073209 100644
--- a/mkfs.c
+++ b/mkfs.c
@@ -137,15 +137,76 @@ err:
return ret;
}
+struct btrfs_params {
+ u32 sectorsize;
+};
+
+/*
+ * Shameless ripped from mke2fs
+ */
+static void parse_extended_opts(struct btrfs_params *param, const char *opts)
+{
+ char *buf, *token, *next, *p, *arg;
+ int len;
+ int r_usage = 0;
+
+ len = strlen(opts);
+ buf = malloc(len+1);
+ if (!buf) {
+ fprintf(stderr, "Couldn't allocate memory to parse options!\n");
+ exit(1);
+ }
+ strcpy(buf, opts);
+ for (token = buf; token && *token; token = next) {
+ p = strchr(token, ',');
+ next = 0;
+ if (p) {
+ *p = 0;
+ next = p+1;
+ }
+ arg = strchr(token, '=');
+ if (arg) {
+ *arg = 0;
+ arg++;
+ }
+ if (strcmp(token, "sectorsize") == 0) {
+ if (!arg) {
+ r_usage++;
+ continue;
+ }
+ param->sectorsize = strtoul(arg, &p, 0);
+ if (*p || (param->sectorsize == 0)) {
+ fprintf(stderr,
+ "Invalid sectorsize parameter: %s\n",
+ arg);
+ r_usage++;
+ continue;
+ }
+ } else {
+ r_usage++;
+ }
+ }
+ if (r_usage) {
+ fprintf(stderr, "\nBad options specified.\n\n"
+ "Extended options are separated ...Whoops, whitespace was screwed up on that patch. Here's try #2.
/ac
From: Alex Chiang <achiang@hp.com>
Subject: [PATCH] Teach mkfs.btrfs about configurable sectorsizes
Currently, btrfs assumes PAGE_SIZE <= sectorsize, and sectorsize
is hardcoded to 4K in mkfs.btrfs.
Give mkfs.btrfs a new command line option to specify a different
sector size. The syntax follows mke2fs's -E extended-options syntax,
and the code is taken from mke2fs.
---
mkfs.c | 76 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
1 files changed, 74 insertions(+), 2 deletions(-)
diff --git a/mkfs.c b/mkfs.c
index 49f7308..874a41e 100644
--- a/mkfs.c
+++ b/mkfs.c
@@ -137,15 +137,76 @@ err:
return ret;
}
+struct btrfs_params {
+ u32 sectorsize;
+};
+
+/*
+ * Shameless ripped from mke2fs
+ */
+static void parse_extended_opts(struct btrfs_params *param, const char *opts)
+{
+ char *buf, *token, *next, *p, *arg;
+ int len;
+ int r_usage = 0;
+
+ len = strlen(opts);
+ buf = malloc(len+1);
+ if (!buf) {
+ fprintf(stderr, "Couldn't allocate memory to parse options!\n");
+ exit(1);
+ }
+ strcpy(buf, opts);
+ for (token = buf; token && *token; token = next) {
+ p = strchr(token, ',');
+ next = 0;
+ if (p) {
+ *p = 0;
+ next = p+1;
+ }
+ arg = strchr(token, '=');
+ if (arg) {
+ *arg = 0;
+ arg++;
+ }
+ if (strcmp(token, "sectorsize") == 0) {
+ if (!arg) {
+ r_usage++;
+ continue;
+ }
+ param->sectorsize = strtoul(arg, &p, 0);
+ if (*p || (param->sectorsize == 0)) {
+ fprintf(stderr,
+ "Invalid sectorsize parameter: %s\n",
+ arg);
+ r_usage++;
+ continue;
+ }
+ } else {
+ r_usage++;
+ }
+ }
+ if (r_usage) {
+ fprintf(stderr, "\nBad options specified.\n\n"
+ "Extended options are separated by commas, "
+ "and may take an argument which\n"
+ "\tis set off by an equals ('=') sign.\n\n"
+ "Valid extended options are:\n"
+ "\tsectorsize=<sector size in ...I did this a little differently, switching to getopt_long in mkfs.btrfs and using [-s | --sectorsize ] for sectorsize. -s used to be stripesize, but that needs to be redone for the multi-device code anyway. You can pull down integrated versions of your patches from: http://www.kernel.org/hg/btrfs/kernel-unstable http://www.kernel.org/hg/btrfs/progs-unstable Make sure to update both, there are minor format changes hidden in the unstable tree since you last used it. -chris --
So, using the patch from my last mail, I created a btrfs, but was unable to mount it... [root@canola btrfs]# getconf PAGESIZE 16384 [root@canola btrfs]# mkfs.btrfs -E sectorsize=16384 /dev/cciss/c2d1 found device 1 on /dev/cciss/c2d1 lowest devid now 1 found Btrfs on /dev/cciss/c2d1 with 1 devices opening /dev/cciss/c2d1 devid 1 fd 5 alloc chunk size 8388608 from dev 1 alloc chunk size 8388608 from dev 1 fs created on /dev/cciss/c2d1 nodesize 16384 leafsize 16384 sectorsize 16384 bytes 73372631040 [root@canola btrfs]# mount -t btrfs /dev/cciss/c2d1 /mnt/btrfs mount: /dev/cciss/c2d1: can't read superblock And from /var/log/messages: Mar 31 16:50:23 canola kernel: btrfs: cciss/c2d1 checksum verify failed on 16384 wanted A76CDD59 found 4A0E371 from_this_trans 0 Mar 31 16:50:23 canola kernel: btrfs: valid FS not found on cciss/c2d1 Mar 31 16:50:23 canola kernel: btrfs: open_ctree failed Any hints? /ac --
It turns out I am an idiot. At some point, I got confused which btrfs trees I was working in, and gotten switched up to where I had insmod'ed btrfs v0.13, but was trying to mount a filesystem created with btrfs-progs-unstable. Of course, I got a version mismatch when it went to check for BTRFS_MAGIC and mount failed. Moving to btrfs-unstable allowed me to mount the filesystem. [root@canola btrfs-unstable]# mount -t btrfs /dev/cciss/c2d1 /mnt/btrfs scan one opens /dev/cciss/c2d1 found device 1 on /dev/cciss/c2d1 lowest devid now 1 scan one closes bdev /dev/cciss/c2d1 opening /dev/cciss/c2d1 devid 1 lowest bdev /dev/cciss/c2d1 Sorry for the noise. /ac --
This is a very easy mistake to make, a long standing TODO is to make sane format revision fields. I'll include this in the multi-device code. Thanks for all the patches so far. -chris --
Great, thanks I'll take these two. The kernel side needs the same hash.c fix, but I've already got that change made locally. -chris --
Here's a patch for the kernel side.
/ac
Subject: [PATCH] btrfs: Stop trashing 'name' arg of btrfs_name_hash
From: Alex Chiang <achiang@hp.com>
In btrfs_name_hash, Local variable 'buf' is declared as
__u32 buf[2];
but we then try to do this:
buf[0] = 0x67452301;
buf[1] = 0xefcdab89;
buf[2] = 0x98badcfe;
buf[3] = 0x10325476;
Oops. Fix buf to be the proper size.
Signed-off-by: Alex Chiang <achiang@hp.com>
diff -r e4cd88595ed7 -r 03942eecb56d hash.c
--- a/hash.c Thu Feb 21 14:54:12 2008 -0500
+++ b/hash.c Mon Mar 31 14:58:00 2008 -0600
@@ -81,7 +81,7 @@ u64 btrfs_name_hash(const char *name, in
__u32 hash;
__u32 minor_hash = 0;
const char *p;
- __u32 in[8], buf[2];
+ __u32 in[8], buf[4];
u64 hash_result;
if (len == 1 && *name == '.') {
--
From: Alex Chiang <achiang@hp.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
We get lots of warnings of the flavor:
utils.c:441: warning: format '%Lu' expects type 'long long unsigned int' but argument 2 has type 'u64'
And thanks to -Werror, the build fails. Clean up these printfs
by properly casting the arg to the format specified.
Signed-off-by: Alex Chiang <achiang@hp.com>
---
ctree.c | 5 +++--
disk-io.c | 7 ++++---
extent-tree.c | 24 +++++++++++++++---------
extent_io.c | 3 ++-
file-item.c | 8 +++++---
inode-map.c | 3 ++-
root-tree.c | 5 ++++-
utils.c | 2 +-
volumes.c | 14 +++++++++-----
9 files changed, 45 insertions(+), 26 deletions(-)
diff --git a/ctree.c b/ctree.c
index 88ebd9e..5311306 100644
--- a/ctree.c
+++ b/ctree.c
@@ -237,8 +237,9 @@ int btrfs_cow_block(struct btrfs_trans_handle *trans,
}
*/
if (trans->transid != root->fs_info->generation) {
- printk(KERN_CRIT "trans %Lu running %Lu\n", trans->transid,
- root->fs_info->generation);
+ printk(KERN_CRIT "trans %llu running %llu\n",
+ (unsigned long long)trans->transid,
+ (unsigned long long)root->fs_info->generation);
WARN_ON(1);
}
if (btrfs_header_generation(buf) == trans->transid) {
diff --git a/disk-io.c b/disk-io.c
index 1afe5a6..8ee7716 100644
--- a/disk-io.c
+++ b/disk-io.c
@@ -84,7 +84,8 @@ static int csum_tree_block(struct btrfs_root *root, struct extent_buffer *buf,
if (verify) {
if (memcmp_extent_buffer(buf, result, 0, BTRFS_CRC32_SIZE)) {
- printk("checksum verify failed on %llu\n", buf->start);
+ printk("checksum verify failed on %llu\n",
+ (unsigned long long)buf->start);
return 1;
}
} else {
@@ -429,8 +430,8 @@ struct btrfs_root *open_ctree_fd(int fp, const char *path, u64 sb_bytenr)
fprintf(stderr, "No valid Btrfs found on %s\n", path);
return NULL;
}
- fprintf(stderr, "found Btrfs on %s with %Lu ...In btrfs_name_hash, Local variable 'buf' is declared as __u32 buf[2]; but we then try to do this: buf[0] = 0x67452301; buf[1] = 0xefcdab89; buf[2] = 0x98badcfe; buf[3] = 0x10325476; Oops. Fix buf to be the proper size. Signed-off-by: Alex Chiang <achiang@hp.com> --- hash.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/hash.c b/hash.c index 58f0be6..6a0795d 100644 --- a/hash.c +++ b/hash.c @@ -80,7 +80,7 @@ u64 btrfs_name_hash(const char *name, int len) __u32 hash; __u32 minor_hash = 0; const char *p; - __u32 in[8], buf[2]; + __u32 in[8], buf[4]; u64 hash_result; /* Initialize the default seed for the hash checksum functions */ -- 1.5.3.1.g1e61 --
Hi Chris, I am trying btrfs for the first time. Sorry :( Not able to compile btrfs-progs. Where should I get this uuid.h from ? Thanks, Badari 3b155:~/btrfs-progs-0.13 # make ls mkfs.c mkfs.c gcc -Wp,-MMD,./.mkfs.o.d,-MT,mkfs.o -Wall -fno-strict-aliasing - D_FILE_OFFSET_BITS=64 -D_FORTIFY_SOURCE=2 -g -Werror -c mkfs.c mkfs.c:30:23: uuid/uuid.h: No such file or directory --
Hi Chris,
While compiling btrfs against 2.6.25-rc8-mm1, ran into this.
div_long_long_rem() is removed in -mm. Replace with div_u64_rem().
Thanks,
Badari
div_long_long_rem() API is being removed (patch in -mm).
Replace div_long_long_rem() with div_u64_rem().
Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
---
extent-tree.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
Index: btrfs-0.13/extent-tree.c
===================================================================
--- btrfs-0.13.orig/extent-tree.c 2008-02-21 11:54:12.000000000 -0800
+++ btrfs-0.13/extent-tree.c 2008-04-03 07:18:04.000000000 -0800
@@ -19,6 +19,7 @@
#include <linux/sched.h>
#include <linux/crc32c.h>
#include <linux/pagemap.h>
+#include <linux/math64.h>
#include "hash.h"
#include "ctree.h"
#include "disk-io.h"
@@ -2653,7 +2654,7 @@
u64 nr = 0;
u64 cur_byte;
u64 old_size;
- unsigned long rem;
+ u32 rem;
struct btrfs_block_group_cache *cache;
struct btrfs_block_group_item *item;
struct btrfs_fs_info *info = root->fs_info;
@@ -2691,7 +2692,7 @@
struct btrfs_block_group_item);
btrfs_set_disk_block_group_used(leaf, item, 0);
- div_long_long_rem(nr, 3, &rem);
+ div_u64_rem(nr, 3, &rem);
if (rem) {
btrfs_set_disk_block_group_flags(leaf, item,
BTRFS_BLOCK_GROUP_DATA);
--
