[PATCH] OCFS2: Allow huge (> 16 TiB) volumes to mount

Previous thread: [ANNOUNCE] Git 1.7.1.1 by Junio C Hamano on Tuesday, June 29, 2010 - 4:48 pm. (1 message)

Next thread: [PATCH 0/2] Yama: add PTRACE exception tracking by Kees Cook on Tuesday, June 29, 2010 - 5:38 pm. (20 messages)
From: Patrick J. LoPresti
Date: Tuesday, June 29, 2010 - 5:16 pm

The OCFS2 developers have already done all of the hard work to allow
volumes larger than 16 TiB.  But there is still a "sanity check" in
fs/ocfs2/super.c that prevents the mounting of such volumes, even when
the cluster size and journal options would allow it.

This patch replaces that sanity check with a more sophisticated one to
mount a huge volume provided that (a) it is addressable by the raw
word/address size of the system (borrowing a test from ext4); (b) the
volume is using JBD2; and (c) the JBD2_FEATURE_INCOMPAT_64BIT flag is
set on the journal.

I factored out the sanity check into its own function.  I also moved it
from ocfs2_initialize_super() down to ocfs2_check_volume(); any earlier,
and the journal's flags have not been read from disk yet.

I have tested this patch on small volumes, huge volumes, and huge
volumes without 64-bit block support in the journal.  All of them appear
to work or to fail gracefully, as appropriate.

Signed-off-by: Patrick LoPresti <lopresti@gmail.com>


diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index 0eaa929..3db233d 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -1991,6 +1991,47 @@ static int ocfs2_setup_osb_uuid(struct ocfs2_super *osb, const unsigned char *uu
 	return 0;
 }
 
+/* Check to make sure entire volume is addressable on this system.
+   Requires osb_clusters_at_boot to be valid and for the journal to
+   have been read by jbd2_journal_load(). */
+static int ocfs2_check_addressable(struct ocfs2_super *osb)
+{
+	int status = 0;
+	u64 max_block =
+		ocfs2_clusters_to_blocks(osb->sb,
+					 osb->osb_clusters_at_boot) - 1;
+
+	/* Absolute addressability check (borrowed from ext4/super.c) */
+	if ((max_block >
+	     (sector_t)(~0LL) >> (osb->sb->s_blocksize_bits - 9)) ||
+	    (max_block > (pgoff_t)(~0LL) >> (PAGE_CACHE_SHIFT -
+					     osb->sb->s_blocksize_bits))) {
+		mlog(ML_ERROR, "Volume too large "
+		     "to mount safely on this system");
+		status = -EFBIG;
+		goto out;
+	}
+
+	/* ...
From: Patrick J. LoPresti
Date: Tuesday, July 6, 2010 - 5:13 pm

As an alternative, could jbd2_journal_init_inode() just call
journal_get_superblock() itself?  That would naturally make the
feature bits valid as soon as the journal is initialized.  It would
also preserve all error checking, instead of converting superblock
read errors into "this journal has no features".

JBD2 developers, please advise.  OCFS2 needs to examine JBD2 journal
feature bits *before* the journal is recovered, which is not possible
at present.  What is the best approach to fixing this?

Thanks.

 - Pat
--

Previous thread: [ANNOUNCE] Git 1.7.1.1 by Junio C Hamano on Tuesday, June 29, 2010 - 4:48 pm. (1 message)

Next thread: [PATCH 0/2] Yama: add PTRACE exception tracking by Kees Cook on Tuesday, June 29, 2010 - 5:38 pm. (20 messages)