[PATCH] UBIFS: allow for gaps when dirtying the LPT

Previous thread: [PATCH] md: fix compilation warning: left shift count >= width of type by Hannes Eder on Friday, November 21, 2008 - 8:21 am. (1 message)

Next thread: [PATCH] FRV: Fix error handling in is_user_addr_valid() by David Howells on Friday, November 21, 2008 - 9:07 am. (2 messages)
From: Artem Bityutskiy
Date: Friday, November 21, 2008 - 10:19 am

Hi,

here are the UBIFS fixes we have scheduled for 2.6.28. We are
planning to send them to Linus a little bit later. I send them
for your review.

The "UBIFS: allow for gaps when dirtying the LPT" patch solves
the problem reported by Deepak Saxena.

The "UBIFS: remove printk" fixes a little bit annoying message
from UBIFS which comes every time we re-mount the FS.

The "UBIFS: do not print scary memory allocation warnings",
    "UBIFS: do not allocate too much", and
    "UBIFS: pre-allocate bulk-read buffer"
fix UBIFS memory allocation problems when bulk-read is enabled.


Adrian Hunter (1):
      UBIFS: allow for gaps when dirtying the LPT

Artem Bityutskiy (6):
      UBIFS: remove printk
      MAINTAINERS: change UBI/UBIFS git tree URLs
      UBIFS: fix compilation warnings
      UBIFS: do not print scary memory allocation warnings
      UBIFS: do not allocate too much
      UBIFS: pre-allocate bulk-read buffer

Harvey Harrison (1):
      UBIFS: endian handling fixes and annotations

 MAINTAINERS           |    4 +-
 fs/ubifs/commit.c     |    4 +-
 fs/ubifs/debug.c      |   66 +++++++++++++++++++++--------------
 fs/ubifs/dir.c        |    5 ++-
 fs/ubifs/file.c       |   91 ++++++++++++++++++++++++++++++++++---------------
 fs/ubifs/journal.c    |    8 +++--
 fs/ubifs/key.h        |    4 +-
 fs/ubifs/lpt_commit.c |    2 -
 fs/ubifs/orphan.c     |   28 +++++++++------
 fs/ubifs/recovery.c   |   17 +++++----
 fs/ubifs/replay.c     |    2 +-
 fs/ubifs/sb.c         |    9 +++--
 fs/ubifs/super.c      |   70 +++++++++++++++++++++++++++++++------
 fs/ubifs/tnc.c        |   12 +++++--
 fs/ubifs/ubifs.h      |   12 +++++--
 15 files changed, 223 insertions(+), 111 deletions(-)

--
Best regards,
Artem Bityutskiy (Битюцкий Артём)
--

From: Artem Bityutskiy
Date: Friday, November 21, 2008 - 10:19 am

From: Harvey Harrison <harvey.harrison@gmail.com>

Noticed by sparse:
fs/ubifs/file.c:75:2: warning: restricted __le64 degrades to integer
fs/ubifs/file.c:629:4: warning: restricted __le64 degrades to integer
fs/ubifs/dir.c:431:3: warning: restricted __le64 degrades to integer

This should be checked to ensure the ubifs_assert is working as
intended, I've done the suggested annotation in this patch.

fs/ubifs/sb.c:298:6: warning: incorrect type in assignment (different base types)
fs/ubifs/sb.c:298:6:    expected int [signed] [assigned] tmp
fs/ubifs/sb.c:298:6:    got restricted __le64 [usertype] <noident>
fs/ubifs/sb.c:299:19: warning: incorrect type in assignment (different base types)
fs/ubifs/sb.c:299:19:    expected restricted __le64 [usertype] atime_sec
fs/ubifs/sb.c:299:19:    got int [signed] [assigned] tmp
fs/ubifs/sb.c:300:19: warning: incorrect type in assignment (different base types)
fs/ubifs/sb.c:300:19:    expected restricted __le64 [usertype] ctime_sec
fs/ubifs/sb.c:300:19:    got int [signed] [assigned] tmp
fs/ubifs/sb.c:301:19: warning: incorrect type in assignment (different base types)
fs/ubifs/sb.c:301:19:    expected restricted __le64 [usertype] mtime_sec
fs/ubifs/sb.c:301:19:    got int [signed] [assigned] tmp

This looks like a bugfix as your tmp was a u32 so there was truncation in
the atime, mtime, ctime value, probably not intentional, add a tmp_le64
and use it here.

fs/ubifs/key.h:348:9: warning: cast to restricted __le32
fs/ubifs/key.h:348:9: warning: cast to restricted __le32
fs/ubifs/key.h:419:9: warning: cast to restricted __le32

Read from the annotated union member instead.

fs/ubifs/recovery.c:175:13: warning: incorrect type in assignment (different base types)
fs/ubifs/recovery.c:175:13:    expected unsigned int [unsigned] [usertype] save_flags
fs/ubifs/recovery.c:175:13:    got restricted __le32 [usertype] flags
fs/ubifs/recovery.c:186:13: warning: incorrect type in assignment (different base types)
fs/ubifs/recovery.c:186:13:    ...
From: Sebastian Andrzej Siewior
Date: Saturday, November 22, 2008 - 12:27 pm

If you would change such references to something like
|return le32_to_cpup(&key->j32[1]) >> UBIFS_S_KEY_BLOCK_BITS;
then on powerpc

  text    data     bss     dec     hex filename
155384    1284      24  156692   26414 ubifs-b4.ko
155372    1284      24  156680   26408 ubifs-after.ko

because now it is possible to load the value as LE from memory instead

This and the previous change look like a bugfix for something that
should trigger during recovery or something? Shouldn't I fail in

another micro optimisation would be to use __constant_cpu_to_le32()

Sebastian
--

From: Harvey Harrison
Date: Saturday, November 22, 2008 - 8:21 pm

Nope, no difference, cpu_to_le32 does constant-detection these days and
so there ends up being no difference.

Harvey

--

From: Sebastian Andrzej Siewior
Date: Sunday, November 23, 2008 - 2:28 am

Indeed. __builtin_constant_p() does the trick. So I guess we could get
rid of __constant_cpu_to_XX32() and its

| grep -RE __constant_cpu_to_[bl]e32 . | wc -l
| 413


Sebastian
--

From: Jamie Lokier
Date: Sunday, November 23, 2008 - 3:05 am

If le32_to_cpu used __builtin_bswap32 (a GCC builtin), wouldn't
GCC do this trivial optimisation itself?

-- Jamie
--

From: Adrian Hunter
Date: Monday, November 24, 2008 - 7:19 am

Wouldn't that be true for every le32_to_cpu of an lvalue?  Shame you can't
do:


This is just about casting.  key->u32[1] and key->j32[1] are the same object.

As per Harvey's reply.

--

From: Harvey Harrison
Date: Monday, November 24, 2008 - 9:46 am

No, you wouldn't want to do the above if the lvalue was on the stack as most
of the time the extra code to setup a pointer to a stack variable ends up
being more expensive than just using cpu_to_le32.

Cheers,

Harvey

--

From: Artem Bityutskiy
Date: Friday, November 21, 2008 - 10:19 am

From: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>

Remove the "UBIFS background thread ubifs_bgd0_0 started" message.
We kill the background thread when we switch to R/O mode, and
start it again whan we switch to R/W mode. OLPC is doing this
many times during boot, and we see this message many times as
well, which is irritating. So just kill the message.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
---
 fs/ubifs/commit.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/fs/ubifs/commit.c b/fs/ubifs/commit.c
index 0a6aa2c..b49884c 100644
--- a/fs/ubifs/commit.c
+++ b/fs/ubifs/commit.c
@@ -234,8 +234,8 @@ int ubifs_bg_thread(void *info)
 	int err;
 	struct ubifs_info *c = info;
 
-	ubifs_msg("background thread \"%s\" started, PID %d",
-		  c->bgt_name, current->pid);
+	dbg_msg("background thread \"%s\" started, PID %d",
+		c->bgt_name, current->pid);
 	set_freezable();
 
 	while (1) {
-- 
1.5.4.3

--

From: Artem Bityutskiy
Date: Friday, November 21, 2008 - 10:19 am

From: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
---
 MAINTAINERS |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index d643e86..7e897ee 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -4192,7 +4192,7 @@ M:	dedekind@infradead.org
 P:	Adrian Hunter
 M:	ext-adrian.hunter@nokia.com
 L:	linux-mtd@lists.infradead.org
-T:	git git://git.infradead.org/~dedekind/ubifs-2.6.git
+T:	git git://git.infradead.org/ubifs-2.6.git
 W:	http://www.linux-mtd.infradead.org/doc/ubifs.html
 S:	Maintained
 
@@ -4246,7 +4246,7 @@ P:	Artem Bityutskiy
 M:	dedekind@infradead.org
 W:	http://www.linux-mtd.infradead.org/
 L:	linux-mtd@lists.infradead.org
-T:	git git://git.infradead.org/~dedekind/ubi-2.6.git
+T:	git git://git.infradead.org/ubi-2.6.git
 S:	Maintained
 
 USB ACM DRIVER
-- 
1.5.4.3

--

From: Artem Bityutskiy
Date: Friday, November 21, 2008 - 10:19 am

From: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>

Bulk-read allocates a lot of memory with 'kmalloc()', and when it
is/gets fragmented 'kmalloc()' fails with a scarry warning. But
because bulk-read is just an optimization, UBIFS keeps working fine.
Supress the warning by passing __GFP_NOWARN option to 'kmalloc()'.

This patch also introduces a macro for the magic 128KiB constant.
This is just neater.

Note, this is not really fixes the problem we had, but just hides
the warnings. The further patches fix the problem.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
---
 fs/ubifs/file.c  |    4 ++--
 fs/ubifs/super.c |   17 ++++++++++++-----
 fs/ubifs/ubifs.h |    2 +-
 3 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/fs/ubifs/file.c b/fs/ubifs/file.c
index 9124eee..8be827c 100644
--- a/fs/ubifs/file.c
+++ b/fs/ubifs/file.c
@@ -705,12 +705,12 @@ static int ubifs_do_bulk_read(struct ubifs_info *c, struct page *page1)
 	int err, page_idx, page_cnt, ret = 0, n = 0;
 	loff_t isize;
 
-	bu = kmalloc(sizeof(struct bu_info), GFP_NOFS);
+	bu = kmalloc(sizeof(struct bu_info), GFP_NOFS | __GFP_NOWARN);
 	if (!bu)
 		return 0;
 
 	bu->buf_len = c->bulk_read_buf_size;
-	bu->buf = kmalloc(bu->buf_len, GFP_NOFS);
+	bu->buf = kmalloc(bu->buf_len, GFP_NOFS | __GFP_NOWARN);
 	if (!bu->buf)
 		goto out_free;
 
diff --git a/fs/ubifs/super.c b/fs/ubifs/super.c
index 8780efb..ea493e6 100644
--- a/fs/ubifs/super.c
+++ b/fs/ubifs/super.c
@@ -36,6 +36,12 @@
 #include <linux/mount.h>
 #include "ubifs.h"
 
+/*
+ * Maximum amount of memory we may 'kmalloc()' without worrying that we are
+ * allocating too much.
+ */
+#define UBIFS_KMALLOC_OK (128*1024)
+
 /* Slab cache for UBIFS inodes */
 struct kmem_cache *ubifs_inode_slab;
 
@@ -561,17 +567,18 @@ static int init_constants_early(struct ubifs_info *c)
 	 * calculations when reporting free space.
 	 */
 	c->leb_overhead = c->leb_size % UBIFS_MAX_DATA_NODE_SZ;
+
 	/* Buffer size for bulk-reads */
 ...
From: Artem Bityutskiy
Date: Friday, November 21, 2008 - 10:19 am

From: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>

Bulk-read allocates 128KiB or more using kmalloc. The allocation
starts failing often when the memory gets fragmented. UBIFS still
works fine in this case, because it falls-back to standard
(non-optimized) read method, though. This patch teaches bulk-read
to allocate exactly the amount of memory it needs, instead of
allocating 128KiB every time.

This patch is also a preparation to the further fix where we'll
have a pre-allocated bulk-read buffer as well. For example, now
the @bu object is prepared in 'ubifs_bulk_read()', so we could
path either pre-allocated or allocated information to
'ubifs_do_bulk_read()' later. Or teaching 'ubifs_do_bulk_read()'
not to allocate 'bu->buf' if it is already there.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
---
 fs/ubifs/file.c  |   70 +++++++++++++++++++++++++++++++++++------------------
 fs/ubifs/super.c |   12 ++++----
 fs/ubifs/tnc.c   |    7 ++++-
 fs/ubifs/ubifs.h |    4 +-
 4 files changed, 60 insertions(+), 33 deletions(-)

diff --git a/fs/ubifs/file.c b/fs/ubifs/file.c
index 8be827c..0c5c27d 100644
--- a/fs/ubifs/file.c
+++ b/fs/ubifs/file.c
@@ -691,32 +691,22 @@ out_err:
 /**
  * ubifs_do_bulk_read - do bulk-read.
  * @c: UBIFS file-system description object
- * @page1: first page
+ * @bu: bulk-read information
+ * @page1: first page to read
  *
  * This function returns %1 if the bulk-read is done, otherwise %0 is returned.
  */
-static int ubifs_do_bulk_read(struct ubifs_info *c, struct page *page1)
+static int ubifs_do_bulk_read(struct ubifs_info *c, struct bu_info *bu,
+			      struct page *page1)
 {
 	pgoff_t offset = page1->index, end_index;
 	struct address_space *mapping = page1->mapping;
 	struct inode *inode = mapping->host;
 	struct ubifs_inode *ui = ubifs_inode(inode);
-	struct bu_info *bu;
 	int err, page_idx, page_cnt, ret = 0, n = 0;
+	int allocate = bu->buf ? 0 : 1;
 	loff_t isize;
 
-	bu = kmalloc(sizeof(struct bu_info), ...
From: Artem Bityutskiy
Date: Friday, November 21, 2008 - 10:19 am

From: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>

To avoid memory allocation failure during bulk-read, pre-allocate
a bulk-read buffer, so that if there is only one bulk-reader at
a time, it would just use the pre-allocated buffer and would not
do any memory allocation. However, if there are more than 1 bulk-
reader, then only one reader would use the pre-allocated buffer,
while the other reader would allocate the buffer for itself.

Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
---
 fs/ubifs/file.c  |   31 ++++++++++++++++++++--------
 fs/ubifs/super.c |   57 +++++++++++++++++++++++++++++++++++++++++++++--------
 fs/ubifs/ubifs.h |    6 +++++
 3 files changed, 76 insertions(+), 18 deletions(-)

diff --git a/fs/ubifs/file.c b/fs/ubifs/file.c
index 0c5c27d..2624411 100644
--- a/fs/ubifs/file.c
+++ b/fs/ubifs/file.c
@@ -811,15 +811,15 @@ static int ubifs_bulk_read(struct page *page)
 	struct ubifs_inode *ui = ubifs_inode(inode);
 	pgoff_t index = page->index, last_page_read = ui->last_page_read;
 	struct bu_info *bu;
-	int err = 0;
+	int err = 0, allocated = 0;
 
 	ui->last_page_read = index;
 	if (!c->bulk_read)
 		return 0;
 
 	/*
-	 * Bulk-read is protected by ui_mutex, but it is an optimization, so
-	 * don't bother if we cannot lock the mutex.
+	 * Bulk-read is protected by @ui->ui_mutex, but it is an optimization,
+	 * so don't bother if we cannot lock the mutex.
 	 */
 	if (!mutex_trylock(&ui->ui_mutex))
 		return 0;
@@ -840,17 +840,30 @@ static int ubifs_bulk_read(struct page *page)
 		ui->bulk_read = 1;
 	}
 
-	bu = kmalloc(sizeof(struct bu_info), GFP_NOFS | __GFP_NOWARN);
-	if (!bu)
-		return 0;
+	/*
+	 * If possible, try to use pre-allocated bulk-read information, which
+	 * is protected by @c->bu_mutex.
+	 */
+	if (mutex_trylock(&c->bu_mutex))
+		bu = &c->bu;
+	else {
+		bu = kmalloc(sizeof(struct bu_info), GFP_NOFS | __GFP_NOWARN);
+		if (!bu)
+			goto out_unlock;
+
+		bu->buf = NULL;
+		allocated = 1;
+	}
 
-	bu->buf = ...
From: Artem Bityutskiy
Date: Friday, November 21, 2008 - 10:19 am

From: Adrian Hunter <ext-adrian.hunter@nokia.com>

The LPT may have gaps in it because initially empty LEBs
are not added by mkfs.ubifs - because it does not know how
many there are.  Then UBIFS allocates empty LEBs in the
reverse order that they are discovered i.e. they are
added to, and removed from, the front of a list.  That
creates a gap in the middle of the LPT.

The function dirtying the LPT tree (for the purpose of
small model garbage collection) assumed that a gap could
only occur at the very end of the LPT and stopped dirtying
prematurely, which in turn resulted in the LPT running
out of space - something that is designed to be impossible.

Signed-off-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
---
 fs/ubifs/lpt_commit.c |    2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/fs/ubifs/lpt_commit.c b/fs/ubifs/lpt_commit.c
index eed5a00..a41434b 100644
--- a/fs/ubifs/lpt_commit.c
+++ b/fs/ubifs/lpt_commit.c
@@ -571,8 +571,6 @@ static struct ubifs_pnode *next_pnode(struct ubifs_info *c,
 		/* We assume here that LEB zero is never an LPT LEB */
 		if (nnode->nbranch[iip].lnum)
 			return ubifs_get_pnode(c, nnode, iip);
-		else
-			return NULL;
 	}
 
 	/* Go up while can't go right */
-- 
1.5.4.3

--

From: Artem Bityutskiy
Date: Friday, November 21, 2008 - 10:19 am

From: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>

We print 'ino_t' type using '%lu' printk() placeholder, but this
results in many warnings when compiling for Alpha platform. Fix
this by adding (unsingned long) casts.

Fixes these warnings:

fs/ubifs/journal.c:693: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/journal.c:1131: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/dir.c:163: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/tnc.c:2680: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/tnc.c:2700: warning: format '%lu' expects type 'long unsigned int', but argument 5 has type 'ino_t'
fs/ubifs/replay.c:1066: warning: format '%lu' expects type 'long unsigned int', but argument 7 has type 'ino_t'
fs/ubifs/orphan.c:108: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/orphan.c:135: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/orphan.c:142: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/orphan.c:154: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/orphan.c:159: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/orphan.c:451: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/orphan.c:539: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/orphan.c:612: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/orphan.c:843: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type 'ino_t'
fs/ubifs/orphan.c:856: warning: format '%lu' expects type 'long unsigned int', but argument 4 has type ...
From: Sebastian Andrzej Siewior
Date: Saturday, November 22, 2008 - 11:54 am

Wouldn't %z help? A cast sounds wrong.

Sebastian
--

From: Adrian Hunter
Date: Monday, November 24, 2008 - 3:03 am

Doesn't help:

fs/ubifs/journal.c:699: warning: format ‘%zu’ expects type ‘size_t’, but argument 4 has type ‘ino_t’
--

From: Sebastian Andrzej Siewior
Date: Sunday, November 30, 2008 - 11:58 am

Indeed. BUT:

Is there actually a reason why Alpha is the only arch having
__kernel_ino_t defined as unsigned int instead of unsigned long ?
(except s390 in 32bit mode).

I just checked and I haven't seen anything that would point out that
__kernel_ino_t / ino_t is part of user space API.
What I've found instead is for instance that ext2 relies that ino_t is a
long:

|struct inode *ext2_iget (struct super_block *sb, unsigned long ino)
|{
....
|        raw_inode = ext2_get_inode(inode->i_sb, ino, &bh);

and the prototype is:
|static struct ext2_inode *ext2_get_inode(struct super_block *sb, ino_t ino,
|                                         struct buffer_head **p)

So we lose the upper 32bit on Alpha. Unless the whole system is
self-contained and the ext2_iget() user never passes something > ino_t.

Any comments?

Sebastian
--

Previous thread: [PATCH] md: fix compilation warning: left shift count >= width of type by Hannes Eder on Friday, November 21, 2008 - 8:21 am. (1 message)

Next thread: [PATCH] FRV: Fix error handling in is_user_addr_valid() by David Howells on Friday, November 21, 2008 - 9:07 am. (2 messages)