Re: [Linux-NTFS-Dev] 2.6.23 regression: second access of empty ntfs file leads to D state hang

Previous thread: Re: [alsa-devel] sysfs: WARNING: at fs/sysfs/dir.c:424 sysfs_add_one() - with ALSA by Kamalesh Babulal on Monday, October 29, 2007 - 1:45 am. (1 message)

Next thread: Re: [REGRESSION] 2.6.24-rc1 fails to boot on a 486 by Mikael Pettersson on Monday, October 29, 2007 - 2:28 am. (6 messages)
To: LKML <linux-kernel@...>, <linux-ntfs-dev@...>
Date: Monday, October 29, 2007 - 2:08 am

Greetings,

I've stumbled across a 2.6.22->2.6.23 regression. First md5sum access
of an empty NTFS file leads to kernel I/O error gripe, a second access
leaves md5sum hung. 2.6.22.10 has no trouble accessing this file.

Looking at the 22->23 diff, I don't see a quick and dirty stab
candidate, and since I'm preparing for a 5 week separation from my box
<twitch>, I doubt I'll have time to do a bisect. /me punts.

root@Homer: md5sum '/windows/C/Dokumente und Einstellungen/All Users/Anwendungsdaten/Microsoft/Network/Connections/Pbk/rasphone.pbk'
md5sum: /windows/C/Dokumente und Einstellungen/All Users/Anwendungsdaten/Microsoft/Network/Connections/Pbk/rasphone.pbk: Input/output error

[ 228.551859] NTFS-fs error (device hda1): ntfs_read_compressed_block(): ntfs_map_runlist() failed. Cannot read compression block.

[ 401.721890] md5sum D eee53d64 0 7249 7019
[ 401.727155] eee53d78 00200082 00000002 eee53d64 eee53d5c 00000000 e9b006b0 e9b00700
[ 401.735469] 5f6b3463 c0677060 c067a080 e9b00860 c180d080 00000000 c180d080 c180d108
[ 401.743991] 00000001 00200086 e9b006e0 00200086 f79ca640 eee53d94 0000b4d3 00200202
[ 401.752514] Call Trace:
[ 401.755155] [<c04b57bf>] io_schedule+0x1e/0x28
[ 401.759711] [<c015a269>] sync_page+0x34/0x3f
[ 401.764102] [<c04b5b78>] __wait_on_bit_lock+0x40/0x63
[ 401.769281] [<c015a221>] __lock_page+0x54/0x5c
[ 401.773845] [<c015a826>] do_generic_mapping_read+0x236/0x4e7
[ 401.779629] [<c015c165>] generic_file_aio_read+0xff/0x198
[ 401.785138] [<c017851c>] do_sync_read+0xd0/0x106
[ 401.789873] [<c0178cff>] vfs_read+0x89/0x11d
[ 401.794266] [<c0179164>] sys_read+0x3d/0x64
[ 401.798569] [<c01041ba>] syscall_call+0x7/0xb
[ 401.803039] =======================

-

To: Mike Galbraith <efault@...>
Cc: LKML <linux-kernel@...>, <linux-ntfs-dev@...>
Date: Monday, October 29, 2007 - 6:31 am

Hi,

Could you post the complete dmesg output, please?

Nothing related has changed in the NTFS driver between 2.6.22.10 and
2.6.23 so I expect something else to be at fault here.

Could you post your .config as well, please?

Best regards,

Anton
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer, http://www.linux-ntfs.org/

-

To: Anton Altaparmakov <aia21@...>
Cc: LKML <linux-kernel@...>, <linux-ntfs-dev@...>
Date: Monday, October 29, 2007 - 7:45 am

Attached. This is after a reboot though, but a fresh attempt to sum the

Also attached.

-Mike

To: Mike Galbraith <efault@...>
Cc: LKML <linux-kernel@...>, <linux-ntfs-dev@...>
Date: Monday, October 29, 2007 - 9:39 am

Hi Mike,

Thanks for the files. That is really odd. And you are sure this just
works with 2.6.22.10 on the exact same file? Have you run "chkdsk /
f /v /x" on the NTFS volume from Windows?

Would you be able to apply the attached patch to your 2.6.23.1 kernel
and try again and then send me the NTFS error messages? The patch
should cause more verbose error reporting to happen... Thanks!

Best regards,

Anton

--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer, http://www.linux-ntfs.org/

-

To: Anton Altaparmakov <aia21@...>
Cc: LKML <linux-kernel@...>, <linux-ntfs-dev@...>
Date: Monday, October 29, 2007 - 10:43 am

Yes, 2.6.22.10 can md5sum that file just fine, did it several times. I

[ 249.009250] NTFS-fs error (device hda1): ntfs_map_runlist_nolock(): vcn 0x0 >= end_vcn 0x0 for inode 0x490f, error 0.
[ 249.020142] NTFS-fs error (device hda1): ntfs_read_compressed_block(): ntfs_map_runlist() failed. Cannot read compression block.

-Mike

-

To: Anton Altaparmakov <aia21@...>
Cc: LKML <linux-kernel@...>, <linux-ntfs-dev@...>
Date: Monday, October 29, 2007 - 11:18 am

I now have fun chkdsk, it didn't gripe, and the error is still present.

-Mike

-

To: Anton Altaparmakov <aia21@...>
Cc: LKML <linux-kernel@...>, <linux-ntfs-dev@...>, Neil Brown <neilb@...>
Date: Tuesday, October 30, 2007 - 4:00 am

Not being very good at walking away from unsolved mysteries, I chased it
down. The problem is that...
commit[a32ea1e1f925399e0d81ca3f7394a44a6dafa12c] Fix read/truncate race
...calls ntfs_readpage() for a zero i_size inode, which it isn't
accustomed to.

Below is the hammer which made my box a happy camper again.

diff --git a/fs/ntfs/aops.c b/fs/ntfs/aops.c
index 6e5c253..ddab5a3 100644
--- a/fs/ntfs/aops.c
+++ b/fs/ntfs/aops.c
@@ -401,7 +401,7 @@ static int ntfs_readpage(struct file *file, struct page *page)
MFT_RECORD *mrec;
unsigned long flags;
u32 attr_len;
- int err = 0;
+ int err = 0, once = 0;

retry_readpage:
BUG_ON(!PageLocked(page));
@@ -414,6 +414,18 @@ retry_readpage:
return 0;
}
vi = page->mapping->host;
+ /*
+ * If we've been called to read a zero sized inode, zero and bail.
+ */
+ if (!once) {
+ loff_t i_size = i_size_read(vi);
+
+ once++;
+ if (!i_size) {
+ zero_user_page(page, 0, PAGE_CACHE_SIZE, KM_USER0);
+ goto done;
+ }
+ }
ni = NTFS_I(vi);
/*
* Only $DATA attributes can be encrypted and only unnamed $DATA

-

To: Mike Galbraith <efault@...>
Cc: LKML <linux-kernel@...>, <linux-ntfs-dev@...>, Neil Brown <neilb@...>
Date: Tuesday, October 30, 2007 - 5:23 am

Hi,

Yes that will fix it but the complete solution is more involved as
there are three related bugs which explain why you were getting the
hangs after the error... I will make a patch for all of these in the
next few days...

Best regards,

--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer, http://www.linux-ntfs.org/

-

To: Mike Galbraith <efault@...>
Cc: LKML <linux-kernel@...>, ntfs-dev <linux-ntfs-dev@...>, Neil Brown <neilb@...>
Date: Friday, November 2, 2007 - 3:22 pm

Hi Mike,

Attached is a patch that should fix this and the other related issues
I found.

Would you be able to test it in your setup? Thanks a lot in advance!

Best regards,

Anton
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer, http://www.linux-ntfs.org/

To: Anton Altaparmakov <aia21@...>
Cc: LKML <linux-kernel@...>, ntfs-dev <linux-ntfs-dev@...>, Neil Brown <neilb@...>
Date: Saturday, November 3, 2007 - 12:56 am

Modulo uninitialized vi, works fine. I moved it up, and all is well.
Thanks.

Tested-by: Mike Galbraith <efault@gmx.de>

diff --git a/fs/ntfs/aops.c b/fs/ntfs/aops.c
index cfdc790..ad87cb0 100644
--- a/fs/ntfs/aops.c
+++ b/fs/ntfs/aops.c
@@ -405,6 +405,15 @@ static int ntfs_readpage(struct file *file, struct page *page)

retry_readpage:
BUG_ON(!PageLocked(page));
+ vi = page->mapping->host;
+ i_size = i_size_read(vi);
+ /* Is the page fully outside i_size? (truncate in progress) */
+ if (unlikely(page->index >= (i_size + PAGE_CACHE_SIZE - 1) >>
+ PAGE_CACHE_SHIFT)) {
+ zero_user_page(page, 0, PAGE_CACHE_SIZE, KM_USER0);
+ ntfs_debug("Read outside i_size - truncated?");
+ goto done;
+ }
/*
* This can potentially happen because we clear PageUptodate() during
* ntfs_writepage() of MstProtected() attributes.
@@ -413,7 +422,6 @@ retry_readpage:
unlock_page(page);
return 0;
}
- vi = page->mapping->host;
ni = NTFS_I(vi);
/*
* Only $DATA attributes can be encrypted and only unnamed $DATA
diff --git a/fs/ntfs/attrib.c b/fs/ntfs/attrib.c
index 92dabdc..50d3b0c 100644
--- a/fs/ntfs/attrib.c
+++ b/fs/ntfs/attrib.c
@@ -179,10 +179,7 @@ int ntfs_map_runlist_nolock(ntfs_inode *ni, VCN vcn, ntfs_attr_search_ctx *ctx)
* ntfs_mapping_pairs_decompress() fails.
*/
end_vcn = sle64_to_cpu(a->data.non_resident.highest_vcn) + 1;
- if (!a->data.non_resident.lowest_vcn && end_vcn == 1)
- end_vcn = sle64_to_cpu(a->data.non_resident.allocated_size) >>
- ni->vol->cluster_size_bits;
- if (unlikely(vcn >= end_vcn)) {
+ if (unlikely(vcn && vcn >= end_vcn)) {
err = -ENOENT;
goto err_out;
}
diff --git a/fs/ntfs/compress.c b/fs/ntfs/compress.c
index d98daf5..d1619d0 100644
--- a/fs/ntfs/compress.c
+++ b/fs/ntfs/compress.c
@@ -561,6 +561,16 @@ int ntfs_read_compressed_block(struct page *page)
read_unlock_irqrestore(&ni->size_lock, flags);
max_page = ((i_siz...

To: Mike Galbraith <efault@...>
Cc: LKML <linux-kernel@...>, ntfs-dev <linux-ntfs-dev@...>, Neil Brown <neilb@...>
Date: Saturday, November 3, 2007 - 3:27 am

Great, thanks! I will submit it to Linus.

Best regards,

Anton

--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer, http://www.linux-ntfs.org/

-

To: Mike Galbraith <efault@...>
Cc: LKML <linux-kernel@...>, ntfs-dev <linux-ntfs-dev@...>
Date: Monday, October 29, 2007 - 9:41 am

oops. forgot to attach the patch... Here it is...

Best regards,

Anton

To: LKML <linux-kernel@...>
Date: Monday, October 29, 2007 - 2:18 am

hrmph, CC to members only list removed.

-

To: Mike Galbraith <efault@...>
Cc: LKML <linux-kernel@...>
Date: Monday, October 29, 2007 - 6:40 am

What makes you think it is members only?!? NTFS-dev is not members
only at all. It is moderated which is a very different thing. Most
list members have to be approved for posting, too... And the only
reason it is moderated is spam filtering. We delete 99.9% of emails
going to the list by hand as moderators because they are all spam.
That means the list and the list archives only contain NTFS related
messages and not thousands of posts trying to sell you pills and what
not...

Best regards,

--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer, http://www.linux-ntfs.org/

-

Previous thread: Re: [alsa-devel] sysfs: WARNING: at fs/sysfs/dir.c:424 sysfs_add_one() - with ALSA by Kamalesh Babulal on Monday, October 29, 2007 - 1:45 am. (1 message)

Next thread: Re: [REGRESSION] 2.6.24-rc1 fails to boot on a 486 by Mikael Pettersson on Monday, October 29, 2007 - 2:28 am. (6 messages)