Hi, I've just got this BUG: message in dmesg which I think is btrfs related. I have a btrfs filesystem in a memory card which I use to contain the cache and config of chromium (to avoid writing to much in the SSD): /dev/mmcblk0p1 on /home/mafra/mmc type btrfs (rw,noexec,nosuid,nodev,noatime,noacl,compress) and the last message before the bug is about the memory card. [ 148.149179] mmcblk0: retrying using single block read [ 148.152014] BUG: unable to handle kernel NULL pointer dereference at (null) [ 148.152021] IP: [<ffffffff811b1301>] extent_range_uptodate+0x51/0xa0 [ 148.152030] PGD 72c37067 PUD 7d64b067 PMD 0 [ 148.152035] Oops: 0000 [#1] SMP [ 148.152038] last sysfs file: /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq [ 148.152041] CPU 1 [ 148.152043] Modules linked in: snd_seq snd_seq_device snd_hda_codec_idt i2c_i801 snd_hda_intel snd_hda_codec sky2 uvcvideo iwlagn snd_hwdep evdev [ 148.152053] [ 148.152057] Pid: 2731, comm: btrfs-endio-met Not tainted 2.6.35.3+ #12 VAIO/VGN-FZ240E [ 148.152059] RIP: 0010:[<ffffffff811b1301>] [<ffffffff811b1301>] extent_range_uptodate+0x51/0xa0 [ 148.152064] RSP: 0018:ffff880079acddd0 EFLAGS: 00010246 [ 148.152066] RAX: 0000000000000000 RBX: 0000000043eba000 RCX: 0000000000000000 [ 148.152068] RDX: 0000000000000001 RSI: 0000000000043eba RDI: 0000000000000001 [ 148.152071] RBP: ffff880079acddf0 R08: 0000000000000000 R09: 0000000000000000 [ 148.152073] R10: 0000000000001000 R11: 0000000000000000 R12: ffff88007b40eb18 [ 148.152076] R13: 0000000043ebadff R14: ffff880079ab0000 R15: ffff880079acde80 [ 148.152079] FS: 0000000000000000(0000) GS:ffff880001b00000(0000) knlGS:0000000000000000 [ 148.152081] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 148.152084] CR2: 0000000000000000 CR3: 0000000070be9000 CR4: 00000000000006e0 [ 148.152086] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 148.152088] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 148.152091] ...
Hmmm it is definitely btrfs related. Do you continue to get it after rebooting? -chris --
No, after rebooting chromium refused to open (like it did during the BUG) but no BUG: message appeared. Then I deleted all chromium cache+config from the memory card and chromium worked again, but after suspending to RAM I noticed this: [ 6035.423238] PM: Syncing filesystems ... done. [ 6036.524889] Freezing user space processes ... [ 6056.533079] Freezing of tasks failed after 20.00 seconds (1 tasks refusing to freeze): [ 6056.533119] chrome D 00000000ffffffff 0 7028 7015 0x00800004 [ 6056.533124] ffff8800630a1c08 0000000000000086 0000000000000000 0000000000012d40 [ 6056.533129] ffff8800630a1fd8 ffff8800630a1fd8 ffff88006303e0c0 0000000000012d40 [ 6056.533133] 0000000000012d40 0000000000004000 0000000000004000 ffff8800630a1fd8 [ 6056.533137] Call Trace: [ 6056.533145] [<ffffffff81064029>] ? ktime_get_ts+0xa9/0xe0 [ 6056.533150] [<ffffffff810a3610>] ? sync_page_killable+0x0/0x40 [ 6056.533155] [<ffffffff81506bde>] io_schedule+0x6e/0xb0 [ 6056.533158] [<ffffffff810a35f8>] sync_page+0x38/0x50 [ 6056.533161] [<ffffffff810a3619>] sync_page_killable+0x9/0x40 [ 6056.533164] [<ffffffff81507332>] __wait_on_bit_lock+0x52/0xb0 [ 6056.533168] [<ffffffff810a3532>] __lock_page_killable+0x62/0x70 [ 6056.533172] [<ffffffff8105bb40>] ? wake_bit_function+0x0/0x40 [ 6056.533175] [<ffffffff810a3459>] ? find_get_page+0x19/0x90 [ 6056.533178] [<ffffffff810a5144>] generic_file_aio_read+0x494/0x6d0 [ 6056.533183] [<ffffffff810d4b22>] do_sync_read+0xd2/0x110 [ 6056.533186] [<ffffffff81025406>] ? do_page_fault+0x186/0x390 [ 6056.533190] [<ffffffff810d5273>] vfs_read+0xb3/0x160 [ 6056.533193] [<ffffffff810d536c>] sys_read+0x4c/0x80 [ 6056.533197] [<ffffffff81002d2b>] system_call_fastpath+0x16/0x1b [ 6056.533207] [ 6056.533208] Restarting tasks ... done. but everything is working well AFAICT. --
Hi, Carlos, Did you hit this bug under heavy memory stress? And, could you reproduce the bug? or show some reproduce steps for us? After digging into extent_range_uptodate(), IMO, this NULL pointer bug that issued page can barely be hit. Maybe, due to heavy memory stress, a page of the extent_buffer has been freed before, which leads that it is missing in page_cache and return NULL. thanks, --
Not at all! My laptop had been recently booted (see the timings in the dmesg) and it was basically idle: just a couple xterms and WindowMaker running, Could it be that the memory card had some bad block and btrfs could not recover from the failure? --
