thr3ads.net - Btrfs devel - consistent oops after power fail during btrfs-vol -r [Mar 2010]

If this information is useful, please help other people find it:
Share via:

Troy Ablan

2010-Mar-14 18:09 UTC

consistent oops after power fail during btrfs-vol -r

Hey guys,

In the middle of a long btrfs-vol -r on a 15-device raid1 the machine
lost power.  I don''t know if that''s the cause, but I now have
access to
the filesystem for only a minute or two, and am now consistently getting
this oops shortly after mount:

[ 1151.367849] btrfs memmove bogus dst_offset 536872944 move len 1110
len 4096
[ 1151.367856] ------------[ cut here ]------------
[ 1151.367908] kernel BUG at fs/btrfs/extent_io.c:3798!
[ 1151.367959] invalid opcode: 0000 [#1] SMP
[ 1151.368013] last sysfs file:
/sys/devices/virtual/block/md1/md/metadata_version
[ 1151.368108] CPU 0
[ 1151.368157] Pid: 5876, comm: btrfs-cleaner Tainted: G        W 
2.6.33-gentoo #1 P55M-GD45 (MS-7588) /MS-7588
[ 1151.368256] RIP: 0010:[<ffffffff812c7372>]  [<ffffffff812c7372>]
memmove_extent_buffer+0x262/0x290
[ 1151.368360] RSP: 0018:ffff8800a8f599b0  EFLAGS: 00010282
[ 1151.368412] RAX: 0000000000000055 RBX: 0000000000000001 RCX:
000000000003ffff
[ 1151.368467] RDX: ffff880028200000 RSI: 0000000000000086 RDI:
0000000000000000
[ 1151.368521] RBP: ffff8800a8f59a20 R08: 0000000000000000 R09:
ffffffff816b54ef
[ 1151.368575] R10: 0000000000000000 R11: 0000000000000003 R12:
0000000000000456
[ 1151.368629] R13: 0000000000000456 R14: 0000000020000033 R15:
ffff88009a8df9a0
[ 1151.368684] FS:  0000000000000000(0000) GS:ffff880028200000(0000)
knlGS:0000000000000000
[ 1151.368780] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1151.368832] CR2: 00000000020c4cd0 CR3: 00000000018df000 CR4:
00000000000006f0
[ 1151.368886] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[ 1151.368939] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[ 1151.368994] Process btrfs-cleaner (pid: 5876, threadinfo
ffff8800a8f58000, task ffff8800ad09d0c0)
[ 1151.369091] Stack:
[ 1151.369137]  ffff8800a8f59a20 ffffffff812bbba1 ffff8800a8f599d8
ffffffff00000004
[ 1151.369198] <0> ffff8800a8f59fd8 0000000000001000 0000000000000000
ffff88009ce58000
[ 1151.369305] <0> 0000000000000000 0000000000000001 0000000000000456
ffff88009a8df9a0
[ 1151.369455] Call Trace:
[ 1151.369685]  [<ffffffff812bbba1>] ? btrfs_item_offset+0xe1/0xf0
[ 1151.369742]  [<ffffffff81293311>] btrfs_del_items+0x141/0x580
[ 1151.369794]  [<ffffffff8129a20c>] ? btrfs_pin_extent+0xac/0xd0
[ 1151.369847]  [<ffffffff8129c5df>] ? pin_down_bytes+0x5f/0x190
[ 1151.369900]  [<ffffffff8129e4be>] __btrfs_free_extent+0x50e/0x7f0
[ 1151.369955]  [<ffffffff812e17d9>] ? tree_insert+0x99/0x190
[ 1151.370007]  [<ffffffff8129ec67>] run_one_delayed_ref+0x4c7/0x540
[ 1151.370060]  [<ffffffff812e230f>] ? btrfs_delayed_ref_lock+0x3f/0x120
[ 1151.370114]  [<ffffffff812a12cd>] run_clustered_refs+0xbd/0x330
[ 1151.370167]  [<ffffffff812e2878>] ? btrfs_find_ref_cluster+0xe8/0x190
[ 1151.370221]  [<ffffffff812a1606>] btrfs_run_delayed_refs+0xc6/0x1f0
[ 1151.370274]  [<ffffffff812a19bc>] btrfs_drop_snapshot+0x28c/0x600
[ 1151.370327]  [<ffffffff812ab2d2>] btrfs_clean_old_snapshots+0x122/0x150
[ 1151.370382]  [<ffffffff812a7ae0>] cleaner_kthread+0x160/0x180
[ 1151.370435]  [<ffffffff812a7980>] ? cleaner_kthread+0x0/0x180
[ 1151.370488]  [<ffffffff812a7980>] ? cleaner_kthread+0x0/0x180
[ 1151.370540]  [<ffffffff812a7980>] ? cleaner_kthread+0x0/0x180
[ 1151.370593]  [<ffffffff81096a16>] kthread+0x96/0xa0
[ 1151.370646]  [<ffffffff81034c14>] kernel_thread_helper+0x4/0x10
[ 1151.370700]  [<ffffffff816b58a9>] ? restore_args+0x0/0x30
[ 1151.370752]  [<ffffffff81096980>] ? kthread+0x0/0xa0
[ 1151.370803]  [<ffffffff81034c10>] ? kernel_thread_helper+0x0/0x10
[ 1151.370856] Code: c3 48 8b 45 b0 48 89 da 48 8d 34 07 48 03 7d b8 e8
34 ac 06 00 e9 73 ff ff ff 4c 89 ea 48 c7 c7 88 1e 82 81 31 c0 e8 4a b1
3e 00 <0f> 0b eb fe 66 2e 0f 1f 84 00 00 00 00 00 48 89 fe 4c 89 ea 48
[ 1151.371152] RIP  [<ffffffff812c7372>] memmove_extent_buffer+0x262/0x290
[ 1151.371208]  RSP <ffff8800a8f599b0>
[ 1151.371517] ---[ end trace e22acb8dc89df5cc ]---


Only once, before this, but still after the power failure, I had this
oops (older kernel):


[ 6962.309973] leaf free space ret -536870861, leaf data size 3995, used
536874856 nritems 50
[ 6962.310068] leaf free space ret -536870861, leaf data size 3995, used
536874856 nritems 50
[ 6962.310082] ------------[ cut here ]------------
[ 6962.310088] WARNING: at fs/btrfs/extent_io.c:3475
read_extent_buffer+0x178/0x1a0()
[ 6962.310089] Hardware name: MS-7588
[ 6962.310090] Modules linked in:
[ 6962.310093] Pid: 6085, comm: rsync Tainted: G        W  2.6.33-rc6 #2
[ 6962.310094] Call Trace:
[ 6962.310098]  [<ffffffff812c7408>] ? read_extent_buffer+0x178/0x1a0
[ 6962.310102]  [<ffffffff81078838>] warn_slowpath_common+0x78/0xd0
[ 6962.310104]  [<ffffffff8107889f>] warn_slowpath_null+0xf/0x20
[ 6962.310106]  [<ffffffff812c7408>] read_extent_buffer+0x178/0x1a0
[ 6962.310108]  [<ffffffff812c7512>] copy_extent_buffer+0xe2/0x190
[ 6962.310111]  [<ffffffff8128f724>] __push_leaf_right+0x404/0x8a0
[ 6962.310113]  [<ffffffff81292c09>] push_leaf_right+0x1a9/0x1b0
[ 6962.310115]  [<ffffffff812936a9>] split_leaf+0x519/0x760
[ 6962.310117]  [<ffffffff8128dca6>] ? leaf_space_used+0xd6/0x110
[ 6962.310119]  [<ffffffff8129567b>] btrfs_search_slot+0x83b/0x880
[ 6962.310121]  [<ffffffff81295d29>] btrfs_insert_empty_items+0x69/0xd0
[ 6962.310124]  [<ffffffff81120ae7>] ? kmem_cache_alloc+0xc7/0x1e0
[ 6962.310127]  [<ffffffff8129e3b8>] run_one_delayed_ref+0x1d8/0x540
[ 6962.310129]  [<ffffffff812a0d40>] ? run_clustered_refs+0xf0/0x330
[ 6962.310132]  [<ffffffff812a0d0d>] run_clustered_refs+0xbd/0x330
[ 6962.310135]  [<ffffffff812e2488>] ? btrfs_find_ref_cluster+0xe8/0x190
[ 6962.310138]  [<ffffffff812a1046>] btrfs_run_delayed_refs+0xc6/0x1f0
[ 6962.310140]  [<ffffffff812ab724>] __btrfs_end_transaction+0x64/0x170
[ 6962.310142]  [<ffffffff812ab84b>] btrfs_end_transaction+0xb/0x10
[ 6962.310145]  [<ffffffff812b3140>] btrfs_dirty_inode+0x50/0x60
[ 6962.310148]  [<ffffffff81146435>] __mark_inode_dirty+0x35/0x180
[ 6962.310151]  [<ffffffff8113b66e>] touch_atime+0x11e/0x160
[ 6962.310154]  [<ffffffff810ebdeb>] generic_file_aio_read+0x2cb/0x630
[ 6962.310157]  [<ffffffff811265b1>] do_sync_read+0xd1/0x120
[ 6962.310159]  [<ffffffff811272c8>] vfs_read+0xc8/0x1a0
[ 6962.310161]  [<ffffffff81127490>] sys_read+0x50/0x90
[ 6962.310164]  [<ffffffff81033e2b>] system_call_fastpath+0x16/0x1b
[ 6962.310166] ---[ end trace 4a71552e8b9479de ]---

One other time, it panicked (still the older kernel), and it didn''t
log.

Let me know if you need more information or how I can help debug.

Thanks

--Troy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Chris Mason

2010-Mar-15 20:56 UTC

head link

Re: consistent oops after power fail during btrfs-vol -r

On Sun, Mar 14, 2010 at 11:09:46AM -0700, Troy Ablan
wrote:> Hey guys,
> 
> In the middle of a long btrfs-vol -r on a 15-device raid1 the machine
> lost power.  I don''t know if that''s the cause, but I now
have access to
> the filesystem for only a minute or two, and am now consistently getting
> this oops shortly after mount:
Just to include comments from irc, this configuration has dm-crypt on
top of plain sata drives.  This configuration won''t pass sata cache
flushing operations from the filesystem to the drive, and so the
writeback cache on these drives needs to be turned off to avoid
corruption during power failures.

This doesn''t meant the oopsen are ok, I''m working on a series
of EIO
patches to get us past these bugs and at least help people read the data
off the drives.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Btrfs devel - Mar 2010 - consistent oops after power fail during btrfs-vol -r

consistent oops after power fail during btrfs-vol -r

Re: consistent oops after power fail during btrfs-vol -r