Joeri Vanthienen
2012-Dec-12 10:24 UTC
kernel BUG at fs/btrfs/extent_io.c:4052 (kernel 3.5.3)
Hi all,
Last week we had 2 times an "uncorrectable ecc memory error" crash on
our server on the same memory module.
After removing the faulty module and restarting the server, everything
was working again.
However, yesterday we had a soft lockup and had to restart the server
again. No warning or ecc error this time. Everything is working now,
but we want to avoid this in the future ofcourse.
Dec 11 17:49:04 SANOS1 kernel: kernel BUG at fs/btrfs/extent_io.c:4052!
Dec 11 17:49:04 SANOS1 kernel: invalid opcode: 0000 [#1] SMP
Dec 11 17:49:04 SANOS1 kernel: CPU 4
Dec 11 17:49:04 SANOS1 kernel: Modules linked in: iscsi_scst(O)
scst_vdisk(O) scst(O) btrfs zlib_deflate libcrc32c
cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq
mperf sg(O) ses(O) ixg
be mdio igb lpc_ich mfd_core enclosure mptctl(O) coretemp kvm_intel
kvm crc32c_intel serio_raw pcspkr i2c_i801 i7core_edac ioatdma
edac_core dca button edd microcode autofs4 processor thermal_sys
scsi_dh_emc
(O) scsi_dh_rdac(O) scsi_dh_alua(O) scsi_dh_hp_sw(O) scsi_dh(O)
mptsas(O) mptscsih(O) mptbase(O) scsi_transport_sas(O) ata_generic
ata_piix [last unloaded: scst]
Dec 11 17:49:04 SANOS1 kernel:
Dec 11 17:49:04 SANOS1 kernel: Pid: 10716, comm: btrfs-endio-wri
Tainted: G O 3.5.3-2.10-desktop #3 Supermicro
X8DTN+-F/X8DTN+-F
Dec 11 17:49:04 SANOS1 kernel: RIP: 0010:[<ffffffffa025c3de>]
[<ffffffffa025c3de>]
btrfs_release_extent_buffer_page.constprop.47+0x11e/0x130 [btrfs]
Dec 11 17:49:04 SANOS1 kernel: RSP: 0018:ffff8804d7cbf900 EFLAGS: 00010202
Dec 11 17:49:04 SANOS1 kernel: RAX: 0000000000000001 RBX:
ffff88080e3e80e0 RCX: ffff880497cb74b0
Dec 11 17:49:04 SANOS1 kernel: RDX: 0000000000000000 RSI:
0000000015644868 RDI: ffff88080e3e80e0
Dec 11 17:49:04 SANOS1 kernel: RBP: ffff8804d7cbf930 R08:
0000000000000028 R09: ffff8804d7cbf808
Dec 11 17:49:04 SANOS1 kernel: R10: 0000000000000000 R11:
0000000000000000 R12: ffff880497cb4c10
Dec 11 17:49:04 SANOS1 kernel: R13: ffff8804cca63eb0 R14:
ffff88080e3e80e0 R15: 0000000000000005
Dec 11 17:49:04 SANOS1 kernel: FS: 0000000000000000(0000)
GS:ffff88083fc00000(0000) knlGS:0000000000000000
Dec 11 17:49:04 SANOS1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Dec 11 17:49:04 SANOS1 kernel: CR2: 00007f5519e56600 CR3:
0000000001a0c000 CR4: 00000000000007e0
Dec 11 17:49:04 SANOS1 kernel: DR0: 0000000000000000 DR1:
0000000000000000 DR2: 0000000000000000
Dec 11 17:49:04 SANOS1 kernel: DR3: 0000000000000000 DR6:
00000000ffff0ff0 DR7: 0000000000000400
Dec 11 17:49:04 SANOS1 kernel: Process btrfs-endio-wri (pid: 10716,
threadinfo ffff8804d7cbe000, task ffff8807be802280)
Dec 11 17:49:04 SANOS1 kernel: Stack:
Dec 11 17:49:04 SANOS1 kernel: ffff8804d7cbf950 ffff88080e3e80e0
ffff880497cb4c10 ffff8804cca63eb0
Dec 11 17:49:04 SANOS1 kernel: 000002ce2aa53000 ffff8804a65ab000
ffff8804d7cbf950 ffffffffa025c60f
Dec 11 17:49:04 SANOS1 kernel: ffff88080e3e80e0 ffff8804cca63eb0
ffff8804d7cbf970 ffffffffa02616b2
Dec 11 17:49:04 SANOS1 kernel: Call Trace:
Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa025c60f>]
release_extent_buffer.isra.38+0x3f/0xc0 [btrfs]
Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa02616b2>]
free_extent_buffer+0x32/0x90 [btrfs]
Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa02197aa>]
btrfs_release_path+0x2a/0xb0 [btrfs]
Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa0219aa6>]
btrfs_free_path+0x16/0x30 [btrfs]
Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa0232df0>]
btrfs_del_csums+0x2b0/0x300 [btrfs]
Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa0227149>]
__btrfs_free_extent+0x639/0x7b0 [btrfs]
Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa022b34e>]
run_clustered_refs+0x2be/0xa50 [btrfs]
Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa022bc76>]
btrfs_run_delayed_refs+0x196/0x4c0 [btrfs]
Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa025cac8>] ?
merge_state+0xd8/0x150 [btrfs]
Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa025c949>] ?
free_extent_state+0x19/0x20 [btrfs]
Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa025d476>] ?
clear_extent_bit+0x216/0x380 [btrfs]
Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa023d56a>]
__btrfs_end_transaction+0x9a/0x350 [btrfs]
Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa023d880>]
btrfs_end_transaction+0x10/0x20 [btrfs]
Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa02433c5>]
btrfs_finish_ordered_io+0x175/0x400 [btrfs]
Dec 11 17:49:04 SANOS1 kernel: [<ffffffff8104ebd0>] ?
usleep_range+0x40/0x40
Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa0243660>]
finish_ordered_fn+0x10/0x20 [btrfs]
Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa026cd97>]
worker_loop+0x157/0x550 [btrfs]
Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa026cc40>] ?
btrfs_queue_worker+0x310/0x310 [btrfs]
Dec 11 17:49:04 SANOS1 kernel: [<ffffffff81061bde>] kthread+0x8e/0xa0
Dec 11 17:49:04 SANOS1 kernel: [<ffffffff81590594>]
kernel_thread_helper+0x4/0x10
Dec 11 17:49:04 SANOS1 kernel: [<ffffffff81061b50>] ?
flush_kthread_worker+0x70/0x70
Dec 11 17:49:04 SANOS1 kernel: [<ffffffff81590590>] ? gs_change+0x13/0x13
Dec 11 17:49:04 SANOS1 kernel: Code: 20 a8 04 75 2c 48 8b 03 a8 10 75
23 48 8b 03 f6 c4 20 75 19 f0 80 63 01 f7 48 c7 43 30 00 00 00 00 48
89 df e8 24 c6 ea e0 eb c0 <0f> 0b 0f 0b 0f 0b 0f 0b 66 2e 0f 1f 84
00 00 00 00 00 55 48 c1
Dec 11 17:49:04 SANOS1 kernel: RIP [<ffffffffa025c3de>]
btrfs_release_extent_buffer_page.constprop.47+0x11e/0x130 [btrfs]
Dec 11 17:49:04 SANOS1 kernel: RSP <ffff8804d7cbf900>
Dec 11 17:49:04 SANOS1 kernel: ---[ end trace ea1d29e10378231c ]---
Dec 11 17:49:29 SANOS1 kernel: BUG: soft lockup - CPU#1 stuck for 22s!
[btrfs-endio-wri:2846]
Today I''ve finished a scrub on the btrfs filesystem. No errors.
SANOS1:~ # btrfs scrub status -d /dev/sde
scrub status for 517e8cfa-4275-4589-8da4-6a46ad613daa
scrub device /dev/sde (id 1) history
scrub started at Wed Dec 12 09:28:44 2012 and finished after
4149 seconds
total bytes scrubbed: 338.19GB with 0 errors
What could be the cause of the soft lockup ? Thanks in advance.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Dec 12, 2012 at 11:24:18AM +0100, Joeri Vanthienen wrote:> Hi all, > > Last week we had 2 times an "uncorrectable ecc memory error" crash on > our server on the same memory module. > After removing the faulty module and restarting the server, everything > was working again. > > However, yesterday we had a soft lockup and had to restart the server > again. No warning or ecc error this time. Everything is working now, > but we want to avoid this in the future ofcourse. > > Dec 11 17:49:04 SANOS1 kernel: kernel BUG at fs/btrfs/extent_io.c:4052! > Dec 11 17:49:04 SANOS1 kernel: invalid opcode: 0000 [#1] SMP > Dec 11 17:49:04 SANOS1 kernel: CPU 4 > Dec 11 17:49:04 SANOS1 kernel: Modules linked in: iscsi_scst(O) > scst_vdisk(O) scst(O) btrfs zlib_deflate libcrc32c > cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq > mperf sg(O) ses(O) ixg > be mdio igb lpc_ich mfd_core enclosure mptctl(O) coretemp kvm_intel > kvm crc32c_intel serio_raw pcspkr i2c_i801 i7core_edac ioatdma > edac_core dca button edd microcode autofs4 processor thermal_sys > scsi_dh_emc > (O) scsi_dh_rdac(O) scsi_dh_alua(O) scsi_dh_hp_sw(O) scsi_dh(O) > mptsas(O) mptscsih(O) mptbase(O) scsi_transport_sas(O) ata_generic > ata_piix [last unloaded: scst] > Dec 11 17:49:04 SANOS1 kernel: > Dec 11 17:49:04 SANOS1 kernel: Pid: 10716, comm: btrfs-endio-wri > Tainted: G O 3.5.3-2.10-desktop #3 Supermicro > X8DTN+-F/X8DTN+-F > Dec 11 17:49:04 SANOS1 kernel: RIP: 0010:[<ffffffffa025c3de>] > [<ffffffffa025c3de>] > btrfs_release_extent_buffer_page.constprop.47+0x11e/0x130 [btrfs] > Dec 11 17:49:04 SANOS1 kernel: RSP: 0018:ffff8804d7cbf900 EFLAGS: 00010202 > Dec 11 17:49:04 SANOS1 kernel: RAX: 0000000000000001 RBX: > ffff88080e3e80e0 RCX: ffff880497cb74b0 > Dec 11 17:49:04 SANOS1 kernel: RDX: 0000000000000000 RSI: > 0000000015644868 RDI: ffff88080e3e80e0 > Dec 11 17:49:04 SANOS1 kernel: RBP: ffff8804d7cbf930 R08: > 0000000000000028 R09: ffff8804d7cbf808 > Dec 11 17:49:04 SANOS1 kernel: R10: 0000000000000000 R11: > 0000000000000000 R12: ffff880497cb4c10 > Dec 11 17:49:04 SANOS1 kernel: R13: ffff8804cca63eb0 R14: > ffff88080e3e80e0 R15: 0000000000000005 > Dec 11 17:49:04 SANOS1 kernel: FS: 0000000000000000(0000) > GS:ffff88083fc00000(0000) knlGS:0000000000000000 > Dec 11 17:49:04 SANOS1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > Dec 11 17:49:04 SANOS1 kernel: CR2: 00007f5519e56600 CR3: > 0000000001a0c000 CR4: 00000000000007e0 > Dec 11 17:49:04 SANOS1 kernel: DR0: 0000000000000000 DR1: > 0000000000000000 DR2: 0000000000000000 > Dec 11 17:49:04 SANOS1 kernel: DR3: 0000000000000000 DR6: > 00000000ffff0ff0 DR7: 0000000000000400 > Dec 11 17:49:04 SANOS1 kernel: Process btrfs-endio-wri (pid: 10716, > threadinfo ffff8804d7cbe000, task ffff8807be802280) > Dec 11 17:49:04 SANOS1 kernel: Stack: > Dec 11 17:49:04 SANOS1 kernel: ffff8804d7cbf950 ffff88080e3e80e0 > ffff880497cb4c10 ffff8804cca63eb0 > Dec 11 17:49:04 SANOS1 kernel: 000002ce2aa53000 ffff8804a65ab000 > ffff8804d7cbf950 ffffffffa025c60f > Dec 11 17:49:04 SANOS1 kernel: ffff88080e3e80e0 ffff8804cca63eb0 > ffff8804d7cbf970 ffffffffa02616b2 > Dec 11 17:49:04 SANOS1 kernel: Call Trace: > Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa025c60f>] > release_extent_buffer.isra.38+0x3f/0xc0 [btrfs] > Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa02616b2>] > free_extent_buffer+0x32/0x90 [btrfs] > Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa02197aa>] > btrfs_release_path+0x2a/0xb0 [btrfs] > Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa0219aa6>] > btrfs_free_path+0x16/0x30 [btrfs] > Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa0232df0>] > btrfs_del_csums+0x2b0/0x300 [btrfs] > Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa0227149>] > __btrfs_free_extent+0x639/0x7b0 [btrfs] > Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa022b34e>] > run_clustered_refs+0x2be/0xa50 [btrfs] > Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa022bc76>] > btrfs_run_delayed_refs+0x196/0x4c0 [btrfs] > Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa025cac8>] ? > merge_state+0xd8/0x150 [btrfs] > Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa025c949>] ? > free_extent_state+0x19/0x20 [btrfs] > Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa025d476>] ? > clear_extent_bit+0x216/0x380 [btrfs] > Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa023d56a>] > __btrfs_end_transaction+0x9a/0x350 [btrfs] > Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa023d880>] > btrfs_end_transaction+0x10/0x20 [btrfs] > Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa02433c5>] > btrfs_finish_ordered_io+0x175/0x400 [btrfs] > Dec 11 17:49:04 SANOS1 kernel: [<ffffffff8104ebd0>] ? usleep_range+0x40/0x40 > Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa0243660>] > finish_ordered_fn+0x10/0x20 [btrfs] > Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa026cd97>] > worker_loop+0x157/0x550 [btrfs] > Dec 11 17:49:04 SANOS1 kernel: [<ffffffffa026cc40>] ? > btrfs_queue_worker+0x310/0x310 [btrfs] > Dec 11 17:49:04 SANOS1 kernel: [<ffffffff81061bde>] kthread+0x8e/0xa0 > Dec 11 17:49:04 SANOS1 kernel: [<ffffffff81590594>] > kernel_thread_helper+0x4/0x10 > Dec 11 17:49:04 SANOS1 kernel: [<ffffffff81061b50>] ? > flush_kthread_worker+0x70/0x70 > Dec 11 17:49:04 SANOS1 kernel: [<ffffffff81590590>] ? gs_change+0x13/0x13 > Dec 11 17:49:04 SANOS1 kernel: Code: 20 a8 04 75 2c 48 8b 03 a8 10 75 > 23 48 8b 03 f6 c4 20 75 19 f0 80 63 01 f7 48 c7 43 30 00 00 00 00 48 > 89 df e8 24 c6 ea e0 eb c0 <0f> 0b 0f 0b 0f 0b 0f 0b 66 2e 0f 1f 84 > 00 00 00 00 00 55 48 c1 > Dec 11 17:49:04 SANOS1 kernel: RIP [<ffffffffa025c3de>] > btrfs_release_extent_buffer_page.constprop.47+0x11e/0x130 [btrfs] > Dec 11 17:49:04 SANOS1 kernel: RSP <ffff8804d7cbf900> > Dec 11 17:49:04 SANOS1 kernel: ---[ end trace ea1d29e10378231c ]--- > Dec 11 17:49:29 SANOS1 kernel: BUG: soft lockup - CPU#1 stuck for 22s! > [btrfs-endio-wri:2846] > > > Today I''ve finished a scrub on the btrfs filesystem. No errors. > SANOS1:~ # btrfs scrub status -d /dev/sde > scrub status for 517e8cfa-4275-4589-8da4-6a46ad613daa > scrub device /dev/sde (id 1) history > scrub started at Wed Dec 12 09:28:44 2012 and finished after > 4149 seconds > total bytes scrubbed: 338.19GB with 0 errors > > What could be the cause of the soft lockup ? Thanks in advance.Just FYI, I once hit similar soft lockup on extent buffer while tracking bugs about tree modify log code, but we''ve fixed them in the latest btrfs. thanks, liubo -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Apparently Analagous Threads
- BTRFS thinks device is busy [kernel 3.5.3]
- question about replacing a drive in raid10
- extent_io.c: bio_add_page() error check for bio ptr
- a stacktrace i had on my luks encrypted btrfs partition on kernel 3.4
- [PATCH 1/2] btrfs: restructure try_release_extent_buffer()