Bill Rugolsky Jr.
2005-Jun-07 16:04 UTC
transaction->t_forget == NULL assertion failure with data=journal
It appears that this bug in data=journal mode, https://listman.redhat.com/archives/ext3-users/2005-February/msg00045.html isn't fixed in 2.6.11.11. Andrew, I've CC'd you since you have previously looked at this specific issue. I'm seeing this problem on dual-Opteron x86-64 boxes serving NFS + Samba3 to a few dozen clients; it takes several hours at high load to reproduce. We have not tested 2.6.12-rc6 yet, as I need to schedule time for the clients on the cluster. I will try and do that ASAP. I see several important fixes on the bk-commits-head list, but none of them jump out at me as being obviously more relevant to data=journal than data=ordered. Meanwhile, I'll endeavor and reproduce this locally. It would be really useful to hunt this down and kill it, because NFS over Ext3 otherwise performs very well in data=journal mode. Suggestions welcome. -Bill Assertion failure in __journal_drop_transaction() at fs/jbd/checkpoint.c:625: "transaction->t_forget == NULL" ----------- [cut here ] --------- [please bite here ] --------- Kernel BUG at checkpoint:625 invalid operand: 0000 [1] SMP CPU 1 Modules linked in: e1000 qla2300 qla2xxx netconsole thermal processor fan button battery ac eeprom adm1026 i2c_sensor i2c_amd756 i2c_core Pid: 17828, comm: kjournald Not tainted 2.6.11.11 RIP: 0010:[<ffffffff801f347f>] ffffffff801f347f>{__journal_drop_transaction+319} RSP: 0018:ffff810028841b58 EFLAGS: 00010296 RAX: 0000000000000071 RBX: ffff8100f840fe00 RCX: ffffffff80612d88 RDX: ffffffff80612d88 RSI: 0000000000000292 RDI: ffffffff80612d80 RBP: ffff8100faf21800 R08: ffff8100f7b45b40 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8100ccba49c0 R13: ffff8100faf21800 R14: 0000000000000000 R15: ffff8100faf2195c FS: 00002aaaaade8b00(0000) GS:ffffffff80847e00(0000) knlGS:00000000557b26c0 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00002aaaaaac2000 CR3: 000000007f6fb000 CR4: 00000000000006e0 Process kjournald (pid: 17828, threadinfo ffff810028840000, task ffff8100fb3e0230) Stack: ffff81007ee618c8 ffff8100faf21800 ffff81003adc3ce8 ffffffff801f36c3 ffff81002fc7d1e8 ffffffff801f275e 0000000100000000 ffff8100faf21824 00000e8c00000000 ffff8100b32d0174 Call Trace: ffffffff801f36c3>{__journal_remove_checkpoint+99} ffffffff801f275e>{journal_commit_transaction+3534} ffffffff8014aec0>{autoremove_wake_function+0} ffffffff8014aec0>{autoremove_wake_function+0} ffffffff8012f047>{recalc_task_prio+327} ffffffff801f6e2c>{kjournald+268} ffffffff8014aec0>{autoremove_wake_function+0} ffffffff80175b4e>{filp_close+126} ffffffff8014aec0>{autoremove_wake_function+0} ffffffff801f6fd0>{commit_timeout+0} ffffffff8010f0f7>{child_rip+8} ffffffff801f6d20>{kjournald+0} ffffffff8010f0ef>{child_rip+0} Code: 0f 0b b9 12 59 80 ff ff ff ff 71 02 66 66 90 66 90 48 83 7b RIP ffffffff801f347f>{__journal_drop_transaction+319} RSP <ffff810028841b58>
Possibly Parallel Threads
- (large, external) data journal BUG (Assertion failure in __journal_drop_transaction() at fs/jbd/checkpoint.c:626: "transaction->t_forget == NULL")
- Oops in 2.6.8.1 at __journal_drop_transaction
- several ext3 and mysql kernel crashes
- Assertion failure in __journal_drop_transaction()
- [PATCH 0/6] jbd cleanup