I''ve got another odd crash, hoping somebody will have a hint: Testing various configurations of our lustre/ramdisk stuff. I''ve come up with a reproducible case where I''m tripping an assertion failure in the generic transaction code. The proximate cause seems to be running iozone with a stripe size of 1MB. The fs is striped 64K, so that ought to be turning into a bunch of parallel writes. I did some digging around and was all excited to find cfs bug 10891, because it looked like it was fixing something like this, but it doesn''t seem to make a difference. This is kernel 2.6.15, mips64, and still running lustre 1.6b7 (because the vanilla kernel support still isn''t ready in 1.6.x). I''m hopeful that somebody will recognize this and be able to point me at a fix, perhaps in the real 1.6. TIA... [4344894.240000] Assertion failure in journal_dirty_metadata() at fs/jbd/transaction.c:1117: "handle->h_buffer_credits > 0" [4344894.246000] Break instruction in kernel code[#1]: [4344894.248000] Cpu 4 [4344894.249000] $ 0 : 0000000000000000 000000001480c4e0 0000000000000082 ffffffff80528e88 [4344894.257000] $ 4 : ffffffff80528eb8 ffffffff80530000 ffffffff80530000 ffffffff805f0000 [4344894.261000] $ 8 : ffffffff80530000 ffffffff80530000 ffffffff80530000 ffffffff805f0000 [4344894.265000] $12 : ffffffff805f0000 ffffffff805f0000 ffffffff805f0000 ffffffff805f0000 [4344894.269000] $16 : a8000000e8050c80 a8000000e96b3e88 a8000000e96467e8 a8000000ef4616b0 [4344894.272000] $20 : a8000000fa013640 ffffffff805f0000 0000000000001800 a8000000ef4616b0 [4344894.278000] $24 : ffffffff80530000 ffffffff80530000 [4344894.282000] $28 : a8000000f9ec0000 a8000000f9ecf190 a8000000f9ecf2a0 ffffffff80241138 [4344894.290000] Hi : 0000000000000002 [4344894.292000] Lo : 0000000000000000 [4344894.293000] epc : ffffffff80241138 journal_dirty_metadata+0x170/0x598 Tainted: P [4344894.297000] ra : ffffffff80241138 journal_dirty_metadata+0x170/0x598 [4344894.301000] Status: 1480c4e3 KX SX UX KERNEL EXL IE [4344894.304000] Cause : 00800024 [4344894.305000] PrId : 040e1301 [4344894.306000] Modules linked in: osc mds fsfilt_ldiskfs mgs mgc lustre lov mdc ksocklnd ksclnd ptlrpc obdclass lvfs ldiskfs lnet libcfs sceth scio scfab [4344894.315000] Process ll_mdt_19 (pid: 1703, threadinfo=a8000000f9ec0000, task=a8000000f02b4540) [4344894.319000] Stack : a8000000e96b3e88 ffffffff805f0000 a8000000ef4616b0 a8000000e8e88168 [4344894.326000] a8000000f9ecf2c0 0000000000000000 c0000000028ec5a8 c0000000028ec474 [4344894.330000] 0000000000000060 ffffffff8023e298 0000000000000040 0000000000000040 [4344894.334000] 00000000000072bf a800000011f3fae0 a800000011cc3f40 c000000000070000 [4344894.338000] a8000000f9ecf220 c000000002a9bf48 0000000000000000 a8000000ef4e3180 [4344894.341000] 0000000000000000 a8000000f9ecf2c0 a8000000e96b3e30 0000000000000800 [4344894.347000] a8000000e96c7800 0000000000001800 c0000000028eeef8 0000000000000000 [4344894.351000] a8000000e96b3e30 0000000000000c00 0000000000000000 0000000000000000 [4344894.357000] c000000008ecbefc a8000000e9316000 a800000011f3fae0 a8000000fa239700 [4344894.360000] a8000000f9ecf2c0 c000000002aa08f8 ffffffff00000000 0000000000000005 [4344894.364000] ... [4344894.366000] Call Trace: [4344894.367000] [<c0000000028ec5a8>] ldiskfs_getblk+0x278/0x668 [ldiskfs] [4344894.370000] [<c0000000028ec474>] ldiskfs_getblk+0x144/0x668 [ldiskfs] [4344894.372000] [<ffffffff8023e298>] do_get_write_access+0xa40/0xdb8 [4344894.375000] [<c000000002a9bf48>] llog_lvfs_write_blob+0x4f8/0xc08 [obdclass] [4344894.378000] [<c0000000028eeef8>] ldiskfs_bread+0x30/0x1e8 [ldiskfs] [4344894.381000] [<c000000008ecbefc>] fsfilt_ldiskfs_write_record+0x22c/0xa68 [fsfilt_ldiskfs] [4344894.386000] [<c000000002aa08f8>] llog_lvfs_write_rec+0x1458/0x2918 [obdclass] [4344894.389000] [<c000000002a9ba50>] llog_lvfs_write_blob+0x0/0xc08 [obdclass] [4344894.392000] [<c000000002a9bf48>] llog_lvfs_write_blob+0x4f8/0xc08 [obdclass] [4344894.397000] [<c000000002a9379c>] llog_cat_new_log+0xff4/0x2350 [obdclass] [4344894.400000] [<c000000002a9ba50>] llog_lvfs_write_blob+0x0/0xc08 [obdclass] [4344894.403000] [<c000000002aa0004>] llog_lvfs_write_rec+0xb64/0x2918 [obdclass] [4344894.411000] [<ffffffff802b0000>] kobject_register+0x80/0xa0 [4344894.413000] [<ffffffff8013b650>] __might_sleep+0x0/0x158 [4344894.415000] [<ffffffff804513bc>] _spin_unlock_irqrestore+0x14/0x48 [4344894.418000] [<c000000002a9788c>] llog_cat_current_log+0x82c/0xb98 [obdclass] [4344894.420000] [<ffffffff804514c8>] _spin_lock_irqsave+0x30/0x48 [4344894.423000] [<ffffffff802b3548>] __up_write+0x40/0x2e8 [4344894.425000] [<c000000002a90000>] llog_process+0x1f30/0x2e80 [obdclass] [4344894.430000] [<c000000002a98350>] llog_cat_add_rec+0x1a0/0x1530 [obdclass] [4344894.433000] [<c0000000028f0000>] ldiskfs_prepare_write+0x188/0x290 [ldiskfs] [4344894.436000] [<c0000000028eae14>] ldiskfs_get_block_handle+0x104/0x12f8 [ldiskfs] [4344894.441000] [<c000000002aada54>] llog_obd_origin_add+0xec/0x5c0 [obdclass] [4344894.444000] [<ffffffff801ba884>] __find_get_block+0x1bc/0x418 [4344894.447000] [<c0000000037d06b4>] _ldlm_lock_debug+0x1d4/0x838 [ptlrpc] [4344894.456000] [<c000000002aaa5cc>] llog_add+0xfc/0x960 [obdclass] [4344894.459000] [<c000000008dd0d88>] lov_llog_origin_add+0x168/0x5b8 [lov] [4344894.461000] [<ffffffff801bcaa0>] ll_rw_block+0x0/0x450 [4344894.464000] [<c0000000028ec474>] ldiskfs_getblk+0x144/0x668 [ldiskfs] [4344894.466000] [<ffffffff801c0000>] end_buffer_read_sync+0x190/0x228 [4344894.469000] [<c000000002aaa5cc>] llog_add+0xfc/0x960 [obdclass] [4344894.471000] [<c000000008e1cc68>] lov_alloc_memmd+0x178/0xf60 [lov] [4344894.478000] [<c000000008f70a4c>] mds_llog_origin_add+0x10c/0x430 [mds] [4344894.481000] [<c000000002aaa5cc>] llog_add+0xfc/0x960 [obdclass] [4344894.483000] [<ffffffff802b0000>] kobject_register+0x80/0xa0 [4344894.490000] [<c000000008f77660>] mds_log_op_unlink+0x19f0/0x3cc0 [mds] [4344894.493000] [<c000000008f774bc>] mds_log_op_unlink+0x184c/0x3cc0 [mds] [4344894.495000] [<c0000000028ffcc8>] __ldiskfs_journal_stop+0x48/0xa0 [ldiskfs] [4344894.498000] [<ffffffff8020baf0>] dnotify_parent+0x60/0x198 [4344894.502000] [<c0000000028f8b88>] ldiskfs_unlink+0xf8/0x340 [ldiskfs] [4344894.505000] [<c000000003874730>] lustre_msg_buf+0x0/0x158 [ptlrpc] [4344894.508000] [<c000000003883030>] lustre_msg_buflen+0x0/0x1f8 [ptlrpc] [4344894.513000] [<c000000008ffa91c>] mds_reint_unlink+0x3aec/0x65a0 [mds] [4344894.516000] [<c000000002a5cf70>] upcall_cache_get_entry+0x14b8/0x1fe8 [lvfs] [4344894.518000] [<ffffffff801e3c4c>] dput+0x9c/0x3d0 [4344894.525000] [<c000000008fd921c>] mds_reint_rec+0x1fc/0x758 [mds] [4344894.527000] [<c0000000090498f8>] mds_unlink_unpack+0x1c8/0x838 [mds] [4344894.530000] [<c0000000090498cc>] mds_unlink_unpack+0x19c/0x838 [mds] [4344894.533000] [<c000000008facf80>] mds_reint+0x568/0xda8 [mds] [4344894.539000] [<c000000003870000>] lustre_msg_buf_v2+0x118/0x4d0 [ptlrpc] [4344894.542000] [<c000000008fc1830>] mds_handle+0x5ed8/0x12dc0 [mds] [4344894.545000] [<ffffffff8010c99c>] do_gettimeofday+0xf4/0x220 [4344894.551000] [<c000000000065e58>] libcfs_debug_vmsg2+0x330/0xfd0 [libcfs] [4344894.554000] [<c000000000065c00>] libcfs_debug_vmsg2+0xd8/0xfd0 [libcfs] [4344894.557000] [<c000000003895c24>] ptlrpc_main+0x352c/0x48f8 [ptlrpc] [4344894.564000] [<c000000003888fc0>] ptlrpc_retry_rqbds+0x0/0x10 [ptlrpc] [4344894.567000] [<c000000003888fc0>] ptlrpc_retry_rqbds+0x0/0x10 [ptlrpc] [4344894.569000] [<ffffffff8013cde8>] default_wake_function+0x0/0x20 [4344894.572000] [<ffffffff80108680>] kernel_thread_helper+0x10/0x18 [4344894.579000] [<ffffffff80108670>] kernel_thread_helper+0x0/0x18 [4344894.582000] [4344894.582000] [4344894.583000] Code: 2407045d 0c051fac 0102402d <0200000d> 8e620008 2442ffff ae620008 de420028 120200b7
From: "John R. Dunning" <jrd@sicortex.com> Date: Wed, 25 Jul 2007 08:06:29 -0400 I''ve got another odd crash, hoping somebody will have a hint: Dug a little deeper, and am more confused: In lustre/ldiskfs/inode.c: struct buffer_head *ldiskfs_getblk(handle_t *handle, struct inode * inode, long block, int create, int * errp) around line 875: dummy.b_blocknr = -1000; around line 880: if (!*errp && buffer_mapped(&dummy)) { struct buffer_head *bh; bh = sb_getblk(inode->i_sb, dummy.b_blocknr); and around line 902 BUFFER_TRACE(bh, "call ldiskfs_journal_dirty_metadata"); err = ldiskfs_journal_dirty_metadata(handle, bh); if (!fatal) fatal = err; It''s in the call to ldiskfs_journal_dirty_metadata that I seem to be blowing the ASSERT. Chasing down the call to sb_getblk, in kernel/fs/buffer.c, around line 1451, I see /* * __getblk will locate (and, if necessary, create) the buffer_head * which corresponds to the passed block_device, block and size. The * returned buffer has its reference count incremented. * * __getblk() cannot fail - it just keeps trying. If you pass it an * illegal block number, __getblk() will happily return a buffer_head * which represents the non-existent block. Very weird. * * __getblk() will lock up the machine if grow_dev_page''s try_to_free_buffers() * attempt is failing. FIXME, perhaps? */ struct buffer_head * __getblk(struct block_device *bdev, sector_t block, int size) { struct buffer_head *bh = __find_get_block(bdev, block, size); might_sleep(); if (bh == NULL) bh = __getblk_slow(bdev, block, size); return bh; }>From the ldiskfs code, it looks to me like we''re intentionally calling__getblk with a bogus block number. But the comment in __getblk pretty clearly states that calling it that way will result in a buffer_head pointing to that illegal block. Since we then turn around and pass that resultant buffer_head (returned from sb_getblk) into ldiskfs_journal_dirty_metadata, that looks to me like a plausible way to end up with a busted bh, and hence the assertion failure. Anybody else find this suspect? Am I missing something? TIA...