Hey, after the my test lustre filesystem was quite full, I''ve made a rm -rf * on the lustre filesystem and got this error message: [ 135.094107] Lustre: MGC192.168.1.2 at tcp: Reactivating import [ 135.094706] Lustre: Server MGS on device /dev/hda5 has started [ 137.827630] Lustre: spfs-MDT0000: temporarily refusing client connection from 192.168.1.3 at tcp [ 137.828076] LustreError: 2100:0:(ldlm_lib.c:1848:target_send_reply_msg()) @@@ processing error (-11) req at decada00 x1333718444276272/t0 o38-><?>@<?>:0/0 lens 368/0 e 0 to 0 dl 1272455883 ref 1 fl Interpret:/0/0 rc -11/0 [ 155.830871] Lustre: spfs-MDT0000: temporarily refusing client connection from 192.168.1.3 at tcp [ 155.831308] LustreError: 2099:0:(ldlm_lib.c:1848:target_send_reply_msg()) @@@ processing error (-11) req at de96b800 x1333718444276280/t0 o38-><?>@<?>:0/0 lens 368/0 e 0 to 0 dl 1272455901 ref 1 fl Interpret:/0/0 rc -11/0 [ 169.705870] Lustre: 2124:0:(mds_lov.c:1167:mds_notify()) MDS spfs-MDT0000: add target spfs-OST0000_UUID [ 169.770049] Lustre: 2052:0:(mds_lov.c:1203:mds_notify()) MDS spfs-MDT0000: in recovery, not resetting orphans on spfs-OST0000_UUID [ 169.802342] Lustre: spfs-mdtlov.lov: set parameter stripesize=1048576 [ 173.866361] LustreError: 2103:0:(mds_open.c:1666:mds_close()) @@@ no handle for file close ino 1738772: cookie 0x5a629f5fe2dc51f1 req at ded9f600 x1333718444266421/t0 o35->51740db3-b37e-ec7c-ab23-9e7365d70fab@:0/0 lens 408/528 e 0 to 0 dl 1272456350 ref 2 fl Interpret:/4/0 rc 0/0 [ 173.867518] LustreError: 2103:0:(ldlm_lib.c:1848:target_send_reply_msg()) @@@ processing error (-116) req at ded9f600 x1333718444266421/t0 o35->51740db3- b37e-ec7c-ab23-9e7365d70fab@:0/0 lens 408/432 e 0 to 0 dl 1272456350 ref 2 fl Interpret:/4/0 rc -116/0 [ 174.635986] LustreError: 2104:0:(mds_open.c:1666:mds_close()) @@@ no handle for file close ino 1671195: cookie 0x5a629f5fe2dc47a2 req at dfac4c00 x1333718444266751/t0 o35->51740db3-b37e-ec7c-ab23-9e7365d70fab@:0/0 lens 408/528 e 0 to 0 dl 1272455836 ref 2 fl Interpret:/4/0 rc 0/0 [ 174.637154] LustreError: 2104:0:(mds_open.c:1666:mds_close()) Skipped 2 previous similar messages [ 176.182138] LustreError: 2103:0:(mds_open.c:1666:mds_close()) @@@ no handle for file close ino 893277: cookie 0x5a629f5fe2dc22e2 req at dfac9400 x1333718444267408/t0 o35->51740db3-b37e-ec7c-ab23-9e7365d70fab@:0/0 lens 408/528 e 0 to 0 dl 1272455838 ref 2 fl Interpret:/4/0 rc 0/0 [ 176.183281] LustreError: 2103:0:(mds_open.c:1666:mds_close()) Skipped 4 previous similar messages [ 177.489667] LustreError: 2099:0:(mds_reint.c:1772:mds_orphan_add_link()) ASSERTION(inode->i_nlink == 2) failed: dir nlink == 1 [ 177.490214] LustreError: 2099:0:(mds_reint.c:1772:mds_orphan_add_link()) LBUG [ 177.490559] Pid: 2099, comm: ll_mdt_00 [ 177.490759] [ 177.490760] Call Trace: [ 177.491067] [<00000000e0d0a7a8>] libcfs_debug_dumpstack+0x58/0x80 [libcfs] [ 177.491423] [<00000000e0d0aedd>] lbug_with_loc+0x6d/0xc0 [libcfs] [ 177.491779] [<00000000e13a30b5>] mds_orphan_add_link+0xd85/0xd90 [mds] [ 177.492162] [<00000000e0e9c5c4>] __ldiskfs_journal_stop+0x24/0x50 [ldiskfs] [ 177.492527] [<00000000e13b6bcf>] mds_reint_unlink+0x1e8f/0x3b80 [mds] [ 177.492867] [<00000000e13a2093>] mds_reint_rec+0x133/0x3d0 [mds] [ 177.493189] [<00000000e138e239>] mds_reint+0x229/0x740 [mds] [ 177.493583] [<00000000e15a6d44>] lustre_msg_get_flags+0x104/0x200 [ptlrpc] [ 177.493941] [<00000000e1399599>] mds_handle+0x17c9/0xa180 [mds] [ 177.494243] [<00000000c02d32ae>] _spin_lock+0x5/0x7 [ 177.494505] [<00000000c02d3204>] _spin_lock_irqsave+0x23/0x29 [ 177.494801] [<00000000c012d2fb>] lock_timer_base+0x19/0x35 [ 177.495081] [<00000000c012d485>] __mod_timer+0xc0/0xc9 [ 177.495392] [<00000000e15a5cfc>] lustre_msg_get_transno+0x10c/0x210 [ptlrpc] [ 177.495738] [<00000000c02d32ae>] _spin_lock+0x5/0x7 [ 177.496046] [<00000000e1546ef2>] target_queue_recovery_request+0xaf2/0x1750 [ptlrpc] [ 177.496428] [<00000000c0129b11>] __do_softirq+0x143/0x16b [ 177.496707] [<00000000c02d32ae>] _spin_lock+0x5/0x7 [ 177.496980] [<00000000e139a783>] mds_handle+0x29b3/0xa180 [mds] [ 177.497282] [<00000000c026d23f>] net_rx_action+0xa4/0x1be [ 177.497558] [<00000000c0129b11>] __do_softirq+0x143/0x16b [ 177.497879] [<00000000e15a5104>] lustre_msg_get_conn_cnt+0x104/0x200 [ptlrpc] [ 177.498238] [<00000000c0104363>] common_interrupt+0x23/0x28 [ 177.498566] [<00000000e15b4ce6>] ptlrpc_update_export_timer+0x56/0x670 [ptlrpc] [ 177.498972] [<00000000e15b4a86>] ptlrpc_check_req+0x16/0x220 [ptlrpc] [ 177.499305] [<00000000e0b1558b>] lprocfs_counter_add+0x5b/0x150 [lvfs] [ 177.499673] [<00000000e15b8959>] ptlrpc_server_handle_request+0xb29/0x1d90 [ptlrpc] [ 177.500065] [<00000000c011a5fb>] enqueue_task+0x52/0x5d [ 177.500336] [<00000000c012048a>] try_to_wake_up+0x15c/0x165 [ 177.500637] [<00000000e0d1750b>] lc_watchdog_touch+0x9b/0x270 [libcfs] [ 177.501675] [<00000000e0d169b1>] lc_watchdog_disable+0x81/0x260 [libcfs] [ 177.502052] [<00000000e15bc277>] ptlrpc_main+0x867/0x22e0 [ptlrpc] [ 177.502362] [<00000000c0120493>] default_wake_function+0x0/0x8 [ 177.502700] [<00000000e15bba10>] ptlrpc_main+0x0/0x22e0 [ptlrpc] [ 177.503003] [<00000000c01045e3>] kernel_thread_helper+0x7/0x10 [ 177.503296] <IRQ> [ 177.506256] LustreError: dumping log to /tmp/lustre-log.1272455822.2099 The lustre-log.1272455822.2099 file is attached to this mail. Has anybody a clue what is going wrong here? Greetings Patrick -- Patrick Winnertz Tel.: +49 (0)21 61 - 46 43-0 Fax: +49 (0)21 61 - 46 43-100 credativ GmbH, HRB M?nchengladbach 12080 Hohenzollernstr. 133, 41061 M?nchengladbach Gesch?ftsf?hrung: Dr. Michael Meskes, J?rg Folz -------------- next part -------------- A non-text attachment was scrubbed... Name: lustre-log.1272455822.2099 Type: application/octet-stream Size: 11176 bytes Desc: not available Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20100428/2810231f/attachment.obj
On Wed, Apr 28, 2010 at 01:50:13PM +0200, Patrick Winnertz wrote:> [ 177.489667] LustreError: 2099:0:(mds_reint.c:1772:mds_orphan_add_link()) > ASSERTION(inode->i_nlink == 2) failed: dir nlink == 1 > [ 177.490214] LustreError: 2099:0:(mds_reint.c:1772:mds_orphan_add_link()) > LBUGThis is a known problem with open-unlinked directory in 1.8.2. There is a fix attached to bug 22177. Johann
Hi, We seem to have may have hit this bug too now. Is the fix in 1.8.3, the bugzilla entry isn''t exactly clear. Additionally the bug (https://bugzilla.lustre.org/show_bug.cgi?id=22177) is still marked as ASSIGNED does that mean the fix still has problems? Thanks, derek On Apr 28, 2010, at 7:59 AM, Johann Lombardi wrote:> On Wed, Apr 28, 2010 at 01:50:13PM +0200, Patrick Winnertz wrote: >> [ 177.489667] LustreError: 2099:0:(mds_reint.c:1772:mds_orphan_add_link()) >> ASSERTION(inode->i_nlink == 2) failed: dir nlink == 1 >> [ 177.490214] LustreError: 2099:0:(mds_reint.c:1772:mds_orphan_add_link()) >> LBUG > > This is a known problem with open-unlinked directory in 1.8.2. > There is a fix attached to bug 22177. > > Johann > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >Derek Yarnell UNIX Systems Administrator University of Maryland Institute for Advanced Computer Studies
On 2010-07-14, at 17:05, Derek Yarnell wrote:> We seem to have may have hit this bug too now. Is the fix in 1.8.3, the bugzilla entry isn''t exactly clear. Additionally the bug (https://bugzilla.lustre.org/show_bug.cgi?id=22177) is still marked as ASSIGNED does that mean the fix still has problems?The patch shows in bugzilla as landed-1.8.3, and the 1.8.3 ChangeLog also lists a fix for bug 22177, so I''d say it is fixed.> On Apr 28, 2010, at 7:59 AM, Johann Lombardi wrote: > >> On Wed, Apr 28, 2010 at 01:50:13PM +0200, Patrick Winnertz wrote: >>> [ 177.489667] LustreError: 2099:0:(mds_reint.c:1772:mds_orphan_add_link()) >>> ASSERTION(inode->i_nlink == 2) failed: dir nlink == 1 >>> [ 177.490214] LustreError: 2099:0:(mds_reint.c:1772:mds_orphan_add_link()) >>> LBUG >> >> This is a known problem with open-unlinked directory in 1.8.2. >> There is a fix attached to bug 22177. >> >> Johann >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> > > Derek Yarnell > UNIX Systems Administrator > University of Maryland > Institute for Advanced Computer Studies > > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discussCheers, Andreas -- Andreas Dilger Lustre Technical Lead Oracle Corporation Canada Inc.
Yes, confirmed. Every site that reported this issue reported it resolved with this patch. Andreas Dilger wrote:> <snip> > The patch shows in bugzilla as landed-1.8.3, and the 1.8.3 ChangeLog also lists a fix for bug 22177, so I''d say it is fixed. > > >
Hi, Thanks, yes this fixed it for us too. Just doing a lfsck today to make sure everything looks good. Sorry for the inability to read the bugzilla will just chalk that up to being tired. Thanks, derek On Jul 15, 2010, at 8:51 AM, Peter Jones wrote:> Yes, confirmed. Every site that reported this issue reported it resolved with this patch. > > Andreas Dilger wrote: >> <snip> >> The patch shows in bugzilla as landed-1.8.3, and the 1.8.3 ChangeLog also lists a fix for bug 22177, so I''d say it is fixed. >> >> >Derek Yarnell UNIX Systems Administrator University of Maryland Institute for Advanced Computer Studies