Hello, Since I update my lustre 2.2 to 2.5.1 (Centos6.5) and copy the MDT to a new SSD disk. I get random kernel panics in the MDS (both HA pairs). The last kernel panic I get this log: <4>Lustre: MGS: non-config logname received: params <3>LustreError: 11-0: cetafs-MDT0000-lwp-MDT0000: Communicating with 0@lo, operation mds_connect failed with -11. <4>Lustre: MGS: non-config logname received: params <4>Lustre: cetafs-MDT0000: Will be in recovery for at least 5:00, or until 102 clients reconnect <4>Lustre: MGS: non-config logname received: params <4>Lustre: MGS: non-config logname received: params <4>Lustre: Skipped 5 previous similar messages <4>Lustre: MGS: non-config logname received: params <4>Lustre: Skipped 9 previous similar messages <4>Lustre: MGS: non-config logname received: params <4>Lustre: Skipped 2 previous similar messages <4>Lustre: MGS: non-config logname received: params <4>Lustre: Skipped 23 previous similar messages <4>Lustre: MGS: non-config logname received: params <4>Lustre: Skipped 8 previous similar messages <3>LustreError: 3461:0:(ldlm_lib.c:1751:check_for_next_transno()) cetafs-MDT0000: waking for gap in transno, VBR is OFF (skip: 17188113481, ql: 1, comp: 101, conn: 102, next: 17188113493, last_committed: 17188113480) <6>Lustre: cetafs-MDT0000: Recovery over after 1:13, of 102 clients 102 recovered and 0 were evicted. <1>BUG: unable to handle kernel NULL pointer dereference at (null) <1>IP: [<ffffffffa0c3b6a0>] __iam_path_lookup+0x70/0x1f0 [osd_ldiskfs] <4>PGD 106c0bf067 PUD 106c0be067 PMD 0 <4>Oops: 0002 [#1] SMP <4>last sysfs file: /sys/devices/system/cpu/online <4>CPU 0 <4>Modules linked in: osp(U) mdd(U) lfsck(U) lod(U) mdt(U) mgs(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) ldiskfs(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ksocklnd(U) ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) sha512_generic sha256_generic crc32c_intel libcfs(U) ipmi_devintf cpufreq_ondemand acpi_cpufreq freq_table mperf ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 dm_multipath microcode iTCO_wdt iTCO_vendor_support sb_edac edac_core lpc_ich mfd_core i2c_i801 igb i2c_algo_bit i2c_core ptp pps_core ioatdma dca mlx4_ib ib_sa ib_mad ib_core mlx4_en mlx4_core sg ext4 jbd2 mbcache sd_mod crc_t10dif ahci isci libsas mpt2sas scsi_transport_sas raid_class megaraid_sas dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan] <4> <4>Pid: 3362, comm: mdt00_001 Not tainted 2.6.32-431.5.1.el6_lustre.x86_64 #1 Bull SAS bullx/X9DRH-7TF/7F/iTF/iF <4>RIP: 0010:[<ffffffffa0c3b6a0>] [<ffffffffa0c3b6a0>] __iam_path_lookup+0x70/0x1f0 [osd_ldiskfs] <4>RSP: 0018:ffff88085e2754b0 EFLAGS: 00010246 <4>RAX: 00000000fffffffb RBX: ffff88085e275600 RCX: 000000000009c93c <4>RDX: 0000000000000000 RSI: 000000000009c93b RDI: ffff88106bcc32f0 <4>RBP: ffff88085e275500 R08: 0000000000000000 R09: 00000000ffffffff <4>R10: 0000000000000000 R11: 0000000000000000 R12: ffff88085e2755c8 <4>R13: 0000000000005250 R14: ffff8810569bf308 R15: 0000000000000001 <4>FS: 0000000000000000(0000) GS:ffff880028200000(0000) knlGS:0000000000000000 <4>CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b <4>CR2: 0000000000000000 CR3: 000000106dd9b000 CR4: 00000000000407f0 <4>DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 <4>DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 <4>Process mdt00_001 (pid: 3362, threadinfo ffff88085e274000, task ffff88085f55c080) <4>Stack: <4> 0000000000000000 ffff88085e2755d8 ffff8810569bf288 ffffffffa00fd2c4 <4><d> ffff88085e275660 ffff88085e2755c8 ffff88085e2756c8 0000000000000000 <4><d> 0000000000000000 ffff88085db2a480 ffff88085e275530 ffffffffa0c3ba6c <4>Call Trace: <4> [<ffffffffa00fd2c4>] ? do_get_write_access+0x3b4/0x520 [jbd2] <4> [<ffffffffa0c3ba6c>] iam_lookup_lock+0x7c/0xb0 [osd_ldiskfs] <4> [<ffffffffa0c3bad4>] __iam_it_get+0x34/0x160 [osd_ldiskfs] <4> [<ffffffffa0c3be1e>] iam_it_get+0x2e/0x150 [osd_ldiskfs] <4> [<ffffffffa0c3bf4e>] iam_it_get_exact+0xe/0x30 [osd_ldiskfs] <4> [<ffffffffa0c3d47f>] iam_insert+0x4f/0xb0 [osd_ldiskfs] <4> [<ffffffffa0c366ea>] osd_oi_iam_refresh+0x18a/0x330 [osd_ldiskfs] <4> [<ffffffffa0c3ea40>] ? iam_lfix_ipd_alloc+0x0/0x20 [osd_ldiskfs] <4> [<ffffffffa0c386dd>] osd_oi_insert+0x11d/0x480 [osd_ldiskfs] <4> [<ffffffff811ae522>] ? generic_setxattr+0xa2/0xb0 <4> [<ffffffffa0c25021>] ? osd_ea_fid_set+0xf1/0x410 [osd_ldiskfs] <4> [<ffffffffa0c33595>] osd_object_ea_create+0x5b5/0x700 [osd_ldiskfs] <4> [<ffffffffa0e173bf>] lod_object_create+0x13f/0x260 [lod] <4> [<ffffffffa0e756c0>] mdd_object_create_internal+0xa0/0x1c0 [mdd] <4> [<ffffffffa0e86428>] mdd_create+0xa38/0x1730 [mdd] <4> [<ffffffffa0c2af37>] ? osd_xattr_get+0x97/0x2e0 [osd_ldiskfs] <4> [<ffffffffa0e14770>] ? lod_index_lookup+0x0/0x30 [lod] <4> [<ffffffffa0d50358>] mdo_create+0x18/0x50 [mdt] <4> [<ffffffffa0d5a64c>] mdt_reint_open+0x13ac/0x21a0 [mdt] <4> [<ffffffffa065983c>] ? lustre_msg_add_version+0x6c/0xc0 [ptlrpc] <4> [<ffffffffa04f4600>] ? lu_ucred_key_init+0x160/0x1a0 [obdclass] <4> [<ffffffffa0d431f1>] mdt_reint_rec+0x41/0xe0 [mdt] <4> [<ffffffffa0d2add3>] mdt_reint_internal+0x4c3/0x780 [mdt] <4> [<ffffffffa0d2b35d>] mdt_intent_reint+0x1ed/0x520 [mdt] <4> [<ffffffffa0d26a0e>] mdt_intent_policy+0x3ae/0x770 [mdt] <4> [<ffffffffa0610511>] ldlm_lock_enqueue+0x361/0x8c0 [ptlrpc] <4> [<ffffffffa0639abf>] ldlm_handle_enqueue0+0x4ef/0x10a0 [ptlrpc] <4> [<ffffffffa0d26ed6>] mdt_enqueue+0x46/0xe0 [mdt] <4> [<ffffffffa0d2dbca>] mdt_handle_common+0x52a/0x1470 [mdt] <4> [<ffffffffa0d68545>] mds_regular_handle+0x15/0x20 [mdt] <4> [<ffffffffa0669a45>] ptlrpc_server_handle_request+0x385/0xc00 [ptlrpc] <4> [<ffffffffa03824ce>] ? cfs_timer_arm+0xe/0x10 [libcfs] <4> [<ffffffffa03933df>] ? lc_watchdog_touch+0x6f/0x170 [libcfs] <4> [<ffffffffa06610e9>] ? ptlrpc_wait_event+0xa9/0x2d0 [ptlrpc] <4> [<ffffffff81054839>] ? __wake_up_common+0x59/0x90 <4> [<ffffffffa066adad>] ptlrpc_main+0xaed/0x1740 [ptlrpc] <4> [<ffffffffa066a2c0>] ? ptlrpc_main+0x0/0x1740 [ptlrpc] <4> [<ffffffff8109aee6>] kthread+0x96/0xa0 <4> [<ffffffff8100c20a>] child_rip+0xa/0x20 <4> [<ffffffff8109ae50>] ? kthread+0x0/0xa0 <4> [<ffffffff8100c200>] ? child_rip+0x0/0x20 <4>Code: 00 48 8b 5d b8 45 31 ff 0f 1f 00 49 8b 46 30 31 d2 48 89 d9 44 89 ee 48 8b 7d c0 ff 50 20 48 8b 13 66 2e 0f 1f 84 00 00 00 00 00 <f0> 0f ba 2a 19 19 c9 85 c9 74 15 48 8b 0a f7 c1 00 00 00 02 74 <1>RIP [<ffffffffa0c3b6a0>] __iam_path_lookup+0x70/0x1f0 [osd_ldiskfs] <4> RSP <ffff88085e2754b0> <4>CR2: 0000000000000000 Any suggestion is welcome? THANKS!!! Alfonso Pardo Diaz System Administrator / Researcher c/ Sola nº 1; 10200 Trujillo, ESPAÑA Tel: +34 927 65 93 17 Fax: +34 927 32 32 37 ---------------------------- Confidencialidad: Este mensaje y sus ficheros adjuntos se dirige exclusivamente a su destinatario y puede contener información privilegiada o confidencial. Si no es vd. el destinatario indicado, queda notificado de que la utilización, divulgación y/o copia sin autorización está prohibida en virtud de la legislación vigente. Si ha recibido este mensaje por error, le rogamos que nos lo comunique inmediatamente respondiendo al mensaje y proceda a su destrucción. Disclaimer: This message and its attached files is intended exclusively for its recipients and may contain confidential information. If you received this e-mail in error you are hereby notified that any dissemination, copy or disclosure of this communication is strictly prohibited and may be unlawful. In this case, please notify us by a reply and delete this email and its contents immediately. ----------------------------