Arya Mazaheri
2011-Feb-21 12:27 UTC
[Lustre-discuss] Kernel Panic error while running lustre 2.0 with infiniband
Hi there, I have configured and ran lustre 2.0 with tcp (OSS and MDS on on the same server) without problem. Now I am trying to run lustre with infiniband support. but whenever I mount the mdt storage on server, the process ends with following error: kernel panic - not syncing: fatal exception my /etc/modprobe.conf is: options lnet networks="o2ib0(ib0)" last lines of dmesg: ---------------------------------------------------- kjournald starting. Commit interval 5 seconds LDISKFS-fs warning: maximal mount count reached, running e2fsck is recommended LDISKFS FS on sda2, internal journal LDISKFS-fs: recovery complete. LDISKFS-fs: mounted filesystem with ordered data mode. kjournald starting. Commit interval 5 seconds LDISKFS-fs warning: maximal mount count reached, running e2fsck is recommended LDISKFS FS on sda2, internal journal LDISKFS-fs: mounted filesystem with ordered data mode. LustreError: 6041:0:(o2iblnd.c:2501:kiblnd_startup()) Can''t query IPoIB interface ib0: it''s down LustreError: 6041:0:(o2iblnd.c:2501:kiblnd_startup()) Skipped 1 previous similar message eth0: no IPv6 routers present LustreError: 105-4: Error -100 starting up LNI o2ib LustreError: Skipped 1 previous similar message LustreError: 6041:0:(events.c:731:ptlrpc_init_portals()) network initialisation failed LustreError: 158-c: Can''t load module ''mgs'' LustreError: 6035:0:(genops.c:286:class_newdev()) OBD: unknown type: mgs LustreError: 6035:0:(obd_config.c:300:class_attach()) Cannot create device MGS of type mgs : -19 LustreError: 6035:0:(obd_mount.c:502:lustre_start_simple()) MGS attach error -19 LustreError: 15e-a: Failed to start MGS ''MGS'' (-19). Is the ''mgs'' module loaded? LustreError: 6035:0:(obd_mount.c:1492:server_put_super()) no obd lustre-MDTffff LustreError: 6035:0:(obd_mount.c:137:server_deregister_mount()) lustre-MDTffff not registered Lustre: server umount lustre-MDTffff complete LustreError: 6035:0:(obd_mount.c:2136:lustre_fill_super()) Unable to mount (-19) kjournald starting. Commit interval 5 seconds LDISKFS-fs warning: maximal mount count reached, running e2fsck is recommended LDISKFS FS on sdb1, internal journal LDISKFS-fs: mounted filesystem with ordered data mode. kjournald starting. Commit interval 5 seconds LDISKFS-fs warning: maximal mount count reached, running e2fsck is recommended LDISKFS FS on sdb1, internal journal LDISKFS-fs: mounted filesystem with ordered data mode. LDISKFS-fs: file extents enabled LDISKFS-fs: mballoc enabled LustreError: 6117:0:(events.c:731:ptlrpc_init_portals()) network initialisation failed LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success) LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 breaks, 0 lost LDISKFS-fs: mballoc: 0 generated and it took 0 LDISKFS-fs: mballoc: 0 preallocated, 0 discarded kjournald starting. Commit interval 5 seconds LDISKFS-fs warning: maximal mount count reached, running e2fsck is recommended LDISKFS FS on sdb2, internal journal LDISKFS-fs: mounted filesystem with ordered data mode. kjournald starting. Commit interval 5 seconds LDISKFS-fs warning: maximal mount count reached, running e2fsck is recommended LDISKFS FS on sdb2, internal journal LDISKFS-fs: mounted filesystem with ordered data mode. LDISKFS-fs: file extents enabled LDISKFS-fs: mballoc enabled LustreError: 6193:0:(events.c:731:ptlrpc_init_portals()) network initialisation failed LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success) LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 breaks, 0 lost LDISKFS-fs: mballoc: 0 generated and it took 0 LDISKFS-fs: mballoc: 0 preallocated, 0 discarded kjournald starting. Commit interval 5 seconds LDISKFS-fs warning: maximal mount count reached, running e2fsck is recommended LDISKFS FS on sdb3, internal journal LDISKFS-fs: mounted filesystem with ordered data mode. kjournald starting. Commit interval 5 seconds LDISKFS-fs warning: maximal mount count reached, running e2fsck is recommended LDISKFS FS on sdb3, internal journal LDISKFS-fs: mounted filesystem with ordered data mode. LDISKFS-fs: file extents enabled LDISKFS-fs: mballoc enabled LustreError: 6269:0:(o2iblnd.c:2501:kiblnd_startup()) Can''t query IPoIB interface ib0: it''s down LustreError: 6269:0:(o2iblnd.c:2501:kiblnd_startup()) Skipped 2 previous similar messages LustreError: 105-4: Error -100 starting up LNI o2ib LustreError: Skipped 2 previous similar messages LustreError: 6269:0:(events.c:731:ptlrpc_init_portals()) network initialisation failed LustreError: 158-c: Can''t load module ''mgc'' LustreError: Skipped 2 previous similar messages LustreError: 6263:0:(genops.c:286:class_newdev()) OBD: unknown type: mgc LustreError: 6263:0:(genops.c:286:class_newdev()) Skipped 2 previous similar messages LustreError: 6263:0:(obd_config.c:300:class_attach()) Cannot create device MGC0 at lo of type mgc : -19 LustreError: 6263:0:(obd_config.c:300:class_attach()) Skipped 2 previous similar messages LustreError: 6263:0:(obd_mount.c:502:lustre_start_simple()) MGC0 at lo attach error -19 LustreError: 6263:0:(obd_mount.c:502:lustre_start_simple()) Skipped 2 previous similar messages LustreError: 6263:0:(obd_mount.c:1492:server_put_super()) no obd lustre-OST0002 LustreError: 6263:0:(obd_mount.c:1492:server_put_super()) Skipped 2 previous similar messages LustreError: 6263:0:(obd_mount.c:137:server_deregister_mount()) lustre-OST0002 not registered LustreError: 6263:0:(obd_mount.c:137:server_deregister_mount()) Skipped 2 previous similar messages LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success) LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 breaks, 0 lost LDISKFS-fs: mballoc: 0 generated and it took 0 LDISKFS-fs: mballoc: 0 preallocated, 0 discarded Lustre: server umount lustre-OST0002 complete Lustre: Skipped 2 previous similar messages LustreError: 6263:0:(obd_mount.c:2136:lustre_fill_super()) Unable to mount (-19) LustreError: 6263:0:(obd_mount.c:2136:lustre_fill_super()) Skipped 2 previous similar messages kjournald starting. Commit interval 5 seconds LDISKFS-fs warning: maximal mount count reached, running e2fsck is recommended LDISKFS FS on sdb4, internal journal LDISKFS-fs: mounted filesystem with ordered data mode. kjournald starting. Commit interval 5 seconds LDISKFS-fs warning: maximal mount count reached, running e2fsck is recommended LDISKFS FS on sdb4, internal journal LDISKFS-fs: mounted filesystem with ordered data mode. LDISKFS-fs: file extents enabled LDISKFS-fs: mballoc enabled LustreError: 6345:0:(events.c:731:ptlrpc_init_portals()) network initialisation failed LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success) LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 breaks, 0 lost LDISKFS-fs: mballoc: 0 generated and it took 0 LDISKFS-fs: mballoc: 0 preallocated, 0 discarded ------------------------------------------------------------------------------------ Any ideas? -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20110221/3f2a13ca/attachment.html
Sébastien Buisson
2011-Feb-21 12:45 UTC
[Lustre-discuss] Kernel Panic error while running lustre 2.0 with infiniband
Hi, The important bit is: LustreError: 6041:0:(o2iblnd.c:2501:kiblnd_startup()) Can''t query IPoIB interface ib0: it''s down Lustre requires IPoIB interfaces to be setup on all Lustre nodes. It does not mean Lustre will use IP stack on top of Infiniband to transfer data, but IPoIB addresses are used as identifiers to establish initial Infiniband connections (Queue Pairs and so on). Cheers, Sebastien. Le 21/02/2011 13:27, Arya Mazaheri a ?crit :> Hi there, > I have configured and ran lustre 2.0 with tcp (OSS and MDS on on the > same server) without problem. Now I am trying to run lustre with > infiniband support. but whenever I mount the mdt storage on server, the > process ends with following error: > kernel panic - not syncing: fatal exception > > my /etc/modprobe.conf is: > options lnet networks="o2ib0(ib0)" > > last lines of dmesg: > ---------------------------------------------------- > kjournald starting. Commit interval 5 seconds > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > recommended > LDISKFS FS on sda2, internal journal > LDISKFS-fs: recovery complete. > LDISKFS-fs: mounted filesystem with ordered data mode. > kjournald starting. Commit interval 5 seconds > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > recommended > LDISKFS FS on sda2, internal journal > LDISKFS-fs: mounted filesystem with ordered data mode. > LustreError: 6041:0:(o2iblnd.c:2501:kiblnd_startup()) Can''t query IPoIB > interface ib0: it''s down > LustreError: 6041:0:(o2iblnd.c:2501:kiblnd_startup()) Skipped 1 previous > similar message > eth0: no IPv6 routers present > LustreError: 105-4: Error -100 starting up LNI o2ib > LustreError: Skipped 1 previous similar message > LustreError: 6041:0:(events.c:731:ptlrpc_init_portals()) network > initialisation failed > LustreError: 158-c: Can''t load module ''mgs'' > LustreError: 6035:0:(genops.c:286:class_newdev()) OBD: unknown type: mgs > LustreError: 6035:0:(obd_config.c:300:class_attach()) Cannot create > device MGS of type mgs : -19 > LustreError: 6035:0:(obd_mount.c:502:lustre_start_simple()) MGS attach > error -19 > LustreError: 15e-a: Failed to start MGS ''MGS'' (-19). Is the ''mgs'' module > loaded? > LustreError: 6035:0:(obd_mount.c:1492:server_put_super()) no obd > lustre-MDTffff > LustreError: 6035:0:(obd_mount.c:137:server_deregister_mount()) > lustre-MDTffff not registered > Lustre: server umount lustre-MDTffff complete > LustreError: 6035:0:(obd_mount.c:2136:lustre_fill_super()) Unable to > mount (-19) > kjournald starting. Commit interval 5 seconds > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > recommended > LDISKFS FS on sdb1, internal journal > LDISKFS-fs: mounted filesystem with ordered data mode. > kjournald starting. Commit interval 5 seconds > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > recommended > LDISKFS FS on sdb1, internal journal > LDISKFS-fs: mounted filesystem with ordered data mode. > LDISKFS-fs: file extents enabled > LDISKFS-fs: mballoc enabled > LustreError: 6117:0:(events.c:731:ptlrpc_init_portals()) network > initialisation failed > LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success) > LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 > breaks, 0 lost > LDISKFS-fs: mballoc: 0 generated and it took 0 > LDISKFS-fs: mballoc: 0 preallocated, 0 discarded > kjournald starting. Commit interval 5 seconds > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > recommended > LDISKFS FS on sdb2, internal journal > LDISKFS-fs: mounted filesystem with ordered data mode. > kjournald starting. Commit interval 5 seconds > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > recommended > LDISKFS FS on sdb2, internal journal > LDISKFS-fs: mounted filesystem with ordered data mode. > LDISKFS-fs: file extents enabled > LDISKFS-fs: mballoc enabled > LustreError: 6193:0:(events.c:731:ptlrpc_init_portals()) network > initialisation failed > LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success) > LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 > breaks, 0 lost > LDISKFS-fs: mballoc: 0 generated and it took 0 > LDISKFS-fs: mballoc: 0 preallocated, 0 discarded > kjournald starting. Commit interval 5 seconds > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > recommended > LDISKFS FS on sdb3, internal journal > LDISKFS-fs: mounted filesystem with ordered data mode. > kjournald starting. Commit interval 5 seconds > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > recommended > LDISKFS FS on sdb3, internal journal > LDISKFS-fs: mounted filesystem with ordered data mode. > LDISKFS-fs: file extents enabled > LDISKFS-fs: mballoc enabled > LustreError: 6269:0:(o2iblnd.c:2501:kiblnd_startup()) Can''t query IPoIB > interface ib0: it''s down > LustreError: 6269:0:(o2iblnd.c:2501:kiblnd_startup()) Skipped 2 previous > similar messages > LustreError: 105-4: Error -100 starting up LNI o2ib > LustreError: Skipped 2 previous similar messages > LustreError: 6269:0:(events.c:731:ptlrpc_init_portals()) network > initialisation failed > LustreError: 158-c: Can''t load module ''mgc'' > LustreError: Skipped 2 previous similar messages > LustreError: 6263:0:(genops.c:286:class_newdev()) OBD: unknown type: mgc > LustreError: 6263:0:(genops.c:286:class_newdev()) Skipped 2 previous > similar messages > LustreError: 6263:0:(obd_config.c:300:class_attach()) Cannot create > device MGC0 at lo of type mgc : -19 > LustreError: 6263:0:(obd_config.c:300:class_attach()) Skipped 2 previous > similar messages > LustreError: 6263:0:(obd_mount.c:502:lustre_start_simple()) MGC0 at lo > attach error -19 > LustreError: 6263:0:(obd_mount.c:502:lustre_start_simple()) Skipped 2 > previous similar messages > LustreError: 6263:0:(obd_mount.c:1492:server_put_super()) no obd > lustre-OST0002 > LustreError: 6263:0:(obd_mount.c:1492:server_put_super()) Skipped 2 > previous similar messages > LustreError: 6263:0:(obd_mount.c:137:server_deregister_mount()) > lustre-OST0002 not registered > LustreError: 6263:0:(obd_mount.c:137:server_deregister_mount()) Skipped > 2 previous similar messages > LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success) > LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 > breaks, 0 lost > LDISKFS-fs: mballoc: 0 generated and it took 0 > LDISKFS-fs: mballoc: 0 preallocated, 0 discarded > Lustre: server umount lustre-OST0002 complete > Lustre: Skipped 2 previous similar messages > LustreError: 6263:0:(obd_mount.c:2136:lustre_fill_super()) Unable to > mount (-19) > LustreError: 6263:0:(obd_mount.c:2136:lustre_fill_super()) Skipped 2 > previous similar messages > kjournald starting. Commit interval 5 seconds > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > recommended > LDISKFS FS on sdb4, internal journal > LDISKFS-fs: mounted filesystem with ordered data mode. > kjournald starting. Commit interval 5 seconds > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > recommended > LDISKFS FS on sdb4, internal journal > LDISKFS-fs: mounted filesystem with ordered data mode. > LDISKFS-fs: file extents enabled > LDISKFS-fs: mballoc enabled > LustreError: 6345:0:(events.c:731:ptlrpc_init_portals()) network > initialisation failed > LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success) > LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 > breaks, 0 lost > LDISKFS-fs: mballoc: 0 generated and it took 0 > LDISKFS-fs: mballoc: 0 preallocated, 0 discarded > ------------------------------------------------------------------------------------ > > Any ideas? > > > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
Johann Lombardi
2011-Feb-21 12:49 UTC
[Lustre-discuss] Kernel Panic error while running lustre 2.0 with infiniband
On Mon, Feb 21, 2011 at 03:57:10PM +0330, Arya Mazaheri wrote:> kernel panic - not syncing: fatal exception > > my /etc/modprobe.conf is: > options lnet networks="o2ib0(ib0)" >[...]> LustreError: 6041:0:(o2iblnd.c:2501:kiblnd_startup()) Can''t query IPoIB > interface ib0: it''s downAlthough IP is not used for any communication, the o2ib LND requires IP over IB addresses to identify nodes. That said, you should definitely not get a kernel panic for such a misconfiguration, this is really a bug. Cheers, Johann -- Johann Lombardi Whamcloud, Inc. www.whamcloud.com
Albert Everett
2011-Feb-21 16:01 UTC
[Lustre-discuss] Kernel Panic error while running lustre 2.0 with infiniband
What''s output of # ifconfig ib0 Albert On Feb 21, 2011, at 6:27 AM, Arya Mazaheri wrote:> Hi there, > I have configured and ran lustre 2.0 with tcp (OSS and MDS on on the > same server) without problem. Now I am trying to run lustre with > infiniband support. but whenever I mount the mdt storage on server, > the process ends with following error: > kernel panic - not syncing: fatal exception > > my /etc/modprobe.conf is: > options lnet networks="o2ib0(ib0)" > > last lines of dmesg: > ---------------------------------------------------- > kjournald starting. Commit interval 5 seconds > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > recommended > LDISKFS FS on sda2, internal journal > LDISKFS-fs: recovery complete. > LDISKFS-fs: mounted filesystem with ordered data mode. > kjournald starting. Commit interval 5 seconds > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > recommended > LDISKFS FS on sda2, internal journal > LDISKFS-fs: mounted filesystem with ordered data mode. > LustreError: 6041:0:(o2iblnd.c:2501:kiblnd_startup()) Can''t query > IPoIB interface ib0: it''s down > LustreError: 6041:0:(o2iblnd.c:2501:kiblnd_startup()) Skipped 1 > previous similar message > eth0: no IPv6 routers present > LustreError: 105-4: Error -100 starting up LNI o2ib > LustreError: Skipped 1 previous similar message > LustreError: 6041:0:(events.c:731:ptlrpc_init_portals()) network > initialisation failed > LustreError: 158-c: Can''t load module ''mgs'' > LustreError: 6035:0:(genops.c:286:class_newdev()) OBD: unknown type: > mgs > LustreError: 6035:0:(obd_config.c:300:class_attach()) Cannot create > device MGS of type mgs : -19 > LustreError: 6035:0:(obd_mount.c:502:lustre_start_simple()) MGS > attach error -19 > LustreError: 15e-a: Failed to start MGS ''MGS'' (-19). Is the ''mgs'' > module loaded? > LustreError: 6035:0:(obd_mount.c:1492:server_put_super()) no obd > lustre-MDTffff > LustreError: 6035:0:(obd_mount.c:137:server_deregister_mount()) > lustre-MDTffff not registered > Lustre: server umount lustre-MDTffff complete > LustreError: 6035:0:(obd_mount.c:2136:lustre_fill_super()) Unable to > mount (-19) > kjournald starting. Commit interval 5 seconds > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > recommended > LDISKFS FS on sdb1, internal journal > LDISKFS-fs: mounted filesystem with ordered data mode. > kjournald starting. Commit interval 5 seconds > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > recommended > LDISKFS FS on sdb1, internal journal > LDISKFS-fs: mounted filesystem with ordered data mode. > LDISKFS-fs: file extents enabled > LDISKFS-fs: mballoc enabled > LustreError: 6117:0:(events.c:731:ptlrpc_init_portals()) network > initialisation failed > LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success) > LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 > breaks, 0 lost > LDISKFS-fs: mballoc: 0 generated and it took 0 > LDISKFS-fs: mballoc: 0 preallocated, 0 discarded > kjournald starting. Commit interval 5 seconds > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > recommended > LDISKFS FS on sdb2, internal journal > LDISKFS-fs: mounted filesystem with ordered data mode. > kjournald starting. Commit interval 5 seconds > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > recommended > LDISKFS FS on sdb2, internal journal > LDISKFS-fs: mounted filesystem with ordered data mode. > LDISKFS-fs: file extents enabled > LDISKFS-fs: mballoc enabled > LustreError: 6193:0:(events.c:731:ptlrpc_init_portals()) network > initialisation failed > LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success) > LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 > breaks, 0 lost > LDISKFS-fs: mballoc: 0 generated and it took 0 > LDISKFS-fs: mballoc: 0 preallocated, 0 discarded > kjournald starting. Commit interval 5 seconds > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > recommended > LDISKFS FS on sdb3, internal journal > LDISKFS-fs: mounted filesystem with ordered data mode. > kjournald starting. Commit interval 5 seconds > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > recommended > LDISKFS FS on sdb3, internal journal > LDISKFS-fs: mounted filesystem with ordered data mode. > LDISKFS-fs: file extents enabled > LDISKFS-fs: mballoc enabled > LustreError: 6269:0:(o2iblnd.c:2501:kiblnd_startup()) Can''t query > IPoIB interface ib0: it''s down > LustreError: 6269:0:(o2iblnd.c:2501:kiblnd_startup()) Skipped 2 > previous similar messages > LustreError: 105-4: Error -100 starting up LNI o2ib > LustreError: Skipped 2 previous similar messages > LustreError: 6269:0:(events.c:731:ptlrpc_init_portals()) network > initialisation failed > LustreError: 158-c: Can''t load module ''mgc'' > LustreError: Skipped 2 previous similar messages > LustreError: 6263:0:(genops.c:286:class_newdev()) OBD: unknown type: > mgc > LustreError: 6263:0:(genops.c:286:class_newdev()) Skipped 2 previous > similar messages > LustreError: 6263:0:(obd_config.c:300:class_attach()) Cannot create > device MGC0 at lo of type mgc : -19 > LustreError: 6263:0:(obd_config.c:300:class_attach()) Skipped 2 > previous similar messages > LustreError: 6263:0:(obd_mount.c:502:lustre_start_simple()) MGC0 at lo > attach error -19 > LustreError: 6263:0:(obd_mount.c:502:lustre_start_simple()) Skipped > 2 previous similar messages > LustreError: 6263:0:(obd_mount.c:1492:server_put_super()) no obd > lustre-OST0002 > LustreError: 6263:0:(obd_mount.c:1492:server_put_super()) Skipped 2 > previous similar messages > LustreError: 6263:0:(obd_mount.c:137:server_deregister_mount()) > lustre-OST0002 not registered > LustreError: 6263:0:(obd_mount.c:137:server_deregister_mount()) > Skipped 2 previous similar messages > LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success) > LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 > breaks, 0 lost > LDISKFS-fs: mballoc: 0 generated and it took 0 > LDISKFS-fs: mballoc: 0 preallocated, 0 discarded > Lustre: server umount lustre-OST0002 complete > Lustre: Skipped 2 previous similar messages > LustreError: 6263:0:(obd_mount.c:2136:lustre_fill_super()) Unable to > mount (-19) > LustreError: 6263:0:(obd_mount.c:2136:lustre_fill_super()) Skipped 2 > previous similar messages > kjournald starting. Commit interval 5 seconds > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > recommended > LDISKFS FS on sdb4, internal journal > LDISKFS-fs: mounted filesystem with ordered data mode. > kjournald starting. Commit interval 5 seconds > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > recommended > LDISKFS FS on sdb4, internal journal > LDISKFS-fs: mounted filesystem with ordered data mode. > LDISKFS-fs: file extents enabled > LDISKFS-fs: mballoc enabled > LustreError: 6345:0:(events.c:731:ptlrpc_init_portals()) network > initialisation failed > LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success) > LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 > breaks, 0 lost > LDISKFS-fs: mballoc: 0 generated and it took 0 > LDISKFS-fs: mballoc: 0 preallocated, 0 discarded > ------------------------------------------------------------------------------------ > > Any ideas? > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
Arya Mazaheri
2011-Feb-21 19:17 UTC
[Lustre-discuss] Kernel Panic error while running lustre 2.0 with infiniband
problem solved. I was trying to set the IP of ib0 by this command: ifconfig ib0 192.168.1.1 netmask 255.255.255.0 up but, it leads to the kernel panic. So I tried to set the IP address by adding to network-scripts. So, it works now... I really don''t know why setting IP with ifconfig doesn''t work. So weird... On Mon, Feb 21, 2011 at 7:31 PM, Albert Everett <aeeverett at ualr.edu> wrote:> What''s output of > > # ifconfig ib0 > > Albert > > > On Feb 21, 2011, at 6:27 AM, Arya Mazaheri wrote: > > Hi there, >> I have configured and ran lustre 2.0 with tcp (OSS and MDS on on the same >> server) without problem. Now I am trying to run lustre with infiniband >> support. but whenever I mount the mdt storage on server, the process ends >> with following error: >> kernel panic - not syncing: fatal exception >> >> my /etc/modprobe.conf is: >> options lnet networks="o2ib0(ib0)" >> >> last lines of dmesg: >> ---------------------------------------------------- >> kjournald starting. Commit interval 5 seconds >> LDISKFS-fs warning: maximal mount count reached, running e2fsck is >> recommended >> LDISKFS FS on sda2, internal journal >> LDISKFS-fs: recovery complete. >> LDISKFS-fs: mounted filesystem with ordered data mode. >> kjournald starting. Commit interval 5 seconds >> LDISKFS-fs warning: maximal mount count reached, running e2fsck is >> recommended >> LDISKFS FS on sda2, internal journal >> LDISKFS-fs: mounted filesystem with ordered data mode. >> LustreError: 6041:0:(o2iblnd.c:2501:kiblnd_startup()) Can''t query IPoIB >> interface ib0: it''s down >> LustreError: 6041:0:(o2iblnd.c:2501:kiblnd_startup()) Skipped 1 previous >> similar message >> eth0: no IPv6 routers present >> LustreError: 105-4: Error -100 starting up LNI o2ib >> LustreError: Skipped 1 previous similar message >> LustreError: 6041:0:(events.c:731:ptlrpc_init_portals()) network >> initialisation failed >> LustreError: 158-c: Can''t load module ''mgs'' >> LustreError: 6035:0:(genops.c:286:class_newdev()) OBD: unknown type: mgs >> LustreError: 6035:0:(obd_config.c:300:class_attach()) Cannot create device >> MGS of type mgs : -19 >> LustreError: 6035:0:(obd_mount.c:502:lustre_start_simple()) MGS attach >> error -19 >> LustreError: 15e-a: Failed to start MGS ''MGS'' (-19). Is the ''mgs'' module >> loaded? >> LustreError: 6035:0:(obd_mount.c:1492:server_put_super()) no obd >> lustre-MDTffff >> LustreError: 6035:0:(obd_mount.c:137:server_deregister_mount()) >> lustre-MDTffff not registered >> Lustre: server umount lustre-MDTffff complete >> LustreError: 6035:0:(obd_mount.c:2136:lustre_fill_super()) Unable to mount >> (-19) >> kjournald starting. Commit interval 5 seconds >> LDISKFS-fs warning: maximal mount count reached, running e2fsck is >> recommended >> LDISKFS FS on sdb1, internal journal >> LDISKFS-fs: mounted filesystem with ordered data mode. >> kjournald starting. Commit interval 5 seconds >> LDISKFS-fs warning: maximal mount count reached, running e2fsck is >> recommended >> LDISKFS FS on sdb1, internal journal >> LDISKFS-fs: mounted filesystem with ordered data mode. >> LDISKFS-fs: file extents enabled >> LDISKFS-fs: mballoc enabled >> LustreError: 6117:0:(events.c:731:ptlrpc_init_portals()) network >> initialisation failed >> LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success) >> LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 breaks, >> 0 lost >> LDISKFS-fs: mballoc: 0 generated and it took 0 >> LDISKFS-fs: mballoc: 0 preallocated, 0 discarded >> kjournald starting. Commit interval 5 seconds >> LDISKFS-fs warning: maximal mount count reached, running e2fsck is >> recommended >> LDISKFS FS on sdb2, internal journal >> LDISKFS-fs: mounted filesystem with ordered data mode. >> kjournald starting. Commit interval 5 seconds >> LDISKFS-fs warning: maximal mount count reached, running e2fsck is >> recommended >> LDISKFS FS on sdb2, internal journal >> LDISKFS-fs: mounted filesystem with ordered data mode. >> LDISKFS-fs: file extents enabled >> LDISKFS-fs: mballoc enabled >> LustreError: 6193:0:(events.c:731:ptlrpc_init_portals()) network >> initialisation failed >> LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success) >> LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 breaks, >> 0 lost >> LDISKFS-fs: mballoc: 0 generated and it took 0 >> LDISKFS-fs: mballoc: 0 preallocated, 0 discarded >> kjournald starting. Commit interval 5 seconds >> LDISKFS-fs warning: maximal mount count reached, running e2fsck is >> recommended >> LDISKFS FS on sdb3, internal journal >> LDISKFS-fs: mounted filesystem with ordered data mode. >> kjournald starting. Commit interval 5 seconds >> LDISKFS-fs warning: maximal mount count reached, running e2fsck is >> recommended >> LDISKFS FS on sdb3, internal journal >> LDISKFS-fs: mounted filesystem with ordered data mode. >> LDISKFS-fs: file extents enabled >> LDISKFS-fs: mballoc enabled >> LustreError: 6269:0:(o2iblnd.c:2501:kiblnd_startup()) Can''t query IPoIB >> interface ib0: it''s down >> LustreError: 6269:0:(o2iblnd.c:2501:kiblnd_startup()) Skipped 2 previous >> similar messages >> LustreError: 105-4: Error -100 starting up LNI o2ib >> LustreError: Skipped 2 previous similar messages >> LustreError: 6269:0:(events.c:731:ptlrpc_init_portals()) network >> initialisation failed >> LustreError: 158-c: Can''t load module ''mgc'' >> LustreError: Skipped 2 previous similar messages >> LustreError: 6263:0:(genops.c:286:class_newdev()) OBD: unknown type: mgc >> LustreError: 6263:0:(genops.c:286:class_newdev()) Skipped 2 previous >> similar messages >> LustreError: 6263:0:(obd_config.c:300:class_attach()) Cannot create device >> MGC0 at lo of type mgc : -19 >> LustreError: 6263:0:(obd_config.c:300:class_attach()) Skipped 2 previous >> similar messages >> LustreError: 6263:0:(obd_mount.c:502:lustre_start_simple()) MGC0 at loattach error -19 >> LustreError: 6263:0:(obd_mount.c:502:lustre_start_simple()) Skipped 2 >> previous similar messages >> LustreError: 6263:0:(obd_mount.c:1492:server_put_super()) no obd >> lustre-OST0002 >> LustreError: 6263:0:(obd_mount.c:1492:server_put_super()) Skipped 2 >> previous similar messages >> LustreError: 6263:0:(obd_mount.c:137:server_deregister_mount()) >> lustre-OST0002 not registered >> LustreError: 6263:0:(obd_mount.c:137:server_deregister_mount()) Skipped 2 >> previous similar messages >> LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success) >> LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 breaks, >> 0 lost >> LDISKFS-fs: mballoc: 0 generated and it took 0 >> LDISKFS-fs: mballoc: 0 preallocated, 0 discarded >> Lustre: server umount lustre-OST0002 complete >> Lustre: Skipped 2 previous similar messages >> LustreError: 6263:0:(obd_mount.c:2136:lustre_fill_super()) Unable to mount >> (-19) >> LustreError: 6263:0:(obd_mount.c:2136:lustre_fill_super()) Skipped 2 >> previous similar messages >> kjournald starting. Commit interval 5 seconds >> LDISKFS-fs warning: maximal mount count reached, running e2fsck is >> recommended >> LDISKFS FS on sdb4, internal journal >> LDISKFS-fs: mounted filesystem with ordered data mode. >> kjournald starting. Commit interval 5 seconds >> LDISKFS-fs warning: maximal mount count reached, running e2fsck is >> recommended >> LDISKFS FS on sdb4, internal journal >> LDISKFS-fs: mounted filesystem with ordered data mode. >> LDISKFS-fs: file extents enabled >> LDISKFS-fs: mballoc enabled >> LustreError: 6345:0:(events.c:731:ptlrpc_init_portals()) network >> initialisation failed >> LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success) >> LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 breaks, >> 0 lost >> LDISKFS-fs: mballoc: 0 generated and it took 0 >> LDISKFS-fs: mballoc: 0 preallocated, 0 discarded >> >> ------------------------------------------------------------------------------------ >> >> Any ideas? >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20110221/83fc2197/attachment.html
CHU, STEPHEN H (ATTSI)
2011-Feb-21 19:23 UTC
[Lustre-discuss] Kernel Panic error while running lustre 2.0with infiniband
Try adding the following to modprobe.conf ahead of "options lnet..." for rebooting: alias ib0 ib_ipoib options lnet network="02ib0(ib0)" On the running node do : modprobe ib_ipoib I''ve ran Lustre 2.0 with infiniband without any problems. For RHEL5.4 and beyond, I had to modify the stocked openibd to unload lustre prior to stopping the network or else it will hang. Steve> -----Original Message----- > From: Albert Everett [mailto:aeeverett at ualr.edu] > Sent: Monday, February 21, 2011 11:01 AM > To: Arya Mazaheri > Cc: lustre-discuss at lists.lustre.org > Subject: Re: [Lustre-discuss] Kernel Panic error while running lustre2.0with> infiniband > > What''s output of > > # ifconfig ib0 > > Albert > > On Feb 21, 2011, at 6:27 AM, Arya Mazaheri wrote: > > > Hi there, > > I have configured and ran lustre 2.0 with tcp (OSS and MDS on on the > > same server) without problem. Now I am trying to run lustre with > > infiniband support. but whenever I mount the mdt storage on server, > > the process ends with following error: > > kernel panic - not syncing: fatal exception > > > > my /etc/modprobe.conf is: > > options lnet networks="o2ib0(ib0)" > > > > last lines of dmesg: > > ---------------------------------------------------- > > kjournald starting. Commit interval 5 seconds > > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > > recommended > > LDISKFS FS on sda2, internal journal > > LDISKFS-fs: recovery complete. > > LDISKFS-fs: mounted filesystem with ordered data mode. > > kjournald starting. Commit interval 5 seconds > > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > > recommended > > LDISKFS FS on sda2, internal journal > > LDISKFS-fs: mounted filesystem with ordered data mode. > > LustreError: 6041:0:(o2iblnd.c:2501:kiblnd_startup()) Can''t query > > IPoIB interface ib0: it''s down > > LustreError: 6041:0:(o2iblnd.c:2501:kiblnd_startup()) Skipped 1 > > previous similar message > > eth0: no IPv6 routers present > > LustreError: 105-4: Error -100 starting up LNI o2ib > > LustreError: Skipped 1 previous similar message > > LustreError: 6041:0:(events.c:731:ptlrpc_init_portals()) network > > initialisation failed > > LustreError: 158-c: Can''t load module ''mgs'' > > LustreError: 6035:0:(genops.c:286:class_newdev()) OBD: unknown type: > > mgs > > LustreError: 6035:0:(obd_config.c:300:class_attach()) Cannot create > > device MGS of type mgs : -19 > > LustreError: 6035:0:(obd_mount.c:502:lustre_start_simple()) MGS > > attach error -19 > > LustreError: 15e-a: Failed to start MGS ''MGS'' (-19). Is the ''mgs'' > > module loaded? > > LustreError: 6035:0:(obd_mount.c:1492:server_put_super()) no obd > > lustre-MDTffff > > LustreError: 6035:0:(obd_mount.c:137:server_deregister_mount()) > > lustre-MDTffff not registered > > Lustre: server umount lustre-MDTffff complete > > LustreError: 6035:0:(obd_mount.c:2136:lustre_fill_super()) Unable to > > mount (-19) > > kjournald starting. Commit interval 5 seconds > > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > > recommended > > LDISKFS FS on sdb1, internal journal > > LDISKFS-fs: mounted filesystem with ordered data mode. > > kjournald starting. Commit interval 5 seconds > > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > > recommended > > LDISKFS FS on sdb1, internal journal > > LDISKFS-fs: mounted filesystem with ordered data mode. > > LDISKFS-fs: file extents enabled > > LDISKFS-fs: mballoc enabled > > LustreError: 6117:0:(events.c:731:ptlrpc_init_portals()) network > > initialisation failed > > LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success) > > LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 > > breaks, 0 lost > > LDISKFS-fs: mballoc: 0 generated and it took 0 > > LDISKFS-fs: mballoc: 0 preallocated, 0 discarded > > kjournald starting. Commit interval 5 seconds > > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > > recommended > > LDISKFS FS on sdb2, internal journal > > LDISKFS-fs: mounted filesystem with ordered data mode. > > kjournald starting. Commit interval 5 seconds > > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > > recommended > > LDISKFS FS on sdb2, internal journal > > LDISKFS-fs: mounted filesystem with ordered data mode. > > LDISKFS-fs: file extents enabled > > LDISKFS-fs: mballoc enabled > > LustreError: 6193:0:(events.c:731:ptlrpc_init_portals()) network > > initialisation failed > > LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success) > > LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 > > breaks, 0 lost > > LDISKFS-fs: mballoc: 0 generated and it took 0 > > LDISKFS-fs: mballoc: 0 preallocated, 0 discarded > > kjournald starting. Commit interval 5 seconds > > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > > recommended > > LDISKFS FS on sdb3, internal journal > > LDISKFS-fs: mounted filesystem with ordered data mode. > > kjournald starting. Commit interval 5 seconds > > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > > recommended > > LDISKFS FS on sdb3, internal journal > > LDISKFS-fs: mounted filesystem with ordered data mode. > > LDISKFS-fs: file extents enabled > > LDISKFS-fs: mballoc enabled > > LustreError: 6269:0:(o2iblnd.c:2501:kiblnd_startup()) Can''t query > > IPoIB interface ib0: it''s down > > LustreError: 6269:0:(o2iblnd.c:2501:kiblnd_startup()) Skipped 2 > > previous similar messages > > LustreError: 105-4: Error -100 starting up LNI o2ib > > LustreError: Skipped 2 previous similar messages > > LustreError: 6269:0:(events.c:731:ptlrpc_init_portals()) network > > initialisation failed > > LustreError: 158-c: Can''t load module ''mgc'' > > LustreError: Skipped 2 previous similar messages > > LustreError: 6263:0:(genops.c:286:class_newdev()) OBD: unknown type: > > mgc > > LustreError: 6263:0:(genops.c:286:class_newdev()) Skipped 2 previous > > similar messages > > LustreError: 6263:0:(obd_config.c:300:class_attach()) Cannot create > > device MGC0 at lo of type mgc : -19 > > LustreError: 6263:0:(obd_config.c:300:class_attach()) Skipped 2 > > previous similar messages > > LustreError: 6263:0:(obd_mount.c:502:lustre_start_simple()) MGC0 at lo > > attach error -19 > > LustreError: 6263:0:(obd_mount.c:502:lustre_start_simple()) Skipped > > 2 previous similar messages > > LustreError: 6263:0:(obd_mount.c:1492:server_put_super()) no obd > > lustre-OST0002 > > LustreError: 6263:0:(obd_mount.c:1492:server_put_super()) Skipped 2 > > previous similar messages > > LustreError: 6263:0:(obd_mount.c:137:server_deregister_mount()) > > lustre-OST0002 not registered > > LustreError: 6263:0:(obd_mount.c:137:server_deregister_mount()) > > Skipped 2 previous similar messages > > LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success) > > LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 > > breaks, 0 lost > > LDISKFS-fs: mballoc: 0 generated and it took 0 > > LDISKFS-fs: mballoc: 0 preallocated, 0 discarded > > Lustre: server umount lustre-OST0002 complete > > Lustre: Skipped 2 previous similar messages > > LustreError: 6263:0:(obd_mount.c:2136:lustre_fill_super()) Unable to > > mount (-19) > > LustreError: 6263:0:(obd_mount.c:2136:lustre_fill_super()) Skipped 2 > > previous similar messages > > kjournald starting. Commit interval 5 seconds > > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > > recommended > > LDISKFS FS on sdb4, internal journal > > LDISKFS-fs: mounted filesystem with ordered data mode. > > kjournald starting. Commit interval 5 seconds > > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > > recommended > > LDISKFS FS on sdb4, internal journal > > LDISKFS-fs: mounted filesystem with ordered data mode. > > LDISKFS-fs: file extents enabled > > LDISKFS-fs: mballoc enabled > > LustreError: 6345:0:(events.c:731:ptlrpc_init_portals()) network > > initialisation failed > > LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success) > > LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 > > breaks, 0 lost > > LDISKFS-fs: mballoc: 0 generated and it took 0 > > LDISKFS-fs: mballoc: 0 preallocated, 0 discarded > >------------------------------------------------------------------------ ------------> > > > Any ideas? > > > > _______________________________________________ > > Lustre-discuss mailing list > > Lustre-discuss at lists.lustre.org > > http://lists.lustre.org/mailman/listinfo/lustre-discuss >
Albert Everett
2011-Feb-21 19:49 UTC
[Lustre-discuss] Kernel Panic error while running lustre 2.0 with infiniband
Here''s all we needed. Yours is probably similar. # cat /etc/sysconfig/network-scripts/ifcfg-ib0 DEVICE=ib0 IPADDR=192.168.2.1 NETMASK=255.255.255.0 BOOTPROTO=static ONBOOT=yes On Feb 21, 2011, at 1:17 PM, Arya Mazaheri wrote:> problem solved. I was trying to set the IP of ib0 by this command: > ifconfig ib0 192.168.1.1 netmask 255.255.255.0 up > > but, it leads to the kernel panic. So I tried to set the IP address > by adding to network-scripts. So, it works now... > I really don''t know why setting IP with ifconfig doesn''t work. So > weird... > > > On Mon, Feb 21, 2011 at 7:31 PM, Albert Everett <aeeverett at ualr.edu> > wrote: > What''s output of > > # ifconfig ib0 > > Albert > > > On Feb 21, 2011, at 6:27 AM, Arya Mazaheri wrote: > > Hi there, > I have configured and ran lustre 2.0 with tcp (OSS and MDS on on the > same server) without problem. Now I am trying to run lustre with > infiniband support. but whenever I mount the mdt storage on server, > the process ends with following error: > kernel panic - not syncing: fatal exception > > my /etc/modprobe.conf is: > options lnet networks="o2ib0(ib0)" > > last lines of dmesg: > ---------------------------------------------------- > kjournald starting. Commit interval 5 seconds > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > recommended > LDISKFS FS on sda2, internal journal > LDISKFS-fs: recovery complete. > LDISKFS-fs: mounted filesystem with ordered data mode. > kjournald starting. Commit interval 5 seconds > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > recommended > LDISKFS FS on sda2, internal journal > LDISKFS-fs: mounted filesystem with ordered data mode. > LustreError: 6041:0:(o2iblnd.c:2501:kiblnd_startup()) Can''t query > IPoIB interface ib0: it''s down > LustreError: 6041:0:(o2iblnd.c:2501:kiblnd_startup()) Skipped 1 > previous similar message > eth0: no IPv6 routers present > LustreError: 105-4: Error -100 starting up LNI o2ib > LustreError: Skipped 1 previous similar message > LustreError: 6041:0:(events.c:731:ptlrpc_init_portals()) network > initialisation failed > LustreError: 158-c: Can''t load module ''mgs'' > LustreError: 6035:0:(genops.c:286:class_newdev()) OBD: unknown type: > mgs > LustreError: 6035:0:(obd_config.c:300:class_attach()) Cannot create > device MGS of type mgs : -19 > LustreError: 6035:0:(obd_mount.c:502:lustre_start_simple()) MGS > attach error -19 > LustreError: 15e-a: Failed to start MGS ''MGS'' (-19). Is the ''mgs'' > module loaded? > LustreError: 6035:0:(obd_mount.c:1492:server_put_super()) no obd > lustre-MDTffff > LustreError: 6035:0:(obd_mount.c:137:server_deregister_mount()) > lustre-MDTffff not registered > Lustre: server umount lustre-MDTffff complete > LustreError: 6035:0:(obd_mount.c:2136:lustre_fill_super()) Unable to > mount (-19) > kjournald starting. Commit interval 5 seconds > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > recommended > LDISKFS FS on sdb1, internal journal > LDISKFS-fs: mounted filesystem with ordered data mode. > kjournald starting. Commit interval 5 seconds > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > recommended > LDISKFS FS on sdb1, internal journal > LDISKFS-fs: mounted filesystem with ordered data mode. > LDISKFS-fs: file extents enabled > LDISKFS-fs: mballoc enabled > LustreError: 6117:0:(events.c:731:ptlrpc_init_portals()) network > initialisation failed > LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success) > LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 > breaks, 0 lost > LDISKFS-fs: mballoc: 0 generated and it took 0 > LDISKFS-fs: mballoc: 0 preallocated, 0 discarded > kjournald starting. Commit interval 5 seconds > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > recommended > LDISKFS FS on sdb2, internal journal > LDISKFS-fs: mounted filesystem with ordered data mode. > kjournald starting. Commit interval 5 seconds > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > recommended > LDISKFS FS on sdb2, internal journal > LDISKFS-fs: mounted filesystem with ordered data mode. > LDISKFS-fs: file extents enabled > LDISKFS-fs: mballoc enabled > LustreError: 6193:0:(events.c:731:ptlrpc_init_portals()) network > initialisation failed > LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success) > LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 > breaks, 0 lost > LDISKFS-fs: mballoc: 0 generated and it took 0 > LDISKFS-fs: mballoc: 0 preallocated, 0 discarded > kjournald starting. Commit interval 5 seconds > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > recommended > LDISKFS FS on sdb3, internal journal > LDISKFS-fs: mounted filesystem with ordered data mode. > kjournald starting. Commit interval 5 seconds > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > recommended > LDISKFS FS on sdb3, internal journal > LDISKFS-fs: mounted filesystem with ordered data mode. > LDISKFS-fs: file extents enabled > LDISKFS-fs: mballoc enabled > LustreError: 6269:0:(o2iblnd.c:2501:kiblnd_startup()) Can''t query > IPoIB interface ib0: it''s down > LustreError: 6269:0:(o2iblnd.c:2501:kiblnd_startup()) Skipped 2 > previous similar messages > LustreError: 105-4: Error -100 starting up LNI o2ib > LustreError: Skipped 2 previous similar messages > LustreError: 6269:0:(events.c:731:ptlrpc_init_portals()) network > initialisation failed > LustreError: 158-c: Can''t load module ''mgc'' > LustreError: Skipped 2 previous similar messages > LustreError: 6263:0:(genops.c:286:class_newdev()) OBD: unknown type: > mgc > LustreError: 6263:0:(genops.c:286:class_newdev()) Skipped 2 previous > similar messages > LustreError: 6263:0:(obd_config.c:300:class_attach()) Cannot create > device MGC0 at lo of type mgc : -19 > LustreError: 6263:0:(obd_config.c:300:class_attach()) Skipped 2 > previous similar messages > LustreError: 6263:0:(obd_mount.c:502:lustre_start_simple()) MGC0 at lo > attach error -19 > LustreError: 6263:0:(obd_mount.c:502:lustre_start_simple()) Skipped > 2 previous similar messages > LustreError: 6263:0:(obd_mount.c:1492:server_put_super()) no obd > lustre-OST0002 > LustreError: 6263:0:(obd_mount.c:1492:server_put_super()) Skipped 2 > previous similar messages > LustreError: 6263:0:(obd_mount.c:137:server_deregister_mount()) > lustre-OST0002 not registered > LustreError: 6263:0:(obd_mount.c:137:server_deregister_mount()) > Skipped 2 previous similar messages > LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success) > LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 > breaks, 0 lost > LDISKFS-fs: mballoc: 0 generated and it took 0 > LDISKFS-fs: mballoc: 0 preallocated, 0 discarded > Lustre: server umount lustre-OST0002 complete > Lustre: Skipped 2 previous similar messages > LustreError: 6263:0:(obd_mount.c:2136:lustre_fill_super()) Unable to > mount (-19) > LustreError: 6263:0:(obd_mount.c:2136:lustre_fill_super()) Skipped 2 > previous similar messages > kjournald starting. Commit interval 5 seconds > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > recommended > LDISKFS FS on sdb4, internal journal > LDISKFS-fs: mounted filesystem with ordered data mode. > kjournald starting. Commit interval 5 seconds > LDISKFS-fs warning: maximal mount count reached, running e2fsck is > recommended > LDISKFS FS on sdb4, internal journal > LDISKFS-fs: mounted filesystem with ordered data mode. > LDISKFS-fs: file extents enabled > LDISKFS-fs: mballoc enabled > LustreError: 6345:0:(events.c:731:ptlrpc_init_portals()) network > initialisation failed > LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success) > LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 > breaks, 0 lost > LDISKFS-fs: mballoc: 0 generated and it took 0 > LDISKFS-fs: mballoc: 0 preallocated, 0 discarded > ------------------------------------------------------------------------------------ > > Any ideas? > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss > >
Arya Mazaheri
2011-Feb-21 20:49 UTC
[Lustre-discuss] Kernel Panic error while running lustre 2.0 with infiniband
yep! you''re right... On Mon, Feb 21, 2011 at 11:19 PM, Albert Everett <aeeverett at ualr.edu> wrote:> Here''s all we needed. Yours is probably similar. > > # cat /etc/sysconfig/network-scripts/ifcfg-ib0 > DEVICE=ib0 > IPADDR=192.168.2.1 > NETMASK=255.255.255.0 > BOOTPROTO=static > ONBOOT=yes > > > On Feb 21, 2011, at 1:17 PM, Arya Mazaheri wrote: > > problem solved. I was trying to set the IP of ib0 by this command: >> ifconfig ib0 192.168.1.1 netmask 255.255.255.0 up >> >> but, it leads to the kernel panic. So I tried to set the IP address by >> adding to network-scripts. So, it works now... >> I really don''t know why setting IP with ifconfig doesn''t work. So weird... >> >> >> On Mon, Feb 21, 2011 at 7:31 PM, Albert Everett <aeeverett at ualr.edu> >> wrote: >> What''s output of >> >> # ifconfig ib0 >> >> Albert >> >> >> On Feb 21, 2011, at 6:27 AM, Arya Mazaheri wrote: >> >> Hi there, >> I have configured and ran lustre 2.0 with tcp (OSS and MDS on on the same >> server) without problem. Now I am trying to run lustre with infiniband >> support. but whenever I mount the mdt storage on server, the process ends >> with following error: >> kernel panic - not syncing: fatal exception >> >> my /etc/modprobe.conf is: >> options lnet networks="o2ib0(ib0)" >> >> last lines of dmesg: >> ---------------------------------------------------- >> kjournald starting. Commit interval 5 seconds >> LDISKFS-fs warning: maximal mount count reached, running e2fsck is >> recommended >> LDISKFS FS on sda2, internal journal >> LDISKFS-fs: recovery complete. >> LDISKFS-fs: mounted filesystem with ordered data mode. >> kjournald starting. Commit interval 5 seconds >> LDISKFS-fs warning: maximal mount count reached, running e2fsck is >> recommended >> LDISKFS FS on sda2, internal journal >> LDISKFS-fs: mounted filesystem with ordered data mode. >> LustreError: 6041:0:(o2iblnd.c:2501:kiblnd_startup()) Can''t query IPoIB >> interface ib0: it''s down >> LustreError: 6041:0:(o2iblnd.c:2501:kiblnd_startup()) Skipped 1 previous >> similar message >> eth0: no IPv6 routers present >> LustreError: 105-4: Error -100 starting up LNI o2ib >> LustreError: Skipped 1 previous similar message >> LustreError: 6041:0:(events.c:731:ptlrpc_init_portals()) network >> initialisation failed >> LustreError: 158-c: Can''t load module ''mgs'' >> LustreError: 6035:0:(genops.c:286:class_newdev()) OBD: unknown type: mgs >> LustreError: 6035:0:(obd_config.c:300:class_attach()) Cannot create device >> MGS of type mgs : -19 >> LustreError: 6035:0:(obd_mount.c:502:lustre_start_simple()) MGS attach >> error -19 >> LustreError: 15e-a: Failed to start MGS ''MGS'' (-19). Is the ''mgs'' module >> loaded? >> LustreError: 6035:0:(obd_mount.c:1492:server_put_super()) no obd >> lustre-MDTffff >> LustreError: 6035:0:(obd_mount.c:137:server_deregister_mount()) >> lustre-MDTffff not registered >> Lustre: server umount lustre-MDTffff complete >> LustreError: 6035:0:(obd_mount.c:2136:lustre_fill_super()) Unable to mount >> (-19) >> kjournald starting. Commit interval 5 seconds >> LDISKFS-fs warning: maximal mount count reached, running e2fsck is >> recommended >> LDISKFS FS on sdb1, internal journal >> LDISKFS-fs: mounted filesystem with ordered data mode. >> kjournald starting. Commit interval 5 seconds >> LDISKFS-fs warning: maximal mount count reached, running e2fsck is >> recommended >> LDISKFS FS on sdb1, internal journal >> LDISKFS-fs: mounted filesystem with ordered data mode. >> LDISKFS-fs: file extents enabled >> LDISKFS-fs: mballoc enabled >> LustreError: 6117:0:(events.c:731:ptlrpc_init_portals()) network >> initialisation failed >> LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success) >> LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 breaks, >> 0 lost >> LDISKFS-fs: mballoc: 0 generated and it took 0 >> LDISKFS-fs: mballoc: 0 preallocated, 0 discarded >> kjournald starting. Commit interval 5 seconds >> LDISKFS-fs warning: maximal mount count reached, running e2fsck is >> recommended >> LDISKFS FS on sdb2, internal journal >> LDISKFS-fs: mounted filesystem with ordered data mode. >> kjournald starting. Commit interval 5 seconds >> LDISKFS-fs warning: maximal mount count reached, running e2fsck is >> recommended >> LDISKFS FS on sdb2, internal journal >> LDISKFS-fs: mounted filesystem with ordered data mode. >> LDISKFS-fs: file extents enabled >> LDISKFS-fs: mballoc enabled >> LustreError: 6193:0:(events.c:731:ptlrpc_init_portals()) network >> initialisation failed >> LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success) >> LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 breaks, >> 0 lost >> LDISKFS-fs: mballoc: 0 generated and it took 0 >> LDISKFS-fs: mballoc: 0 preallocated, 0 discarded >> kjournald starting. Commit interval 5 seconds >> LDISKFS-fs warning: maximal mount count reached, running e2fsck is >> recommended >> LDISKFS FS on sdb3, internal journal >> LDISKFS-fs: mounted filesystem with ordered data mode. >> kjournald starting. Commit interval 5 seconds >> LDISKFS-fs warning: maximal mount count reached, running e2fsck is >> recommended >> LDISKFS FS on sdb3, internal journal >> LDISKFS-fs: mounted filesystem with ordered data mode. >> LDISKFS-fs: file extents enabled >> LDISKFS-fs: mballoc enabled >> LustreError: 6269:0:(o2iblnd.c:2501:kiblnd_startup()) Can''t query IPoIB >> interface ib0: it''s down >> LustreError: 6269:0:(o2iblnd.c:2501:kiblnd_startup()) Skipped 2 previous >> similar messages >> LustreError: 105-4: Error -100 starting up LNI o2ib >> LustreError: Skipped 2 previous similar messages >> LustreError: 6269:0:(events.c:731:ptlrpc_init_portals()) network >> initialisation failed >> LustreError: 158-c: Can''t load module ''mgc'' >> LustreError: Skipped 2 previous similar messages >> LustreError: 6263:0:(genops.c:286:class_newdev()) OBD: unknown type: mgc >> LustreError: 6263:0:(genops.c:286:class_newdev()) Skipped 2 previous >> similar messages >> LustreError: 6263:0:(obd_config.c:300:class_attach()) Cannot create device >> MGC0 at lo of type mgc : -19 >> LustreError: 6263:0:(obd_config.c:300:class_attach()) Skipped 2 previous >> similar messages >> LustreError: 6263:0:(obd_mount.c:502:lustre_start_simple()) MGC0 at loattach error -19 >> LustreError: 6263:0:(obd_mount.c:502:lustre_start_simple()) Skipped 2 >> previous similar messages >> LustreError: 6263:0:(obd_mount.c:1492:server_put_super()) no obd >> lustre-OST0002 >> LustreError: 6263:0:(obd_mount.c:1492:server_put_super()) Skipped 2 >> previous similar messages >> LustreError: 6263:0:(obd_mount.c:137:server_deregister_mount()) >> lustre-OST0002 not registered >> LustreError: 6263:0:(obd_mount.c:137:server_deregister_mount()) Skipped 2 >> previous similar messages >> LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success) >> LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 breaks, >> 0 lost >> LDISKFS-fs: mballoc: 0 generated and it took 0 >> LDISKFS-fs: mballoc: 0 preallocated, 0 discarded >> Lustre: server umount lustre-OST0002 complete >> Lustre: Skipped 2 previous similar messages >> LustreError: 6263:0:(obd_mount.c:2136:lustre_fill_super()) Unable to mount >> (-19) >> LustreError: 6263:0:(obd_mount.c:2136:lustre_fill_super()) Skipped 2 >> previous similar messages >> kjournald starting. Commit interval 5 seconds >> LDISKFS-fs warning: maximal mount count reached, running e2fsck is >> recommended >> LDISKFS FS on sdb4, internal journal >> LDISKFS-fs: mounted filesystem with ordered data mode. >> kjournald starting. Commit interval 5 seconds >> LDISKFS-fs warning: maximal mount count reached, running e2fsck is >> recommended >> LDISKFS FS on sdb4, internal journal >> LDISKFS-fs: mounted filesystem with ordered data mode. >> LDISKFS-fs: file extents enabled >> LDISKFS-fs: mballoc enabled >> LustreError: 6345:0:(events.c:731:ptlrpc_init_portals()) network >> initialisation failed >> LDISKFS-fs: mballoc: 0 blocks 0 reqs (0 success) >> LDISKFS-fs: mballoc: 0 extents scanned, 0 goal hits, 0 2^N hits, 0 breaks, >> 0 lost >> LDISKFS-fs: mballoc: 0 generated and it took 0 >> LDISKFS-fs: mballoc: 0 preallocated, 0 discarded >> >> ------------------------------------------------------------------------------------ >> >> Any ideas? >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20110222/bc1c5d74/attachment-0001.html