Chris Worley
2008-Mar-04 16:06 UTC
[Lustre-discuss] ko2iblnd panics in kiblnd_map_tx_descs
I''m trying to port Lustre 1.6.4.2 to OFED 1.2.5.5 w/ the RHEL kernel 2.6.9.67.0.4. ksocklnd-based mounts work fine, but when I try to mount over IB, I get a panic in ko2iblnd in the transmit descriptor mapping routine: general protection fault: 0000 [1] SMP CPU 1 Modules linked in: ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) nfs(U) lockd(U) nfs_acl(U) sunrpc(U) rdma_ucm(U) ib_sdp(U) rdma_cm(U) iw_cm(U) ib_addr(U) mlx4_ib(U) mlx4_core(U) ib_mthca(U) dm_mod(U) ib_ipoib(U) md5(U) ipv6(U) ib_umad(U) ib_ucm(U) ib_uverbs(U) ib_cm(U) ib_sa(U) ib_mad(U) ib_core(U) aic79xx(U) e1000(U) ext3(U) jbd(U) raid0(U) mptscsih(U) mptsas(U) mptspi(U) mptscsi(U) mptbase(U) sd_mod(U) ata_piix(U) libata(U) scsi_mod(U) Pid: 5141, comm: modprobe Not tainted 2.6.9-67.0.4.EL-Lustre-1.6.4.2 RIP: 0010:[<ffffffffa04659d1>] <ffffffffa04659d1>{:ko2iblnd:kiblnd_map_tx_descs+225} RSP: 0000:00000102105d7cd8 EFLAGS: 00010286 RAX: ffffffffa01e6b4e RBX: ffffff0010028000 RCX: 0000000000000001 RDX: 0000000000001000 RSI: 000001020e705000 RDI: 00000102154e2000 RBP: 00000102102c4200 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 00000000ffffffff R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 00000102102c4228 FS: 0000002a958a0b00(0000) GS:ffffffff8046ac00(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000002a9598200f CR3: 000000009fa08000 CR4: 00000000000006e0 Process modprobe (pid: 5141, threadinfo 00000102105d6000, task 00000102175e0030) Stack: 0000000000000000 00000102102c4080 00000102102c4100 00000102102c4200 00000102179c2b86 00000102177df400 0000010215548ac0 ffffffffa0466fdf 00000102179c2b85 0000000000000000 Call Trace:<ffffffffa0466fdf>{:ko2iblnd:kiblnd_startup+2239} <ffffffffa03043dc>{:lnet:lnet_startup_lndnis+332} <ffffffffa02d2f38>{:libcfs:cfs_alloc+40} <ffffffffa0305206>{:lnet:LNetNIInit+278} <ffffffffa03fcb0a>{:ptlrpc:ptlrpc_ni_init+106} <ffffffff8012f9cd>{default_wake_function+0} <ffffffffa03fcbfa>{:ptlrpc:ptlrpc_init_portals+10} <ffffffff8012f9cd>{default_wake_function+0} <ffffffffa045f22b>{:ptlrpc:init_module+267} <ffffffff8014bc0a>{sys_init_module+278} <ffffffff8010f23e>{system_call+126} Code: ff 50 08 eb 12 48 8b 3f b9 01 00 00 00 ba 00 10 00 00 e8 30 RIP <ffffffffa04659d1>{:ko2iblnd:kiblnd_map_tx_descs+225} RSP <00000102105d7cd8> Does this ring any bells? Otherwise, any debugging tips? Shane said that they get an oops if they compile with the "version specific OFA tree". Is this the Oops? Thanks, Chris
Canon, Richard Shane
2008-Mar-04 16:47 UTC
[Lustre-discuss] ko2iblnd panics in kiblnd_map_tx_descs
Chris, Which headers did you point lustre to? Try using the non-version one (ie ofa_kernel not ofa_kernel-1.2.5.5). Shane ----- Original Message ----- From: lustre-discuss-bounces at lists.lustre.org <lustre-discuss-bounces at lists.lustre.org> To: lustre-discuss at lists.lustre.org <lustre-discuss at lists.lustre.org> Sent: Tue Mar 04 11:06:48 2008 Subject: [Lustre-discuss] ko2iblnd panics in kiblnd_map_tx_descs I''m trying to port Lustre 1.6.4.2 to OFED 1.2.5.5 w/ the RHEL kernel 2.6.9.67.0.4. ksocklnd-based mounts work fine, but when I try to mount over IB, I get a panic in ko2iblnd in the transmit descriptor mapping routine: general protection fault: 0000 [1] SMP CPU 1 Modules linked in: ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) nfs(U) lockd(U) nfs_acl(U) sunrpc(U) rdma_ucm(U) ib_sdp(U) rdma_cm(U) iw_cm(U) ib_addr(U) mlx4_ib(U) mlx4_core(U) ib_mthca(U) dm_mod(U) ib_ipoib(U) md5(U) ipv6(U) ib_umad(U) ib_ucm(U) ib_uverbs(U) ib_cm(U) ib_sa(U) ib_mad(U) ib_core(U) aic79xx(U) e1000(U) ext3(U) jbd(U) raid0(U) mptscsih(U) mptsas(U) mptspi(U) mptscsi(U) mptbase(U) sd_mod(U) ata_piix(U) libata(U) scsi_mod(U) Pid: 5141, comm: modprobe Not tainted 2.6.9-67.0.4.EL-Lustre-1.6.4.2 RIP: 0010:[<ffffffffa04659d1>] <ffffffffa04659d1>{:ko2iblnd:kiblnd_map_tx_descs+225} RSP: 0000:00000102105d7cd8 EFLAGS: 00010286 RAX: ffffffffa01e6b4e RBX: ffffff0010028000 RCX: 0000000000000001 RDX: 0000000000001000 RSI: 000001020e705000 RDI: 00000102154e2000 RBP: 00000102102c4200 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 00000000ffffffff R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 00000102102c4228 FS: 0000002a958a0b00(0000) GS:ffffffff8046ac00(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000002a9598200f CR3: 000000009fa08000 CR4: 00000000000006e0 Process modprobe (pid: 5141, threadinfo 00000102105d6000, task 00000102175e0030) Stack: 0000000000000000 00000102102c4080 00000102102c4100 00000102102c4200 00000102179c2b86 00000102177df400 0000010215548ac0 ffffffffa0466fdf 00000102179c2b85 0000000000000000 Call Trace:<ffffffffa0466fdf>{:ko2iblnd:kiblnd_startup+2239} <ffffffffa03043dc>{:lnet:lnet_startup_lndnis+332} <ffffffffa02d2f38>{:libcfs:cfs_alloc+40} <ffffffffa0305206>{:lnet:LNetNIInit+278} <ffffffffa03fcb0a>{:ptlrpc:ptlrpc_ni_init+106} <ffffffff8012f9cd>{default_wake_function+0} <ffffffffa03fcbfa>{:ptlrpc:ptlrpc_init_portals+10} <ffffffff8012f9cd>{default_wake_function+0} <ffffffffa045f22b>{:ptlrpc:init_module+267} <ffffffff8014bc0a>{sys_init_module+278} <ffffffff8010f23e>{system_call+126} Code: ff 50 08 eb 12 48 8b 3f b9 01 00 00 00 ba 00 10 00 00 e8 30 RIP <ffffffffa04659d1>{:ko2iblnd:kiblnd_map_tx_descs+225} RSP <00000102105d7cd8> Does this ring any bells? Otherwise, any debugging tips? Shane said that they get an oops if they compile with the "version specific OFA tree". Is this the Oops? Thanks, Chris _______________________________________________ Lustre-discuss mailing list Lustre-discuss at lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss
Hi Chris, To resolve your problem, please: 1. apply this patch to your lnet: https://bugzilla.lustre.org/attachment.cgi?id=15733 2. please make sure use this option while configure: --with-o2ib=/path/to/ofed 3. Copy /path/to/ofed/Module.symvers to your $LUSTRE before building Regards Liang Chris Worley wrote:> I''m trying to port Lustre 1.6.4.2 to OFED 1.2.5.5 w/ the RHEL kernel > 2.6.9.67.0.4. > > ksocklnd-based mounts work fine, but when I try to mount over IB, I > get a panic in ko2iblnd in the transmit descriptor mapping routine: > > general protection fault: 0000 [1] SMP > CPU 1 > Modules linked in: ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) > libcfs(U) nfs(U) lockd(U) nfs_acl(U) sunrpc(U) rdma_ucm(U) ib_sdp(U) > rdma_cm(U) iw_cm(U) ib_addr(U) mlx4_ib(U) mlx4_core(U) ib_mthca(U) > dm_mod(U) ib_ipoib(U) md5(U) ipv6(U) ib_umad(U) ib_ucm(U) ib_uverbs(U) > ib_cm(U) ib_sa(U) ib_mad(U) ib_core(U) aic79xx(U) e1000(U) ext3(U) > jbd(U) raid0(U) mptscsih(U) mptsas(U) mptspi(U) mptscsi(U) mptbase(U) > sd_mod(U) ata_piix(U) libata(U) scsi_mod(U) > Pid: 5141, comm: modprobe Not tainted 2.6.9-67.0.4.EL-Lustre-1.6.4.2 > RIP: 0010:[<ffffffffa04659d1>] > <ffffffffa04659d1>{:ko2iblnd:kiblnd_map_tx_descs+225} > RSP: 0000:00000102105d7cd8 EFLAGS: 00010286 > RAX: ffffffffa01e6b4e RBX: ffffff0010028000 RCX: 0000000000000001 > RDX: 0000000000001000 RSI: 000001020e705000 RDI: 00000102154e2000 > RBP: 00000102102c4200 R08: 0000000000000000 R09: 0000000000000000 > R10: 0000000000000000 R11: 00000000ffffffff R12: 0000000000000000 > R13: 0000000000000000 R14: 0000000000000000 R15: 00000102102c4228 > FS: 0000002a958a0b00(0000) GS:ffffffff8046ac00(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > CR2: 0000002a9598200f CR3: 000000009fa08000 CR4: 00000000000006e0 > Process modprobe (pid: 5141, threadinfo 00000102105d6000, task 00000102175e0030) > Stack: 0000000000000000 00000102102c4080 00000102102c4100 00000102102c4200 > 00000102179c2b86 00000102177df400 0000010215548ac0 ffffffffa0466fdf > 00000102179c2b85 0000000000000000 > Call Trace:<ffffffffa0466fdf>{:ko2iblnd:kiblnd_startup+2239} > <ffffffffa03043dc>{:lnet:lnet_startup_lndnis+332} > <ffffffffa02d2f38>{:libcfs:cfs_alloc+40} > <ffffffffa0305206>{:lnet:LNetNIInit+278} > <ffffffffa03fcb0a>{:ptlrpc:ptlrpc_ni_init+106} > <ffffffff8012f9cd>{default_wake_function+0} > <ffffffffa03fcbfa>{:ptlrpc:ptlrpc_init_portals+10} > <ffffffff8012f9cd>{default_wake_function+0} > <ffffffffa045f22b>{:ptlrpc:init_module+267} > <ffffffff8014bc0a>{sys_init_module+278} > <ffffffff8010f23e>{system_call+126} > > > Code: ff 50 08 eb 12 48 8b 3f b9 01 00 00 00 ba 00 10 00 00 e8 30 > RIP <ffffffffa04659d1>{:ko2iblnd:kiblnd_map_tx_descs+225} RSP <00000102105d7cd8> > > Does this ring any bells? Otherwise, any debugging tips? > > Shane said that they get an oops if they compile with the "version > specific OFA tree". Is this the Oops? > > Thanks, > > Chris > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >
Sébastien Buisson
2008-Mar-05 14:24 UTC
[Lustre-discuss] ko2iblnd panics in kiblnd_map_tx_descs
Hi there, I have the same problem described by Chris, so I am interested in the solution. ;-) The problem is that I do not understand where to copy the OFED Module.symvers. I tried to put it in the directory where I build lustre (subdirectories are build, ldiskfs, libsysio, lnet, lustre and snmp), but I still get the "undefined symbols" messages when building. Any idea? Cheers, Sebastien. Liang Zhen a ?crit :> Hi Chris, > To resolve your problem, please: > 1. apply this patch to your lnet: > https://bugzilla.lustre.org/attachment.cgi?id=15733 > 2. please make sure use this option while configure: > --with-o2ib=/path/to/ofed > 3. Copy /path/to/ofed/Module.symvers to your $LUSTRE before building > > Regards > Liang > > Chris Worley wrote: >> I''m trying to port Lustre 1.6.4.2 to OFED 1.2.5.5 w/ the RHEL kernel >> 2.6.9.67.0.4. >> >> ksocklnd-based mounts work fine, but when I try to mount over IB, I >> get a panic in ko2iblnd in the transmit descriptor mapping routine: >> >> general protection fault: 0000 [1] SMP >> CPU 1 >> Modules linked in: ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) >> libcfs(U) nfs(U) lockd(U) nfs_acl(U) sunrpc(U) rdma_ucm(U) ib_sdp(U) >> rdma_cm(U) iw_cm(U) ib_addr(U) mlx4_ib(U) mlx4_core(U) ib_mthca(U) >> dm_mod(U) ib_ipoib(U) md5(U) ipv6(U) ib_umad(U) ib_ucm(U) ib_uverbs(U) >> ib_cm(U) ib_sa(U) ib_mad(U) ib_core(U) aic79xx(U) e1000(U) ext3(U) >> jbd(U) raid0(U) mptscsih(U) mptsas(U) mptspi(U) mptscsi(U) mptbase(U) >> sd_mod(U) ata_piix(U) libata(U) scsi_mod(U) >> Pid: 5141, comm: modprobe Not tainted 2.6.9-67.0.4.EL-Lustre-1.6.4.2 >> RIP: 0010:[<ffffffffa04659d1>] >> <ffffffffa04659d1>{:ko2iblnd:kiblnd_map_tx_descs+225} >> RSP: 0000:00000102105d7cd8 EFLAGS: 00010286 >> RAX: ffffffffa01e6b4e RBX: ffffff0010028000 RCX: 0000000000000001 >> RDX: 0000000000001000 RSI: 000001020e705000 RDI: 00000102154e2000 >> RBP: 00000102102c4200 R08: 0000000000000000 R09: 0000000000000000 >> R10: 0000000000000000 R11: 00000000ffffffff R12: 0000000000000000 >> R13: 0000000000000000 R14: 0000000000000000 R15: 00000102102c4228 >> FS: 0000002a958a0b00(0000) GS:ffffffff8046ac00(0000) knlGS:0000000000000000 >> CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >> CR2: 0000002a9598200f CR3: 000000009fa08000 CR4: 00000000000006e0 >> Process modprobe (pid: 5141, threadinfo 00000102105d6000, task 00000102175e0030) >> Stack: 0000000000000000 00000102102c4080 00000102102c4100 00000102102c4200 >> 00000102179c2b86 00000102177df400 0000010215548ac0 ffffffffa0466fdf >> 00000102179c2b85 0000000000000000 >> Call Trace:<ffffffffa0466fdf>{:ko2iblnd:kiblnd_startup+2239} >> <ffffffffa03043dc>{:lnet:lnet_startup_lndnis+332} >> <ffffffffa02d2f38>{:libcfs:cfs_alloc+40} >> <ffffffffa0305206>{:lnet:LNetNIInit+278} >> <ffffffffa03fcb0a>{:ptlrpc:ptlrpc_ni_init+106} >> <ffffffff8012f9cd>{default_wake_function+0} >> <ffffffffa03fcbfa>{:ptlrpc:ptlrpc_init_portals+10} >> <ffffffff8012f9cd>{default_wake_function+0} >> <ffffffffa045f22b>{:ptlrpc:init_module+267} >> <ffffffff8014bc0a>{sys_init_module+278} >> <ffffffff8010f23e>{system_call+126} >> >> >> Code: ff 50 08 eb 12 48 8b 3f b9 01 00 00 00 ba 00 10 00 00 e8 30 >> RIP <ffffffffa04659d1>{:ko2iblnd:kiblnd_map_tx_descs+225} RSP <00000102105d7cd8> >> >> Does this ring any bells? Otherwise, any debugging tips? >> >> Shane said that they get an oops if they compile with the "version >> specific OFA tree". Is this the Oops? >> >> Thanks, >> >> Chris >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss > >
Chris Worley
2008-Mar-05 14:31 UTC
[Lustre-discuss] ko2iblnd panics in kiblnd_map_tx_descs
On Wed, Mar 5, 2008 at 7:30 AM, Chris Worley <worleys at gmail.com> wrote:> On Wed, Mar 5, 2008 at 7:24 AM, S?bastien Buisson > <sebastien.buisson at bull.net> wrote: > > Hi there, > > > > I have the same problem described by Chris, so I am interested in the > > solution. ;-) > > The problem is that I do not understand where to copy the OFED > > Module.symvers. > > It goes in your kernel source directory...i.e.: /usr/src/linux/.And don''t copy it, append it... and make sure no IB symbols are in there already. Chris> > Chris > > > > I tried to put it in the directory where I build lustre > > (subdirectories are build, ldiskfs, libsysio, lnet, lustre and snmp), > > but I still get the "undefined symbols" messages when building. > > > > Any idea? > > > > Cheers, > > Sebastien. > > > > > > Liang Zhen a ?crit : > > > > > > > Hi Chris, > > > To resolve your problem, please: > > > 1. apply this patch to your lnet: > > > https://bugzilla.lustre.org/attachment.cgi?id=15733 > > > 2. please make sure use this option while configure: > > > --with-o2ib=/path/to/ofed > > > 3. Copy /path/to/ofed/Module.symvers to your $LUSTRE before building > > > > > > Regards > > > Liang > > > > > > Chris Worley wrote: > > >> I''m trying to port Lustre 1.6.4.2 to OFED 1.2.5.5 w/ the RHEL kernel > > >> 2.6.9.67.0.4. > > >> > > >> ksocklnd-based mounts work fine, but when I try to mount over IB, I > > >> get a panic in ko2iblnd in the transmit descriptor mapping routine: > > >> > > >> general protection fault: 0000 [1] SMP > > >> CPU 1 > > >> Modules linked in: ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) > > >> libcfs(U) nfs(U) lockd(U) nfs_acl(U) sunrpc(U) rdma_ucm(U) ib_sdp(U) > > >> rdma_cm(U) iw_cm(U) ib_addr(U) mlx4_ib(U) mlx4_core(U) ib_mthca(U) > > >> dm_mod(U) ib_ipoib(U) md5(U) ipv6(U) ib_umad(U) ib_ucm(U) ib_uverbs(U) > > >> ib_cm(U) ib_sa(U) ib_mad(U) ib_core(U) aic79xx(U) e1000(U) ext3(U) > > >> jbd(U) raid0(U) mptscsih(U) mptsas(U) mptspi(U) mptscsi(U) mptbase(U) > > >> sd_mod(U) ata_piix(U) libata(U) scsi_mod(U) > > >> Pid: 5141, comm: modprobe Not tainted 2.6.9-67.0.4.EL-Lustre-1.6.4.2 > > >> RIP: 0010:[<ffffffffa04659d1>] > > >> <ffffffffa04659d1>{:ko2iblnd:kiblnd_map_tx_descs+225} > > >> RSP: 0000:00000102105d7cd8 EFLAGS: 00010286 > > >> RAX: ffffffffa01e6b4e RBX: ffffff0010028000 RCX: 0000000000000001 > > >> RDX: 0000000000001000 RSI: 000001020e705000 RDI: 00000102154e2000 > > >> RBP: 00000102102c4200 R08: 0000000000000000 R09: 0000000000000000 > > >> R10: 0000000000000000 R11: 00000000ffffffff R12: 0000000000000000 > > >> R13: 0000000000000000 R14: 0000000000000000 R15: 00000102102c4228 > > >> FS: 0000002a958a0b00(0000) GS:ffffffff8046ac00(0000) knlGS:0000000000000000 > > >> CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > > >> CR2: 0000002a9598200f CR3: 000000009fa08000 CR4: 00000000000006e0 > > >> Process modprobe (pid: 5141, threadinfo 00000102105d6000, task 00000102175e0030) > > >> Stack: 0000000000000000 00000102102c4080 00000102102c4100 00000102102c4200 > > >> 00000102179c2b86 00000102177df400 0000010215548ac0 ffffffffa0466fdf > > >> 00000102179c2b85 0000000000000000 > > >> Call Trace:<ffffffffa0466fdf>{:ko2iblnd:kiblnd_startup+2239} > > >> <ffffffffa03043dc>{:lnet:lnet_startup_lndnis+332} > > >> <ffffffffa02d2f38>{:libcfs:cfs_alloc+40} > > >> <ffffffffa0305206>{:lnet:LNetNIInit+278} > > >> <ffffffffa03fcb0a>{:ptlrpc:ptlrpc_ni_init+106} > > >> <ffffffff8012f9cd>{default_wake_function+0} > > >> <ffffffffa03fcbfa>{:ptlrpc:ptlrpc_init_portals+10} > > >> <ffffffff8012f9cd>{default_wake_function+0} > > >> <ffffffffa045f22b>{:ptlrpc:init_module+267} > > >> <ffffffff8014bc0a>{sys_init_module+278} > > >> <ffffffff8010f23e>{system_call+126} > > >> > > >> > > >> Code: ff 50 08 eb 12 48 8b 3f b9 01 00 00 00 ba 00 10 00 00 e8 30 > > >> RIP <ffffffffa04659d1>{:ko2iblnd:kiblnd_map_tx_descs+225} RSP <00000102105d7cd8> > > >> > > >> Does this ring any bells? Otherwise, any debugging tips? > > >> > > >> Shane said that they get an oops if they compile with the "version > > >> specific OFA tree". Is this the Oops? > > >> > > >> Thanks, > > >> > > >> Chris > > >> _______________________________________________ > > >> Lustre-discuss mailing list > > >> Lustre-discuss at lists.lustre.org > > >> http://lists.lustre.org/mailman/listinfo/lustre-discuss > > >> > > > > > > _______________________________________________ > > > Lustre-discuss mailing list > > > Lustre-discuss at lists.lustre.org > > > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > > > > > > > > > >
Hi Sebastien, If it can''t work when you put it in $LUSTRE/, then please put it in $LUSTRE/lnet/klnds/o2iblnd and try. If it still can''t work, please read OFED/doc/OFED_tips.txt, there is a section for problem like this. Regards Liang S?bastien Buisson wrote:> Hi there, > > I have the same problem described by Chris, so I am interested in the > solution. ;-) > The problem is that I do not understand where to copy the OFED > Module.symvers. I tried to put it in the directory where I build > lustre (subdirectories are build, ldiskfs, libsysio, lnet, lustre and > snmp), but I still get the "undefined symbols" messages when building. > > Any idea? > > Cheers, > Sebastien. > > > Liang Zhen a ?crit : >> Hi Chris, >> To resolve your problem, please: >> 1. apply this patch to your lnet: >> https://bugzilla.lustre.org/attachment.cgi?id=15733 >> 2. please make sure use this option while configure: >> --with-o2ib=/path/to/ofed >> 3. Copy /path/to/ofed/Module.symvers to your $LUSTRE before building >> >> Regards >> Liang >> >> Chris Worley wrote: >>> I''m trying to port Lustre 1.6.4.2 to OFED 1.2.5.5 w/ the RHEL kernel >>> 2.6.9.67.0.4. >>> >>> ksocklnd-based mounts work fine, but when I try to mount over IB, I >>> get a panic in ko2iblnd in the transmit descriptor mapping routine: >>> >>> general protection fault: 0000 [1] SMP >>> CPU 1 >>> Modules linked in: ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) >>> libcfs(U) nfs(U) lockd(U) nfs_acl(U) sunrpc(U) rdma_ucm(U) ib_sdp(U) >>> rdma_cm(U) iw_cm(U) ib_addr(U) mlx4_ib(U) mlx4_core(U) ib_mthca(U) >>> dm_mod(U) ib_ipoib(U) md5(U) ipv6(U) ib_umad(U) ib_ucm(U) ib_uverbs(U) >>> ib_cm(U) ib_sa(U) ib_mad(U) ib_core(U) aic79xx(U) e1000(U) ext3(U) >>> jbd(U) raid0(U) mptscsih(U) mptsas(U) mptspi(U) mptscsi(U) mptbase(U) >>> sd_mod(U) ata_piix(U) libata(U) scsi_mod(U) >>> Pid: 5141, comm: modprobe Not tainted 2.6.9-67.0.4.EL-Lustre-1.6.4.2 >>> RIP: 0010:[<ffffffffa04659d1>] >>> <ffffffffa04659d1>{:ko2iblnd:kiblnd_map_tx_descs+225} >>> RSP: 0000:00000102105d7cd8 EFLAGS: 00010286 >>> RAX: ffffffffa01e6b4e RBX: ffffff0010028000 RCX: 0000000000000001 >>> RDX: 0000000000001000 RSI: 000001020e705000 RDI: 00000102154e2000 >>> RBP: 00000102102c4200 R08: 0000000000000000 R09: 0000000000000000 >>> R10: 0000000000000000 R11: 00000000ffffffff R12: 0000000000000000 >>> R13: 0000000000000000 R14: 0000000000000000 R15: 00000102102c4228 >>> FS: 0000002a958a0b00(0000) GS:ffffffff8046ac00(0000) >>> knlGS:0000000000000000 >>> CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b >>> CR2: 0000002a9598200f CR3: 000000009fa08000 CR4: 00000000000006e0 >>> Process modprobe (pid: 5141, threadinfo 00000102105d6000, task >>> 00000102175e0030) >>> Stack: 0000000000000000 00000102102c4080 00000102102c4100 >>> 00000102102c4200 >>> 00000102179c2b86 00000102177df400 0000010215548ac0 >>> ffffffffa0466fdf >>> 00000102179c2b85 0000000000000000 >>> Call Trace:<ffffffffa0466fdf>{:ko2iblnd:kiblnd_startup+2239} >>> <ffffffffa03043dc>{:lnet:lnet_startup_lndnis+332} >>> <ffffffffa02d2f38>{:libcfs:cfs_alloc+40} >>> <ffffffffa0305206>{:lnet:LNetNIInit+278} >>> <ffffffffa03fcb0a>{:ptlrpc:ptlrpc_ni_init+106} >>> <ffffffff8012f9cd>{default_wake_function+0} >>> <ffffffffa03fcbfa>{:ptlrpc:ptlrpc_init_portals+10} >>> <ffffffff8012f9cd>{default_wake_function+0} >>> <ffffffffa045f22b>{:ptlrpc:init_module+267} >>> <ffffffff8014bc0a>{sys_init_module+278} >>> <ffffffff8010f23e>{system_call+126} >>> >>> >>> Code: ff 50 08 eb 12 48 8b 3f b9 01 00 00 00 ba 00 10 00 00 e8 30 >>> RIP <ffffffffa04659d1>{:ko2iblnd:kiblnd_map_tx_descs+225} RSP >>> <00000102105d7cd8> >>> >>> Does this ring any bells? Otherwise, any debugging tips? >>> >>> Shane said that they get an oops if they compile with the "version >>> specific OFA tree". Is this the Oops? >>> >>> Thanks, >>> >>> Chris >>> _______________________________________________ >>> Lustre-discuss mailing list >>> Lustre-discuss at lists.lustre.org >>> http://lists.lustre.org/mailman/listinfo/lustre-discuss >>> >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >> >