Josephine Palencia
2009-May-23 01:32 UTC
[Lustre-discuss] builds: lustre-2.0.2alpha (1.9.181)+ HEAD (1.9.190) with kerberos enabled
A. Centos5.3, Lustre-2.0.2alpha (1.9.181) with kerberos enabled Re: lustre-module complaining of wrong kernel [root at mds02w x86_64]# rpm -ivh lustre-modules-1.9.181-2.6.18_128.1.6_lustre_1.9.181_200905190414.x86_64.rpm error: Failed dependencies: kernel = 2.6.18-128.1.6-lustre-1.9.181 is needed by lustre-modules-1.9.181-2.6.18_128.1.6_lustre_1.9.181_200905190414.x86_64 [root at mds02w x86_64]# uname -a Linux mds02w.psc.teragrid.org 2.6.18-128.1.6-lustre-1.9.181 #3 SMP Tue May 19 03:15:43 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux The rpms built completely from the source. [root at mds02w x86_64]# pwd /usr/src/redhat/RPMS/x86_64 [root at mds02w x86_64]# ls kernel-2.6.18128.1.6lustre1.9.181-2.x86_64.rpm lustre-1.9.181-2.6.18_128.1.6_lustre_1.9.181_200905190414.x86_64.rpm lustre-ldiskfs-4.0.1-2.6.18_128.1.6_lustre_1.9.181_200905190415.x86_64.rpm lustre-ldiskfs-4.0.1-2.6.18_128.1.6_lustre_1.9.181_200905190427.x86_64.rpm lustre-modules-1.9.181-2.6.18_128.1.6_lustre_1.9.181_200905190414.x86_64.rpm lustre-source-1.9.181-2.6.18_128.1.6_lustre_1.9.181_200905190414.x86_64.rpm lustre-tests-1.9.181-2.6.18_128.1.6_lustre_1.9.181_200905190414.x86_64.rpm Configure script identified linux source path. Booted to the right kernel before lustre build. So not sure why I get the dep error for wrong kernel when I try to install lustre-module rpm. [root at mds02w lustre-1.9.181]# pwd /extra/lustre-1.9.181 $ ./configure --with-linux=/extra/linux-2.6.18-128.1.6 .... --enable-dependency-tracking --enable-posix-osd --enable-panic_dumplog --enable-gss --enable-health_write --enable-lru-resize --enable-liblustre-tests --enable-mindf --enable-quota --enable-lu_ref --with-ldiskfsprogs Have done this many times in older 1.9.50 versions with no problems.. If I force --nodeps on lustre-modules install, system crashes as expected. If I proceed with make install instead of make rpms, the resulting system system crashes when I attempt to load lustre module. B. Centos5.3, HEAD (1.9.190) with kerberos Lustre-module rpm installs. However the system crashes when I attempt to load lustre module Crash message below: [root at mds02w ~]# Assertion failure in journal_start() at fs/jbd/transaction.c:283: "handle->h_transaction->t_journal == journal" ----------- [cut here ] --------- [please bite here ] --------- Kernel BUG at fs/jbd/transaction.c:283 invalid opcode: 0000 [1] SMP last sysfs file: /class/misc/obd_psdev/dev CPU 0 Modules linked in: lustre(U) lov(U) osc(U) lquota(U) mdc(U) fid(U) fld(U) ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) ipt _MASQUERADE(U) iptable_nat(U) ip_nat(U) xt_state(U) ip_conntrack(U) nfnetlink(U) ipt_REJECT(U) xt_tcpudp(U) iptable_filter(U) ip_tables(U ) x_tables(U) bridge(U) autofs4(U) hidp(U) l2cap(U) bluetooth(U) sunrpc(U) ipv6(U) xfrm_nalgo(U) crypto_api(U) dm_mirror(U) dm_multipath( U) scsi_dh(U) video(U) backlight(U) sbs(U) i2c_ec(U) button(U) battery(U) asus_acpi(U) acpi_memhotplug(U) ac(U) parport_pc(U) lp(U) parpo rt(U) sg(U) e100(U) tg3(U) mii(U) i2c_amd756(U) i2c_amd8111(U) k8_edac(U) k8temp(U) i2c_core(U) amd_rng(U) libphy(U) ide_cd(U) pcspkr(U) cdrom(U) hwmon(U) serio_raw(U) edac_mc(U) dm_raid45(U) dm_message(U) dm_region_hash(U) dm_log(U) dm_mod(U) dm_mem_cache(U) sata_sil(U) li bata(U) shpchp(U) 3w_xxxx(U) sd_mod(U) scsi_mod(U) ext3(U) jbd(U) uhci_hcd(U) ohci_hcd(U) ehci_hcd(U) Pid: 2790, comm: modprobe Tainted: G 2.6.18-prep #3 RIP: 0010:[<ffffffff880321c2>] [<ffffffff880321c2>] :jbd:journal_start+0x62/0x107 RSP: 0018:ffff810077701a28 EFLAGS: 00010282 RAX: 0000000000000073 RBX: ffff810027460ad8 RCX: ffffffff802f7aa8 RDX: ffffffff802f7aa8 RSI: 0000000000000000 RDI: ffffffff802f7aa0 RBP: ffff810001aa7400 R08: ffffffff802f7aa8 R09: 0000000000000046 R10: ffffffff803d9520 R11: 0000000000000000 R12: 000000000000000a R13: ffff810040c2fa60 R14: 00000000000007c0 R15: 0000000000000780 FS: 00002abf4c117240(0000) GS:ffffffff803ac000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000002b04788 CR3: 0000000000201000 CR4: 00000000000006e0 Process modprobe (pid: 2790, threadinfo ffff810077700000, task ffff81007ef3b7e0) Stack: 0000000000000040 ffff81003374ea68 000000000101e780 ffffffff8805223d 0000000a00000000 0000000000000000 0000000000000000 0000000000000040 ffff810040c2fa60 000000000101e780 0000000000000040 ffff810077701e98 Call Trace: [<ffffffff8805223d>] :ext3:ext3_prepare_write+0x42/0x17b [<ffffffff8000fc43>] generic_file_buffered_write+0x26c/0x6d3 [<ffffffff8000e007>] current_fs_time+0x3b/0x40 [<ffffffff800c888c>] zone_statistics+0x3e/0x6d [<ffffffff80016199>] __generic_file_aio_write_nolock+0x36c/0x3b8 [<ffffffff80021169>] generic_file_aio_write+0x65/0xc1 [<ffffffff8804e1a2>] :ext3:ext3_file_write+0x16/0x91 [<ffffffff80017d30>] do_sync_write+0xc7/0x104 [<ffffffff8000aed4>] release_pages+0x14e/0x15b [<ffffffff8009dbc4>] autoremove_wake_function+0x0/0x2e [<ffffffff8006dd2c>] do_gettimeofday+0x40/0x8f [<ffffffff8005a31f>] getnstimeofday+0x10/0x28 [<ffffffff8005b486>] do_acct_process+0x517/0x54e [<ffffffff8000d0d2>] dput+0x2c/0x113 [<ffffffff800494f8>] acct_process+0x45/0x50 [<ffffffff80015305>] do_exit+0x2bb/0x91f [<ffffffff80048c1c>] cpuset_exit+0x0/0x6c [<ffffffff8005d28d>] tracesys+0xd5/0xe0 Code: 0f 0b 68 59 99 03 88 c2 1b 01 ff 43 0c e9 8b 00 00 00 48 8b RIP [<ffffffff880321c2>] :jbd:journal_start+0x62/0x107 RSP <ffff810077701a28> <0>Kernel panic - not syncing: Fatal exception <1>LustreError: dumping log to /tmp/lustre-log.1243027259.2790 Lustre-rpms of course work without the kerberos. Ps advice. Thanks, -josephin
Brian J. Murrell
2009-May-23 15:44 UTC
[Lustre-discuss] builds: lustre-2.0.2alpha (1.9.181)+ HEAD (1.9.190) with kerberos enabled
On Fri, 2009-05-22 at 21:32 -0400, Josephine Palencia wrote:> > A. Centos5.3, Lustre-2.0.2alpha (1.9.181) with kerberos enabled > > Re: lustre-module complaining of wrong kernel > > [root at mds02w x86_64]# rpm -ivh > lustre-modules-1.9.181-2.6.18_128.1.6_lustre_1.9.181_200905190414.x86_64.rpm > error: Failed dependencies: > kernel = 2.6.18-128.1.6-lustre-1.9.181 is needed by > lustre-modules-1.9.181-2.6.18_128.1.6_lustre_1.9.181_200905190414.x86_64What does "rpm -q --provides kernel-lustre" report? As you might have noticed, we have transitioned to building our kernel with the vendor''s spec. In that transition there was a bug, 19163, in which we didn''t provide the "kernel = <version>" that is provided implicitly with the vendor''s native binary package. I don''t recall whether the fix for that bug landed on HEAD prior to or after 2.0.2a though. But now that you point this all out, I think it would be better for lustre-modules to require "kernel-lustre = <version>" rather than "kernel = <version>". That way we know we have the lustre version of the kernel.> The rpms built completely from the source. > [root at mds02w x86_64]# pwd > /usr/src/redhat/RPMS/x86_64 > [root at mds02w x86_64]# ls > kernel-2.6.18128.1.6lustre1.9.181-2.x86_64.rpm^ Hrm. This is strange. There should be a dash between those. Is it really missing in your /usr/src/redhat/RPMS/x86_64 dir?> If I force --nodeps on lustre-modules install, system crashes as expected.I wouldn''t say it''s expected to crash. All you have so far is an RPM dependency mismatch. Forcing that installation should not cause the system to crash. What does the crash look like?> If I proceed with make install instead of make rpms, the resulting > system system crashes when I attempt to load lustre module.> [root at mds02w ~]# Assertion failure in journal_start() at > fs/jbd/transaction.c:283: "handle->h_transaction->t_journal == journal" > ----------- [cut here ] --------- [please bite here ] ---------This looks like a real bug, and if the crash you report from scenario A, after forcing the lustre-modules RPM to install is the same as this one, this crash is unrelated to the RPM dependencies. Cheers, b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090523/4559a67a/attachment-0001.bin