Josephine Palencia
2009-May-23  01:32 UTC
[Lustre-discuss] builds: lustre-2.0.2alpha (1.9.181)+ HEAD (1.9.190) with kerberos enabled
A.  Centos5.3, Lustre-2.0.2alpha (1.9.181) with kerberos enabled
Re:  lustre-module complaining of wrong kernel
[root at mds02w x86_64]# rpm -ivh 
lustre-modules-1.9.181-2.6.18_128.1.6_lustre_1.9.181_200905190414.x86_64.rpm
error: Failed dependencies:
         kernel = 2.6.18-128.1.6-lustre-1.9.181 is needed by 
lustre-modules-1.9.181-2.6.18_128.1.6_lustre_1.9.181_200905190414.x86_64
[root at mds02w x86_64]# uname -a
Linux mds02w.psc.teragrid.org 2.6.18-128.1.6-lustre-1.9.181 #3 SMP Tue May 19
03:15:43 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
The rpms built completely from the source.
[root at mds02w x86_64]# pwd
/usr/src/redhat/RPMS/x86_64
[root at mds02w x86_64]# ls
kernel-2.6.18128.1.6lustre1.9.181-2.x86_64.rpm
lustre-1.9.181-2.6.18_128.1.6_lustre_1.9.181_200905190414.x86_64.rpm
lustre-ldiskfs-4.0.1-2.6.18_128.1.6_lustre_1.9.181_200905190415.x86_64.rpm
lustre-ldiskfs-4.0.1-2.6.18_128.1.6_lustre_1.9.181_200905190427.x86_64.rpm
lustre-modules-1.9.181-2.6.18_128.1.6_lustre_1.9.181_200905190414.x86_64.rpm
lustre-source-1.9.181-2.6.18_128.1.6_lustre_1.9.181_200905190414.x86_64.rpm
lustre-tests-1.9.181-2.6.18_128.1.6_lustre_1.9.181_200905190414.x86_64.rpm
Configure script identified linux source path.
Booted to the right kernel before lustre build.
So not sure why I get the dep error for wrong kernel when I try to install 
lustre-module rpm.
[root at mds02w lustre-1.9.181]# pwd
/extra/lustre-1.9.181
$ ./configure --with-linux=/extra/linux-2.6.18-128.1.6 ....
--enable-dependency-tracking --enable-posix-osd --enable-panic_dumplog 
--enable-gss --enable-health_write --enable-lru-resize --enable-liblustre-tests
--enable-mindf --enable-quota --enable-lu_ref --with-ldiskfsprogs
Have done this many  times in older 1.9.50 versions with no problems..
If I force --nodeps on lustre-modules install, system crashes as expected.
If I proceed with make install instead of make rpms, the resulting 
system system crashes when I attempt to load lustre module.
B. Centos5.3,  HEAD (1.9.190) with kerberos
Lustre-module rpm installs.
However the system crashes when I attempt to load lustre module
Crash message below:
[root at mds02w ~]# Assertion failure in journal_start() at 
fs/jbd/transaction.c:283: "handle->h_transaction->t_journal ==
journal"
----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at fs/jbd/transaction.c:283
invalid opcode: 0000 [1] SMP
last sysfs file: /class/misc/obd_psdev/dev
CPU 0
Modules linked in: lustre(U) lov(U) osc(U) lquota(U) mdc(U) fid(U) fld(U) 
ksocklnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) libcfs(U) ipt
_MASQUERADE(U) iptable_nat(U) ip_nat(U) xt_state(U) ip_conntrack(U) 
nfnetlink(U) ipt_REJECT(U) xt_tcpudp(U) iptable_filter(U) ip_tables(U
) x_tables(U) bridge(U) autofs4(U) hidp(U) l2cap(U) bluetooth(U) sunrpc(U) 
ipv6(U) xfrm_nalgo(U) crypto_api(U) dm_mirror(U) dm_multipath(
U) scsi_dh(U) video(U) backlight(U) sbs(U) i2c_ec(U) button(U) battery(U) 
asus_acpi(U) acpi_memhotplug(U) ac(U) parport_pc(U) lp(U) parpo
rt(U) sg(U) e100(U) tg3(U) mii(U) i2c_amd756(U) i2c_amd8111(U) k8_edac(U) 
k8temp(U) i2c_core(U) amd_rng(U) libphy(U) ide_cd(U) pcspkr(U)
cdrom(U) hwmon(U) serio_raw(U) edac_mc(U) dm_raid45(U) dm_message(U) 
dm_region_hash(U) dm_log(U) dm_mod(U) dm_mem_cache(U) sata_sil(U) li
bata(U) shpchp(U) 3w_xxxx(U) sd_mod(U) scsi_mod(U) ext3(U) jbd(U) uhci_hcd(U) 
ohci_hcd(U) ehci_hcd(U)
Pid: 2790, comm: modprobe Tainted: G      2.6.18-prep #3
RIP: 0010:[<ffffffff880321c2>]  [<ffffffff880321c2>] 
:jbd:journal_start+0x62/0x107
RSP: 0018:ffff810077701a28  EFLAGS: 00010282
RAX: 0000000000000073 RBX: ffff810027460ad8 RCX: ffffffff802f7aa8
RDX: ffffffff802f7aa8 RSI: 0000000000000000 RDI: ffffffff802f7aa0
RBP: ffff810001aa7400 R08: ffffffff802f7aa8 R09: 0000000000000046
R10: ffffffff803d9520 R11: 0000000000000000 R12: 000000000000000a
R13: ffff810040c2fa60 R14: 00000000000007c0 R15: 0000000000000780
FS:  00002abf4c117240(0000) GS:ffffffff803ac000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000002b04788 CR3: 0000000000201000 CR4: 00000000000006e0
Process modprobe (pid: 2790, threadinfo ffff810077700000, task 
ffff81007ef3b7e0)
Stack:  0000000000000040 ffff81003374ea68 000000000101e780 ffffffff8805223d
  0000000a00000000 0000000000000000 0000000000000000 0000000000000040
  ffff810040c2fa60 000000000101e780 0000000000000040 ffff810077701e98
Call Trace:
  [<ffffffff8805223d>] :ext3:ext3_prepare_write+0x42/0x17b
  [<ffffffff8000fc43>] generic_file_buffered_write+0x26c/0x6d3
  [<ffffffff8000e007>] current_fs_time+0x3b/0x40
  [<ffffffff800c888c>] zone_statistics+0x3e/0x6d
  [<ffffffff80016199>] __generic_file_aio_write_nolock+0x36c/0x3b8
  [<ffffffff80021169>] generic_file_aio_write+0x65/0xc1
  [<ffffffff8804e1a2>] :ext3:ext3_file_write+0x16/0x91
  [<ffffffff80017d30>] do_sync_write+0xc7/0x104
  [<ffffffff8000aed4>] release_pages+0x14e/0x15b
  [<ffffffff8009dbc4>] autoremove_wake_function+0x0/0x2e
  [<ffffffff8006dd2c>] do_gettimeofday+0x40/0x8f
  [<ffffffff8005a31f>] getnstimeofday+0x10/0x28
  [<ffffffff8005b486>] do_acct_process+0x517/0x54e
  [<ffffffff8000d0d2>] dput+0x2c/0x113
  [<ffffffff800494f8>] acct_process+0x45/0x50
  [<ffffffff80015305>] do_exit+0x2bb/0x91f
  [<ffffffff80048c1c>] cpuset_exit+0x0/0x6c
  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Code: 0f 0b 68 59 99 03 88 c2 1b 01 ff 43 0c e9 8b 00 00 00 48 8b
RIP  [<ffffffff880321c2>] :jbd:journal_start+0x62/0x107
  RSP <ffff810077701a28>
  <0>Kernel panic - not syncing: Fatal exception
  <1>LustreError: dumping log to /tmp/lustre-log.1243027259.2790
Lustre-rpms of course work without the kerberos.
Ps advice.
Thanks,
-josephin
Brian J. Murrell
2009-May-23  15:44 UTC
[Lustre-discuss] builds: lustre-2.0.2alpha (1.9.181)+ HEAD (1.9.190) with kerberos enabled
On Fri, 2009-05-22 at 21:32 -0400, Josephine Palencia wrote:> > A. Centos5.3, Lustre-2.0.2alpha (1.9.181) with kerberos enabled > > Re: lustre-module complaining of wrong kernel > > [root at mds02w x86_64]# rpm -ivh > lustre-modules-1.9.181-2.6.18_128.1.6_lustre_1.9.181_200905190414.x86_64.rpm > error: Failed dependencies: > kernel = 2.6.18-128.1.6-lustre-1.9.181 is needed by > lustre-modules-1.9.181-2.6.18_128.1.6_lustre_1.9.181_200905190414.x86_64What does "rpm -q --provides kernel-lustre" report? As you might have noticed, we have transitioned to building our kernel with the vendor''s spec. In that transition there was a bug, 19163, in which we didn''t provide the "kernel = <version>" that is provided implicitly with the vendor''s native binary package. I don''t recall whether the fix for that bug landed on HEAD prior to or after 2.0.2a though. But now that you point this all out, I think it would be better for lustre-modules to require "kernel-lustre = <version>" rather than "kernel = <version>". That way we know we have the lustre version of the kernel.> The rpms built completely from the source. > [root at mds02w x86_64]# pwd > /usr/src/redhat/RPMS/x86_64 > [root at mds02w x86_64]# ls > kernel-2.6.18128.1.6lustre1.9.181-2.x86_64.rpm^ Hrm. This is strange. There should be a dash between those. Is it really missing in your /usr/src/redhat/RPMS/x86_64 dir?> If I force --nodeps on lustre-modules install, system crashes as expected.I wouldn''t say it''s expected to crash. All you have so far is an RPM dependency mismatch. Forcing that installation should not cause the system to crash. What does the crash look like?> If I proceed with make install instead of make rpms, the resulting > system system crashes when I attempt to load lustre module.> [root at mds02w ~]# Assertion failure in journal_start() at > fs/jbd/transaction.c:283: "handle->h_transaction->t_journal == journal" > ----------- [cut here ] --------- [please bite here ] ---------This looks like a real bug, and if the crash you report from scenario A, after forcing the lustre-modules RPM to install is the same as this one, this crash is unrelated to the RPM dependencies. Cheers, b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090523/4559a67a/attachment-0001.bin