Bernd Schubert wrote:> Hello Tam?s,
>
> On Tuesday 17 June 2008 16:41:55 Papp Tam?s wrote:
>
>> Dear All,
>>
>> Is there any reason to not user kernels with version 2.6.22.x above
>> 2.6.22.14 or should it work?
>>
>>
>> I''ve just compiled it with 2.6.22.19 and I can mount the
cluster, but
>> after the first ls command it gives me an oops, and stuck on this
stage.
>>
>
> I didn''t have the time to test lustre-1.6.5, but it would be quite
helpful if
> you could paste the oops.
>
helo!
Sure..
This is from dmesg:
PM: Adding info for No Bus:lnet
Lustre: OBD class driver, info at clusterfs.com
Lustre Version: 1.6.5
Build Version:
1.6.5-19700101010000-PRISTINE-.usr.src.linux-2.6.22.19.-2.6.22.19
PM: Adding info for No Bus:obd_psdev
Lustre: Added LNI 192.168.0.123 at tcp [8/256]
Lustre: Accept secure, port 988
LustreError: 2007:0:(router_proc.c:1013:lnet_proc_init()) couldn''t
create proc entry sys/lnet/stats
Lustre: Lustre Client File System; info at clusterfs.com
Lustre: Request x1 sent from MGC10.1.1.1 at tcp to NID 10.1.1.1 at tcp 5s ago
has timed out (limit 5s).
Lustre: Changing connection for MGC10.1.1.1 at tcp to
MGC10.1.1.1 at tcp_1/10.1.1.2 at tcp
Lustre: Request x3 sent from MGC10.1.1.1 at tcp to NID 10.1.1.2 at tcp 5s ago
has timed out (limit 5s).
LustreError: 2005:0:(client.c:716:ptlrpc_import_delay_req()) @@@
IMP_INVALID req at ffff810073341000 x4/t0
o501->MGS at MGC10.1.1.1@tcp_1:26/25 lens 136/248 e 0 to 100 dl 0 ref 1 fl
Rpc:/0/0 rc 0/0
LustreError: 15c-8: MGC10.1.1.1 at tcp: The configuration from log
''cubefs-client'' failed (-108). This may be the result of
communication
errors between this node and the MGS, a bad configuration, or other
errors. See the syslog for more information.
LustreError: 2005:0:(llite_lib.c:1061:ll_fill_super()) Unable to process
log: -108
Lustre: client ffff8100734e6000 umount complete
LustreError: 2005:0:(obd_mount.c:1951:lustre_fill_super()) Unable to
mount (-108)
PM: Adding info for No Bus:vcs1
PM: Adding info for No Bus:vcsa1
PM: Removing info for No Bus:vcs1
PM: Removing info for No Bus:vcsa1
PM: Adding info for No Bus:vcs1
PM: Adding info for No Bus:vcsa1
PM: Removing info for No Bus:vcs1
PM: Removing info for No Bus:vcsa1
PM: Adding info for No Bus:vcs1
PM: Adding info for No Bus:vcsa1
PM: Removing info for No Bus:vcs1
PM: Removing info for No Bus:vcsa1
PM: Adding info for No Bus:vcs1
PM: Adding info for No Bus:vcsa1
PM: Adding info for No Bus:vcs3
PM: Adding info for No Bus:vcsa3
PM: Removing info for No Bus:vcs3
PM: Removing info for No Bus:vcsa3
PM: Adding info for No Bus:vcs3
PM: Adding info for No Bus:vcsa3
PM: Adding info for No Bus:vcs4
PM: Adding info for No Bus:vcsa4
PM: Removing info for No Bus:vcs4
PM: Removing info for No Bus:vcsa4
PM: Adding info for No Bus:vcs4
PM: Adding info for No Bus:vcsa4
PM: Adding info for No Bus:vcs2
PM: Adding info for No Bus:vcsa2
PM: Removing info for No Bus:vcs2
PM: Removing info for No Bus:vcsa2
PM: Adding info for No Bus:vcs2
PM: Adding info for No Bus:vcsa2
PM: Adding info for No Bus:vcs5
PM: Adding info for No Bus:vcsa5
PM: Removing info for No Bus:vcs5
PM: Removing info for No Bus:vcsa5
PM: Adding info for No Bus:vcs5
An this is from messages log:
Jun 17 16:06:20 core-123 kernel: Lustre: OBD class driver,
info at clusterfs.com
Jun 17 16:06:20 core-123 kernel: Lustre Version: 1.6.5
Jun 17 16:06:20 core-123 kernel: Build Version:
1.6.5-19700101010000-PRISTINE-.usr.src.linux-2.6.22.19.-2.6.22.19
Jun 17 16:06:20 core-123 kernel: Lustre: Added LNI 192.168.0.123 at tcp [8/256]
Jun 17 16:06:20 core-123 kernel: Lustre: Accept secure, port 988
Jun 17 16:06:20 core-123 kernel: LustreError:
2014:0:(router_proc.c:1013:lnet_proc_init()) couldn''t create proc entry
sys/lnet/stats
Jun 17 16:06:21 core-123 kernel: Lustre: Lustre Client File System;
info at clusterfs.com
Jun 17 16:06:21 core-123 kernel: Lustre:
cubefs-clilov-ffff810076303800.lov: set parameter stripesize=8388608
Jun 17 16:06:26 core-123 kernel: Lustre: Request x8 sent from
cubefs-MDT0000-mdc-ffff810076303800 to NID 10.1.1.2 at tcp 5s ago has timed
out (limit 5s).
Jun 17 16:06:46 core-123 kernel: Lustre: Changing connection for
cubefs-MDT0000-mdc-ffff810076303800 to 10.1.1.1 at tcp/10.1.1.1 at tcp
Jun 17 16:06:46 core-123 kernel: Lustre: Client cubefs-client has started
Jun 17 16:06:54 core-123 pcscd: winscard.c:219:SCardConnect() Reader
E-Gate 0 0 Not Found
Jun 17 16:06:54 core-123 pcscd:last message repeated 3 times
Jun 17 16:06:54 core-123 acpid: client connected from 2208[0:0]
Jun 17 16:06:58 core-123 acpid: client connected from 2218[0:0]
Jun 17 16:07:03 core-123 acpid: client connected from 2226[0:0]
Jun 17 16:09:16 core-123 kernel: Unable to handle kernel paging request
at 00000000ffffffff RIP:
Jun 17 16:09:16 core-123 kernel: [<ffffffff8108beef>]
kmem_cache_alloc+0x2f/0x60
Jun 17 16:09:16 core-123 kernel: PGD 73cd9067 PUD 0
Jun 17 16:09:16 core-123 kernel: Oops: 0000 [1] SMP
Jun 17 16:09:16 core-123 kernel: CPU 0
Jun 17 16:09:16 core-123 kernel: Modules linked in: cifs mgc lustre lov
mdc lquota osc ksocklnd ptlrpc obdclass lnet lvfs libcfs rfcomm l2cap
bluetooth nfs lockd nfs_acl fuse sunrpc snd_hda_intel snd_seq_dummy
snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss
snd_mixer_oss snd_pcm snd_timer iTCO_wdt iTCO_vendor_support r8169 snd
soundcore snd_page_alloc floppy sg parport_pc parport ata_piix
ata_generic libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd
ehci_hcd
Jun 17 16:09:16 core-123 kernel: Pid: 2339, comm: ll_sa_2337 Not tainted
2.6.22.19 #3
Jun 17 16:09:16 core-123 kernel: RIP: 0010:[<ffffffff8108beef>]
[<ffffffff8108beef>] kmem_cache_alloc+0x2f/0x60
Jun 17 16:09:16 core-123 kernel: RSP: 0018:ffff810073d47da0 EFLAGS:
00010006
Jun 17 16:09:16 core-123 kernel: RAX: 0000000000000000 RBX:
0000000000000246 RCX: 00000000ffffffff
Jun 17 16:09:16 core-123 kernel: RDX: ffff81007fcf7210 RSI:
00000000000000d0 RDI: ffff81000103e800
Jun 17 16:09:16 core-123 kernel: RBP: ffff810073c9607c R08:
ffff810073d46000 R09: 0000000000000002
Jun 17 16:09:16 core-123 kernel: R10: 00000000ffffffff R11:
0000000000000001 R12: ffff810073d2e680
Jun 17 16:09:16 core-123 kernel: R13: ffff810073d47ee0 R14:
0000000000000000 R15: ffff81007945cab8
Jun 17 16:09:16 core-123 kernel: FS: 00002b9794ef7710(0000)
GS:ffffffff81362000(0000) knlGS:0000000000000000
Jun 17 16:09:16 core-123 kernel: CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
Jun 17 16:09:16 core-123 kernel: CR2: 00000000ffffffff CR3:
0000000073c41000 CR4: 00000000000006e0
Jun 17 16:09:16 core-123 kernel: Process ll_sa_2337 (pid: 2339,
threadinfo ffff810073d46000, task ffff810073c195f0)
Jun 17 16:09:16 core-123 kernel: Stack: ffff810073d47ee0
ffffffff810a4f12 ffff810073d47ee0 ffff810073c9607c
Jun 17 16:09:16 core-123 kernel: ffff81007b3e9580 0000000000000000
ffff810079c34b40 ffffffff884e9b87
Jun 17 16:09:16 core-123 kernel: 0000000000000000 0000000000000000
0000000000000000 0000000000000000
Jun 17 16:09:16 core-123 kernel: Call Trace:
Jun 17 16:09:16 core-123 kernel: [<ffffffff810a4f12>] d_alloc+0x22/0x1d0
Jun 17 16:09:16 core-123 kernel: [<ffffffff884e9b87>]
:lustre:ll_statahead_thread+0xed7/0x1610
Jun 17 16:09:16 core-123 kernel: [<ffffffff81029150>]
default_wake_function+0x0/0x10
Jun 17 16:09:16 core-123 kernel: [<ffffffff8100acc8>] child_rip+0xa/0x12
Jun 17 16:09:16 core-123 kernel: [<ffffffff884a9270>]
:lustre:ll_inode_permission+0x0/0xc0
Jun 17 16:09:16 core-123 kernel: [<ffffffff884e8cb0>]
:lustre:ll_statahead_thread+0x0/0x1610
Jun 17 16:09:16 core-123 kernel: [<ffffffff8100acbe>] child_rip+0x0/0x12
Jun 17 16:09:16 core-123 kernel:
Jun 17 16:09:16 core-123 kernel:
Jun 17 16:09:16 core-123 kernel: Code: 48 8b 04 c1 48 89 42 10 53 9d 5b
48 89 c8 c3 66 90 49 89 d0
Jun 17 16:09:16 core-123 kernel: RIP [<ffffffff8108beef>]
kmem_cache_alloc+0x2f/0x60
Jun 17 16:09:16 core-123 kernel: RSP <ffff810073d47da0>
Jun 17 16:09:16 core-123 kernel: CR2: 00000000ffffffff
Jun 17 16:09:36 core-123 kernel: Unable to handle kernel paging request
at 00000000fffffffe RIP:
Jun 17 16:09:36 core-123 kernel: [<ffffffff8108beef>]
kmem_cache_alloc+0x2f/0x60
Jun 17 16:09:36 core-123 kernel: PGD 73c73067 PUD 0
Jun 17 16:09:36 core-123 kernel: Oops: 0000 [2] SMP
Jun 17 16:09:36 core-123 kernel: CPU 0
Jun 17 16:09:36 core-123 kernel: Modules linked in: cifs mgc lustre lov
mdc lquota osc ksocklnd ptlrpc obdclass lnet lvfs libcfs rfcomm l2cap
bluetooth nfs lockd nfs_acl fuse sunrpc snd_hda_intel snd_seq_dummy
snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss
snd_mixer_oss snd_pcm snd_timer iTCO_wdt iTCO_vendor_support r8169 snd
soundcore snd_page_alloc floppy sg parport_pc parport ata_piix
ata_generic libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd
ehci_hcd
Jun 17 16:09:16 core-123 kernel: Pid: 2339, comm: ll_sa_2337 Not tainted
2.6.22.19 #3
Jun 17 16:09:16 core-123 kernel: RIP: 0010:[<ffffffff8108beef>]
[<ffffffff8108beef>] kmem_cache_alloc+0x2f/0x60
Jun 17 16:09:16 core-123 kernel: RSP: 0018:ffff810073d47da0 EFLAGS:
00010006
Jun 17 16:09:16 core-123 kernel: RAX: 0000000000000000 RBX:
0000000000000246 RCX: 00000000ffffffff
Jun 17 16:09:16 core-123 kernel: RDX: ffff81007fcf7210 RSI:
00000000000000d0 RDI: ffff81000103e800
Jun 17 16:09:16 core-123 kernel: RBP: ffff810073c9607c R08:
ffff810073d46000 R09: 0000000000000002
Jun 17 16:09:16 core-123 kernel: R10: 00000000ffffffff R11:
0000000000000001 R12: ffff810073d2e680
Jun 17 16:09:16 core-123 kernel: R13: ffff810073d47ee0 R14:
0000000000000000 R15: ffff81007945cab8
Jun 17 16:09:16 core-123 kernel: FS: 00002b9794ef7710(0000)
GS:ffffffff81362000(0000) knlGS:0000000000000000
Jun 17 16:09:16 core-123 kernel: CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
Jun 17 16:09:16 core-123 kernel: CR2: 00000000ffffffff CR3:
0000000073c41000 CR4: 00000000000006e0
Jun 17 16:09:16 core-123 kernel: Process ll_sa_2337 (pid: 2339,
threadinfo ffff810073d46000, task ffff810073c195f0)
Jun 17 16:09:16 core-123 kernel: Stack: ffff810073d47ee0
ffffffff810a4f12 ffff810073d47ee0 ffff810073c9607c
Jun 17 16:09:16 core-123 kernel: ffff81007b3e9580 0000000000000000
ffff810079c34b40 ffffffff884e9b87
Jun 17 16:09:16 core-123 kernel: 0000000000000000 0000000000000000
0000000000000000 0000000000000000
Jun 17 16:09:16 core-123 kernel: Call Trace:
Jun 17 16:09:16 core-123 kernel: [<ffffffff810a4f12>] d_alloc+0x22/0x1d0
Jun 17 16:09:16 core-123 kernel: [<ffffffff884e9b87>]
:lustre:ll_statahead_thread+0xed7/0x1610
Jun 17 16:09:16 core-123 kernel: [<ffffffff81029150>]
default_wake_function+0x0/0x10
Jun 17 16:09:16 core-123 kernel: [<ffffffff8100acc8>] child_rip+0xa/0x12
Jun 17 16:09:16 core-123 kernel: [<ffffffff884a9270>]
:lustre:ll_inode_permission+0x0/0xc0
Jun 17 16:09:16 core-123 kernel: [<ffffffff884e8cb0>]
:lustre:ll_statahead_thread+0x0/0x1610
Jun 17 16:09:16 core-123 kernel: [<ffffffff8100acbe>] child_rip+0x0/0x12
Jun 17 16:09:16 core-123 kernel:
Jun 17 16:09:16 core-123 kernel:
Jun 17 16:09:16 core-123 kernel: Code: 48 8b 04 c1 48 89 42 10 53 9d 5b
48 89 c8 c3 66 90 49 89 d0
Jun 17 16:09:16 core-123 kernel: RIP [<ffffffff8108beef>]
kmem_cache_alloc+0x2f/0x60
Jun 17 16:09:16 core-123 kernel: RSP <ffff810073d47da0>
Jun 17 16:09:16 core-123 kernel: CR2: 00000000ffffffff
Jun 17 16:09:36 core-123 kernel: Unable to handle kernel paging request
at 00000000fffffffe RIP:
Jun 17 16:09:36 core-123 kernel: [<ffffffff8108beef>]
kmem_cache_alloc+0x2f/0x60
Jun 17 16:09:36 core-123 kernel: PGD 73c73067 PUD 0
Jun 17 16:09:36 core-123 kernel: Oops: 0000 [2] SMP
Jun 17 16:09:36 core-123 kernel: CPU 0
Jun 17 16:09:36 core-123 kernel: Modules linked in: cifs mgc lustre lov
mdc lquota osc ksocklnd ptlrpc obdclass lnet lvfs libcfs rfcomm l2cap
bluetooth nfs lockd nfs_acl fuse sunrpc snd_hda_intel snd_seq_dummy
snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss
snd_mixer_oss snd_pcm snd_timer iTCO_wdt iTCO_vendor_support r8169 snd
soundcore snd_page_alloc floppy sg parport_pc parport ata_piix
ata_generic libata sd_mod scsi_mod ext3 jbd mbcache uhci_hcd ohci_hcd
ehci_hcd
Jun 17 16:09:36 core-123 kernel: Pid: 2342, comm: sshd Not tainted
2.6.22.19 #3
Jun 17 16:09:36 core-123 kernel: RIP: 0010:[<ffffffff8108beef>]
[<ffffffff8108beef>] kmem_cache_alloc+0x2f/0x60
Jun 17 16:09:36 core-123 kernel: RSP: 0018:ffff810073d4fbc8 EFLAGS:
00010002
Jun 17 16:09:36 core-123 kernel: RAX: 0000000000000000 RBX:
0000000000000246 RCX: 00000000fffffffe
Jun 17 16:09:36 core-123 kernel: RDX: ffff81007fcf7210 RSI:
00000000000000d0 RDI: ffff81000103e800
Jun 17 16:09:36 core-123 kernel: RBP: ffff81007e2df340 R08:
0000000000000001 R09: 0000000000000000
Jun 17 16:09:36 core-123 kernel: R10: 0000000000000085 R11:
0000000000000001 R12: ffff81007e2df340
Jun 17 16:09:36 core-123 kernel: R13: ffff810073d4fc78 R14:
0000000000000000 R15: ffff81007e2e4020
Jun 17 16:09:36 core-123 kernel: FS: 00002b21291a53b0(0000)
GS:ffffffff81362000(0000) knlGS:0000000000000000
Jun 17 16:09:36 core-123 kernel: CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
Jun 17 16:09:36 core-123 kernel: CR2: 00000000fffffffe CR3:
00000000741eb000 CR4: 00000000000006e0
Jun 17 16:09:36 core-123 kernel: Process sshd (pid: 2342, threadinfo
ffff810073d4e000, task ffff81007417eea0)
Jun 17 16:09:36 core-123 kernel: Stack: 0000000000000000
ffffffff810a4f12 0000000000000000 ffff81007e2df340
Jun 17 16:09:36 core-123 kernel: ffff810073d4fea8 ffff810073d4fc78
ffff810073d4fc88 ffffffff81099d0c
Jun 17 16:09:36 core-123 kernel: ffff810073d4fca8 ffff810037f36600
ffff81007e2e40d8 ffff810073d4c00c
Jun 17 16:09:36 core-123 kernel: Call Trace:
Jun 17 16:09:36 core-123 kernel: [<ffffffff810a4f12>] d_alloc+0x22/0x1d0
Jun 17 16:09:36 core-123 kernel: [<ffffffff81099d0c>]
do_lookup+0x19c/0x210
Jun 17 16:09:36 core-123 kernel: [<ffffffff8109bf22>]
__link_path_walk+0x882/0xe20
Jun 17 16:09:36 core-123 kernel: [<ffffffff810a3e5f>] dput+0x1f/0x130
Jun 17 16:09:36 core-123 kernel: [<ffffffff8109c51b>]
link_path_walk+0x5b/0x100
Jun 17 16:09:36 core-123 kernel: [<ffffffff8109c7dc>]
do_path_lookup+0x8c/0x260
Jun 17 16:09:36 core-123 kernel: [<ffffffff8109d67a>]
__path_lookup_intent_open+0x6a/0xd0
Jun 17 16:09:36 core-123 kernel: [<ffffffff8109d8ab>]
open_namei+0x8b/0x6f0
Jun 17 16:09:36 core-123 kernel: [<ffffffff81091024>]
sys_statfs+0x94/0xc0
Jun 17 16:09:36 core-123 kernel: [<ffffffff8107df8d>]
free_pages_and_swap_cache+0x8d/0xb0
Jun 17 16:09:36 core-123 kernel: [<ffffffff8109010c>]
do_filp_open+0x1c/0x50
Jun 17 16:09:36 core-123 kernel: [<ffffffff81090194>]
do_sys_open+0x54/0xf0
Jun 17 16:09:36 core-123 kernel: [<ffffffff8100a02c>] tracesys+0xdc/0xe1
Jun 17 16:09:36 core-123 kernel:
Jun 17 16:09:36 core-123 kernel:
Jun 17 16:09:36 core-123 kernel: Code: 48 8b 04 c1 48 89 42 10 53 9d 5b
48 89 c8 c3 66 90 49 89 d0
Jun 17 16:09:36 core-123 kernel: RIP [<ffffffff8108beef>]
kmem_cache_alloc+0x2f/0x60
Jun 17 16:09:36 core-123 kernel: RSP <ffff810073d4fbc8>
Jun 17 16:09:36 core-123 kernel: CR2: 00000000fffffffe
And I rebooted it.
The system is and uptodate FC8.
Thank you,
tamas