Displaying 20 results from an estimated 500 matches similar to: "Lustre module not getting loaded in MDS"
2008 Mar 07
2
Multihomed question: want Lustre over IB andEthernet
Chris,
Perhaps you need to perform some write_conf like command. I''m not sure if this is needed in 1.6 or not.
Shane
----- Original Message -----
From: lustre-discuss-bounces at lists.lustre.org <lustre-discuss-bounces at lists.lustre.org>
To: lustre-discuss <lustre-discuss at lists.lustre.org>
Sent: Fri Mar 07 12:03:17 2008
Subject: Re: [Lustre-discuss] Multihomed
2008 Jan 15
19
How do you make an MGS/OSS listen on 2 NICs?
I am running on CentOS 5 distribution without adding any updates from CentOS. I am using the lustre 1.6.4.1 kernel and software.
I have two NICs that run though different switches.
I have the lustre options in my modprobe.conf to look like this:
options lnet networks=tcp0(eth1,eth0)
My MGS seems to be only listening on the first interface however.
When I try and ping the 1st interface (eth1)
2008 Apr 15
5
o2ib module prevents shutdown
Hello,
Not sure if this is the right forum: I''m encountering difficulties
with o2ib which prevents an LNET shutdown from proceeding:
Unloading OpenIB kernel modules:NET: Unregistered protocal family 27
Failed to unload rdma_cm
Failed to unload rdma_cm
Failed to unload ib_cm
Failed to unload ib_sa
LustreError: 131-3: Received notification of device removal
Please shutdown LNET
2007 Oct 15
3
iptables rules for lustre 1.6.x and MGS recovery procedures
Hi,
I would like to know what TCP/UDP ports should i keep open in my
firewall policies on my MGS server such that I can have my MGS server
fire-walled. Also if in a event of loss of MGT would it be possible
to recreate the MGT without loosing data or bringing the filesystem
down (i.e. by using cached information from MDT''s and OST''s)
Thanks
Anand
2008 Feb 04
32
Luster clients getting evicted
on our cluster that has been running lustre for about 1 month. I have
1 MDT/MGS and 1 OSS with 2 OST''s.
Our cluster uses all Gige and has about 608 nodes 1854 cores.
We have allot of jobs that die, and/or go into high IO wait, strace
shows processes stuck in fstat().
The big problem is (i think) I would like some feedback on it that of
these 608 nodes 209 of them have in dmesg
2006 Sep 25
4
Re: [openib-general] problems with lustre o2ib module & ofed
It seems that lustre puts its modules in /lib/modules/2.6.16.21-0.8-default
despite the fact that my kernel is 2.6.16.21-0.8-smp !
uname -a
Linux n32 2.6.16.21-0.8-smp #4 SMP Sun Sep 24 08:47:30 BST 2006 i686 i686 i386 GNU/Linux
make[3]: Nothing to be done for `install-exec-am''.
/bin/sh ../../mkinstalldirs /lib/modules/2.6.16.21-0.8-default/kernel/fs/lustre
/usr/bin/install -c -m 644
2012 Nov 02
3
lctl ping of Pacemaker IP
Greetings!
I am working with Lustre-2.1.2 on RHEL 6.2. First I configured it
using the standard defaults over TCP/IP. Everything worked very
nicely usnig a real, static --mgsnode=a.b.c.x value which was the
actual IP of the MGS/MDS system1 node.
I am now trying to integrate it with Pacemaker-1.1.7. I believe I
have most of the set-up completed with a particular exception. The
"lctl
2008 Jan 10
4
1.6.4.1 - active client evicted
Hi!
We''ve started to poke and prod at Lustre 1.6.4.1, and it seems to
mostly work (we haven''t had it OOPS on us yet like the earlier
1.6-versions did).
However, we had this weird incident where an active client (it was
copying 4GB files and running ls at the time) got evicted by the MDS
and all OST''s. After a while logs indicate that it did recover the
connection
2007 Nov 07
1
ll_cfg_requeue process timeouts
Hi,
Our environment is: 2.6.9-55.0.9.EL_lustre.1.6.3smp
I am getting following errors from two OSS''s
...
Nov 7 10:39:51 storage09.beowulf.cluster kernel: LustreError:
23045:0:(client.c:519:ptlrpc_import_delay_req()) @@@ IMP_INVALID
req at 00000100b410be00 x4190687/t0 o101->MGS at MGC10.143.245.201@tcp_0:26
lens 232/240 ref 1 fl Rpc:/0/0 rc 0/0
Nov 7 10:39:51
2008 Feb 22
0
lustre error
Dear All,
Yesterday evening or cluster has stopped.
Two of our nodes tried to take the resource from each other, they
haven''t seen the other side, if I saw well.
I stopped heartbeat, resources, start it again, and back to online,
worked fine.
This morning I saw this in logs:
Feb 22 03:25:07 node4 kernel: Lustre:
7:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall
2007 Oct 25
1
Error message
I''m seeing this error message on one of my OSS''s but not the other
three. Any idea what is causing it?
Oct 25 13:58:56 oss2 kernel: LustreError:
3228:0:(client.c:519:ptlrpc_import_delay_req()) @@@ IMP_INVALID
req at f6b13200 x18040/t0 o101->MGS at MGC192.168.0.200@tcp_0:26 lens 176/184
ref 1 fl Rpc:/0/0 rc 0/0
Oct 25 13:58:56 oss2 kernel: LustreError:
2013 Dec 17
2
Setting up a lustre zfs dual mgs/mdt over tcp - help requested
Hi all,
Here is the situation:
I have 2 nodes MDS1 , MDS2 (10.0.0.22 , 10.0.0.23) I wish to use as
failover MGS, active/active MDT with zfs.
I have a jbod shelf with 12 disks, seen by both nodes as das (the
shelf has 2 sas ports, connected to a sas hba on each node), and I
am using lustre 2.4 on centos 6.4 x64
I have created 3 zfs pools:
1. mgs:
# zpool
2010 Jun 22
7
lnet infiniband config
Hi all,
I''m getting my feet wet in the infiniband lake and of course I run into
some problems.
It would seem I got the compilation part of sles11 kernel 2.6.27 +
Lustre 1.8.3 + ofed 1.4.2 right, because it allows me to see and use the
infiniband fabric, and because ko2iblnd loads without any complaints.
In /etc/modprobe.d/lustre (this is a Debian system, hence this subdir of
2013 Apr 29
1
OSTs inactive on one client (only)
Hi everyone,
I have seen this question here before, but without a very
satisfactory answer. One of our half a dozen clients has
lost access to a set of OSTs:
> lfs osts
OBDS::
0: lustre-OST0000_UUID ACTIVE
1: lustre-OST0001_UUID ACTIVE
2: lustre-OST0002_UUID INACTIVE
3: lustre-OST0003_UUID INACTIVE
4: lustre-OST0004_UUID INACTIVE
5: lustre-OST0005_UUID ACTIVE
6: lustre-OST0006_UUID ACTIVE
2008 Mar 06
2
strange lustre errors
Hi,
On a few of the hpc cluster nodes, i am seeing a new lustre
error that is pasted below. The volumes are working fine and there
is nothing on the oss and mds to report.
LustreError: 5080:0:(import.c:607:ptlrpc_connect_interpret())
data3-OST0000_UUID at 192.168.2.98@tcp changed handle from
0xfe51139158c64fae to 0xfe511392a35878b3; copying, but this may
foreshadow disaster
2007 Oct 22
0
The mds_connect operation failed with -11
Hi, list:
I''m trying configure lustre with:
1 MGS -------------> 192.168.3.100 with mkfs.lustre --mgs /dev/md1 ;
mount -t lustre ...
1 MDT ------------> 192.168.3.101 with mkfs.lustre --fsname=datafs00
--mdt --mgsnode=192.168.3.100 /dev/sda3 ; mount -t lustre ...
4 ost -----------> 192.168.3.102-104 with mkfs.lustre --fsname=datafs00
--ost --mgsnode=192.168.3.100 at tcp0
2008 Jan 31
2
lustre+samba
Dear All,
I try to use our cluster though samba share. Everything work fine, but
I think, we should have -o flock at lustre mount time.
Great, it''s work. But when I want to save a file on the share, I get
this on the logs:
Jan 31 10:45:24 opteron-ren-11 kernel: LustreError: 24836:0:(file.c:2309:ll_file_flock()) unknown fcntl lock type: 32
Jan 31 10:45:24 opteron-ren-11 kernel:
2008 Feb 07
2
Lustre behaviour when multiple network paths are available?
Hi there,
When Lustre is configured in an environment where there are multiple paths
to the same destination of the same length (i.e. two paths, each one hop
away), which path(s) will be used for sending and receiving data?
I have my cluster configured with two OSTs with two GigE NICs in each. I am
seeing identical performance metrics when I use LACP to aggregate, and when
I use two separate
2008 Feb 12
0
Lustre-discuss Digest, Vol 25, Issue 17
Hi,
i just want to know whether there are any alternative file systems for HP SFS.
I heard that there is Cluster Gateway from Polyserve. Can anybody plz help me in finding more abt this Cluster Gateway.
Thanks and Regards,
Ashok Bharat
-----Original Message-----
From: lustre-discuss-bounces at lists.lustre.org on behalf of lustre-discuss-request at lists.lustre.org
Sent: Tue 2/12/2008 3:18 AM
2008 Feb 14
2
kickstart file problem
I have a kickstart file that I am using to install multiple machines. If I install with no %post script, everything runs great. When I add the following %post section, if fails.
I have been working on this for a few days now without luck, Any help would be appreciated.
Here is the error, the script follows.
Traceback (most recent call first):
File