thr3ads.net - Lustre discuss - [Lustre-discuss] Can lustre dynamically add clients? [May 2006]

If this information is useful, please help other people find it:
Share via:

Phil Schwan

2006-May-19 07:36 UTC

[Lustre-discuss] Can lustre dynamically add clients?

Porter Don wrote:> 
> 1) If I have a lustre lov set up as follows:
> 
> node1: client
> node2: client and mds
> node3 & 4: ost
> 
> If I reboot nodes 2-4 and leave the system mounted on node1, when the other
> nodes come up and restart lustre, node1 cannot seamlessly restore its
state.
> In fact, the other three nodes seem to get messed up to the point that node
> 2 cannot mount the lov until all nodes, client and server, are rebooted and
> lustre is restarted.  
> 
> Is this how the system is supposed to work?  If not, am I making a newbie
> mistake?
Lustre 1.x can recover from any single failure at a time -- but we
explicitly do not support recovery from multiple simultaneous failures.

Even if you just reboot node 2 here, you will not have a seamless
recovery, because you have failed two components simultaneously: a
client and an MDS.  In this case, node1 will be told to flush its caches
and abort any in-progress operations.

If you try to start the MDS while one or more OSTs are down, it will
fail unless you explicitly tell lconf to ignore the inactive OSTs.  This
is so that administrators do not accidentally start up file systems in
degraded mode, without all of the servers, in which some data is
inaccessible -- we thought it best for that decision to be made very
consciously and explicitly.
> 2) If I get a new client node, is there a way to have it join without
> bringing down all nodes?  It would be ok to bring down the servers, but as
> noted above, doing so without bringing down all clients is causing me
> problems.
Adding new client nodes is standard practice -- you can mount it
normally, as if it had been there from the beginning.

If you are having problems mounting additional clients, please let us
know -- that is a bug!

Thanks--

-Phil

Porter Don

2006-May-19 07:36 UTC

head link

[Lustre-discuss] Can lustre dynamically add clients?

Phil,

Thanks for the help.  As long as I know what to look for in dmesg, I think
that is ok.  

Yeah, running the abort command seems to fix things up.

Thanks!

Don Porter

-----Original Message-----
From: Phil Schwan
To: Porter Don
Cc: ''''lustre-discuss@lists.clusterfs.com'' ''
Sent: 4/21/04 12:11 PM
Subject: RE: [Lustre-discuss] Can lustre dynamically add clients?

Hi Don--

On Tue, 2004-04-20 at 16:44, Porter Don wrote:> Ok.  I ran 
> 
> lmc -m orch_test.xml --add net --node client --nid ''*''
--nettype tcp
> 
> on all mds and osd nodes.
> 
> Then restarted the nodes.  When I tried:
> 
> [root@jawa051 root]# mount -t lustre jawa046:/mds1/client /mnt/lustre
> /sbin/mount.lustre: Invalid argument
This is not a great error message, I will be the first to agree.  Let me
see what I can do about improving that, although it may take a lot of
plumbing to get a useful error back from the kernel.
> I also got this on the mds:
> 
> [root@jawa046 root]# dmesg
> LustreError: 1556:(../ldlm/ldlm_lib.c:474:target_handle_connect())
denying> connection for new client b992fe62-fd97-4670-ba25-27a0c7e943f1: 10
clients> in recovery for 120s
This is the key message.  If the MDS was not shutdown cleanly, it will
wait for all old clients to reconnect, so that it can complete recovery.

Please see https://bugzilla.lustre.org/show_bug.cgi?id=2398 for more
details.

If you wait the 120 seconds, or perform the abort-recovery recipe
described in issue 2397, are you able to mount again?

-Phil

Don Porter

2006-May-19 07:36 UTC

head link

[Lustre-discuss] Can lustre dynamically add clients?

1) I got this to work by a setup like the following:

node 1: mds
node 2 & 3: ost
node 4: client.

I could reboot node 2 and everything still worked.

2) so, how do I add a node 5 to the mix without adding an entry to the
config.xml file and restarting all other nodes?  I suppose this is just
something I am having trouble finding in the documentation.

Thanks,
don

On Tue, 2004-04-13 at 00:58, Phil Schwan wrote:> Porter Don wrote:
> > 
> > 1) If I have a lustre lov set up as follows:
> > 
> > node1: client
> > node2: client and mds
> > node3 & 4: ost
> > 
> > If I reboot nodes 2-4 and leave the system mounted on node1, when the
other
> > nodes come up and restart lustre, node1 cannot seamlessly restore its
state.
> > In fact, the other three nodes seem to get messed up to the point that
node
> > 2 cannot mount the lov until all nodes, client and server, are
rebooted and
> > lustre is restarted.  
> > 
> > Is this how the system is supposed to work?  If not, am I making a
newbie
> > mistake?
> 
> Lustre 1.x can recover from any single failure at a time -- but we
> explicitly do not support recovery from multiple simultaneous failures.
> 
> Even if you just reboot node 2 here, you will not have a seamless
> recovery, because you have failed two components simultaneously: a
> client and an MDS.  In this case, node1 will be told to flush its caches
> and abort any in-progress operations.
> 
> If you try to start the MDS while one or more OSTs are down, it will
> fail unless you explicitly tell lconf to ignore the inactive OSTs.  This
> is so that administrators do not accidentally start up file systems in
> degraded mode, without all of the servers, in which some data is
> inaccessible -- we thought it best for that decision to be made very
> consciously and explicitly.
> 
> > 2) If I get a new client node, is there a way to have it join without
> > bringing down all nodes?  It would be ok to bring down the servers,
but as
> > noted above, doing so without bringing down all clients is causing me
> > problems.
> 
> Adding new client nodes is standard practice -- you can mount it
> normally, as if it had been there from the beginning.
> 
> If you are having problems mounting additional clients, please let us
> know -- that is a bug!
> 
> Thanks--
> 
> -Phil

Phil Schwan

2006-May-19 07:36 UTC

head link

[Lustre-discuss] Can lustre dynamically add clients?

Hi--

On Mon, 2004-04-19 at 21:31, Don Porter wrote:> 
> 2) so, how do I add a node 5 to the mix without adding an entry to the
> config.xml file and restarting all other nodes?  I suppose this is just
> something I am having trouble finding in the documentation.
Ah!  Now I understand what you''re having trouble with.

Clients are usually configurated with a ''*'' rule, so they all
use the
same profile.  For example:

  lmc -m config.xml --add net --node client --nid ''*''
--nettype tcp

Then, to start any client, you run:

  mount -t lustre mds.host.name:/mds_name/client_profile /mnt/lustre

In your configuration, assuming that you used the same service names as
the examples, this might well be:

  mount -t lustre mds:/mds1/client /mnt/lustre

(or if you are still using lconf: "lconf --node client config.xml")

Hope that helps--

-Phil

Porter Don

2006-May-19 07:36 UTC

head link

[Lustre-discuss] Can lustre dynamically add clients?

Ahh.  That makes life SO much better.  Having to specify each client at
startup time would be a major challenge to the usability of lustre.

I didn''t see anything about the wildcard node id in this doc:

https://wiki.clusterfs.com/lustre/LustreHowto

Is there a more recent one?  If not, it might be good to update that.

Thanks for the help,
Don

-----Original Message-----
From: Phil Schwan
To: Don Porter
Cc: ''lustre-discuss@lists.clusterfs.com''
Sent: 4/20/04 9:53 AM
Subject: Re: [Lustre-discuss] Can lustre dynamically add clients?

Hi--

On Mon, 2004-04-19 at 21:31, Don Porter wrote:> 
> 2) so, how do I add a node 5 to the mix without adding an entry to the
> config.xml file and restarting all other nodes?  I suppose this is
just> something I am having trouble finding in the documentation.
Ah!  Now I understand what you''re having trouble with.

Clients are usually configurated with a ''*'' rule, so they all
use the
same profile.  For example:

  lmc -m config.xml --add net --node client --nid ''*''
--nettype tcp

Then, to start any client, you run:

  mount -t lustre mds.host.name:/mds_name/client_profile /mnt/lustre

In your configuration, assuming that you used the same service names as
the examples, this might well be:

  mount -t lustre mds:/mds1/client /mnt/lustre

(or if you are still using lconf: "lconf --node client config.xml")

Hope that helps--

-Phil

Porter Don

2006-May-19 07:36 UTC

head link

[Lustre-discuss] Can lustre dynamically add clients?

Ok.  I ran 

lmc -m orch_test.xml --add net --node client --nid ''*''
--nettype tcp

on all mds and osd nodes.

Then restarted the nodes.  When I tried:

[root@jawa051 root]# mount -t lustre jawa046:/mds1/client /mnt/lustre
/sbin/mount.lustre: Invalid argument
[root@jawa051 root]# dmesg
LustreError: 3583:(client.c:445:ptlrpc_check_status()) @@@ type =PTL_RPC_MSG_ERR
req@f6a14000 x11/t0 o38->mds1@MDS_PEER_UUID:12 lens 168/64
ref 1 fl RPC:R/0/50000 rc 0/-16
LustreError: 3583:(llite_lib.c:446:lustre_process_log()) cannot connect to
mds1: rc = -16
LustreError: 3583:(llite_lib.c:547:lustre_fill_super()) No profile found:
client
LustreError: 3583:(client.c:445:ptlrpc_check_status()) @@@ type =PTL_RPC_MSG_ERR
req@f6d0c200 x12/t0 o38->mds1@MDS_PEER_UUID:12 lens 168/64
ref 1 fl RPC:R/0/50000 rc 0/-16
LustreError: 3583:(llite_lib.c:446:lustre_process_log()) cannot connect to
mds1: rc = -16


I also got this on the mds:

[root@jawa046 root]# dmesg
LustreError: 1556:(../ldlm/ldlm_lib.c:474:target_handle_connect()) denying
connection for new client b992fe62-fd97-4670-ba25-27a0c7e943f1: 10 clients
in recovery for 120s
LustreError: 1556:(../ldlm/ldlm_lib.c:1056:target_send_reply()) @@@
processing error (-16) req@f6c72c00 x11/t0 o38-><?>@:-1 lens 168/64 ref
0 fl
?phase?:/0/50000 rc -16/0
Lustre: 1474:(socknal_cb.c:1544:ksocknal_process_receive()) [c6368800] EOF
from 0xa540450 ip 10.84.4.80:32806
LustreError: 1557:(../ldlm/ldlm_lib.c:474:target_handle_connect()) denying
connection for new client 2e71937c-2275-4c06-ac3c-36ab10215d60: 10 clients
in recovery for 120s
LustreError: 1557:(../ldlm/ldlm_lib.c:1056:target_send_reply()) @@@
processing error (-16) req@f6ee8c00 x12/t0 o38-><?>@:-1 lens 168/64 ref
0 fl
?phase?:/0/50000 rc -16/0
Lustre: 1474:(socknal_cb.c:1544:ksocknal_process_receive()) [f71e1800] EOF
from 0xa540450 ip 10.84.4.80:32809


Any suggestions?  I can send my complete xml if you need it.

Thanks,
Don

-----Original Message-----
From: Phil Schwan
To: Don Porter
Cc: ''lustre-discuss@lists.clusterfs.com''
Sent: 4/20/04 9:53 AM
Subject: Re: [Lustre-discuss] Can lustre dynamically add clients?

Hi--

On Mon, 2004-04-19 at 21:31, Don Porter wrote:> 
> 2) so, how do I add a node 5 to the mix without adding an entry to the
> config.xml file and restarting all other nodes?  I suppose this is
just> something I am having trouble finding in the documentation.
Ah!  Now I understand what you''re having trouble with.

Clients are usually configurated with a ''*'' rule, so they all
use the
same profile.  For example:

  lmc -m config.xml --add net --node client --nid ''*''
--nettype tcp

Then, to start any client, you run:

  mount -t lustre mds.host.name:/mds_name/client_profile /mnt/lustre

In your configuration, assuming that you used the same service names as
the examples, this might well be:

  mount -t lustre mds:/mds1/client /mnt/lustre

(or if you are still using lconf: "lconf --node client config.xml")

Hope that helps--

-Phil

Phil Schwan

2006-May-19 07:36 UTC

head link

[Lustre-discuss] Can lustre dynamically add clients?

Hi Don--

On Tue, 2004-04-20 at 16:44, Porter Don wrote:> Ok.  I ran 
> 
> lmc -m orch_test.xml --add net --node client --nid ''*''
--nettype tcp
> 
> on all mds and osd nodes.
> 
> Then restarted the nodes.  When I tried:
> 
> [root@jawa051 root]# mount -t lustre jawa046:/mds1/client /mnt/lustre
> /sbin/mount.lustre: Invalid argument
This is not a great error message, I will be the first to agree.  Let me
see what I can do about improving that, although it may take a lot of
plumbing to get a useful error back from the kernel.
> I also got this on the mds:
> 
> [root@jawa046 root]# dmesg
> LustreError: 1556:(../ldlm/ldlm_lib.c:474:target_handle_connect()) denying
> connection for new client b992fe62-fd97-4670-ba25-27a0c7e943f1: 10 clients
> in recovery for 120s
This is the key message.  If the MDS was not shutdown cleanly, it will
wait for all old clients to reconnect, so that it can complete recovery.

Please see https://bugzilla.lustre.org/show_bug.cgi?id=2398 for more
details.

If you wait the 120 seconds, or perform the abort-recovery recipe
described in issue 2397, are you able to mount again?

-Phil

Porter Don

2006-May-19 07:36 UTC

head link

[Lustre-discuss] Can lustre dynamically add clients?

Hello all,

I have recently started using lustre 1.0.4 on a few x86 machines running the
provided 2.4.20-28.9_lustre.1.0.4smp kernel.  I have encountered two things
that confuse me a bit.

1) If I have a lustre lov set up as follows:

node1: client
node2: client and mds
node3 & 4: ost

If I reboot nodes 2-4 and leave the system mounted on node1, when the other
nodes come up and restart lustre, node1 cannot seamlessly restore its state.
In fact, the other three nodes seem to get messed up to the point that node
2 cannot mount the lov until all nodes, client and server, are rebooted and
lustre is restarted.  

Is this how the system is supposed to work?  If not, am I making a newbie
mistake?

2) If I get a new client node, is there a way to have it join without
bringing down all nodes?  It would be ok to bring down the servers, but as
noted above, doing so without bringing down all clients is causing me
problems.

Any advice/suggestions/help would be greatly appreciated.  Also, if I can
provide any more helpful information with this problem, I would be happy to.

Thanks,
Don Porter

Lustre discuss - May 2006 - Can lustre dynamically add clients?

[Lustre-discuss] Can lustre dynamically add clients?

[Lustre-discuss] Can lustre dynamically add clients?

[Lustre-discuss] Can lustre dynamically add clients?

[Lustre-discuss] Can lustre dynamically add clients?

[Lustre-discuss] Can lustre dynamically add clients?

[Lustre-discuss] Can lustre dynamically add clients?

[Lustre-discuss] Can lustre dynamically add clients?

[Lustre-discuss] Can lustre dynamically add clients?