Anselm Strauss wrote:> hi.
>
> i''ve read about the configuration where mds and oss server are on
the
> same host. i was wondering if it is also possible to do failover in such
> a scenario, especially if it''s possible to:
>
> 1) automatically/manually failover only one service (results in a
> load-balanced scenario),
> 2) achieve a clean failover if a host crashes and both services must
> failover at the same time?
>
> has anyone tried it?
Hi Anselm,
Before 1.4.4 this could deadlock because OST''s had to be up before
MDS''s, but those deadlocks were fixed in 1.4.6. There could also be
problems at extreme loads if an MDS and OSS are running on the same
system.
But we think it is safe to try and to answer your questions:
Let''s call your services ''mds-foo'' and
''oss-foo''
(1) you can failover one service with:
Option 1 - copy your XML file to /etc/lustre/config.xml, symlink the
service name to /etc/init.d/lustre, then you can
# ./mds-foo stop || start
or
# ./oss-foo stop || start
Option 2
If you look inside /etc/init.d/lustre, the current options look like
(assume mds-foo)
(start) # lconf --service mds-foo <XML FILE>
(stop) # lconf --service mds-foo --failover --cleanup <XML FILE>
which expands to these options:
(start) # lconf --group mds-foo --select mds-foo=HOSTNAME
(stop) # lconf --group mds-foo --select mds-foo=HOSTNAME --failover
--cleanup <XML FILE>
(2) both services can failover
On the primary node, given the above /etc/lustre/config.xml
''/etc/init.d/lustre'' will stop and start all the services on
the node,
this is done by matching the hostname of the node to the service
description.
For the secondary node, you would have to start/stop both services
separately, as the implied match won''t happen.
For most failover software I would always treat the MDS and OSS as
separate resources - this would make it simple to cover both the
automatic and manual cases with one set of scripts.
It is worth noting - the MDS generally does quite a bit less work than
the OSS for most workloads, if the OSS is consuming your server, moving
the MDS off that server will in most cases not reduce your load very
much if at all as in many cases the application will do very few
metadata operations relative to data transactions. Thus the
''load-balancing'' part of your scenario may not be very useful.
I think that CFS believes this works, and it is just because we haven''t
had time to test this extensively that we do not officially "support"
it.
- Peter & Cliff ->
> sincerely,
> anselm strauss
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss@clusterfs.com
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss