thr3ads.net - Lustre discuss - [Lustre-discuss] Lustre, server+client and failover [May 2006]

If this information is useful, please help other people find it:
Share via:

Andreas Dilger

2006-May-19 07:36 UTC

[Lustre-discuss] Lustre, server+client and failover

On Feb 03, 2006  15:32 +0100, Slawomir Mroczek wrote:> I''ve got two dual Xeon machines running SLES9 SP2 with 3GB RAM
each.
> These two machines have access to shared storage which is HP MSA1000
> array with RAID ADG (two parity disks) configuration + hot-spare disk.
> Array has two controllers and two FC switches and each machine has one
> FC link to each array switch using two Qlogic HBAs. There is a FC 
> multipath configuration running and both machines can see one large
> LVM2 volume /dev/vg00/vol01 and one snapshot volume /dev/vg00/vol01snap.
It is very important to note that Lustre does not use shared storage
concurrently, like e.g. GFS.  Each Lustre service (MDS or OST) needs
its own dedicated volume.  Lustre failover is done by moving the
service of this node from primary to backup nodes, and should NEVER
be accessed by two nodes at the same time.
> And now I would like to put Lustre on this /dev/vg00/vol01. Each
> machine should mount have mounted /mnt/lustre.
There need to be at least 2 volumes, one for the MDS (size should
be at least 400MB + 4kB * number of files), and one or more for the
OSTs (size should be data size + 5%, no more than 2TB per OST).  If
your workload is IO bound, you may want to have OSTs on both nodes
to improve performance.
> There will be three
> directories only on /mnt/lustre. Two for Oracle (RAC files (not sure
> about that right now), and database files, but each table will have
> separate chunk file), and one where should go about 300 000 small
> files. These small files are input for another application feeding
> Oracle DB with data, and these files will be pushed here by another
> hosts - CIFS will be used.
> It seems I need to run server and client on each machine with failover
> feature. I''ve read that server+client on one host is not a wise
> choise, but Mr. Andreas Dilger wrote there is nothing to worry about.
Well, what I likely wrote was that "this is probably OK for normal usage,
but is not a currently supported configuration".  Only testing in your
environment can say if you will hit memory pressure and deadlocks with
the IO from the client to the OST on the same node.  This is not a
configuration that our customers use.
> as far I don''t know how to
> build proper Lustre configuration with failover. I would like to ask
> you to guide me how to made one or show me some working example I
> could use.
I thought there have been several examples of failover configurations
posted to this list?
> Another question: is there any chance to use LVM2 snapshots
> with Lustre? 
Because LVM is itself not cluster-aware, it is OK to use LVM for the
underlying devices, but you cannot do anything like lvresize on the
device.  As for snapshots, it depends on how they are implemented.
If the snapshot is done by moving old data to the snapshot volume and
leaving the "live" volume intact, this is likely OK, but we have never
tested it.
> Sorry for my bad english.
Much better than my ten words of Polish :-).

Na zdrowie, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

Slawomir Mroczek

2006-May-19 07:36 UTC

head link

[Lustre-discuss] Lustre, server+client and failover

Hello.

I''m at the planning and testing stage with one of my current project.
I''ve decided to try implementing Lustre but after reading docs, this
mailing list and asking google for help I''m a little consused if my
decision was right. Let me tell you what I got and what I try to get.
I''ve got two dual Xeon machines running SLES9 SP2 with 3GB RAM each.
These two machines have access to shared storage which is HP MSA1000
array with RAID ADG (two parity disks) configuration + hot-spare disk.
Array has two controllers and two FC switches and each machine has one
FC link to each array switch using two Qlogic HBAs. There is a FC 
multipath configuration running and both machines can see one large
LVM2 volume /dev/vg00/vol01 and one snapshot volume /dev/vg00/vol01snap.
Network configuration: 4 Gbit ethernet links - eth0 is strictly for
management, eth1 and eth2 are bonded and dedicated for clients
(and connected to two stacked L3 Cisco Gbit switches), and eth3 for
interconnect. Used packades are:
kernel-bigsmp-2.6.5-7.201_lustre.1.4.5.1.i686.rpm
kernel-source-2.6.5-7.201_lustre.1.4.5.1.i686.rpm
lustre-1.4.5.1-2.6.5_7.201_lustre.1.4.5.1bigsmp.i686.rpm
lustre-debuginfo-1.4.5.1-2.6.5_7.201_lustre.1.4.5.1bigsmp.i686.rpm
lustre-modules-1.4.5.1-2.6.5_7.201_lustre.1.4.5.1bigsmp.i686.rpm
lustre-source-1.4.5.1-2.6.5_7.201_lustre.1.4.5.1bigsmp.i686.rpm

And now I would like to put Lustre on this /dev/vg00/vol01. Each
machine should mount have mounted /mnt/lustre. There will be three
directories only on /mnt/lustre. Two for Oracle (RAC files (not sure
about that right now), and database files, but each table will have
separate chunk file), and one where should go about 300 000 small
files. These small files are input for another application feeding
Oracle DB with data, and these files will be pushed here by another
hosts - CIFS will be used.
All should be Highly Available. For applications which are not
cluster-aware I''ll use heartbeat. That is not problem. All I care right
now is to have there a good one cluster FS. I''ve tried RHEL4 with GFS,
and when development team started testing their apllications all I''ve
heard was complaining. Guess, GFS was not what I was looking for. So,
I''ve sitched to OCFS2. All run fine, but sometimes second node just
make decision to fence using kernel panic while i.e. rebooting first
node. Troubleshooting pointed us to nowhere. OCFS2 seems to be useless
on production systems, and I don''t want to go back and use old OCFS.
DRBD is not an option, becouse I will have to take care with mount
points (local, and remote). Of course it can be done with heartbeat,
and I have 3 production system using DRBD with no problems, but it is
not what I want. Coul you please tell me, if Lustre is what I want? It
seems I need to run server and client on each machine with failover
feature. I''ve read that server+client on one host is not a wise
choise, but Mr. Andreas Dilger wrote there is nothing to worry about.
I belive him. I''ve read a lot about Lustre. I''ve got a lot of
tips,
configuration files and other stuff, but as far I don''t know how to
build proper Lustre configuration with failover. I would like to ask
you to guide me how to made one or show me some working example I
could use. If that is no problem to you, of course.
Another question: is there any chance to use LVM2 snapshots
with Lustre? 

P.S.
I know that maybe I''m asking too much but it''s friday,
last two weeks I spent on fighting with cluster file systems, and now
I''ve got a feeling that I''m making a trivial mistakes and that
is why
Lustre doesn''t work for me. The next my project is 10 node cluster
running OpenSSI. And I would like to use Lustre there, too. It seems to
be a tough month...

Sorry for my bad english.

-- 
S?awomir Mroczek

Lustre discuss - May 2006 - Lustre, server+client and failover

[Lustre-discuss] Lustre, server+client and failover

[Lustre-discuss] Lustre, server+client and failover