thr3ads.net - Lustre discuss - [Lustre-discuss] one server node fails, its all dead? [Feb 2009]

If this information is useful, please help other people find it:
Share via:

Robert Minvielle

2009-Feb-02 21:23 UTC

[Lustre-discuss] one server node fails, its all dead?

More testing today. I downed a server (OST) to see what would happen. Well,
it does follow the FAQ :) The FAQ states:

-- begin FAQ --
I don''t need failover, and don''t want shared storage. How will
this work?

If Lustre is configured without shared storage for failover, and a server node
fails, then a client that tries to use that node will pause until the failed
server is returned to operation. After a short delay (a configurable timeout
value), applications waiting for those nodes can be aborted with a signal (kill
or Ctrl-C), similar to the NFS soft-mount mode.

When the node is returned to service, applications which have not been aborted
will continue to run without errors or data loss.

-- end FAQ --

So, if I have a server that goes down, the clients are out of luck. I have
a hard time believing this is "acceptable". Ok, so it is "as good
as" NFS,
but I mean really, if a single storage unit fails all of my clients can do
nothing? Am I missing something here or is this by design? The real reason
I ask is that I am testing Lustre against a few other DPFS to see if we will
move to Lustre. So far, some things are nice, and some are not nice. Writing
seems to be faster, but reading is slower (than my other test DPFSs). 
Contacting Sun to ask about support took forever. At least four days for them
to just call me back and tell me they could not give me a price without 
knowing how much storage I have (ugh, a pay per byte system, great). 

So, Lustre users, is it worth it? My setup would be 24 OST''s with about
100TB of storage, 10G ethernet, RAID on each OST, at least 20 or so clients
needing pretty fast read/write, connected via 10G ethernet (yes, I know I 
need a SAN but the physical locations will not allow it and the price is
prohibitive, hence my looking at DPFSs)... Am I on the right track looking
at Lustre, or should I go elsewhere? I also need commercial support of some
kind (although it seems Sun is unsure of themselves here, they did not 
know who to contact when I contacted them "Lustre, we make a product
called Lustre? Hold please"...

Brian J. Murrell

2009-Feb-02 21:46 UTC

head link

[Lustre-discuss] one server node fails, its all dead?

On Mon, 2009-02-02 at 15:23 -0600, Robert Minvielle
wrote:> 
> So, if I have a server that goes down, the clients are out of luck.
Without failover configured, yes.
> I have
> a hard time believing this is "acceptable".
Well, that''s completely subjective of course.  If it''s not
acceptable to
you, then you can configure a second node that has access to the (i.e.
shared) storage (i.e. OSTs or MDTs) for the failed node and service will
continue on after the clients discover that the primary has failed and
resume operations with the secondary.  All of this happens transparent
to the applications running on the clients.
> Ok, so it is "as good as" NFS,
If you don''t configure failover, yes.  If you configure failover then
it''s better.
> but I mean really, if a single storage unit fails all of my clients can do
> nothing?
No, that''s not true.  Even if you don''t have failover
configured, any
clients that do not attempt to access any files (or file stripes) on
that failed OST don''t even notice and continue on merrily.
> Am I missing something here or is this by design?
It''s design.
> Contacting Sun to ask about support took forever. At least four days for
them
> to just call me back and tell me they could not give me a price without 
> knowing how much storage I have (ugh, a pay per byte system, great). 
No.  You must have misunderstood.  We don''t charge "per
byte".  IIUC,
support costs are a function of how many OSSes you have.

b.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090202/b847f2c3/attachment.bin

Craig Tierney

2009-Feb-02 21:52 UTC

head link

[Lustre-discuss] one server node fails, its all dead?

Robert Minvielle wrote:> More testing today. I downed a server (OST) to see what would happen. Well,
> it does follow the FAQ :) The FAQ states:
> 
> -- begin FAQ --
> I don''t need failover, and don''t want shared storage. How
will this work?
> 
> If Lustre is configured without shared storage for failover, and a server
node fails, then a client that tries to use that node will pause until the
failed server is returned to operation. After a short delay (a configurable
timeout value), applications waiting for those nodes can be aborted with a
signal (kill or Ctrl-C), similar to the NFS soft-mount mode.
> 
> When the node is returned to service, applications which have not been
aborted will continue to run without errors or data loss.
> 
> -- end FAQ --
> 
> So, if I have a server that goes down, the clients are out of luck. I have
> a hard time believing this is "acceptable". Ok, so it is "as
good as" NFS,
> but I mean really, if a single storage unit fails all of my clients can do
> nothing? Am I missing something here or is this by design? 
It says that if a server node fails, then any client trying to access that
server node will block until its function is restored.  If you don''t
want
failover nor shared storage, what is Lustre supposed to do?  Other clients
can write data to other nodes, and they can read data if it is on a working
server node.

The real reason> I ask is that I am testing Lustre against a few other DPFS to see if we
will
> move to Lustre. So far, some things are nice, and some are not nice.
Writing
> seems to be faster, but reading is slower (than my other test DPFSs). 
> Contacting Sun to ask about support took forever. At least four days for
them
> to just call me back and tell me they could not give me a price without 
> knowing how much storage I have (ugh, a pay per byte system, great). 
> 
Performance is going to very greatly with the storage hardware used.  Sun
can help there (if you can find the right people to find).
> So, Lustre users, is it worth it? My setup would be 24 OST''s with
about
> 100TB of storage, 10G ethernet, RAID on each OST, at least 20 or so clients
> needing pretty fast read/write, connected via 10G ethernet (yes, I know I 
> need a SAN but the physical locations will not allow it and the price is
> prohibitive, hence my looking at DPFSs)... Am I on the right track looking
> at Lustre, or should I go elsewhere? I also need commercial support of some
> kind (although it seems Sun is unsure of themselves here, they did not 
> know who to contact when I contacted them "Lustre, we make a product
> called Lustre? Hold please"... 
Other companies can provide Lustre support.  Data Direct Networks provides
a packaged solution with their hardware.  It can be configured with failover
and all the other goodies one would want to ensure maximum uptime.  Terascala
provides Lustre support with their hardware.  Although they are working
(or may have delivered) failover configurations, their initial releases were
non-shared storage, no-failover configurations.

Craig


> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
> 

-- 
Craig Tierney (craig.tierney at noaa.gov)

Andreas Dilger

2009-Feb-02 21:57 UTC

head link

[Lustre-discuss] one server node fails, its all dead?

On Feb 02, 2009  15:23 -0600, Robert Minvielle wrote:> So, if I have a server that goes down, the clients are out of luck. I have
> a hard time believing this is "acceptable". Ok, so it is "as
good as" NFS,
> but I mean really, if a single storage unit fails all of my clients can do
> nothing? Am I missing something here or is this by design?
It is possible for clients to create new files while a server is down,
but as you can expect it isn''t possible to read any data from the
failed
server.  In some cases users have used DRBD to do device replication
instead of using shared storage.
> The real reason
> I ask is that I am testing Lustre against a few other DPFS to see if we
will
> move to Lustre. So far, some things are nice, and some are not nice.
Writing
> seems to be faster, but reading is slower (than my other test DPFSs). 
> Contacting Sun to ask about support took forever. At least four days for
them
> to just call me back and tell me they could not give me a price without 
> knowing how much storage I have (ugh, a pay per byte system, great). 
You can imagine that supporting the largest Lustre filesystem (1300+ OSTs
with 10PB of storage and 30k+ clients) will take more effort than supporting
a system with a handful of OSTs and clients.  The support price is not per
client, but rather per-OST, IIRC.
> So, Lustre users, is it worth it? My setup would be 24 OST''s with
about
> 100TB of storage, 10G ethernet, RAID on each OST, at least 20 or so clients
> needing pretty fast read/write, connected via 10G ethernet (yes, I know I 
> need a SAN but the physical locations will not allow it and the price is
> prohibitive, hence my looking at DPFSs)... Am I on the right track looking
> at Lustre, or should I go elsewhere? I also need commercial support of some
> kind (although it seems Sun is unsure of themselves here, they did not 
> know who to contact when I contacted them "Lustre, we make a product
> called Lustre? Hold please"... 
Well, Sun is a big company, and Lustre was only acquired a year ago and
does not necessarily generate a high call volume to the L1 support people,
so they are not necessarily going to have information immediately handy.

Note that Lustre itself does NOT need a SAN to work, unlike some other
cluster filesystems.  The only SAN requirement is for failover pairs of
servers.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Kevin Van Maren

2009-Feb-03 14:46 UTC

head link

[Lustre-discuss] one server node fails, its all dead?

On Feb 2, 2009, at 2:57 PM, Andreas Dilger <adilger at sun.com> wrote:
> On Feb 02, 2009  15:23 -0600, Robert Minvielle wrote:
>> So, if I have a server that goes down, the clients are out of luck.  
>> I have
>> a hard time believing this is "acceptable". Ok, so it is
"as good
>> as" NFS,
>> but I mean really, if a single storage unit fails all of my clients  
>> can do
>> nothing? Am I missing something here or is this by design?
>
> It is possible for clients to create new files while a server is down,
> but as you can expect it isn''t possible to read any data from the
> failed
> server.  In some cases users have used DRBD to do device replication
> instead of using shared storage.
>
>> The real reason
>> I ask is that I am testing Lustre against a few other DPFS to see  
>> if we will
>> move to Lustre. So far, some things are nice, and some are not  
>> nice. Writing
>> seems to be faster, but reading is slower (than my other test DPFSs).
>> Contacting Sun to ask about support took forever. At least four  
>> days for them
>> to just call me back and tell me they could not give me a price  
>> without
>> knowing how much storage I have (ugh, a pay per byte system, great).
>
> You can imagine that supporting the largest Lustre filesystem (1300+  
> OSTs
> with 10PB of storage and 30k+ clients) will take more effort than  
> supporting
> a system with a handful of OSTs and clients.  The support price is  
> not per
> client, but rather per-OST, IIRC.
Just to clarify: Lustre support prices are currently based on the  
number of OSS servers (machines serving OSTs), not the number of OSTs,  
and not the size of the storage.
>
>> So, Lustre users, is it worth it? My setup would be 24 OST''s
with
>> about
>> 100TB of storage, 10G ethernet, RAID on each OST, at least 20 or so  
>> clients
>> needing pretty fast read/write, connected via 10G ethernet (yes, I  
>> know I
>> need a SAN but the physical locations will not allow it and the  
>> price is
>> prohibitive, hence my looking at DPFSs)... Am I on the right track  
>> looking
>> at Lustre, or should I go elsewhere? I also need commercial support  
>> of some
>> kind (although it seems Sun is unsure of themselves here, they did  
>> not
>> know who to contact when I contacted them "Lustre, we make a
product
>> called Lustre? Hold please"...
>
> Well, Sun is a big company, and Lustre was only acquired a year ago  
> and
> does not necessarily generate a high call volume to the L1 support  
> people,
> so they are not necessarily going to have information immediately  
> handy.
>
> Note that Lustre itself does NOT need a SAN to work, unlike some other
> cluster filesystems.  The only SAN requirement is for failover pairs  
> of
> servers.
Sun also has low-cost shared storage options for Luster -- search for  
Sun Storage Cluster.
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

Lustre discuss - Feb 2009 - one server node fails, its all dead?

[Lustre-discuss] one server node fails, its all dead?

[Lustre-discuss] one server node fails, its all dead?

[Lustre-discuss] one server node fails, its all dead?

[Lustre-discuss] one server node fails, its all dead?

[Lustre-discuss] one server node fails, its all dead?