thr3ads.net - Lustre discuss - [Lustre-discuss] is Luster ready for prime time? [Jan 2013]

If this information is useful, please help other people find it:
Share via:

greg whynott

2013-Jan-17 17:17 UTC

[Lustre-discuss] is Luster ready for prime time?

Hello,

just signed up today, please forgive me if this question has been covered
recently.  - in a bit of a rush to get an answer on this as we need to make
a decision soon,  the idea of using luster was thrown into the mix very
late in the decision making process.


We are looking to procure a new storage solution which will predominately
be used for HPC output but will also be used as our main business centric
storage for day to day use.  Meaning the file system needs to be available
24/7/365.    The last time I was involved in considering Luster was about 6
years ago and it was at that time being considered for scratch space for
HPC usage only.

Our VMs and databases would remain on non-luster storage as we already have
that in place and it works well.    The luster file system potentially
would have everything else.  Projects we work on typically take up to 2
years to complete and during that time we would want all assets to remain
on the file system.

Some of the vendors on our short list include HDS(Blue Arc), Isilon and
NetApp.    Last week we started bouncing the idea of using Luster around.
I''d love to use it if it is considered stable enough to do so.

your thoughts and/or comments would be greatly appreciated.  thanks for
your time.

greg
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20130117/b391b6bf/attachment.html

Hammitt, Charles Allen

2013-Jan-17 19:20 UTC

head link

[Lustre-discuss] is Luster ready for prime time?

Somewhat surprised that no one has responded yet; although it?s likely that the
responses would be rather subjective?including mine, of course!

Generally I would say that it would be interesting to know more about your
datasets and intended workload; however, you mention this is to be used as your
day-to-day main business storage?so I imagine those characteristics would
greatly vary? mine certainly do; that much is for sure!

I don?t really think uptime would be as much an issue here; there are lots of
redundancies, recovery mechanisms, and plenty of stable branches to choose
from?the question becomes what are the feature-set needs, performance usability
for different file types and workloads, and general comfort level with greater
complexity and somewhat less resources.  That said, I?d personally be a bit wary
of using it as a general filesystem for all your needs.


I do find it interesting that your short list is a wide range mix of storage and
filesystem types; traditional NAS, scale-out NAS, and then some block storage
with a parallel filesytem in Lustre.  Why no GPFS on the list for comparison?

I currently manage, or have used in the past [bluearc], all the storage /
filesystems and more from your list.  The reason being is that different storage
and filesystems components have some things they are good at? while other things
they might not be as good at doing.  So I diversify by putting different
storage/filesystem component pieces in the areas where they excel at best?



Regards,

Charles



From: lustre-discuss-bounces at lists.lustre.org [mailto:lustre-discuss-bounces
at lists.lustre.org] On Behalf Of greg whynott
Sent: Thursday, January 17, 2013 12:18 PM
To: lustre-discuss at lists.lustre.org
Subject: [Lustre-discuss] is Luster ready for prime time?

Hello,

just signed up today, please forgive me if this question has been covered
recently.  - in a bit of a rush to get an answer on this as we need to make a
decision soon,  the idea of using luster was thrown into the mix very late in
the decision making process.

We are looking to procure a new storage solution which will predominately be
used for HPC output but will also be used as our main business centric storage
for day to day use.  Meaning the file system needs to be available 24/7/365.   
The last time I was involved in considering Luster was about 6 years ago and it
was at that time being considered for scratch space for HPC usage only.
Our VMs and databases would remain on non-luster storage as we already have that
in place and it works well.    The luster file system potentially would have
everything else.  Projects we work on typically take up to 2 years to complete
and during that time we would want all assets to remain on the file system.
Some of the vendors on our short list include HDS(Blue Arc), Isilon and NetApp. 
Last week we started bouncing the idea of using Luster around.   I''d
love to use it if it is considered stable enough to do so.

your thoughts and/or comments would be greatly appreciated.  thanks for your
time.

greg


-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20130117/1caee1a5/attachment.html

Colin Faber

2013-Jan-17 19:26 UTC

head link

[Lustre-discuss] is Luster ready for prime time?

Hi Greg,

In general like all file systems you''re limited to how stable and 
reliable your hardware platform is. If you''re building something 
yourself then Lustre becomes much more work. This is due to the need to 
keep up with stability patches as well as addressing issues directly 
related to your use case and hardware profile.

In my opinion lustre is no less stable than any other file system 
technology, especially when you''re talking about 1.8 revisions (which 
are very stable) however you have many more things which can go wrong, 
as you''re usually talking about many more components which can fail.

A correctly architect cluster with proper fail over environment should 
leave the file system trouble free, unless of course you hit a bug. 
There are many people on this list (including my self) that run Lustre 
as a /home file system without issues, again in most cases issues are 
introduced when you''re over taxing your hardware, or you have hardware 
failure and a poor fail over environment.

There are many vendors which can setup a very robust file system for 
you, however again, remember if you''re looking for the cheapest option,
you get what you pay for.

-cf

On 01/17/2013 10:17 AM, greg whynott wrote:> Hello,
>
> just signed up today, please forgive me if this question has been 
> covered recently.  - in a bit of a rush to get an answer on this as we 
> need to make a decision soon,  the idea of using luster was thrown 
> into the mix very late in the decision making process.
>
>
> We are looking to procure a new storage solution which will 
> predominately be used for HPC output but will also be used as our main 
> business centric storage for day to day use. Meaning the file system 
> needs to be available 24/7/365. The last time I was involved in 
> considering Luster was about 6 years ago and it was at that time being 
> considered for scratch space for HPC usage only.
>
> Our VMs and databases would remain on non-luster storage as we already 
> have that in place and it works well.    The luster file system 
> potentially would have everything else.  Projects we work on typically 
> take up to 2 years to complete and during that time we would want all 
> assets to remain on the file system.
>
> Some of the vendors on our short list include HDS(Blue Arc), Isilon 
> and NetApp.    Last week we started bouncing the idea of using Luster 
> around.   I''d love to use it if it is considered stable enough to
do so.
>
> your thoughts and/or comments would be greatly appreciated. thanks for 
> your time.
>
> greg
>
>
>
>
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

Jeff Johnson

2013-Jan-17 20:03 UTC

head link

[Lustre-discuss] is Luster ready for prime time?

Greg,

I''m echoing Charles'' comments a bit. Specific filesystems are
not good
at everything. While it is my opinion that Lustre can be very stable, 
and like Colin stated the underlying hardware and configuration is 
crucial to that end, the filesystem may not be the best performing at 
every data access model.

Like every other filesystem Lustre has use cases where it excels and 
others where overhead may be less than optimal. Other filesystems and 
storage devices also suffer from "one size fits most".

Many here would likely be biased toward Lustre but many of those people 
have also used many other options on the market and ended up here.

--Jeff

-- 
------------------------------
Jeff Johnson
Co-Founder
Aeon Computing

jeff.johnson at aeoncomputing.com
www.aeoncomputing.com
t: 858-412-3810 x101   f: 858-412-3845
m: 619-204-9061

4170 Morena Boulevard, Suite D - San Diego, CA 92117




On 1/17/13 9:17 AM, greg whynott wrote:> Hello,
>
> just signed up today, please forgive me if this question has been 
> covered recently.  - in a bit of a rush to get an answer on this as we 
> need to make a decision soon,  the idea of using luster was thrown 
> into the mix very late in the decision making process.
>
>
> We are looking to procure a new storage solution which will 
> predominately be used for HPC output but will also be used as our main 
> business centric storage for day to day use. Meaning the file system 
> needs to be available 24/7/365. The last time I was involved in 
> considering Luster was about 6 years ago and it was at that time being 
> considered for scratch space for HPC usage only.
>
> Our VMs and databases would remain on non-luster storage as we already 
> have that in place and it works well.    The luster file system 
> potentially would have everything else.  Projects we work on typically 
> take up to 2 years to complete and during that time we would want all 
> assets to remain on the file system.
>
> Some of the vendors on our short list include HDS(Blue Arc), Isilon 
> and NetApp.    Last week we started bouncing the idea of using Luster 
> around.   I''d love to use it if it is considered stable enough to
do so.
>
> your thoughts and/or comments would be greatly appreciated. thanks for 
> your time.
>
> greg
>
>
>
>
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

greg whynott

2013-Jan-17 20:21 UTC

head link

[Lustre-discuss] is Luster ready for prime time?

Hi Charles,

  I received a few off list challenging email messages along with a few
fishing ones,  but its all good.   its interesting how a post asking a
question can make someone appear angry.  8)

Our IO profiles from the different segments of our business do vary
greatly.   The HPC is more or less the typical load you would expect to
see,  depending on which software is in use for the for the job being ran.
      We have hundreds of artists and administrative staff who use the file
system in a variety of ways.   Some examples would include but not limited
to:  saving out multiple revisions of photoshop documents (typically in the
hundreds of megs to +1gig range),   video editing (stereoscopic 2k and 4k
images(again from 10''s 100''s to gigs in size) including
uncompressed
video,  excel, word and similar files,  thousands of project files (from
software such as Maya,  Nuke and similar)  these also vary largely in size,
from 1 to thousands of megs in size.

The intention is keep our data bases and VM requirements on the existing
file system which is comprised of about 100 10k SAS drives,  it works well.

We did consider GPFS but that consideration went out the door once I
started talking to them and hammering in some numbers into their online
calculator.  Things got a bit crazy quickly.   They have different pricing
for the different types and speeds of Intel CPUs.  I got the feeling they
were trying to squeeze every penny out of customers they could.  felt very
Brocade-ish and left a bad taste with us.   wouldn''t of been much of a
problem as some other shops I''ve worked at,  but here we do have a
finite
budget to work within.

The NAS vendors could all be considered scale out I suspect.   All 3 can
scale out the storage and front end.  NA C-mode can have up to 24 heads,
Blue Arc goes up to 4 or 8 depending on the class,  Isilon can go up to 24
nodes or more as well if memory serves me correctly,  and they all have a
single name space solution in place.   They each have their limits,   but
for our use case they are really subjective.   We will not hit the limits
of their scalability before we are considering a fork lift refresh.  In our
view,  for what they offer it is perty much a wash for them - any would
meet our needs.  NetApp still has a silly agg/vol size limit,  at least it
is up to 90TB now (from 9 in the past(formatted fs use))..  in April it is
suppose to go much higher.

 The block storage idea in the mix - since all our HPC is linux,  they all
would become luster clients.   To provide a gateway into the luster storage
for none linux/luster hosts the thinking was a clustered pair of linux
boxes running SAMBA/NFS which were also Luster clients.    Its just an idea
being bounced around at this point.  The data serving requirements of the
non HPC parts of the business are much less.   The video editors most
likely would stay on our existing storage solution as that is working out
very well for them, but even if we did put them onto the Luster FS,  I
think they would be fine.  based on that, it didn''t seem so crazy to
consider block access in this method.   that said,  I think we would be one
of the first in M&E to do so,  pioneers if you will...

diversify - we will end up in the same boat for the same reasons.

thanks Charles,
greg

On Thu, Jan 17, 2013 at 2:20 PM, Hammitt, Charles Allen <
chammitt at email.unc.edu> wrote:
>  ** **
>
> Somewhat surprised that no one has responded yet; although it?s likely
> that the responses would be rather subjective?including mine, of course!**
> **
>
> ** **
>
> Generally I would say that it would be interesting to know more about your
> datasets and intended workload; however, you mention this is to be used as
> your day-to-day main business storage?so I imagine those characteristics
> would greatly vary? mine certainly do; that much is for sure!****
>
> ** **
>
> I don?t really think uptime would be as much an issue here; there are lots
> of redundancies, recovery mechanisms, and plenty of stable branches to
> choose from?the question becomes what are the feature-set needs,
> performance usability for different file types and workloads, and general
> comfort level with greater complexity and somewhat less resources.  That
> said, I?d personally be a bit wary of using it as a general filesystem for
> *all* your needs.  ****
>
> ** **
>
> ** **
>
> I do find it interesting that your short list is a wide range mix of
> storage and filesystem types; traditional NAS, scale-out NAS, and then some
> block storage with a parallel filesytem in Lustre.  Why no GPFS on the list
> for comparison?****
>
> ** **
>
> I currently manage, or have used in the past *[bluearc]*, all the storage
> / filesystems and more from your list.  The reason being is that different
> storage and filesystems components have some things they are good at? while
> other things they might not be as good at doing.  So I diversify by putting
> different storage/filesystem component pieces in the areas where they excel
> at best?****
>
> ** **
>
> ** **
>
> ** **
>
> Regards,****
>
> ** **
>
> Charles****
>
> ** **
>
> ** **
>
> ** **
>
> *From:* lustre-discuss-bounces at lists.lustre.org [mailto:
> lustre-discuss-bounces at lists.lustre.org] *On Behalf Of *greg whynott
> *Sent:* Thursday, January 17, 2013 12:18 PM
> *To:* lustre-discuss at lists.lustre.org
>
> *Subject:* [Lustre-discuss] is Luster ready for prime time?****
>
>  ** **
>
> Hello,
>
>
> just signed up today, please forgive me if this question has been covered
> recently.  - in a bit of a rush to get an answer on this as we need to make
> a decision soon,  the idea of using luster was thrown into the mix very
> late in the decision making process.
>
> ****
>
>  We are looking to procure a new storage solution which will
> predominately be used for HPC output but will also be used as our main
> business centric storage for day to day use.  Meaning the file system needs
> to be available 24/7/365.    The last time I was involved in considering
> Luster was about 6 years ago and it was at that time being considered for
> scratch space for HPC usage only. ****
>
> Our VMs and databases would remain on non-luster storage as we already
> have that in place and it works well.    The luster file system potentially
> would have everything else.  Projects we work on typically take up to 2
> years to complete and during that time we would want all assets to remain
> on the file system.****
>
> Some of the vendors on our short list include HDS(Blue Arc), Isilon and
> NetApp.    Last week we started bouncing the idea of using Luster around.
> I''d love to use it if it is considered stable enough to do so.
>
> your thoughts and/or comments would be greatly appreciated.  thanks for
> your time.
>
> greg
>
>
> ****
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20130117/2ec680b8/attachment-0001.html

Indivar Nair

2013-Jan-19 11:52 UTC

head link

[Lustre-discuss] is Luster ready for prime time?

Hi Greg,

One of our customers had a similar requirement and we deployed Lustre
2.0.0.1 for them. This was in July 2011. Though there were a lots of
problems initially, all of them were sorted out over time. They are quite
happy with it now.

*Environment:*
Its a 150 Artist studio with around 60 Render nodes. The studio mainly uses
Mocha, After Effects, Silhouette, Synth Eye, Maya, and Nuke among others.
They mainly work on 3D Effects and Stereoscopy Conversions.
Around 45% of Artists and Render Nodes are on Linux and use native Lustre
Client. All others access it through Samba.

*Lustre Setup:*
It consists of 2 x Dell R610 as MDS Nodes, and 4 x Dell R710 as OSS Nodes.
2 x Dell MD3200 with 12x1TB SAS Nearline Disks are used for storage. Each
Dell MD3200s are shared among 2 OSS nodes for H/A.

Since the original plan (which didn''t happen) was to move to a 100%
Linux
environment, we didn''t allocate separate Samba Gateways and use the OSS
nodes with CTDB for it. Thankfully, we haven''t had any issues with that
yet.

*Performance:*
We get a good THROUGHPUT of 800 - 1000MB/s with Lustre Caching. The disks
it self provide much lesser speeds. But that is fine, as caching is in
effect most of the time.

*Challenge:*
The challenge for us was to tune the storage for small files 10 - 50MB
totalling to 10s of GBs. An average shot would consist of 2000 - 4000  .dpx
images. Some Scenes / Shots also had millions of <1MB Maya Cache files.
This did tax the storage, especially the MDS. Fixed it to an extent by
adding more RAM to MDS.

*Suggestions:*

1. Get the real number of small files (I mean <1MB ones) created / used by
all software. These are the ones that could give you the most trouble. Do
not assume anything.

2. Get the file - sizes, numbers and access patterns absolutely correct.
This is the key.
    Its easier to design and tune Lustre for large files and I/O.

3. Network tuning is as important and storage tuning. Tune Switches, each
Workstation, Render Nodes, Samba / NFS Gateways, OSS Nodes, MDS Nodes,
everything.

4. Similarly do not undermine Samba / NFS Gateway. Size and tune them
correctly too.

5. Use High Speed Switching like QDR Infiniband or 40GigE, especially for
backend connectivity between Samba/NFS Gateway and Lustre MDS/OSS Nodes.

6. As far as possible, have fixed directory pattern for all projects.
Separate working files (Maya, Nuke, etc.) from the data, i.e. frames /
images, videos, etc. at the top directory level it self. This will help you
tune / manage the storage better. Different directory tree for different
file sizes or file access types.

If designed and tuned right, I think Lustre is best storage currently
available for your kind of work.

Hope this helps.

Regards,


Indivar Nair


On Fri, Jan 18, 2013 at 1:51 AM, greg whynott <greg.whynott at
gmail.com>wrote:
> Hi Charles,
>
>   I received a few off list challenging email messages along with a few
> fishing ones,  but its all good.   its interesting how a post asking a
> question can make someone appear angry.  8)
>
> Our IO profiles from the different segments of our business do vary
> greatly.   The HPC is more or less the typical load you would expect to
> see,  depending on which software is in use for the for the job being ran.
>       We have hundreds of artists and administrative staff who use the file
> system in a variety of ways.   Some examples would include but not limited
> to:  saving out multiple revisions of photoshop documents (typically in the
> hundreds of megs to +1gig range),   video editing (stereoscopic 2k and 4k
> images(again from 10''s 100''s to gigs in size) including
uncompressed
> video,  excel, word and similar files,  thousands of project files (from
> software such as Maya,  Nuke and similar)  these also vary largely in size,
> from 1 to thousands of megs in size.
>
> The intention is keep our data bases and VM requirements on the existing
> file system which is comprised of about 100 10k SAS drives,  it works well.
>
> We did consider GPFS but that consideration went out the door once I
> started talking to them and hammering in some numbers into their online
> calculator.  Things got a bit crazy quickly.   They have different pricing
> for the different types and speeds of Intel CPUs.  I got the feeling they
> were trying to squeeze every penny out of customers they could.  felt very
> Brocade-ish and left a bad taste with us.   wouldn''t of been much
of a
> problem as some other shops I''ve worked at,  but here we do have a
finite
> budget to work within.
>
> The NAS vendors could all be considered scale out I suspect.   All 3 can
> scale out the storage and front end.  NA C-mode can have up to 24 heads,
> Blue Arc goes up to 4 or 8 depending on the class,  Isilon can go up to 24
> nodes or more as well if memory serves me correctly,  and they all have a
> single name space solution in place.   They each have their limits,   but
> for our use case they are really subjective.   We will not hit the limits
> of their scalability before we are considering a fork lift refresh.  In our
> view,  for what they offer it is perty much a wash for them - any would
> meet our needs.  NetApp still has a silly agg/vol size limit,  at least it
> is up to 90TB now (from 9 in the past(formatted fs use))..  in April it is
> suppose to go much higher.
>
>  The block storage idea in the mix - since all our HPC is linux,  they all
> would become luster clients.   To provide a gateway into the luster storage
> for none linux/luster hosts the thinking was a clustered pair of linux
> boxes running SAMBA/NFS which were also Luster clients.    Its just an idea
> being bounced around at this point.  The data serving requirements of the
> non HPC parts of the business are much less.   The video editors most
> likely would stay on our existing storage solution as that is working out
> very well for them, but even if we did put them onto the Luster FS,  I
> think they would be fine.  based on that, it didn''t seem so crazy
to
> consider block access in this method.   that said,  I think we would be one
> of the first in M&E to do so,  pioneers if you will...
>
>
> diversify - we will end up in the same boat for the same reasons.
>
>
> thanks Charles,
> greg
>
>
>
>
>
>
> On Thu, Jan 17, 2013 at 2:20 PM, Hammitt, Charles Allen <
> chammitt at email.unc.edu> wrote:
>
>>  ** **
>>
>> Somewhat surprised that no one has responded yet; although it?s likely
>> that the responses would be rather subjective?including mine, of
course!*
>> ***
>>
>> ** **
>>
>> Generally I would say that it would be interesting to know more about
>> your datasets and intended workload; however, you mention this is to be
>> used as your day-to-day main business storage?so I imagine those
>> characteristics would greatly vary? mine certainly do; that much is for
>> sure!****
>>
>> ** **
>>
>> I don?t really think uptime would be as much an issue here; there are
>> lots of redundancies, recovery mechanisms, and plenty of stable
branches to
>> choose from?the question becomes what are the feature-set needs,
>> performance usability for different file types and workloads, and
general
>> comfort level with greater complexity and somewhat less resources. 
That
>> said, I?d personally be a bit wary of using it as a general filesystem
for
>> *all* your needs.  ****
>>
>> ** **
>>
>> ** **
>>
>> I do find it interesting that your short list is a wide range mix of
>> storage and filesystem types; traditional NAS, scale-out NAS, and then
some
>> block storage with a parallel filesytem in Lustre.  Why no GPFS on the
list
>> for comparison?****
>>
>> ** **
>>
>> I currently manage, or have used in the past *[bluearc]*, all the
>> storage / filesystems and more from your list.  The reason being is
that
>> different storage and filesystems components have some things they are
good
>> at? while other things they might not be as good at doing.  So I
diversify
>> by putting different storage/filesystem component pieces in the areas
where
>> they excel at best?****
>>
>> ** **
>>
>> ** **
>>
>> ** **
>>
>> Regards,****
>>
>> ** **
>>
>> Charles****
>>
>> ** **
>>
>> ** **
>>
>> ** **
>>
>> *From:* lustre-discuss-bounces at lists.lustre.org [mailto:
>> lustre-discuss-bounces at lists.lustre.org] *On Behalf Of *greg whynott
>> *Sent:* Thursday, January 17, 2013 12:18 PM
>> *To:* lustre-discuss at lists.lustre.org
>>
>> *Subject:* [Lustre-discuss] is Luster ready for prime time?****
>>
>>  ** **
>>
>> Hello,
>>
>>
>> just signed up today, please forgive me if this question has been
covered
>> recently.  - in a bit of a rush to get an answer on this as we need to
make
>> a decision soon,  the idea of using luster was thrown into the mix very
>> late in the decision making process.
>>
>> ****
>>
>>  We are looking to procure a new storage solution which will
>> predominately be used for HPC output but will also be used as our main
>> business centric storage for day to day use.  Meaning the file system
needs
>> to be available 24/7/365.    The last time I was involved in
considering
>> Luster was about 6 years ago and it was at that time being considered
for
>> scratch space for HPC usage only. ****
>>
>> Our VMs and databases would remain on non-luster storage as we already
>> have that in place and it works well.    The luster file system
potentially
>> would have everything else.  Projects we work on typically take up to 2
>> years to complete and during that time we would want all assets to
remain
>> on the file system.****
>>
>> Some of the vendors on our short list include HDS(Blue Arc), Isilon and
>> NetApp.    Last week we started bouncing the idea of using Luster
around.
>> I''d love to use it if it is considered stable enough to do so.
>>
>> your thoughts and/or comments would be greatly appreciated.  thanks for
>> your time.
>>
>> greg
>>
>>
>> ****
>>
>
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20130119/b9626b66/attachment-0001.html

greg whynott

2013-Jan-21 16:24 UTC

head link

[Lustre-discuss] is Luster ready for prime time?

Thanks very much Indivar,  informative read.    it is good to see others in
our sector are using the technology and you have some good points.

have a great day,
greg



On Sat, Jan 19, 2013 at 6:52 AM, Indivar Nair <indivar.nair at
techterra.in>wrote:
>  Hi Greg,
>
> One of our customers had a similar requirement and we deployed Lustre
> 2.0.0.1 for them. This was in July 2011. Though there were a lots of
> problems initially, all of them were sorted out over time. They are quite
> happy with it now.
>
> *Environment:*
> Its a 150 Artist studio with around 60 Render nodes. The studio mainly
> uses Mocha, After Effects, Silhouette, Synth Eye, Maya, and Nuke among
> others. They mainly work on 3D Effects and Stereoscopy Conversions.
> Around 45% of Artists and Render Nodes are on Linux and use native Lustre
> Client. All others access it through Samba.
>
> *Lustre Setup:*
> It consists of 2 x Dell R610 as MDS Nodes, and 4 x Dell R710 as OSS Nodes.
> 2 x Dell MD3200 with 12x1TB SAS Nearline Disks are used for storage. Each
> Dell MD3200s are shared among 2 OSS nodes for H/A.
>
> Since the original plan (which didn''t happen) was to move to a
100% Linux
> environment, we didn''t allocate separate Samba Gateways and use
the OSS
> nodes with CTDB for it. Thankfully, we haven''t had any issues with
that yet.
>
> *Performance:*
> We get a good THROUGHPUT of 800 - 1000MB/s with Lustre Caching. The disks
> it self provide much lesser speeds. But that is fine, as caching is in
> effect most of the time.
>
> *Challenge:*
> The challenge for us was to tune the storage for small files 10 - 50MB
> totalling to 10s of GBs. An average shot would consist of 2000 - 4000  .dpx
> images. Some Scenes / Shots also had millions of <1MB Maya Cache files.
> This did tax the storage, especially the MDS. Fixed it to an extent by
> adding more RAM to MDS.
>
> *Suggestions:*
>
> 1. Get the real number of small files (I mean <1MB ones) created / used
by
> all software. These are the ones that could give you the most trouble. Do
> not assume anything.
>
> 2. Get the file - sizes, numbers and access patterns absolutely correct.
> This is the key.
>     Its easier to design and tune Lustre for large files and I/O.
>
> 3. Network tuning is as important and storage tuning. Tune Switches, each
> Workstation, Render Nodes, Samba / NFS Gateways, OSS Nodes, MDS Nodes,
> everything.
>
> 4. Similarly do not undermine Samba / NFS Gateway. Size and tune them
> correctly too.
>
> 5. Use High Speed Switching like QDR Infiniband or 40GigE, especially for
> backend connectivity between Samba/NFS Gateway and Lustre MDS/OSS Nodes.
>
> 6. As far as possible, have fixed directory pattern for all projects.
> Separate working files (Maya, Nuke, etc.) from the data, i.e. frames /
> images, videos, etc. at the top directory level it self. This will help you
> tune / manage the storage better. Different directory tree for different
> file sizes or file access types.
>
> If designed and tuned right, I think Lustre is best storage currently
> available for your kind of work.
>
> Hope this helps.
>
> Regards,
>
>
> Indivar Nair
>
>
> On Fri, Jan 18, 2013 at 1:51 AM, greg whynott <greg.whynott at
gmail.com>wrote:
>
>> Hi Charles,
>>
>>   I received a few off list challenging email messages along with a few
>> fishing ones,  but its all good.   its interesting how a post asking a
>> question can make someone appear angry.  8)
>>
>> Our IO profiles from the different segments of our business do vary
>> greatly.   The HPC is more or less the typical load you would expect to
>> see,  depending on which software is in use for the for the job being
ran.
>>       We have hundreds of artists and administrative staff who use the
file
>> system in a variety of ways.   Some examples would include but not
limited
>> to:  saving out multiple revisions of photoshop documents (typically in
the
>> hundreds of megs to +1gig range),   video editing (stereoscopic 2k and
4k
>> images(again from 10''s 100''s to gigs in size)
including uncompressed
>> video,  excel, word and similar files,  thousands of project files
(from
>> software such as Maya,  Nuke and similar)  these also vary largely in
size,
>> from 1 to thousands of megs in size.
>>
>> The intention is keep our data bases and VM requirements on the
existing
>> file system which is comprised of about 100 10k SAS drives,  it works
well.
>>
>> We did consider GPFS but that consideration went out the door once I
>> started talking to them and hammering in some numbers into their online
>> calculator.  Things got a bit crazy quickly.   They have different
pricing
>> for the different types and speeds of Intel CPUs.  I got the feeling
they
>> were trying to squeeze every penny out of customers they could.  felt
very
>> Brocade-ish and left a bad taste with us.   wouldn''t of been
much of a
>> problem as some other shops I''ve worked at,  but here we do
have a finite
>> budget to work within.
>>
>> The NAS vendors could all be considered scale out I suspect.   All 3
can
>> scale out the storage and front end.  NA C-mode can have up to 24
heads,
>> Blue Arc goes up to 4 or 8 depending on the class,  Isilon can go up to
24
>> nodes or more as well if memory serves me correctly,  and they all have
a
>> single name space solution in place.   They each have their limits,  
but
>> for our use case they are really subjective.   We will not hit the
limits
>> of their scalability before we are considering a fork lift refresh.  In
our
>> view,  for what they offer it is perty much a wash for them - any would
>> meet our needs.  NetApp still has a silly agg/vol size limit,  at least
it
>> is up to 90TB now (from 9 in the past(formatted fs use))..  in April it
is
>> suppose to go much higher.
>>
>>  The block storage idea in the mix - since all our HPC is linux,  they
>> all would become luster clients.   To provide a gateway into the luster
>> storage for none linux/luster hosts the thinking was a clustered pair
of
>> linux boxes running SAMBA/NFS which were also Luster clients.    Its
just
>> an idea being bounced around at this point.  The data serving
requirements
>> of the non HPC parts of the business are much less.   The video editors
>> most likely would stay on our existing storage solution as that is
working
>> out very well for them, but even if we did put them onto the Luster FS,
I
>> think they would be fine.  based on that, it didn''t seem so
crazy to
>> consider block access in this method.   that said,  I think we would be
one
>> of the first in M&E to do so,  pioneers if you will...
>>
>>
>> diversify - we will end up in the same boat for the same reasons.
>>
>>
>> thanks Charles,
>> greg
>>
>>
>>
>>
>>
>>
>> On Thu, Jan 17, 2013 at 2:20 PM, Hammitt, Charles Allen <
>> chammitt at email.unc.edu> wrote:
>>
>>>  ** **
>>>
>>> Somewhat surprised that no one has responded yet; although it?s
likely
>>> that the responses would be rather subjective?including mine, of
course!
>>> ****
>>>
>>> ** **
>>>
>>> Generally I would say that it would be interesting to know more
about
>>> your datasets and intended workload; however, you mention this is
to be
>>> used as your day-to-day main business storage?so I imagine those
>>> characteristics would greatly vary? mine certainly do; that much is
for
>>> sure!****
>>>
>>> ** **
>>>
>>> I don?t really think uptime would be as much an issue here; there
are
>>> lots of redundancies, recovery mechanisms, and plenty of stable
branches to
>>> choose from?the question becomes what are the feature-set needs,
>>> performance usability for different file types and workloads, and
general
>>> comfort level with greater complexity and somewhat less resources. 
That
>>> said, I?d personally be a bit wary of using it as a general
filesystem for
>>> *all* your needs.  ****
>>>
>>> ** **
>>>
>>> ** **
>>>
>>> I do find it interesting that your short list is a wide range mix
of
>>> storage and filesystem types; traditional NAS, scale-out NAS, and
then some
>>> block storage with a parallel filesytem in Lustre.  Why no GPFS on
the list
>>> for comparison?****
>>>
>>> ** **
>>>
>>> I currently manage, or have used in the past *[bluearc]*, all the
>>> storage / filesystems and more from your list.  The reason being is
that
>>> different storage and filesystems components have some things they
are good
>>> at? while other things they might not be as good at doing.  So I
diversify
>>> by putting different storage/filesystem component pieces in the
areas where
>>> they excel at best?****
>>>
>>> ** **
>>>
>>> ** **
>>>
>>> ** **
>>>
>>> Regards,****
>>>
>>> ** **
>>>
>>> Charles****
>>>
>>> ** **
>>>
>>> ** **
>>>
>>> ** **
>>>
>>> *From:* lustre-discuss-bounces at lists.lustre.org [mailto:
>>> lustre-discuss-bounces at lists.lustre.org] *On Behalf Of *greg
whynott
>>> *Sent:* Thursday, January 17, 2013 12:18 PM
>>> *To:* lustre-discuss at lists.lustre.org
>>>
>>> *Subject:* [Lustre-discuss] is Luster ready for prime time?****
>>>
>>>  ** **
>>>
>>> Hello,
>>>
>>>
>>> just signed up today, please forgive me if this question has been
>>> covered recently.  - in a bit of a rush to get an answer on this as
we need
>>> to make a decision soon,  the idea of using luster was thrown into
the mix
>>> very late in the decision making process.
>>>
>>> ****
>>>
>>>  We are looking to procure a new storage solution which will
>>> predominately be used for HPC output but will also be used as our
main
>>> business centric storage for day to day use.  Meaning the file
system needs
>>> to be available 24/7/365.    The last time I was involved in
considering
>>> Luster was about 6 years ago and it was at that time being
considered for
>>> scratch space for HPC usage only. ****
>>>
>>> Our VMs and databases would remain on non-luster storage as we
already
>>> have that in place and it works well.    The luster file system
potentially
>>> would have everything else.  Projects we work on typically take up
to 2
>>> years to complete and during that time we would want all assets to
remain
>>> on the file system.****
>>>
>>> Some of the vendors on our short list include HDS(Blue Arc), Isilon
and
>>> NetApp.    Last week we started bouncing the idea of using Luster
around.
>>> I''d love to use it if it is considered stable enough to do
so.
>>>
>>> your thoughts and/or comments would be greatly appreciated.  thanks
for
>>> your time.
>>>
>>> greg
>>>
>>>
>>> ****
>>>
>>
>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at lists.lustre.org
>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20130121/d311779c/attachment.html

Lind, Bobbie J

2013-Jan-21 19:25 UTC

head link

[Lustre-discuss] is Luster ready for prime time?

Indivar,

I would be very interested to see what tuning parameters you have set to
tune lustre and the storage for small files.  I have had similar setups in
the past and been stumped by the small file performance.

-- 
Bobbie Lind


>Date: Mon, 21 Jan 2013 11:24:32 -0500
>From: greg whynott <greg.whynott at gmail.com>
>Subject: Re: [Lustre-discuss] is Luster ready for prime time?
>To: Indivar Nair <indivar.nair at techterra.in>
>Cc: "lustre-discuss at lists.lustre.org"
>	<lustre-discuss at lists.lustre.org>
>Message-ID:
>	<CAKuzA1G4-W122LQrf3VKqADd=WrDgcAVx5hyAGJfZwwR8KKG2g at
mail.gmail.com>
>Content-Type: text/plain; charset="utf-8"
>
>Thanks very much Indivar,  informative read.    it is good to see others
>in
>our sector are using the technology and you have some good points.
>
>have a great day,
>greg
>
>
>
>On Sat, Jan 19, 2013 at 6:52 AM, Indivar Nair
><indivar.nair at techterra.in>wrote:
>
>>  Hi Greg,
>>
>> One of our customers had a similar requirement and we deployed Lustre
>> 2.0.0.1 for them. This was in July 2011. Though there were a lots of
>> problems initially, all of them were sorted out over time. They are
>>quite
>> happy with it now.
>>
>> *Environment:*
>> Its a 150 Artist studio with around 60 Render nodes. The studio mainly
>> uses Mocha, After Effects, Silhouette, Synth Eye, Maya, and Nuke among
>> others. They mainly work on 3D Effects and Stereoscopy Conversions.
>> Around 45% of Artists and Render Nodes are on Linux and use native
>>Lustre
>> Client. All others access it through Samba.
>>
>> *Lustre Setup:*
>> It consists of 2 x Dell R610 as MDS Nodes, and 4 x Dell R710 as OSS
>>Nodes.
>> 2 x Dell MD3200 with 12x1TB SAS Nearline Disks are used for storage.
>>Each
>> Dell MD3200s are shared among 2 OSS nodes for H/A.
>>
>> Since the original plan (which didn''t happen) was to move to a
100%
>>Linux
>> environment, we didn''t allocate separate Samba Gateways and
use the OSS
>> nodes with CTDB for it. Thankfully, we haven''t had any issues
with that
>>yet.
>>
>> *Performance:*
>> We get a good THROUGHPUT of 800 - 1000MB/s with Lustre Caching. The
>>disks
>> it self provide much lesser speeds. But that is fine, as caching is in
>> effect most of the time.
>>
>> *Challenge:*
>> The challenge for us was to tune the storage for small files 10 - 50MB
>> totalling to 10s of GBs. An average shot would consist of 2000 - 4000
>>.dpx
>> images. Some Scenes / Shots also had millions of <1MB Maya Cache
files.
>> This did tax the storage, especially the MDS. Fixed it to an extent by
>> adding more RAM to MDS.
>>
>> *Suggestions:*
>>
>> 1. Get the real number of small files (I mean <1MB ones) created /
used
>>by
>> all software. These are the ones that could give you the most trouble.
>>Do
>> not assume anything.
>>
>> 2. Get the file - sizes, numbers and access patterns absolutely
correct.
>> This is the key.
>>     Its easier to design and tune Lustre for large files and I/O.
>>
>> 3. Network tuning is as important and storage tuning. Tune Switches,
>>each
>> Workstation, Render Nodes, Samba / NFS Gateways, OSS Nodes, MDS Nodes,
>> everything.
>>
>> 4. Similarly do not undermine Samba / NFS Gateway. Size and tune them
>> correctly too.
>>
>> 5. Use High Speed Switching like QDR Infiniband or 40GigE, especially
>>for
>> backend connectivity between Samba/NFS Gateway and Lustre MDS/OSS
Nodes.
>>
>> 6. As far as possible, have fixed directory pattern for all projects.
>> Separate working files (Maya, Nuke, etc.) from the data, i.e. frames /
>> images, videos, etc. at the top directory level it self. This will help
>>you
>> tune / manage the storage better. Different directory tree for
different
>> file sizes or file access types.
>>
>> If designed and tuned right, I think Lustre is best storage currently
>> available for your kind of work.
>>
>> Hope this helps.
>>
>> Regards,
>>
>>
>> Indivar Nair
>>
>>
>> On Fri, Jan 18, 2013 at 1:51 AM, greg whynott
>><greg.whynott at gmail.com>wrote:
>>
>>> Hi Charles,
>>>
>>>   I received a few off list challenging email messages along with a
few
>>> fishing ones,  but its all good.   its interesting how a post
asking a
>>> question can make someone appear angry.  8)
>>>
>>> Our IO profiles from the different segments of our business do vary
>>> greatly.   The HPC is more or less the typical load you would
expect to
>>> see,  depending on which software is in use for the for the job
being
>>>ran.
>>>       We have hundreds of artists and administrative staff who use
the
>>>file
>>> system in a variety of ways.   Some examples would include but not
>>>limited
>>> to:  saving out multiple revisions of photoshop documents
(typically
>>>in the
>>> hundreds of megs to +1gig range),   video editing (stereoscopic 2k
and
>>>4k
>>> images(again from 10''s 100''s to gigs in size)
including uncompressed
>>> video,  excel, word and similar files,  thousands of project files
>>>(from
>>> software such as Maya,  Nuke and similar)  these also vary largely
in
>>>size,
>>> from 1 to thousands of megs in size.
>>>
>>> The intention is keep our data bases and VM requirements on the
>>>existing
>>> file system which is comprised of about 100 10k SAS drives,  it
works
>>>well.
>>>
>>> We did consider GPFS but that consideration went out the door once
I
>>> started talking to them and hammering in some numbers into their
online
>>> calculator.  Things got a bit crazy quickly.   They have different
>>>pricing
>>> for the different types and speeds of Intel CPUs.  I got the
feeling
>>>they
>>> were trying to squeeze every penny out of customers they could. 
felt
>>>very
>>> Brocade-ish and left a bad taste with us.   wouldn''t of
been much of a
>>> problem as some other shops I''ve worked at,  but here we
do have a
>>>finite
>>> budget to work within.
>>>
>>> The NAS vendors could all be considered scale out I suspect.   All
3
>>>can
>>> scale out the storage and front end.  NA C-mode can have up to 24
>>>heads,
>>> Blue Arc goes up to 4 or 8 depending on the class,  Isilon can go
up
>>>to 24
>>> nodes or more as well if memory serves me correctly,  and they all
>>>have a
>>> single name space solution in place.   They each have their limits,
>>>but
>>> for our use case they are really subjective.   We will not hit the
>>>limits
>>> of their scalability before we are considering a fork lift refresh.
>>>In our
>>> view,  for what they offer it is perty much a wash for them - any
would
>>> meet our needs.  NetApp still has a silly agg/vol size limit,  at
>>>least it
>>> is up to 90TB now (from 9 in the past(formatted fs use))..  in
April
>>>it is
>>> suppose to go much higher.
>>>
>>>  The block storage idea in the mix - since all our HPC is linux, 
they
>>> all would become luster clients.   To provide a gateway into the
luster
>>> storage for none linux/luster hosts the thinking was a clustered
pair
>>>of
>>> linux boxes running SAMBA/NFS which were also Luster clients.   
Its
>>>just
>>> an idea being bounced around at this point.  The data serving
>>>requirements
>>> of the non HPC parts of the business are much less.   The video
editors
>>> most likely would stay on our existing storage solution as that is
>>>working
>>> out very well for them, but even if we did put them onto the Luster
>>>FS,  I
>>> think they would be fine.  based on that, it didn''t seem
so crazy to
>>> consider block access in this method.   that said,  I think we
would
>>>be one
>>> of the first in M&E to do so,  pioneers if you will...
>>>
>>>
>>> diversify - we will end up in the same boat for the same reasons.
>>>
>>>
>>> thanks Charles,
>>> greg
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Thu, Jan 17, 2013 at 2:20 PM, Hammitt, Charles Allen <
>>> chammitt at email.unc.edu> wrote:
>>>
>>>>  ** **
>>>>
>>>> Somewhat surprised that no one has responded yet; although it?s
likely
>>>> that the responses would be rather subjective?including mine,
of
>>>>course!
>>>> ****
>>>>
>>>> ** **
>>>>
>>>> Generally I would say that it would be interesting to know more
about
>>>> your datasets and intended workload; however, you mention this
is to
>>>>be
>>>> used as your day-to-day main business storage?so I imagine
those
>>>> characteristics would greatly vary? mine certainly do; that
much is
>>>>for
>>>> sure!****
>>>>
>>>> ** **
>>>>
>>>> I don?t really think uptime would be as much an issue here;
there are
>>>> lots of redundancies, recovery mechanisms, and plenty of stable
>>>>branches to
>>>> choose from?the question becomes what are the feature-set
needs,
>>>> performance usability for different file types and workloads,
and
>>>>general
>>>> comfort level with greater complexity and somewhat less
resources.
>>>>That
>>>> said, I?d personally be a bit wary of using it as a general
>>>>filesystem for
>>>> *all* your needs.  ****
>>>>
>>>> ** **
>>>>
>>>> ** **
>>>>
>>>> I do find it interesting that your short list is a wide range
mix of
>>>> storage and filesystem types; traditional NAS, scale-out NAS,
and
>>>>then some
>>>> block storage with a parallel filesytem in Lustre.  Why no GPFS
on
>>>>the list
>>>> for comparison?****
>>>>
>>>> ** **
>>>>
>>>> I currently manage, or have used in the past *[bluearc]*, all
the
>>>> storage / filesystems and more from your list.  The reason
being is
>>>>that
>>>> different storage and filesystems components have some things
they
>>>>are good
>>>> at? while other things they might not be as good at doing.  So
I
>>>>diversify
>>>> by putting different storage/filesystem component pieces in the
areas
>>>>where
>>>> they excel at best?****
>>>>
>>>> ** **
>>>>
>>>> ** **
>>>>
>>>> ** **
>>>>
>>>> Regards,****
>>>>
>>>> ** **
>>>>
>>>> Charles****
>>>>
>>>> ** **
>>>>
>>>> ** **
>>>>
>>>> ** **
>>>>
>>>> *From:* lustre-discuss-bounces at lists.lustre.org [mailto:
>>>> lustre-discuss-bounces at lists.lustre.org] *On Behalf Of *greg
whynott
>>>> *Sent:* Thursday, January 17, 2013 12:18 PM
>>>> *To:* lustre-discuss at lists.lustre.org
>>>>
>>>> *Subject:* [Lustre-discuss] is Luster ready for prime time?****
>>>>
>>>>  ** **
>>>>
>>>> Hello,
>>>>
>>>>
>>>> just signed up today, please forgive me if this question has
been
>>>> covered recently.  - in a bit of a rush to get an answer on
this as
>>>>we need
>>>> to make a decision soon,  the idea of using luster was thrown
into
>>>>the mix
>>>> very late in the decision making process.
>>>>
>>>> ****
>>>>
>>>>  We are looking to procure a new storage solution which will
>>>> predominately be used for HPC output but will also be used as
our main
>>>> business centric storage for day to day use.  Meaning the file
system
>>>>needs
>>>> to be available 24/7/365.    The last time I was involved in
>>>>considering
>>>> Luster was about 6 years ago and it was at that time being
considered
>>>>for
>>>> scratch space for HPC usage only. ****
>>>>
>>>> Our VMs and databases would remain on non-luster storage as we
already
>>>> have that in place and it works well.    The luster file system
>>>>potentially
>>>> would have everything else.  Projects we work on typically take
up to
>>>>2
>>>> years to complete and during that time we would want all assets
to
>>>>remain
>>>> on the file system.****
>>>>
>>>> Some of the vendors on our short list include HDS(Blue Arc),
Isilon
>>>>and
>>>> NetApp.    Last week we started bouncing the idea of using
Luster
>>>>around.
>>>> I''d love to use it if it is considered stable enough
to do so.
>>>>
>>>> your thoughts and/or comments would be greatly appreciated. 
thanks
>>>>for
>>>> your time.
>>>>
>>>> greg
>>>>
>>>>
>>>> ****
>>>>
>>>
>>>
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at lists.lustre.org
>>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>>>
>>>
>>
>-------------- next part --------------
>An HTML attachment was scrubbed...
>URL: 
>http://lists.lustre.org/pipermail/lustre-discuss/attachments/20130121/d311
>779c/attachment-0001.html
>
>------------------------------
>
>_______________________________________________
>Lustre-discuss mailing list
>Lustre-discuss at lists.lustre.org
>http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>
>End of Lustre-discuss Digest, Vol 84, Issue 12
>**********************************************

Indivar Nair

2013-Feb-01 03:08 UTC

head link

[Lustre-discuss] is Luster ready for prime time?

Hi Bobbie,

Small file performance is an issue.
It is the caching that balances it out. Due to the nature of the work, all
nodes in a given pool will always ask for the same set of files. So the
initial response to requests may be slow, but the subsequent ones are fine.

As I had mentioned earlier, we also had problems with listing large
directories. We worked around it by having a cron job on the Samba Gateway
get the file stat in large directories at regular intervals, thereby
keeping the OSS vfs cache primed at all times.

Play around with these parameters on MDS, OSS and Gateway ...it works out
differently for everyone -
--------------------------------------------------------------------------------------------------------------------------------------------------------------
sysctl -w vm.vfs_cache_pressure=2
sysctl -w vm.dirty_ratio=15
sysctl -w vm.swappiness=90                 #Swapping out regularly makes
more space for caches
sysctl -w vm.dirty_background_ratio=4
--------------------------------------------------------------------------------------------------------------------------------------------------------------

On the Gateways / Clients, run after each time you mount Lustre -
--------------------------------------------------------------------------------------------------------------------------------------------------------------
pushd /proc/fs/lustre/osc
for ost in *-OST*
 do
  echo 32 > ${ost}/max_rpcs_in_flight
 done
popd

lctl set_param osc.*.max_dirty_mb=512
--------------------------------------------------------------------------------------------------------------------------------------------------------------

--------------------------------------------------------------------------------------------------------------------------------------------------------------
/proc/fs/lustre/llite/<fsname>-<uid>/max_read_ahead_mb            #
at
default 40MB, as most of our files are in the 10MB range

/proc/fs/lustre/llite/<fsname>-<uid>/max_read_ahead_whole_mb  # set
to 10MB
/proc/fs/lustre/llite/*/statahead_max
# set to 8192
--------------------------------------------------------------------------------------------------------------------------------------------------------------

Regards,


Indivar Nair







On Tue, Jan 22, 2013 at 12:55 AM, Lind, Bobbie J <bobbie.j.lind at
intel.com>wrote:
> Indivar,
>
> I would be very interested to see what tuning parameters you have set to
> tune lustre and the storage for small files.  I have had similar setups in
> the past and been stumped by the small file performance.
>
> --
> Bobbie Lind
>
>
>
> >Date: Mon, 21 Jan 2013 11:24:32 -0500
> >From: greg whynott <greg.whynott at gmail.com>
> >Subject: Re: [Lustre-discuss] is Luster ready for prime time?
> >To: Indivar Nair <indivar.nair at techterra.in>
> >Cc: "lustre-discuss at lists.lustre.org"
> >       <lustre-discuss at lists.lustre.org>
> >Message-ID:
> >       <CAKuzA1G4-W122LQrf3VKqADd> WrDgcAVx5hyAGJfZwwR8KKG2g at
mail.gmail.com>
> >Content-Type: text/plain; charset="utf-8"
> >
> >Thanks very much Indivar,  informative read.    it is good to see
others
> >in
> >our sector are using the technology and you have some good points.
> >
> >have a great day,
> >greg
> >
> >
> >
> >On Sat, Jan 19, 2013 at 6:52 AM, Indivar Nair
> ><indivar.nair at techterra.in>wrote:
> >
> >>  Hi Greg,
> >>
> >> One of our customers had a similar requirement and we deployed
Lustre
> >> 2.0.0.1 for them. This was in July 2011. Though there were a lots
of
> >> problems initially, all of them were sorted out over time. They
are
> >>quite
> >> happy with it now.
> >>
> >> *Environment:*
> >> Its a 150 Artist studio with around 60 Render nodes. The studio
mainly
> >> uses Mocha, After Effects, Silhouette, Synth Eye, Maya, and Nuke
among
> >> others. They mainly work on 3D Effects and Stereoscopy
Conversions.
> >> Around 45% of Artists and Render Nodes are on Linux and use native
> >>Lustre
> >> Client. All others access it through Samba.
> >>
> >> *Lustre Setup:*
> >> It consists of 2 x Dell R610 as MDS Nodes, and 4 x Dell R710 as
OSS
> >>Nodes.
> >> 2 x Dell MD3200 with 12x1TB SAS Nearline Disks are used for
storage.
> >>Each
> >> Dell MD3200s are shared among 2 OSS nodes for H/A.
> >>
> >> Since the original plan (which didn''t happen) was to move
to a 100%
> >>Linux
> >> environment, we didn''t allocate separate Samba Gateways
and use the OSS
> >> nodes with CTDB for it. Thankfully, we haven''t had any
issues with that
> >>yet.
> >>
> >> *Performance:*
> >> We get a good THROUGHPUT of 800 - 1000MB/s with Lustre Caching.
The
> >>disks
> >> it self provide much lesser speeds. But that is fine, as caching
is in
> >> effect most of the time.
> >>
> >> *Challenge:*
> >> The challenge for us was to tune the storage for small files 10 -
50MB
> >> totalling to 10s of GBs. An average shot would consist of 2000 -
4000
> >>.dpx
> >> images. Some Scenes / Shots also had millions of <1MB Maya
Cache files.
> >> This did tax the storage, especially the MDS. Fixed it to an
extent by
> >> adding more RAM to MDS.
> >>
> >> *Suggestions:*
> >>
> >> 1. Get the real number of small files (I mean <1MB ones)
created / used
> >>by
> >> all software. These are the ones that could give you the most
trouble.
> >>Do
> >> not assume anything.
> >>
> >> 2. Get the file - sizes, numbers and access patterns absolutely
correct.
> >> This is the key.
> >>     Its easier to design and tune Lustre for large files and I/O.
> >>
> >> 3. Network tuning is as important and storage tuning. Tune
Switches,
> >>each
> >> Workstation, Render Nodes, Samba / NFS Gateways, OSS Nodes, MDS
Nodes,
> >> everything.
> >>
> >> 4. Similarly do not undermine Samba / NFS Gateway. Size and tune
them
> >> correctly too.
> >>
> >> 5. Use High Speed Switching like QDR Infiniband or 40GigE,
especially
> >>for
> >> backend connectivity between Samba/NFS Gateway and Lustre MDS/OSS
Nodes.
> >>
> >> 6. As far as possible, have fixed directory pattern for all
projects.
> >> Separate working files (Maya, Nuke, etc.) from the data, i.e.
frames /
> >> images, videos, etc. at the top directory level it self. This will
help
> >>you
> >> tune / manage the storage better. Different directory tree for
different
> >> file sizes or file access types.
> >>
> >> If designed and tuned right, I think Lustre is best storage
currently
> >> available for your kind of work.
> >>
> >> Hope this helps.
> >>
> >> Regards,
> >>
> >>
> >> Indivar Nair
> >>
> >>
> >> On Fri, Jan 18, 2013 at 1:51 AM, greg whynott
> >><greg.whynott at gmail.com>wrote:
> >>
> >>> Hi Charles,
> >>>
> >>>   I received a few off list challenging email messages along
with a few
> >>> fishing ones,  but its all good.   its interesting how a post
asking a
> >>> question can make someone appear angry.  8)
> >>>
> >>> Our IO profiles from the different segments of our business do
vary
> >>> greatly.   The HPC is more or less the typical load you would
expect to
> >>> see,  depending on which software is in use for the for the
job being
> >>>ran.
> >>>       We have hundreds of artists and administrative staff who
use the
> >>>file
> >>> system in a variety of ways.   Some examples would include but
not
> >>>limited
> >>> to:  saving out multiple revisions of photoshop documents
(typically
> >>>in the
> >>> hundreds of megs to +1gig range),   video editing
(stereoscopic 2k and
> >>>4k
> >>> images(again from 10''s 100''s to gigs in
size) including uncompressed
> >>> video,  excel, word and similar files,  thousands of project
files
> >>>(from
> >>> software such as Maya,  Nuke and similar)  these also vary
largely in
> >>>size,
> >>> from 1 to thousands of megs in size.
> >>>
> >>> The intention is keep our data bases and VM requirements on
the
> >>>existing
> >>> file system which is comprised of about 100 10k SAS drives, 
it works
> >>>well.
> >>>
> >>> We did consider GPFS but that consideration went out the door
once I
> >>> started talking to them and hammering in some numbers into
their online
> >>> calculator.  Things got a bit crazy quickly.   They have
different
> >>>pricing
> >>> for the different types and speeds of Intel CPUs.  I got the
feeling
> >>>they
> >>> were trying to squeeze every penny out of customers they
could.  felt
> >>>very
> >>> Brocade-ish and left a bad taste with us.   wouldn''t
of been much of a
> >>> problem as some other shops I''ve worked at,  but here
we do have a
> >>>finite
> >>> budget to work within.
> >>>
> >>> The NAS vendors could all be considered scale out I suspect.  
All 3
> >>>can
> >>> scale out the storage and front end.  NA C-mode can have up to
24
> >>>heads,
> >>> Blue Arc goes up to 4 or 8 depending on the class,  Isilon can
go up
> >>>to 24
> >>> nodes or more as well if memory serves me correctly,  and they
all
> >>>have a
> >>> single name space solution in place.   They each have their
limits,
> >>>but
> >>> for our use case they are really subjective.   We will not hit
the
> >>>limits
> >>> of their scalability before we are considering a fork lift
refresh.
> >>>In our
> >>> view,  for what they offer it is perty much a wash for them -
any would
> >>> meet our needs.  NetApp still has a silly agg/vol size limit, 
at
> >>>least it
> >>> is up to 90TB now (from 9 in the past(formatted fs use))..  in
April
> >>>it is
> >>> suppose to go much higher.
> >>>
> >>>  The block storage idea in the mix - since all our HPC is
linux,  they
> >>> all would become luster clients.   To provide a gateway into
the luster
> >>> storage for none linux/luster hosts the thinking was a
clustered pair
> >>>of
> >>> linux boxes running SAMBA/NFS which were also Luster clients. 
Its
> >>>just
> >>> an idea being bounced around at this point.  The data serving
> >>>requirements
> >>> of the non HPC parts of the business are much less.   The
video editors
> >>> most likely would stay on our existing storage solution as
that is
> >>>working
> >>> out very well for them, but even if we did put them onto the
Luster
> >>>FS,  I
> >>> think they would be fine.  based on that, it didn''t
seem so crazy to
> >>> consider block access in this method.   that said,  I think we
would
> >>>be one
> >>> of the first in M&E to do so,  pioneers if you will...
> >>>
> >>>
> >>> diversify - we will end up in the same boat for the same
reasons.
> >>>
> >>>
> >>> thanks Charles,
> >>> greg
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Thu, Jan 17, 2013 at 2:20 PM, Hammitt, Charles Allen <
> >>> chammitt at email.unc.edu> wrote:
> >>>
> >>>>  ** **
> >>>>
> >>>> Somewhat surprised that no one has responded yet; although
it?s likely
> >>>> that the responses would be rather subjective?including
mine, of
> >>>>course!
> >>>> ****
> >>>>
> >>>> ** **
> >>>>
> >>>> Generally I would say that it would be interesting to know
more about
> >>>> your datasets and intended workload; however, you mention
this is to
> >>>>be
> >>>> used as your day-to-day main business storage?so I imagine
those
> >>>> characteristics would greatly vary? mine certainly do;
that much is
> >>>>for
> >>>> sure!****
> >>>>
> >>>> ** **
> >>>>
> >>>> I don?t really think uptime would be as much an issue
here; there are
> >>>> lots of redundancies, recovery mechanisms, and plenty of
stable
> >>>>branches to
> >>>> choose from?the question becomes what are the feature-set
needs,
> >>>> performance usability for different file types and
workloads, and
> >>>>general
> >>>> comfort level with greater complexity and somewhat less
resources.
> >>>>That
> >>>> said, I?d personally be a bit wary of using it as a
general
> >>>>filesystem for
> >>>> *all* your needs.  ****
> >>>>
> >>>> ** **
> >>>>
> >>>> ** **
> >>>>
> >>>> I do find it interesting that your short list is a wide
range mix of
> >>>> storage and filesystem types; traditional NAS, scale-out
NAS, and
> >>>>then some
> >>>> block storage with a parallel filesytem in Lustre.  Why no
GPFS on
> >>>>the list
> >>>> for comparison?****
> >>>>
> >>>> ** **
> >>>>
> >>>> I currently manage, or have used in the past *[bluearc]*,
all the
> >>>> storage / filesystems and more from your list.  The reason
being is
> >>>>that
> >>>> different storage and filesystems components have some
things they
> >>>>are good
> >>>> at? while other things they might not be as good at doing.
So I
> >>>>diversify
> >>>> by putting different storage/filesystem component pieces
in the areas
> >>>>where
> >>>> they excel at best?****
> >>>>
> >>>> ** **
> >>>>
> >>>> ** **
> >>>>
> >>>> ** **
> >>>>
> >>>> Regards,****
> >>>>
> >>>> ** **
> >>>>
> >>>> Charles****
> >>>>
> >>>> ** **
> >>>>
> >>>> ** **
> >>>>
> >>>> ** **
> >>>>
> >>>> *From:* lustre-discuss-bounces at lists.lustre.org
[mailto:
> >>>> lustre-discuss-bounces at lists.lustre.org] *On Behalf Of
*greg whynott
> >>>> *Sent:* Thursday, January 17, 2013 12:18 PM
> >>>> *To:* lustre-discuss at lists.lustre.org
> >>>>
> >>>> *Subject:* [Lustre-discuss] is Luster ready for prime
time?****
> >>>>
> >>>>  ** **
> >>>>
> >>>> Hello,
> >>>>
> >>>>
> >>>> just signed up today, please forgive me if this question
has been
> >>>> covered recently.  - in a bit of a rush to get an answer
on this as
> >>>>we need
> >>>> to make a decision soon,  the idea of using luster was
thrown into
> >>>>the mix
> >>>> very late in the decision making process.
> >>>>
> >>>> ****
> >>>>
> >>>>  We are looking to procure a new storage solution which
will
> >>>> predominately be used for HPC output but will also be used
as our main
> >>>> business centric storage for day to day use.  Meaning the
file system
> >>>>needs
> >>>> to be available 24/7/365.    The last time I was involved
in
> >>>>considering
> >>>> Luster was about 6 years ago and it was at that time being
considered
> >>>>for
> >>>> scratch space for HPC usage only. ****
> >>>>
> >>>> Our VMs and databases would remain on non-luster storage
as we already
> >>>> have that in place and it works well.    The luster file
system
> >>>>potentially
> >>>> would have everything else.  Projects we work on typically
take up to
> >>>>2
> >>>> years to complete and during that time we would want all
assets to
> >>>>remain
> >>>> on the file system.****
> >>>>
> >>>> Some of the vendors on our short list include HDS(Blue
Arc), Isilon
> >>>>and
> >>>> NetApp.    Last week we started bouncing the idea of using
Luster
> >>>>around.
> >>>> I''d love to use it if it is considered stable
enough to do so.
> >>>>
> >>>> your thoughts and/or comments would be greatly
appreciated.  thanks
> >>>>for
> >>>> your time.
> >>>>
> >>>> greg
> >>>>
> >>>>
> >>>> ****
> >>>>
> >>>
> >>>
> >>> _______________________________________________
> >>> Lustre-discuss mailing list
> >>> Lustre-discuss at lists.lustre.org
> >>> http://lists.lustre.org/mailman/listinfo/lustre-discuss
> >>>
> >>>
> >>
> >-------------- next part --------------
> >An HTML attachment was scrubbed...
> >URL:
> >
> http://lists.lustre.org/pipermail/lustre-discuss/attachments/20130121/d311
> >779c/attachment-0001.html
> >
> >------------------------------
> >
> >_______________________________________________
> >Lustre-discuss mailing list
> >Lustre-discuss at lists.lustre.org
> >http://lists.lustre.org/mailman/listinfo/lustre-discuss
> >
> >
> >End of Lustre-discuss Digest, Vol 84, Issue 12
> >**********************************************
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20130201/2092b396/attachment-0001.html

Lustre discuss - Jan 2013 - is Luster ready for prime time?

[Lustre-discuss] is Luster ready for prime time?

[Lustre-discuss] is Luster ready for prime time?

[Lustre-discuss] is Luster ready for prime time?

[Lustre-discuss] is Luster ready for prime time?

[Lustre-discuss] is Luster ready for prime time?

[Lustre-discuss] is Luster ready for prime time?

[Lustre-discuss] is Luster ready for prime time?

[Lustre-discuss] is Luster ready for prime time?

[Lustre-discuss] is Luster ready for prime time?