thr3ads.net - CentOS - [CentOS] Storage cluster advise, anybody? [Apr 2016]

If this information is useful, please help other people find it:
Share via:

Valeri Galtsev

2016-Apr-22 19:18 UTC

[CentOS] Storage cluster advise, anybody?

Dear Experts,

I would like to ask everybody: what would you advise to use as a storage
cluster, or as a distributed filesystem.

I made my own research of what I can do, but I hit a snag with my
seemingly best choice, so I decided to stay away from it finally, and ask
clever people what they would use.

My requirements are:

1. I would like to have one big (say, comparable to petabyte) filesystem,
accessible on more than one machine, composed of disk space leftovers on a
bunch of machines having 1 gigabit per second ethernet connections

2. It can be a bit slow, as filesystem one would need for backups onto it
(say, using bacula or bareos), and/or for long term storage of large
datasets, portions of which can be copied over to faster storage for
processing if necessary. I would be thinking in 1-2 TB of data written to
it daily.

3. It would be great to have it single machine failure/reboot resilient

4. metadata machines should be redundant (or at least backup medatada host
should be manually convertible into master metadata host if fatal failure
to master or corruption of its data happens)


What I would like to avoid/exclude:

1. Proprietary commercial solutions, as:

a. I would like to stay on as minimal budget as possible
b. I want to be able to predict that it will exist for long time, and I
have better experience with my predictions of this sort about open source
projects as opposed to proprietary ones

2. Open source solutions using portions of proprietary closed source
binaries/libraries (e.g., I would like to stay away from google
proprietary code/binaries/libraries/modules)

3. Kernel level modifications. I really would like to have this
independent of OS as much as I can, or rather available on multiple OSes
(though I do not like Java based things - just my personal experience with
some of them). I have a bunch of Linux boxes and a bunch of FreeBSD boxes,
and I do not want to exclude neither of them if possible. Also, the need
to have custom Linux kernel specifically scares me: Linux kernels get
critical updates often, and having customizations lagging behind the need
of critical update is as unpleasant as rebooting the machine because of
kernel update is.

I'm not too scared of a "split nature" projects: proprietary
projects
having open source satellite. I have mixed experience with those, using
open source satellite I mean. Some of them are indeed not neglected, and
even though you may be missing some features commercial counterpart has,
some are really great ones: they are just missing commercial support, and
maybe having a bit sparse documentation, thus making you to invest more
effort into making it work, which I don't mind: I can earn my sysadmin's
salary here. I would say I more often had good experience with those than
bad one (and I have a list of early indications of potential bad outcome,
so I can more or less predict my future with this kind of projects).

<rant>
I really didn't mean to write this, but I figure it probably will surface
once I start getting your advices, so here it is. I did my research having
my requirements in mind and came up with the solution: moosefs. It is not
reviewed much, no reviews with criticism at all, and not much you can ("I
could" I should say) find howtos about customizations, performance tuning
etc. It installs without a hitch. It runs well, until you start stress
writing a lot to it in parallel, then it started performing exponentially
badly for me. Here is where extensive attempts to find performance tuning
documentation faces lack of success. What made my decision to never ever
use it in a future was the following. I started migrating data back from
moosefs to local UFS (that is FreeBSD box) filesystem using rsync command.
What I observed was: source files after they have been touched by rsync
changed their timestamps. As if instead of creation timestamp it is an
access timestamp on moosefs. This renders rsync from moosefs useless, as
you can not re-run failed rsync, and you obliterate some of metadata of
the source ("creation" timestamp). I wrote e-mail to sourceforge
moosefs
mail list, mentioning all this and the fact that I am using open source
moosefs. Next day they replied asking whether I use version 3."this"
or
version 3."that", as they want to know in which of them they have a
bug.
Whereas latest open source version they have everywhere, including
sourceforge is older version: 2.0.88.
Basically, my decision was made. Sorry for venting it out here, but I
figured, it will happen some moment when I will get your advises.
</rant>

Thanks a lot for all your advises!

Valeri

++++++++++++++++++++++++++++++++++++++++
Valeri Galtsev
Sr System Administrator
Department of Astronomy and Astrophysics
Kavli Institute for Cosmological Physics
University of Chicago
Phone: 773-702-4247
++++++++++++++++++++++++++++++++++++++++

Digimer

2016-Apr-22 19:24 UTC

head link

[CentOS] Storage cluster advise, anybody?

On 22/04/16 03:18 PM, Valeri Galtsev wrote:> Dear Experts,
> 
> I would like to ask everybody: what would you advise to use as a storage
> cluster, or as a distributed filesystem.
> 
> I made my own research of what I can do, but I hit a snag with my
> seemingly best choice, so I decided to stay away from it finally, and ask
> clever people what they would use.
> 
> My requirements are:
> 
> 1. I would like to have one big (say, comparable to petabyte) filesystem,
> accessible on more than one machine, composed of disk space leftovers on a
> bunch of machines having 1 gigabit per second ethernet connections
This sounds like you want a cloud-type storage, like ceph or gluster. I
don't use either, so I can't speak to them in detail.
> 2. It can be a bit slow, as filesystem one would need for backups onto it
> (say, using bacula or bareos), and/or for long term storage of large
> datasets, portions of which can be copied over to faster storage for
> processing if necessary. I would be thinking in 1-2 TB of data written to
> it daily.
> 
> 3. It would be great to have it single machine failure/reboot resilient
HA solutions put a priority on resilience, not resource utilization
efficiency, so you need to pick your priority. If you put a priority on
resilience and availability, you'll want to do something like create two
machines with equal storage, configure them in single-primary and use a
floating IP to expore the space over NFS or similar.

Then you would use pacemaker to manage the floating IP, fence (stonith)
a lost node, and promote drbd->mount FS->start nfsd->start floating IP.

This is not efficient, but it is very resilient. All of this is 100%
open source.
> 4. metadata machines should be redundant (or at least backup medatada host
> should be manually convertible into master metadata host if fatal failure
> to master or corruption of its data happens)
> 
> 
> What I would like to avoid/exclude:
> 
> 1. Proprietary commercial solutions, as:
> 
> a. I would like to stay on as minimal budget as possible
> b. I want to be able to predict that it will exist for long time, and I
> have better experience with my predictions of this sort about open source
> projects as opposed to proprietary ones
> 
> 2. Open source solutions using portions of proprietary closed source
> binaries/libraries (e.g., I would like to stay away from google
> proprietary code/binaries/libraries/modules)
> 
> 3. Kernel level modifications. I really would like to have this
> independent of OS as much as I can, or rather available on multiple OSes
> (though I do not like Java based things - just my personal experience with
> some of them). I have a bunch of Linux boxes and a bunch of FreeBSD boxes,
> and I do not want to exclude neither of them if possible. Also, the need
> to have custom Linux kernel specifically scares me: Linux kernels get
> critical updates often, and having customizations lagging behind the need
> of critical update is as unpleasant as rebooting the machine because of
> kernel update is.
> 
> I'm not too scared of a "split nature" projects: proprietary
projects
> having open source satellite. I have mixed experience with those, using
> open source satellite I mean. Some of them are indeed not neglected, and
> even though you may be missing some features commercial counterpart has,
> some are really great ones: they are just missing commercial support, and
> maybe having a bit sparse documentation, thus making you to invest more
> effort into making it work, which I don't mind: I can earn my
sysadmin's
> salary here. I would say I more often had good experience with those than
> bad one (and I have a list of early indications of potential bad outcome,
> so I can more or less predict my future with this kind of projects).
> 
> <rant>
> I really didn't mean to write this, but I figure it probably will
surface
> once I start getting your advices, so here it is. I did my research having
> my requirements in mind and came up with the solution: moosefs. It is not
> reviewed much, no reviews with criticism at all, and not much you can
("I
> could" I should say) find howtos about customizations, performance
tuning
> etc. It installs without a hitch. It runs well, until you start stress
> writing a lot to it in parallel, then it started performing exponentially
> badly for me. Here is where extensive attempts to find performance tuning
> documentation faces lack of success. What made my decision to never ever
> use it in a future was the following. I started migrating data back from
> moosefs to local UFS (that is FreeBSD box) filesystem using rsync command.
> What I observed was: source files after they have been touched by rsync
> changed their timestamps. As if instead of creation timestamp it is an
> access timestamp on moosefs. This renders rsync from moosefs useless, as
> you can not re-run failed rsync, and you obliterate some of metadata of
> the source ("creation" timestamp). I wrote e-mail to sourceforge
moosefs
> mail list, mentioning all this and the fact that I am using open source
> moosefs. Next day they replied asking whether I use version
3."this" or
> version 3."that", as they want to know in which of them they have
a bug.
> Whereas latest open source version they have everywhere, including
> sourceforge is older version: 2.0.88.
> Basically, my decision was made. Sorry for venting it out here, but I
> figured, it will happen some moment when I will get your advises.
> </rant>
> 
> Thanks a lot for all your advises!
> 
> Valeri
Before you go any further, you need to decide what is your priority. If
you need resilience, prepare to invest in the back-end hardware. If it
is more important to scrape unused resources from everywhere, then
resilience is not going to be so good.

-- 
Digimer
Papers and Projects: https://alteeve.ca/w/
What if the cure for cancer is trapped in the mind of a person without
access to education?

Traiano Welcome

2016-Apr-22 20:24 UTC

head link

[CentOS] Storage cluster advise, anybody?

Hi Valeri


On Fri, Apr 22, 2016 at 10:24 PM, Digimer <lists at alteeve.ca>
wrote:> On 22/04/16 03:18 PM, Valeri Galtsev wrote:
>> Dear Experts,
>>
>> I would like to ask everybody: what would you advise to use as a
storage
>> cluster, or as a distributed filesystem.
>>
>> I made my own research of what I can do, but I hit a snag with my
>> seemingly best choice, so I decided to stay away from it finally, and
ask
>> clever people what they would use.
>>
>> My requirements are:
>>
>> 1. I would like to have one big (say, comparable to petabyte)
filesystem,
>> accessible on more than one machine, composed of disk space leftovers
on a
>> bunch of machines having 1 gigabit per second ethernet connections
>
> This sounds like you want a cloud-type storage, like ceph or gluster. I
> don't use either, so I can't speak to them in detail.
>
>> 2. It can be a bit slow, as filesystem one would need for backups onto
it
>> (say, using bacula or bareos), and/or for long term storage of large
>> datasets, portions of which can be copied over to faster storage for
>> processing if necessary. I would be thinking in 1-2 TB of data written
to
>> it daily.
>>
>> 3. It would be great to have it single machine failure/reboot resilient
>
> HA solutions put a priority on resilience, not resource utilization
> efficiency, so you need to pick your priority. If you put a priority on
> resilience and availability, you'll want to do something like create
two
> machines with equal storage, configure them in single-primary and use a
> floating IP to expore the space over NFS or similar.
>
> Then you would use pacemaker to manage the floating IP, fence (stonith)
> a lost node, and promote drbd->mount FS->start nfsd->start
floating IP.
>
> This is not efficient, but it is very resilient. All of this is 100%
> open source.
>
>> 4. metadata machines should be redundant (or at least backup medatada
host
>> should be manually convertible into master metadata host if fatal
failure
>> to master or corruption of its data happens)
>>
Sounds like Hadoop HDFS might be worth looking into:

https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Overview


>>
>> What I would like to avoid/exclude:
>>
>> 1. Proprietary commercial solutions, as:
>>
>> a. I would like to stay on as minimal budget as possible
>> b. I want to be able to predict that it will exist for long time, and I
>> have better experience with my predictions of this sort about open
source
>> projects as opposed to proprietary ones
>>
>> 2. Open source solutions using portions of proprietary closed source
>> binaries/libraries (e.g., I would like to stay away from google
>> proprietary code/binaries/libraries/modules)
>>
>> 3. Kernel level modifications. I really would like to have this
>> independent of OS as much as I can, or rather available on multiple
OSes
>> (though I do not like Java based things - just my personal experience
with
>> some of them). I have a bunch of Linux boxes and a bunch of FreeBSD
boxes,
>> and I do not want to exclude neither of them if possible. Also, the
need
>> to have custom Linux kernel specifically scares me: Linux kernels get
>> critical updates often, and having customizations lagging behind the
need
>> of critical update is as unpleasant as rebooting the machine because of
>> kernel update is.
>>
>> I'm not too scared of a "split nature" projects:
proprietary projects
>> having open source satellite. I have mixed experience with those, using
>> open source satellite I mean. Some of them are indeed not neglected,
and
>> even though you may be missing some features commercial counterpart
has,
>> some are really great ones: they are just missing commercial support,
and
>> maybe having a bit sparse documentation, thus making you to invest more
>> effort into making it work, which I don't mind: I can earn my
sysadmin's
>> salary here. I would say I more often had good experience with those
than
>> bad one (and I have a list of early indications of potential bad
outcome,
>> so I can more or less predict my future with this kind of projects).
>>
>> <rant>
>> I really didn't mean to write this, but I figure it probably will
surface
>> once I start getting your advices, so here it is. I did my research
having
>> my requirements in mind and came up with the solution: moosefs. It is
not
>> reviewed much, no reviews with criticism at all, and not much you can
("I
>> could" I should say) find howtos about customizations, performance
tuning
>> etc. It installs without a hitch. It runs well, until you start stress
>> writing a lot to it in parallel, then it started performing
exponentially
>> badly for me. Here is where extensive attempts to find performance
tuning
>> documentation faces lack of success. What made my decision to never
ever
>> use it in a future was the following. I started migrating data back
from
>> moosefs to local UFS (that is FreeBSD box) filesystem using rsync
command.
>> What I observed was: source files after they have been touched by rsync
>> changed their timestamps. As if instead of creation timestamp it is an
>> access timestamp on moosefs. This renders rsync from moosefs useless,
as
>> you can not re-run failed rsync, and you obliterate some of metadata of
>> the source ("creation" timestamp). I wrote e-mail to
sourceforge moosefs
>> mail list, mentioning all this and the fact that I am using open source
>> moosefs. Next day they replied asking whether I use version
3."this" or
>> version 3."that", as they want to know in which of them they
have a bug.
>> Whereas latest open source version they have everywhere, including
>> sourceforge is older version: 2.0.88.
>> Basically, my decision was made. Sorry for venting it out here, but I
>> figured, it will happen some moment when I will get your advises.
>> </rant>
>>
>> Thanks a lot for all your advises!
>>
>> Valeri
>
> Before you go any further, you need to decide what is your priority. If
> you need resilience, prepare to invest in the back-end hardware. If it
> is more important to scrape unused resources from everywhere, then
> resilience is not going to be so good.
>
> --
> Digimer
> Papers and Projects: https://alteeve.ca/w/
> What if the cure for cancer is trapped in the mind of a person without
> access to education?
> _______________________________________________
> CentOS mailing list
> CentOS at centos.org
> https://lists.centos.org/mailman/listinfo/centos

Paul Heinlein

2016-Apr-22 20:40 UTC

head link

[CentOS] Storage cluster advise, anybody?

On Fri, 22 Apr 2016, Digimer wrote:
> Then you would use pacemaker to manage the floating IP, fence 
> (stonith) a lost node, and promote drbd->mount FS->start
nfsd->start
> floating IP.
My favorite acronym: stonith -- shoot the other node in the head.

-- 
Paul Heinlein
heinlein at madboa.com
45?38' N, 122?6' W

Gordon Messmer

2016-Apr-22 21:47 UTC

head link

[CentOS] Storage cluster advise, anybody?

On 04/22/2016 12:24 PM, Digimer wrote:>> My requirements are:
> This sounds like you want a cloud-type storage, like ceph or gluster.
I agree.  I think either would work.  A cluster with striping and 
mirroring of volumes should fit all the requirements.

Possibly Parallel Threads

Search for more apparently analagous threads

CentOS - Apr 2016 - Storage cluster advise, anybody?

[CentOS] Storage cluster advise, anybody?

[CentOS] Storage cluster advise, anybody?

[CentOS] Storage cluster advise, anybody?

[CentOS] Storage cluster advise, anybody?

[CentOS] Storage cluster advise, anybody?

Possibly Parallel Threads