On 05/01/2017 01:13 PM, Pranith Kumar Karampuri wrote:>
>
> On Mon, May 1, 2017 at 10:42 PM, Pranith Kumar Karampuri
> <pkarampu at redhat.com <mailto:pkarampu at redhat.com>> wrote:
>
>
>
> On Mon, May 1, 2017 at 10:39 PM, Gandalf Corvotempesta
> <gandalf.corvotempesta at gmail.com
> <mailto:gandalf.corvotempesta at gmail.com>> wrote:
>
> 2017-05-01 18:57 GMT+02:00 Pranith Kumar Karampuri
> <pkarampu at redhat.com <mailto:pkarampu at
redhat.com>>:
> > Yes this is precisely what all the other SDS with metadata
servers kind of
> > do. They kind of keep a map of on what all servers a
particular file/blob is
> > stored in a metadata server.
>
> Not exactly. Other SDS has some servers dedicated to metadata and,
> personally, I don't like that approach.
>
> > GlusterFS doesn't do that. In GlusterFS what
> > bricks need to be replicated is always given and distribute
layer on top of
> > these replication layer will do the job of distributing and
fetching the
> > data. Because replication happens at a brick level and not at
a file level
> > and distribute happens on top of replication and not at file
level. There
> > isn't too much metadata that needs to be stored per file.
Hence no need for
> > separate metadata servers.
>
> And this is great, that's why i'm talking about embedding a
sort
> of database
> to be stored on all nodes. no metadata servers, only a mapping
> between files
> and servers.
>
> > If you know path of the file, you can always know where the
file is stored
> > using pathinfo:
> > Method-2 in the following link:
> >
https://gluster.readthedocs.io/en/latest/Troubleshooting/gfid-to-path/
>
<https://gluster.readthedocs.io/en/latest/Troubleshooting/gfid-to-path/>
> >
> > You don't need any db.
>
> For the current gluster yes.
> I'm talking about a different thing.
>
> In a RAID, you have data stored somewhere on the array, with
> metadata
> defining how this data should
> be wrote or read. obviously, raid metadata must be stored in a
fixed
> position, or you won't be able to read
> that.
>
> Something similiar could be added in gluster (i don't know if
it
> would
> be hard): you store a file mapping in a fixed
> position in gluster, then all gluster clients will be able to know
> where a file is by looking at this "metadata" stored in
> the fixed position.
>
> Like ".gluster" directory. Gluster is using some
"internal"
> directories for internal operations (".shards",
".gluster",
> ".trash")
> A ".metadata" with file mapping would be hard to add ?
>
> > Basically what you want, if I understood correctly is:
> > If we add a 3rd node with just one disk, the data should
automatically
> > arrange itself splitting itself to 3 categories(Assuming
replica-2)
> > 1) Files that are present in Node1, Node2
> > 2) Files that are present in Node2, Node3
> > 3) Files that are present in Node1, Node3
> >
> > As you can see we arrived at a contradiction where all the
nodes should have
> > at least 2 bricks but there is only 1 disk. Hence the
contradiction. We
> > can't do what you are asking without brick splitting. i.e.
we need to split
> > the disk into 2 bricks.
Splitting the bricks need not be a post factum decision, we can start
with larger brick counts, on a given node/disk count, and hence spread
these bricks to newer nodes/bricks as they are added.
If I understand the ceph PG count, it works on a similar notion, till
the cluster grows beyond the initial PG count (set for the pool) at
which point there is a lot more data movement (as the pg count has to be
increased, and hence existing PGs need to be further partitioned) .
(just using ceph as an example, a similar approach exists for openstack
swift with their partition power settings).
>
> I don't think so.
> Let's assume a replica 2.
>
> S1B1 + S2B1
>
> 1TB each, thus 1TB available (2TB/2)
>
> Adding a third 1TB disks should increase available space to
> 1.5TB (3TB/2)
>
>
> I agree it should. Question is how? What will be the resulting
> brick-map?
>
>
> I don't see any solution that we can do without at least 2 bricks on
> each of the 3 servers.
>
>
>
>
> --
> Pranith
>
>
>
>
> --
> Pranith
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>