thr3ads.net - Gluster users - [Gluster-users] disperse volume file to subvolume mapping [Apr 2016]

If this information is useful, please help other people find it:
Share via:

Serkan Çoban

2016-Apr-20 12:13 UTC

[Gluster-users] disperse volume file to subvolume mapping

Here is the steps that I do in detail and relevant output from bricks:

I am using below command for volume creation:
gluster volume create v0 disperse 20 redundancy 4 \
1.1.1.{185..204}:/bricks/02 \
1.1.1.{205..224}:/bricks/02 \
1.1.1.{225..244}:/bricks/02 \
1.1.1.{185..204}:/bricks/03 \
1.1.1.{205..224}:/bricks/03 \
1.1.1.{225..244}:/bricks/03 \
1.1.1.{185..204}:/bricks/04 \
1.1.1.{205..224}:/bricks/04 \
1.1.1.{225..244}:/bricks/04 \
1.1.1.{185..204}:/bricks/05 \
1.1.1.{205..224}:/bricks/05 \
1.1.1.{225..244}:/bricks/05 \
1.1.1.{185..204}:/bricks/06 \
1.1.1.{205..224}:/bricks/06 \
1.1.1.{225..244}:/bricks/06 \
1.1.1.{185..204}:/bricks/07 \
1.1.1.{205..224}:/bricks/07 \
1.1.1.{225..244}:/bricks/07 \
1.1.1.{185..204}:/bricks/08 \
1.1.1.{205..224}:/bricks/08 \
1.1.1.{225..244}:/bricks/08 \
1.1.1.{185..204}:/bricks/09 \
1.1.1.{205..224}:/bricks/09 \
1.1.1.{225..244}:/bricks/09 \
1.1.1.{185..204}:/bricks/10 \
1.1.1.{205..224}:/bricks/10 \
1.1.1.{225..244}:/bricks/10 \
1.1.1.{185..204}:/bricks/11 \
1.1.1.{205..224}:/bricks/11 \
1.1.1.{225..244}:/bricks/11 \
1.1.1.{185..204}:/bricks/12 \
1.1.1.{205..224}:/bricks/12 \
1.1.1.{225..244}:/bricks/12 \
1.1.1.{185..204}:/bricks/13 \
1.1.1.{205..224}:/bricks/13 \
1.1.1.{225..244}:/bricks/13 \
1.1.1.{185..204}:/bricks/14 \
1.1.1.{205..224}:/bricks/14 \
1.1.1.{225..244}:/bricks/14 \
1.1.1.{185..204}:/bricks/15 \
1.1.1.{205..224}:/bricks/15 \
1.1.1.{225..244}:/bricks/15 \
1.1.1.{185..204}:/bricks/16 \
1.1.1.{205..224}:/bricks/16 \
1.1.1.{225..244}:/bricks/16 \
1.1.1.{185..204}:/bricks/17 \
1.1.1.{205..224}:/bricks/17 \
1.1.1.{225..244}:/bricks/17 \
1.1.1.{185..204}:/bricks/18 \
1.1.1.{205..224}:/bricks/18 \
1.1.1.{225..244}:/bricks/18 \
1.1.1.{185..204}:/bricks/19 \
1.1.1.{205..224}:/bricks/19 \
1.1.1.{225..244}:/bricks/19 \
1.1.1.{185..204}:/bricks/20 \
1.1.1.{205..224}:/bricks/20 \
1.1.1.{225..244}:/bricks/20 \
1.1.1.{185..204}:/bricks/21 \
1.1.1.{205..224}:/bricks/21 \
1.1.1.{225..244}:/bricks/21 \
1.1.1.{185..204}:/bricks/22 \
1.1.1.{205..224}:/bricks/22 \
1.1.1.{225..244}:/bricks/22 \
1.1.1.{185..204}:/bricks/23 \
1.1.1.{205..224}:/bricks/23 \
1.1.1.{225..244}:/bricks/23 \
1.1.1.{185..204}:/bricks/24 \
1.1.1.{205..224}:/bricks/24 \
1.1.1.{225..244}:/bricks/24 \
1.1.1.{185..204}:/bricks/25 \
1.1.1.{205..224}:/bricks/25 \
1.1.1.{225..244}:/bricks/25 \
1.1.1.{185..204}:/bricks/26 \
1.1.1.{205..224}:/bricks/26 \
1.1.1.{225..244}:/bricks/26 \
1.1.1.{185..204}:/bricks/27 \
1.1.1.{205..224}:/bricks/27 \
1.1.1.{225..244}:/bricks/27 force

then I mount volume on 50 clients:
mount -t glusterfs 1.1.1.185:/v0 /mnt/gluster

then I make a directory from one of the clients and chmod it.
mkdir /mnt/gluster/s1 && chmod 777 /mnt/gluster/s1

then I start distcp on clients, there are 1059X8.8GB files in one folder and
they will be copied to /mnt/gluster/s1 with 100 parallel which means 2
copy jobs per client at same time.
hadoop distcp -m 100 http://nn1:8020/path/to/teragen-10tb file:///mnt/gluster/s1

After job finished here is the status of s1 directory from bricks:
s1 directory is present in all 1560 brick.
s1/teragen-10tb folder is present in all 1560 brick.

full listing of files in bricks:
https://www.dropbox.com/s/rbgdxmrtwz8oya8/teragen_list.zip?dl=0

You can ignore the .crc files in the brick output above, they are
checksum files...

As you can see part-m-xxxx files written only some bricks in nodes 0205..0224
All bricks have some files but they have zero size.

I increase file descriptors to 65k so it is not the issue...





On Wed, Apr 20, 2016 at 9:34 AM, Xavier Hernandez <xhernandez at
datalab.es> wrote:> Hi Serkan,
>
> On 19/04/16 15:16, Serkan ?oban wrote:
>>>>>
>>>>> I assume that gluster is used to store the intermediate
files before
>>>>> the reduce phase
>>
>> Nope, gluster is the destination for distcp command. hadoop distcp -m
>> 50 http://nn1:8020/path/to/folder file:///mnt/gluster
>> This run maps on datanodes which have /mnt/gluster mounted on all of
them.
>
>
> I don't know hadoop, so I'm of little help here. However it seems
that -m 50
> means to execute 50 copies in parallel. This means that even if the
> distribution worked fine, at most 50 (much probably less) of the 78 ec sets
> would be used in parallel.
>
>>
>>>>> This means that this is caused by some peculiarity of the
mapreduce.
>>
>> Yes but how a client write 500 files to gluster mount and those file
>> just written only to subset of subvolumes? I cannot use gluster as a
>> backup cluster if I cannot write with distcp.
>>
>
> All 500 files were created only on one of the 78 ec sets and the remaining
> 77 got empty ?
>
>>>>> You should look which files are created in each brick and
how many
>>>>> while the process is running.
>>
>> Files only created on nodes 185..204 or 205..224 or 225..244. Only on
>> 20 nodes in each test.
>
>
> How many files there were in each brick ?
>
> Not sure if this can be related, but standard linux distributions have a
> default limit of 1024 open file descriptors. Having a so big volume and
> doing a massive copy, maybe this limit is affecting something ?
>
> Are there any error or warning messages in the mount or bricks logs ?
>
>
> Xavi
>
>>
>> On Tue, Apr 19, 2016 at 1:05 PM, Xavier Hernandez <xhernandez at
datalab.es>
>> wrote:
>>>
>>> Hi Serkan,
>>>
>>> moved to gluster-users since this doesn't belong to devel list.
>>>
>>> On 19/04/16 11:24, Serkan ?oban wrote:
>>>>
>>>>
>>>> I am copying 10.000 files to gluster volume using mapreduce on
>>>> clients. Each map process took one file at a time and copy it
to
>>>> gluster volume.
>>>
>>>
>>>
>>> I assume that gluster is used to store the intermediate files
before the
>>> reduce phase.
>>>
>>>> My disperse volume consist of 78 subvolumes of 16+4 disk each.
So If I
>>>> copy >78 files parallel I expect each file goes to different
subvolume
>>>> right?
>>>
>>>
>>>
>>> If you only copy 78 files, most probably you will get some
subvolume
>>> empty
>>> and some other with more than one or two files. It's not an
exact
>>> distribution, it's a statistially balanced distribution: over
time and
>>> with
>>> enough files, each brick will contain an amount of files in the
same
>>> order
>>> of magnitude, but they won't have the *same* number of files.
>>>
>>>> In my tests during tests with fio I can see every file goes to
>>>> different subvolume, but when I start mapreduce process from
clients
>>>> only 78/3=26 subvolumes used for writing files.
>>>
>>>
>>>
>>> This means that this is caused by some peculiarity of the
mapreduce.
>>>
>>>> I see that clearly from network traffic. Mapreduce on client
side can
>>>> be run multi thread. I tested with 1-5-10 threads on each
client but
>>>> every time only 26 subvolumes used.
>>>> How can I debug the issue further?
>>>
>>>
>>>
>>> You should look which files are created in each brick and how many
while
>>> the
>>> process is running.
>>>
>>> Xavi
>>>
>>>
>>>>
>>>> On Tue, Apr 19, 2016 at 11:22 AM, Xavier Hernandez
>>>> <xhernandez at datalab.es> wrote:
>>>>>
>>>>>
>>>>> Hi Serkan,
>>>>>
>>>>> On 19/04/16 09:18, Serkan ?oban wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hi, I just reinstalled fresh 3.7.11 and I am seeing the
same behavior.
>>>>>> 50 clients copying part-0-xxxx named files using
mapreduce to gluster
>>>>>> using one thread per server and they are using only 20
servers out of
>>>>>> 60. On the other hand fio tests use all the servers.
Anything I can do
>>>>>> to solve the issue?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Distribution of files to ec sets is done by dht. In theory
if you
>>>>> create
>>>>> many files each ec set will receive the same amount of
files. However
>>>>> when
>>>>> the number of files is small enough, statistics can fail.
>>>>>
>>>>> Not sure what you are doing exactly, but a mapreduce
procedure
>>>>> generally
>>>>> only creates a single output. In that case it makes sense
that only one
>>>>> ec
>>>>> set is used. If you want to use all ec sets for a single
file, you
>>>>> should
>>>>> enable sharding (I haven't tested that) or split the
result in multiple
>>>>> files.
>>>>>
>>>>> Xavi
>>>>>
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Serkan
>>>>>>
>>>>>>
>>>>>> ---------- Forwarded message ----------
>>>>>> From: Serkan ?oban <cobanserkan at gmail.com>
>>>>>> Date: Mon, Apr 18, 2016 at 2:39 PM
>>>>>> Subject: disperse volume file to subvolume mapping
>>>>>> To: Gluster Users <gluster-users at gluster.org>
>>>>>>
>>>>>>
>>>>>> Hi, I have a problem where clients are using only 1/3
of nodes in
>>>>>> disperse volume for writing.
>>>>>> I am testing from 50 clients using 1 to 10 threads with
file names
>>>>>> part-0-xxxx.
>>>>>> What I see is clients only use 20 nodes for writing.
How is the file
>>>>>> name to sub volume hashing is done? Is this related to
file names are
>>>>>> similar?
>>>>>>
>>>>>> My cluster is 3.7.10 with 60 nodes each has 26 disks.
Disperse volume
>>>>>> is 78 x (16+4). Only 26 out of 78 sub volumes used
during writes..
>>>>>>
>>>>>
>>>
>

Xavier Hernandez

2016-Apr-21 07:00 UTC

head link

[Gluster-users] disperse volume file to subvolume mapping

Hi Serkan,

I think the problem is in the temporary name that distcp gives to the 
file while it's being copied before renaming it to the real name. Do you 
know what is the structure of this name ?

DHT selects the subvolume (in this case the ec set) on which the file 
will be stored based on the name of the file. This has a problem when a 
file is being renamed, because this could change the subvolume where the 
file should be found.

DHT has a feature to avoid incorrect file placements when executing 
renames for the rsync case. What it does is to check if the file matches 
the following regular expression:

     ^\.(.+)\.[^.]+$

If a match is found, it only considers the part between parenthesis to 
calculate the destination subvolume.

This is useful for rsync because temporary file names are constructed in 
the following way: suppose the original filename is 'test'. The 
temporary filename while rsync is being executed is made by prepending a 
dot and appending '.<random chars>': .test.712hd

As you can see, the original name and the part of the name between 
parenthesis that matches the regular expression are the same. This 
causes that, after renaming the temporary file to its original filename, 
both files will be considered to belong to the same subvolume by DHT.

In your case it's very probable that distcp uses a temporary name like 
'.part.<number>'. In this case the portion of the name used to
select
the subvolume is always 'part'. This would explain why all files go to 
the same subvolume. Once the file is renamed to another name, DHT 
realizes that it should go to another subvolume. At this point it 
creates a link file (those files with access rights = '---------T') in 
the correct subvolume but it doesn't move it. As you can see, this kind 
of files are better balanced.

To solve this problem you have three options:

1. change the temporary filename used by distcp to correctly match the 
regular expression. I'm not sure if this can be configured, but if this 
is possible, this is the best option.

2. define the option 'extra-hash-regex' to an expression that matches 
your temporary file names and returns the same name that will finally 
have. Depending on the differences between original and temporary file 
names, this option could be useless.

3. set the option 'rsync-hash-regex' to 'none'. This will
prevent the
name conversion, so the files will be evenly distributed. However this 
will cause a lot of files placed in incorrect subvolumes, creating a lot 
of link files until a rebalance is executed.

Xavi

On 20/04/16 14:13, Serkan ?oban wrote:> Here is the steps that I do in detail and relevant output from bricks:
>
> I am using below command for volume creation:
> gluster volume create v0 disperse 20 redundancy 4 \
> 1.1.1.{185..204}:/bricks/02 \
> 1.1.1.{205..224}:/bricks/02 \
> 1.1.1.{225..244}:/bricks/02 \
> 1.1.1.{185..204}:/bricks/03 \
> 1.1.1.{205..224}:/bricks/03 \
> 1.1.1.{225..244}:/bricks/03 \
> 1.1.1.{185..204}:/bricks/04 \
> 1.1.1.{205..224}:/bricks/04 \
> 1.1.1.{225..244}:/bricks/04 \
> 1.1.1.{185..204}:/bricks/05 \
> 1.1.1.{205..224}:/bricks/05 \
> 1.1.1.{225..244}:/bricks/05 \
> 1.1.1.{185..204}:/bricks/06 \
> 1.1.1.{205..224}:/bricks/06 \
> 1.1.1.{225..244}:/bricks/06 \
> 1.1.1.{185..204}:/bricks/07 \
> 1.1.1.{205..224}:/bricks/07 \
> 1.1.1.{225..244}:/bricks/07 \
> 1.1.1.{185..204}:/bricks/08 \
> 1.1.1.{205..224}:/bricks/08 \
> 1.1.1.{225..244}:/bricks/08 \
> 1.1.1.{185..204}:/bricks/09 \
> 1.1.1.{205..224}:/bricks/09 \
> 1.1.1.{225..244}:/bricks/09 \
> 1.1.1.{185..204}:/bricks/10 \
> 1.1.1.{205..224}:/bricks/10 \
> 1.1.1.{225..244}:/bricks/10 \
> 1.1.1.{185..204}:/bricks/11 \
> 1.1.1.{205..224}:/bricks/11 \
> 1.1.1.{225..244}:/bricks/11 \
> 1.1.1.{185..204}:/bricks/12 \
> 1.1.1.{205..224}:/bricks/12 \
> 1.1.1.{225..244}:/bricks/12 \
> 1.1.1.{185..204}:/bricks/13 \
> 1.1.1.{205..224}:/bricks/13 \
> 1.1.1.{225..244}:/bricks/13 \
> 1.1.1.{185..204}:/bricks/14 \
> 1.1.1.{205..224}:/bricks/14 \
> 1.1.1.{225..244}:/bricks/14 \
> 1.1.1.{185..204}:/bricks/15 \
> 1.1.1.{205..224}:/bricks/15 \
> 1.1.1.{225..244}:/bricks/15 \
> 1.1.1.{185..204}:/bricks/16 \
> 1.1.1.{205..224}:/bricks/16 \
> 1.1.1.{225..244}:/bricks/16 \
> 1.1.1.{185..204}:/bricks/17 \
> 1.1.1.{205..224}:/bricks/17 \
> 1.1.1.{225..244}:/bricks/17 \
> 1.1.1.{185..204}:/bricks/18 \
> 1.1.1.{205..224}:/bricks/18 \
> 1.1.1.{225..244}:/bricks/18 \
> 1.1.1.{185..204}:/bricks/19 \
> 1.1.1.{205..224}:/bricks/19 \
> 1.1.1.{225..244}:/bricks/19 \
> 1.1.1.{185..204}:/bricks/20 \
> 1.1.1.{205..224}:/bricks/20 \
> 1.1.1.{225..244}:/bricks/20 \
> 1.1.1.{185..204}:/bricks/21 \
> 1.1.1.{205..224}:/bricks/21 \
> 1.1.1.{225..244}:/bricks/21 \
> 1.1.1.{185..204}:/bricks/22 \
> 1.1.1.{205..224}:/bricks/22 \
> 1.1.1.{225..244}:/bricks/22 \
> 1.1.1.{185..204}:/bricks/23 \
> 1.1.1.{205..224}:/bricks/23 \
> 1.1.1.{225..244}:/bricks/23 \
> 1.1.1.{185..204}:/bricks/24 \
> 1.1.1.{205..224}:/bricks/24 \
> 1.1.1.{225..244}:/bricks/24 \
> 1.1.1.{185..204}:/bricks/25 \
> 1.1.1.{205..224}:/bricks/25 \
> 1.1.1.{225..244}:/bricks/25 \
> 1.1.1.{185..204}:/bricks/26 \
> 1.1.1.{205..224}:/bricks/26 \
> 1.1.1.{225..244}:/bricks/26 \
> 1.1.1.{185..204}:/bricks/27 \
> 1.1.1.{205..224}:/bricks/27 \
> 1.1.1.{225..244}:/bricks/27 force
>
> then I mount volume on 50 clients:
> mount -t glusterfs 1.1.1.185:/v0 /mnt/gluster
>
> then I make a directory from one of the clients and chmod it.
> mkdir /mnt/gluster/s1 && chmod 777 /mnt/gluster/s1
>
> then I start distcp on clients, there are 1059X8.8GB files in one folder
and
> they will be copied to /mnt/gluster/s1 with 100 parallel which means 2
> copy jobs per client at same time.
> hadoop distcp -m 100 http://nn1:8020/path/to/teragen-10tb
file:///mnt/gluster/s1
>
> After job finished here is the status of s1 directory from bricks:
> s1 directory is present in all 1560 brick.
> s1/teragen-10tb folder is present in all 1560 brick.
>
> full listing of files in bricks:
> https://www.dropbox.com/s/rbgdxmrtwz8oya8/teragen_list.zip?dl=0
>
> You can ignore the .crc files in the brick output above, they are
> checksum files...
>
> As you can see part-m-xxxx files written only some bricks in nodes
0205..0224
> All bricks have some files but they have zero size.
>
> I increase file descriptors to 65k so it is not the issue...
>
>
>
>
>
> On Wed, Apr 20, 2016 at 9:34 AM, Xavier Hernandez <xhernandez at
datalab.es> wrote:
>> Hi Serkan,
>>
>> On 19/04/16 15:16, Serkan ?oban wrote:
>>>>>>
>>>>>> I assume that gluster is used to store the intermediate
files before
>>>>>> the reduce phase
>>>
>>> Nope, gluster is the destination for distcp command. hadoop distcp
-m
>>> 50 http://nn1:8020/path/to/folder file:///mnt/gluster
>>> This run maps on datanodes which have /mnt/gluster mounted on all
of them.
>>
>>
>> I don't know hadoop, so I'm of little help here. However it
seems that -m 50
>> means to execute 50 copies in parallel. This means that even if the
>> distribution worked fine, at most 50 (much probably less) of the 78 ec
sets
>> would be used in parallel.
>>
>>>
>>>>>> This means that this is caused by some peculiarity of
the mapreduce.
>>>
>>> Yes but how a client write 500 files to gluster mount and those
file
>>> just written only to subset of subvolumes? I cannot use gluster as
a
>>> backup cluster if I cannot write with distcp.
>>>
>>
>> All 500 files were created only on one of the 78 ec sets and the
remaining
>> 77 got empty ?
>>
>>>>>> You should look which files are created in each brick
and how many
>>>>>> while the process is running.
>>>
>>> Files only created on nodes 185..204 or 205..224 or 225..244. Only
on
>>> 20 nodes in each test.
>>
>>
>> How many files there were in each brick ?
>>
>> Not sure if this can be related, but standard linux distributions have
a
>> default limit of 1024 open file descriptors. Having a so big volume and
>> doing a massive copy, maybe this limit is affecting something ?
>>
>> Are there any error or warning messages in the mount or bricks logs ?
>>
>>
>> Xavi
>>
>>>
>>> On Tue, Apr 19, 2016 at 1:05 PM, Xavier Hernandez <xhernandez at
datalab.es>
>>> wrote:
>>>>
>>>> Hi Serkan,
>>>>
>>>> moved to gluster-users since this doesn't belong to devel
list.
>>>>
>>>> On 19/04/16 11:24, Serkan ?oban wrote:
>>>>>
>>>>>
>>>>> I am copying 10.000 files to gluster volume using mapreduce
on
>>>>> clients. Each map process took one file at a time and copy
it to
>>>>> gluster volume.
>>>>
>>>>
>>>>
>>>> I assume that gluster is used to store the intermediate files
before the
>>>> reduce phase.
>>>>
>>>>> My disperse volume consist of 78 subvolumes of 16+4 disk
each. So If I
>>>>> copy >78 files parallel I expect each file goes to
different subvolume
>>>>> right?
>>>>
>>>>
>>>>
>>>> If you only copy 78 files, most probably you will get some
subvolume
>>>> empty
>>>> and some other with more than one or two files. It's not an
exact
>>>> distribution, it's a statistially balanced distribution:
over time and
>>>> with
>>>> enough files, each brick will contain an amount of files in the
same
>>>> order
>>>> of magnitude, but they won't have the *same* number of
files.
>>>>
>>>>> In my tests during tests with fio I can see every file goes
to
>>>>> different subvolume, but when I start mapreduce process
from clients
>>>>> only 78/3=26 subvolumes used for writing files.
>>>>
>>>>
>>>>
>>>> This means that this is caused by some peculiarity of the
mapreduce.
>>>>
>>>>> I see that clearly from network traffic. Mapreduce on
client side can
>>>>> be run multi thread. I tested with 1-5-10 threads on each
client but
>>>>> every time only 26 subvolumes used.
>>>>> How can I debug the issue further?
>>>>
>>>>
>>>>
>>>> You should look which files are created in each brick and how
many while
>>>> the
>>>> process is running.
>>>>
>>>> Xavi
>>>>
>>>>
>>>>>
>>>>> On Tue, Apr 19, 2016 at 11:22 AM, Xavier Hernandez
>>>>> <xhernandez at datalab.es> wrote:
>>>>>>
>>>>>>
>>>>>> Hi Serkan,
>>>>>>
>>>>>> On 19/04/16 09:18, Serkan ?oban wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Hi, I just reinstalled fresh 3.7.11 and I am seeing
the same behavior.
>>>>>>> 50 clients copying part-0-xxxx named files using
mapreduce to gluster
>>>>>>> using one thread per server and they are using only
20 servers out of
>>>>>>> 60. On the other hand fio tests use all the
servers. Anything I can do
>>>>>>> to solve the issue?
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Distribution of files to ec sets is done by dht. In
theory if you
>>>>>> create
>>>>>> many files each ec set will receive the same amount of
files. However
>>>>>> when
>>>>>> the number of files is small enough, statistics can
fail.
>>>>>>
>>>>>> Not sure what you are doing exactly, but a mapreduce
procedure
>>>>>> generally
>>>>>> only creates a single output. In that case it makes
sense that only one
>>>>>> ec
>>>>>> set is used. If you want to use all ec sets for a
single file, you
>>>>>> should
>>>>>> enable sharding (I haven't tested that) or split
the result in multiple
>>>>>> files.
>>>>>>
>>>>>> Xavi
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Serkan
>>>>>>>
>>>>>>>
>>>>>>> ---------- Forwarded message ----------
>>>>>>> From: Serkan ?oban <cobanserkan at gmail.com>
>>>>>>> Date: Mon, Apr 18, 2016 at 2:39 PM
>>>>>>> Subject: disperse volume file to subvolume mapping
>>>>>>> To: Gluster Users <gluster-users at
gluster.org>
>>>>>>>
>>>>>>>
>>>>>>> Hi, I have a problem where clients are using only
1/3 of nodes in
>>>>>>> disperse volume for writing.
>>>>>>> I am testing from 50 clients using 1 to 10 threads
with file names
>>>>>>> part-0-xxxx.
>>>>>>> What I see is clients only use 20 nodes for
writing. How is the file
>>>>>>> name to sub volume hashing is done? Is this related
to file names are
>>>>>>> similar?
>>>>>>>
>>>>>>> My cluster is 3.7.10 with 60 nodes each has 26
disks. Disperse volume
>>>>>>> is 78 x (16+4). Only 26 out of 78 sub volumes used
during writes..
>>>>>>>
>>>>>>
>>>>
>>

Gluster users - Apr 2016 - disperse volume file to subvolume mapping

[Gluster-users] disperse volume file to subvolume mapping

[Gluster-users] disperse volume file to subvolume mapping