Subrata Ghosh
2015-May-26  11:26 UTC
[Gluster-users] Regarding the issues gluster DHT and Layouts of bricks
Hi Sussant,
Extremely sorry for the belated reply. Thanks  for your input. We will try
having rebalance then create a set of small files with a random pattern
generation and check where it falls in by DM_TYPE  DHT.
I have one short query:
We were also thinking to use (in case we need ) Translators/cluster/unify. Along
with NUFA or ALU scheduler. However we noticed Translators/cluster/unify is
became  obsolete/legacy , in that case automatically ALU, NUFA,rr scheduler also
became obsolete ?
Is the Translators/cluster/unify
From: Subrata Ghosh
Sent: Thursday, May 21, 2015 4:26 PM
To: gluster-devel at gluster.org; 'gluster-users at gluster.org'
Cc: Nobin Mathew; 'Susant Palai'; 'Vijay Bellur'
Subject: Regarding the issues gluster DHT and Layouts of bricks
Hi  All,
Could you please guide us  to solve the following DHT and brick layout problem
we are  dealing with ? Questions are marked bold.
Problem statement :
1.      We have a requirement to achieve maximum write and read performance and
we have to meet some committed performance metrics.
               Our goal is to place each file into different bricks to get
optimal performance and also observer the nature of the  throughput , hence need
to have a mechanism  to generate different hash using gluster
glusterfs.gf_dm_hashfn,
(assuming number of files are : N , Number of Bricks :N)  to place spate bricks.
-        How to make sure each file has different hash and   falls to different
bricks ?
-        Other way to put the question if I  know the range of the brick layout
or more precisely if I know the  hex value of the desired hash ( so that it will
be placed desired brick)  that we need to generate from Davis-Meyer algorithm
used in gluster,  Can we create a file name such that, that also solve our
problem to some extent?
2.      We tried to experiment to see  how a file in gluster is decided to be
placed in a particular brick following gluster glusterfs.gf_dm_hashfn and took
some idea from
       some articles  like
http://gluster.readthedocs.org/en/latest/Features/dht/ , 
https://joejulian.name/blog/dht-misses-are-expensive/ page which describes
layout for that brick  and calculate a hash for the file.
        To minimize collisions or generating different hash in such way to place
each file in different bricks ( file 1 => brick A, file 2 => Brick B, file
3=>  Brick C, file 4 => brick D)
               We use kind of similar script to get the hash value for a file
def gf_dm_hashfn(filename):
    return ctypes.c_uint32(glusterfs.gf_dm_hashfn(
        filename,
        len(filendame)))
if __name__ == "__main__":
    print hex(gf_dm_hashfn(sys.argv[1]).value)
We can then calculate the hash for a filename:
# python gf_dm_hash.py file1
0x99d1b6fL
Extended attribute is fetch to check the range and try to match the above
generated hash value.
getfattr -n trusted.glusterfs.dht -e hex file1
      However we are not able to exactly follow till this point ,  how the hash
value matched to one of the layout assignments, to yield what we call a hashed
location.
-         My question is if I  know the range of brick lay out ( say  0xc0000000
to  0xffffffff, is range  select a hash 0xc0070000 ) where to be placed the next
file can we generate the name ( kind of reverse of  gluster
glusterfs.gf_dm_hashfn) ?
PS :  Susant : Can you throw some light or suggest  a method we are trying to
solve.
Thanks for your time.
Best Regards,
Subrata Ghosh
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150526/2d011162/attachment-0001.html>