Subrata Ghosh
2015-May-26 11:26 UTC
[Gluster-users] Regarding the issues gluster DHT and Layouts of bricks
Hi Sussant, Extremely sorry for the belated reply. Thanks for your input. We will try having rebalance then create a set of small files with a random pattern generation and check where it falls in by DM_TYPE DHT. I have one short query: We were also thinking to use (in case we need ) Translators/cluster/unify. Along with NUFA or ALU scheduler. However we noticed Translators/cluster/unify is became obsolete/legacy , in that case automatically ALU, NUFA,rr scheduler also became obsolete ? Is the Translators/cluster/unify From: Subrata Ghosh Sent: Thursday, May 21, 2015 4:26 PM To: gluster-devel at gluster.org; 'gluster-users at gluster.org' Cc: Nobin Mathew; 'Susant Palai'; 'Vijay Bellur' Subject: Regarding the issues gluster DHT and Layouts of bricks Hi All, Could you please guide us to solve the following DHT and brick layout problem we are dealing with ? Questions are marked bold. Problem statement : 1. We have a requirement to achieve maximum write and read performance and we have to meet some committed performance metrics. Our goal is to place each file into different bricks to get optimal performance and also observer the nature of the throughput , hence need to have a mechanism to generate different hash using gluster glusterfs.gf_dm_hashfn, (assuming number of files are : N , Number of Bricks :N) to place spate bricks. - How to make sure each file has different hash and falls to different bricks ? - Other way to put the question if I know the range of the brick layout or more precisely if I know the hex value of the desired hash ( so that it will be placed desired brick) that we need to generate from Davis-Meyer algorithm used in gluster, Can we create a file name such that, that also solve our problem to some extent? 2. We tried to experiment to see how a file in gluster is decided to be placed in a particular brick following gluster glusterfs.gf_dm_hashfn and took some idea from some articles like http://gluster.readthedocs.org/en/latest/Features/dht/ , https://joejulian.name/blog/dht-misses-are-expensive/ page which describes layout for that brick and calculate a hash for the file. To minimize collisions or generating different hash in such way to place each file in different bricks ( file 1 => brick A, file 2 => Brick B, file 3=> Brick C, file 4 => brick D) We use kind of similar script to get the hash value for a file def gf_dm_hashfn(filename): return ctypes.c_uint32(glusterfs.gf_dm_hashfn( filename, len(filendame))) if __name__ == "__main__": print hex(gf_dm_hashfn(sys.argv[1]).value) We can then calculate the hash for a filename: # python gf_dm_hash.py file1 0x99d1b6fL Extended attribute is fetch to check the range and try to match the above generated hash value. getfattr -n trusted.glusterfs.dht -e hex file1 However we are not able to exactly follow till this point , how the hash value matched to one of the layout assignments, to yield what we call a hashed location. - My question is if I know the range of brick lay out ( say 0xc0000000 to 0xffffffff, is range select a hash 0xc0070000 ) where to be placed the next file can we generate the name ( kind of reverse of gluster glusterfs.gf_dm_hashfn) ? PS : Susant : Can you throw some light or suggest a method we are trying to solve. Thanks for your time. Best Regards, Subrata Ghosh -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150526/2d011162/attachment-0001.html>