thr3ads.net - Gluster users - [Gluster-users] Arbiter brick size estimation [Mar 2016]

If this information is useful, please help other people find it:
Share via:

Ravishankar N

2016-Mar-08 13:43 UTC

[Gluster-users] Arbiter brick size estimation

On 03/05/2016 03:45 PM, Oleksandr Natalenko wrote:> In order to estimate GlusterFS arbiter brick size, I've deployed test
setup
> with replica 3 arbiter 1 volume within one node. Each brick is located on
> separate HDD (XFS with inode size == 512). Using GlusterFS v3.7.6 + memleak
> patches. Volume options are kept default.
>
> Here is the script that creates files and folders in mounted volume: [1]
>
> The script creates 1M of files of random size (between 1 and 32768 bytes)
and
> some amount of folders. After running it I've got 1036637 folders. So,
in
> total it is 2036637 files and folders.
>
> The initial used space on each brick is 42M . After running script I've
got:
>
> replica brick 1 and 2: 19867168 kbytes == 19G
> arbiter brick: 1872308 kbytes == 1.8G
>
> The amount of inodes on each brick is 3139091. So here goes estimation.
>
> Dividing arbiter used space by files+folders we get:
>
> (1872308 - 42000)/2036637 == 899 bytes per file or folder
>
> Dividing arbiter used space by inodes we get:
>
> (1872308 - 42000)/3139091 == 583 bytes per inode
>
> Not sure about what calculation is correct.
I think the first one is right because you still haven't used up all the 
inodes.(2036637 used vs. the max. permissible 3139091). But again this 
is an approximation because not all files would be 899 bytes. For 
example if there are a thousand files present in a directory, then du 
<dirname> would be more than du <file> because the directory will
take
some disk space to store the dentries.
>   I guess we should consider the one
> that accounts inodes because of .glusterfs/ folder data.
>
> Nevertheless, in contrast, documentation [2] says it should be 4096 bytes
per
> file. Am I wrong with my calculations?
The 4KB is a conservative estimate considering the fact that though the 
arbiter brick does not store data, it still keeps a copy of both user 
and gluster xattrs. For example, if the application sets a lot of 
xattrs, it can consume a data block if they cannot be accommodated on 
the inode itself.  Also there is the .glusterfs folder like you said 
which would take up some space. Here is what I tried on an XFS brick:
[root at ravi4 brick]# touch file

[root at ravi4 brick]# ls -l file
-rw-r--r-- 1 root root 0 Mar  8 12:54 file

[root at ravi4 brick]# du file
*0       file**
*
[root at ravi4 brick]# for i in {1..100}
 > do
 > setfattr -n user.value$i -v value$i file
 > done

[root at ravi4 brick]# ll -l file
-rw-r--r-- 1 root root 0 Mar  8 12:54 file

[root at ravi4 brick]# du -h file
*4.0K    file**
*
Hope this helps,
Ravi

> Pranith?
>
> [1] http://termbin.com/ka9x
> [2]
http://gluster.readthedocs.org/en/latest/Administrator%20Guide/arbiter-volumes-and-quorum/
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20160308/f663453c/attachment.html>

Oleksandr Natalenko

2016-Mar-08 15:18 UTC

head link

[Gluster-users] Arbiter brick size estimation

Hi.

On ????????, 8 ??????? 2016 ?. 19:13:05 EET Ravishankar N
wrote:> I think the first one is right because you still haven't used up all
the
> inodes.(2036637 used vs. the max. permissible 3139091). But again this
> is an approximation because not all files would be 899 bytes. For
> example if there are a thousand files present in a directory, then du
> <dirname> would be more than du <file> because the directory
will take
> some disk space to store the dentries.
I believe you've got me wrong. 2036637 is the number of files+folders.
3139091
is the amount of inodes actually allocated on the underlying FS (according to 
df -i information). The max. inodes number is much higher than that, and I do 
not take it into account.

Also, probably, I should recheck the results for 1000 files per folder to make 
it sure.
> The 4KB is a conservative estimate considering the fact that though the
> arbiter brick does not store data, it still keeps a copy of both user
> and gluster xattrs. For example, if the application sets a lot of
> xattrs, it can consume a data block if they cannot be accommodated on
> the inode itself.  Also there is the .glusterfs folder like you said
> which would take up some space. Here is what I tried on an XFS brick:
4KB as upper level sounds and looks reasonable to me, thanks. But the average 
value will be still lower, I believe, as it is uncommon for apps to set lots 
of xattrs, especially for ordinary deployment.

Regards,
  Oleksandr.

Oleksandr Natalenko

2016-Mar-16 17:27 UTC

head link

[Gluster-users] Arbiter brick size estimation

OK, I've repeated the test with the following hierarchy:

* 10 top-level folders with 10 second-level folders each;
* 10 000 files in each second-level folder.

So, this composes 10?10?10000=1M files and 100 folders

Initial brick used space: 33 M
Initial inodes count: 24

After test:

* each brick in replica took 18G, and the arbiter brick took 836M;
* inodes count: 1066036

So:

(836 - 33) / (1066036 - 24) == 790 bytes per inode.

So, yes, it is slightly bigger value than with previous test due to, I guess, 
lots of files in one folder, but it is still too far from 4k. Given a good 
engineer should consider 30% reserve, the ratio is about 1k per stored inode.

Correct me if I'm missing something (regarding average workload and not
corner
cases).

Test script is here: [1]

Regards,
  Oleksandr.

[1] http://termbin.com/qlvz

On ????????, 8 ??????? 2016 ?. 19:13:05 EET Ravishankar N
wrote:> On 03/05/2016 03:45 PM, Oleksandr Natalenko wrote:
> > In order to estimate GlusterFS arbiter brick size, I've deployed
test
> > setup
> > with replica 3 arbiter 1 volume within one node. Each brick is located
on
> > separate HDD (XFS with inode size == 512). Using GlusterFS v3.7.6 +
> > memleak
> > patches. Volume options are kept default.
> > 
> > Here is the script that creates files and folders in mounted volume:
[1]
> > 
> > The script creates 1M of files of random size (between 1 and 32768
bytes)
> > and some amount of folders. After running it I've got 1036637
folders.
> > So, in total it is 2036637 files and folders.
> > 
> > The initial used space on each brick is 42M . After running script
I've
> > got:
> > 
> > replica brick 1 and 2: 19867168 kbytes == 19G
> > arbiter brick: 1872308 kbytes == 1.8G
> > 
> > The amount of inodes on each brick is 3139091. So here goes
estimation.
> > 
> > Dividing arbiter used space by files+folders we get:
> > 
> > (1872308 - 42000)/2036637 == 899 bytes per file or folder
> > 
> > Dividing arbiter used space by inodes we get:
> > 
> > (1872308 - 42000)/3139091 == 583 bytes per inode
> > 
> > Not sure about what calculation is correct.
> 
> I think the first one is right because you still haven't used up all
the
> inodes.(2036637 used vs. the max. permissible 3139091). But again this
> is an approximation because not all files would be 899 bytes. For
> example if there are a thousand files present in a directory, then du
> <dirname> would be more than du <file> because the directory
will take
> some disk space to store the dentries.
> 
> >   I guess we should consider the one
> > 
> > that accounts inodes because of .glusterfs/ folder data.
> > 
> > Nevertheless, in contrast, documentation [2] says it should be 4096
bytes
> > per file. Am I wrong with my calculations?
> 
> The 4KB is a conservative estimate considering the fact that though the
> arbiter brick does not store data, it still keeps a copy of both user
> and gluster xattrs. For example, if the application sets a lot of
> xattrs, it can consume a data block if they cannot be accommodated on
> the inode itself.  Also there is the .glusterfs folder like you said
> which would take up some space. Here is what I tried on an XFS brick:
> [root at ravi4 brick]# touch file
> 
> [root at ravi4 brick]# ls -l file
> -rw-r--r-- 1 root root 0 Mar  8 12:54 file
> 
> [root at ravi4 brick]# du file
> *0       file**
> *
> [root at ravi4 brick]# for i in {1..100}
> 
>  > do
>  > setfattr -n user.value$i -v value$i file
>  > done
> 
> [root at ravi4 brick]# ll -l file
> -rw-r--r-- 1 root root 0 Mar  8 12:54 file
> 
> [root at ravi4 brick]# du -h file
> *4.0K    file**
> *
> Hope this helps,
> Ravi
> 
> > Pranith?
> > 
> > [1] http://termbin.com/ka9x
> > [2]
> >
http://gluster.readthedocs.org/en/latest/Administrator%20Guide/arbiter-vo
> > lumes-and-quorum/ _______________________________________________
> > Gluster-devel mailing list
> > Gluster-devel at gluster.org
> > http://www.gluster.org/mailman/listinfo/gluster-devel

Gluster users - Mar 2016 - Arbiter brick size estimation

[Gluster-users] Arbiter brick size estimation

[Gluster-users] Arbiter brick size estimation

[Gluster-users] Arbiter brick size estimation