On 03/05/2016 03:45 PM, Oleksandr Natalenko wrote:> In order to estimate GlusterFS arbiter brick size, I've deployed test setup > with replica 3 arbiter 1 volume within one node. Each brick is located on > separate HDD (XFS with inode size == 512). Using GlusterFS v3.7.6 + memleak > patches. Volume options are kept default. > > Here is the script that creates files and folders in mounted volume: [1] > > The script creates 1M of files of random size (between 1 and 32768 bytes) and > some amount of folders. After running it I've got 1036637 folders. So, in > total it is 2036637 files and folders. > > The initial used space on each brick is 42M . After running script I've got: > > replica brick 1 and 2: 19867168 kbytes == 19G > arbiter brick: 1872308 kbytes == 1.8G > > The amount of inodes on each brick is 3139091. So here goes estimation. > > Dividing arbiter used space by files+folders we get: > > (1872308 - 42000)/2036637 == 899 bytes per file or folder > > Dividing arbiter used space by inodes we get: > > (1872308 - 42000)/3139091 == 583 bytes per inode > > Not sure about what calculation is correct.I think the first one is right because you still haven't used up all the inodes.(2036637 used vs. the max. permissible 3139091). But again this is an approximation because not all files would be 899 bytes. For example if there are a thousand files present in a directory, then du <dirname> would be more than du <file> because the directory will take some disk space to store the dentries.> I guess we should consider the one > that accounts inodes because of .glusterfs/ folder data. > > Nevertheless, in contrast, documentation [2] says it should be 4096 bytes per > file. Am I wrong with my calculations?The 4KB is a conservative estimate considering the fact that though the arbiter brick does not store data, it still keeps a copy of both user and gluster xattrs. For example, if the application sets a lot of xattrs, it can consume a data block if they cannot be accommodated on the inode itself. Also there is the .glusterfs folder like you said which would take up some space. Here is what I tried on an XFS brick: [root at ravi4 brick]# touch file [root at ravi4 brick]# ls -l file -rw-r--r-- 1 root root 0 Mar 8 12:54 file [root at ravi4 brick]# du file *0 file** * [root at ravi4 brick]# for i in {1..100} > do > setfattr -n user.value$i -v value$i file > done [root at ravi4 brick]# ll -l file -rw-r--r-- 1 root root 0 Mar 8 12:54 file [root at ravi4 brick]# du -h file *4.0K file** * Hope this helps, Ravi> Pranith? > > [1] http://termbin.com/ka9x > [2] http://gluster.readthedocs.org/en/latest/Administrator%20Guide/arbiter-volumes-and-quorum/ > _______________________________________________ > Gluster-devel mailing list > Gluster-devel at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160308/f663453c/attachment.html>
Hi. On ????????, 8 ??????? 2016 ?. 19:13:05 EET Ravishankar N wrote:> I think the first one is right because you still haven't used up all the > inodes.(2036637 used vs. the max. permissible 3139091). But again this > is an approximation because not all files would be 899 bytes. For > example if there are a thousand files present in a directory, then du > <dirname> would be more than du <file> because the directory will take > some disk space to store the dentries.I believe you've got me wrong. 2036637 is the number of files+folders. 3139091 is the amount of inodes actually allocated on the underlying FS (according to df -i information). The max. inodes number is much higher than that, and I do not take it into account. Also, probably, I should recheck the results for 1000 files per folder to make it sure.> The 4KB is a conservative estimate considering the fact that though the > arbiter brick does not store data, it still keeps a copy of both user > and gluster xattrs. For example, if the application sets a lot of > xattrs, it can consume a data block if they cannot be accommodated on > the inode itself. Also there is the .glusterfs folder like you said > which would take up some space. Here is what I tried on an XFS brick:4KB as upper level sounds and looks reasonable to me, thanks. But the average value will be still lower, I believe, as it is uncommon for apps to set lots of xattrs, especially for ordinary deployment. Regards, Oleksandr.
OK, I've repeated the test with the following hierarchy: * 10 top-level folders with 10 second-level folders each; * 10 000 files in each second-level folder. So, this composes 10?10?10000=1M files and 100 folders Initial brick used space: 33 M Initial inodes count: 24 After test: * each brick in replica took 18G, and the arbiter brick took 836M; * inodes count: 1066036 So: (836 - 33) / (1066036 - 24) == 790 bytes per inode. So, yes, it is slightly bigger value than with previous test due to, I guess, lots of files in one folder, but it is still too far from 4k. Given a good engineer should consider 30% reserve, the ratio is about 1k per stored inode. Correct me if I'm missing something (regarding average workload and not corner cases). Test script is here: [1] Regards, Oleksandr. [1] http://termbin.com/qlvz On ????????, 8 ??????? 2016 ?. 19:13:05 EET Ravishankar N wrote:> On 03/05/2016 03:45 PM, Oleksandr Natalenko wrote: > > In order to estimate GlusterFS arbiter brick size, I've deployed test > > setup > > with replica 3 arbiter 1 volume within one node. Each brick is located on > > separate HDD (XFS with inode size == 512). Using GlusterFS v3.7.6 + > > memleak > > patches. Volume options are kept default. > > > > Here is the script that creates files and folders in mounted volume: [1] > > > > The script creates 1M of files of random size (between 1 and 32768 bytes) > > and some amount of folders. After running it I've got 1036637 folders. > > So, in total it is 2036637 files and folders. > > > > The initial used space on each brick is 42M . After running script I've > > got: > > > > replica brick 1 and 2: 19867168 kbytes == 19G > > arbiter brick: 1872308 kbytes == 1.8G > > > > The amount of inodes on each brick is 3139091. So here goes estimation. > > > > Dividing arbiter used space by files+folders we get: > > > > (1872308 - 42000)/2036637 == 899 bytes per file or folder > > > > Dividing arbiter used space by inodes we get: > > > > (1872308 - 42000)/3139091 == 583 bytes per inode > > > > Not sure about what calculation is correct. > > I think the first one is right because you still haven't used up all the > inodes.(2036637 used vs. the max. permissible 3139091). But again this > is an approximation because not all files would be 899 bytes. For > example if there are a thousand files present in a directory, then du > <dirname> would be more than du <file> because the directory will take > some disk space to store the dentries. > > > I guess we should consider the one > > > > that accounts inodes because of .glusterfs/ folder data. > > > > Nevertheless, in contrast, documentation [2] says it should be 4096 bytes > > per file. Am I wrong with my calculations? > > The 4KB is a conservative estimate considering the fact that though the > arbiter brick does not store data, it still keeps a copy of both user > and gluster xattrs. For example, if the application sets a lot of > xattrs, it can consume a data block if they cannot be accommodated on > the inode itself. Also there is the .glusterfs folder like you said > which would take up some space. Here is what I tried on an XFS brick: > [root at ravi4 brick]# touch file > > [root at ravi4 brick]# ls -l file > -rw-r--r-- 1 root root 0 Mar 8 12:54 file > > [root at ravi4 brick]# du file > *0 file** > * > [root at ravi4 brick]# for i in {1..100} > > > do > > setfattr -n user.value$i -v value$i file > > done > > [root at ravi4 brick]# ll -l file > -rw-r--r-- 1 root root 0 Mar 8 12:54 file > > [root at ravi4 brick]# du -h file > *4.0K file** > * > Hope this helps, > Ravi > > > Pranith? > > > > [1] http://termbin.com/ka9x > > [2] > > http://gluster.readthedocs.org/en/latest/Administrator%20Guide/arbiter-vo > > lumes-and-quorum/ _______________________________________________ > > Gluster-devel mailing list > > Gluster-devel at gluster.org > > http://www.gluster.org/mailman/listinfo/gluster-devel