Khawaja Shams
2012-Jul-06 15:29 UTC
[Gluster-users] Gluster striping taking up too much space
Hello, We are trying to create a 40 volume gluster across 4 machines. For the purposes of our benchmarks, we are trying to set it up without any replication. However, files are taking up 40 times the storage space than we think. Putting out a 256MB file with dd puts a 9.2GB file on the file system. The gluster FAQs claim that ls doesn't track the file space properly and du does: http://gluster.org/community/documentation//index.php/GlusterFS_Technical_FAQ#Stripe_behavior_not_working_as_expected However, in our case, ls shows the file to be 256MB, while du shows it to be 9.2GB. After looking at the individual drives, we also noticed that each drive has 256MB of data and it seems to be getting replicated. Here is how we create the volumes: gluster create volume gluster-test stripe 40 transport tcp Gluster1:/data_f Gluster1:/data_g Gluster1:/data_h Gluster1:/data_i Gluster1:/data_j Gluster1:/data_k Gluster1:/data_l Gluster1:/data_m Gluster1:/data_n Gluster1:/data_o Gluster2:/data_f Gluster2:/data_g Gluster2:/data_h Gluster2:/data_i Gluster2:/data_j Gluster2:/data_k Gluster2:/data_l Gluster2:/data_m Gluster2:/data_n Gluster2:/data_o Gluster3:/data_f Gluster3:/data_g Gluster3:/data_h Gluster3:/data_i Gluster3:/data_j Gluster3:/data_k Gluster3:/data_l Gluster3:/data_m Gluster3:/data_n Gluster3:/data_o Gluster4:/data_f Gluster4:/data_g Gluster4:/data_h Gluster4:/data_i Gluster4:/data_j Gluster4:/data_k Gluster4:/data_l Gluster4:/data_m Gluster4:/data_n Gluster4:/data_o gluster --version *glusterfs 3.3.0 built on Jul 2 2012 23:26:48* Repository revision: git://git.gluster.com/glusterfs.git Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com> GlusterFS comes with ABSOLUTELY NO WARRANTY. You may redistribute copies of GlusterFS under the terms of the GNU General Public License. We do not have this issue if all the bricks are on the same machine. What are we doing wrong? Are there specific logs we should be looking at? Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20120706/fbbab395/attachment.html>
Jeff Darcy
2012-Jul-06 16:13 UTC
[Gluster-users] Gluster striping taking up too much space
On 07/06/2012 11:29 AM, Khawaja Shams wrote:> Hello, > We are trying to create a 40 volume gluster across 4 machines. For the > purposes of our benchmarks, we are trying to set it up without any replication. > However, files are taking up 40 times the storage space than we think. Putting > out a 256MB file with dd puts a 9.2GB file on the file system. The gluster FAQs > claim that ls doesn't track the file space properly and du does: > http://gluster.org/community/documentation//index.php/GlusterFS_Technical_FAQ#Stripe_behavior_not_working_as_expected > > However, in our case, ls shows the file to be 256MB, while du shows it to be > 9.2GB. After looking at the individual drives, we also noticed that each drive > has 256MB of data and it seems to be getting replicated. Here is how we create > the volumes: > > gluster create volume gluster-test stripe 40 transport tcp Gluster1:/data_f > Gluster1:/data_g Gluster1:/data_h Gluster1:/data_i Gluster1:/data_j > Gluster1:/data_k Gluster1:/data_l Gluster1:/data_m Gluster1:/data_n > Gluster1:/data_o Gluster2:/data_f Gluster2:/data_g Gluster2:/data_h > Gluster2:/data_i Gluster2:/data_j Gluster2:/data_k Gluster2:/data_l > Gluster2:/data_m Gluster2:/data_n Gluster2:/data_o Gluster3:/data_f > Gluster3:/data_g Gluster3:/data_h Gluster3:/data_i Gluster3:/data_j > Gluster3:/data_k Gluster3:/data_l Gluster3:/data_m Gluster3:/data_n > Gluster3:/data_o Gluster4:/data_f Gluster4:/data_g Gluster4:/data_h > Gluster4:/data_i Gluster4:/data_j Gluster4:/data_k Gluster4:/data_l > Gluster4:/data_m Gluster4:/data_n Gluster4:/data_o > > gluster --version > *glusterfs 3.3.0 built on Jul 2 2012 23:26:48* > Repository revision: git://git.gluster.com/glusterfs.git > <http://git.gluster.com/glusterfs.git> > Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com> > GlusterFS comes with ABSOLUTELY NO WARRANTY. > You may redistribute copies of GlusterFS under the terms of the GNU General > Public License. > > We do not have this issue if all the bricks are on the same machine. What are > we doing wrong? Are there specific logs we should be looking at?Three questions come to mind. (1) What underlying local filesystem (e.g. XFS, ext4) are you using for the bricks? (2) What exactly is the "dd" command you're using? Block size is particularly important, also whether you're trying to write a sparse file. (3) Have you tried turning on the "cluster.stripe-coalesce" volume option, and did it make a difference?