Sabuj Pattanayek
2012-Feb-23 23:12 UTC
[Gluster-users] default cluster.stripe-block-size for striped volumes on 3.0.x vs 3.3 beta (128kb), performance change if i reduce to a smaller block size?
Hi, I've been migrating data from an old striped 3.0.x gluster install to a 3.3 beta install. I copied all the data to a regular XFS partition (4K blocksize) from the old gluster striped volume and it totaled 9.2TB. With the old setup I used the following option in a "volume stripe" block in the configuration file in a client : volume stripe type cluster/stripe option block-size 2MB subvolumes .... end-volume IIRC, the data was using up about the same space on the old striped volume (9.2T) . While copying the data back to the new v3.3 striped gluster volume on the same 5 servers/same brick filesystems (XFS w/4K blocksize), I noticed that the amount stored on disk increased by 5x. Currently if I do a du -sh on the gluster fuse mount of the new striped volume I get 4.3TB (I haven't finished copying all 9.2TB of data over, stopped it prematurely because it's going to use up all the physical disk it seems if I let it keep going). However, if I do a du -sh at the filesystem / brick level on each of the 5 directories on the 5 servers that store the striped data, it shows that each one is storing 4.1TB. So basically, 4.3TB of data from a 4K block size FS took up 20.5TB of storage on a 128KB block size striped gluster volume. What is the correlation between the " option block-size" setting on client configs in cluster/stripe blocks in 3.0.x vs the cluster.stripe-block-size parameter in 3.3? If these settings are talking about what I think they mean, then basically a file that is 1M in size would be written out to the stripe in 128KB chunks across N servers, i.e. 128/N KB of data per brick? What happens when the stripe block size isn't evenly divisible by N (e.g. 128/5 = 25.6). If the old block-size and new stripe-block-size options are describing the same thing, then wouldn't a 2MB block size from the old config cause more storage to be used up vs a 128KB block size? Thanks, Sabuj
Sabuj Pattanayek
2012-Feb-24 06:50 UTC
[Gluster-users] default cluster.stripe-block-size for striped volumes on 3.0.x vs 3.3 beta (128kb), performance change if i reduce to a smaller block size?
This seems to be a bug in XFS as Joe pointed out : http://oss.sgi.com/archives/xfs/2011-06/msg00233.html http://stackoverflow.com/questions/6940516/create-sparse-file-with-alternate-data-and-hole-on-ext3-and-xfs It seems to be there in XFS available natively in RHEL6 and RHEL5 On Thu, Feb 23, 2012 at 5:12 PM, Sabuj Pattanayek <sabujp at gmail.com> wrote:> Hi, > > I've been migrating data from an old striped 3.0.x gluster install to > a 3.3 beta install. I copied all the data to a regular XFS partition > (4K blocksize) from the old gluster striped volume and it totaled > 9.2TB. With the old setup I used the following option in a "volume > stripe" block in the configuration file in a client : > > volume stripe > ?type cluster/stripe > ?option block-size 2MB > ?subvolumes .... > end-volume > > IIRC, the data was using up about the same space on the old striped > volume (9.2T) . While copying the data back to the new v3.3 striped > gluster volume on the same 5 servers/same brick filesystems (XFS w/4K > blocksize), I noticed that the amount stored on disk increased by 5x. > > Currently if I do a du -sh on the gluster fuse mount of the new > striped volume I get 4.3TB (I haven't finished copying all 9.2TB of > data over, stopped it prematurely because it's going to use up all the > physical disk it seems if I let it keep going). However, if I do a du > -sh at the filesystem / brick level on each of the 5 directories on > the 5 servers that store the striped data, it shows that each one is > storing 4.1TB. So basically, 4.3TB of data from a 4K block size FS > took up 20.5TB of storage on a 128KB block size striped gluster > volume. What is the correlation between the " option block-size" > setting on client configs in cluster/stripe blocks in 3.0.x vs the > cluster.stripe-block-size parameter in 3.3? If these settings are > talking about what I think they mean, then basically a file that is 1M > in size would be written out to the stripe in 128KB chunks across N > servers, i.e. 128/N KB of data per brick? What happens when the stripe > block size isn't evenly divisible by N (e.g. 128/5 = 25.6). If the old > block-size and new stripe-block-size options are describing the same > thing, then wouldn't a 2MB block size from the old config cause more > storage to be used up vs a 128KB block size? > > Thanks, > Sabuj
Reasonably Related Threads
- crash when using the cp command to copy files off a striped gluster dir but not when using rsync
- No subject
- Infiniband performance issues answered?
- smbd's using up 100% of all cpu's and load avg slowly going up
- things that break with unix extensions = yes, samba 4.1.5 and osx 10.9 clients?