ElĂas David
2014-Jan-23 05:33 UTC
[Gluster-users] Several questions from replicas to performance
Hello everyone, The place I work for is starting to look at gluster to replace a current windows share we have. The amount of files and sizes varies a lot, from very small (~5kb) to somewhat large (~25GB). Since we're checking if gluster is a viable option for us, and we're still learning about the filesystem, we're pretty sure that a problem we're seeing right now is coming from our ignorance. Right now our biggest concern during our tests is a ridiculously low performance, I'm talking about a 'cp -R /home/user/* /mnt/data' where /mnt/data is: mount -t glusterfs 192.168.0.10:/VolName /mnt/data that took something ridiculous like 12 "hours" or more to transfer mere 226GB of data (from ISOs to documents to flat files....). Right now our setup is this: -We have 5 servers (peers) with each having two 2TB disks WD black formatted with xfs -i size=512, each disk is a brick so we have: 192.168.0.10 disk0 (2TB) on /gv0/vol1 192.168.0.10 disk1 (2TB) on /gv0/vol2 192.168.0.11 disk0 (2TB) on /gv0/vol1 192.168.0.11 disk1 (2TB) on /gv0/vol2 and so on... Total: 2TB disk x 10 = 20TB Now, not being really sure yet how replica counts really work, we created a volume with "replica 5", as in: gluster vol create replica 5 server1:/gv0/vol1/data server2:/gv0/vol1/data server3:/gv0/vol1/data server4:/gv0/vol1/data server5:/gv0/vol1/data server1:/gv0/vol2/data server2:/gv0/vol2/data server3:/gv0/vol2/data server4:/gv0/vol2/data server5:/gv0/vol2/data Vol info goes like this: Volume Name: Data Type: Distributed-Replicate Volume ID: 2c938585-d2bd-43cf-98d8-caab70033750 Status: Started Number of Bricks: 2 x 5 = 10 Transport-type: tcp Bricks: Brick1: 192.168.0.10:/gv0/vol1/data Brick2: 192.168.0.11:/gv0/vol1/data Brick3: 192.168.0.12:/gv0/vol1/data Brick4: 192.168.0.13:/gv0/vol1/data Brick5: 192.168.0.14:/gv0/vol1/data Brick6: 192.168.0.10:/gv0/vol2/data Brick7: 192.168.0.11:/gv0/vol2/data Brick8: 192.168.0.12:/gv0/vol2/data Brick9: 192.168.0.13:/gv0/vol2/data Brick10: 192.168.0.14:/gv0/vol2/data The servers are not bad really, Intel(R) Xeon(R) CPU X5450 @ 3.00GHz 8 cores, 32 GB of ram and 1GB link for gluster As I said earlier I mounted this vol on another machine in the lan using 'mount -t glusterfs 192.168.0.10:/Data /mnt/data' and I use a simple cp -R to put data on it, I also tested with 'dd if=/dev/zero of=/mnt/data/zerofile bs=1M count=50000' and this dd process is running four about 5 hours now and I'm about to reach 24GB of 50GB file size... I'm pretty sure that this problem is solely caused from our ignorance of the filesystem and that's why I ask you guys The servers are all running CentOS 6.5, glusterfs-* packages from EPEL repo, glusterfs version 3.4.2 Another question I would like to add if I may is that, I run df -h after mounting the volume and I'm seeing as total volume capacity of 3.7 terabytes when I expected something like 10 terabytes given 10 2TB disks in a replica 5 setup, is this normal or I'm misunderstanding the replica count thing? That's it, sorry for the long message just wanted to be clear, any input or info about this would be greatly appreciated. Thanks!
Dean Bruhn
2014-Jan-23 14:34 UTC
[Gluster-users] Several questions from replicas to performance
Elias, Looks like you?ve got your replication setup turned around. The replica count is the number of times you want the data replicated across the volume. So right now, for every file you write, it is being written to 5 bricks, I would suspect you want more of a replica 2. You would modify your volume create command to look more like this gluster vol create replica 2 server1:/gv0/vol1/data server2:/gv0/vol1/data server3:/gv0/vol1/data server4:/gv0/vol1/data server5:/gv0/vol1/data server1:/gv0/vol2/data server2:/gv0/vol2/data server3:/gv0/vol2/data server4:/gv0/vol2/data server5:/gv0/vol2/data - Dean On Jan 22, 2014, at 11:33 PM, El?as David <elias.moreno.tec at gmail.com> wrote:> Hello everyone, > > The place I work for is starting to look at gluster to replace a current windows share we have. The amount of files and sizes varies a lot, from very small (~5kb) to somewhat large (~25GB). > > Since we're checking if gluster is a viable option for us, and we're still learning about the filesystem, we're pretty sure that a problem we're seeing right now is coming from our ignorance. > > Right now our biggest concern during our tests is a ridiculously low performance, I'm talking about a 'cp -R /home/user/* /mnt/data' where /mnt/data is: mount -t glusterfs 192.168.0.10:/VolName /mnt/data that took something ridiculous like 12 "hours" or more to transfer mere 226GB of data (from ISOs to documents to flat files....). > > Right now our setup is this: > -We have 5 servers (peers) with each having two 2TB disks WD black formatted with xfs -i size=512, each disk is a brick so we have: > > 192.168.0.10 disk0 (2TB) on /gv0/vol1 > 192.168.0.10 disk1 (2TB) on /gv0/vol2 > 192.168.0.11 disk0 (2TB) on /gv0/vol1 > 192.168.0.11 disk1 (2TB) on /gv0/vol2 > and so on... > > Total: 2TB disk x 10 = 20TB > > Now, not being really sure yet how replica counts really work, we created a volume with "replica 5", as in: > > gluster vol create replica 5 server1:/gv0/vol1/data server2:/gv0/vol1/data server3:/gv0/vol1/data server4:/gv0/vol1/data server5:/gv0/vol1/data server1:/gv0/vol2/data server2:/gv0/vol2/data server3:/gv0/vol2/data server4:/gv0/vol2/data server5:/gv0/vol2/data > > Vol info goes like this: > > Volume Name: Data > Type: Distributed-Replicate > Volume ID: 2c938585-d2bd-43cf-98d8-caab70033750 > Status: Started > Number of Bricks: 2 x 5 = 10 > Transport-type: tcp > Bricks: > Brick1: 192.168.0.10:/gv0/vol1/data > Brick2: 192.168.0.11:/gv0/vol1/data > Brick3: 192.168.0.12:/gv0/vol1/data > Brick4: 192.168.0.13:/gv0/vol1/data > Brick5: 192.168.0.14:/gv0/vol1/data > Brick6: 192.168.0.10:/gv0/vol2/data > Brick7: 192.168.0.11:/gv0/vol2/data > Brick8: 192.168.0.12:/gv0/vol2/data > Brick9: 192.168.0.13:/gv0/vol2/data > Brick10: 192.168.0.14:/gv0/vol2/data > > The servers are not bad really, Intel(R) Xeon(R) CPU X5450 @ 3.00GHz 8 cores, 32 GB of ram and 1GB link for gluster > > As I said earlier I mounted this vol on another machine in the lan using 'mount -t glusterfs 192.168.0.10:/Data /mnt/data' and I use a simple cp -R to put data on it, I also tested with 'dd if=/dev/zero of=/mnt/data/zerofile bs=1M count=50000' and this dd process is running four about 5 hours now and I'm about to reach 24GB of 50GB file size... > > I'm pretty sure that this problem is solely caused from our ignorance of the filesystem and that's why I ask you guys > > The servers are all running CentOS 6.5, glusterfs-* packages from EPEL repo, glusterfs version 3.4.2 > > Another question I would like to add if I may is that, I run df -h after mounting the volume and I'm seeing as total volume capacity of 3.7 terabytes when I expected something like 10 terabytes given 10 2TB disks in a replica 5 setup, is this normal or I'm misunderstanding the replica count thing? > > That's it, sorry for the long message just wanted to be clear, any input or info about this would be greatly appreciated. > > Thanks! > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users