Pranith Kumar Karampuri
2017-May-17 09:02 UTC
[Gluster-users] Slow write times to gluster disk
On Tue, May 16, 2017 at 9:38 PM, Joe Julian <joe at julianfamily.org> wrote:> On 04/13/17 23:50, Pranith Kumar Karampuri wrote: > > > > On Sat, Apr 8, 2017 at 10:28 AM, Ravishankar N <ravishankar at redhat.com> > wrote: > >> Hi Pat, >> >> I'm assuming you are using gluster native (fuse mount). If it helps, you >> could try mounting it via gluster NFS (gnfs) and then see if there is an >> improvement in speed. Fuse mounts are slower than gnfs mounts but you get >> the benefit of avoiding a single point of failure. Unlike fuse mounts, if >> the gluster node containing the gnfs server goes down, all mounts done >> using that node will fail). For fuse mounts, you could try tweaking the >> write-behind xlator settings to see if it helps. See the >> performance.write-behind and performance.write-behind-window-size >> options in `gluster volume set help`. Of course, even for gnfs mounts, you >> can achieve fail-over by using CTDB. >> > > Ravi, > Do you have any data that suggests fuse mounts are slower than gNFS > servers? > > Pat, > I see that I am late to the thread, but do you happen to have > "profile info" of the workload? > > > I have done actual testing. For directory ops, NFS is faster due to the > default cache settings in the kernel. For raw throughput, or ops on an open > file, fuse is faster. > > I have yet to test this but I expect with the newer caching features in > 3.8+, even directory op performance should be similar to nfs and more > accurate. >We are actually comparing fuse+gluster and kernel NFS (n the same brick. Did you get a chance to do this test at any point?> > > You can follow https://gluster.readthedocs.io/en/latest/Administrator% > 20Guide/Monitoring%20Workload/ to get the information. > > >> >> Thanks, >> Ravi >> >> >> On 04/08/2017 12:07 AM, Pat Haley wrote: >> >> >> Hi, >> >> We noticed a dramatic slowness when writing to a gluster disk when >> compared to writing to an NFS disk. Specifically when using dd (data >> duplicator) to write a 4.3 GB file of zeros: >> >> - on NFS disk (/home): 9.5 Gb/s >> - on gluster disk (/gdata): 508 Mb/s >> >> The gluser disk is 2 bricks joined together, no replication or anything >> else. The hardware is (literally) the same: >> >> - one server with 70 hard disks and a hardware RAID card. >> - 4 disks in a RAID-6 group (the NFS disk) >> - 32 disks in a RAID-6 group (the max allowed by the card, >> /mnt/brick1) >> - 32 disks in another RAID-6 group (/mnt/brick2) >> - 2 hot spare >> >> Some additional information and more tests results (after changing the >> log level): >> >> glusterfs 3.7.11 built on Apr 27 2016 14:09:22 >> CentOS release 6.8 (Final) >> RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS-3 3108 >> [Invader] (rev 02) >> >> >> >> *Create the file to /gdata (gluster)* >> [root at mseas-data2 gdata]# dd if=/dev/zero of=/gdata/zero1 bs=1M >> count=1000 >> 1000+0 records in >> 1000+0 records out >> 1048576000 bytes (1.0 GB) copied, 1.91876 s, *546 MB/s* >> >> *Create the file to /home (ext4)* >> [root at mseas-data2 gdata]# dd if=/dev/zero of=/home/zero1 bs=1M count=1000 >> 1000+0 records in >> 1000+0 records out >> 1048576000 bytes (1.0 GB) copied, 0.686021 s, *1.5 GB/s - *3 times as >> fast >> >> >> >> * Copy from /gdata to /gdata (gluster to gluster) *[root at mseas-data2 >> gdata]# dd if=/gdata/zero1 of=/gdata/zero2 >> 2048000+0 records in >> 2048000+0 records out >> 1048576000 bytes (1.0 GB) copied, 101.052 s, *10.4 MB/s* - realllyyy >> slooowww >> >> >> *Copy from /gdata to /gdata* *2nd time (gluster to gluster)* >> [root at mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2 >> 2048000+0 records in >> 2048000+0 records out >> 1048576000 bytes (1.0 GB) copied, 92.4904 s, *11.3 MB/s* - realllyyy >> slooowww again >> >> >> >> *Copy from /home to /home (ext4 to ext4)* >> [root at mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero2 >> 2048000+0 records in >> 2048000+0 records out >> 1048576000 bytes (1.0 GB) copied, 3.53263 s, *297 MB/s *30 times as fast >> >> >> *Copy from /home to /home (ext4 to ext4)* >> [root at mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero3 >> 2048000+0 records in >> 2048000+0 records out >> 1048576000 bytes (1.0 GB) copied, 4.1737 s, *251 MB/s* - 30 times as fast >> >> >> As a test, can we copy data directly to the xfs mountpoint (/mnt/brick1) >> and bypass gluster? >> >> >> Any help you could give us would be appreciated. >> >> Thanks >> >> -- >> >> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- >> Pat Haley Email: phaley at mit.edu >> Center for Ocean Engineering Phone: (617) 253-6824 >> Dept. of Mechanical Engineering Fax: (617) 253-8125 >> MIT, Room 5-213 http://web.mit.edu/phaley/www/ >> 77 Massachusetts Avenue >> Cambridge, MA 02139-4301 >> >> _______________________________________________ >> Gluster-users mailing listGluster-users at gluster.orghttp://lists.gluster.org/mailman/listinfo/gluster-users >> >> _______________________________________________ Gluster-users mailing >> list Gluster-users at gluster.org http://lists.gluster.org/mailm >> an/listinfo/gluster-users > > -- > Pranith > > _______________________________________________ > Gluster-users mailing listGluster-users at gluster.orghttp://lists.gluster.org/mailman/listinfo/gluster-users > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users >-- Pranith -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170517/a1226f3e/attachment.html>
On 05/17/17 02:02, Pranith Kumar Karampuri wrote:> On Tue, May 16, 2017 at 9:38 PM, Joe Julian <joe at julianfamily.org > <mailto:joe at julianfamily.org>> wrote: > > On 04/13/17 23:50, Pranith Kumar Karampuri wrote: >> >> >> On Sat, Apr 8, 2017 at 10:28 AM, Ravishankar N >> <ravishankar at redhat.com <mailto:ravishankar at redhat.com>> wrote: >> >> Hi Pat, >> >> I'm assuming you are using gluster native (fuse mount). If it >> helps, you could try mounting it via gluster NFS (gnfs) and >> then see if there is an improvement in speed. Fuse mounts are >> slower than gnfs mounts but you get the benefit of avoiding a >> single point of failure. Unlike fuse mounts, if the gluster >> node containing the gnfs server goes down, all mounts done >> using that node will fail). For fuse mounts, you could try >> tweaking the write-behind xlator settings to see if it helps. >> See the performance.write-behind and >> performance.write-behind-window-size options in `gluster >> volume set help`. Of course, even for gnfs mounts, you can >> achieve fail-over by using CTDB. >> >> >> Ravi, >> Do you have any data that suggests fuse mounts are slower >> than gNFS servers? >> >> Pat, >> I see that I am late to the thread, but do you happen to >> have "profile info" of the workload? >> > > I have done actual testing. For directory ops, NFS is faster due > to the default cache settings in the kernel. For raw throughput, > or ops on an open file, fuse is faster. > > I have yet to test this but I expect with the newer caching > features in 3.8+, even directory op performance should be similar > to nfs and more accurate. > > > We are actually comparing fuse+gluster and kernel NFS (n the same > brick. Did you get a chance to do this test at any point?No, that's not comparing like to like and I've rarely had a use case to which a single-store NFS was the answer.> > >> You can follow >> https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Monitoring%20Workload/ >> <https://gluster.readthedocs.io/en/latest/Administrator%20Guide/Monitoring%20Workload/> >> to get the information. >> >> >> Thanks, >> Ravi >> >> >> On 04/08/2017 12:07 AM, Pat Haley wrote: >>> >>> Hi, >>> >>> We noticed a dramatic slowness when writing to a gluster >>> disk when compared to writing to an NFS disk. Specifically >>> when using dd (data duplicator) to write a 4.3 GB file of zeros: >>> >>> * on NFS disk (/home): 9.5 Gb/s >>> * on gluster disk (/gdata): 508 Mb/s >>> >>> The gluser disk is 2 bricks joined together, no replication >>> or anything else. The hardware is (literally) the same: >>> >>> * one server with 70 hard disks and a hardware RAID card. >>> * 4 disks in a RAID-6 group (the NFS disk) >>> * 32 disks in a RAID-6 group (the max allowed by the card, >>> /mnt/brick1) >>> * 32 disks in another RAID-6 group (/mnt/brick2) >>> * 2 hot spare >>> >>> Some additional information and more tests results (after >>> changing the log level): >>> >>> glusterfs 3.7.11 built on Apr 27 2016 14:09:22 >>> CentOS release 6.8 (Final) >>> RAID bus controller: LSI Logic / Symbios Logic MegaRAID >>> SAS-3 3108 [Invader] (rev 02) >>> >>> >>> >>> *Create the file to /gdata (gluster)* >>> [root at mseas-data2 gdata]# dd if=/dev/zero of=/gdata/zero1 >>> bs=1M count=1000 >>> 1000+0 records in >>> 1000+0 records out >>> 1048576000 bytes (1.0 GB) copied, 1.91876 s, *546 MB/s* >>> >>> *Create the file to /home (ext4)* >>> [root at mseas-data2 gdata]# dd if=/dev/zero of=/home/zero1 >>> bs=1M count=1000 >>> 1000+0 records in >>> 1000+0 records out >>> 1048576000 bytes (1.0 GB) copied, 0.686021 s, *1.5 GB/s - *3 >>> times as fast* >>> >>> >>> Copy from /gdata to /gdata (gluster to gluster) >>> *[root at mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2 >>> 2048000+0 records in >>> 2048000+0 records out >>> 1048576000 bytes (1.0 GB) copied, 101.052 s, *10.4 MB/s* - >>> realllyyy slooowww >>> >>> >>> *Copy from /gdata to /gdata* *2nd time *(gluster to gluster)** >>> [root at mseas-data2 gdata]# dd if=/gdata/zero1 of=/gdata/zero2 >>> 2048000+0 records in >>> 2048000+0 records out >>> 1048576000 bytes (1.0 GB) copied, 92.4904 s, *11.3 MB/s* - >>> realllyyy slooowww again >>> >>> >>> >>> *Copy from /home to /home (ext4 to ext4)* >>> [root at mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero2 >>> 2048000+0 records in >>> 2048000+0 records out >>> 1048576000 bytes (1.0 GB) copied, 3.53263 s, *297 MB/s *30 >>> times as fast >>> >>> >>> *Copy from /home to /home (ext4 to ext4)* >>> [root at mseas-data2 gdata]# dd if=/home/zero1 of=/home/zero3 >>> 2048000+0 records in >>> 2048000+0 records out >>> 1048576000 bytes (1.0 GB) copied, 4.1737 s, *251 MB/s* - 30 >>> times as fast >>> >>> >>> As a test, can we copy data directly to the xfs mountpoint >>> (/mnt/brick1) and bypass gluster? >>> >>> >>> Any help you could give us would be appreciated. >>> >>> Thanks >>> >>> -- >>> >>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- >>> Pat Haley Email:phaley at mit.edu <mailto:phaley at mit.edu> >>> Center for Ocean Engineering Phone: (617) 253-6824 >>> Dept. of Mechanical Engineering Fax: (617) 253-8125 >>> MIT, Room 5-213http://web.mit.edu/phaley/www/ >>> 77 Massachusetts Avenue >>> Cambridge, MA 02139-4301 >>> >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> >>> http://lists.gluster.org/mailman/listinfo/gluster-users >>> <http://lists.gluster.org/mailman/listinfo/gluster-users> >> >> _______________________________________________ Gluster-users >> mailing list Gluster-users at gluster.org >> <mailto:Gluster-users at gluster.org> >> http://lists.gluster.org/mailman/listinfo/gluster-users >> <http://lists.gluster.org/mailman/listinfo/gluster-users> >> >> -- >> Pranith >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org> >> http://lists.gluster.org/mailman/listinfo/gluster-users >> <http://lists.gluster.org/mailman/listinfo/gluster-users> > _______________________________________________ Gluster-users > mailing list Gluster-users at gluster.org > <mailto:Gluster-users at gluster.org> > http://lists.gluster.org/mailman/listinfo/gluster-users > <http://lists.gluster.org/mailman/listinfo/gluster-users> > > -- > Pranith-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170517/f34bb45c/attachment.html>