Ben Turner
2015-Nov-16 18:02 UTC
[Gluster-users] rsync to gluster mount: self-heal and bad performance
----- Original Message -----> From: "Tiemen Ruiten" <t.ruiten at rdmedia.com> > To: "Ben Turner" <bturner at redhat.com> > Cc: "gluster-users" <gluster-users at gluster.org> > Sent: Monday, November 16, 2015 5:00:20 AM > Subject: Re: [Gluster-users] rsync to gluster mount: self-heal and bad performance > > Hello Ben, > > Thank you for your answer. I don't see the same errors when just creating a > number of files, eg. touch test{0000..9999}. Performance is not great, but > after a few minutes it finishes successfully.We have made alot of small file perf enhancements, try the following: -Run on RHEL 7, I have seen a good improvement on el7 over el6. -Set lookup optimize on -Set client and server event threads to 4 -There is a metadata perf regression that could be affecting your rsync as well, keep an eye on - https://bugzilla.redhat.com/show_bug.cgi?id=1250803 for the fix.> > I'm running rsync through lsyncd, the options are: > > /usr/bin/rsync --delete --ignore-errors -zslt -r $source $destination > > I'm running it over a LAN network, between two VMs. The volume is indeed > mounted with --acl, but on the directory I'm syncing to I haven't set them > explicitly:Do you need ACLs? If not can you try without that option? I am wondering if there is a bug with ACLs that could be causing the self heals to happen. If we don't see it without ACLs that can give us somewhere to look at.> > [tiemen at iron2 test]$ getfacl stg/ > # file: stg/ > # owner: root > # group: rdcompany > # flags: -s- > user::rwx > group::rwx > other::r-x > > Volume options: > > [tiemen at iron2 test]$ sudo gluster volume info lpxassets > > Volume Name: lpxassets > Type: Replicate > Volume ID: fea00430-63b1-4a4e-bc38-b74d3732acf4 > Status: Started > Number of Bricks: 1 x 3 = 3 > Transport-type: tcp > Bricks: > Brick1: iron2:/data/brick/lpxassets > Brick2: cobalt2:/data/brick/lpxassets > Brick3: arbiter:/data/arbiter/lpxassets > Options Reconfigured: > nfs.disable: on > performance.readdir-ahead: on > cluster.quorum-type: auto > cluster.enable-shared-storage: enable > nfs-ganesha: disable > > Any other info I could provide? > > > On 15 November 2015 at 18:11, Ben Turner <bturner at redhat.com> wrote: > > > ----- Original Message ----- > > > From: "Tiemen Ruiten" <t.ruiten at rdmedia.com> > > > To: "gluster-users" <gluster-users at gluster.org> > > > Sent: Sunday, November 15, 2015 5:22:08 AM > > > Subject: Re: [Gluster-users] rsync to gluster mount: self-heal and bad > > performance > > > > > > Any other suggestions? > > > > You are correct, rsync should not cause self heal on every file. It makes > > me think that Ernie is correct and that something isn't correct. If you > > just create a bunch of files out side of rsync do you see the same > > behavior? What rsync command are you running, where are you syncing the > > data from? I see you have the acl mount option, are you using ACLs? > > > > -b > > > > > > > > > > On 13 November 2015 at 09:56, Tiemen Ruiten < t.ruiten at rdmedia.com > > > wrote: > > > > > > > > > > > > Hello Ernie, list, > > > > > > No, that's not the case. The volume is mounted through glusterfs-fuse - > > on > > > the same server running one of the bricks. The fstab: > > > > > > # /etc/fstab > > > # Created by anaconda on Tue Aug 18 18:10:49 2015 > > > # > > > # Accessible filesystems, by reference, are maintained under '/dev/disk' > > > # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more > > info > > > # > > > UUID=56778fed-bf3f-435e-8c32-edaa8c707f29 / xfs defaults 0 0 > > > UUID=a44e32ed-cfbe-4ba0-896f-1efff9397ba1 /boot xfs defaults 0 0 > > > UUID=a344d2bc-266d-4905-85b1-fbb7fe927659 swap swap defaults 0 0 > > > /dev/vdb1 /data/brick xfs defaults 1 2 > > > iron2:/lpxassets /mnt/lpxassets glusterfs _netdev,acl 0 0 > > > > > > > > > > > > > > > On 12 November 2015 at 22:50, Ernie Dunbar < maillist at lightspeed.ca > > > wrote: > > > > > > > > > Hi Tiemen > > > > > > It sounds like you're trying to rsync files onto your Gluster server, > > rather > > > than to the Gluster filesystem. You want to copy these files into the > > > mounted filesystem (typically on some other system than the Gluster > > > servers), because Gluster is designed to handle it that way. > > > > > > I can't remember the nitty gritty details about why this is, but I've > > made > > > this mistake before as well. Hope that helps. :) > > > > > > > > > On 2015-11-12 11:31, Tiemen Ruiten wrote: > > > > > > > > > > > > Hello, > > > > > > While rsyncing to a directory mounted through glusterfs fuse, > > > performance is very bad and it appears every synced file generates a > > > (metadata) self-heal. > > > > > > The volume is mounted with option acl and acl's are set on a > > > subdirectory. > > > > > > Setup is as follows: > > > > > > Two Centos 7 VM's (KVM), with Gluster 3.7.6 and one physical CentOS 6 > > > node, also Gluster 3.7.6. Physical node functions as arbiter. So it's > > > a replica 3 arbiter 1 volume. The bricks are LVM's with XFS > > > filesystem. > > > > > > While I don't think I should expect top performance for rsync on > > > Gluster, I wouldn't expect every file synced to trigger a self-heal. > > > Anything I can do to improve this? Should I file a bug? > > > > > > Another thing that looks related, I see a lot of these messages, > > > especially when doing IO: > > > > > > [2015-11-12 19:25:42.185904] I [dict.c:473:dict_get] > > > > > (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk+0x121) > > > [0x7fdcc2d31161] > > > > > -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk+0x242) > > > [0x7fdcc2b1b212] -->/lib64/libglusterfs.so.0(dict_get+0xac) > > > [0x7fdcd5e770cc] ) 0-dict: !this || key=system.posix_acl_default > > > [Invalid argument] > > > > > > -- > > > > > > Tiemen Ruiten > > > Systems Engineer > > > R&D Media > > > > > > _______________________________________________ > > > Gluster-users mailing list > > > Gluster-users at gluster.org > > > http://www.gluster.org/mailman/listinfo/gluster-users > > > > > > _______________________________________________ > > > Gluster-users mailing list > > > Gluster-users at gluster.org > > > http://www.gluster.org/mailman/listinfo/gluster-users > > > > > > > > > > > > -- > > > Tiemen Ruiten > > > Systems Engineer > > > R&D Media > > > > > > > > > > > > -- > > > Tiemen Ruiten > > > Systems Engineer > > > R&D Media > > > > > > _______________________________________________ > > > Gluster-users mailing list > > > Gluster-users at gluster.org > > > http://www.gluster.org/mailman/listinfo/gluster-users > > > > > > -- > Tiemen Ruiten > Systems Engineer > R&D Media >
Tiemen Ruiten
2015-Nov-17 16:19 UTC
[Gluster-users] rsync to gluster mount: self-heal and bad performance
Thank you Ben, I'm already running on 7, I've set the options you recommended and indeed file creation is faster now. I double-checked my config and found out that the filesystem of the brick on the arbiter node doesn't support ACLs: underlying fs is ext4 without acl mount option, while the other bricks are XFS ( where it's always enabled). Do all the bricks need to support ACLs? To keep things simple I suppose it makes sense to remove ACLs as it's not strictly needed for my setup, so I ran some tests and I can now confirm that I don't see the self-heals when the volume isn't mounted with --acl. I don't have exact numbers but I have the impression that syncing is faster as well. On 16 November 2015 at 19:02, Ben Turner <bturner at redhat.com> wrote:> ----- Original Message ----- > > From: "Tiemen Ruiten" <t.ruiten at rdmedia.com> > > To: "Ben Turner" <bturner at redhat.com> > > Cc: "gluster-users" <gluster-users at gluster.org> > > Sent: Monday, November 16, 2015 5:00:20 AM > > Subject: Re: [Gluster-users] rsync to gluster mount: self-heal and bad > performance > > > > Hello Ben, > > > > Thank you for your answer. I don't see the same errors when just > creating a > > number of files, eg. touch test{0000..9999}. Performance is not great, > but > > after a few minutes it finishes successfully. > > We have made alot of small file perf enhancements, try the following: > > -Run on RHEL 7, I have seen a good improvement on el7 over el6. > -Set lookup optimize on > -Set client and server event threads to 4 > -There is a metadata perf regression that could be affecting your rsync as > well, keep an eye on - https://bugzilla.redhat.com/show_bug.cgi?id=1250803 > for the fix. > > > > > > I'm running rsync through lsyncd, the options are: > > > > /usr/bin/rsync --delete --ignore-errors -zslt -r $source $destination > > > > I'm running it over a LAN network, between two VMs. The volume is indeed > > mounted with --acl, but on the directory I'm syncing to I haven't set > them > > explicitly: > > Do you need ACLs? If not can you try without that option? I am wondering > if there is a bug with ACLs that could be causing the self heals to > happen. If we don't see it without ACLs that can give us somewhere to look > at. > > > > > [tiemen at iron2 test]$ getfacl stg/ > > # file: stg/ > > # owner: root > > # group: rdcompany > > # flags: -s- > > user::rwx > > group::rwx > > other::r-x > > > > Volume options: > > > > [tiemen at iron2 test]$ sudo gluster volume info lpxassets > > > > Volume Name: lpxassets > > Type: Replicate > > Volume ID: fea00430-63b1-4a4e-bc38-b74d3732acf4 > > Status: Started > > Number of Bricks: 1 x 3 = 3 > > Transport-type: tcp > > Bricks: > > Brick1: iron2:/data/brick/lpxassets > > Brick2: cobalt2:/data/brick/lpxassets > > Brick3: arbiter:/data/arbiter/lpxassets > > Options Reconfigured: > > nfs.disable: on > > performance.readdir-ahead: on > > cluster.quorum-type: auto > > cluster.enable-shared-storage: enable > > nfs-ganesha: disable > > > > Any other info I could provide? > > > > > > On 15 November 2015 at 18:11, Ben Turner <bturner at redhat.com> wrote: > > > > > ----- Original Message ----- > > > > From: "Tiemen Ruiten" <t.ruiten at rdmedia.com> > > > > To: "gluster-users" <gluster-users at gluster.org> > > > > Sent: Sunday, November 15, 2015 5:22:08 AM > > > > Subject: Re: [Gluster-users] rsync to gluster mount: self-heal and > bad > > > performance > > > > > > > > Any other suggestions? > > > > > > You are correct, rsync should not cause self heal on every file. It > makes > > > me think that Ernie is correct and that something isn't correct. If > you > > > just create a bunch of files out side of rsync do you see the same > > > behavior? What rsync command are you running, where are you syncing > the > > > data from? I see you have the acl mount option, are you using ACLs? > > > > > > -b > > > > > > > > > > > > > > On 13 November 2015 at 09:56, Tiemen Ruiten < t.ruiten at rdmedia.com > > > > wrote: > > > > > > > > > > > > > > > > Hello Ernie, list, > > > > > > > > No, that's not the case. The volume is mounted through > glusterfs-fuse - > > > on > > > > the same server running one of the bricks. The fstab: > > > > > > > > # /etc/fstab > > > > # Created by anaconda on Tue Aug 18 18:10:49 2015 > > > > # > > > > # Accessible filesystems, by reference, are maintained under > '/dev/disk' > > > > # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for > more > > > info > > > > # > > > > UUID=56778fed-bf3f-435e-8c32-edaa8c707f29 / xfs defaults 0 0 > > > > UUID=a44e32ed-cfbe-4ba0-896f-1efff9397ba1 /boot xfs defaults 0 0 > > > > UUID=a344d2bc-266d-4905-85b1-fbb7fe927659 swap swap defaults 0 0 > > > > /dev/vdb1 /data/brick xfs defaults 1 2 > > > > iron2:/lpxassets /mnt/lpxassets glusterfs _netdev,acl 0 0 > > > > > > > > > > > > > > > > > > > > On 12 November 2015 at 22:50, Ernie Dunbar < maillist at lightspeed.ca > > > > > wrote: > > > > > > > > > > > > Hi Tiemen > > > > > > > > It sounds like you're trying to rsync files onto your Gluster server, > > > rather > > > > than to the Gluster filesystem. You want to copy these files into the > > > > mounted filesystem (typically on some other system than the Gluster > > > > servers), because Gluster is designed to handle it that way. > > > > > > > > I can't remember the nitty gritty details about why this is, but I've > > > made > > > > this mistake before as well. Hope that helps. :) > > > > > > > > > > > > On 2015-11-12 11:31, Tiemen Ruiten wrote: > > > > > > > > > > > > > > > > Hello, > > > > > > > > While rsyncing to a directory mounted through glusterfs fuse, > > > > performance is very bad and it appears every synced file generates a > > > > (metadata) self-heal. > > > > > > > > The volume is mounted with option acl and acl's are set on a > > > > subdirectory. > > > > > > > > Setup is as follows: > > > > > > > > Two Centos 7 VM's (KVM), with Gluster 3.7.6 and one physical CentOS 6 > > > > node, also Gluster 3.7.6. Physical node functions as arbiter. So it's > > > > a replica 3 arbiter 1 volume. The bricks are LVM's with XFS > > > > filesystem. > > > > > > > > While I don't think I should expect top performance for rsync on > > > > Gluster, I wouldn't expect every file synced to trigger a self-heal. > > > > Anything I can do to improve this? Should I file a bug? > > > > > > > > Another thing that looks related, I see a lot of these messages, > > > > especially when doing IO: > > > > > > > > [2015-11-12 19:25:42.185904] I [dict.c:473:dict_get] > > > > > > > > (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk+0x121) > > > > [0x7fdcc2d31161] > > > > > > > > -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk+0x242) > > > > [0x7fdcc2b1b212] -->/lib64/libglusterfs.so.0(dict_get+0xac) > > > > [0x7fdcd5e770cc] ) 0-dict: !this || key=system.posix_acl_default > > > > [Invalid argument] > > > > > > > > -- > > > > > > > > Tiemen Ruiten > > > > Systems Engineer > > > > R&D Media > > > > > > > > _______________________________________________ > > > > Gluster-users mailing list > > > > Gluster-users at gluster.org > > > > http://www.gluster.org/mailman/listinfo/gluster-users > > > > > > > > _______________________________________________ > > > > Gluster-users mailing list > > > > Gluster-users at gluster.org > > > > http://www.gluster.org/mailman/listinfo/gluster-users > > > > > > > > > > > > > > > > -- > > > > Tiemen Ruiten > > > > Systems Engineer > > > > R&D Media > > > > > > > > > > > > > > > > -- > > > > Tiemen Ruiten > > > > Systems Engineer > > > > R&D Media > > > > > > > > _______________________________________________ > > > > Gluster-users mailing list > > > > Gluster-users at gluster.org > > > > http://www.gluster.org/mailman/listinfo/gluster-users > > > > > > > > > > > -- > > Tiemen Ruiten > > Systems Engineer > > R&D Media > > >-- Tiemen Ruiten Systems Engineer R&D Media -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20151117/a59d85fc/attachment.html>