thr3ads.net - Gluster users - [Gluster-users] rsync to gluster mount: self-heal and bad performance [Nov 2015]

If this information is useful, please help other people find it:
Share via:

Ben Turner

2015-Nov-16 18:02 UTC

[Gluster-users] rsync to gluster mount: self-heal and bad performance

----- Original Message -----> From: "Tiemen Ruiten" <t.ruiten at rdmedia.com>
> To: "Ben Turner" <bturner at redhat.com>
> Cc: "gluster-users" <gluster-users at gluster.org>
> Sent: Monday, November 16, 2015 5:00:20 AM
> Subject: Re: [Gluster-users] rsync to gluster mount: self-heal and bad
performance
> 
> Hello Ben,
> 
> Thank you for your answer. I don't see the same errors when just
creating a
> number of files, eg. touch test{0000..9999}. Performance is not great, but
> after a few minutes it finishes successfully.
We have made alot of small file perf enhancements, try the following:

-Run on RHEL 7, I have seen a good improvement on el7 over el6.
-Set lookup optimize on
-Set client and server event threads to 4
-There is a metadata perf regression that could be affecting your rsync as well,
keep an eye on - https://bugzilla.redhat.com/show_bug.cgi?id=1250803 for the
fix.

> 
> I'm running rsync through lsyncd, the options are:
> 
> /usr/bin/rsync --delete --ignore-errors -zslt -r $source $destination
> 
> I'm running it over a LAN network, between two VMs. The volume is
indeed
> mounted with --acl, but on the directory I'm syncing to I haven't
set them
> explicitly:
Do you need ACLs?  If not can you try without that option?  I am wondering if
there is a bug with ACLs that could be causing the self heals to happen.  If we
don't see it without ACLs that can give us somewhere to look at.
> 
> [tiemen at iron2 test]$ getfacl stg/
> # file: stg/
> # owner: root
> # group: rdcompany
> # flags: -s-
> user::rwx
> group::rwx
> other::r-x
> 
> Volume options:
> 
> [tiemen at iron2 test]$ sudo gluster volume info lpxassets
> 
> Volume Name: lpxassets
> Type: Replicate
> Volume ID: fea00430-63b1-4a4e-bc38-b74d3732acf4
> Status: Started
> Number of Bricks: 1 x 3 = 3
> Transport-type: tcp
> Bricks:
> Brick1: iron2:/data/brick/lpxassets
> Brick2: cobalt2:/data/brick/lpxassets
> Brick3: arbiter:/data/arbiter/lpxassets
> Options Reconfigured:
> nfs.disable: on
> performance.readdir-ahead: on
> cluster.quorum-type: auto
> cluster.enable-shared-storage: enable
> nfs-ganesha: disable
> 
> Any other info I could provide?
> 
> 
> On 15 November 2015 at 18:11, Ben Turner <bturner at redhat.com>
wrote:
> 
> > ----- Original Message -----
> > > From: "Tiemen Ruiten" <t.ruiten at rdmedia.com>
> > > To: "gluster-users" <gluster-users at
gluster.org>
> > > Sent: Sunday, November 15, 2015 5:22:08 AM
> > > Subject: Re: [Gluster-users] rsync to gluster mount: self-heal
and bad
> >       performance
> > >
> > > Any other suggestions?
> >
> > You are correct, rsync should not cause self heal on every file.  It
makes
> > me think that Ernie is correct and that something isn't correct. 
If you
> > just create a bunch of files out side of rsync do you see the same
> > behavior?  What rsync command are you running, where are you syncing
the
> > data from?  I see you have the acl mount option, are you using ACLs?
> >
> > -b
> >
> >
> > >
> > > On 13 November 2015 at 09:56, Tiemen Ruiten < t.ruiten at
rdmedia.com >
> > wrote:
> > >
> > >
> > >
> > > Hello Ernie, list,
> > >
> > > No, that's not the case. The volume is mounted through
glusterfs-fuse -
> > on
> > > the same server running one of the bricks. The fstab:
> > >
> > > # /etc/fstab
> > > # Created by anaconda on Tue Aug 18 18:10:49 2015
> > > #
> > > # Accessible filesystems, by reference, are maintained under
'/dev/disk'
> > > # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for
more
> > info
> > > #
> > > UUID=56778fed-bf3f-435e-8c32-edaa8c707f29 / xfs defaults 0 0
> > > UUID=a44e32ed-cfbe-4ba0-896f-1efff9397ba1 /boot xfs defaults 0 0
> > > UUID=a344d2bc-266d-4905-85b1-fbb7fe927659 swap swap defaults 0 0
> > > /dev/vdb1 /data/brick xfs defaults 1 2
> > > iron2:/lpxassets /mnt/lpxassets glusterfs _netdev,acl 0 0
> > >
> > >
> > >
> > >
> > > On 12 November 2015 at 22:50, Ernie Dunbar < maillist at
lightspeed.ca >
> > wrote:
> > >
> > >
> > > Hi Tiemen
> > >
> > > It sounds like you're trying to rsync files onto your Gluster
server,
> > rather
> > > than to the Gluster filesystem. You want to copy these files into
the
> > > mounted filesystem (typically on some other system than the
Gluster
> > > servers), because Gluster is designed to handle it that way.
> > >
> > > I can't remember the nitty gritty details about why this is,
but I've
> > made
> > > this mistake before as well. Hope that helps. :)
> > >
> > >
> > > On 2015-11-12 11:31, Tiemen Ruiten wrote:
> > >
> > >
> > >
> > > Hello,
> > >
> > > While rsyncing to a directory mounted through glusterfs fuse,
> > > performance is very bad and it appears every synced file
generates a
> > > (metadata) self-heal.
> > >
> > > The volume is mounted with option acl and acl's are set on a
> > > subdirectory.
> > >
> > > Setup is as follows:
> > >
> > > Two Centos 7 VM's (KVM), with Gluster 3.7.6 and one physical
CentOS 6
> > > node, also Gluster 3.7.6. Physical node functions as arbiter. So
it's
> > > a replica 3 arbiter 1 volume. The bricks are LVM's with XFS
> > > filesystem.
> > >
> > > While I don't think I should expect top performance for rsync
on
> > > Gluster, I wouldn't expect every file synced to trigger a
self-heal.
> > > Anything I can do to improve this? Should I file a bug?
> > >
> > > Another thing that looks related, I see a lot of these messages,
> > > especially when doing IO:
> > >
> > > [2015-11-12 19:25:42.185904] I [dict.c:473:dict_get]
> > >
> >
(-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk+0x121)
> > > [0x7fdcc2d31161]
> > >
> >
-->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk+0x242)
> > > [0x7fdcc2b1b212] -->/lib64/libglusterfs.so.0(dict_get+0xac)
> > > [0x7fdcd5e770cc] ) 0-dict: !this || key=system.posix_acl_default
> > > [Invalid argument]
> > >
> > > --
> > >
> > > Tiemen Ruiten
> > > Systems Engineer
> > > R&D Media
> > >
> > > _______________________________________________
> > > Gluster-users mailing list
> > > Gluster-users at gluster.org
> > > http://www.gluster.org/mailman/listinfo/gluster-users
> > >
> > > _______________________________________________
> > > Gluster-users mailing list
> > > Gluster-users at gluster.org
> > > http://www.gluster.org/mailman/listinfo/gluster-users
> > >
> > >
> > >
> > > --
> > > Tiemen Ruiten
> > > Systems Engineer
> > > R&D Media
> > >
> > >
> > >
> > > --
> > > Tiemen Ruiten
> > > Systems Engineer
> > > R&D Media
> > >
> > > _______________________________________________
> > > Gluster-users mailing list
> > > Gluster-users at gluster.org
> > > http://www.gluster.org/mailman/listinfo/gluster-users
> >
> 
> 
> 
> --
> Tiemen Ruiten
> Systems Engineer
> R&D Media
>

Tiemen Ruiten

2015-Nov-17 16:19 UTC

head link

[Gluster-users] rsync to gluster mount: self-heal and bad performance

Thank you Ben,

I'm already running on 7, I've set the options you recommended and
indeed
file creation is faster now.

I double-checked my config and found out that the filesystem of the brick
on the arbiter node doesn't support ACLs: underlying fs is ext4 without acl
mount option, while the other bricks are XFS ( where it's always enabled).
Do all the bricks need to support ACLs?

To keep things simple I suppose it makes sense to remove ACLs as it's not
strictly needed for my setup, so I ran some tests and I can now confirm
that I don't see the self-heals when the volume isn't mounted with
--acl. I
don't have exact numbers but I have the impression that syncing is faster
as well.

On 16 November 2015 at 19:02, Ben Turner <bturner at redhat.com> wrote:
> ----- Original Message -----
> > From: "Tiemen Ruiten" <t.ruiten at rdmedia.com>
> > To: "Ben Turner" <bturner at redhat.com>
> > Cc: "gluster-users" <gluster-users at gluster.org>
> > Sent: Monday, November 16, 2015 5:00:20 AM
> > Subject: Re: [Gluster-users] rsync to gluster mount: self-heal and bad
> performance
> >
> > Hello Ben,
> >
> > Thank you for your answer. I don't see the same errors when just
> creating a
> > number of files, eg. touch test{0000..9999}. Performance is not great,
> but
> > after a few minutes it finishes successfully.
>
> We have made alot of small file perf enhancements, try the following:
>
> -Run on RHEL 7, I have seen a good improvement on el7 over el6.
> -Set lookup optimize on
> -Set client and server event threads to 4
> -There is a metadata perf regression that could be affecting your rsync as
> well, keep an eye on - https://bugzilla.redhat.com/show_bug.cgi?id=1250803
> for the fix.
>
>
> >
> > I'm running rsync through lsyncd, the options are:
> >
> > /usr/bin/rsync --delete --ignore-errors -zslt -r $source $destination
> >
> > I'm running it over a LAN network, between two VMs. The volume is
indeed
> > mounted with --acl, but on the directory I'm syncing to I
haven't set
> them
> > explicitly:
>
> Do you need ACLs?  If not can you try without that option?  I am wondering
> if there is a bug with ACLs that could be causing the self heals to
> happen.  If we don't see it without ACLs that can give us somewhere to
look
> at.
>
> >
> > [tiemen at iron2 test]$ getfacl stg/
> > # file: stg/
> > # owner: root
> > # group: rdcompany
> > # flags: -s-
> > user::rwx
> > group::rwx
> > other::r-x
> >
> > Volume options:
> >
> > [tiemen at iron2 test]$ sudo gluster volume info lpxassets
> >
> > Volume Name: lpxassets
> > Type: Replicate
> > Volume ID: fea00430-63b1-4a4e-bc38-b74d3732acf4
> > Status: Started
> > Number of Bricks: 1 x 3 = 3
> > Transport-type: tcp
> > Bricks:
> > Brick1: iron2:/data/brick/lpxassets
> > Brick2: cobalt2:/data/brick/lpxassets
> > Brick3: arbiter:/data/arbiter/lpxassets
> > Options Reconfigured:
> > nfs.disable: on
> > performance.readdir-ahead: on
> > cluster.quorum-type: auto
> > cluster.enable-shared-storage: enable
> > nfs-ganesha: disable
> >
> > Any other info I could provide?
> >
> >
> > On 15 November 2015 at 18:11, Ben Turner <bturner at redhat.com>
wrote:
> >
> > > ----- Original Message -----
> > > > From: "Tiemen Ruiten" <t.ruiten at
rdmedia.com>
> > > > To: "gluster-users" <gluster-users at
gluster.org>
> > > > Sent: Sunday, November 15, 2015 5:22:08 AM
> > > > Subject: Re: [Gluster-users] rsync to gluster mount:
self-heal and
> bad
> > >       performance
> > > >
> > > > Any other suggestions?
> > >
> > > You are correct, rsync should not cause self heal on every file. 
It
> makes
> > > me think that Ernie is correct and that something isn't
correct.  If
> you
> > > just create a bunch of files out side of rsync do you see the
same
> > > behavior?  What rsync command are you running, where are you
syncing
> the
> > > data from?  I see you have the acl mount option, are you using
ACLs?
> > >
> > > -b
> > >
> > >
> > > >
> > > > On 13 November 2015 at 09:56, Tiemen Ruiten < t.ruiten at
rdmedia.com >
> > > wrote:
> > > >
> > > >
> > > >
> > > > Hello Ernie, list,
> > > >
> > > > No, that's not the case. The volume is mounted through
> glusterfs-fuse -
> > > on
> > > > the same server running one of the bricks. The fstab:
> > > >
> > > > # /etc/fstab
> > > > # Created by anaconda on Tue Aug 18 18:10:49 2015
> > > > #
> > > > # Accessible filesystems, by reference, are maintained under
> '/dev/disk'
> > > > # See man pages fstab(5), findfs(8), mount(8) and/or
blkid(8) for
> more
> > > info
> > > > #
> > > > UUID=56778fed-bf3f-435e-8c32-edaa8c707f29 / xfs defaults 0 0
> > > > UUID=a44e32ed-cfbe-4ba0-896f-1efff9397ba1 /boot xfs defaults
0 0
> > > > UUID=a344d2bc-266d-4905-85b1-fbb7fe927659 swap swap defaults
0 0
> > > > /dev/vdb1 /data/brick xfs defaults 1 2
> > > > iron2:/lpxassets /mnt/lpxassets glusterfs _netdev,acl 0 0
> > > >
> > > >
> > > >
> > > >
> > > > On 12 November 2015 at 22:50, Ernie Dunbar < maillist at
lightspeed.ca
> >
> > > wrote:
> > > >
> > > >
> > > > Hi Tiemen
> > > >
> > > > It sounds like you're trying to rsync files onto your
Gluster server,
> > > rather
> > > > than to the Gluster filesystem. You want to copy these files
into the
> > > > mounted filesystem (typically on some other system than the
Gluster
> > > > servers), because Gluster is designed to handle it that way.
> > > >
> > > > I can't remember the nitty gritty details about why this
is, but I've
> > > made
> > > > this mistake before as well. Hope that helps. :)
> > > >
> > > >
> > > > On 2015-11-12 11:31, Tiemen Ruiten wrote:
> > > >
> > > >
> > > >
> > > > Hello,
> > > >
> > > > While rsyncing to a directory mounted through glusterfs
fuse,
> > > > performance is very bad and it appears every synced file
generates a
> > > > (metadata) self-heal.
> > > >
> > > > The volume is mounted with option acl and acl's are set
on a
> > > > subdirectory.
> > > >
> > > > Setup is as follows:
> > > >
> > > > Two Centos 7 VM's (KVM), with Gluster 3.7.6 and one
physical CentOS 6
> > > > node, also Gluster 3.7.6. Physical node functions as
arbiter. So it's
> > > > a replica 3 arbiter 1 volume. The bricks are LVM's with
XFS
> > > > filesystem.
> > > >
> > > > While I don't think I should expect top performance for
rsync on
> > > > Gluster, I wouldn't expect every file synced to trigger
a self-heal.
> > > > Anything I can do to improve this? Should I file a bug?
> > > >
> > > > Another thing that looks related, I see a lot of these
messages,
> > > > especially when doing IO:
> > > >
> > > > [2015-11-12 19:25:42.185904] I [dict.c:473:dict_get]
> > > >
> > >
>
(-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk+0x121)
> > > > [0x7fdcc2d31161]
> > > >
> > >
>
-->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk+0x242)
> > > > [0x7fdcc2b1b212]
-->/lib64/libglusterfs.so.0(dict_get+0xac)
> > > > [0x7fdcd5e770cc] ) 0-dict: !this ||
key=system.posix_acl_default
> > > > [Invalid argument]
> > > >
> > > > --
> > > >
> > > > Tiemen Ruiten
> > > > Systems Engineer
> > > > R&D Media
> > > >
> > > > _______________________________________________
> > > > Gluster-users mailing list
> > > > Gluster-users at gluster.org
> > > > http://www.gluster.org/mailman/listinfo/gluster-users
> > > >
> > > > _______________________________________________
> > > > Gluster-users mailing list
> > > > Gluster-users at gluster.org
> > > > http://www.gluster.org/mailman/listinfo/gluster-users
> > > >
> > > >
> > > >
> > > > --
> > > > Tiemen Ruiten
> > > > Systems Engineer
> > > > R&D Media
> > > >
> > > >
> > > >
> > > > --
> > > > Tiemen Ruiten
> > > > Systems Engineer
> > > > R&D Media
> > > >
> > > > _______________________________________________
> > > > Gluster-users mailing list
> > > > Gluster-users at gluster.org
> > > > http://www.gluster.org/mailman/listinfo/gluster-users
> > >
> >
> >
> >
> > --
> > Tiemen Ruiten
> > Systems Engineer
> > R&D Media
> >
>


-- 
Tiemen Ruiten
Systems Engineer
R&D Media
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20151117/a59d85fc/attachment.html>

Gluster users - Nov 2015 - rsync to gluster mount: self-heal and bad performance

[Gluster-users] rsync to gluster mount: self-heal and bad performance

[Gluster-users] rsync to gluster mount: self-heal and bad performance