Frank Ruehlemann
2018-Apr-23 13:22 UTC
[Gluster-users] Problems since 3.12.7: invisible files, strange rebalance size, setxattr failed during rebalance and broken unix rights
Hi,
after 2 years running GlusterFS without bigger problems we're facing
some strange errors lately.
After updating to 3.12.7 some user reported at least 4 broken
directories with some invisible files. The files are at the bricks and
don't start with a dot, but aren't visible in "ls". Clients
still can
interact with them by using the explicit path.
More information: https://bugzilla.redhat.com/show_bug.cgi?id=1564071
And since this update gluster reported for the rebalance of >16900 PB
(Petabyte!) of data for one of our 2 server, when using ?gluster volume
rebalance $myvolume status?. The time looks right, but the size of
transfered files is absurd. The rebalance was with 3.12.6 in March 2018.
The last rebalance log file listed no errors and a realistic size at the
end.
We started a new rebalance today during a downtime of our corresponding
compute cluster, since these errors started to spread and this might
help. The output of ?gluster volume rebalance $myvolume status? doesn't
list any errors so far and the numbers look like realistic values.
But we're seeing some strange errors (every few minutes) reports in the
journald:
?[2018-04-23 12:31:24.942377] E [MSGID: 113001]
[posix.c:5983:_posix_handle_xattr_keyvalue_pair] 0-$myvolume-posix:
setxattr failed
on
/srv/glusterfs/bricks/DATA112/data/.glusterfs/e6/a8/e6a8ce50-fda5-4bad-8d4d-acd25dafcaa2
while doing xattrop:
key=trusted.glusterfs.quota.1ce02d3b-b7ae-4485-903c-2991de5350b6.contri.1 [No
such file or directory]?
The rebalance log file lists no errors.
Has anybody seen similar error messages during a rebalance?
And we see some files dublicated. There are two copies on different
bricks (we're running a distributed volume).
One copy looks like this:
$ ls -lah
-rwxr--r-- 2 $user $group 293 May 11 2017 config
The other one looks rather strange:
$ ls -lah
---------T 2 root $group 0 May 11 2017 config
Has anybody seen similar broken files?
We're using gluster 3.12 from the gluster.org-repositories on a standard
Debian 9 with XFS formatted bricks.
Hopefully somebody might have an answer how to fix this.
At least somebody in the future might find this, since we didn't found
anything while searching after these errors. If you're from the future:
Good luck! (^_^)
So far,
--
Frank R?hlemann
IT-Systemtechnik
UNIVERSIT?T ZU L?BECK
IT-Service-Center
Ratzeburger Allee 160
23562 L?beck
Tel +49 451 3101 2034
Fax +49 451 3101 2004
ruehlemann at itsc.uni-luebeck.de
www.itsc.uni-luebeck.de
Nithya Balachandran
2018-Apr-23 13:42 UTC
[Gluster-users] Problems since 3.12.7: invisible files, strange rebalance size, setxattr failed during rebalance and broken unix rights
Hi, What is the output of 'gluster volume info' for this volume? Regards, Nithya On 23 April 2018 at 18:52, Frank Ruehlemann <ruehlemann at itsc.uni-luebeck.de> wrote:> Hi, > > after 2 years running GlusterFS without bigger problems we're facing > some strange errors lately. > > After updating to 3.12.7 some user reported at least 4 broken > directories with some invisible files. The files are at the bricks and > don't start with a dot, but aren't visible in "ls". Clients still can > interact with them by using the explicit path. > More information: https://bugzilla.redhat.com/show_bug.cgi?id=1564071 > > And since this update gluster reported for the rebalance of >16900 PB > (Petabyte!) of data for one of our 2 server, when using ?gluster volume > rebalance $myvolume status?. The time looks right, but the size of > transfered files is absurd. The rebalance was with 3.12.6 in March 2018. > The last rebalance log file listed no errors and a realistic size at the > end. > > We started a new rebalance today during a downtime of our corresponding > compute cluster, since these errors started to spread and this might > help. The output of ?gluster volume rebalance $myvolume status? doesn't > list any errors so far and the numbers look like realistic values. > But we're seeing some strange errors (every few minutes) reports in the > journald: > ?[2018-04-23 12:31:24.942377] E [MSGID: 113001] > [posix.c:5983:_posix_handle_xattr_keyvalue_pair] 0-$myvolume-posix: > setxattr failed > on /srv/glusterfs/bricks/DATA112/data/.glusterfs/e6/a8/ > e6a8ce50-fda5-4bad-8d4d-acd25dafcaa2 while doing xattrop: > key=trusted.glusterfs.quota.1ce02d3b-b7ae-4485-903c-2991de5350b6.contri.1 > [No such file or directory]? > The rebalance log file lists no errors. > > Has anybody seen similar error messages during a rebalance? > > And we see some files dublicated. There are two copies on different > bricks (we're running a distributed volume). > One copy looks like this: > $ ls -lah > -rwxr--r-- 2 $user $group 293 May 11 2017 config > > The other one looks rather strange: > $ ls -lah > ---------T 2 root $group 0 May 11 2017 config > > Has anybody seen similar broken files? > > We're using gluster 3.12 from the gluster.org-repositories on a standard > Debian 9 with XFS formatted bricks. > > Hopefully somebody might have an answer how to fix this. > > At least somebody in the future might find this, since we didn't found > anything while searching after these errors. If you're from the future: > Good luck! (^_^) > > So far, > > -- > Frank R?hlemann > IT-Systemtechnik > > UNIVERSIT?T ZU L?BECK > IT-Service-Center > > Ratzeburger Allee 160 > 23562 L?beck > Tel +49 451 3101 2034 > Fax +49 451 3101 2004 > ruehlemann at itsc.uni-luebeck.de > www.itsc.uni-luebeck.de > > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180423/61b72ebd/attachment.html>
Frank Ruehlemann
2018-Apr-23 14:06 UTC
[Gluster-users] Problems since 3.12.7: invisible files, strange rebalance size, setxattr failed during rebalance and broken unix rights
Hi,
here it is.
# gluster volume info $myvolume
Volume Name: $myvolume
Type: Distribute
Volume ID: 0d210c70-e44f-46f1-862c-ef260514c9f1
Status: Started
Snapshot Count: 0
Number of Bricks: 23
Transport-type: tcp
Bricks:
Brick1: gluster02:/srv/glusterfs/bricks/DATA201/data
Brick2: gluster02:/srv/glusterfs/bricks/DATA202/data
Brick3: gluster02:/srv/glusterfs/bricks/DATA203/data
Brick4: gluster02:/srv/glusterfs/bricks/DATA204/data
Brick5: gluster02:/srv/glusterfs/bricks/DATA205/data
Brick6: gluster02:/srv/glusterfs/bricks/DATA206/data
Brick7: gluster02:/srv/glusterfs/bricks/DATA207/data
Brick8: gluster02:/srv/glusterfs/bricks/DATA208/data
Brick9: gluster01:/srv/glusterfs/bricks/DATA110/data
Brick10: gluster01:/srv/glusterfs/bricks/DATA111/data
Brick11: gluster01:/srv/glusterfs/bricks/DATA112/data
Brick12: gluster01:/srv/glusterfs/bricks/DATA113/data
Brick13: gluster01:/srv/glusterfs/bricks/DATA114/data
Brick14: gluster02:/srv/glusterfs/bricks/DATA209/data
Brick15: gluster01:/srv/glusterfs/bricks/DATA101/data
Brick16: gluster01:/srv/glusterfs/bricks/DATA102/data
Brick17: gluster01:/srv/glusterfs/bricks/DATA103/data
Brick18: gluster01:/srv/glusterfs/bricks/DATA104/data
Brick19: gluster01:/srv/glusterfs/bricks/DATA105/data
Brick20: gluster01:/srv/glusterfs/bricks/DATA106/data
Brick21: gluster01:/srv/glusterfs/bricks/DATA107/data
Brick22: gluster01:/srv/glusterfs/bricks/DATA108/data
Brick23: gluster01:/srv/glusterfs/bricks/DATA109/data
Options Reconfigured:
features.quota-deem-statfs: on
features.inode-quota: on
features.quota: on
auth.allow: $myipspace
performance.readdir-ahead: on
diagnostics.brick-log-level: WARNING
nfs.disable: on
transport.address-family: inet
nfs.addr-namelookup: off
diagnostics.brick-sys-log-level: WARNING
Well at least one thing got fixed by this reboot: "df -h" returns a
realistic size of the volume etc. This wasn't the case after our update
to 3.12.7.
Best Regards,
--
Frank R?hlemann
IT-Systemtechnik
UNIVERSIT?T ZU L?BECK
IT-Service-Center
Ratzeburger Allee 160
23562 L?beck
Tel +49 451 3101 2034
Fax +49 451 3101 2004
ruehlemann at itsc.uni-luebeck.de
www.itsc.uni-luebeck.de
Am Montag, den 23.04.2018, 19:12 +0530 schrieb Nithya
Balachandran:> Hi,
>
> What is the output of 'gluster volume info' for this volume?
>
>
> Regards,
> Nithya
>
> On 23 April 2018 at 18:52, Frank Ruehlemann <ruehlemann at
itsc.uni-luebeck.de>
> wrote:
>
> > Hi,
> >
> > after 2 years running GlusterFS without bigger problems we're
facing
> > some strange errors lately.
> >
> > After updating to 3.12.7 some user reported at least 4 broken
> > directories with some invisible files. The files are at the bricks and
> > don't start with a dot, but aren't visible in "ls".
Clients still can
> > interact with them by using the explicit path.
> > More information: https://bugzilla.redhat.com/show_bug.cgi?id=1564071
> >
> > And since this update gluster reported for the rebalance of >16900
PB
> > (Petabyte!) of data for one of our 2 server, when using ?gluster
volume
> > rebalance $myvolume status?. The time looks right, but the size of
> > transfered files is absurd. The rebalance was with 3.12.6 in March
2018.
> > The last rebalance log file listed no errors and a realistic size at
the
> > end.
> >
> > We started a new rebalance today during a downtime of our
corresponding
> > compute cluster, since these errors started to spread and this might
> > help. The output of ?gluster volume rebalance $myvolume status?
doesn't
> > list any errors so far and the numbers look like realistic values.
> > But we're seeing some strange errors (every few minutes) reports
in the
> > journald:
> > ?[2018-04-23 12:31:24.942377] E [MSGID: 113001]
> > [posix.c:5983:_posix_handle_xattr_keyvalue_pair] 0-$myvolume-posix:
> > setxattr failed
> > on /srv/glusterfs/bricks/DATA112/data/.glusterfs/e6/a8/
> > e6a8ce50-fda5-4bad-8d4d-acd25dafcaa2 while doing xattrop:
> >
key=trusted.glusterfs.quota.1ce02d3b-b7ae-4485-903c-2991de5350b6.contri.1
> > [No such file or directory]?
> > The rebalance log file lists no errors.
> >
> > Has anybody seen similar error messages during a rebalance?
> >
> > And we see some files dublicated. There are two copies on different
> > bricks (we're running a distributed volume).
> > One copy looks like this:
> > $ ls -lah
> > -rwxr--r-- 2 $user $group 293 May 11 2017 config
> >
> > The other one looks rather strange:
> > $ ls -lah
> > ---------T 2 root $group 0 May 11 2017 config
> >
> > Has anybody seen similar broken files?
> >
> > We're using gluster 3.12 from the gluster.org-repositories on a
standard
> > Debian 9 with XFS formatted bricks.
> >
> > Hopefully somebody might have an answer how to fix this.
> >
> > At least somebody in the future might find this, since we didn't
found
> > anything while searching after these errors. If you're from the
future:
> > Good luck! (^_^)
> >
> > So far,
> >
> > --
> > Frank R?hlemann
> > IT-Systemtechnik
> >
> > UNIVERSIT?T ZU L?BECK
> > IT-Service-Center
> >
> > Ratzeburger Allee 160
> > 23562 L?beck
> > Tel +49 451 3101 2034
> > Fax +49 451 3101 2004
> > ruehlemann at itsc.uni-luebeck.de
> > www.itsc.uni-luebeck.de
> >
> >
> >
> >
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > http://lists.gluster.org/mailman/listinfo/gluster-users
Nithya Balachandran
2018-Apr-23 16:21 UTC
[Gluster-users] Problems since 3.12.7: invisible files, strange rebalance size, setxattr failed during rebalance and broken unix rights
Hi, On 23 April 2018 at 18:52, Frank Ruehlemann <ruehlemann at itsc.uni-luebeck.de> wrote:> Hi, > > after 2 years running GlusterFS without bigger problems we're facing > some strange errors lately. > > After updating to 3.12.7 some user reported at least 4 broken > directories with some invisible files. The files are at the bricks and > don't start with a dot, but aren't visible in "ls". Clients still can > interact with them by using the explicit path. > More information: https://bugzilla.redhat.com/show_bug.cgi?id=1564071I will continue the analysis for this issue in the bug.> > > And since this update gluster reported for the rebalance of >16900 PB > (Petabyte!) of data for one of our 2 server, when using ?gluster volume > rebalance $myvolume status?. The time looks right, but the size of > transfered files is absurd. The rebalance was with 3.12.6 in March 2018. > The last rebalance log file listed no errors and a realistic size at the > end. >This has been seen a few times and is because an incorrect value is stored in the node_state.info file . However, I don't know what causes this incorrect value to be stored. It is harmless and can be ignored.> > We started a new rebalance today during a downtime of our corresponding > compute cluster, since these errors started to spread and this might > help. The output of ?gluster volume rebalance $myvolume status? doesn't > list any errors so far and the numbers look like realistic values. > But we're seeing some strange errors (every few minutes) reports in the > journald: > ?[2018-04-23 12:31:24.942377] E [MSGID: 113001] > [posix.c:5983:_posix_handle_xattr_keyvalue_pair] 0-$myvolume-posix: > setxattr failed > on /srv/glusterfs/bricks/DATA112/data/.glusterfs/e6/a8/ > e6a8ce50-fda5-4bad-8d4d-acd25dafcaa2 while doing xattrop: > key=trusted.glusterfs.quota.1ce02d3b-b7ae-4485-903c-2991de5350b6.contri.1 > [No such file or directory]? > The rebalance log file lists no errors. > > Has anybody seen similar error messages during a rebalance? >Are any directories being deleted/renamed during the rebalance? If yes, this could be a valid message.> > And we see some files dublicated. There are two copies on different > bricks (we're running a distributed volume). > One copy looks like this: > $ ls -lah > -rwxr--r-- 2 $user $group 293 May 11 2017 config > > The other one looks rather strange: > $ ls -lah > ---------T 2 root $group 0 May 11 2017 config > > Has anybody seen similar broken files? >This is fine as long as you only see a single file from the mount point. The 'T' files are internal gluster files (called linkto files) and should be invisible from the mount point. Regards, Nithya> > We're using gluster 3.12 from the gluster.org-repositories on a standard > Debian 9 with XFS formatted bricks. > > Hopefully somebody might have an answer how to fix this. > > At least somebody in the future might find this, since we didn't found > anything while searching after these errors. If you're from the future: > Good luck! (^_^) > > So far, > > -- > Frank R?hlemann > IT-Systemtechnik > > UNIVERSIT?T ZU L?BECK > IT-Service-Center > > Ratzeburger Allee 160 > 23562 L?beck > Tel +49 451 3101 2034 > Fax +49 451 3101 2004 > ruehlemann at itsc.uni-luebeck.de > www.itsc.uni-luebeck.de > > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180423/58a173f1/attachment.html>
Frank Ruehlemann
2018-Apr-24 08:26 UTC
[Gluster-users] Problems since 3.12.7: invisible files, strange rebalance size, setxattr failed during rebalance and broken unix rights
Hi, thank you for you quick answer. Am Montag, den 23.04.2018, 21:51 +0530 schrieb Nithya Balachandran:> On 23 April 2018 at 18:52, Frank Ruehlemann <ruehlemann at itsc.uni-luebeck.de> > wrote: > > > Hi, > > > > after 2 years running GlusterFS without bigger problems we're facing > > some strange errors lately. > > > > After updating to 3.12.7 some user reported at least 4 broken > > directories with some invisible files. The files are at the bricks and > > don't start with a dot, but aren't visible in "ls". Clients still can > > interact with them by using the explicit path. > > More information: https://bugzilla.redhat.com/show_bug.cgi?id=1564071 > > > I will continue the analysis for this issue in the bug.This would be very helpful. We saw your request for additional information and will provide them as soon as possible.> > And since this update gluster reported for the rebalance of >16900 PB > > (Petabyte!) of data for one of our 2 server, when using ?gluster volume > > rebalance $myvolume status?. The time looks right, but the size of > > transfered files is absurd. The rebalance was with 3.12.6 in March 2018. > > The last rebalance log file listed no errors and a realistic size at the > > end. > > > > This has been seen a few times and is because an incorrect value is stored > in the node_state.info file . However, I don't know what causes this > incorrect value to be stored. It is harmless and can be ignored.Ok. :)> > We started a new rebalance today during a downtime of our corresponding > > compute cluster, since these errors started to spread and this might > > help. The output of ?gluster volume rebalance $myvolume status? doesn't > > list any errors so far and the numbers look like realistic values. > > But we're seeing some strange errors (every few minutes) reports in the > > journald: > > ?[2018-04-23 12:31:24.942377] E [MSGID: 113001] > > [posix.c:5983:_posix_handle_xattr_keyvalue_pair] 0-$myvolume-posix: > > setxattr failed > > on /srv/glusterfs/bricks/DATA112/data/.glusterfs/e6/a8/ > > e6a8ce50-fda5-4bad-8d4d-acd25dafcaa2 while doing xattrop: > > key=trusted.glusterfs.quota.1ce02d3b-b7ae-4485-903c-2991de5350b6.contri.1 > > [No such file or directory]? > > The rebalance log file lists no errors. > > > > Has anybody seen similar error messages during a rebalance? > > > > Are any directories being deleted/renamed during the rebalance? If yes, > this could be a valid message.No. We locked out all users and took down all clients that mount the volume before we started the rebalance to ensure that there's no interaction of any client with it. The messages continued during the last hours and occurred up to several times per minute with some sporadic phases without them on all bricks of this volume.> > And we see some files dublicated. There are two copies on different > > bricks (we're running a distributed volume). > > One copy looks like this: > > $ ls -lah > > -rwxr--r-- 2 $user $group 293 May 11 2017 config > > > > The other one looks rather strange: > > $ ls -lah > > ---------T 2 root $group 0 May 11 2017 config > > > > Has anybody seen similar broken files? > > > > This is fine as long as you only see a single file from the mount point. > The 'T' files are internal gluster files (called linkto files) and should > be invisible from the mount point. > > > Regards, > NithyaThis is good to know. Yes, all files we saw so far had only one of those files. Thanks for your message. It helped a lot. -- Frank R?hlemann IT-Systemtechnik UNIVERSIT?T ZU L?BECK IT-Service-Center Ratzeburger Allee 160 23562 L?beck Tel +49 451 3101 2034 Fax +49 451 3101 2004 ruehlemann at itsc.uni-luebeck.de www.itsc.uni-luebeck.de
Seemingly Similar Threads
- Problems since 3.12.7: invisible files, strange rebalance size, setxattr failed during rebalance and broken unix rights
- Problems since 3.12.7: invisible files, strange rebalance size, setxattr failed during rebalance and broken unix rights
- Invisible files and directories
- Invisible files and directories
- About adding bricks ...