thr3ads.net - Gluster users - [Gluster-users] Gluster 3.12.14: wrong quota in Distributed Dispersed Volume [Nov 2018]

If this information is useful, please help other people find it:
Share via:

Hari Gowtham

2018-Nov-20 11:29 UTC

[Gluster-users] Gluster 3.12.14: wrong quota in Distributed Dispersed Volume

reply inline.
On Tue, Nov 20, 2018 at 3:53 PM Gudrun Mareike Amedick
<g.amedick at uni-luebeck.de> wrote:>
> Hi,
>
> I think I know what happened. According to the logs, the crawlers recieved
a signum(15). They seemed to have died before having finished. Probably too
> much to do simultaneously. I have disabled and re-enabled quota and will
set the quotas again with more time.
>
> Is there a way to restart a crawler that was killed too soon?
No. the disable and enable of quota starts a new crawl.
>
> If I restart a server while a crawler is running, will the crawler be
restarted, too? We'll need to do some hardware fixing on one of the servers
soon
> and I need to know whether I have to check the crawlers first before
shutting it down.
During the shutdown of the server the crawl will be killed. (data
usage shown will be updated as per what has been crawled)
The crawl won't be restarted on starting the server. Only quotad will
be restarted (which is not the same as crawl).
For the crawl to happen you will have to restart the quota.
>
> Thanks for the pointers
>
> Gudrun Amedick
> Am Dienstag, den 20.11.2018, 11:38 +0530 schrieb Hari Gowtham:
> > Hi,
> >
> > Can you check if the quota crawl finished? Without it having finished
> > the quota list will show incorrect values.
> > Looking at the under accounting, it looks like the crawl is not yet
> > finished ( it does take a lot of time as it has to crawl the whole
> > filesystem).
> >
> > If the crawl has finished and the usage is still showing wrong values
> > then there should be an accounting issue.
> > The easy way to fix this is to try restarting quota. This will not
> > cause any problems. The only downside is the limits won't hold
true
> > while the quota is disabled,
> > till its enabled and the crawl finishes.
> > Or you can try using the quota fsck script
> > https://review.gluster.org/#/c/glusterfs/+/19179/ to fix your
> > accounting issue.
> >
> > Regards,
> > Hari.
> > On Mon, Nov 19, 2018 at 10:05 PM Frank Ruehlemann
> > <f.ruehlemann at uni-luebeck.de> wrote:
> > >
> > >
> > > Hi,
> > >
> > > we're running a Distributed Dispersed volume with Gluster
3.12.14 at
> > > Debian 9.6 (Stretch).
> > >
> > > We migrated our data (>300TB) from a pure Distributed volume
into this
> > > Dispersed volume with cp, followed by multiple rsyncs.
> > > After the migration was successful we enabled quotas again with
"gluster
> > > volume quota $VOLUME enable", which finished successfully.
> > > And we set our required quotas with "gluster volume quota
$VOLUME
> > > limit-usage $PATH $QUOTA", which finished without errors
too.
> > >
> > > But our "gluster volume quota $VOLUME list" shows wrong
values.
> > > For example:
> > > A directory with ~170TB of data shows only 40.8TB Used.
> > > When we sum up all quoted directories we're way under the
~310TB that
> > > "df -h /$volume" shows.
> > > And "df -h /$volume/$directory" shows wrong values for
nearly all
> > > directories.
> > >
> > > All 72 8TB-bricks and all quota deamons of the 6 servers are
visible and
> > > online in "gluster volume status $VOLUME".
> > >
> > >
> > > In quotad.log I found multiple warnings like this:
> > > >
> > > > [2018-11-16 09:21:25.738901] W [dict.c:636:dict_unref]
(-->/usr/lib/x86_64-linux-gnu/glusterfs/3.12.14/xlator/features/quotad.so(+0x1d58)
> > > > [0x7f6844be7d58]
-->/usr/lib/x86_64-linux-gnu/glusterfs/3.12.14/xlator/features/quotad.so(+0x2b92)
[0x7f6844be8b92] -->/usr/lib/x86_64-linux-
> > > > gnu/libglusterfs.so.0(dict_unref+0xc0) [0x7f684b0db640] )
0-dict: dict is NULL [Invalid argument]
> > > In some brick logs I found those:
> > > >
> > > > [2018-11-19 07:23:30.932327] I [MSGID: 120020]
[quota.c:2198:quota_unlink_cbk] 0-$VOLUME-quota: quota context not set inode
(gfid:f100f7a9-0779-
> > > > 4b4c-880f-c8b3b4bdc49d) [Invalid argument]
> > > and (replaced the volume name with "$VOLUME") those:
> > > >
> > > > The message "W [MSGID: 120003]
[quota.c:821:quota_build_ancestry_cbk] 0-$VOLUME-quota: parent is NULL [Invalid
argument]" repeated 13 times
> > > > between [2018-11-19 15:28:54.089404] and [2018-11-19
15:30:12.792175]
> > > > [2018-11-19 15:31:34.559348] W [MSGID: 120003]
[quota.c:821:quota_build_ancestry_cbk] 0-$VOLUME-quota: parent is NULL [Invalid
argument]
> > > I already found that setting the flag
"trusted.glusterfs.quota.dirty" might help, but I'm unsure about
the consequences that will be triggered.
> > > And I'm unsure about the necessary version flag.
> > >
> > > Has anyone an idea how to fix this?
> > >
> > > Best Regards,
> > > --
> > > Frank R?hlemann
> > >    IT-Systemtechnik
> > >
> > > UNIVERSIT?T ZU L?BECK
> > >     IT-Service-Center
> > >
> > >     Ratzeburger Allee 160
> > >     23562 L?beck
> > >     Tel +49 451 3101 2034
> > >     Fax +49 451 3101 2004
> > >     ruehlemann at itsc.uni-luebeck.de
> > >     www.itsc.uni-luebeck.de
> > >
> > >
> > >
> > >
> > > _______________________________________________
> > > Gluster-users mailing list
> > > Gluster-users at gluster.org
> > > https://lists.gluster.org/mailman/listinfo/gluster-users
> >
> >


-- 
Regards,
Hari Gowtham.

Gudrun Mareike Amedick

2018-Nov-21 15:24 UTC

head link

[Gluster-users] Gluster 3.12.14: wrong quota in Distributed Dispersed Volume

Hi Hari,

I disabled and re-enabled the quota and I saw the crawlers starting. However,
this caused a pretty high load on my servers (200+) and this seem to
have gotten them killed again. At least, I have no crawlers running, the quotas
are not matching the output of du -h, and the crawler logs all contain
this line:

[2018-11-20 14:16:35.180467] W [glusterfsd.c:1375:cleanup_and_exit]
(-->/lib/x86_64-linux-gnu/libpthread.so.0(+0x7494) [0x7f0e3d6fe494]
-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xf5) [0x561eb7952d45]
-->/usr/sbin/glusterfs(cleanup_and_exit+0x54) [0x561eb7952ba4] ) 0-: received
signum(15), shutting down

I suspect this means my file attributes are not set correctly. Would the script
you sent me fix that? And the script seems to be part of the Git
GlusterFS 5.0 repo. We are running 3.12. Would it still work on 3.12 (or 4.1,
since we'll be upgrading soon) or could it break things?

Kind regards

Gudrun Amedick
Am Dienstag, den 20.11.2018, 16:59 +0530 schrieb Hari
Gowtham:> reply inline.
> On Tue, Nov 20, 2018 at 3:53 PM Gudrun Mareike Amedick
> <g.amedick at uni-luebeck.de> wrote:
> > 
> > 
> > Hi,
> > 
> > I think I know what happened. According to the logs, the crawlers
recieved a signum(15). They seemed to have died before having finished. Probably
> > too
> > much to do simultaneously. I have disabled and re-enabled quota and
will set the quotas again with more time.
> > 
> > Is there a way to restart a crawler that was killed too soon?
> No. the disable and enable of quota starts a new crawl.
> 
> > 
> > 
> > If I restart a server while a crawler is running, will the crawler be
restarted, too? We'll need to do some hardware fixing on one of the servers
> > soon
> > and I need to know whether I have to check the crawlers first before
shutting it down.
> During the shutdown of the server the crawl will be killed. (data
> usage shown will be updated as per what has been crawled)
> The crawl won't be restarted on starting the server. Only quotad will
> be restarted (which is not the same as crawl).
> For the crawl to happen you will have to restart the quota.
> 
> > 
> > 
> > Thanks for the pointers
> > 
> > Gudrun Amedick
> > Am Dienstag, den 20.11.2018, 11:38 +0530 schrieb Hari Gowtham:
> > > 
> > > Hi,
> > > 
> > > Can you check if the quota crawl finished? Without it having
finished
> > > the quota list will show incorrect values.
> > > Looking at the under accounting, it looks like the crawl is not
yet
> > > finished ( it does take a lot of time as it has to crawl the
whole
> > > filesystem).
> > > 
> > > If the crawl has finished and the usage is still showing wrong
values
> > > then there should be an accounting issue.
> > > The easy way to fix this is to try restarting quota. This will
not
> > > cause any problems. The only downside is the limits won't
hold true
> > > while the quota is disabled,
> > > till its enabled and the crawl finishes.
> > > Or you can try using the quota fsck script
> > > https://review.gluster.org/#/c/glusterfs/+/19179/ to fix your
> > > accounting issue.
> > > 
> > > Regards,
> > > Hari.
> > > On Mon, Nov 19, 2018 at 10:05 PM Frank Ruehlemann
> > > <f.ruehlemann at uni-luebeck.de> wrote:
> > > > 
> > > > 
> > > > 
> > > > Hi,
> > > > 
> > > > we're running a Distributed Dispersed volume with
Gluster 3.12.14 at
> > > > Debian 9.6 (Stretch).
> > > > 
> > > > We migrated our data (>300TB) from a pure Distributed
volume into this
> > > > Dispersed volume with cp, followed by multiple rsyncs.
> > > > After the migration was successful we enabled quotas again
with "gluster
> > > > volume quota $VOLUME enable", which finished
successfully.
> > > > And we set our required quotas with "gluster volume
quota $VOLUME
> > > > limit-usage $PATH $QUOTA", which finished without
errors too.
> > > > 
> > > > But our "gluster volume quota $VOLUME list" shows
wrong values.
> > > > For example:
> > > > A directory with ~170TB of data shows only 40.8TB Used.
> > > > When we sum up all quoted directories we're way under
the ~310TB that
> > > > "df -h /$volume" shows.
> > > > And "df -h /$volume/$directory" shows wrong values
for nearly all
> > > > directories.
> > > > 
> > > > All 72 8TB-bricks and all quota deamons of the 6 servers are
visible and
> > > > online in "gluster volume status $VOLUME".
> > > > 
> > > > 
> > > > In quotad.log I found multiple warnings like this:
> > > > > 
> > > > > 
> > > > > [2018-11-16 09:21:25.738901] W [dict.c:636:dict_unref]
(-->/usr/lib/x86_64-linux-gnu/glusterfs/3.12.14/xlator/features/quotad.so(+0x1d58)
> > > > > [0x7f6844be7d58]
-->/usr/lib/x86_64-linux-gnu/glusterfs/3.12.14/xlator/features/quotad.so(+0x2b92)
[0x7f6844be8b92] -->/usr/lib/x86_64-
> > > > > linux-
> > > > > gnu/libglusterfs.so.0(dict_unref+0xc0) [0x7f684b0db640]
) 0-dict: dict is NULL [Invalid argument]
> > > > In some brick logs I found those:
> > > > > 
> > > > > 
> > > > > [2018-11-19 07:23:30.932327] I [MSGID: 120020]
[quota.c:2198:quota_unlink_cbk] 0-$VOLUME-quota: quota context not set inode
(gfid:f100f7a9-
> > > > > 0779-
> > > > > 4b4c-880f-c8b3b4bdc49d) [Invalid argument]
> > > > and (replaced the volume name with "$VOLUME")
those:
> > > > > 
> > > > > 
> > > > > The message "W [MSGID: 120003]
[quota.c:821:quota_build_ancestry_cbk] 0-$VOLUME-quota: parent is NULL [Invalid
argument]" repeated 13 times
> > > > > between [2018-11-19 15:28:54.089404] and [2018-11-19
15:30:12.792175]
> > > > > [2018-11-19 15:31:34.559348] W [MSGID: 120003]
[quota.c:821:quota_build_ancestry_cbk] 0-$VOLUME-quota: parent is NULL [Invalid
argument]
> > > > I already found that setting the flag
"trusted.glusterfs.quota.dirty" might help, but I'm unsure about
the consequences that will be
> > > > triggered.
> > > > And I'm unsure about the necessary version flag.
> > > > 
> > > > Has anyone an idea how to fix this?
> > > > 
> > > > Best Regards,
> > > > --
> > > > Frank R?hlemann
> > > > ???IT-Systemtechnik
> > > > 
> > > > UNIVERSIT?T ZU L?BECK
> > > > ????IT-Service-Center
> > > > 
> > > > ????Ratzeburger Allee 160
> > > > ????23562 L?beck
> > > > ????Tel +49 451 3101 2034
> > > > ????Fax +49 451 3101 2004
> > > > ????ruehlemann at itsc.uni-luebeck.de
> > > > ????www.itsc.uni-luebeck.de
> > > > 
> > > > 
> > > > 
> > > > 
> > > > _______________________________________________
> > > > Gluster-users mailing list
> > > > Gluster-users at gluster.org
> > > > https://lists.gluster.org/mailman/listinfo/gluster-users
> > > 
> 
> -------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 6743 bytes
Desc: not available
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20181121/ced282cc/attachment.bin>

Gluster users - Nov 2018 - Gluster 3.12.14: wrong quota in Distributed Dispersed Volume

[Gluster-users] Gluster 3.12.14: wrong quota in Distributed Dispersed Volume

[Gluster-users] Gluster 3.12.14: wrong quota in Distributed Dispersed Volume