thr3ads.net - Gluster users - [Gluster-users] Gluster 3.10.5: used disk size reported by quota and du mismatch [Jul 2018]

If this information is useful, please help other people find it:
Share via:

Hari Gowtham

2018-Jul-11 07:16 UTC

[Gluster-users] Gluster 3.10.5: used disk size reported by quota and du mismatch

Hi,

There was a accounting issue in your setup.
The directory ans004/ftp/CMCC-CM2-VHR4-CTR/atm/hist and ans004/ftp/CMCC-CM2-VHR4
had wrong size value on them.

To fix it, you will have to set dirty xattr (an internal gluster
xattr) on these directories
which will mark it for calculating the values again for the directory.
And then do a du on the mount after setting the xattrs. This will do a
stat that will
calculate and update the right values.

To set dirty xattr:
setfattr -n trusted.glusterfs.quota.dirty -v 0x3100 <path to the
directory>
This has to be done for both the directories one after the other on each brick.
Once done for all the bricks issue the du command.

Thanks to Sanoj for the guidance
On Tue, Jul 10, 2018 at 6:37 PM Mauro Tridici <mauro.tridici at cmcc.it>
wrote:>
>
> Hi Hari,
>
> sorry for the late.
> Yes, the gluster volume is a single volume that is spread between all the 3
node and has 36 bricks
>
> In attachment you can find a tar.gz file containing:
>
> - gluster volume status command output;
> - gluster volume info command output;
> - the output of the following script execution (it generated 3 files per
server: s01.log, s02.log, s03.log).
>
> This is the ?check.sh? script that has been executed on each server
(servers are s01, s02, s03).
>
> #!/bin/bash
>
> #set -xv
>
> host=$(hostname)
>
> for i in {1..12}
> do
>  ./quota_fsck_new-6.py --full-logs --sub-dir CSP/ans004
/gluster/mnt$i/brick >> $host.log
> done
>
> Many thanks,
> Mauro
>
>
> Il giorno 10 lug 2018, alle ore 12:12, Hari Gowtham <hgowtham at
redhat.com> ha scritto:
>
> Hi Mauro,
>
> Can you send the gluster v status command output?
>
> Is it a single volume that is spread between all the 3 node and has 36
bricks?
> If yes, you will have to run on all the bricks.
>
> In the command use sub-dir option if you are running only for the
> directory where limit is set. else if you are
> running on the brick mount path you can remove it.
>
> The full-log will consume a lot of space as its going to record the
> xattrs for each entry inside the path we are
> running it. This data is needed to cross check and verify quota's
> marker functionality.
>
> To reduce resource consumption you can run it on one replica set alone
> (if its replicate volume)
> But its better if you can run it on all the brick if possible and if
> the size consumed is fine with you.
>
> Make sure you run it with the script link provided above by Sanoj. (patch
set 6)
> On Tue, Jul 10, 2018 at 2:54 PM Mauro Tridici <mauro.tridici at
cmcc.it> wrote:
>
>
>
> Hi Hari,
>
> thank you very much for your answer.
> I will try to use the script mentioned above pointing to each backend
bricks.
>
> So, if I understand, since I have a gluster cluster composed by 3 nodes
(with 12 bricks on each node), I have to execute the script 36 times. Right?
>
> You can find below the ?df? command output executed on a cluster node:
>
> /dev/mapper/cl_s01-gluster           100G   33M    100G   1% /gluster
> /dev/mapper/gluster_vgd-gluster_lvd  9,0T  5,6T    3,5T  62% /gluster/mnt2
> /dev/mapper/gluster_vge-gluster_lve  9,0T  5,7T    3,4T  63% /gluster/mnt3
> /dev/mapper/gluster_vgj-gluster_lvj  9,0T  5,7T    3,4T  63% /gluster/mnt8
> /dev/mapper/gluster_vgc-gluster_lvc  9,0T  5,6T    3,5T  62% /gluster/mnt1
> /dev/mapper/gluster_vgl-gluster_lvl  9,0T  5,8T    3,3T  65% /gluster/mnt10
> /dev/mapper/gluster_vgh-gluster_lvh  9,0T  5,7T    3,4T  64% /gluster/mnt6
> /dev/mapper/gluster_vgf-gluster_lvf  9,0T  5,7T    3,4T  63% /gluster/mnt4
> /dev/mapper/gluster_vgm-gluster_lvm  9,0T  5,4T    3,7T  60% /gluster/mnt11
> /dev/mapper/gluster_vgn-gluster_lvn  9,0T  5,4T    3,7T  60% /gluster/mnt12
> /dev/mapper/gluster_vgg-gluster_lvg  9,0T  5,7T    3,4T  64% /gluster/mnt5
> /dev/mapper/gluster_vgi-gluster_lvi  9,0T  5,7T    3,4T  63% /gluster/mnt7
> /dev/mapper/gluster_vgk-gluster_lvk  9,0T  5,8T    3,3T  65% /gluster/mnt9
>
> I will execute the following command and I will put here the output.
>
> ./quota_fsck_new.py --full-logs --sub-dir /gluster/mnt{1..12}
>
> Thank you again for your support.
> Regards,
> Mauro
>
> Il giorno 10 lug 2018, alle ore 11:02, Hari Gowtham <hgowtham at
redhat.com> ha scritto:
>
> Hi,
>
> There is no explicit command to backup all the quota limits as per my
> understanding. need to look further about this.
> But you can do the following to backup and set it.
> Gluster volume quota volname list which will print all the quota
> limits on that particular volume.
> You will have to make a note of the directories with their respective limit
set.
> Once noted down, you can disable quota on the volume and then enable it.
> Once enabled, you will have to set each limit explicitly on the volume.
>
> Before doing this we suggest you can to try running the script
> mentioned above with the backend brick path instead of the mount path.
> you need to run this on the machines where the backend bricks are
> located and not on the mount.
> On Mon, Jul 9, 2018 at 9:01 PM Mauro Tridici <mauro.tridici at
cmcc.it> wrote:
>
>
> Hi Sanoj,
>
> could you provide me the command that I need in order to backup all quota
limits?
> If there is no solution for this kind of problem, I would like to try to
follow your ?backup? suggestion.
>
> Do you think that I should contact gluster developers too?
>
> Thank you very much.
> Regards,
> Mauro
>
>
> Il giorno 05 lug 2018, alle ore 09:56, Mauro Tridici <mauro.tridici at
cmcc.it> ha scritto:
>
> Hi Sanoj,
>
> unfortunately the output of the command execution was not helpful.
>
> [root at s01 ~]# find /tier2/CSP/ans004  | xargs getfattr -d -m. -e hex
> [root at s01 ~]#
>
> Do you have some other idea in order to detect the cause of the issue?
>
> Thank you again,
> Mauro
>
>
> Il giorno 05 lug 2018, alle ore 09:08, Sanoj Unnikrishnan <sunnikri at
redhat.com> ha scritto:
>
> Hi Mauro,
>
> A script issue did not capture all necessary xattr.
> Could you provide the xattrs with..
> find /tier2/CSP/ans004  | xargs getfattr -d -m. -e hex
>
> Meanwhile, If you are being impacted, you could do the following
> back up quota limits
> disable quota
> enable quota
> freshly set the limits.
>
> Please capture the xattr values first, so that we can get to know what went
wrong.
> Regards,
> Sanoj
>
>
> On Tue, Jul 3, 2018 at 4:09 PM, Mauro Tridici <mauro.tridici at
cmcc.it> wrote:
>
>
> Dear Sanoj,
>
> thank you very much for your support.
> I just downloaded and executed the script you suggested.
>
> This is the full command I executed:
>
> ./quota_fsck_new.py --full-logs --sub-dir /tier2/CSP/ans004/ /gluster
>
> In attachment, you can find the logs generated by the script.
> What can I do now?
>
> Thank you very much for your patience.
> Mauro
>
>
>
>
> Il giorno 03 lug 2018, alle ore 11:34, Sanoj Unnikrishnan <sunnikri at
redhat.com> ha scritto:
>
> Hi Mauro,
>
> This may be an issue with update of backend xattrs.
> To RCA further and provide resolution could you provide me with the logs by
running the following fsck script.
> https://review.gluster.org/#/c/19179/6/extras/quota/quota_fsck.py
>
> Try running the script and revert with the logs generated.
>
> Thanks,
> Sanoj
>
>
> On Mon, Jul 2, 2018 at 2:21 PM, Mauro Tridici <mauro.tridici at
cmcc.it> wrote:
>
>
> Dear Users,
>
> I just noticed that, after some data deletions executed inside
"/tier2/CSP/ans004? folder, the amount of used disk reported by quota
command doesn?t reflect the value indicated by du command.
> Surfing on the web, it seems that it is a bug of previous versions of
Gluster FS and it was already fixed.
> In my case, the problem seems unfortunately still here.
>
> How can I solve this issue? Is it possible to do it without starting a
downtime period?
>
> Thank you very much in advance,
> Mauro
>
> [root at s01 ~]# glusterfs -V
> glusterfs 3.10.5
> Repository revision: git://git.gluster.org/glusterfs.git
> Copyright (c) 2006-2016 Red Hat, Inc. <https://www.gluster.org/>
> GlusterFS comes with ABSOLUTELY NO WARRANTY.
> It is licensed to you under your choice of the GNU Lesser
> General Public License, version 3 or any later version (LGPLv3
> or later), or the GNU General Public License, version 2 (GPLv2),
> in all cases as published by the Free Software Foundation.
>
> [root at s01 ~]# gluster volume quota tier2 list /CSP/ans004
>                 Path                   Hard-limit  Soft-limit      Used 
Available  Soft-limit exceeded? Hard-limit exceeded?
>
-------------------------------------------------------------------------------------------------------------------------------
> /CSP/ans004                                1.0TB     99%(1013.8GB)    3.9TB
0Bytes             Yes                  Yes
>
> [root at s01 ~]# du -hs /tier2/CSP/ans004/
> 295G /tier2/CSP/ans004/
>
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
>
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
>
>
>
> --
> Regards,
> Hari Gowtham.
>
>
>
>
>
> --
> Regards,
> Hari Gowtham.
>
>

-- 
Regards,
Hari Gowtham.

Mauro Tridici

2018-Jul-11 08:23 UTC

head link

[Gluster-users] Gluster 3.10.5: used disk size reported by quota and du mismatch

Hi Hari, Hi Sanoj,

thank you very much for your patience and your support! 
The problem has been solved following your instructions :-)

N.B.: in order to reduce the running time, I executed the ?du? command as
follows:

for i in {1..12}
do
 du /gluster/mnt$i/brick/CSP/ans004/ftp
done

and not on each brick at "/gluster/mnt$i/brick" tree level.

I hope it was a correct idea :-)

Thank you again for helping me to solve this issue.
Have a good day.
Mauro

> Il giorno 11 lug 2018, alle ore 09:16, Hari Gowtham <hgowtham at
redhat.com> ha scritto:
> 
> Hi,
> 
> There was a accounting issue in your setup.
> The directory ans004/ftp/CMCC-CM2-VHR4-CTR/atm/hist and
ans004/ftp/CMCC-CM2-VHR4
> had wrong size value on them.
> 
> To fix it, you will have to set dirty xattr (an internal gluster
> xattr) on these directories
> which will mark it for calculating the values again for the directory.
> And then do a du on the mount after setting the xattrs. This will do a
> stat that will
> calculate and update the right values.
> 
> To set dirty xattr:
> setfattr -n trusted.glusterfs.quota.dirty -v 0x3100 <path to the
directory>
> This has to be done for both the directories one after the other on each
brick.
> Once done for all the bricks issue the du command.
> 
> Thanks to Sanoj for the guidance
> On Tue, Jul 10, 2018 at 6:37 PM Mauro Tridici <mauro.tridici at
cmcc.it> wrote:
>> 
>> 
>> Hi Hari,
>> 
>> sorry for the late.
>> Yes, the gluster volume is a single volume that is spread between all
the 3 node and has 36 bricks
>> 
>> In attachment you can find a tar.gz file containing:
>> 
>> - gluster volume status command output;
>> - gluster volume info command output;
>> - the output of the following script execution (it generated 3 files
per server: s01.log, s02.log, s03.log).
>> 
>> This is the ?check.sh? script that has been executed on each server
(servers are s01, s02, s03).
>> 
>> #!/bin/bash
>> 
>> #set -xv
>> 
>> host=$(hostname)
>> 
>> for i in {1..12}
>> do
>> ./quota_fsck_new-6.py --full-logs --sub-dir CSP/ans004
/gluster/mnt$i/brick >> $host.log
>> done
>> 
>> Many thanks,
>> Mauro
>> 
>> 
>> Il giorno 10 lug 2018, alle ore 12:12, Hari Gowtham <hgowtham at
redhat.com> ha scritto:
>> 
>> Hi Mauro,
>> 
>> Can you send the gluster v status command output?
>> 
>> Is it a single volume that is spread between all the 3 node and has 36
bricks?
>> If yes, you will have to run on all the bricks.
>> 
>> In the command use sub-dir option if you are running only for the
>> directory where limit is set. else if you are
>> running on the brick mount path you can remove it.
>> 
>> The full-log will consume a lot of space as its going to record the
>> xattrs for each entry inside the path we are
>> running it. This data is needed to cross check and verify quota's
>> marker functionality.
>> 
>> To reduce resource consumption you can run it on one replica set alone
>> (if its replicate volume)
>> But its better if you can run it on all the brick if possible and if
>> the size consumed is fine with you.
>> 
>> Make sure you run it with the script link provided above by Sanoj.
(patch set 6)
>> On Tue, Jul 10, 2018 at 2:54 PM Mauro Tridici <mauro.tridici at
cmcc.it> wrote:
>> 
>> 
>> 
>> Hi Hari,
>> 
>> thank you very much for your answer.
>> I will try to use the script mentioned above pointing to each backend
bricks.
>> 
>> So, if I understand, since I have a gluster cluster composed by 3 nodes
(with 12 bricks on each node), I have to execute the script 36 times. Right?
>> 
>> You can find below the ?df? command output executed on a cluster node:
>> 
>> /dev/mapper/cl_s01-gluster           100G   33M    100G   1% /gluster
>> /dev/mapper/gluster_vgd-gluster_lvd  9,0T  5,6T    3,5T  62%
/gluster/mnt2
>> /dev/mapper/gluster_vge-gluster_lve  9,0T  5,7T    3,4T  63%
/gluster/mnt3
>> /dev/mapper/gluster_vgj-gluster_lvj  9,0T  5,7T    3,4T  63%
/gluster/mnt8
>> /dev/mapper/gluster_vgc-gluster_lvc  9,0T  5,6T    3,5T  62%
/gluster/mnt1
>> /dev/mapper/gluster_vgl-gluster_lvl  9,0T  5,8T    3,3T  65%
/gluster/mnt10
>> /dev/mapper/gluster_vgh-gluster_lvh  9,0T  5,7T    3,4T  64%
/gluster/mnt6
>> /dev/mapper/gluster_vgf-gluster_lvf  9,0T  5,7T    3,4T  63%
/gluster/mnt4
>> /dev/mapper/gluster_vgm-gluster_lvm  9,0T  5,4T    3,7T  60%
/gluster/mnt11
>> /dev/mapper/gluster_vgn-gluster_lvn  9,0T  5,4T    3,7T  60%
/gluster/mnt12
>> /dev/mapper/gluster_vgg-gluster_lvg  9,0T  5,7T    3,4T  64%
/gluster/mnt5
>> /dev/mapper/gluster_vgi-gluster_lvi  9,0T  5,7T    3,4T  63%
/gluster/mnt7
>> /dev/mapper/gluster_vgk-gluster_lvk  9,0T  5,8T    3,3T  65%
/gluster/mnt9
>> 
>> I will execute the following command and I will put here the output.
>> 
>> ./quota_fsck_new.py --full-logs --sub-dir /gluster/mnt{1..12}
>> 
>> Thank you again for your support.
>> Regards,
>> Mauro
>> 
>> Il giorno 10 lug 2018, alle ore 11:02, Hari Gowtham <hgowtham at
redhat.com> ha scritto:
>> 
>> Hi,
>> 
>> There is no explicit command to backup all the quota limits as per my
>> understanding. need to look further about this.
>> But you can do the following to backup and set it.
>> Gluster volume quota volname list which will print all the quota
>> limits on that particular volume.
>> You will have to make a note of the directories with their respective
limit set.
>> Once noted down, you can disable quota on the volume and then enable
it.
>> Once enabled, you will have to set each limit explicitly on the volume.
>> 
>> Before doing this we suggest you can to try running the script
>> mentioned above with the backend brick path instead of the mount path.
>> you need to run this on the machines where the backend bricks are
>> located and not on the mount.
>> On Mon, Jul 9, 2018 at 9:01 PM Mauro Tridici <mauro.tridici at
cmcc.it> wrote:
>> 
>> 
>> Hi Sanoj,
>> 
>> could you provide me the command that I need in order to backup all
quota limits?
>> If there is no solution for this kind of problem, I would like to try
to follow your ?backup? suggestion.
>> 
>> Do you think that I should contact gluster developers too?
>> 
>> Thank you very much.
>> Regards,
>> Mauro
>> 
>> 
>> Il giorno 05 lug 2018, alle ore 09:56, Mauro Tridici <mauro.tridici
at cmcc.it> ha scritto:
>> 
>> Hi Sanoj,
>> 
>> unfortunately the output of the command execution was not helpful.
>> 
>> [root at s01 ~]# find /tier2/CSP/ans004  | xargs getfattr -d -m. -e hex
>> [root at s01 ~]#
>> 
>> Do you have some other idea in order to detect the cause of the issue?
>> 
>> Thank you again,
>> Mauro
>> 
>> 
>> Il giorno 05 lug 2018, alle ore 09:08, Sanoj Unnikrishnan <sunnikri
at redhat.com> ha scritto:
>> 
>> Hi Mauro,
>> 
>> A script issue did not capture all necessary xattr.
>> Could you provide the xattrs with..
>> find /tier2/CSP/ans004  | xargs getfattr -d -m. -e hex
>> 
>> Meanwhile, If you are being impacted, you could do the following
>> back up quota limits
>> disable quota
>> enable quota
>> freshly set the limits.
>> 
>> Please capture the xattr values first, so that we can get to know what
went wrong.
>> Regards,
>> Sanoj
>> 
>> 
>> On Tue, Jul 3, 2018 at 4:09 PM, Mauro Tridici <mauro.tridici at
cmcc.it> wrote:
>> 
>> 
>> Dear Sanoj,
>> 
>> thank you very much for your support.
>> I just downloaded and executed the script you suggested.
>> 
>> This is the full command I executed:
>> 
>> ./quota_fsck_new.py --full-logs --sub-dir /tier2/CSP/ans004/ /gluster
>> 
>> In attachment, you can find the logs generated by the script.
>> What can I do now?
>> 
>> Thank you very much for your patience.
>> Mauro
>> 
>> 
>> 
>> 
>> Il giorno 03 lug 2018, alle ore 11:34, Sanoj Unnikrishnan <sunnikri
at redhat.com> ha scritto:
>> 
>> Hi Mauro,
>> 
>> This may be an issue with update of backend xattrs.
>> To RCA further and provide resolution could you provide me with the
logs by running the following fsck script.
>> https://review.gluster.org/#/c/19179/6/extras/quota/quota_fsck.py
>> 
>> Try running the script and revert with the logs generated.
>> 
>> Thanks,
>> Sanoj
>> 
>> 
>> On Mon, Jul 2, 2018 at 2:21 PM, Mauro Tridici <mauro.tridici at
cmcc.it> wrote:
>> 
>> 
>> Dear Users,
>> 
>> I just noticed that, after some data deletions executed inside
"/tier2/CSP/ans004? folder, the amount of used disk reported by quota
command doesn?t reflect the value indicated by du command.
>> Surfing on the web, it seems that it is a bug of previous versions of
Gluster FS and it was already fixed.
>> In my case, the problem seems unfortunately still here.
>> 
>> How can I solve this issue? Is it possible to do it without starting a
downtime period?
>> 
>> Thank you very much in advance,
>> Mauro
>> 
>> [root at s01 ~]# glusterfs -V
>> glusterfs 3.10.5
>> Repository revision: git://git.gluster.org/glusterfs.git
>> Copyright (c) 2006-2016 Red Hat, Inc. <https://www.gluster.org/>
>> GlusterFS comes with ABSOLUTELY NO WARRANTY.
>> It is licensed to you under your choice of the GNU Lesser
>> General Public License, version 3 or any later version (LGPLv3
>> or later), or the GNU General Public License, version 2 (GPLv2),
>> in all cases as published by the Free Software Foundation.
>> 
>> [root at s01 ~]# gluster volume quota tier2 list /CSP/ans004
>>                Path                   Hard-limit  Soft-limit      Used 
Available  Soft-limit exceeded? Hard-limit exceeded?
>>
-------------------------------------------------------------------------------------------------------------------------------
>> /CSP/ans004                                1.0TB     99%(1013.8GB)   
3.9TB  0Bytes             Yes                  Yes
>> 
>> [root at s01 ~]# du -hs /tier2/CSP/ans004/
>> 295G /tier2/CSP/ans004/
>> 
>> 
>> 
>> 
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://lists.gluster.org/mailman/listinfo/gluster-users
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>> 
>> 
>> 
>> 
>> --
>> Regards,
>> Hari Gowtham.
>> 
>> 
>> 
>> 
>> 
>> --
>> Regards,
>> Hari Gowtham.
>> 
>> 
> 
> 
> -- 
> Regards,
> Hari Gowtham.

-------------------------
Mauro Tridici

Fondazione CMCC
CMCC Supercomputing Center
presso Complesso Ecotekne - Universit? del Salento -
Strada Prov.le Lecce - Monteroni sn
73100 Lecce  IT
http://www.cmcc.it

mobile: (+39) 327 5630841
email: mauro.tridici at cmcc.it

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20180711/c5ea73bc/attachment.html>

Gluster users - Jul 2018 - Gluster 3.10.5: used disk size reported by quota and du mismatch

[Gluster-users] Gluster 3.10.5: used disk size reported by quota and du mismatch

[Gluster-users] Gluster 3.10.5: used disk size reported by quota and du mismatch