Hi Jo?o,
There isn't a straightforward way of tracking the crawl but as gluster uses
find and stat during crawl, one can run the following command,
# ps aux | grep find
If the output is of the form,
"root 1513 0.0 0.1 127224 2636 ? S 12:24 0.00
/usr/bin/find . -exec /usr/bin/stat {} \"
then it means that the crawl is still going on.
Thanks and Regards,
SRIJAN SIVAKUMAR
Associate Software Engineer
Red Hat
<https://www.redhat.com>
<https://www.redhat.com>
T: +91-9727532362 <http://redhatemailsignature-marketing.itos.redhat.com/>
<https://red.ht/sig>
TRIED. TESTED. TRUSTED. <https://redhat.com/trusted>
On Wed, Aug 19, 2020 at 1:46 AM Jo?o Ba?to <
joao.bauto at neuro.fchampalimaud.org> wrote:
> Hi Srijan,
>
> Is there a way of getting the status of the crawl process?
> We are going to expand this cluster, adding 12 new bricks (around 500TB)
> and we rely heavily on the quota feature to control the space usage for
> each project. It's been running since Saturday (nothing changed) and
> unsure if it's going to finish tomorrow or in weeks.
>
> Thank you!
> *Jo?o Ba?to*
> ---------------
>
> *Scientific Computing and Software Platform*
> Champalimaud Research
> Champalimaud Center for the Unknown
> Av. Bras?lia, Doca de Pedrou?os
> 1400-038 Lisbon, Portugal
> fchampalimaud.org <https://www.fchampalimaud.org/>
>
>
> Srijan Sivakumar <ssivakum at redhat.com> escreveu no dia domingo,
> 16/08/2020 ?(s) 06:11:
>
>> Hi Jo?o,
>>
>> Yes it'll take some time given the file system size as it has to
change
>> the xattrs in each level and then crawl upwards.
>>
>> stat is done by the script itself so the crawl is initiated.
>>
>> Regards,
>> Srijan Sivakumar
>>
>> On Sun 16 Aug, 2020, 04:58 Jo?o Ba?to, <
>> joao.bauto at neuro.fchampalimaud.org> wrote:
>>
>>> Hi Srijan & Strahil,
>>>
>>> I ran the quota_fsck script mentioned in Hari's blog post in
all bricks
>>> and it detected a lot of size mismatch.
>>>
>>> The script was executed as,
>>>
>>> - python quota_fsck.py --sub-dir projectB --fix-issues /mnt/tank
>>> /tank/volume2/brick (in all nodes and bricks)
>>>
>>> Here is a snippet from the script,
>>>
>>> Size Mismatch /tank/volume2/brick/projectB {'parents':
>>> {'00000000-0000-0000-0000-000000000001':
{'contri_file_count':
>>> 18446744073035296610L, 'contri_size':
18446645297413872640L,
>>> 'contri_dir_count': 18446744073709527653L}},
'version': '1', 'file_count':
>>> 18446744073035296610L, 'dirty': False, 'dir_count':
18446744073709527653L,
>>> 'size': 18446645297413872640L} 15204281691754
>>> MARKING DIRTY: /tank/volume2/brick/projectB
>>> stat on /mnt/tank/projectB
>>> Files verified : 683223
>>> Directories verified : 46823
>>> Objects Fixed : 705230
>>>
>>> Checking the xattr in the bricks I can see the directory in
question
>>> marked as dirty,
>>> # getfattr -d -m. -e hex /tank/volume2/brick/projectB
>>> getfattr: Removing leading '/' from absolute path names
>>> # file: tank/volume2/brick/projectB
>>> trusted.gfid=0x3ca2bce0455945efa6662813ce20fc0c
>>>
>>>
trusted.glusterfs.9582685f-07fa-41fd-b9fc-ebab3a6989cf.xtime=0x5f372478000a7705
>>> trusted.glusterfs.dht=0xe1a4060c000000003ffffffe5ffffffc
>>>
>>>
trusted.glusterfs.mdata=0x010000000000000000000000005f3724750000000013ddf679000000005ce2aff90000000007fdacb0000000005ce2aff90000000007fdacb0
>>>
>>>
trusted.glusterfs.quota.00000000-0000-0000-0000-000000000001.contri.1=0x00000ca6ccf7a80000000000000790a1000000000000b6ea
>>> trusted.glusterfs.quota.dirty=0x3100
>>>
trusted.glusterfs.quota.limit-set.1=0x0000640000000000ffffffffffffffff
>>>
>>>
trusted.glusterfs.quota.size.1=0x00000ca6ccf7a80000000000000790a1000000000000b6ea
>>>
>>> Now, my question is how do I trigger Gluster to recalculate the
quota
>>> for this directory? Is it automatic but it takes a while? Because
the quota
>>> list did change but not to a good "result".
>>>
>>> Path Hard-limit Soft-limit Used
>>> Available Soft-limit exceeded? Hard-limit exceeded?
>>> /projectB 100.0TB 80%(80.0TB) 16383.9PB 190.1TB
>>> No No
>>>
>>> I would like to avoid a disable/enable quota in the volume as it
removes
>>> the configs.
>>>
>>> Thank you for all the help!
>>> *Jo?o Ba?to*
>>> ---------------
>>>
>>> *Scientific Computing and Software Platform*
>>> Champalimaud Research
>>> Champalimaud Center for the Unknown
>>> Av. Bras?lia, Doca de Pedrou?os
>>> 1400-038 Lisbon, Portugal
>>> fchampalimaud.org <https://www.fchampalimaud.org/>
>>>
>>>
>>> Srijan Sivakumar <ssivakum at redhat.com> escreveu no dia
s?bado,
>>> 15/08/2020 ?(s) 11:57:
>>>
>>>> Hi Jo?o,
>>>>
>>>> The quota accounting error is what we're looking at here. I
think
>>>> you've already looked into the blog post by Hari and are
using the script
>>>> to fix the accounting issue.
>>>> That should help you out in fixing this issue.
>>>>
>>>> Let me know if you face any issues while using it.
>>>>
>>>> Regards,
>>>> Srijan Sivakumar
>>>>
>>>>
>>>> On Fri 14 Aug, 2020, 17:10 Jo?o Ba?to, <
>>>> joao.bauto at neuro.fchampalimaud.org> wrote:
>>>>
>>>>> Hi Strahil,
>>>>>
>>>>> I have tried removing the quota for that specific directory
and
>>>>> setting it again but it didn't work (maybe it has to be
a quota disable and
>>>>> enable in the volume options). Currently testing a solution
>>>>> by Hari with the quota_fsck.py script
(https://medium.com/@harigowtham
>>>>> /glusterfs-quota-fix-accounting-840df33fcd3a) and its
detecting a lot
>>>>> of size mismatch in files.
>>>>>
>>>>> Thank you,
>>>>> *Jo?o Ba?to*
>>>>> ---------------
>>>>>
>>>>> *Scientific Computing and Software Platform*
>>>>> Champalimaud Research
>>>>> Champalimaud Center for the Unknown
>>>>> Av. Bras?lia, Doca de Pedrou?os
>>>>> 1400-038 Lisbon, Portugal
>>>>> fchampalimaud.org <https://www.fchampalimaud.org/>
>>>>>
>>>>>
>>>>> Strahil Nikolov <hunter86_bg at yahoo.com> escreveu
no dia sexta,
>>>>> 14/08/2020 ?(s) 10:16:
>>>>>
>>>>>> Hi Jo?o,
>>>>>>
>>>>>> Based on your output it seems that the quota size is
different on the
>>>>>> 2 bricks.
>>>>>>
>>>>>> Have you tried to remove the quota and then recreate it
? Maybe it
>>>>>> will be the easiest way to fix it.
>>>>>>
>>>>>> Best Regards,
>>>>>> Strahil Nikolov
>>>>>>
>>>>>>
>>>>>> ?? 14 ?????? 2020 ?. 4:35:14 GMT+03:00, "Jo?o
Ba?to" <
>>>>>> joao.bauto at neuro.fchampalimaud.org> ??????:
>>>>>> >Hi all,
>>>>>> >
>>>>>> >We have a 4-node distributed cluster with 2 bricks
per node running
>>>>>> >Gluster
>>>>>> >7.7 + ZFS. We use directory quota to limit the
space used by our
>>>>>> >members on
>>>>>> >each project. Two days ago we noticed inconsistent
space used
>>>>>> reported
>>>>>> >by
>>>>>> >Gluster in the quota list.
>>>>>> >
>>>>>> >A small snippet of gluster volume quota vol list,
>>>>>> >
>>>>>> > Path Hard-limit Soft-limit
Used
>>>>>> >Available Soft-limit exceeded? Hard-limit
exceeded?
>>>>>> >/projectA 5.0TB 80%(4.0TB)
3.1TB
>>>>>> 1.9TB
>>>>>> > No No
>>>>>> >*/projectB 100.0TB 80%(80.0TB)
16383.4PB 740.9TB
>>>>>> > No No*
>>>>>> >/projectC 70.0TB 80%(56.0TB)
50.0TB 20.0TB
>>>>>> > No No
>>>>>> >
>>>>>> >The total space available in the cluster is 360TB,
the quota for
>>>>>> >projectB
>>>>>> >is 100TB and, as you can see, its reporting
16383.4PB used and 740TB
>>>>>> >available (already decreased from 750TB).
>>>>>> >
>>>>>> >There was an issue in Gluster 3.x related to the
wrong directory
>>>>>> quota
>>>>>> >(
>>>>>> >
>>>>>>
https://lists.gluster.org/pipermail/gluster-users/2016-February/025305.html
>>>>>> > and
>>>>>> >
>>>>>>
https://lists.gluster.org/pipermail/gluster-users/2018-November/035374.html
>>>>>> )
>>>>>> >but it's marked as solved (not sure if the
solution still applies).
>>>>>> >
>>>>>> >*On projectB*
>>>>>> ># getfattr -d -m . -e hex projectB
>>>>>> ># file: projectB
>>>>>> >trusted.gfid=0x3ca2bce0455945efa6662813ce20fc0c
>>>>>>
>>>>>>
>trusted.glusterfs.9582685f-07fa-41fd-b9fc-ebab3a6989cf.xtime=0x5f35e69800098ed9
>>>>>>
>trusted.glusterfs.dht=0xe1a4060c000000003ffffffe5ffffffc
>>>>>>
>>>>>>
>trusted.glusterfs.mdata=0x010000000000000000000000005f355c59000000000939079f000000005ce2aff90000000007fdacb0000000005ce2aff90000000007fdacb0
>>>>>>
>>>>>>
>trusted.glusterfs.quota.00000000-0000-0000-0000-000000000001.contri.1=0x0000ab0f227a860000000000478e33acffffffffffffc112
>>>>>> >trusted.glusterfs.quota.dirty=0x3000
>>>>>>
>>>>>>
>trusted.glusterfs.quota.limit-set.1=0x0000640000000000ffffffffffffffff
>>>>>>
>>>>>>
>trusted.glusterfs.quota.size.1=0x0000ab0f227a860000000000478e33acffffffffffffc112
>>>>>> >
>>>>>> >*On projectA*
>>>>>> ># getfattr -d -m . -e hex projectA
>>>>>> ># file: projectA
>>>>>> >trusted.gfid=0x05b09ded19354c0eb544d22d4659582e
>>>>>>
>>>>>>
>trusted.glusterfs.9582685f-07fa-41fd-b9fc-ebab3a6989cf.xtime=0x5f1aeb9f00044c64
>>>>>>
>trusted.glusterfs.dht=0xe1a4060c000000001fffffff3ffffffd
>>>>>>
>>>>>>
>trusted.glusterfs.mdata=0x010000000000000000000000005f1ac6a10000000018f30a4e000000005c338fab0000000017a3135a000000005b0694fb000000001584a21b
>>>>>>
>>>>>>
>trusted.glusterfs.quota.00000000-0000-0000-0000-000000000001.contri.1=0x0000067de3bbe20000000000000128610000000000033498
>>>>>> >trusted.glusterfs.quota.dirty=0x3000
>>>>>>
>>>>>>
>trusted.glusterfs.quota.limit-set.1=0x0000460000000000ffffffffffffffff
>>>>>>
>>>>>>
>trusted.glusterfs.quota.size.1=0x0000067de3bbe20000000000000128610000000000033498
>>>>>> >
>>>>>> >Any idea on what's happening and how to fix it?
>>>>>> >
>>>>>> >Thanks!
>>>>>> >*Jo?o Ba?to*
>>>>>> >---------------
>>>>>> >
>>>>>> >*Scientific Computing and Software Platform*
>>>>>> >Champalimaud Research
>>>>>> >Champalimaud Center for the Unknown
>>>>>> >Av. Bras?lia, Doca de Pedrou?os
>>>>>> >1400-038 Lisbon, Portugal
>>>>>> >fchampalimaud.org
<https://www.fchampalimaud.org/>
>>>>>>
>>>>> ________
>>>>>
>>>>>
>>>>>
>>>>> Community Meeting Calendar:
>>>>>
>>>>> Schedule -
>>>>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>>>>> Bridge: https://bluejeans.com/441850968
>>>>>
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>>
--
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20200819/34e39a73/attachment.html>