thr3ads.net - Gluster users - [Gluster-users] Brick missing trusted.glusterfs.dht xattr [Aug 2019]

If this information is useful, please help other people find it:
Share via:

Sunny Kumar

2019-Jul-29 05:56 UTC

[Gluster-users] Brick missing trusted.glusterfs.dht xattr

HI Matthew,

Can you share geo-rep logs and one more log file
(changes-<brick-path>.log) it will help to pinpoint actual reason
behind failure.

/sunny

On Mon, Jul 29, 2019 at 9:13 AM Nithya Balachandran <nbalacha at
redhat.com> wrote:>
>
>
> On Sat, 27 Jul 2019 at 02:31, Matthew Benstead <matthewb at uvic.ca>
wrote:
>>
>> Ok thank-you for explaining everything - that makes sense.
>>
>> Currently the brick file systems are pretty evenly distributed so I
probably won't run the fix-layout right now.
>>
>> Would this state have any impact on geo-replication? I'm trying to
geo-replicate this volume, but am getting a weird error: "Changelog
register failed error=[Errno 21] Is a directory"
>
>
> It should not. Sunny, can you comment on this?
>
> Regards,
> Nithya
>>
>>
>> I assume this is related to something else, but I wasn't sure.
>>
>> Thanks,
>>  -Matthew
>>
>> --
>> Matthew Benstead
>> System Administrator
>> Pacific Climate Impacts Consortium
>> University of Victoria, UH1
>> PO Box 1800, STN CSC
>> Victoria, BC, V8W 2Y2
>> Phone: +1-250-721-8432
>> Email: matthewb at uvic.ca
>>
>> On 7/26/19 12:02 AM, Nithya Balachandran wrote:
>>
>>
>>
>> On Fri, 26 Jul 2019 at 01:56, Matthew Benstead <matthewb at
uvic.ca> wrote:
>>>
>>> Hi Nithya,
>>>
>>> Hmm... I don't remember if I did, but based on what I'm
seeing it sounds like I probably didn't run rebalance or fix-layout.
>>>
>>> It looks like folders that haven't had any new files created
have a dht of 0, while other folders have non-zero values.
>>>
>>> [root at gluster07 ~]# getfattr --absolute-names -m . -d -e hex
/mnt/raid6-storage/storage/ | grep dht
>>> [root at gluster07 ~]# getfattr --absolute-names -m . -d -e hex
/mnt/raid6-storage/storage/home | grep dht
>>> trusted.glusterfs.dht=0x00000000000000000000000000000000
>>> [root at gluster07 ~]# getfattr --absolute-names -m . -d -e hex
/mnt/raid6-storage/storage/home/matthewb | grep dht
>>> trusted.glusterfs.dht=0x00000001000000004924921a6db6dbc7
>>>
>>> If I just run the fix-layout command will it re-create all of the
dht values or just the missing ones?
>>
>>
>> A fix-layout will recalculate the layouts entirely so files all the
values will change. No files will be moved.
>> A rebalance will recalculate the layouts like the fix-layout but will
also move files to their new locations based on the new layout ranges. This
could take a lot of time depending on the number of files/directories on the
volume. If you do this, I would recommend that you turn off lookup-optimize
until the rebalance is over.
>>
>>>
>>> Since the brick is already fairly size balanced could I get away
with running fix-layout but not rebalance? Or would the new dht layout mean
slower accesses since the files may be expected on different bricks?
>>
>>
>> The first access for a file will be slower. The next one will be faster
as the location will be cached in the client's in-memory structures.
>> You may not need to run either a fix-layout or a rebalance if new file
creations will be in directories created after the add-brick. Gluster will
automatically include all 7 bricks for those directories.
>>
>> Regards,
>> Nithya
>>
>>>
>>> Thanks,
>>>  -Matthew
>>>
>>> --
>>> Matthew Benstead
>>> System Administrator
>>> Pacific Climate Impacts Consortium
>>> University of Victoria, UH1
>>> PO Box 1800, STN CSC
>>> Victoria, BC, V8W 2Y2
>>> Phone: +1-250-721-8432
>>> Email: matthewb at uvic.ca
>>>
>>> On 7/24/19 9:30 PM, Nithya Balachandran wrote:
>>>
>>>
>>>
>>> On Wed, 24 Jul 2019 at 22:12, Matthew Benstead <matthewb at
uvic.ca> wrote:
>>>>
>>>> So looking more closely at the trusted.glusterfs.dht attributes
from the bricks it looks like they cover the entire range... and there is no
range left for gluster07.
>>>>
>>>> The first 6 bricks range from 0x00000000 to 0xffffffff - so...
is there a way to re-calculate what the dht values should be? Each of the bricks
should have a gap
>>>>
>>>> Gluster05 00000000 -> 2aaaaaa9
>>>> Gluster06 2aaaaaaa -> 55555553
>>>> Gluster01 55555554 -> 7ffffffd
>>>> Gluster02 7ffffffe -> aaaaaaa7
>>>> Gluster03 aaaaaaa8 -> d5555551
>>>> Gluster04 d5555552 -> ffffffff
>>>> Gluster07 None
>>>>
>>>> If we split the range into 7 servers that would be a gap of
about 0x24924924 for each server.
>>>>
>>>> Now in terms of the gluster07 brick, about 2 years ago the RAID
array the brick was stored on became corrupted. I ran the remove-brick force
command, then provisioned a new server, ran the add-brick command and then
restored the missing files from backup by copying them back to the main gluster
mount (not the brick).
>>>>
>>>
>>> Did you run a rebalance after performing the add-brick? Without a
rebalance/fix-layout , the layout for existing directories on the volume will
not  be updated to use the new brick as well.
>>>
>>> That the layout does not include the new brick in the root dir is
in itself is not a problem. Do you create a lot of files directly in the root of
the volume? If yes, you might want to run a rebalance. Otherwise, if you mostly
create files in newly added directories, you can probably ignore this. You can
check the layout for directories on the volume and see if they incorporate the
brick7.
>>>
>>> I would expect a lookup on the root to have set an xattr on the
brick with an empty layout range . The fact that the xattr does not exist at all
on the brick is what I am looking into.
>>>
>>>
>>>> It looks like prior to that event this was the layout - which
would make sense given the equal size of the 7 bricks:
>>>>
>>>> gluster02.pcic.uvic.ca | SUCCESS | rc=0 >>
>>>> # file: /mnt/raid6-storage/storage
>>>> trusted.glusterfs.dht=0x000000010000000048bfff206d1ffe5f
>>>>
>>>> gluster05.pcic.uvic.ca | SUCCESS | rc=0 >>
>>>> # file: /mnt/raid6-storage/storage
>>>> trusted.glusterfs.dht=0x0000000100000000b5dffce0da3ffc1f
>>>>
>>>> gluster04.pcic.uvic.ca | SUCCESS | rc=0 >>
>>>> # file: /mnt/raid6-storage/storage
>>>> trusted.glusterfs.dht=0x0000000100000000917ffda0b5dffcdf
>>>>
>>>> gluster03.pcic.uvic.ca | SUCCESS | rc=0 >>
>>>> # file: /mnt/raid6-storage/storage
>>>> trusted.glusterfs.dht=0x00000001000000006d1ffe60917ffd9f
>>>>
>>>> gluster01.pcic.uvic.ca | SUCCESS | rc=0 >>
>>>> # file: /mnt/raid6-storage/storage
>>>> trusted.glusterfs.dht=0x0000000100000000245fffe048bfff1f
>>>>
>>>> gluster07.pcic.uvic.ca | SUCCESS | rc=0 >>
>>>> # file: /mnt/raid6-storage/storage
>>>> trusted.glusterfs.dht=0x000000010000000000000000245fffdf
>>>>
>>>> gluster06.pcic.uvic.ca | SUCCESS | rc=0 >>
>>>> # file: /mnt/raid6-storage/storage
>>>> trusted.glusterfs.dht=0x0000000100000000da3ffc20ffffffff
>>>>
>>>> Which yields the following:
>>>>
>>>> 00000000 -> 245fffdf    Gluster07
>>>> 245fffe0 -> 48bfff1f    Gluster01
>>>> 48bfff20 -> 6d1ffe5f    Gluster02
>>>> 6d1ffe60 -> 917ffd9f    Gluster03
>>>> 917ffda0 -> b5dffcdf    Gluster04
>>>> b5dffce0 -> da3ffc1f    Gluster05
>>>> da3ffc20 -> ffffffff    Gluster06
>>>>
>>>> Is there some way to get back to this?
>>>>
>>>> Thanks,
>>>>  -Matthew
>>>>
>>>> --
>>>> Matthew Benstead
>>>> System Administrator
>>>> Pacific Climate Impacts Consortium
>>>> University of Victoria, UH1
>>>> PO Box 1800, STN CSC
>>>> Victoria, BC, V8W 2Y2
>>>> Phone: +1-250-721-8432
>>>> Email: matthewb at uvic.ca
>>>>
>>>> On 7/18/19 7:20 AM, Matthew Benstead wrote:
>>>>
>>>> Hi Nithya,
>>>>
>>>> No - it was added about a year and a half ago. I have tried
re-mounting the volume on the server, but it didn't add the attr:
>>>>
>>>> [root at gluster07 ~]# umount /storage/
>>>> [root at gluster07 ~]# cat /etc/fstab | grep
"/storage"
>>>> 10.0.231.56:/storage /storage glusterfs
defaults,log-level=WARNING,backupvolfile-server=10.0.231.51 0 0
>>>> [root at gluster07 ~]# mount /storage/
>>>> [root at gluster07 ~]# df -h /storage/
>>>> Filesystem            Size  Used Avail Use% Mounted on
>>>> 10.0.231.56:/storage  255T  194T   62T  77% /storage
>>>> [root at gluster07 ~]# getfattr --absolute-names -m . -d -e hex
/mnt/raid6-storage/storage/
>>>> # file: /mnt/raid6-storage/storage/
>>>>
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000
>>>> trusted.gfid=0x00000000000000000000000000000001
>>>>
trusted.glusterfs.6f95525a-94d7-4174-bac4-e1a18fe010a2.xtime=0x5d307baa00023ec0
>>>> trusted.glusterfs.quota.dirty=0x3000
>>>>
trusted.glusterfs.quota.size.2=0x00001b71d5279e000000000000763e32000000000005cd53
>>>> trusted.glusterfs.volume-id=0x6f95525a94d74174bac4e1a18fe010a2
>>>>
>>>> Thanks,
>>>>  -Matthew
>>>>
>>>> On 7/17/19 10:04 PM, Nithya Balachandran wrote:
>>>>
>>>> Hi Matthew,
>>>>
>>>> Was this node/brick added to the volume recently? If yes, try
mounting the volume on a fresh mount point - that should create the xattr on
this as well.
>>>>
>>>> Regards,
>>>> Nithya
>>>>
>>>> On Wed, 17 Jul 2019 at 21:01, Matthew Benstead <matthewb at
uvic.ca> wrote:
>>>>>
>>>>> Hello,
>>>>>
>>>>> I've just noticed one brick in my 7 node distribute
volume is missing
>>>>> the trusted.glusterfs.dht xattr...? How can I fix this?
>>>>>
>>>>> I'm running glusterfs-5.3-2.el7.x86_64 on CentOS 7.
>>>>>
>>>>> All of the other nodes are fine, but gluster07 from the
list below does
>>>>> not have the attribute.
>>>>>
>>>>> $ ansible -i hosts gluster-servers[0:6] ... -m shell -a
"getfattr -m .
>>>>> --absolute-names -n trusted.glusterfs.dht -e hex
>>>>> /mnt/raid6-storage/storage"
>>>>> ...
>>>>> gluster05 | SUCCESS | rc=0 >>
>>>>> # file: /mnt/raid6-storage/storage
>>>>> trusted.glusterfs.dht=0x0000000100000000000000002aaaaaa9
>>>>>
>>>>> gluster03 | SUCCESS | rc=0 >>
>>>>> # file: /mnt/raid6-storage/storage
>>>>> trusted.glusterfs.dht=0x0000000100000000aaaaaaa8d5555551
>>>>>
>>>>> gluster04 | SUCCESS | rc=0 >>
>>>>> # file: /mnt/raid6-storage/storage
>>>>> trusted.glusterfs.dht=0x0000000100000000d5555552ffffffff
>>>>>
>>>>> gluster06 | SUCCESS | rc=0 >>
>>>>> # file: /mnt/raid6-storage/storage
>>>>> trusted.glusterfs.dht=0x00000001000000002aaaaaaa55555553
>>>>>
>>>>> gluster02 | SUCCESS | rc=0 >>
>>>>> # file: /mnt/raid6-storage/storage
>>>>> trusted.glusterfs.dht=0x00000001000000007ffffffeaaaaaaa7
>>>>>
>>>>> gluster07 | FAILED | rc=1 >>
>>>>> /mnt/raid6-storage/storage: trusted.glusterfs.dht: No such
>>>>> attributenon-zero return code
>>>>>
>>>>> gluster01 | SUCCESS | rc=0 >>
>>>>> # file: /mnt/raid6-storage/storage
>>>>> trusted.glusterfs.dht=0x0000000100000000555555547ffffffd
>>>>>
>>>>> Here are all of the attr's from the brick:
>>>>>
>>>>> [root at gluster07 ~]# getfattr --absolute-names -m . -d -e
hex
>>>>> /mnt/raid6-storage/storage/
>>>>> # file: /mnt/raid6-storage/storage/
>>>>>
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000
>>>>> trusted.gfid=0x00000000000000000000000000000001
>>>>>
trusted.glusterfs.6f95525a-94d7-4174-bac4-e1a18fe010a2.xtime=0x5d2dee800001fdf9
>>>>> trusted.glusterfs.quota.dirty=0x3000
>>>>>
trusted.glusterfs.quota.size.2=0x00001b69498a1400000000000076332e000000000005cd03
>>>>>
trusted.glusterfs.volume-id=0x6f95525a94d74174bac4e1a18fe010a2
>>>>>
>>>>>
>>>>> And here is the volume information:
>>>>>
>>>>> [root at gluster07 ~]# gluster volume info storage
>>>>>
>>>>> Volume Name: storage
>>>>> Type: Distribute
>>>>> Volume ID: 6f95525a-94d7-4174-bac4-e1a18fe010a2
>>>>> Status: Started
>>>>> Snapshot Count: 0
>>>>> Number of Bricks: 7
>>>>> Transport-type: tcp
>>>>> Bricks:
>>>>> Brick1: 10.0.231.50:/mnt/raid6-storage/storage
>>>>> Brick2: 10.0.231.51:/mnt/raid6-storage/storage
>>>>> Brick3: 10.0.231.52:/mnt/raid6-storage/storage
>>>>> Brick4: 10.0.231.53:/mnt/raid6-storage/storage
>>>>> Brick5: 10.0.231.54:/mnt/raid6-storage/storage
>>>>> Brick6: 10.0.231.55:/mnt/raid6-storage/storage
>>>>> Brick7: 10.0.231.56:/mnt/raid6-storage/storage
>>>>> Options Reconfigured:
>>>>> changelog.changelog: on
>>>>> features.quota-deem-statfs: on
>>>>> features.read-only: off
>>>>> features.inode-quota: on
>>>>> features.quota: on
>>>>> performance.readdir-ahead: on
>>>>> nfs.disable: on
>>>>> geo-replication.indexing: on
>>>>> geo-replication.ignore-pid-check: on
>>>>> transport.address-family: inet
>>>>>
>>>>> Thanks,
>>>>>  -Matthew
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>>
>>>>
>>>
>>

Matthew Benstead

2019-Jul-29 16:46 UTC

head link

[Gluster-users] Brick missing trusted.glusterfs.dht xattr

Hi Sunny,

Yes, I have attached the gsyncd.log file. I couldn't find any
changes-<brick-path>.log files...

Trying to start replication goes faulty right away:

[root at gluster01 ~]# rpm -q glusterfs
glusterfs-5.6-1.el7.x86_64
[root at gluster01 ~]# uname -r
3.10.0-957.21.3.el7.x86_64
[root at gluster01 ~]# cat /etc/centos-release
CentOS Linux release 7.6.1810 (Core)

[root at gluster01 ~]# gluster volume geo-replication storage
root at 10.0.231.81::pcic-backup start
Starting geo-replication session between storage &
10.0.231.81::pcic-backup has been successful
[root at gluster01 ~]# gluster volume geo-replication storage
root at 10.0.231.81::pcic-backup status
?
MASTER NODE??? MASTER VOL??? MASTER BRICK????????????????? SLAVE USER???
SLAVE?????????????????????? SLAVE NODE??? STATUS??? CRAWL STATUS???
LAST_SYNCED?????????
-------------------------------------------------------------------------------------------------------------------------------------------------------
10.0.231.50??? storage?????? /mnt/raid6-storage/storage??? root?????????
10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A????????????
N/A?????????????????
10.0.231.52??? storage?????? /mnt/raid6-storage/storage??? root?????????
10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A????????????
N/A?????????????????
10.0.231.54??? storage?????? /mnt/raid6-storage/storage??? root?????????
10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A????????????
N/A?????????????????
10.0.231.51??? storage?????? /mnt/raid6-storage/storage??? root?????????
10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A????????????
N/A?????????????????
10.0.231.53??? storage?????? /mnt/raid6-storage/storage??? root?????????
10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A????????????
N/A?????????????????
10.0.231.55??? storage?????? /mnt/raid6-storage/storage??? root?????????
10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A????????????
N/A?????????????????
10.0.231.56??? storage?????? /mnt/raid6-storage/storage??? root?????????
10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A????????????
N/A?????????????????
[root at gluster01 ~]# gluster volume geo-replication storage
root at 10.0.231.81::pcic-backup stop
Stopping geo-replication session between storage &
10.0.231.81::pcic-backup has been successful

This is the primary cluster:

[root at gluster01 ~]# gluster volume info storage
?
Volume Name: storage
Type: Distribute
Volume ID: 6f95525a-94d7-4174-bac4-e1a18fe010a2
Status: Started
Snapshot Count: 0
Number of Bricks: 7
Transport-type: tcp
Bricks:
Brick1: 10.0.231.50:/mnt/raid6-storage/storage
Brick2: 10.0.231.51:/mnt/raid6-storage/storage
Brick3: 10.0.231.52:/mnt/raid6-storage/storage
Brick4: 10.0.231.53:/mnt/raid6-storage/storage
Brick5: 10.0.231.54:/mnt/raid6-storage/storage
Brick6: 10.0.231.55:/mnt/raid6-storage/storage
Brick7: 10.0.231.56:/mnt/raid6-storage/storage
Options Reconfigured:
features.read-only: off
features.inode-quota: on
features.quota: on
performance.readdir-ahead: on
nfs.disable: on
geo-replication.indexing: on
geo-replication.ignore-pid-check: on
transport.address-family: inet
features.quota-deem-statfs: on
changelog.changelog: on
diagnostics.client-log-level: INFO


And this is the cluster I'm trying to replicate to:

[root at pcic-backup01 ~]# gluster volume info pcic-backup
?
Volume Name: pcic-backup
Type: Distribute
Volume ID: 2890bcde-a023-4feb-a0e5-e8ef8f337d4c
Status: Started
Snapshot Count: 0
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: 10.0.231.81:/pcic-backup01-zpool/brick
Brick2: 10.0.231.82:/pcic-backup02-zpool/brick
Options Reconfigured:
nfs.disable: on
transport.address-family: inet


Thanks,
?-Matthew

On 7/28/19 10:56 PM, Sunny Kumar wrote:> HI Matthew,
>
> Can you share geo-rep logs and one more log file
> (changes-<brick-path>.log) it will help to pinpoint actual reason
> behind failure.
>
> /sunny
>
> On Mon, Jul 29, 2019 at 9:13 AM Nithya Balachandran <nbalacha at
redhat.com> wrote:
>>
>>
>> On Sat, 27 Jul 2019 at 02:31, Matthew Benstead <matthewb at
uvic.ca> wrote:
>>> Ok thank-you for explaining everything - that makes sense.
>>>
>>> Currently the brick file systems are pretty evenly distributed so I
probably won't run the fix-layout right now.
>>>
>>> Would this state have any impact on geo-replication? I'm trying
to geo-replicate this volume, but am getting a weird error: "Changelog
register failed error=[Errno 21] Is a directory"
>>
>> It should not. Sunny, can you comment on this?
>>
>> Regards,
>> Nithya
>>>
>>> I assume this is related to something else, but I wasn't sure.
>>>
>>> Thanks,
>>>  -Matthew
>>>
>>> --
>>> Matthew Benstead
>>> System Administrator
>>> Pacific Climate Impacts Consortium
>>> University of Victoria, UH1
>>> PO Box 1800, STN CSC
>>> Victoria, BC, V8W 2Y2
>>> Phone: +1-250-721-8432
>>> Email: matthewb at uvic.ca
>>>
>>> On 7/26/19 12:02 AM, Nithya Balachandran wrote:
>>>
>>>
>>>
>>> On Fri, 26 Jul 2019 at 01:56, Matthew Benstead <matthewb at
uvic.ca> wrote:
>>>> Hi Nithya,
>>>>
>>>> Hmm... I don't remember if I did, but based on what I'm
seeing it sounds like I probably didn't run rebalance or fix-layout.
>>>>
>>>> It looks like folders that haven't had any new files
created have a dht of 0, while other folders have non-zero values.
>>>>
>>>> [root at gluster07 ~]# getfattr --absolute-names -m . -d -e hex
/mnt/raid6-storage/storage/ | grep dht
>>>> [root at gluster07 ~]# getfattr --absolute-names -m . -d -e hex
/mnt/raid6-storage/storage/home | grep dht
>>>> trusted.glusterfs.dht=0x00000000000000000000000000000000
>>>> [root at gluster07 ~]# getfattr --absolute-names -m . -d -e hex
/mnt/raid6-storage/storage/home/matthewb | grep dht
>>>> trusted.glusterfs.dht=0x00000001000000004924921a6db6dbc7
>>>>
>>>> If I just run the fix-layout command will it re-create all of
the dht values or just the missing ones?
>>>
>>> A fix-layout will recalculate the layouts entirely so files all the
values will change. No files will be moved.
>>> A rebalance will recalculate the layouts like the fix-layout but
will also move files to their new locations based on the new layout ranges. This
could take a lot of time depending on the number of files/directories on the
volume. If you do this, I would recommend that you turn off lookup-optimize
until the rebalance is over.
>>>
>>>> Since the brick is already fairly size balanced could I get
away with running fix-layout but not rebalance? Or would the new dht layout mean
slower accesses since the files may be expected on different bricks?
>>>
>>> The first access for a file will be slower. The next one will be
faster as the location will be cached in the client's in-memory structures.
>>> You may not need to run either a fix-layout or a rebalance if new
file creations will be in directories created after the add-brick. Gluster will
automatically include all 7 bricks for those directories.
>>>
>>> Regards,
>>> Nithya
>>>
>>>> Thanks,
>>>>  -Matthew
>>>>
>>>> --
>>>> Matthew Benstead
>>>> System Administrator
>>>> Pacific Climate Impacts Consortium
>>>> University of Victoria, UH1
>>>> PO Box 1800, STN CSC
>>>> Victoria, BC, V8W 2Y2
>>>> Phone: +1-250-721-8432
>>>> Email: matthewb at uvic.ca
>>>>
>>>> On 7/24/19 9:30 PM, Nithya Balachandran wrote:
>>>>
>>>>
>>>>
>>>> On Wed, 24 Jul 2019 at 22:12, Matthew Benstead <matthewb at
uvic.ca> wrote:
>>>>> So looking more closely at the trusted.glusterfs.dht
attributes from the bricks it looks like they cover the entire range... and
there is no range left for gluster07.
>>>>>
>>>>> The first 6 bricks range from 0x00000000 to 0xffffffff -
so... is there a way to re-calculate what the dht values should be? Each of the
bricks should have a gap
>>>>>
>>>>> Gluster05 00000000 -> 2aaaaaa9
>>>>> Gluster06 2aaaaaaa -> 55555553
>>>>> Gluster01 55555554 -> 7ffffffd
>>>>> Gluster02 7ffffffe -> aaaaaaa7
>>>>> Gluster03 aaaaaaa8 -> d5555551
>>>>> Gluster04 d5555552 -> ffffffff
>>>>> Gluster07 None
>>>>>
>>>>> If we split the range into 7 servers that would be a gap of
about 0x24924924 for each server.
>>>>>
>>>>> Now in terms of the gluster07 brick, about 2 years ago the
RAID array the brick was stored on became corrupted. I ran the remove-brick
force command, then provisioned a new server, ran the add-brick command and then
restored the missing files from backup by copying them back to the main gluster
mount (not the brick).
>>>>>
>>>> Did you run a rebalance after performing the add-brick? Without
a rebalance/fix-layout , the layout for existing directories on the volume will
not  be updated to use the new brick as well.
>>>>
>>>> That the layout does not include the new brick in the root dir
is in itself is not a problem. Do you create a lot of files directly in the root
of the volume? If yes, you might want to run a rebalance. Otherwise, if you
mostly create files in newly added directories, you can probably ignore this.
You can check the layout for directories on the volume and see if they
incorporate the brick7.
>>>>
>>>> I would expect a lookup on the root to have set an xattr on the
brick with an empty layout range . The fact that the xattr does not exist at all
on the brick is what I am looking into.
>>>>
>>>>
>>>>> It looks like prior to that event this was the layout -
which would make sense given the equal size of the 7 bricks:
>>>>>
>>>>> gluster02.pcic.uvic.ca | SUCCESS | rc=0 >>
>>>>> # file: /mnt/raid6-storage/storage
>>>>> trusted.glusterfs.dht=0x000000010000000048bfff206d1ffe5f
>>>>>
>>>>> gluster05.pcic.uvic.ca | SUCCESS | rc=0 >>
>>>>> # file: /mnt/raid6-storage/storage
>>>>> trusted.glusterfs.dht=0x0000000100000000b5dffce0da3ffc1f
>>>>>
>>>>> gluster04.pcic.uvic.ca | SUCCESS | rc=0 >>
>>>>> # file: /mnt/raid6-storage/storage
>>>>> trusted.glusterfs.dht=0x0000000100000000917ffda0b5dffcdf
>>>>>
>>>>> gluster03.pcic.uvic.ca | SUCCESS | rc=0 >>
>>>>> # file: /mnt/raid6-storage/storage
>>>>> trusted.glusterfs.dht=0x00000001000000006d1ffe60917ffd9f
>>>>>
>>>>> gluster01.pcic.uvic.ca | SUCCESS | rc=0 >>
>>>>> # file: /mnt/raid6-storage/storage
>>>>> trusted.glusterfs.dht=0x0000000100000000245fffe048bfff1f
>>>>>
>>>>> gluster07.pcic.uvic.ca | SUCCESS | rc=0 >>
>>>>> # file: /mnt/raid6-storage/storage
>>>>> trusted.glusterfs.dht=0x000000010000000000000000245fffdf
>>>>>
>>>>> gluster06.pcic.uvic.ca | SUCCESS | rc=0 >>
>>>>> # file: /mnt/raid6-storage/storage
>>>>> trusted.glusterfs.dht=0x0000000100000000da3ffc20ffffffff
>>>>>
>>>>> Which yields the following:
>>>>>
>>>>> 00000000 -> 245fffdf    Gluster07
>>>>> 245fffe0 -> 48bfff1f    Gluster01
>>>>> 48bfff20 -> 6d1ffe5f    Gluster02
>>>>> 6d1ffe60 -> 917ffd9f    Gluster03
>>>>> 917ffda0 -> b5dffcdf    Gluster04
>>>>> b5dffce0 -> da3ffc1f    Gluster05
>>>>> da3ffc20 -> ffffffff    Gluster06
>>>>>
>>>>> Is there some way to get back to this?
>>>>>
>>>>> Thanks,
>>>>>  -Matthew
>>>>>
>>>>> --
>>>>> Matthew Benstead
>>>>> System Administrator
>>>>> Pacific Climate Impacts Consortium
>>>>> University of Victoria, UH1
>>>>> PO Box 1800, STN CSC
>>>>> Victoria, BC, V8W 2Y2
>>>>> Phone: +1-250-721-8432
>>>>> Email: matthewb at uvic.ca
>>>>>
>>>>> On 7/18/19 7:20 AM, Matthew Benstead wrote:
>>>>>
>>>>> Hi Nithya,
>>>>>
>>>>> No - it was added about a year and a half ago. I have tried
re-mounting the volume on the server, but it didn't add the attr:
>>>>>
>>>>> [root at gluster07 ~]# umount /storage/
>>>>> [root at gluster07 ~]# cat /etc/fstab | grep
"/storage"
>>>>> 10.0.231.56:/storage /storage glusterfs
defaults,log-level=WARNING,backupvolfile-server=10.0.231.51 0 0
>>>>> [root at gluster07 ~]# mount /storage/
>>>>> [root at gluster07 ~]# df -h /storage/
>>>>> Filesystem            Size  Used Avail Use% Mounted on
>>>>> 10.0.231.56:/storage  255T  194T   62T  77% /storage
>>>>> [root at gluster07 ~]# getfattr --absolute-names -m . -d -e
hex /mnt/raid6-storage/storage/
>>>>> # file: /mnt/raid6-storage/storage/
>>>>>
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000
>>>>> trusted.gfid=0x00000000000000000000000000000001
>>>>>
trusted.glusterfs.6f95525a-94d7-4174-bac4-e1a18fe010a2.xtime=0x5d307baa00023ec0
>>>>> trusted.glusterfs.quota.dirty=0x3000
>>>>>
trusted.glusterfs.quota.size.2=0x00001b71d5279e000000000000763e32000000000005cd53
>>>>>
trusted.glusterfs.volume-id=0x6f95525a94d74174bac4e1a18fe010a2
>>>>>
>>>>> Thanks,
>>>>>  -Matthew
>>>>>
>>>>> On 7/17/19 10:04 PM, Nithya Balachandran wrote:
>>>>>
>>>>> Hi Matthew,
>>>>>
>>>>> Was this node/brick added to the volume recently? If yes,
try mounting the volume on a fresh mount point - that should create the xattr on
this as well.
>>>>>
>>>>> Regards,
>>>>> Nithya
>>>>>
>>>>> On Wed, 17 Jul 2019 at 21:01, Matthew Benstead <matthewb
at uvic.ca> wrote:
>>>>>> Hello,
>>>>>>
>>>>>> I've just noticed one brick in my 7 node distribute
volume is missing
>>>>>> the trusted.glusterfs.dht xattr...? How can I fix this?
>>>>>>
>>>>>> I'm running glusterfs-5.3-2.el7.x86_64 on CentOS 7.
>>>>>>
>>>>>> All of the other nodes are fine, but gluster07 from the
list below does
>>>>>> not have the attribute.
>>>>>>
>>>>>> $ ansible -i hosts gluster-servers[0:6] ... -m shell -a
"getfattr -m .
>>>>>> --absolute-names -n trusted.glusterfs.dht -e hex
>>>>>> /mnt/raid6-storage/storage"
>>>>>> ...
>>>>>> gluster05 | SUCCESS | rc=0 >>
>>>>>> # file: /mnt/raid6-storage/storage
>>>>>>
trusted.glusterfs.dht=0x0000000100000000000000002aaaaaa9
>>>>>>
>>>>>> gluster03 | SUCCESS | rc=0 >>
>>>>>> # file: /mnt/raid6-storage/storage
>>>>>>
trusted.glusterfs.dht=0x0000000100000000aaaaaaa8d5555551
>>>>>>
>>>>>> gluster04 | SUCCESS | rc=0 >>
>>>>>> # file: /mnt/raid6-storage/storage
>>>>>>
trusted.glusterfs.dht=0x0000000100000000d5555552ffffffff
>>>>>>
>>>>>> gluster06 | SUCCESS | rc=0 >>
>>>>>> # file: /mnt/raid6-storage/storage
>>>>>>
trusted.glusterfs.dht=0x00000001000000002aaaaaaa55555553
>>>>>>
>>>>>> gluster02 | SUCCESS | rc=0 >>
>>>>>> # file: /mnt/raid6-storage/storage
>>>>>>
trusted.glusterfs.dht=0x00000001000000007ffffffeaaaaaaa7
>>>>>>
>>>>>> gluster07 | FAILED | rc=1 >>
>>>>>> /mnt/raid6-storage/storage: trusted.glusterfs.dht: No
such
>>>>>> attributenon-zero return code
>>>>>>
>>>>>> gluster01 | SUCCESS | rc=0 >>
>>>>>> # file: /mnt/raid6-storage/storage
>>>>>>
trusted.glusterfs.dht=0x0000000100000000555555547ffffffd
>>>>>>
>>>>>> Here are all of the attr's from the brick:
>>>>>>
>>>>>> [root at gluster07 ~]# getfattr --absolute-names -m .
-d -e hex
>>>>>> /mnt/raid6-storage/storage/
>>>>>> # file: /mnt/raid6-storage/storage/
>>>>>>
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000
>>>>>> trusted.gfid=0x00000000000000000000000000000001
>>>>>>
trusted.glusterfs.6f95525a-94d7-4174-bac4-e1a18fe010a2.xtime=0x5d2dee800001fdf9
>>>>>> trusted.glusterfs.quota.dirty=0x3000
>>>>>>
trusted.glusterfs.quota.size.2=0x00001b69498a1400000000000076332e000000000005cd03
>>>>>>
trusted.glusterfs.volume-id=0x6f95525a94d74174bac4e1a18fe010a2
>>>>>>
>>>>>>
>>>>>> And here is the volume information:
>>>>>>
>>>>>> [root at gluster07 ~]# gluster volume info storage
>>>>>>
>>>>>> Volume Name: storage
>>>>>> Type: Distribute
>>>>>> Volume ID: 6f95525a-94d7-4174-bac4-e1a18fe010a2
>>>>>> Status: Started
>>>>>> Snapshot Count: 0
>>>>>> Number of Bricks: 7
>>>>>> Transport-type: tcp
>>>>>> Bricks:
>>>>>> Brick1: 10.0.231.50:/mnt/raid6-storage/storage
>>>>>> Brick2: 10.0.231.51:/mnt/raid6-storage/storage
>>>>>> Brick3: 10.0.231.52:/mnt/raid6-storage/storage
>>>>>> Brick4: 10.0.231.53:/mnt/raid6-storage/storage
>>>>>> Brick5: 10.0.231.54:/mnt/raid6-storage/storage
>>>>>> Brick6: 10.0.231.55:/mnt/raid6-storage/storage
>>>>>> Brick7: 10.0.231.56:/mnt/raid6-storage/storage
>>>>>> Options Reconfigured:
>>>>>> changelog.changelog: on
>>>>>> features.quota-deem-statfs: on
>>>>>> features.read-only: off
>>>>>> features.inode-quota: on
>>>>>> features.quota: on
>>>>>> performance.readdir-ahead: on
>>>>>> nfs.disable: on
>>>>>> geo-replication.indexing: on
>>>>>> geo-replication.ignore-pid-check: on
>>>>>> transport.address-family: inet
>>>>>>
>>>>>> Thanks,
>>>>>>  -Matthew
>>>>>> _______________________________________________
>>>>>> Gluster-users mailing list
>>>>>> Gluster-users at gluster.org
>>>>>>
https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gluster01-gsyncd.log
Type: text/x-log
Size: 8980 bytes
Desc: not available
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190729/a0c15867/attachment.bin>

Matthew Benstead

2019-Aug-09 21:33 UTC

head link

[Gluster-users] Brick missing trusted.glusterfs.dht xattr

Hi Sunny,

Where would I find the changes-<brick-path>.log files? Is there anything
else to help diagnose this?

Thanks,
?-Matthew

--
Matthew Benstead
System Administrator
Pacific Climate Impacts Consortium <https://pacificclimate.org/>
University of Victoria, UH1
PO Box 1800, STN CSC
Victoria, BC, V8W 2Y2
Phone: +1-250-721-8432
Email: matthewb at uvic.ca

On 7/29/19 9:46 AM, Matthew Benstead wrote:> Hi Sunny,
>
> Yes, I have attached the gsyncd.log file. I couldn't find any
> changes-<brick-path>.log files...
>
> Trying to start replication goes faulty right away:
>
> [root at gluster01 ~]# rpm -q glusterfs
> glusterfs-5.6-1.el7.x86_64
> [root at gluster01 ~]# uname -r
> 3.10.0-957.21.3.el7.x86_64
> [root at gluster01 ~]# cat /etc/centos-release
> CentOS Linux release 7.6.1810 (Core)
>
> [root at gluster01 ~]# gluster volume geo-replication storage
> root at 10.0.231.81::pcic-backup start
> Starting geo-replication session between storage &
> 10.0.231.81::pcic-backup has been successful
> [root at gluster01 ~]# gluster volume geo-replication storage
> root at 10.0.231.81::pcic-backup status
> ?
> MASTER NODE??? MASTER VOL??? MASTER BRICK????????????????? SLAVE USER???
> SLAVE?????????????????????? SLAVE NODE??? STATUS??? CRAWL STATUS???
> LAST_SYNCED?????????
>
-------------------------------------------------------------------------------------------------------------------------------------------------------
> 10.0.231.50??? storage?????? /mnt/raid6-storage/storage??? root?????????
> 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A????????????
> N/A?????????????????
> 10.0.231.52??? storage?????? /mnt/raid6-storage/storage??? root?????????
> 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A????????????
> N/A?????????????????
> 10.0.231.54??? storage?????? /mnt/raid6-storage/storage??? root?????????
> 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A????????????
> N/A?????????????????
> 10.0.231.51??? storage?????? /mnt/raid6-storage/storage??? root?????????
> 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A????????????
> N/A?????????????????
> 10.0.231.53??? storage?????? /mnt/raid6-storage/storage??? root?????????
> 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A????????????
> N/A?????????????????
> 10.0.231.55??? storage?????? /mnt/raid6-storage/storage??? root?????????
> 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A????????????
> N/A?????????????????
> 10.0.231.56??? storage?????? /mnt/raid6-storage/storage??? root?????????
> 10.0.231.81::pcic-backup??? N/A?????????? Faulty??? N/A????????????
> N/A?????????????????
> [root at gluster01 ~]# gluster volume geo-replication storage
> root at 10.0.231.81::pcic-backup stop
> Stopping geo-replication session between storage &
> 10.0.231.81::pcic-backup has been successful
>
> This is the primary cluster:
>
> [root at gluster01 ~]# gluster volume info storage
> ?
> Volume Name: storage
> Type: Distribute
> Volume ID: 6f95525a-94d7-4174-bac4-e1a18fe010a2
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 7
> Transport-type: tcp
> Bricks:
> Brick1: 10.0.231.50:/mnt/raid6-storage/storage
> Brick2: 10.0.231.51:/mnt/raid6-storage/storage
> Brick3: 10.0.231.52:/mnt/raid6-storage/storage
> Brick4: 10.0.231.53:/mnt/raid6-storage/storage
> Brick5: 10.0.231.54:/mnt/raid6-storage/storage
> Brick6: 10.0.231.55:/mnt/raid6-storage/storage
> Brick7: 10.0.231.56:/mnt/raid6-storage/storage
> Options Reconfigured:
> features.read-only: off
> features.inode-quota: on
> features.quota: on
> performance.readdir-ahead: on
> nfs.disable: on
> geo-replication.indexing: on
> geo-replication.ignore-pid-check: on
> transport.address-family: inet
> features.quota-deem-statfs: on
> changelog.changelog: on
> diagnostics.client-log-level: INFO
>
>
> And this is the cluster I'm trying to replicate to:
>
> [root at pcic-backup01 ~]# gluster volume info pcic-backup
> ?
> Volume Name: pcic-backup
> Type: Distribute
> Volume ID: 2890bcde-a023-4feb-a0e5-e8ef8f337d4c
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 2
> Transport-type: tcp
> Bricks:
> Brick1: 10.0.231.81:/pcic-backup01-zpool/brick
> Brick2: 10.0.231.82:/pcic-backup02-zpool/brick
> Options Reconfigured:
> nfs.disable: on
> transport.address-family: inet
>
>
> Thanks,
> ?-Matthew
>
> On 7/28/19 10:56 PM, Sunny Kumar wrote:
>> HI Matthew,
>>
>> Can you share geo-rep logs and one more log file
>> (changes-<brick-path>.log) it will help to pinpoint actual reason
>> behind failure.
>>
>> /sunny
>>
>> On Mon, Jul 29, 2019 at 9:13 AM Nithya Balachandran <nbalacha at
redhat.com> wrote:
>>>
>>> On Sat, 27 Jul 2019 at 02:31, Matthew Benstead <matthewb at
uvic.ca> wrote:
>>>> Ok thank-you for explaining everything - that makes sense.
>>>>
>>>> Currently the brick file systems are pretty evenly distributed
so I probably won't run the fix-layout right now.
>>>>
>>>> Would this state have any impact on geo-replication? I'm
trying to geo-replicate this volume, but am getting a weird error:
"Changelog register failed error=[Errno 21] Is a directory"
>>> It should not. Sunny, can you comment on this?
>>>
>>> Regards,
>>> Nithya
>>>> I assume this is related to something else, but I wasn't
sure.
>>>>
>>>> Thanks,
>>>>  -Matthew
>>>>
>>>> --
>>>> Matthew Benstead
>>>> System Administrator
>>>> Pacific Climate Impacts Consortium
>>>> University of Victoria, UH1
>>>> PO Box 1800, STN CSC
>>>> Victoria, BC, V8W 2Y2
>>>> Phone: +1-250-721-8432
>>>> Email: matthewb at uvic.ca
>>>>
>>>> On 7/26/19 12:02 AM, Nithya Balachandran wrote:
>>>>
>>>>
>>>>
>>>> On Fri, 26 Jul 2019 at 01:56, Matthew Benstead <matthewb at
uvic.ca> wrote:
>>>>> Hi Nithya,
>>>>>
>>>>> Hmm... I don't remember if I did, but based on what
I'm seeing it sounds like I probably didn't run rebalance or fix-layout.
>>>>>
>>>>> It looks like folders that haven't had any new files
created have a dht of 0, while other folders have non-zero values.
>>>>>
>>>>> [root at gluster07 ~]# getfattr --absolute-names -m . -d -e
hex /mnt/raid6-storage/storage/ | grep dht
>>>>> [root at gluster07 ~]# getfattr --absolute-names -m . -d -e
hex /mnt/raid6-storage/storage/home | grep dht
>>>>> trusted.glusterfs.dht=0x00000000000000000000000000000000
>>>>> [root at gluster07 ~]# getfattr --absolute-names -m . -d -e
hex /mnt/raid6-storage/storage/home/matthewb | grep dht
>>>>> trusted.glusterfs.dht=0x00000001000000004924921a6db6dbc7
>>>>>
>>>>> If I just run the fix-layout command will it re-create all
of the dht values or just the missing ones?
>>>> A fix-layout will recalculate the layouts entirely so files all
the values will change. No files will be moved.
>>>> A rebalance will recalculate the layouts like the fix-layout
but will also move files to their new locations based on the new layout ranges.
This could take a lot of time depending on the number of files/directories on
the volume. If you do this, I would recommend that you turn off lookup-optimize
until the rebalance is over.
>>>>
>>>>> Since the brick is already fairly size balanced could I get
away with running fix-layout but not rebalance? Or would the new dht layout mean
slower accesses since the files may be expected on different bricks?
>>>> The first access for a file will be slower. The next one will
be faster as the location will be cached in the client's in-memory
structures.
>>>> You may not need to run either a fix-layout or a rebalance if
new file creations will be in directories created after the add-brick. Gluster
will automatically include all 7 bricks for those directories.
>>>>
>>>> Regards,
>>>> Nithya
>>>>
>>>>> Thanks,
>>>>>  -Matthew
>>>>>
>>>>> --
>>>>> Matthew Benstead
>>>>> System Administrator
>>>>> Pacific Climate Impacts Consortium
>>>>> University of Victoria, UH1
>>>>> PO Box 1800, STN CSC
>>>>> Victoria, BC, V8W 2Y2
>>>>> Phone: +1-250-721-8432
>>>>> Email: matthewb at uvic.ca
>>>>>
>>>>> On 7/24/19 9:30 PM, Nithya Balachandran wrote:
>>>>>
>>>>>
>>>>>
>>>>> On Wed, 24 Jul 2019 at 22:12, Matthew Benstead <matthewb
at uvic.ca> wrote:
>>>>>> So looking more closely at the trusted.glusterfs.dht
attributes from the bricks it looks like they cover the entire range... and
there is no range left for gluster07.
>>>>>>
>>>>>> The first 6 bricks range from 0x00000000 to 0xffffffff
- so... is there a way to re-calculate what the dht values should be? Each of
the bricks should have a gap
>>>>>>
>>>>>> Gluster05 00000000 -> 2aaaaaa9
>>>>>> Gluster06 2aaaaaaa -> 55555553
>>>>>> Gluster01 55555554 -> 7ffffffd
>>>>>> Gluster02 7ffffffe -> aaaaaaa7
>>>>>> Gluster03 aaaaaaa8 -> d5555551
>>>>>> Gluster04 d5555552 -> ffffffff
>>>>>> Gluster07 None
>>>>>>
>>>>>> If we split the range into 7 servers that would be a
gap of about 0x24924924 for each server.
>>>>>>
>>>>>> Now in terms of the gluster07 brick, about 2 years ago
the RAID array the brick was stored on became corrupted. I ran the remove-brick
force command, then provisioned a new server, ran the add-brick command and then
restored the missing files from backup by copying them back to the main gluster
mount (not the brick).
>>>>>>
>>>>> Did you run a rebalance after performing the add-brick?
Without a rebalance/fix-layout , the layout for existing directories on the
volume will not  be updated to use the new brick as well.
>>>>>
>>>>> That the layout does not include the new brick in the root
dir is in itself is not a problem. Do you create a lot of files directly in the
root of the volume? If yes, you might want to run a rebalance. Otherwise, if you
mostly create files in newly added directories, you can probably ignore this.
You can check the layout for directories on the volume and see if they
incorporate the brick7.
>>>>>
>>>>> I would expect a lookup on the root to have set an xattr on
the brick with an empty layout range . The fact that the xattr does not exist at
all on the brick is what I am looking into.
>>>>>
>>>>>
>>>>>> It looks like prior to that event this was the layout -
which would make sense given the equal size of the 7 bricks:
>>>>>>
>>>>>> gluster02.pcic.uvic.ca | SUCCESS | rc=0 >>
>>>>>> # file: /mnt/raid6-storage/storage
>>>>>>
trusted.glusterfs.dht=0x000000010000000048bfff206d1ffe5f
>>>>>>
>>>>>> gluster05.pcic.uvic.ca | SUCCESS | rc=0 >>
>>>>>> # file: /mnt/raid6-storage/storage
>>>>>>
trusted.glusterfs.dht=0x0000000100000000b5dffce0da3ffc1f
>>>>>>
>>>>>> gluster04.pcic.uvic.ca | SUCCESS | rc=0 >>
>>>>>> # file: /mnt/raid6-storage/storage
>>>>>>
trusted.glusterfs.dht=0x0000000100000000917ffda0b5dffcdf
>>>>>>
>>>>>> gluster03.pcic.uvic.ca | SUCCESS | rc=0 >>
>>>>>> # file: /mnt/raid6-storage/storage
>>>>>>
trusted.glusterfs.dht=0x00000001000000006d1ffe60917ffd9f
>>>>>>
>>>>>> gluster01.pcic.uvic.ca | SUCCESS | rc=0 >>
>>>>>> # file: /mnt/raid6-storage/storage
>>>>>>
trusted.glusterfs.dht=0x0000000100000000245fffe048bfff1f
>>>>>>
>>>>>> gluster07.pcic.uvic.ca | SUCCESS | rc=0 >>
>>>>>> # file: /mnt/raid6-storage/storage
>>>>>>
trusted.glusterfs.dht=0x000000010000000000000000245fffdf
>>>>>>
>>>>>> gluster06.pcic.uvic.ca | SUCCESS | rc=0 >>
>>>>>> # file: /mnt/raid6-storage/storage
>>>>>>
trusted.glusterfs.dht=0x0000000100000000da3ffc20ffffffff
>>>>>>
>>>>>> Which yields the following:
>>>>>>
>>>>>> 00000000 -> 245fffdf    Gluster07
>>>>>> 245fffe0 -> 48bfff1f    Gluster01
>>>>>> 48bfff20 -> 6d1ffe5f    Gluster02
>>>>>> 6d1ffe60 -> 917ffd9f    Gluster03
>>>>>> 917ffda0 -> b5dffcdf    Gluster04
>>>>>> b5dffce0 -> da3ffc1f    Gluster05
>>>>>> da3ffc20 -> ffffffff    Gluster06
>>>>>>
>>>>>> Is there some way to get back to this?
>>>>>>
>>>>>> Thanks,
>>>>>>  -Matthew
>>>>>>
>>>>>> --
>>>>>> Matthew Benstead
>>>>>> System Administrator
>>>>>> Pacific Climate Impacts Consortium
>>>>>> University of Victoria, UH1
>>>>>> PO Box 1800, STN CSC
>>>>>> Victoria, BC, V8W 2Y2
>>>>>> Phone: +1-250-721-8432
>>>>>> Email: matthewb at uvic.ca
>>>>>>
>>>>>> On 7/18/19 7:20 AM, Matthew Benstead wrote:
>>>>>>
>>>>>> Hi Nithya,
>>>>>>
>>>>>> No - it was added about a year and a half ago. I have
tried re-mounting the volume on the server, but it didn't add the attr:
>>>>>>
>>>>>> [root at gluster07 ~]# umount /storage/
>>>>>> [root at gluster07 ~]# cat /etc/fstab | grep
"/storage"
>>>>>> 10.0.231.56:/storage /storage glusterfs
defaults,log-level=WARNING,backupvolfile-server=10.0.231.51 0 0
>>>>>> [root at gluster07 ~]# mount /storage/
>>>>>> [root at gluster07 ~]# df -h /storage/
>>>>>> Filesystem            Size  Used Avail Use% Mounted on
>>>>>> 10.0.231.56:/storage  255T  194T   62T  77% /storage
>>>>>> [root at gluster07 ~]# getfattr --absolute-names -m .
-d -e hex /mnt/raid6-storage/storage/
>>>>>> # file: /mnt/raid6-storage/storage/
>>>>>>
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000
>>>>>> trusted.gfid=0x00000000000000000000000000000001
>>>>>>
trusted.glusterfs.6f95525a-94d7-4174-bac4-e1a18fe010a2.xtime=0x5d307baa00023ec0
>>>>>> trusted.glusterfs.quota.dirty=0x3000
>>>>>>
trusted.glusterfs.quota.size.2=0x00001b71d5279e000000000000763e32000000000005cd53
>>>>>>
trusted.glusterfs.volume-id=0x6f95525a94d74174bac4e1a18fe010a2
>>>>>>
>>>>>> Thanks,
>>>>>>  -Matthew
>>>>>>
>>>>>> On 7/17/19 10:04 PM, Nithya Balachandran wrote:
>>>>>>
>>>>>> Hi Matthew,
>>>>>>
>>>>>> Was this node/brick added to the volume recently? If
yes, try mounting the volume on a fresh mount point - that should create the
xattr on this as well.
>>>>>>
>>>>>> Regards,
>>>>>> Nithya
>>>>>>
>>>>>> On Wed, 17 Jul 2019 at 21:01, Matthew Benstead
<matthewb at uvic.ca> wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> I've just noticed one brick in my 7 node
distribute volume is missing
>>>>>>> the trusted.glusterfs.dht xattr...? How can I fix
this?
>>>>>>>
>>>>>>> I'm running glusterfs-5.3-2.el7.x86_64 on
CentOS 7.
>>>>>>>
>>>>>>> All of the other nodes are fine, but gluster07 from
the list below does
>>>>>>> not have the attribute.
>>>>>>>
>>>>>>> $ ansible -i hosts gluster-servers[0:6] ... -m
shell -a "getfattr -m .
>>>>>>> --absolute-names -n trusted.glusterfs.dht -e hex
>>>>>>> /mnt/raid6-storage/storage"
>>>>>>> ...
>>>>>>> gluster05 | SUCCESS | rc=0 >>
>>>>>>> # file: /mnt/raid6-storage/storage
>>>>>>>
trusted.glusterfs.dht=0x0000000100000000000000002aaaaaa9
>>>>>>>
>>>>>>> gluster03 | SUCCESS | rc=0 >>
>>>>>>> # file: /mnt/raid6-storage/storage
>>>>>>>
trusted.glusterfs.dht=0x0000000100000000aaaaaaa8d5555551
>>>>>>>
>>>>>>> gluster04 | SUCCESS | rc=0 >>
>>>>>>> # file: /mnt/raid6-storage/storage
>>>>>>>
trusted.glusterfs.dht=0x0000000100000000d5555552ffffffff
>>>>>>>
>>>>>>> gluster06 | SUCCESS | rc=0 >>
>>>>>>> # file: /mnt/raid6-storage/storage
>>>>>>>
trusted.glusterfs.dht=0x00000001000000002aaaaaaa55555553
>>>>>>>
>>>>>>> gluster02 | SUCCESS | rc=0 >>
>>>>>>> # file: /mnt/raid6-storage/storage
>>>>>>>
trusted.glusterfs.dht=0x00000001000000007ffffffeaaaaaaa7
>>>>>>>
>>>>>>> gluster07 | FAILED | rc=1 >>
>>>>>>> /mnt/raid6-storage/storage: trusted.glusterfs.dht:
No such
>>>>>>> attributenon-zero return code
>>>>>>>
>>>>>>> gluster01 | SUCCESS | rc=0 >>
>>>>>>> # file: /mnt/raid6-storage/storage
>>>>>>>
trusted.glusterfs.dht=0x0000000100000000555555547ffffffd
>>>>>>>
>>>>>>> Here are all of the attr's from the brick:
>>>>>>>
>>>>>>> [root at gluster07 ~]# getfattr --absolute-names -m
. -d -e hex
>>>>>>> /mnt/raid6-storage/storage/
>>>>>>> # file: /mnt/raid6-storage/storage/
>>>>>>>
security.selinux=0x756e636f6e66696e65645f753a6f626a6563745f723a756e6c6162656c65645f743a733000
>>>>>>> trusted.gfid=0x00000000000000000000000000000001
>>>>>>>
trusted.glusterfs.6f95525a-94d7-4174-bac4-e1a18fe010a2.xtime=0x5d2dee800001fdf9
>>>>>>> trusted.glusterfs.quota.dirty=0x3000
>>>>>>>
trusted.glusterfs.quota.size.2=0x00001b69498a1400000000000076332e000000000005cd03
>>>>>>>
trusted.glusterfs.volume-id=0x6f95525a94d74174bac4e1a18fe010a2
>>>>>>>
>>>>>>>
>>>>>>> And here is the volume information:
>>>>>>>
>>>>>>> [root at gluster07 ~]# gluster volume info storage
>>>>>>>
>>>>>>> Volume Name: storage
>>>>>>> Type: Distribute
>>>>>>> Volume ID: 6f95525a-94d7-4174-bac4-e1a18fe010a2
>>>>>>> Status: Started
>>>>>>> Snapshot Count: 0
>>>>>>> Number of Bricks: 7
>>>>>>> Transport-type: tcp
>>>>>>> Bricks:
>>>>>>> Brick1: 10.0.231.50:/mnt/raid6-storage/storage
>>>>>>> Brick2: 10.0.231.51:/mnt/raid6-storage/storage
>>>>>>> Brick3: 10.0.231.52:/mnt/raid6-storage/storage
>>>>>>> Brick4: 10.0.231.53:/mnt/raid6-storage/storage
>>>>>>> Brick5: 10.0.231.54:/mnt/raid6-storage/storage
>>>>>>> Brick6: 10.0.231.55:/mnt/raid6-storage/storage
>>>>>>> Brick7: 10.0.231.56:/mnt/raid6-storage/storage
>>>>>>> Options Reconfigured:
>>>>>>> changelog.changelog: on
>>>>>>> features.quota-deem-statfs: on
>>>>>>> features.read-only: off
>>>>>>> features.inode-quota: on
>>>>>>> features.quota: on
>>>>>>> performance.readdir-ahead: on
>>>>>>> nfs.disable: on
>>>>>>> geo-replication.indexing: on
>>>>>>> geo-replication.ignore-pid-check: on
>>>>>>> transport.address-family: inet
>>>>>>>
>>>>>>> Thanks,
>>>>>>>  -Matthew
>>>>>>> _______________________________________________
>>>>>>> Gluster-users mailing list
>>>>>>> Gluster-users at gluster.org
>>>>>>>
https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20190809/62915948/attachment.html>

Gluster users - Aug 2019 - Brick missing trusted.glusterfs.dht xattr

[Gluster-users] Brick missing trusted.glusterfs.dht xattr

[Gluster-users] Brick missing trusted.glusterfs.dht xattr

[Gluster-users] Brick missing trusted.glusterfs.dht xattr