thr3ads.net - Gluster users - [Gluster-users] Confusion supreme [Jun 2024]

If this information is useful, please help other people find it:
Share via:

Zenon Panoussis

2024-Jun-26 14:07 UTC

[Gluster-users] Confusion supreme

Hello all

I have a mail store on a volume replica 3 with no arbiter. A while
ago the disk of one of the bricks failed and I was several days
late to notice it. When I did, I removed that brick from the volume,
replaced the failed disk, updated the OS on that machine from el8
to el9 and gluster on all three nodes from 10.3 to 11.1, added back
the brick and started a heal. Things appeared to work out OK, but
actually they did not. And this is what I have now.

# gluster volume info gv0

Volume Name: gv0
Type: Replicate
Volume ID: 1e3ca399-8e57-4ee8-997f-f64479199d23
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x 3 = 3
Transport-type: tcp
Bricks:
Brick1: zephyrosaurus:/vol/gfs/gv0
Brick2: alvarezsaurus:/vol/gfs/gv0
Brick3: nanosaurus:/vol/gfs/gv0
Options Reconfigured:
cluster.entry-self-heal: on
cluster.metadata-self-heal: on
cluster.data-self-heal: on
<snip>

On all three hosts:

# ls /vol/vmail/net/provocation/oracle/Maildir/cur
ls: reading directory '/vol/vmail/net/provocation/oracle/Maildir/cur':
Invalid argument

That is the glusterfs-mounted inbox of my mail.

If I list the bricks instead, I have different results on each host:

HostA
# ls /vol/gfs/gv0/net/provocation/oracle/Maildir/cur/ |wc -l
4848

HostB
# ls /vol/gfs/gv0/net/provocation/oracle/Maildir/cur/ |wc -l
522

HostC
# ls /vol/gfs/gv0/net/provocation/oracle/Maildir/cur/ |wc -l
4837

However,

# gluster volume heal gv0 info
Brick zephyrosaurus:/vol/gfs/gv0
/net/provocation/oracle/Maildir/cur
/net/provocation/oracle/Maildir/cur/1701712419.M379665P902306V000000000000002DI8264026770F33CFF_1.zephyrosaurus.nettheatre.org,S=14500:2,RS
/net/provocation/oracle/Maildir/cur/1701712390.M212926P902294V000000000000002DIA089A37BF7E58BB4_1.zephyrosaurus.nettheatre.org,S=19286:2,S
Status: Connected
Number of entries: 3

Brick alvarezsaurus:/vol/gfs/gv0
/net/provocation/oracle/Maildir/cur
Status: Connected
Number of entries: 1

Brick nanosaurus:/vol/gfs/gv0
/net/provocation/oracle/Maildir/cur
Status: Connected
Number of entries: 1

That's definitely not what it should be. There are at least 4300+
files missing from hostB and a dozen from hostC which are not queued
for healing. And nothing is in split-brain.

So I check the attributes of that Maildir/cur directory:

HostA
# getfattr -d -m . -e hex /vol/gfs/gv0/net/provocation/oracle/Maildir/cur
getfattr: Removing leading '/' from absolute path names
# file: vol/gfs/gv0/net/provocation/oracle/Maildir/cur
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.gv0-client-1=0x000000000000000000002394
trusted.afr.gv0-client-2=0x0000000000000000000000a0
trusted.afr.gv0-client-4=0x00000001000000000000001a
trusted.gfid=0xbf3ed8b7b2a8457d88f19482ae1ce73d
trusted.glusterfs.dht=0x000000000000000000000000ffffffff
trusted.glusterfs.mdata=0x010000000000000000520318006fbb5600000000001c00ba4800000000667c1663000000002e7f0c84a2925c3455bf9e00000000001098f13e

HostB
# getfattr -d -m . -e hex /vol/gfs/gv0/net/provocation/oracle/Maildir/cur
getfattr: Removing leading '/' from absolute path names
# file: vol/gfs/gv0/net/provocation/oracle/Maildir/cur
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.gv0-client-2=0x0000000000000000000000a0
trusted.gfid=0xbf3ed8b7b2a8457d88f19482ae1ce73d
trusted.glusterfs.dht=0x000000000000000000000000ffffffff
trusted.glusterfs.mdata=0x010000000000000000520318006fbb5600000000001c00ba4800000000667c1663000000002e7f0c84a2925c3455bf9e00000000001098f13e

HostC
# getfattr -d -m . -e hex /vol/gfs/gv0/net/provocation/oracle/Maildir/cur
getfattr: Removing leading '/' from absolute path names
# file: vol/gfs/gv0/net/provocation/oracle/Maildir/cur
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.gv0-client-1=0x000000000000000000002394
trusted.afr.gv0-client-3=0x000000000000000000000000
trusted.afr.gv0-client-4=0x000000010000000000000020
trusted.gfid=0xbf3ed8b7b2a8457d88f19482ae1ce73d
trusted.glusterfs.dht=0x000000000000000000000000ffffffff
trusted.glusterfs.mdata=0x010000000000000000520318006fbb5600000000001c00ba4800000000667c1663000000002e7f0c84a2925c3455bf9e00000000001098f13e

There is the explanation of this mess. In a replica 3 where
I should have exactly three clients, I have four clients and
none of the bricks have the same set of peers.

In a recent thread about similar problems, Ilias used the word
"clueless". It applies equally to me too: I have zero clue where
to begin or what to do. Any ideas anyone?

Cheers,

Z


-- 
????? ???????!
?????? ?????!

Zenon Panoussis

2024-Jun-26 17:46 UTC

head link

[Gluster-users] Confusion supreme

I should add that in /var/lib/glusterd/vols/gv0/gv0-shd.vol and
in all other configs in /var/lib/glusterd/ on all three machines
the nodes are consistently named

client-2: zephyrosaurus
client-3: alvarezsaurus
client-4: nanosaurus

This is normal. It was the second time that a brick was removed,
so client-0 and client-1 are gone.

So the problem is the file attibutes themselves. And there I see
things like

trusted.afr.gv0-client-0=0x000000000000000000000000
trusted.afr.gv0-client-1=0x000000000000000000000ab0
trusted.afr.gv0-client-3=0x000000000000000000000000
trusted.afr.gv0-client-4=0x000000000000000000000000

and

trusted.afr.gv0-client-3=0x000000000000000000000000
trusted.afr.gv0-client-4=0x000000000000000000000000

and other such, where the only thing that is consistent, is inconsistency.

When a brick is removed, shouldn't all files on the remaining bricks
be re-fattr'ed to remove the pointers to the non-existent brick?

I guess I can do this manually, but it will still leave me with
those files where the value of all trusted.afr.gv0-client(s) is
zero. How does healing deal with those?

Cheers,

Z


-- 
????? ???????!
?????? ?????!

Possibly Parallel Threads

Search for more maybe matching threads

Gluster users - Jun 2024 - Confusion supreme

[Gluster-users] Confusion supreme

[Gluster-users] Confusion supreme

Possibly Parallel Threads