My volume is replica 3 arbiter 1, maybe that makes a difference?
Bricks processes tend to die quite often (I have to restart glusterd at
least once a day because "gluster v info | grep ' N '" reports
at least
one missing brick; sometimes even if all bricks are reported up I have
to kill all glusterfs[d] processes and restart glusterd).
The 3 servers have 192GB RAM (that should be way more than enough!), 30
data bricks and 15 arbiters (the arbiters share a single SSD).
And I noticed that some "stale file handle" are not reported by heal
info.
root at str957-cluster:/# ls -l
/scratch/extra/m******/PNG/PNGQuijote/ModGrav/fNL40/
ls: cannot access
'/scratch/extra/m******/PNG/PNGQuijote/ModGrav/fNL40/output_21': Stale
file handle
total 40
d????????? ? ? ? ? ? output_21
...
but "gluster v heal cluster_data info |grep output_21" returns
nothing. :(
Seems the other stale handles either got corrected by subsequent 'stat's
or became I/O errors.
Diego.
Il 12/02/2023 21:34, Strahil Nikolov ha scritto:> The 2-nd error indicates conflicts between the nodes. The only way that
> could happen on replica 3 is gfid conflict (file/dir was renamed or
> recreated).
>
> Are you sure that all bricks are online? Usually 'Transport endpoint is
> not connected' indicates a brick down situation.
>
> First start with all stale file handles:
> check md5sum on all bricks. If it differs somewhere, delete the gfid and
> move the file away from the brick and check in FUSE. If it's fine ,
> touch it and the FUSE client will "heal" it.
>
> Best Regards,
> Strahil Nikolov
>
>
>
> On Tue, Feb 7, 2023 at 16:33, Diego Zuccato
> <diego.zuccato at unibo.it> wrote:
> The contents do not match exactly, but the only difference is the
> "option shared-brick-count" line that sometimes is 0 and
sometimes 1.
>
> The command you gave could be useful for the files that still needs
> healing with the source still present, but the files related to the
> stale gfids have been deleted, so "find -samefile" won't
find anything.
>
> For the other files reported by heal info, I saved the output to
> 'healinfo', then:
> ? for T in $(grep '^/' healinfo |sort|uniq); do stat
/mnt/scratch$T >
> /dev/null; done
>
> but I still see a lot of 'Transport endpoint is not connected'
and
> 'Stale file handle' errors :( And many 'No such file or
directory'...
>
> I don't understand the first two errors, since /mnt/scratch have
been
> freshly mounted after enabling client healing, and gluster v info does
> not highlight unconnected/down bricks.
>
> Diego
>
> Il 06/02/2023 22:46, Strahil Nikolov ha scritto:
> > I'm not sure if the md5sum has to match , but at least the
content
> > should do.
> > In modern versions of GlusterFS the client side healing is
> disabled ,
> > but it's worth trying.
> > You will need to enable cluster.metadata-self-heal,
> > cluster.data-self-heal and cluster.entry-self-heal and then
create a
> > small one-liner that identifies the names of the files/dirs from
the
> > volume heal ,so you can stat them through the FUSE.
> >
> > Something like this:
> >
> >
> > for i in $(gluster volume heal <VOL> info | awk -F
'<gfid:|>'
> '/gfid:/
> > {print $2}'); do find /PATH/TO/BRICK/ -samefile
> > /PATH/TO/BRICK/.glusterfs/${i:0:2}/${i:2:2}/$i | awk
'!/.glusterfs/
> > {gsub("/PATH/TO/BRICK", "stat
/MY/FUSE/MOUNTPOINT", $0); print
> $0}' ; done
> >
> > Then Just copy paste the output and you will trigger the client
side
> > heal only on the affected gfids.
> >
> > Best Regards,
> > Strahil Nikolov
> > ? ??????????, 6 ???????? 2023 ?., 10:19:02 ?. ???????+2, Diego
> Zuccato
> > <diego.zuccato at unibo.it <mailto:diego.zuccato at
unibo.it>> ??????:
> >
> >
> > Ops... Reincluding the list that got excluded in my previous
> answer :(
> >
> > I generated md5sums of all files in vols/ on clustor02 and
> compared to
> > the other nodes (clustor00 and clustor01).
> > There are differences in volfiles (shouldn't it always be 1,
> since every
> > data brick is on its own fs? quorum bricks, OTOH, share a single
> > partition on SSD and should always be 15, but in both cases
sometimes
> > it's 0).
> >
> > I nearly got a stroke when I saw diff output for 'info'
files,
> but once
> > I sorted 'em their contents matched. Pfhew!
> >
> > Diego
> >
> > Il 03/02/2023 19:01, Strahil Nikolov ha scritto:
> >? > This one doesn't look good:
> >? >
> >? >
> >? > [2023-02-03 07:45:46.896924 +0000] E [MSGID: 114079]
> >? > [client-handshake.c:1253:client_query_portmap]
> 0-cluster_data-client-48:
> >? > remote-subvolume not set in volfile []
> >? >
> >? >
> >? > Can you compare all vol files in /var/lib/glusterd/vols/
> between the
> > nodes ?
> >? > I have the suspicioun that there is a vol file mismatch
(maybe
> >? > /var/lib/glusterd/vols/<VOLUME_NAME>/*-shd.vol).
> >? >
> >? > Best Regards,
> >? > Strahil Nikolov
> >? >
> >? >? ? On Fri, Feb 3, 2023 at 12:20, Diego Zuccato
> >? >? ? <diego.zuccato at unibo.it <mailto:diego.zuccato
at unibo.it>
> <mailto:diego.zuccato at unibo.it <mailto:diego.zuccato at
unibo.it>>> wrote:
> >? >? ? Can't see anything relevant in glfsheal log, just
messages
> related to
> >? >? ? the crash of one of the nodes (the one that had the mobo
> replaced... I
> >? >? ? fear some on-disk structures could have been silently
> damaged by RAM
> >? >? ? errors and that makes gluster processes crash, or
it's just
> an issue
> >? >? ? with enabling brick-multiplex).
> >? >? ? -8<--
> >? >? ? [2023-02-03 07:45:46.896924 +0000] E [MSGID: 114079]
> >? >? ? [client-handshake.c:1253:client_query_portmap]
> >? >? ? 0-cluster_data-client-48:
> >? >? ? remote-subvolume not set in volfile []
> >? >? ? [2023-02-03 07:45:46.897282 +0000] E
> >? >? ? [rpc-clnt.c:331:saved_frames_unwind] (-->
> >? >
> >
>
/lib/x86_64-linux-gnu/libglusterfs.so.0(_gf_log_callingfn+0x195)[0x7fce0c867b95]
> >? >? ? (-->
> /lib/x86_64-linux-gnu/libgfrpc.so.0(+0x72fc)[0x7fce0c0ca2fc] (-->
> >? >
> >
>
/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_clnt_connection_cleanup+0x109)[0x7fce0c0d2419]
> >? >? ? (-->
> /lib/x86_64-linux-gnu/libgfrpc.so.0(+0x10308)[0x7fce0c0d3308]
> > (-->
> >? >
> >
>
/lib/x86_64-linux-gnu/libgfrpc.so.0(rpc_transport_notify+0x26)[0x7fce0c0ce7e6]
> >? >? ? ))))) 0-cluster_data-client-48: forced unwinding frame
> type(GF-DUMP)
> >? >? ? op(NULL(2)) called at 2023-02-03 07:45:46.891054 +0000
> (xid=0x13)
> >? >? ? -8<--
> >? >
> >? >? ? Well, actually I *KNOW* the files outside .glusterfs
have
> been deleted
> >? >? ? (by me :) ). That's why I call those 'stale'
gfids.
> >? >? ? Affected entries under .glusterfs have usually link
count > 1 =>
> >? >? ? nothing
> >? >? ? 'find' can find.
> >? >? ? Since I already recovered those files (before deleting
from
> bricks),
> >? >? ? can
> >? >? ? .glusterfs entries be deleted too or should I check
> something else?
> >? >? ? Maybe I should create a script that finds all files/dirs
(not
> > symlinks,
> >? >? ? IIUC) in .glusterfs on all bricks/arbiters and moves
'em to
> a temp
> > dir?
> >? >
> >? >? ? Diego
> >? >
> >? >? ? Il 02/02/2023 23:35, Strahil Nikolov ha scritto:
> >? >? ? ? > Any issues reported in
/var/log/glusterfs/glfsheal-*.log ?
> >? >? ? ? >
> >? >? ? ? > The easiest way to identify the affected entries
is to run:
> >? >? ? ? > find /FULL/PATH/TO/BRICK/ -samefile
> >? >? ? ? >
> >? >
> >
>
/FULL/PATH/TO/BRICK/.glusterfs/57/e4/57e428c7-6bed-4eb3-b9bd-02ca4c46657a
> >? >? ? ? >
> >? >? ? ? >
> >? >? ? ? > Best Regards,
> >? >? ? ? > Strahil Nikolov
> >? >? ? ? >
> >? >? ? ? >
> >? >? ? ? > ? ???????, 31 ?????? 2023 ?., 11:58:24 ?.
???????+2,
> Diego Zuccato
> >? >? ? ? > <diego.zuccato at unibo.it
<mailto:diego.zuccato at unibo.it>
> <mailto:diego.zuccato at unibo.it <mailto:diego.zuccato at
unibo.it>>
> > <mailto:diego.zuccato at unibo.it <mailto:diego.zuccato at
unibo.it>
> <mailto:diego.zuccato at unibo.it <mailto:diego.zuccato at
unibo.it>>>>
> ??????:
> >? >? ? ? >
> >? >? ? ? >
> >? >? ? ? > Hello all.
> >? >? ? ? >
> >? >? ? ? > I've had one of the 3 nodes serving a
"replica 3
> arbiter 1"
> > down for
> >? >? ? ? > some days (apparently RAM issues, but actually
failing
> mobo).
> >? >? ? ? > The other nodes have had some issues (RAM
exhaustion,
> old problem
> >? >? ? ? > already ticketed but still no solution) and some
brick
> processes
> >? >? ? ? > coredumped. Restarting the processes allowed the
> cluster to
> > continue
> >? >? ? ? > working. Mostly.
> >? >? ? ? >
> >? >? ? ? > After the third server got fixed I started a
heal, but
> files
> >? >? ? didn't get
> >? >? ? ? > healed and count (by "ls -l
> >? >? ? ? > /srv/bricks/*/d/.glusterfs/indices/xattrop/|grep
^-|wc
> -l")
> > did not
> >? >? ? ? > decrease over 2 days. So, to recover I copied
files
> from bricks
> >? >? ? to temp
> >? >? ? ? > storage (keeping both copies of conflicting files
with
> different
> >? >? ? ? > contents), removed files on bricks and arbiters,
and
> finally
> >? >? ? copied back
> >? >? ? ? > from temp storage to the volume.
> >? >? ? ? >
> >? >? ? ? > Now the files are accessible but I still see lots
of
> entries like
> >? >? ? ? > <gfid:57e428c7-6bed-4eb3-b9bd-02ca4c46657a>
> >? >? ? ? >
> >? >? ? ? > IIUC that's due to a mismatch between
.glusterfs/
> contents and
> > normal
> >? >? ? ? > hierarchy. Is there some tool to speed up the
cleanup?
> >? >? ? ? >
> >? >? ? ? > Tks.
> >? >? ? ? >
> >? >? ? ? > --
> >? >? ? ? > Diego Zuccato
> >? >? ? ? > DIFA - Dip. di Fisica e Astronomia
> >? >? ? ? > Servizi Informatici
> >? >? ? ? > Alma Mater Studiorum - Universit? di Bologna
> >? >? ? ? > V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
> >? >? ? ? > tel.: +39 051 20 95786
> >? >? ? ? > ________
> >? >? ? ? >
> >? >? ? ? >
> >? >? ? ? >
> >? >? ? ? > Community Meeting Calendar:
> >? >? ? ? >
> >? >? ? ? > Schedule -
> >? >? ? ? > Every 2nd and 4th Tuesday at 14:30 IST / 09:00
UTC
> >? >? ? ? > Bridge: https://meet.google.com/cpu-eiue-hvk
> <https://meet.google.com/cpu-eiue-hvk >
> > <https://meet.google.com/cpu-eiue-hvk
> <https://meet.google.com/cpu-eiue-hvk>>
> >? >? ? <https://meet.google.com/cpu-eiue-hvk
> <https://meet.google.com/cpu-eiue-hvk >
> > <https://meet.google.com/cpu-eiue-hvk
> <https://meet.google.com/cpu-eiue-hvk >>>
> >? >? ? ? > <https://meet.google.com/cpu-eiue-hvk
> <https://meet.google.com/cpu-eiue-hvk >
> > <https://meet.google.com/cpu-eiue-hvk
> <https://meet.google.com/cpu-eiue-hvk>>
> >? >? ? <https://meet.google.com/cpu-eiue-hvk
> <https://meet.google.com/cpu-eiue-hvk >
> > <https://meet.google.com/cpu-eiue-hvk
> <https://meet.google.com/cpu-eiue-hvk>>>>
> >? >? ? ? > Gluster-users mailing list
> >? >? ? ? > Gluster-users at gluster.org
> <mailto:Gluster-users at gluster.org> <mailto:Gluster-users at
gluster.org
> <mailto:Gluster-users at gluster.org>>
> > <mailto:Gluster-users at gluster.org
> <mailto:Gluster-users at gluster.org> <mailto:Gluster-users at
gluster.org
> <mailto:Gluster-users at gluster.org>>>
> >? >? ? <mailto:Gluster-users at gluster.org
> <mailto:Gluster-users at gluster.org>
> > <mailto:Gluster-users at gluster.org
> <mailto:Gluster-users at gluster.org>>
> <mailto:Gluster-users at gluster.org <mailto:Gluster-users at
gluster.org>
> > <mailto:Gluster-users at gluster.org
> <mailto:Gluster-users at gluster.org>>>>
> >? >? ? ? >
> https://lists.gluster.org/mailman/listinfo/gluster-users
> <https://lists.gluster.org/mailman/listinfo/gluster-users >
> > <https://lists.gluster.org/mailman/listinfo/gluster-users
> <https://lists.gluster.org/mailman/listinfo/gluster-users>>
> >? >? ?
<https://lists.gluster.org/mailman/listinfo/gluster-users
> <https://lists.gluster.org/mailman/listinfo/gluster-users >
> > <https://lists.gluster.org/mailman/listinfo/gluster-users
> <https://lists.gluster.org/mailman/listinfo/gluster-users
>>>
> >? >? ? ? >
> <https://lists.gluster.org/mailman/listinfo/gluster-users
> <https://lists.gluster.org/mailman/listinfo/gluster-users >
> > <https://lists.gluster.org/mailman/listinfo/gluster-users
> <https://lists.gluster.org/mailman/listinfo/gluster-users>>
> >? >? ?
<https://lists.gluster.org/mailman/listinfo/gluster-users
> <https://lists.gluster.org/mailman/listinfo/gluster-users >
> > <https://lists.gluster.org/mailman/listinfo/gluster-users
>
<https://lists.gluster.org/mailman/listinfo/gluster-users>>>>
>
> >
> >? >
> >? >
> >? >? ? --
> >? >? ? Diego Zuccato
> >? >? ? DIFA - Dip. di Fisica e Astronomia
> >? >? ? Servizi Informatici
> >? >? ? Alma Mater Studiorum - Universit? di Bologna
> >? >? ? V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
> >? >? ? tel.: +39 051 20 95786
> >? >
> >
> > --
> > Diego Zuccato
> > DIFA - Dip. di Fisica e Astronomia
> > Servizi Informatici
> > Alma Mater Studiorum - Universit? di Bologna
> > V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
> > tel.: +39 051 20 95786
>
> --
> Diego Zuccato
> DIFA - Dip. di Fisica e Astronomia
> Servizi Informatici
> Alma Mater Studiorum - Universit? di Bologna
> V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
> tel.: +39 051 20 95786
>
--
Diego Zuccato
DIFA - Dip. di Fisica e Astronomia
Servizi Informatici
Alma Mater Studiorum - Universit? di Bologna
V.le Berti-Pichat 6/2 - 40127 Bologna - Italy
tel.: +39 051 20 95786