Ronny Adsetts
2022-Nov-03 14:44 UTC
[Gluster-users] Question on stale shards with distribute-replicate volume
Hi, We have a 4 x ( 2 + 1 ) distribute-replicate volume with sharding enabled. We use the volume for storing backing files for iscsi devices. The iscsi devices are provided to our file server using tgtd using the glfs backing store type via libgfapi. So we had a problem the other day where one of the filesystems wouldn't re-mount following a rolling tgtd restart (we have 4 servers providing tgtd). I think this rolling restart was done too quickly which meant there was a disconnect at the file server end (speculating). After some investigation, and manually trying to copy the fs image file to a temporary location, I found 0 byte shards. Because I mounted the file directly I got errors in the gluster logs (/var/log/glusterfs/srv-iscsi.log) for the volume. I get no errors in gluster logs when this happens via libgfapi though I did see tgtd errors. The tgtd errors look like this: tgtd[24080]: tgtd: bs_glfs_request(279) Error on read ffffffff 1000tgtd: bs_glfs_request(370) io error 0x55da8d9820b0 2 28 -1 4096 376698519552, Stale file handle Not sure how to figure out which shard is the issue out of that log entry. :-) The gluster logs look like this: [2022-11-01 16:51:28.496911] E [MSGID: 133010] [shard.c:2342:shard_common_lookup_shards_cbk] 0-iscsi-shard: Lookup on shard 5613 failed. Base file gfid = b42dc8f9-755e-46be-8418-4882a9f765e1 [Stale file handle] [2022-11-01 19:17:09.060376] E [MSGID: 133010] [shard.c:2342:shard_common_lookup_shards_cbk] 0-iscsi-shard: Lookup on shard 5418 failed. Base file gfid = b42dc8f9-755e-46be-8418-4882a9f765e1 [Stale file handle] So there were the two shards showing up as problematic. Checking the shard files showed that they were 0 byte with a trusted.glusterfs.dht.linkto value in the file attributes. There were other shard files of the same name with the correct size. So I guess the shard had been moved at some point resulting in the 8 byte linkto copies. Anyway, moving the offending .shard and associated .gluster files out of the way resulted in me being able to first, copy the file without error, and then run an "xfs_repair -L" on the filesystem and get it remounted. There was some data loss but minor as far as I can tell. So the two shards I removed (replica 2 + arbiter) look like so: ronny at cogline:~$ ls -al /tmp/publichomes-backup-stale-shards/.shard/ total 0 drwxr-xr-x 2 root root 104 Nov 2 00:13 . drwxr-xr-x 4 root root 38 Nov 2 00:05 .. ---------T 1 root root 0 Aug 14 04:26 b42dc8f9-755e-46be-8418-4882a9f765e1.5418 ---------T 1 root root 0 Oct 25 10:49 b42dc8f9-755e-46be-8418-4882a9f765e1.5613 ronny at keratrix:~$ ls -al /tmp/publichomes-backup-stale-shards/.shard/ total 0 drwxr-xr-x 2 root root 104 Nov 2 00:13 . drwxr-xr-x 4 root root 38 Nov 2 00:07 .. ---------T 1 root root 0 Aug 14 04:26 b42dc8f9-755e-46be-8418-4882a9f765e1.5418 ---------T 1 root root 0 Oct 25 10:49 b42dc8f9-755e-46be-8418-4882a9f765e1.5613 ronny at bellizen:~$ ls -al /tmp/publichomes-backup-stale-shards/.shard/ total 0 drwxr-xr-x 2 root root 55 Nov 2 00:07 . drwxr-xr-x 4 root root 38 Nov 2 00:07 .. ---------T 1 root root 0 Oct 25 10:49 b42dc8f9-755e-46be-8418-4882a9f765e1.5613 ronny at risca:~$ ls -al /tmp/publichomes-backup-stale-shards/.shard/ total 0 drwxr-xr-x 2 root root 55 Nov 2 00:13 . drwxr-xr-x 4 root root 38 Nov 2 00:13 .. ---------T 1 root root 0 Aug 14 04:26 b42dc8f9-755e-46be-8418-4882a9f765e1.5418 So the first question is did I do the right thing to get this resolved? The other, and more important question now relates to "Stale file handle" errors we are now seeing on a different file system. I only have tgtd log entries for this and wondered if anyone could help with taking a log entry and somehow figuring out which shard is the problematic one: tgtd[3052]: tgtd: bs_glfs_request(370) io error 0x56404e0dc510 2 2a -1 1310720 428680884224, Stale file handle Thanks for any help anyone can provide. Ronny -- Ronny Adsetts Technical Director Amazing Internet Ltd, London t: +44 20 8977 8943 w: www.amazinginternet.com Registered office: 85 Waldegrave Park, Twickenham, TW1 4TJ Registered in England. Company No. 4042957
Strahil Nikolov
2022-Nov-10 17:28 UTC
[Gluster-users] Question on stale shards with distribute-replicate volume
I skimmed over , so take everything I say with a grain of salt. Based on thr logs, the gfid for one of the cases is clear -> b42dc8f9-755e-46be-8418-4882a9f765e1 and shard 5613. As there is a linkto, most probably the shards location was on another subvolume and in such case I would just "walk" over all bricks and get the extended file attributes of the real ones. I can't imagine why it happened but I do suspect a gfid splitbrain. If I were in your shoes, I would check the gfids and assume that those with the same gfid value are the good ones (usually the one that differs has an older timestamp) and I would remove the copy from the last brick and check if it fixes the things for me. Best Regards,Strahil Nikolov? On Thu, Nov 3, 2022 at 17:24, Ronny Adsetts<ronny.adsetts at amazinginternet.com> wrote: Hi, We have a 4 x ( 2 + 1 ) distribute-replicate volume with sharding enabled. We use the volume for storing backing files for iscsi devices. The iscsi devices are provided to our file server using tgtd using the glfs backing store type via libgfapi. So we had a problem the other day where one of the filesystems wouldn't re-mount following a rolling tgtd restart (we have 4 servers providing tgtd). I think this rolling restart was done too quickly which meant there was a disconnect at the file server end (speculating). After some investigation, and manually trying to copy the fs image file to a temporary location, I found 0 byte shards. Because I mounted the file directly I got errors in the gluster logs (/var/log/glusterfs/srv-iscsi.log) for the volume. I get no errors in gluster logs when this happens via libgfapi though I did see tgtd errors. The tgtd errors look like this: tgtd[24080]: tgtd: bs_glfs_request(279) Error on read ffffffff 1000tgtd: bs_glfs_request(370) io error 0x55da8d9820b0 2 28 -1 4096 376698519552, Stale file handle Not sure how to figure out which shard is the issue out of that log entry. :-) The gluster logs look like this: [2022-11-01 16:51:28.496911] E [MSGID: 133010] [shard.c:2342:shard_common_lookup_shards_cbk] 0-iscsi-shard: Lookup on shard 5613 failed. Base file gfid = b42dc8f9-755e-46be-8418-4882a9f765e1 [Stale file handle] [2022-11-01 19:17:09.060376] E [MSGID: 133010] [shard.c:2342:shard_common_lookup_shards_cbk] 0-iscsi-shard: Lookup on shard 5418 failed. Base file gfid = b42dc8f9-755e-46be-8418-4882a9f765e1 [Stale file handle] So there were the two shards showing up as problematic. Checking the shard files showed that they were 0 byte with a trusted.glusterfs.dht.linkto value in the file attributes. There were other shard files of the same name with the correct size. So I guess the shard had been moved at some point resulting in the 8 byte linkto copies. Anyway, moving the offending .shard and associated .gluster files out of the way resulted in me being able to first, copy the file without error, and then run an "xfs_repair -L" on the filesystem and get it remounted. There was some data loss but minor as far as I can tell. So the two shards I removed (replica 2 + arbiter) look like so: ronny at cogline:~$ ls -al /tmp/publichomes-backup-stale-shards/.shard/ total 0 drwxr-xr-x 2 root root 104 Nov? 2 00:13 . drwxr-xr-x 4 root root? 38 Nov? 2 00:05 .. ---------T 1 root root? 0 Aug 14 04:26 b42dc8f9-755e-46be-8418-4882a9f765e1.5418 ---------T 1 root root? 0 Oct 25 10:49 b42dc8f9-755e-46be-8418-4882a9f765e1.5613 ronny at keratrix:~$ ls -al /tmp/publichomes-backup-stale-shards/.shard/ total 0 drwxr-xr-x 2 root root 104 Nov? 2 00:13 . drwxr-xr-x 4 root root? 38 Nov? 2 00:07 .. ---------T 1 root root? 0 Aug 14 04:26 b42dc8f9-755e-46be-8418-4882a9f765e1.5418 ---------T 1 root root? 0 Oct 25 10:49 b42dc8f9-755e-46be-8418-4882a9f765e1.5613 ronny at bellizen:~$ ls -al /tmp/publichomes-backup-stale-shards/.shard/ total 0 drwxr-xr-x 2 root root 55 Nov? 2 00:07 . drwxr-xr-x 4 root root 38 Nov? 2 00:07 .. ---------T 1 root root? 0 Oct 25 10:49 b42dc8f9-755e-46be-8418-4882a9f765e1.5613 ronny at risca:~$ ls -al /tmp/publichomes-backup-stale-shards/.shard/ total 0 drwxr-xr-x 2 root root 55 Nov? 2 00:13 . drwxr-xr-x 4 root root 38 Nov? 2 00:13 .. ---------T 1 root root? 0 Aug 14 04:26 b42dc8f9-755e-46be-8418-4882a9f765e1.5418 So the first question is did I do the right thing to get this resolved? The other, and more important question now relates to "Stale file handle" errors we are now seeing on a different file system. I only have tgtd log entries for this and wondered if anyone could help with taking a log entry and somehow figuring out which shard is the problematic one: tgtd[3052]: tgtd: bs_glfs_request(370) io error 0x56404e0dc510 2 2a -1 1310720 428680884224, Stale file handle Thanks for any help anyone can provide. Ronny -- Ronny Adsetts Technical Director Amazing Internet Ltd, London t: +44 20 8977 8943 w: www.amazinginternet.com Registered office: 85 Waldegrave Park, Twickenham, TW1 4TJ Registered in England. Company No. 4042957 ________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list Gluster-users at gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20221110/d02da741/attachment.html>