Nithya Balachandran
2018-Jul-09 09:19 UTC
[Gluster-users] Files from one brick missing from readdir
Or even better, the brick on which those files exist and the gluster volume status output for the volume. Thanks, Nithya On 9 July 2018 at 14:42, Nithya Balachandran <nbalacha at redhat.com> wrote:> Thanks Hans. What are the names of the "missing" files? > > Regards, > Nithya > > On 9 July 2018 at 13:30, Nithya Balachandran <nbalacha at redhat.com> wrote: > >> Hi Hans, >> >> Another user has reported something similar and we are still debugging >> this. >> >> Would you mind taking a tcpdump of the client while listing the directory >> from a FUSE client and sending it to me? Please use >> tcpdump -i any -s 0 -w /var/tmp/dirls.pcap tcp and not port 22 >> >> >> Also, please send the output of gluster volume info and gluster volume >> get <volname> all. >> >> Thanks, >> Nithya >> >> >> On 9 July 2018 at 12:51, Hans Henrik Happe <happe at nbi.dk> wrote: >> >>> Hi, >>> >>> After an upgrade from 3.7 -> 3.10 -> 3.12.9 that seemed to go smoothly, >>> we have experienced missing files and dirs when listing directories. >>> >>> We are using a distributed setup with 20 bricks (no redundance from >>> glusterfs). >>> >>> The dirs and files can be referenced directly, but does not show up in >>> listings (readdir, i.e. ls). Renaming them works, but they still does >>> not show up. >>> >>> The first time we discovered this, we noticed that files slowly >>> reappeared and finally all were there. After that we started a >>> fix-layout which is still running (5mio dirs). After this we would >>> compare brick files to the mounted fs. >>> >>> Yesterday we again discovered some missing files in a dir. After some >>> poking around we found that all missing files were located on the same >>> brick. >>> >>> Comparing dir xattr did not give us a clue: >>> >>> >>> Brick with missing files: >>> >>> # getfattr -m . -d -e hex backup >>> # file: backup >>> trusted.gfid=0x8613f6e0317141918b42d8c8063ffbce >>> trusted.glusterfs.dht=0x0000000100000000b2169fa3bf11b4e7 >>> trusted.glusterfs.quota.6e0ab807-6eed-4af1-92b8-0db3ca7a19e0 >>> .contri.1=0x00000000d65e340000000000000000750000000000000001 >>> trusted.glusterfs.quota.dirty=0x3000 >>> trusted.glusterfs.quota.size.1=0x00000000d65e340000000000000 >>> 000750000000000000001 >>> >>> Other brick: >>> >>> # getfattr -m . -d -e hex backup >>> # file: backup >>> trusted.gfid=0x8613f6e0317141918b42d8c8063ffbce >>> trusted.glusterfs.dht=0x0000000100000000bf11b4e8cbe5e488 >>> trusted.glusterfs.quota.6e0ab807-6eed-4af1-92b8-0db3ca7a19e0 >>> .contri.1=0x00000000b03aa80000000000000000700000000000000001 >>> trusted.glusterfs.quota.dirty=0x3000 >>> trusted.glusterfs.quota.size.1=0x00000000b03aa80000000000000 >>> 000700000000000000001 >>> >>> >>> Anyone who experienced this or have some clues to what might be wrong? >>> >>> Cheers, >>> Hans Henrik >>> _______________________________________________ >>> Gluster-users mailing list >>> Gluster-users at gluster.org >>> https://lists.gluster.org/mailman/listinfo/gluster-users >>> >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180709/48ecb313/attachment.html>
Nithya Balachandran
2018-Jul-09 09:25 UTC
[Gluster-users] Files from one brick missing from readdir
Hi Hans, Never mind - I found it. It looks like the same problem as reported by the other user. In both cases, this is a pure distribute volume. See packet 154.The iatt is null for all entries. It looks like a .glusterfs gfid link is missing on that brick. Would you prefer that I send the steps to recover in a private email? Regards, Nithya On 9 July 2018 at 14:49, Nithya Balachandran <nbalacha at redhat.com> wrote:> Or even better, the brick on which those files exist and the gluster > volume status output for the volume. > > Thanks, > Nithya > > On 9 July 2018 at 14:42, Nithya Balachandran <nbalacha at redhat.com> wrote: > >> Thanks Hans. What are the names of the "missing" files? >> >> Regards, >> Nithya >> >> On 9 July 2018 at 13:30, Nithya Balachandran <nbalacha at redhat.com> wrote: >> >>> Hi Hans, >>> >>> Another user has reported something similar and we are still debugging >>> this. >>> >>> Would you mind taking a tcpdump of the client while listing the >>> directory from a FUSE client and sending it to me? Please use >>> tcpdump -i any -s 0 -w /var/tmp/dirls.pcap tcp and not port 22 >>> >>> >>> Also, please send the output of gluster volume info and gluster volume >>> get <volname> all. >>> >>> Thanks, >>> Nithya >>> >>> >>> On 9 July 2018 at 12:51, Hans Henrik Happe <happe at nbi.dk> wrote: >>> >>>> Hi, >>>> >>>> After an upgrade from 3.7 -> 3.10 -> 3.12.9 that seemed to go smoothly, >>>> we have experienced missing files and dirs when listing directories. >>>> >>>> We are using a distributed setup with 20 bricks (no redundance from >>>> glusterfs). >>>> >>>> The dirs and files can be referenced directly, but does not show up in >>>> listings (readdir, i.e. ls). Renaming them works, but they still does >>>> not show up. >>>> >>>> The first time we discovered this, we noticed that files slowly >>>> reappeared and finally all were there. After that we started a >>>> fix-layout which is still running (5mio dirs). After this we would >>>> compare brick files to the mounted fs. >>>> >>>> Yesterday we again discovered some missing files in a dir. After some >>>> poking around we found that all missing files were located on the same >>>> brick. >>>> >>>> Comparing dir xattr did not give us a clue: >>>> >>>> >>>> Brick with missing files: >>>> >>>> # getfattr -m . -d -e hex backup >>>> # file: backup >>>> trusted.gfid=0x8613f6e0317141918b42d8c8063ffbce >>>> trusted.glusterfs.dht=0x0000000100000000b2169fa3bf11b4e7 >>>> trusted.glusterfs.quota.6e0ab807-6eed-4af1-92b8-0db3ca7a19e0 >>>> .contri.1=0x00000000d65e340000000000000000750000000000000001 >>>> trusted.glusterfs.quota.dirty=0x3000 >>>> trusted.glusterfs.quota.size.1=0x00000000d65e340000000000000 >>>> 000750000000000000001 >>>> >>>> Other brick: >>>> >>>> # getfattr -m . -d -e hex backup >>>> # file: backup >>>> trusted.gfid=0x8613f6e0317141918b42d8c8063ffbce >>>> trusted.glusterfs.dht=0x0000000100000000bf11b4e8cbe5e488 >>>> trusted.glusterfs.quota.6e0ab807-6eed-4af1-92b8-0db3ca7a19e0 >>>> .contri.1=0x00000000b03aa80000000000000000700000000000000001 >>>> trusted.glusterfs.quota.dirty=0x3000 >>>> trusted.glusterfs.quota.size.1=0x00000000b03aa80000000000000 >>>> 000700000000000000001 >>>> >>>> >>>> Anyone who experienced this or have some clues to what might be wrong? >>>> >>>> Cheers, >>>> Hans Henrik >>>> _______________________________________________ >>>> Gluster-users mailing list >>>> Gluster-users at gluster.org >>>> https://lists.gluster.org/mailman/listinfo/gluster-users >>>> >>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180709/25a750bc/attachment.html>