Nithya Balachandran
2018-Jul-09 08:00 UTC
[Gluster-users] Files from one brick missing from readdir
Hi Hans, Another user has reported something similar and we are still debugging this. Would you mind taking a tcpdump of the client while listing the directory from a FUSE client and sending it to me? Please use tcpdump -i any -s 0 -w /var/tmp/dirls.pcap tcp and not port 22 Also, please send the output of gluster volume info and gluster volume get <volname> all. Thanks, Nithya On 9 July 2018 at 12:51, Hans Henrik Happe <happe at nbi.dk> wrote:> Hi, > > After an upgrade from 3.7 -> 3.10 -> 3.12.9 that seemed to go smoothly, > we have experienced missing files and dirs when listing directories. > > We are using a distributed setup with 20 bricks (no redundance from > glusterfs). > > The dirs and files can be referenced directly, but does not show up in > listings (readdir, i.e. ls). Renaming them works, but they still does > not show up. > > The first time we discovered this, we noticed that files slowly > reappeared and finally all were there. After that we started a > fix-layout which is still running (5mio dirs). After this we would > compare brick files to the mounted fs. > > Yesterday we again discovered some missing files in a dir. After some > poking around we found that all missing files were located on the same > brick. > > Comparing dir xattr did not give us a clue: > > > Brick with missing files: > > # getfattr -m . -d -e hex backup > # file: backup > trusted.gfid=0x8613f6e0317141918b42d8c8063ffbce > trusted.glusterfs.dht=0x0000000100000000b2169fa3bf11b4e7 > trusted.glusterfs.quota.6e0ab807-6eed-4af1-92b8-0db3ca7a19e0.contri.1> 0x00000000d65e340000000000000000750000000000000001 > trusted.glusterfs.quota.dirty=0x3000 > trusted.glusterfs.quota.size.1=0x00000000d65e3400000000000000 > 00750000000000000001 > > Other brick: > > # getfattr -m . -d -e hex backup > # file: backup > trusted.gfid=0x8613f6e0317141918b42d8c8063ffbce > trusted.glusterfs.dht=0x0000000100000000bf11b4e8cbe5e488 > trusted.glusterfs.quota.6e0ab807-6eed-4af1-92b8-0db3ca7a19e0.contri.1> 0x00000000b03aa80000000000000000700000000000000001 > trusted.glusterfs.quota.dirty=0x3000 > trusted.glusterfs.quota.size.1=0x00000000b03aa800000000000000 > 00700000000000000001 > > > Anyone who experienced this or have some clues to what might be wrong? > > Cheers, > Hans Henrik > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180709/a3608339/attachment.html>
Niels de Vos
2018-Jul-09 08:50 UTC
[Gluster-users] Files from one brick missing from readdir
On Mon, Jul 09, 2018 at 01:30:23PM +0530, Nithya Balachandran wrote:> Hi Hans, > > Another user has reported something similar and we are still debugging this. > > Would you mind taking a tcpdump of the client while listing the directory > from a FUSE client and sending it to me? Please use > tcpdump -i any -s 0 -w /var/tmp/dirls.pcap tcp and not port 22Last week I came across a patch that fixes a bug in the FUSE kernel module: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c6cdd51404b7ac12dd95173ddfc548c59ecf037f This was included in linux-4.14 but has been marked for backporting to older kernels as well. If the description matches the experienced behaviour, running with an updates (or patched) kernel may help. Niels> > > Also, please send the output of gluster volume info and gluster volume get > <volname> all. > > Thanks, > Nithya > > > On 9 July 2018 at 12:51, Hans Henrik Happe <happe at nbi.dk> wrote: > > > Hi, > > > > After an upgrade from 3.7 -> 3.10 -> 3.12.9 that seemed to go smoothly, > > we have experienced missing files and dirs when listing directories. > > > > We are using a distributed setup with 20 bricks (no redundance from > > glusterfs). > > > > The dirs and files can be referenced directly, but does not show up in > > listings (readdir, i.e. ls). Renaming them works, but they still does > > not show up. > > > > The first time we discovered this, we noticed that files slowly > > reappeared and finally all were there. After that we started a > > fix-layout which is still running (5mio dirs). After this we would > > compare brick files to the mounted fs. > > > > Yesterday we again discovered some missing files in a dir. After some > > poking around we found that all missing files were located on the same > > brick. > > > > Comparing dir xattr did not give us a clue: > > > > > > Brick with missing files: > > > > # getfattr -m . -d -e hex backup > > # file: backup > > trusted.gfid=0x8613f6e0317141918b42d8c8063ffbce > > trusted.glusterfs.dht=0x0000000100000000b2169fa3bf11b4e7 > > trusted.glusterfs.quota.6e0ab807-6eed-4af1-92b8-0db3ca7a19e0.contri.1> > 0x00000000d65e340000000000000000750000000000000001 > > trusted.glusterfs.quota.dirty=0x3000 > > trusted.glusterfs.quota.size.1=0x00000000d65e3400000000000000 > > 00750000000000000001 > > > > Other brick: > > > > # getfattr -m . -d -e hex backup > > # file: backup > > trusted.gfid=0x8613f6e0317141918b42d8c8063ffbce > > trusted.glusterfs.dht=0x0000000100000000bf11b4e8cbe5e488 > > trusted.glusterfs.quota.6e0ab807-6eed-4af1-92b8-0db3ca7a19e0.contri.1> > 0x00000000b03aa80000000000000000700000000000000001 > > trusted.glusterfs.quota.dirty=0x3000 > > trusted.glusterfs.quota.size.1=0x00000000b03aa800000000000000 > > 00700000000000000001 > > > > > > Anyone who experienced this or have some clues to what might be wrong? > > > > Cheers, > > Hans Henrik > > _______________________________________________ > > Gluster-users mailing list > > Gluster-users at gluster.org > > https://lists.gluster.org/mailman/listinfo/gluster-users > >> _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 801 bytes Desc: not available URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180709/6617f823/attachment.sig>
Nithya Balachandran
2018-Jul-09 09:12 UTC
[Gluster-users] Files from one brick missing from readdir
Thanks Hans. What are the names of the "missing" files? Regards, Nithya On 9 July 2018 at 13:30, Nithya Balachandran <nbalacha at redhat.com> wrote:> Hi Hans, > > Another user has reported something similar and we are still debugging > this. > > Would you mind taking a tcpdump of the client while listing the directory > from a FUSE client and sending it to me? Please use > tcpdump -i any -s 0 -w /var/tmp/dirls.pcap tcp and not port 22 > > > Also, please send the output of gluster volume info and gluster volume get > <volname> all. > > Thanks, > Nithya > > > On 9 July 2018 at 12:51, Hans Henrik Happe <happe at nbi.dk> wrote: > >> Hi, >> >> After an upgrade from 3.7 -> 3.10 -> 3.12.9 that seemed to go smoothly, >> we have experienced missing files and dirs when listing directories. >> >> We are using a distributed setup with 20 bricks (no redundance from >> glusterfs). >> >> The dirs and files can be referenced directly, but does not show up in >> listings (readdir, i.e. ls). Renaming them works, but they still does >> not show up. >> >> The first time we discovered this, we noticed that files slowly >> reappeared and finally all were there. After that we started a >> fix-layout which is still running (5mio dirs). After this we would >> compare brick files to the mounted fs. >> >> Yesterday we again discovered some missing files in a dir. After some >> poking around we found that all missing files were located on the same >> brick. >> >> Comparing dir xattr did not give us a clue: >> >> >> Brick with missing files: >> >> # getfattr -m . -d -e hex backup >> # file: backup >> trusted.gfid=0x8613f6e0317141918b42d8c8063ffbce >> trusted.glusterfs.dht=0x0000000100000000b2169fa3bf11b4e7 >> trusted.glusterfs.quota.6e0ab807-6eed-4af1-92b8-0db3ca7a19e0 >> .contri.1=0x00000000d65e340000000000000000750000000000000001 >> trusted.glusterfs.quota.dirty=0x3000 >> trusted.glusterfs.quota.size.1=0x00000000d65e340000000000000 >> 000750000000000000001 >> >> Other brick: >> >> # getfattr -m . -d -e hex backup >> # file: backup >> trusted.gfid=0x8613f6e0317141918b42d8c8063ffbce >> trusted.glusterfs.dht=0x0000000100000000bf11b4e8cbe5e488 >> trusted.glusterfs.quota.6e0ab807-6eed-4af1-92b8-0db3ca7a19e0 >> .contri.1=0x00000000b03aa80000000000000000700000000000000001 >> trusted.glusterfs.quota.dirty=0x3000 >> trusted.glusterfs.quota.size.1=0x00000000b03aa80000000000000 >> 000700000000000000001 >> >> >> Anyone who experienced this or have some clues to what might be wrong? >> >> Cheers, >> Hans Henrik >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180709/17ad2223/attachment.html>