I found a solution after making a discovery. I logged into the brick with
the worst file count discrepancy - odroid4 - and killed the gluster daemon
there. All file counts across all clients then matched. So I started the
daemon and ran this command to try to fix it up:
gluster volume replace-brick gvol0 odroid4:/srv/gfs-brick/gvol0
odroid4:/srv/gfs-brick/gvol0_2 commit force
...and that fixed it. It's disconcerting that it's possible for Gluster
to
merrily hum along without any problems showing up in the various status
summaries yet show vastly different directory listings to different
clients. Is this a known problem or shall I open a bug report? Are there
any particular error logs I should monitor to be alerted to this bad state?
On Thu, Oct 29, 2020 at 8:39 PM James H <stormdig at gmail.com> wrote:
> Hi folks, I'm struggling to find a solution to missing files on FUSE
> mounts. Which files are missing is different on different clients. I can
> stat or ls the missing files directly when called by filename but listing
> directories won't show them.
>
> So far I've:
>
> - verified heal info shows no files in need of healing and no split
> brain condition
> - verified the same number of clients are connected to each brick
> - verified the file counts on the bricks match
> - upgraded Gluster server and clients from 3.x to 6.x and 7.x
> - run a stat on all files
> - run a heal full
> - rebooted / remounted FUSE clients
>
> File count from running a 'find' command on FUSE mounts on the
bricks
> themselves. These counts should all be the same:
> *38823 *fuse-odroid1-share2
> *38823 *fuse-odroid2-share2
> *60962 *fuse-odroid3-share2
> *7202 *fuse-odroid4-share2
>
> ...and a FUSE mount on a seperate server:
> *38823 *fuse-phn2dsm-share2
>
> File count from running a 'find' command on brick directories
> themselves::
> *43382 *brick-odroid1-share2
> *43382 *brick-odroid2-share2
> *43382 *brick-arbiter-odroid3-share2
> *23075 *brick-odroid3-share2
> *23075 *brick-odroid4-share2
> *23075 *brick-arbiter-odroid2-share2
>
> Here's some info about the setup:
>
> *# gluster --version | head -1; cat /etc/lsb-release; uname -r*
> glusterfs 7.8
> DISTRIB_ID=Ubuntu
> DISTRIB_RELEASE=18.04
> DISTRIB_CODENAME=bionic
> DISTRIB_DESCRIPTION="Ubuntu 18.04.3 LTS"
> 4.14.157-171
>
> *# gluster volume info*
> Volume Name: gvol0
> Type: Distributed-Replicate
> Volume ID: 57e3a085-5fb7-417d-a71a-fed5cd0ae2d9
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 2 x (2 + 1) = 6
> Transport-type: tcp
> Bricks:
> Brick1: odroid1:/srv/gfs-brick/gvol0
> Brick2: odroid2:/srv/gfs-brick/gvol0
> Brick3: odroid3:/srv/gfs-brick/gvol0-arbiter2 (arbiter)
> Brick4: odroid3:/srv/gfs-brick/gvol0_2
> Brick5: odroid4:/srv/gfs-brick/gvol0
> Brick6: odroid2:/srv/gfs-brick/gvol0-arbiter2 (arbiter)
> Options Reconfigured:
> cluster.self-heal-daemon: enable
> performance.readdir-ahead: yes
> performance.cache-invalidation: on
> performance.stat-prefetch: on
> performance.quick-read: on
> cluster.shd-max-threads: 4
> performance.parallel-readdir: on
> cluster.server-quorum-type: server
> server.event-threads: 4
> client.event-threads: 4
> performance.nl-cache-timeout: 600
> performance.nl-cache: on
> network.inode-lru-limit: 200000
> performance.md-cache-timeout: 600
> performance.cache-samba-metadata: on
> features.cache-invalidation-timeout: 600
> features.cache-invalidation: on
> storage.fips-mode-rchecksum: on
> performance.client-io-threads: off
> nfs.disable: on
> transport.address-family: inet
> features.bitrot: on
> features.scrub: Active
> features.scrub-throttle: lazy
> features.scrub-freq: daily
> cluster.min-free-disk: 10%
>
> *# gluster volume status gvol0 detail*
> Status of volume: gvol0
>
>
------------------------------------------------------------------------------
> Brick : Brick odroid1:/srv/gfs-brick/gvol0
> TCP Port : 49152
> RDMA Port : 0
> Online : Y
> Pid : 702
> File System : xfs
> Device : /dev/sda
> Mount Options :
> rw,noatime,nouuid,attr2,inode64,sunit=256,swidth=2560,noquota
> Inode Size : 512
> Disk Space Free : 983.4GB
> Total Disk Space : 5.5TB
> Inode Count : 586052224
> Free Inodes : 585835873
>
>
------------------------------------------------------------------------------
> Brick : Brick odroid2:/srv/gfs-brick/gvol0
> TCP Port : 49152
> RDMA Port : 0
> Online : Y
> Pid : 30206
> File System : xfs
> Device : /dev/sda
> Mount Options :
> rw,noatime,nouuid,attr2,inode64,sunit=256,swidth=2560,noquota
> Inode Size : 512
> Disk Space Free : 983.3GB
> Total Disk Space : 5.5TB
> Inode Count : 586052224
> Free Inodes : 585711242
>
>
------------------------------------------------------------------------------
> Brick : Brick odroid3:/srv/gfs-brick/gvol0-arbiter2
> TCP Port : 49152
> RDMA Port : 0
> Online : Y
> Pid : 32449
> File System : xfs
> Device : /dev/sda
> Mount Options :
> rw,noatime,nouuid,attr2,inode64,sunit=256,swidth=2560,noquota
> Inode Size : 512
> Disk Space Free : 1.4TB
> Total Disk Space : 2.7TB
> Inode Count : 293026624
> Free Inodes : 292378835
>
>
------------------------------------------------------------------------------
> Brick : Brick odroid3:/srv/gfs-brick/gvol0_2
> TCP Port : 49153
> RDMA Port : 0
> Online : Y
> Pid : 32474
> File System : xfs
> Device : /dev/sda
> Mount Options :
> rw,noatime,nouuid,attr2,inode64,sunit=256,swidth=2560,noquota
> Inode Size : 512
> Disk Space Free : 1.4TB
> Total Disk Space : 2.7TB
> Inode Count : 293026624
> Free Inodes : 292378835
>
>
------------------------------------------------------------------------------
> Brick : Brick odroid4:/srv/gfs-brick/gvol0
> TCP Port : 49152
> RDMA Port : 0
> Online : Y
> Pid : 23138
> File System : xfs
> Device : /dev/sda
> Mount Options :
> rw,noatime,nouuid,attr2,inode64,sunit=256,swidth=2560,noquota
> Inode Size : 512
> Disk Space Free : 1.4TB
> Total Disk Space : 2.7TB
> Inode Count : 293026624
> Free Inodes : 292891910
>
>
------------------------------------------------------------------------------
> Brick : Brick odroid2:/srv/gfs-brick/gvol0-arbiter2
> TCP Port : 49153
> RDMA Port : 0
> Online : Y
> Pid : 30231
> File System : xfs
> Device : /dev/sda
> Mount Options :
> rw,noatime,nouuid,attr2,inode64,sunit=256,swidth=2560,noquota
> Inode Size : 512
> Disk Space Free : 983.3GB
> Total Disk Space : 5.5TB
> Inode Count : 586052224
> Free Inodes : 585711242
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20201102/9ce2564c/attachment.html>