Niels Hendriks
2018-Apr-16 08:24 UTC
[Gluster-users] Gluster FUSE mount sometimes reports that files do not exist until ls is performed on parent directory
Hi, We have a 3-node gluster setup where gluster is both the server and the client. Every few days we have some $random file or directory that does not exist according to the FUSE mountpoint. When we try to access the file (stat, cat, etc...) the filesystem reports that the file/directory does not exist, even though it does. When we try to create the file/directory we receive the following error which is also logged in /var/log/glusterfs/bricks/$brick.log: [2018-04-10 12:51:26.755928] E [MSGID: 113027] [posix.c:1779:posix_mkdir] 0-www-posix: mkdir of /storage/gluster/path/to/dir failed [File exists] We don't see this issue on all of the servers, but only on the servers that did not create the file/directory (so 2 of the 3 gluster nodes). We found that this issue does not resolve itself automatically. However, when we perform an ls command on the parent directory the issue will be resolved for the other nodes. We are running glusterfs 3.12.6 on debian 8 Mount-options in /etc/fstab: /dev/storage-gluster/gluster /storage/gluster xfs rw,inode64,noatime,nouuid 0 2 localhost:/www /var/www glusterfs backup-volfile-servers=10.0.0.2:10.0.0.3,log-level=WARNING 0 0 gluster volume info www Volume Name: www Type: Replicate Volume ID: e0579d53-f671-4868-863b-ba85c4cfacb3 Status: Started Snapshot Count: 0 Number of Bricks: 1 x 3 = 3 Transport-type: tcp Bricks: Brick1: n01c01-gluster:/storage/gluster/www Brick2: n02c01-gluster:/storage/gluster/www Brick3: n03c01-gluster:/storage/gluster/www Options Reconfigured: performance.read-ahead: on performance.client-io-threads: on nfs.disable: on transport.address-family: inet performance.md-cache-timeout: 600 diagnostics.brick-log-level: WARNING network.ping-timeout: 3 features.cache-invalidation: on server.event-threads: 4 performance.cache-invalidation: on performance.quick-read: on features.cache-invalidation-timeout: 600 network.inode-lru-limit: 90000 performance.cache-priority: *.php:3,*.temp:3,*:1 performance.nl-cache: on performance.cache-size: 1GB performance.readdir-ahead: on performance.write-behind: on cluster.readdir-optimize: on performance.io-thread-count: 64 client.event-threads: 4 cluster.lookup-optimize: on performance.parallel-readdir: off performance.write-behind-window-size: 4MB performance.flush-behind: on features.bitrot: on features.scrub: Active performance.io-cache: off performance.stat-prefetch: on We suspected that the md-cache could be the cause, but it does have a timeout of 600 seconds so this would be strange since the issue can be present for hours (at which point we did an ls to fix it). Does anyone have an idea of what could be the cause of this? Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180416/6745ed94/attachment.html>
Raghavendra Gowdappa
2018-Apr-16 08:37 UTC
[Gluster-users] Gluster FUSE mount sometimes reports that files do not exist until ls is performed on parent directory
On Mon, Apr 16, 2018 at 1:54 PM, Niels Hendriks <niels at nuvini.com> wrote:> Hi, > > We have a 3-node gluster setup where gluster is both the server and the > client. > Every few days we have some $random file or directory that does not exist > according to the FUSE mountpoint. When we try to access the file (stat, > cat, etc...) the filesystem reports that the file/directory does not exist, > even though it does. When we try to create the file/directory we receive > the following error which is also logged in > /var/log/glusterfs/bricks/$brick.log: > > [2018-04-10 12:51:26.755928] E [MSGID: 113027] [posix.c:1779:posix_mkdir] > 0-www-posix: mkdir of /storage/gluster/path/to/dir failed [File exists] > > We don't see this issue on all of the servers, but only on the servers that > did not create the file/directory (so 2 of the 3 gluster nodes). > > We found that this issue does not resolve itself automatically. However, > when we perform an ls command on the parent directory the issue will be > resolved for the other nodes. > > We are running glusterfs 3.12.6 on debian 8 > > Mount-options in /etc/fstab: > /dev/storage-gluster/gluster /storage/gluster xfs rw,inode64,noatime,nouuid > 0 2 > localhost:/www /var/www glusterfs > backup-volfile-servers=10.0.0.2:10.0.0.3,log-level=WARNING > 0 0 > > gluster volume info www > > Volume Name: www > Type: Replicate > Volume ID: e0579d53-f671-4868-863b-ba85c4cfacb3 > Status: Started > Snapshot Count: 0 > Number of Bricks: 1 x 3 = 3 > Transport-type: tcp > Bricks: > Brick1: n01c01-gluster:/storage/gluster/www > Brick2: n02c01-gluster:/storage/gluster/www > Brick3: n03c01-gluster:/storage/gluster/www > Options Reconfigured: > performance.read-ahead: on > performance.client-io-threads: on > nfs.disable: on > transport.address-family: inet > performance.md-cache-timeout: 600 > diagnostics.brick-log-level: WARNING > network.ping-timeout: 3 > features.cache-invalidation: on > server.event-threads: 4 > performance.cache-invalidation: on > performance.quick-read: on > features.cache-invalidation-timeout: 600 > network.inode-lru-limit: 90000 > performance.cache-priority: *.php:3,*.temp:3,*:1 > performance.nl-cache: on > performance.cache-size: 1GB > performance.readdir-ahead: on > performance.write-behind: on > cluster.readdir-optimize: on > performance.io-thread-count: 64 > client.event-threads: 4 > cluster.lookup-optimize: on > performance.parallel-readdir: off > performance.write-behind-window-size: 4MB > performance.flush-behind: on > features.bitrot: on > features.scrub: Active > performance.io-cache: off > performance.stat-prefetch: on > > We suspected that the md-cache could be the cause, but it does have a > timeout of 600 seconds so this would be strange since the issue can be > present for hours (at which point we did an ls to fix it). > > Does anyone have an idea of what could be the cause of this? >For files, it could be because of: * cluster.lookup-optimize set to on * datafile is present on non hashed subvol, but linkto file is absent on hashed subvol I see that lookup-optimize is on. Can you get the following information from problematic file? * Name of the file * all xattrs on parent directory from all bricks * stat of file from all bricks where it is present. * all xattrs on file from all bricks where it is present. If you are seeing the problem on directory, * Name of directory * xattr of directory and its parent from all bricks regards, Raghavendra> Thanks! > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180416/909d12de/attachment.html>
Nithya Balachandran
2018-Apr-16 09:29 UTC
[Gluster-users] Gluster FUSE mount sometimes reports that files do not exist until ls is performed on parent directory
On 16 April 2018 at 14:07, Raghavendra Gowdappa <rgowdapp at redhat.com> wrote:> > > On Mon, Apr 16, 2018 at 1:54 PM, Niels Hendriks <niels at nuvini.com> wrote: > >> Hi, >> >> We have a 3-node gluster setup where gluster is both the server and the >> client. >> Every few days we have some $random file or directory that does not exist >> according to the FUSE mountpoint. When we try to access the file (stat, >> cat, etc...) the filesystem reports that the file/directory does not >> exist, >> even though it does. When we try to create the file/directory we receive >> the following error which is also logged in >> /var/log/glusterfs/bricks/$brick.log: >> >> [2018-04-10 12:51:26.755928] E [MSGID: 113027] [posix.c:1779:posix_mkdir] >> 0-www-posix: mkdir of /storage/gluster/path/to/dir failed [File exists] >> >> We don't see this issue on all of the servers, but only on the servers >> that >> did not create the file/directory (so 2 of the 3 gluster nodes). >> >> We found that this issue does not resolve itself automatically. However, >> when we perform an ls command on the parent directory the issue will be >> resolved for the other nodes. >> >> We are running glusterfs 3.12.6 on debian 8 >> >> Mount-options in /etc/fstab: >> /dev/storage-gluster/gluster /storage/gluster xfs >> rw,inode64,noatime,nouuid >> 0 2 >> localhost:/www /var/www glusterfs >> backup-volfile-servers=10.0.0.2:10.0.0.3,log-level=WARNING >> 0 0 >> >> gluster volume info www >> >> Volume Name: www >> Type: Replicate >> Volume ID: e0579d53-f671-4868-863b-ba85c4cfacb3 >> Status: Started >> Snapshot Count: 0 >> Number of Bricks: 1 x 3 = 3 >> Transport-type: tcp >> Bricks: >> Brick1: n01c01-gluster:/storage/gluster/www >> Brick2: n02c01-gluster:/storage/gluster/www >> Brick3: n03c01-gluster:/storage/gluster/www >> Options Reconfigured: >> performance.read-ahead: on >> performance.client-io-threads: on >> nfs.disable: on >> transport.address-family: inet >> performance.md-cache-timeout: 600 >> diagnostics.brick-log-level: WARNING >> network.ping-timeout: 3 >> features.cache-invalidation: on >> server.event-threads: 4 >> performance.cache-invalidation: on >> performance.quick-read: on >> features.cache-invalidation-timeout: 600 >> network.inode-lru-limit: 90000 >> performance.cache-priority: *.php:3,*.temp:3,*:1 >> performance.nl-cache: on >> performance.cache-size: 1GB >> performance.readdir-ahead: on >> performance.write-behind: on >> cluster.readdir-optimize: on >> performance.io-thread-count: 64 >> client.event-threads: 4 >> cluster.lookup-optimize: on >> performance.parallel-readdir: off >> performance.write-behind-window-size: 4MB >> performance.flush-behind: on >> features.bitrot: on >> features.scrub: Active >> performance.io-cache: off >> performance.stat-prefetch: on >> >> We suspected that the md-cache could be the cause, but it does have a >> timeout of 600 seconds so this would be strange since the issue can be >> present for hours (at which point we did an ls to fix it). >> >> Does anyone have an idea of what could be the cause of this? >> > > For files, it could be because of: > * cluster.lookup-optimize set to on > * datafile is present on non hashed subvol, but linkto file is absent on > hashed subvol > >This is a pure replicate volume: Type: Replicate Volume ID: e0579d53-f671-4868-863b-ba85c4cfacb3 Status: Started Snapshot Count: 0 Number of Bricks: 1 x 3 = 3 So unlikely to be a lookup-optimize problem.> I see that lookup-optimize is on. Can you get the following information > from problematic file? > > * Name of the file > * all xattrs on parent directory from all bricks > * stat of file from all bricks where it is present. > * all xattrs on file from all bricks where it is present. > > If you are seeing the problem on directory, > * Name of directory > * xattr of directory and its parent from all bricks > > regards, > Raghavendra > > >> Thanks! >> >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://lists.gluster.org/mailman/listinfo/gluster-users >> > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180416/a14c45c9/attachment.html>
Reasonably Related Threads
- Gluster FUSE mount sometimes reports that files do not exist until ls is performed on parent directory
- Gluster FUSE mount sometimes reports that files do not exist until ls is performed on parent directory
- Performance VPN over the internet
- setting gfid on .trashcan/... failed - total outage
- setting gfid on .trashcan/... failed - total outage