Morus Walter
2021-Sep-23 06:43 UTC
[Gluster-users] files missing, as they seem to be searched on the wrong subvolume
Hi, we are in the process of switching from a setup with three subvolumes (each having two bricks and an arbiter) into a setup with two (new, larger) subvolumes, and ran into an issue, where gluster does not find a number of files any more. The files are there, when doing an ls, but cannot be accessed directly before that. Once the ls is done, they are accessible for some time. So far this seems to affect old files (files that have been there before the volume changes started) only. Our current understanding of the issue is, that the distributed hash table translator assumes the file on the wrong subvolume and therefor does not find it. The directory lookup has to look into all subvolumes and therefor finds it. That result is somehow cached on the client and makes the file accessible as well for some time.. I'm not fully aware of the order of changes made to the volume (it was done by different people and one of them is on vacation ATM), but I think it was something like - we added a first new subvolume - we started and commited the removal of the first two old subvolumes For the removal gluster reported a number of errors and we did it forcefully anyways, as we did not find further info on the errors and how to get rid of them. Probably something we should not have done. - we added the second new subvolume - we started the removal of the last old subvolume The last removal has not been commited yet. glusterfs version is 9.3 on Ubuntu 18.04.6 We googled the issue, one thing we found was https://github.com/gluster/glusterfs/issues/843 It suggests to try 'gluster volume set parallel-readdir disable' but that did not change the situation (makes sense, we do not have issues with readdir but file access). We looked into the logs but did not find further insight. Questions: Is there any means how to fix that within gluster commands/tools? Would another rebalance (beyond the one that happend from the removal) likely fix the situation? We figured out a way to fix the situation file by file by either copying or hardlinking the files into new folders and removing the old ones. (for a folder foo, one would * create a folder foo_new, * hardlink all files from foo into foo_new * rename foo into foo_old * rename foo_new into foo * delete foo_old) Any help appreciated. best Morus PS: gluster volume info Volume Name: webgate Type: Distributed-Replicate Volume ID: 383cf25e-f76c-4921-8d64-8bc41c908d57 Status: Started Snapshot Count: 0 Number of Bricks: 3 x (2 + 1) = 9 Transport-type: tcp Bricks: Brick1: budgie-brick1.arriwebgate.com:/data/gluster Brick2: budgie-brick2.arriwebgate.com:/data/gluster Brick3: budgie-arbiter.arriwebgate.com:/data/gluster (arbiter) Brick4: parrot-brick1.arriwebgate.com:/data/gluster Brick5: parrot-brick2.arriwebgate.com:/data/gluster Brick6: parrot-arbiter.arriwebgate.com:/data/gluster (arbiter) Brick7: kiwi-brick1.arriwebgate.com:/data/gluster Brick8: kiwi-brick2.arriwebgate.com:/data/gluster Brick9: kiwi-arbiter.arriwebgate.com:/data/gluster (arbiter) Options Reconfigured: performance.write-behind-window-size: 1MB storage.fips-mode-rchecksum: on changelog.changelog: on geo-replication.ignore-pid-check: on geo-replication.indexing: on cluster.rebal-throttle: lazy transport.address-family: inet nfs.disable: on performance.client-io-threads: off performance.cache-size: 4GB performance.io-thread-count: 16 performance.readdir-ahead: on client.event-threads: 8 server.event-threads: 8 config.transport: tcp performance.read-ahead: off features.cache-invalidation: on features.cache-invalidation-timeout: 600 performance.stat-prefetch: on performance.cache-invalidation: on performance.md-cache-timeout: 600 network.inode-lru-limit: 1000000 performance.parallel-readdir: on storage.owner-uid: 1000 storage.owner-gid: 1000 cluster.background-self-heal-count: 64 cluster.shd-max-threads: 4 gluster volume heal webgate info Brick budgie-brick1.arriwebgate.com:/data/gluster Status: Connected Number of entries: 0 Brick budgie-brick2.arriwebgate.com:/data/gluster Status: Connected Number of entries: 0 Brick budgie-arbiter.arriwebgate.com:/data/gluster Status: Connected Number of entries: 0 Brick parrot-brick1.arriwebgate.com:/data/gluster Status: Connected Number of entries: 0 Brick parrot-brick2.arriwebgate.com:/data/gluster Status: Connected Number of entries: 0 Brick parrot-arbiter.arriwebgate.com:/data/gluster Status: Connected Number of entries: 0 Brick kiwi-brick1.arriwebgate.com:/data/gluster Status: Connected Number of entries: 0 Brick kiwi-brick2.arriwebgate.com:/data/gluster Status: Connected Number of entries: 0 Brick kiwi-arbiter.arriwebgate.com:/data/gluster Status: Connected Number of entries: 0 gluster volume status webgate Status of volume: webgate Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick budgie-brick1.arriwebgate.com:/data/g luster 49152 0 Y 854 Brick budgie-brick2.arriwebgate.com:/data/g luster 49152 0 Y 888 Brick budgie-arbiter.arriwebgate.com:/data/ gluster 49152 0 Y 857 Brick parrot-brick1.arriwebgate.com:/data/g luster 49152 0 Y 1889 Brick parrot-brick2.arriwebgate.com:/data/g luster 49152 0 Y 2505 Brick parrot-arbiter.arriwebgate.com:/data/ gluster 49152 0 Y 1439 Brick kiwi-brick1.arriwebgate.com:/data/glu ster 49152 0 Y 24941 Brick kiwi-brick2.arriwebgate.com:/data/glu ster 49152 0 Y 31448 Brick kiwi-arbiter.arriwebgate.com:/data/gl uster 49152 0 Y 5483 Self-heal Daemon on localhost N/A N/A Y 24700 Self-heal Daemon on budgie-brick1.arriwebga te.com N/A N/A Y 960 Self-heal Daemon on budgie-brick2.arriwebga te.com N/A N/A Y 974 Self-heal Daemon on parrot-brick1.arriwebga te.com N/A N/A Y 1811 Self-heal Daemon on parrot-brick2.arriwebga te.com N/A N/A Y 2543 Self-heal Daemon on kiwi-brick2.arriwebgate .com N/A N/A Y 31207 Self-heal Daemon on kiwi-arbiter.arriwebgat e.com N/A N/A Y 5229 Self-heal Daemon on parrot-arbiter.arriwebg ate.com N/A N/A Y 1466 Self-heal Daemon on budgie-arbiter.arriwebg ate.com N/A N/A Y 984 Task Status of Volume webgate ------------------------------------------------------------------------------ Task : Remove brick ID : 6f67e1d4-23f4-46ca-a97a-2adb152ef294 Removed bricks: budgie-brick1.arriwebgate.com:/data/gluster budgie-brick2.arriwebgate.com:/data/gluster budgie-arbiter.arriwebgate.com:/data/gluster Status : completed -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20210923/12fd1948/attachment.html>
Strahil Nikolov
2021-Sep-27 12:56 UTC
[Gluster-users] files missing, as they seem to be searched on the wrong subvolume
Have you tried with? 'performance.stat-prefetch' or 'performance.parallel-readdir' set to disabled ? Best Regards, Strahil Nikolov ? ?????????, 23 ????????? 2021 ?., 09:43:44 ?. ???????+3, Morus Walter <morus.walter.ml at googlemail.com> ??????: Hi, we are in the process of switching from a setup with three subvolumes (each having two bricks and an arbiter) into a setup with two (new, larger) subvolumes, and ran into an issue, where gluster does not find a number of files any more. The files are there, when doing an ls, but cannot be accessed directly before that. Once the ls is done, they are accessible for some time. So far this seems to affect old files (files that have been there before the volume changes started) only. Our current understanding of the issue is, that the distributed hash table translator assumes the file on the wrong subvolume and therefor does not find it. The directory lookup has to look into all subvolumes and therefor finds it. That result is somehow cached on the client and makes the file accessible as well for some time.. I'm not fully aware of the order of changes made to the volume (it was done by different people and one of them is on vacation ATM), but I think it was something like - we added a first new subvolume - we started and commited the removal of the first two old subvolumes ? For the removal gluster reported a number of errors and we did it ? forcefully anyways, as we did not find further info on the errors and ? how to get rid of them. Probably something we should not have done. - we added the second new subvolume - we started the removal of the last old subvolume The last removal has not been commited yet. glusterfs version is 9.3 on Ubuntu 18.04.6 We googled the issue, one thing we found was https://github.com/gluster/glusterfs/issues/843 It suggests to try 'gluster volume set parallel-readdir disable' but that did not change the situation (makes sense, we do not have issues with readdir but file access). We looked into the logs but did not find further insight. Questions: Is there any means how to fix that within gluster commands/tools? Would another rebalance (beyond the one that happend from the removal) likely fix the situation? We figured out a way to fix the situation file by file by either copying or hardlinking the files into new folders and removing the old ones. (for a folder foo, one would ?* create a folder foo_new, ?* hardlink all files from foo into foo_new ?* rename foo into foo_old ?* rename foo_new into foo ?* delete foo_old) Any help appreciated. best ? Morus PS: gluster volume info Volume Name: webgate Type: Distributed-Replicate Volume ID: 383cf25e-f76c-4921-8d64-8bc41c908d57 Status: Started Snapshot Count: 0 Number of Bricks: 3 x (2 + 1) = 9 Transport-type: tcp Bricks: Brick1: budgie-brick1.arriwebgate.com:/data/gluster Brick2: budgie-brick2.arriwebgate.com:/data/gluster Brick3: budgie-arbiter.arriwebgate.com:/data/gluster (arbiter) Brick4: parrot-brick1.arriwebgate.com:/data/gluster Brick5: parrot-brick2.arriwebgate.com:/data/gluster Brick6: parrot-arbiter.arriwebgate.com:/data/gluster (arbiter) Brick7: kiwi-brick1.arriwebgate.com:/data/gluster Brick8: kiwi-brick2.arriwebgate.com:/data/gluster Brick9: kiwi-arbiter.arriwebgate.com:/data/gluster (arbiter) Options Reconfigured: performance.write-behind-window-size: 1MB storage.fips-mode-rchecksum: on changelog.changelog: on geo-replication.ignore-pid-check: on geo-replication.indexing: on cluster.rebal-throttle: lazy transport.address-family: inet nfs.disable: on performance.client-io-threads: off performance.cache-size: 4GB performance.io-thread-count: 16 performance.readdir-ahead: on client.event-threads: 8 server.event-threads: 8 config.transport: tcp performance.read-ahead: off features.cache-invalidation: on features.cache-invalidation-timeout: 600 performance.stat-prefetch: on performance.cache-invalidation: on performance.md-cache-timeout: 600 network.inode-lru-limit: 1000000 performance.parallel-readdir: on storage.owner-uid: 1000 storage.owner-gid: 1000 cluster.background-self-heal-count: 64 cluster.shd-max-threads: 4 gluster volume heal webgate info Brick budgie-brick1.arriwebgate.com:/data/gluster Status: Connected Number of entries: 0 Brick budgie-brick2.arriwebgate.com:/data/gluster Status: Connected Number of entries: 0 Brick budgie-arbiter.arriwebgate.com:/data/gluster Status: Connected Number of entries: 0 Brick parrot-brick1.arriwebgate.com:/data/gluster Status: Connected Number of entries: 0 Brick parrot-brick2.arriwebgate.com:/data/gluster Status: Connected Number of entries: 0 Brick parrot-arbiter.arriwebgate.com:/data/gluster Status: Connected Number of entries: 0 Brick kiwi-brick1.arriwebgate.com:/data/gluster Status: Connected Number of entries: 0 Brick kiwi-brick2.arriwebgate.com:/data/gluster Status: Connected Number of entries: 0 Brick kiwi-arbiter.arriwebgate.com:/data/gluster Status: Connected Number of entries: 0 gluster volume status webgate Status of volume: webgate Gluster process ? ? ? ? ? ? ? ? ? ? ? ? ? ? TCP Port ?RDMA Port ?Online ?Pid ------------------------------------------------------------------------------ Brick budgie-brick1.arriwebgate.com:/data/g luster ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?49152 ? ? 0 ? ? ? ? ?Y ? ? ? 854 Brick budgie-brick2.arriwebgate.com:/data/g luster ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?49152 ? ? 0 ? ? ? ? ?Y ? ? ? 888 Brick budgie-arbiter.arriwebgate.com:/data/ gluster ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 49152 ? ? 0 ? ? ? ? ?Y ? ? ? 857 Brick parrot-brick1.arriwebgate.com:/data/g luster ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?49152 ? ? 0 ? ? ? ? ?Y ? ? ? 1889 Brick parrot-brick2.arriwebgate.com:/data/g luster ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?49152 ? ? 0 ? ? ? ? ?Y ? ? ? 2505 Brick parrot-arbiter.arriwebgate.com:/data/ gluster ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 49152 ? ? 0 ? ? ? ? ?Y ? ? ? 1439 Brick kiwi-brick1.arriwebgate.com:/data/glu ster ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?49152 ? ? 0 ? ? ? ? ?Y ? ? ? 24941 Brick kiwi-brick2.arriwebgate.com:/data/glu ster ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?49152 ? ? 0 ? ? ? ? ?Y ? ? ? 31448 Brick kiwi-arbiter.arriwebgate.com:/data/gl uster ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 49152 ? ? 0 ? ? ? ? ?Y ? ? ? 5483 Self-heal Daemon on localhost ? ? ? ? ? ? ? N/A ? ? ? N/A ? ? ? ?Y ? ? ? 24700 Self-heal Daemon on budgie-brick1.arriwebga te.com ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?N/A ? ? ? N/A ? ? ? ?Y ? ? ? 960 Self-heal Daemon on budgie-brick2.arriwebga te.com ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?N/A ? ? ? N/A ? ? ? ?Y ? ? ? 974 Self-heal Daemon on parrot-brick1.arriwebga te.com ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?N/A ? ? ? N/A ? ? ? ?Y ? ? ? 1811 Self-heal Daemon on parrot-brick2.arriwebga te.com ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?N/A ? ? ? N/A ? ? ? ?Y ? ? ? 2543 Self-heal Daemon on kiwi-brick2.arriwebgate .com ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?N/A ? ? ? N/A ? ? ? ?Y ? ? ? 31207 Self-heal Daemon on kiwi-arbiter.arriwebgat e.com ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? N/A ? ? ? N/A ? ? ? ?Y ? ? ? 5229 Self-heal Daemon on parrot-arbiter.arriwebg ate.com ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? N/A ? ? ? N/A ? ? ? ?Y ? ? ? 1466 Self-heal Daemon on budgie-arbiter.arriwebg ate.com ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? N/A ? ? ? N/A ? ? ? ?Y ? ? ? 984 Task Status of Volume webgate ------------------------------------------------------------------------------ Task ? ? ? ? ? ? ? ? : Remove brick ID ? ? ? ? ? ? ? ? ? : 6f67e1d4-23f4-46ca-a97a-2adb152ef294 Removed bricks: budgie-brick1.arriwebgate.com:/data/gluster budgie-brick2.arriwebgate.com:/data/gluster budgie-arbiter.arriwebgate.com:/data/gluster Status ? ? ? ? ? ? ? : completed ________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list Gluster-users at gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users