Erik Jacobson
2020-Feb-11 00:46 UTC
[Gluster-users] question on rebalance errors gluster 7.2 (adding to distributed/replicated)
My question: Are the errors and anomalies below something I need to investigate? Are should I not be worried? I installed a test cluster to gluster 7.2 to run some tests, preparing to see if we gain confidence to put this on the 5,120 node supercomputer instead of gluster 4.1.6. I started with a 3x2 volume with heavy optimizations for writes and NFS. (6 nodes, distribute/replicate). I booted my NFS-root clients and maintained them online. I then performaned a add-brick operation to make it a 3x3 instead of 3.2 (so 9 servers instead of 6). The rebalance went much better for me than gluster 4.1.6. However, I saw some errors. We noted them first here -- 14 errors on leader8, and a few on the others. These are the NEW nodes so the data flow was from the old nodes to these three that at least have one error: [root at leader8 glusterfs]# gluster volume rebalance cm_shared status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- leader1.head.cm.eag.rdlabs.hpecorp.net 18933 596.4MB 181780 0 3760 completed 0:41:39 172.23.0.4 18960 1.2GB 181831 0 3766 completed 0:41:39 172.23.0.5 18691 1.2GB 181826 0 3716 completed 0:41:39 172.23.0.6 14917 618.8MB 175758 0 3869 completed 0:35:40 172.23.0.7 15114 573.5MB 175728 0 3853 completed 0:35:41 172.23.0.8 14864 459.2MB 175742 0 3951 completed 0:35:40 172.23.0.9 0 0Bytes 11 3 0 completed 0:08:26 172.23.0.11 0 0Bytes 242 1 0 completed 0:08:25 localhost 0 0Bytes 5 14 0 completed 0:08:26 volume rebalance: cm_shared: success My rebalance log is like 32M and I find it's hard for people to help me when I post that much data. So I've tried to filter some of the data here. Two classes -- anomalies and errors. Errors (14 reported on this node): [root at leader8 glusterfs]# grep -i "error from gf_defrag_get_entry" cm_shared-rebalance.log [2020-02-10 23:23:55.286830] W [dht-rebalance.c:3439:gf_defrag_process_dir] 0-cm_shared-dht: Found error from gf_defrag_get_entry [2020-02-10 23:24:12.903496] W [dht-rebalance.c:3439:gf_defrag_process_dir] 0-cm_shared-dht: Found error from gf_defrag_get_entry [2020-02-10 23:24:15.226948] W [dht-rebalance.c:3439:gf_defrag_process_dir] 0-cm_shared-dht: Found error from gf_defrag_get_entry [2020-02-10 23:24:15.259480] W [dht-rebalance.c:3439:gf_defrag_process_dir] 0-cm_shared-dht: Found error from gf_defrag_get_entry [2020-02-10 23:24:15.398784] W [dht-rebalance.c:3439:gf_defrag_process_dir] 0-cm_shared-dht: Found error from gf_defrag_get_entry [2020-02-10 23:24:16.633033] W [dht-rebalance.c:3439:gf_defrag_process_dir] 0-cm_shared-dht: Found error from gf_defrag_get_entry [2020-02-10 23:24:16.645847] W [dht-rebalance.c:3439:gf_defrag_process_dir] 0-cm_shared-dht: Found error from gf_defrag_get_entry [2020-02-10 23:24:21.783528] W [dht-rebalance.c:3439:gf_defrag_process_dir] 0-cm_shared-dht: Found error from gf_defrag_get_entry [2020-02-10 23:24:22.307464] W [dht-rebalance.c:3439:gf_defrag_process_dir] 0-cm_shared-dht: Found error from gf_defrag_get_entry [2020-02-10 23:25:23.391256] W [dht-rebalance.c:3439:gf_defrag_process_dir] 0-cm_shared-dht: Found error from gf_defrag_get_entry [2020-02-10 23:26:34.203129] W [dht-rebalance.c:3439:gf_defrag_process_dir] 0-cm_shared-dht: Found error from gf_defrag_get_entry [2020-02-10 23:26:39.669243] W [dht-rebalance.c:3439:gf_defrag_process_dir] 0-cm_shared-dht: Found error from gf_defrag_get_entry [2020-02-10 23:27:42.615081] W [dht-rebalance.c:3439:gf_defrag_process_dir] 0-cm_shared-dht: Found error from gf_defrag_get_entry [2020-02-10 23:28:53.942158] W [dht-rebalance.c:3439:gf_defrag_process_dir] 0-cm_shared-dht: Found error from gf_defrag_get_entry Brick log errors around 23:23:55 (to match the first error above): [2020-02-10 23:23:54.605681] W [MSGID: 113096] [posix-handle.c:834:posix_handle_soft] 0-cm_shared-posix: symlink ../../a4/3e/a43ef7fd-08eb-434c-8168-96a92059d186/LC_MESSAGES -> /data/brick_cm_shared/.glusterfs/10/d9/10d97106-49b1-4c5e-a86f-b8e70c9ef838 failed [File exists] [2020-02-10 23:23:54.883387] W [MSGID: 113096] [posix-handle.c:834:posix_handle_soft] 0-cm_shared-posix: symlink ../../7d/66/7d66930c-3bd0-40c8-9473-897fcd2f8c11/LC_MESSAGES -> /data/brick_cm_shared/.glusterfs/7c/41/7c412877-2443-43a8-9c7a-67ada4d96a13 failed [File exists] [2020-02-10 23:23:55.284155] W [MSGID: 113096] [posix-handle.c:834:posix_handle_soft] 0-cm_shared-posix: symlink ../../a0/2c/a02c8b2d-f587-4c58-9de9-7928828e37e5/LC_MESSAGES -> /data/brick_cm_shared/.glusterfs/eb/79/eb79298d-a65e-41f3-a9a8-da4634879e88 failed [File exists] [2020-02-10 23:23:55.284178] E [MSGID: 113020] [posix-entry-ops.c:835:posix_mkdir] 0-cm_shared-posix: setting gfid on /data/brick_cm_shared/image/images_ro_nfs/rhel8.0/usr/share/vim/vim80/lang/zh_CN.UTF-8/LC_MESSAGES failed [File exists] [2020-02-10 23:23:55.284913] W [MSGID: 113103] [posix-entry-ops.c:247:posix_lookup] 0-cm_shared-posix: Found stale gfid handle /data/brick_cm_shared/.glusterfs/eb/79/eb79298d-a65e-41f3-a9a8-da4634879e88, removing it. [No such file or directory] [2020-02-10 23:23:57.218664] W [MSGID: 113096] [posix-handle.c:834:posix_handle_soft] 0-cm_shared-posix: symlink ../../86/c2/86c2e694-d00b-4dcf-8383-60ce0cb07275/html -> /data/brick_cm_shared/.glusterfs/5c/f0/5cf0cc7d-86fe-4ba2-bea5-1d8ad3616274 failed [File exists] Example anomalies - normal root files: [2020-02-10 23:28:18.816012] I [MSGID: 109063] [dht-layout.c:647:dht_layout_normalize] 0-cm_shared-dht: Found anomalies in /image/images_dist/rhel8.0/usr/lib64/python3.6/email (gfid = 4194dca6-dcc9-409b-a162-58e90b8db63d). Holes=1 overlaps=0 [2020-02-10 23:28:18.822869] I [MSGID: 109063] [dht-layout.c:647:dht_layout_normalize] 0-cm_shared-dht: Found anomalies in /image/images_dist/rhel8.0/usr/lib64/python3.6/email/__pycache__ (gfid = 07e4e462-de25-4840-99dc-f4235b4b45bf). Holes=1 overlaps=0 [2020-02-10 23:28:18.834924] I [MSGID: 109063] [dht-layout.c:647:dht_layout_normalize] 0-cm_shared-dht: Found anomalies in /image/images_dist/rhel8.0/usr/lib64/python3.6/email/mime (gfid = f882e53c-43c6-48ea-9230-c0bc7eee901f). Holes=1 overlaps=0 ... Example anomalies - sparse files with XFS images used as node-writable space: (but these are just the directories that hold the sparse files, not the spars files themselves) [2020-02-10 23:26:07.231529] I [MSGID: 109063] [dht-layout.c:647:dht_layout_normalize] 0-cm_shared-dht: Found anomalies in /image/images_rw_nfs/n2521 (gfid = 3b65777c-5fc5-4213-9525-294e74a560ca). Holes=1 overlaps=0 [2020-02-10 23:26:07.237923] I [MSGID: 109063] [dht-layout.c:647:dht_layout_normalize] 0-cm_shared-dht: Found anomalies in /image/images_rw_nfs/n2521/rhel8.0-aarch64 (gfid = f822683d-7136-4d5c-8df5-94f1b84afc03). Holes=1 overlaps=0 Volume info: [root at leader8 glusterfs]# gluster volume status cm_shared Status of volume: cm_shared Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick 172.23.0.3:/data/brick_cm_shared 49152 0 Y 36543 Brick 172.23.0.4:/data/brick_cm_shared 49152 0 Y 34371 Brick 172.23.0.5:/data/brick_cm_shared 49152 0 Y 34451 Brick 172.23.0.6:/data/brick_cm_shared 49152 0 Y 35685 Brick 172.23.0.7:/data/brick_cm_shared 49152 0 Y 34068 Brick 172.23.0.8:/data/brick_cm_shared 49152 0 Y 35093 Brick 172.23.0.9:/data/brick_cm_shared 49154 0 Y 31940 Brick 172.23.0.10:/data/brick_cm_shared 49154 0 Y 32420 Brick 172.23.0.11:/data/brick_cm_shared 49154 0 Y 32906 Self-heal Daemon on localhost N/A N/A Y 32063 NFS Server on localhost 2049 0 Y 32493 Self-heal Daemon on 172.23.0.4 N/A N/A Y 34435 NFS Server on 172.23.0.4 2049 0 Y 9636 Self-heal Daemon on 172.23.0.5 N/A N/A Y 34514 NFS Server on 172.23.0.5 2049 0 Y 11483 Self-heal Daemon on 172.23.0.7 N/A N/A Y 34131 NFS Server on 172.23.0.7 2049 0 Y 12294 Self-heal Daemon on 172.23.0.6 N/A N/A Y 35752 NFS Server on 172.23.0.6 2049 0 Y 4699 Self-heal Daemon on leader1.head.cm.eag.rdl abs.hpecorp.net N/A N/A Y 36626 NFS Server on leader1.head.cm.eag.rdlabs.hp ecorp.net 2049 0 Y 8736 Self-heal Daemon on 172.23.0.9 N/A N/A Y 31583 NFS Server on 172.23.0.9 2049 0 Y 31996 Self-heal Daemon on 172.23.0.11 N/A N/A Y 32550 NFS Server on 172.23.0.11 2049 0 Y 32962 Self-heal Daemon on 172.23.0.8 N/A N/A Y 35160 NFS Server on 172.23.0.8 2049 0 Y 2250 Task Status of Volume cm_shared ------------------------------------------------------------------------------ Task : Rebalance ID : f42c98ad-801a-4376-94ea-7dff698f8241 Status : completed Commands used to grow: ssh leader1 gluster volume add-brick cm_shared 172.23.0.9://data/brick_cm_shared 172.23.0.10://data/brick_cm_shared 172.23.0.11://data/brick_cm_shared volume add-brick: success ssh leader1 gluster volume rebalance cm_shared start volume rebalance: cm_shared: success: Rebalance on cm_shared has been started successfully. Use rebalance status command to check status of the rebalance process. All volume data/settings: [root at leader8 glusterfs]# gluster volume get cm_shared all Option Value ------ ----- cluster.lookup-unhashed auto cluster.lookup-optimize on cluster.min-free-disk 10% cluster.min-free-inodes 5% cluster.rebalance-stats off cluster.subvols-per-directory (null) cluster.readdir-optimize off cluster.rsync-hash-regex (null) cluster.extra-hash-regex (null) cluster.dht-xattr-name trusted.glusterfs.dht cluster.randomize-hash-range-by-gfid off cluster.rebal-throttle normal cluster.lock-migration off cluster.force-migration off cluster.local-volume-name (null) cluster.weighted-rebalance on cluster.switch-pattern (null) cluster.entry-change-log on cluster.read-subvolume (null) cluster.read-subvolume-index -1 cluster.read-hash-mode 1 cluster.background-self-heal-count 8 cluster.metadata-self-heal off cluster.data-self-heal off cluster.entry-self-heal off cluster.self-heal-daemon on cluster.heal-timeout 600 cluster.self-heal-window-size 1 cluster.data-change-log on cluster.metadata-change-log on cluster.data-self-heal-algorithm (null) cluster.eager-lock on disperse.eager-lock on disperse.other-eager-lock on disperse.eager-lock-timeout 1 disperse.other-eager-lock-timeout 1 cluster.quorum-type auto cluster.quorum-count (null) cluster.choose-local true cluster.self-heal-readdir-size 1KB cluster.post-op-delay-secs 1 cluster.ensure-durability on cluster.consistent-metadata no cluster.heal-wait-queue-length 128 cluster.favorite-child-policy none cluster.full-lock yes cluster.optimistic-change-log on diagnostics.latency-measurement off diagnostics.dump-fd-stats off diagnostics.count-fop-hits off diagnostics.brick-log-level INFO diagnostics.client-log-level INFO diagnostics.brick-sys-log-level CRITICAL diagnostics.client-sys-log-level CRITICAL diagnostics.brick-logger (null) diagnostics.client-logger (null) diagnostics.brick-log-format (null) diagnostics.client-log-format (null) diagnostics.brick-log-buf-size 5 diagnostics.client-log-buf-size 5 diagnostics.brick-log-flush-timeout 120 diagnostics.client-log-flush-timeout 120 diagnostics.stats-dump-interval 0 diagnostics.fop-sample-interval 0 diagnostics.stats-dump-format json diagnostics.fop-sample-buf-size 65535 diagnostics.stats-dnscache-ttl-sec 86400 performance.cache-max-file-size 0 performance.cache-min-file-size 0 performance.cache-refresh-timeout 60 performance.cache-priority performance.cache-size 8GB performance.io-thread-count 32 performance.high-prio-threads 16 performance.normal-prio-threads 16 performance.low-prio-threads 16 performance.least-prio-threads 1 performance.enable-least-priority on performance.iot-watchdog-secs (null) performance.iot-cleanup-disconnected-reqsoff performance.iot-pass-through false performance.io-cache-pass-through false performance.cache-size 8GB performance.qr-cache-timeout 1 performance.cache-invalidation on performance.ctime-invalidation false performance.flush-behind on performance.nfs.flush-behind on performance.write-behind-window-size 1024MB performance.resync-failed-syncs-after-fsyncoff performance.nfs.write-behind-window-size1MB performance.strict-o-direct off performance.nfs.strict-o-direct off performance.strict-write-ordering off performance.nfs.strict-write-ordering off performance.write-behind-trickling-writesoff performance.aggregate-size 2048KB performance.nfs.write-behind-trickling-writeson performance.lazy-open yes performance.read-after-open yes performance.open-behind-pass-through false performance.read-ahead-page-count 4 performance.read-ahead-pass-through false performance.readdir-ahead-pass-through false performance.md-cache-pass-through false performance.md-cache-timeout 600 performance.cache-swift-metadata true performance.cache-samba-metadata false performance.cache-capability-xattrs true performance.cache-ima-xattrs true performance.md-cache-statfs off performance.xattr-cache-list performance.nl-cache-pass-through false network.frame-timeout 1800 network.ping-timeout 42 network.tcp-window-size (null) client.ssl off network.remote-dio disable client.event-threads 32 client.tcp-user-timeout 0 client.keepalive-time 20 client.keepalive-interval 2 client.keepalive-count 9 network.tcp-window-size (null) network.inode-lru-limit 1000000 auth.allow * auth.reject (null) transport.keepalive 1 server.allow-insecure on server.root-squash off server.all-squash off server.anonuid 65534 server.anongid 65534 server.statedump-path /var/run/gluster server.outstanding-rpc-limit 1024 server.ssl off auth.ssl-allow * server.manage-gids off server.dynamic-auth on client.send-gids on server.gid-timeout 300 server.own-thread (null) server.event-threads 32 server.tcp-user-timeout 42 server.keepalive-time 20 server.keepalive-interval 2 server.keepalive-count 9 transport.listen-backlog 16384 transport.address-family inet performance.write-behind on performance.read-ahead on performance.readdir-ahead on performance.io-cache on performance.open-behind on performance.quick-read on performance.nl-cache off performance.stat-prefetch on performance.client-io-threads on performance.nfs.write-behind on performance.nfs.read-ahead off performance.nfs.io-cache on performance.nfs.quick-read off performance.nfs.stat-prefetch off performance.nfs.io-threads off performance.force-readdirp true performance.cache-invalidation on performance.global-cache-invalidation true features.uss off features.snapshot-directory .snaps features.show-snapshot-directory off features.tag-namespaces off network.compression off network.compression.window-size -15 network.compression.mem-level 8 network.compression.min-size 0 network.compression.compression-level -1 network.compression.debug false features.default-soft-limit 80% features.soft-timeout 60 features.hard-timeout 5 features.alert-time 86400 features.quota-deem-statfs off geo-replication.indexing off geo-replication.indexing off geo-replication.ignore-pid-check off geo-replication.ignore-pid-check off features.quota off features.inode-quota off features.bitrot disable debug.trace off debug.log-history no debug.log-file no debug.exclude-ops (null) debug.include-ops (null) debug.error-gen off debug.error-failure (null) debug.error-number (null) debug.random-failure off debug.error-fops (null) nfs.enable-ino32 no nfs.mem-factor 15 nfs.export-dirs on nfs.export-volumes on nfs.addr-namelookup off nfs.dynamic-volumes off nfs.register-with-portmap on nfs.outstanding-rpc-limit 1024 nfs.port 2049 nfs.rpc-auth-unix on nfs.rpc-auth-null on nfs.rpc-auth-allow all nfs.rpc-auth-reject none nfs.ports-insecure off nfs.trusted-sync off nfs.trusted-write off nfs.volume-access read-write nfs.export-dir nfs.disable off nfs.nlm off nfs.acl on nfs.mount-udp off nfs.mount-rmtab /- nfs.rpc-statd /sbin/rpc.statd nfs.server-aux-gids off nfs.drc off nfs.drc-size 0x20000 nfs.read-size (1 * 1048576ULL) nfs.write-size (1 * 1048576ULL) nfs.readdir-size (1 * 1048576ULL) nfs.rdirplus on nfs.event-threads 2 nfs.exports-auth-enable on nfs.auth-refresh-interval-sec 360 nfs.auth-cache-ttl-sec 360 features.read-only off features.worm off features.worm-file-level off features.worm-files-deletable on features.default-retention-period 120 features.retention-mode relax features.auto-commit-period 180 storage.linux-aio off storage.batch-fsync-mode reverse-fsync storage.batch-fsync-delay-usec 0 storage.owner-uid -1 storage.owner-gid -1 storage.node-uuid-pathinfo off storage.health-check-interval 30 storage.build-pgfid off storage.gfid2path on storage.gfid2path-separator : storage.reserve 1 storage.reserve-size 0 storage.health-check-timeout 10 storage.fips-mode-rchecksum on storage.force-create-mode 0000 storage.force-directory-mode 0000 storage.create-mask 0777 storage.create-directory-mask 0777 storage.max-hardlinks 0 features.ctime on config.gfproxyd off cluster.server-quorum-type off cluster.server-quorum-ratio 51 changelog.changelog off changelog.changelog-dir {{ brick.path }}/.glusterfs/changelogs changelog.encoding ascii changelog.rollover-time 15 changelog.fsync-interval 5 changelog.changelog-barrier-timeout 120 changelog.capture-del-path off features.barrier disable features.barrier-timeout 120 features.trash off features.trash-dir .trashcan features.trash-eliminate-path (null) features.trash-max-filesize 5MB features.trash-internal-op off cluster.enable-shared-storage disable locks.trace off locks.mandatory-locking off cluster.disperse-self-heal-daemon enable cluster.quorum-reads no client.bind-insecure (null) features.shard off features.shard-block-size 64MB features.shard-lru-limit 16384 features.shard-deletion-rate 100 features.scrub-throttle lazy features.scrub-freq biweekly features.scrub false features.expiry-time 120 features.cache-invalidation on features.cache-invalidation-timeout 600 features.leases off features.lease-lock-recall-timeout 60 disperse.background-heals 8 disperse.heal-wait-qlength 128 cluster.heal-timeout 600 dht.force-readdirp on disperse.read-policy gfid-hash cluster.shd-max-threads 1 cluster.shd-wait-qlength 1024 cluster.locking-scheme full cluster.granular-entry-heal no features.locks-revocation-secs 0 features.locks-revocation-clear-all false features.locks-revocation-max-blocked 0 features.locks-monkey-unlocking false features.locks-notify-contention no features.locks-notify-contention-delay 5 disperse.shd-max-threads 1 disperse.shd-wait-qlength 1024 disperse.cpu-extensions auto disperse.self-heal-window-size 1 cluster.use-compound-fops off performance.parallel-readdir on performance.rda-request-size 131072 performance.rda-low-wmark 4096 performance.rda-high-wmark 128KB performance.rda-cache-limit 10MB performance.nl-cache-positive-entry false performance.nl-cache-limit 10MB performance.nl-cache-timeout 60 cluster.brick-multiplex disable glusterd.vol_count_per_thread 100 cluster.max-bricks-per-process 250 disperse.optimistic-change-log on disperse.stripe-cache 4 cluster.halo-enabled False cluster.halo-shd-max-latency 99999 cluster.halo-nfsd-max-latency 5 cluster.halo-max-latency 5 cluster.halo-max-replicas 99999 cluster.halo-min-replicas 2 features.selinux on cluster.daemon-log-level INFO debug.delay-gen off delay-gen.delay-percentage 10% delay-gen.delay-duration 100000 delay-gen.enable disperse.parallel-writes on features.sdfs off features.cloudsync off features.ctime on ctime.noatime on features.cloudsync-storetype (null) features.enforce-mandatory-lock off config.global-threading off config.client-threads 16 config.brick-threads 16 features.cloudsync-remote-read off features.cloudsync-store-id (null) features.cloudsync-product-id (null)