Jeff Byers
2015-Dec-15 01:10 UTC
[Gluster-users] No longer possible to "recreate" a GlusterFS "Distributed-Replicate" volume from its bricks?
It seems that it is no longer possible to stop, and delete a GlusterFS volume, and then clean, and recreate it from the component bricks, when the volume is "Distributed-Replicate". The same steps used with a simple "Replicate" or "Distribute" volume are successful with the correct data. For the failing "Distributed-Replicate" case, the "rebalance" always fails after the recreate. The "heal" command is successful, showing no "info split-brain" files indicated, but about half of the files are missing, and there are many "(Possible split-brain)" warnings in the logfile. The gluster "Distributed-Replicate" volume "recreate" procedure works fine in GlusterFS versions 3.2.7, and 3.4.2, but not in glusterfs 3.6.5, 3.6.7, or 3.7.6. Perhaps the "recreate" procedure has changed, or I'm doing something wrong that now matters in the newer GlusterFS versions. Details below. Any ideas how to make it work again? Thanks. ~ Jeff Byers ~ # glusterd -V glusterfs 3.7.6 built on Dec 14 2015 07:05:12 ################################################ # Failing "Distributed-Replicate" recreate case. ################################################ # mountpoint /exports/test-dir/ /exports/test-dir/ is a mountpoint # mount |grep test-dir /dev/sdu on /exports/test-dir type xfs (rw,noatime,nodiratime,barrier,nouuid,inode64,logbufs=8,logbsize=256k) # mkdir /exports/test-dir/test-brick-1a # mkdir /exports/test-dir/test-brick-1b # mkdir /exports/test-dir/test-brick-2a # mkdir /exports/test-dir/test-brick-2b # gluster volume create test-replica-dist replica 2 transport tcp 10.10.60.169:/exports/test-dir/test-brick-1a 10.10.60.169:/exports/test-dir/test-brick-2a 10.10.60.169:/exports/test-dir/test-brick-1b 10.10.60.169:/exports/test-dir/test-brick-2b force volume create: test-replica-dist: success: please start the volume to access data # gluster volume start test-replica-dist volume start: test-replica-dist: success # gluster volume info test-replica-dist Volume Name: test-replica-dist Type: Distributed-Replicate Volume ID: c8de4e65-2304-4801-a244-6511f39fc0c9 Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: 10.10.60.169:/exports/test-dir/test-brick-1a Brick2: 10.10.60.169:/exports/test-dir/test-brick-2a Brick3: 10.10.60.169:/exports/test-dir/test-brick-1b Brick4: 10.10.60.169:/exports/test-dir/test-brick-2b Options Reconfigured: snap-activate-on-create: enable # mkdir /mnt/test-replica-dist # mount -t glusterfs -o acl,log-level=WARNING 127.0.0.1:/test-replica-dist /mnt/test-replica-dist/ # cp -rf /lib64/ /mnt/test-replica-dist/ # diff -r /lib64/ /mnt/test-replica-dist/lib64/ # umount /mnt/test-replica-dist # gluster volume stop test-replica-dist Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y volume stop: test-replica-dist: success # gluster volume delete test-replica-dist Deleting volume will erase all information about the volume. Do you want to continue? (y/n) y volume delete: test-replica-dist: success # gluster_clear_xattrs.sh /exports/test-dir/test-brick-1a removing all .glusterfs directories in progress: /exports/test-dir/test-brick-1a xattr clean-up in progress: /exports/test-dir/test-brick-1a /exports/test-dir/test-brick-1a ready to be used as a glusterfs brick # gluster_clear_xattrs.sh /exports/test-dir/test-brick-1b removing all .glusterfs directories in progress: /exports/test-dir/test-brick-1b xattr clean-up in progress: /exports/test-dir/test-brick-1b /exports/test-dir/test-brick-1b ready to be used as a glusterfs brick # gluster_clear_xattrs.sh /exports/test-dir/test-brick-2a removing all .glusterfs directories in progress: /exports/test-dir/test-brick-2a xattr clean-up in progress: /exports/test-dir/test-brick-2a /exports/test-dir/test-brick-2a ready to be used as a glusterfs brick # gluster_clear_xattrs.sh /exports/test-dir/test-brick-2b removing all .glusterfs directories in progress: /exports/test-dir/test-brick-2b xattr clean-up in progress: /exports/test-dir/test-brick-2b /exports/test-dir/test-brick-2b ready to be used as a glusterfs brick # gluster volume create test-replica-dist replica 2 transport tcp 10.10.60.169:/exports/test-dir/test-brick-1a 10.10.60.169:/exports/test-dir/test-brick-2a 10.10.60.169:/exports/test-dir/test-brick-1b 10.10.60.169:/exports/test-dir/test-brick-2b force volume create: test-replica-dist: success: please start the volume to access data # gluster volume start test-replica-dist volume start: test-replica-dist: success # mount -t glusterfs -o acl,log-level=WARNING 127.0.0.1:/test-replica-dist /mnt/test-replica-dist/ # diff -r /lib64/ /mnt/test-replica-dist/lib64/ Only in /lib64/device-mapper: libdevmapper-event-lvm2thin.so Only in /lib64/multipath: libcheckcciss_tur.so Only in /lib64/multipath: libcheckemc_clariion.so Only in /lib64/multipath: libcheckhp_sw.so Only in /lib64/multipath: libprioconst.so Only in /lib64/multipath: libpriordac.so Only in /lib64/multipath: libprioweighted.so Only in /lib64/rtkaio: librtkaio-2.12.so Only in /lib64/rtkaio: librt.so.1 Only in /lib64/xtables: libip6t_ah.so Only in /lib64/xtables: libip6t_dst.so Only in /lib64/xtables: libip6t_eui64.so Only in /lib64/xtables: libip6t_frag.so Only in /lib64/xtables: libip6t_HL.so Only in /lib64/xtables: libip6t_icmp6.so Only in /lib64/xtables: libip6t_LOG.so Only in /lib64/xtables: libip6t_mh.so Only in /lib64/xtables: libip6t_REJECT.so Only in /lib64/xtables: libip6t_set.so Only in /lib64/xtables: libipt_ah.so Only in /lib64/xtables: libipt_ecn.so Only in /lib64/xtables: libipt_ECN.so Only in /lib64/xtables: libipt_icmp.so Only in /lib64/xtables: libipt_MIRROR.so Only in /lib64/xtables: libipt_realm.so Only in /lib64/xtables: libipt_REDIRECT.so Only in /lib64/xtables: libipt_REJECT.so Only in /lib64/xtables: libipt_SAME.so Only in /lib64/xtables: libipt_SET.so Only in /lib64/xtables: libipt_SNAT.so Only in /lib64/xtables: libipt_ttl.so Only in /lib64/xtables: libipt_ULOG.so Only in /lib64/xtables: libipt_unclean.so Only in /lib64/xtables: libxt_CHECKSUM.so Only in /lib64/xtables: libxt_cluster.so Only in /lib64/xtables: libxt_connbytes.so Only in /lib64/xtables: libxt_connlimit.so Only in /lib64/xtables: libxt_CONNMARK.so Only in /lib64/xtables: libxt_CONNSECMARK.so Only in /lib64/xtables: libxt_conntrack.so Only in /lib64/xtables: libxt_dccp.so Only in /lib64/xtables: libxt_dscp.so Only in /lib64/xtables: libxt_iprange.so Only in /lib64/xtables: libxt_length.so Only in /lib64/xtables: libxt_limit.so Only in /lib64/xtables: libxt_mac.so Only in /lib64/xtables: libxt_multiport.so Only in /lib64/xtables: libxt_osf.so Only in /lib64/xtables: libxt_physdev.so Only in /lib64/xtables: libxt_pkttype.so Only in /lib64/xtables: libxt_policy.so Only in /lib64/xtables: libxt_quota.so Only in /lib64/xtables: libxt_RATEEST.so Only in /lib64/xtables: libxt_sctp.so Only in /lib64/xtables: libxt_SECMARK.so Only in /lib64/xtables: libxt_socket.so Only in /lib64/xtables: libxt_statistic.so Only in /lib64/xtables: libxt_string.so Only in /lib64/xtables: libxt_tcpmss.so Only in /lib64/xtables: libxt_TCPOPTSTRIP.so Only in /lib64/xtables: libxt_time.so Only in /lib64/xtables: libxt_TOS.so Only in /lib64/xtables: libxt_TPROXY.so Only in /lib64/xtables: libxt_TRACE.so Only in /lib64/xtables: libxt_udp.so # gluster volume rebalance test-replica-dist start volume rebalance: test-replica-dist: success: Rebalance on test-replica-dist has been started successfully. Use rebalance status command to check status of the rebalance process. ID: ccf76757-c3df-4ae2-af2d-b82f8283d821 # gluster volume rebalance test-replica-dist status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 0 0Bytes 249 3 0 failed 0.00 volume rebalance: test-replica-dist: success # gluster volume heal test-replica-dist full Launching heal operation to perform full self heal on volume test-replica-dist has been successful Use heal info commands to check status # gluster volume heal test-replica-dist info split-brain Brick 10.10.60.169:/exports/test-dir/test-brick-1a Number of entries in split-brain: 0 Brick 10.10.60.169:/exports/test-dir/test-brick-2a Number of entries in split-brain: 0 Brick 10.10.60.169:/exports/test-dir/test-brick-1b Number of entries in split-brain: 0 Brick 10.10.60.169:/exports/test-dir/test-brick-2b Number of entries in split-brain: 0 # diff -r /lib64/ /mnt/test-replica-dist/lib64/ Only in /lib64/device-mapper: libdevmapper-event-lvm2thin.so Only in /lib64/multipath: libcheckcciss_tur.so Only in /lib64/multipath: libcheckemc_clariion.so ... Only in /lib64/xtables: libxt_TRACE.so Only in /lib64/xtables: libxt_udp.so # view /var/log/glusterfs/test-replica-dist-rebalance.log [2015-12-14 23:06:25.432546] E [dht-rebalance.c:2949:gf_defrag_fix_layout] 0-test-replica-dist-dht: //.trashcan gfid not present [2015-12-14 23:06:25.433196] I [MSGID: 109081] [dht-common.c:3810:dht_setxattr] 0-test-replica-dist-dht: fixing the layout of /lib64 [2015-12-14 23:06:25.433217] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-test-replica-dist-dht: subvolume 0 (test-replica-dist-replicate-0): 1014 chunks [2015-12-14 23:06:25.433228] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-test-replica-dist-dht: subvolume 1 (test-replica-dist-replicate-1): 1014 chunks [2015-12-14 23:06:25.434564] I [dht-rebalance.c:2446:gf_defrag_process_dir] 0-test-replica-dist-dht: migrate data called on /lib64 [2015-12-14 23:06:25.562584] I [dht-rebalance.c:2656:gf_defrag_process_dir] 0-test-replica-dist-dht: Migration operation on dir /lib64 took 0.13 secs [2015-12-14 23:06:25.568177] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-2: remote operation failed. Path: /lib64/xtables (0e2a4e5c-7c8b-4f8b-979a-9dbd73de6ecc) [No data available] [2015-12-14 23:06:25.568206] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-3: remote operation failed. Path: /lib64/xtables (0e2a4e5c-7c8b-4f8b-979a-9dbd73de6ecc) [No data available] [2015-12-14 23:06:25.568557] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-3: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:06:25.568581] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-2: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:06:25.569942] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-2: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:06:25.569964] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-3: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:06:25.570117] I [MSGID: 109063] [dht-layout.c:702:dht_layout_normalize] 0-test-replica-dist-dht: Found anomalies in /lib64/xtables (gfid = 0e2a4e5c-7c8b-4f8b-979a-9dbd73de6ecc). Holes=1 overlaps=0 [2015-12-14 23:06:25.570142] W [MSGID: 109005] [dht-selfheal.c:1805:dht_selfheal_directory] 0-test-replica-dist-dht: Directory selfheal failed : 1 subvolumes have unrecoverable errors. path /lib64/xtables, gfid = 0e2a4e5c-7c8b-4f8b-979a-9dbd73de6ecc [2015-12-14 23:06:25.570179] I [MSGID: 109081] [dht-common.c:3810:dht_setxattr] 0-test-replica-dist-dht: fixing the layout of /lib64/xtables [2015-12-14 23:06:25.570206] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-test-replica-dist-dht: subvolume 0 (test-replica-dist-replicate-0): 1014 chunks [2015-12-14 23:06:25.570219] I [MSGID: 109045] [dht-selfheal.c:1509:dht_fix_layout_of_directory] 0-test-replica-dist-dht: subvolume 1 (test-replica-dist-replicate-1): 1014 chunks [2015-12-14 23:06:25.570889] E [dht-rebalance.c:2992:gf_defrag_fix_layout] 0-test-replica-dist-dht: Setxattr failed for /lib64/xtables [2015-12-14 23:06:25.571049] E [MSGID: 109016] [dht-rebalance.c:3006:gf_defrag_fix_layout] 0-test-replica-dist-dht: Fix layout failed for /lib64 [2015-12-14 23:06:25.571182] I [dht-rebalance.c:2085:gf_defrag_task] 0-DHT: Thread wokeup. defrag->current_thread_count: 5 [2015-12-14 23:06:25.571255] I [dht-rebalance.c:2085:gf_defrag_task] 0-DHT: Thread wokeup. defrag->current_thread_count: 6 [2015-12-14 23:06:25.571281] I [dht-rebalance.c:2085:gf_defrag_task] 0-DHT: Thread wokeup. defrag->current_thread_count: 7 [2015-12-14 23:06:25.571302] I [dht-rebalance.c:2085:gf_defrag_task] 0-DHT: Thread wokeup. defrag->current_thread_count: 8 [2015-12-14 23:06:25.571627] I [MSGID: 109028] [dht-rebalance.c:3485:gf_defrag_status_get] 0-test-replica-dist-dht: Rebalance is failed. Time taken is 0.00 secs [2015-12-14 23:06:25.571647] I [MSGID: 109028] [dht-rebalance.c:3489:gf_defrag_status_get] 0-test-replica-dist-dht: Files migrated: 0, size: 0, lookups: 249, failures: 3, skipped: 0 # view /var/log/glusterfs/mnt-test-replica-dist-.log [2015-12-14 23:05:03.797439] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-3: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:05:03.800523] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-2: remote operation failed. Path: /lib64/xtables-1.4.7/libxt_TPROXY.so (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:05:03.800626] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-3: remote operation failed. Path: /lib64/xtables-1.4.7/libxt_TPROXY.so (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:05:03.802003] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-2: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:05:03.802100] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-3: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:05:03.804886] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-2: remote operation failed. Path: /lib64/xtables-1.4.7/libxt_TRACE.so (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:05:03.804989] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-3: remote operation failed. Path: /lib64/xtables-1.4.7/libxt_TRACE.so (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:05:03.806342] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-2: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:05:03.806477] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-3: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:05:03.809396] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-0: remote operation failed. Path: /lib64/xtables-1.4.7/libxt_u32.so (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:05:03.809430] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-1: remote operation failed. Path: /lib64/xtables-1.4.7/libxt_u32.so (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:05:03.810841] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-0: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:05:03.810905] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-1: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:05:03.813705] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-2: remote operation failed. Path: /lib64/xtables-1.4.7/libxt_udp.so (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:05:03.813824] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-3: remote operation failed. Path: /lib64/xtables-1.4.7/libxt_udp.so (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:05:03.815201] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-2: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:05:03.815295] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-3: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:05:03.818854] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-0: remote operation failed. Path: /lib64/ZIPScanner.so (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:05:03.818895] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-1: remote operation failed. Path: /lib64/ZIPScanner.so (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:05:03.820314] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-0: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:05:03.820342] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-1: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:05:03.821992] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-0: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:05:03.822001] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-1: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:05:03.822207] W [fuse-resolve.c:65:fuse_resolve_entry_cbk] 0-fuse: 056bce17-a4c2-4e13-a352-40f783c4804a/ZIPScanner.so: failed to resolve (No data available) [2015-12-14 23:05:03.822574] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-0: remote operation failed. Path: /lib64/ZIPScanner.so (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:05:03.822600] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-1: remote operation failed. Path: /lib64/ZIPScanner.so (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:05:03.824000] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-0: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:05:03.824076] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-1: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:05:01.386125] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-3: remote operation failed. Path: /lib64 (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:05:01.386151] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-2: remote operation failed. Path: /lib64 (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:05:01.411283] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-3: remote operation failed. Path: /lib64/CFBScanner.so (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:07:38.979973] W [MSGID: 108008] [afr-read-txn.c:250:afr_read_txn] 0-test-replica-dist-replicate-1: Unreadable subvolume -1 found with event generation 2 for gfid 8a2836d2-dd2c-46dc-9a1e-437c5444a704. (Possible split-brain) [2015-12-14 23:07:38.983422] W [MSGID: 108008] [afr-read-txn.c:250:afr_read_txn] 0-test-replica-dist-replicate-1: Unreadable subvolume -1 found with event generation 2 for gfid 43c8e284-1cd0-48a8-b8e5-c075092eeaa7. (Possible split-brain) [2015-12-14 23:07:39.031276] W [MSGID: 109011] [dht-layout.c:191:dht_layout_search] 0-test-replica-dist-dht: no subvolume for hash (value) = 827357797 [2015-12-14 23:07:39.031587] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 3874: LOOKUP() /lib64/device-mapper/libdevmapper-event-lvm2thin.so => -1 (Stale file handle) [2015-12-14 23:07:39.032043] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-2: remote operation failed. Path: /lib64/device-mapper (43c8e284-1cd0-48a8-b8e5-c075092eeaa7) [No data available] [2015-12-14 23:07:39.032090] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-3: remote operation failed. Path: /lib64/device-mapper (43c8e284-1cd0-48a8-b8e5-c075092eeaa7) [No data available] [2015-12-14 23:07:39.033510] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-2: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:07:39.033523] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-3: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:07:39.034759] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-2: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:07:39.034790] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-test-replica-dist-client-3: remote operation failed. Path: (null) (00000000-0000-0000-0000-000000000000) [No data available] [2015-12-14 23:07:39.035335] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 3876: LOOKUP() /lib64/device-mapper/libdevmapper-event-lvm2thin.so => -1 (Stale file handle) [2015-12-14 23:07:39.037553] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 3881: LOOKUP() /lib64/device-mapper/libdevmapper-event-lvm2thin.so => -1 (Stale file handle) [2015-12-14 23:07:39.037976] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 3883: LOOKUP() /lib64/device-mapper/libdevmapper-event-lvm2thin.so => -1 (Stale file handle) The message "W [MSGID: 109011] [dht-layout.c:191:dht_layout_search] 0-test-replica-dist-dht: no subvolume for hash (value) = 827357797" repeated 3 times between [2015-12-14 23:07:39.031276] and [2015-12-14 23:07:39.037686] [2015-12-14 23:07:39.190729] W [MSGID: 109011] [dht-layout.c:191:dht_layout_search] 0-test-replica-dist-dht: no subvolume for hash (value) = 3553706931 [2015-12-14 23:07:39.190781] W [MSGID: 109011] [dht-layout.c:191:dht_layout_search] 0-test-replica-dist-dht: no subvolume for hash (value) = 3781782680 [2015-12-14 23:07:39.190990] W [MSGID: 108008] [afr-read-txn.c:250:afr_read_txn] 0-test-replica-dist-replicate-1: Unreadable subvolume -1 found with event generation 2 for gfid eeab01c5-af5f-49f8-bd06-01471f405c84. (Possible split-brain) [2015-12-14 23:07:39.214980] W [MSGID: 108008] [afr-read-txn.c:250:afr_read_txn] 0-test-replica-dist-replicate-1: Unreadable subvolume -1 found with event generation 2 for gfid fe99280a-10f4-4e2f-8483-99d63461fa9e. (Possible split-brain) [2015-12-14 23:07:39.227809] W [MSGID: 109011] [dht-layout.c:191:dht_layout_search] 0-test-replica-dist-dht: no subvolume for hash (value) = 3553706931 [2015-12-14 23:07:39.227837] W [MSGID: 109011] [dht-layout.c:191:dht_layout_search] 0-test-replica-dist-dht: no subvolume for hash (value) = 3781782680 [2015-12-14 23:07:39.228015] W [MSGID: 108008] [afr-read-txn.c:250:afr_read_txn] 0-test-replica-dist-replicate-1: Unreadable subvolume -1 found with event generation 2 for gfid 862461d1-38ef-4f3a-8216-c9d9dedde1af. (Possible split-brain) [2015-12-14 23:07:39.264828] W [MSGID: 109011] [dht-layout.c:191:dht_layout_search] 0-test-replica-dist-dht: no subvolume for hash (value) = 3553706931 [2015-12-14 23:07:39.264862] W [MSGID: 109011] [dht-layout.c:191:dht_layout_search] 0-test-replica-dist-dht: no subvolume for hash (value) = 3781782680 [2015-12-14 23:07:39.266617] W [MSGID: 108008] [afr-read-txn.c:250:afr_read_txn] 0-test-replica-dist-replicate-1: Unreadable subvolume -1 found with event generation 2 for gfid 0e2a4e5c-7c8b-4f8b-979a-9dbd73de6ecc. (Possible split-brain ########################################## # Successful "Distributed" recreate case. ########################################## # mkdir /exports/test-dir/test-brick-1 # mkdir /exports/test-dir/test-brick-2 # gluster volume create test-dist transport tcp 10.10.60.169:/exports/test-dir/test-brick-1 10.10.60.169:/exports/test-dir/test-brick-2 volume create: test-dist: success: please start the volume to access data # gluster volume start test-dist volume start: test-dist: success # gluster volume info test-dist Volume Name: test-dist Type: Distribute Volume ID: 385a8546-1776-45be-8ae4-cd94ed37f2a5 Status: Started Number of Bricks: 2 Transport-type: tcp Bricks: Brick1: 10.10.60.169:/exports/test-dir/test-brick-1 Brick2: 10.10.60.169:/exports/test-dir/test-brick-2 Options Reconfigured: snap-activate-on-create: enable # mkdir /mnt/test-dist # mount -t glusterfs -o acl,log-level=WARNING 127.0.0.1:/test-dist /mnt/test-dist/ # cp -rf /lib64/ /mnt/test-dist/ # diff -r /lib64/ /mnt/test-dist/lib64/ # umount /mnt/test-dist/ # gluster volume stop test-dist Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y volume stop: test-dist: success # gluster volume delete test-dist Deleting volume will erase all information about the volume. Do you want to continue? (y/n) y volume delete: test-dist: success # gluster_clear_xattrs.sh /exports/test-dir/test-brick-1 removing all .glusterfs directories in progress: /exports/test-dir/test-brick-1 xattr clean-up in progress: /exports/test-dir/test-brick-1 /exports/test-dir/test-brick-1 ready to be used as a glusterfs brick # gluster_clear_xattrs.sh /exports/test-dir/test-brick-2 removing all .glusterfs directories in progress: /exports/test-dir/test-brick-2 xattr clean-up in progress: /exports/test-dir/test-brick-2 /exports/test-dir/test-brick-2 ready to be used as a glusterfs brick # gluster volume create test-dist transport tcp 10.10.60.169:/exports/test-dir/test-brick-1 10.10.60.169:/exports/test-dir/test-brick-2 volume create: test-dist: success: please start the volume to access data # gluster volume start test-dist volume start: test-dist: success # mount -t glusterfs -o acl,log-level=WARNING 127.0.0.1:/test-dist /mnt/test-dist/ # diff -r /lib64/ /mnt/test-dist/lib64/ # gluster volume rebalance test-dist start volume rebalance: test-dist: success: Rebalance on test-dist has been started successfully. Use rebalance status command to check status of the rebalance process. ID: 63163e88-0b81-40cf-9050-4af12bf31acd # gluster volume rebalance test-dist status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 0 0Bytes 537 0 0 completed 1.00 volume rebalance: test-dist: success ########################################## # Successful "Replicate" recreate case. ########################################## # mkdir /exports/test-dir/test-brick-3a # mkdir /exports/test-dir/test-brick-3b # gluster volume create test-replica replica 2 transport tcp 10.10.60.169:/exports/test-dir/test-brick-3a 10.10.60.169:/exports/test-dir/test-brick-3b force volume create: test-replica: success: please start the volume to access data # gluster volume start test-replica volume start: test-replica: success # gluster volume info test-replica Volume Name: test-replica Type: Replicate Volume ID: 1e66af41-a29f-45ba-b25d-b4b16d2a66d9 Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 10.10.60.169:/exports/test-dir/test-brick-3a Brick2: 10.10.60.169:/exports/test-dir/test-brick-3b Options Reconfigured: snap-activate-on-create: enable # mkdir /mnt/test-replica # mount -t glusterfs -o acl,log-level=WARNING 127.0.0.1:/test-replica /mnt/test-replica # cp -rf /lib64/ /mnt/test-replica/ # diff -r /lib64/ /mnt/test-replica/lib64/ # umount /mnt/test-replica # gluster volume stop test-replica Stopping volume will make its data inaccessible. Do you want to continue? (y/n) y volume stop: test-replica: success # gluster volume delete test-replica Deleting volume will erase all information about the volume. Do you want to continue? (y/n) y volume delete: test-replica: success # gluster_clear_xattrs.sh /exports/test-dir/test-brick-3a removing all .glusterfs directories in progress: /exports/test-dir/test-brick-3a xattr clean-up in progress: /exports/test-dir/test-brick-3a /exports/test-dir/test-brick-3a ready to be used as a glusterfs brick # gluster_clear_xattrs.sh /exports/test-dir/test-brick-3b removing all .glusterfs directories in progress: /exports/test-dir/test-brick-3b xattr clean-up in progress: /exports/test-dir/test-brick-3b /exports/test-dir/test-brick-3b ready to be used as a glusterfs brick # gluster volume create test-replica replica 2 transport tcp 10.10.60.169:/exports/test-dir/test-brick-3a 10.10.60.169:/exports/test-dir/test-brick-3b force volume create: test-replica: success: please start the volume to access data # gluster volume start test-replica volume start: test-replica: success # mount -t glusterfs -o acl,log-level=WARNING 127.0.0.1:/test-replica /mnt/test-replica # diff -r /lib64/ /mnt/test-replica/lib64/ -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20151214/4f5b4db0/attachment.html>