Hi, I tried to rebalance my volume for several times, but all failed. Gluster version is 3.7.4. Status and info: [root at d001 ~]# gluster volume rebalance FastVol status Node Rebalanced-files size scanned failures skipped status run time in secs --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 51251 553.4MB 3422092 0 0 failed 13211.00 volume rebalance: FastVol: success: [root at d001 ~]# gluster volume status Status of volume: FastVol Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick d001:/mnt/c1/brick 49154 0 Y 32111 Brick d001:/mnt/b1/brick 49155 0 Y 24557 Task Status of Volume FastVol ------------------------------------------------------------------------------ Task : Rebalance ID : cf1b25a0-4e33-4abf-9bb9-64cfd7bad115 Status : failed [root at d001 ~]# gluster volume info Volume Name: FastVol Type: Distribute Volume ID: dbee250a-e3fe-4448-b905-b76c5ba80b25 Status: Started Number of Bricks: 2 Transport-type: tcp Bricks: Brick1: d001:/mnt/c1/brick Brick2: d001:/mnt/b1/brick Options Reconfigured: nfs.disable: true auth.allow: 127.0.0.1,10.* I have checked FastVol-rebalance.log and find no error by searching " E ", then I find some warnings by searching " W ": [2015-12-11 15:51:57.402661] W [MSGID: 109009] [dht-common.c:569:dht_lookup_dir_cbk] 0-FastVol-dht: /for_ybest_fsdir/user/Weixin.oClDcji6/7g/yg/3LAF3uXtLlQndyFA: gfid different on FastVol-client-1. gfid local = 393d4a1a-20b8-49b2-0000-000000000000, gfid subvol = 393d4a1a-20b8-49b2-8f79-cb17472579e2 [2015-12-11 15:52:12.071984] W [MSGID: 109009] [dht-common.c:569:dht_lookup_dir_cbk] 0-FastVol-dht: /for_ybest_fsdir/user/Weixin.oClDcji6/ZA: gfid different on FastVol-client-1. gfid local = 5de2d8a9-954a-437a-8a4f-fe6ab30b646d, gfid subvol = 5de2d8a9-954a-437a-8a4f-fe6ab30b646d [2015-12-11 16:04:24.346027] W [MSGID: 109009] [dht-common.c:569:dht_lookup_dir_cbk] 0-FastVol-dht: /for_ybest_fsdir/user/Weixin.oClDcjtd/2q: gfid different on FastVol-client-1. gfid local = 49c3a238-c204-4b05-0000-000000000000, gfid subvol = 49c3a238-c204-4b05-85ea-9a400044def6 [2015-12-11 17:55:46.232418] W [MSGID: 109009] [dht-common.c:569:dht_lookup_dir_cbk] 0-FastVol-dht: /for_ybest_fsdir/user/li/ur/on/gzhi/linkwrap/49138: gfid different on FastVol-client-1. gfid local = ae68fd66-36c8-4bd7-0000-000000000000, gfid subvol = ae68fd66-36c8-4bd7-a183-94390fb5704c I also checked etc-glusterfs-glusterd.vol.log, but find no errors or warnings after the rebalance task had started. Latest lines in etc-glusterfs-glusterd.vol.log : [2015-12-11 16:03:26.709198] W [socket.c:588:__socket_rwv] 0-nfs: readv on /var/run/gluster/b87982e05d7252cd3efe66bb7c634115.socket failed (Invalid argument) [2015-12-11 16:03:29.709626] W [socket.c:588:__socket_rwv] 0-nfs: readv on /var/run/gluster/b87982e05d7252cd3efe66bb7c634115.socket failed (Invalid argument) [2015-12-11 16:03:30.315759] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: nfs already stopped [2015-12-11 16:03:30.318867] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: glustershd already stopped [2015-12-11 16:03:30.323944] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: quotad already stopped [2015-12-11 16:03:30.326917] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: bitd already stopped [2015-12-11 16:03:30.329868] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: scrub already stopped [2015-12-11 16:03:30.371050] I [run.c:190:runner_log] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1eb)[0x7f819756863b] (--> /usr/lib64/libglusterfs.so.0(runner_log+0x105)[0x7f81975bd5a5] (--> /usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so(glusterd_hooks_run_hooks+0x4cc)[0x7f818c027cbc] (--> /usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so(+0xeefd2)[0x7f818c027fd2] (--> /lib64/libpthread.so.0(+0x79d1)[0x7f81966509d1] ))))) 0-management: Ran script: /var/lib/glusterd/hooks/1/set/post/S30samba-set.sh --volname=FastVol -o nfs.disable=on --gd-workdir=/var/lib/glusterd [2015-12-11 16:03:30.389063] I [run.c:190:runner_log] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1eb)[0x7f819756863b] (--> /usr/lib64/libglusterfs.so.0(runner_log+0x105)[0x7f81975bd5a5] (--> /usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so(glusterd_hooks_run_hooks+0x4cc)[0x7f818c027cbc] (--> /usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so(+0xeefd2)[0x7f818c027fd2] (--> /lib64/libpthread.so.0(+0x79d1)[0x7f81966509d1] ))))) 0-management: Ran script: /var/lib/glusterd/hooks/1/set/post/S32gluster_enable_shared_storage.sh --volname=FastVol -o nfs.disable=on --gd-workdir=/var/lib/glusterd The message "I [MSGID: 106006] [glusterd-svc-mgmt.c:323:glusterd_svc_common_rpc_notify] 0-management: nfs has disconnected from glusterd." repeated 39 times between [2015-12-11 16:01:32.695911] and [2015-12-11 16:03:29.709689] [2015-12-11 16:05:44.813587] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: glustershd already stopped [2015-12-11 16:05:44.823077] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: quotad already stopped [2015-12-11 16:05:44.825986] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: bitd already stopped [2015-12-11 16:05:44.829007] I [MSGID: 106132] [glusterd-proc-mgmt.c:83:glusterd_proc_stop] 0-management: scrub already stopped [2015-12-11 16:05:44.865623] I [run.c:190:runner_log] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1eb)[0x7f819756863b] (--> /usr/lib64/libglusterfs.so.0(runner_log+0x105)[0x7f81975bd5a5] (--> /usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so(glusterd_hooks_run_hooks+0x4cc)[0x7f818c027cbc] (--> /usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so(+0xeefd2)[0x7f818c027fd2] (--> /lib64/libpthread.so.0(+0x79d1)[0x7f81966509d1] ))))) 0-management: Ran script: /var/lib/glusterd/hooks/1/set/post/S30samba-set.sh --volname=FastVol -o nfs.disable=true --gd-workdir=/var/lib/glusterd [2015-12-11 16:05:44.873447] I [run.c:190:runner_log] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1eb)[0x7f819756863b] (--> /usr/lib64/libglusterfs.so.0(runner_log+0x105)[0x7f81975bd5a5] (--> /usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so(glusterd_hooks_run_hooks+0x4cc)[0x7f818c027cbc] (--> /usr/lib64/glusterfs/3.7.4/xlator/mgmt/glusterd.so(+0xeefd2)[0x7f818c027fd2] (--> /lib64/libpthread.so.0(+0x79d1)[0x7f81966509d1] ))))) 0-management: Ran script: /var/lib/glusterd/hooks/1/set/post/S32gluster_enable_shared_storage.sh --volname=FastVol -o nfs.disable=true --gd-workdir=/var/lib/glusterd [2015-12-11 19:26:37.779065] W [socket.c:588:__socket_rwv] 0-management: readv on /var/run/gluster/gluster-rebalance-dbee250a-e3fe-4448-b905-b76c5ba80b25.sock failed (No data available) [2015-12-11 19:26:38.220385] I [MSGID: 106007] [glusterd-rebalance.c:162:__glusterd_defrag_notify] 0-management: Rebalance process for volume FastVol has disconnected. [2015-12-11 19:26:38.220446] I [MSGID: 101053] [mem-pool.c:616:mem_pool_destroy] 0-management: size=588 max=1 total=1235 [2015-12-11 19:26:38.220462] I [MSGID: 101053] [mem-pool.c:616:mem_pool_destroy] 0-management: size=124 max=1 total=1235 [2015-12-12 01:11:13.920354] I [MSGID: 106488] [glusterd-handler.c:1463:__glusterd_handle_cli_get_volume] 0-glusterd: Received get vol req [2015-12-12 01:11:34.302028] I [MSGID: 106499] [glusterd-handler.c:4258:__glusterd_handle_status_volume] 0-management: Received status volume req for volume FastVol Thanks Cloudor -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20151212/13f596df/attachment.html>