thr3ads.net - Gluster users - [Gluster-users] One brick in replicated/distributed volume not being written to. [Feb 2013]

If this information is useful, please help other people find it:
Share via:

Matthew Temple

2013-Feb-21 14:33 UTC

[Gluster-users] One brick in replicated/distributed volume not being written to.

Hi, all.

   I thought I had everything set correctly on my volume, but something is
wrong.   Here is the volume, made of 4 bricks:

Volume Name: gf2
Type: Distributed-Replicate
Volume ID: a9e64630-9166-4957-8243-e2933791b24b
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: gf2ibp-1:/mnt/d0-0
Brick2: gf2ibp-1r:/mnt/d0-0
Brick3: gf2ibp-2:/mnt/d0-0
Brick4: gf2ibp-2r:/mnt/d0-0

I have Volume gf2 mounted by a computer we call "rcapps"
About 6 TB have been written to the volume.
When I look at /mnt/d0-0 on all 4 bricks, 3 look correct, but
    Brick1 only has 48GB written to it.
    Brick2, which should replicate Brick1, has 4TB.
    Brick3 and Brick4 seem to have the same amount of data.

The status of the volume looks correct:

gluster> volume status gf2
Status of volume: gf2
Gluster process Port Online Pid
------------------------------------------------------------------------------
Brick gf2ibp-1:/mnt/d0-0 24011 Y 30754
Brick gf2ibp-1r:/mnt/d0-0 24011 Y 17824
Brick gf2ibp-2:/mnt/d0-0 24011 Y 31516
Brick gf2ibp-2r:/mnt/d0-0 24011 Y 29119
NFS Server on localhost 38467 Y 30760
Self-heal Daemon on localhost N/A Y 30766
NFS Server on gf2ibp-2 38467 Y 31522
Self-heal Daemon on gf2ibp-2 N/A Y 31528
NFS Server on gf2ibp-2r 38467 Y 29125
Self-heal Daemon on gf2ibp-2r N/A Y 29131
NFS Server on gf2ibp-1r 38467 Y 17830
Self-heal Daemon on gf2ibp-1r N/A Y 17836

I then saw I had the firewall turned on for brick2 (even though it could be
written) which I then turned off.
I thought I should try to heal the volume but when I tried this through the
gluster console, the operation failed.
In the log file I see (it can't get a lock which is held by itself?):

[2013-02-21 09:24:39.501612] I
[glusterd-volume-ops.c:492:glusterd_handle_cli_heal_volume] 0-management:
Received heal vol req for volume gf2
[2013-02-21 09:24:39.501732] E [glusterd-utils.c:277:glusterd_lock]
0-glusterd: Unable to get lock for uuid:
f5edea20-9467-48ed-b4f1-dc566a9b6d02, lock held by:
f5edea20-9467-48ed-b4f1-dc566a9b6d02
[2013-02-21 09:24:39.501759] E
[glusterd-handler.c:458:glusterd_op_txn_begin] 0-management: Unable to
acquire local lock, ret: -1

And here is what I see in cli.log, which I can't interpret.

2013-02-21 09:31:38.689316] W [cli-rl.c:116:cli_rl_process_line]
0-glusterfs: failed to process line
[2013-02-21 09:31:48.952950] I
[cli-rpc-ops.c:5928:gf_cli3_1_heal_volume_cbk] 0-cli: Received resp to heal
volume
[2013-02-21 09:31:48.953366] W [dict.c:2339:dict_unserialize]
(-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x120) [0x333440f8b0]
(-->/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5) [0x333440f0b5]
(-->gluster(gf_cli3_1_heal_volume_cbk+0x2e3) [0x41ca43]))) 0-dict: buf is
null!
[2013-02-21 09:31:48.953410] E
[cli-rpc-ops.c:5968:gf_cli3_1_heal_volume_cbk] 0-: Unable to allocate memory
[2013-02-21 09:31:48.953490] W [cli-rl.c:116:cli_rl_process_line]
0-glusterfs: failed to process line
[2013-02-21 09:31:56.419708] I
[cli-rpc-ops.c:5928:gf_cli3_1_heal_volume_cbk] 0-cli: Received resp to heal
volume
[2013-02-21 09:31:56.419859] W [dict.c:2339:dict_unserialize]
(-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x120) [0x333440f8b0]
(-->/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5) [0x333440f0b5]
(-->gluster(gf_cli3_1_heal_volume_cbk+0x2e3) [0x41ca43]))) 0-dict: buf is
null!
[2013-02-21 09:31:56.419894] E
[cli-rpc-ops.c:5968:gf_cli3_1_heal_volume_cbk] 0-: Unable to allocate memory
[2013-02-21 09:31:56.419979] W [cli-rl.c:116:cli_rl_process_line]
0-glusterfs: failed to process line

Any ideas of what I should do next?
Right now I have a pair of bricks that replicate fine and a pair that does
not, in a distributed/replicated cluster.
I need to get get brick2 to send its files back to brick1.

Thanks in advance.

Matt Temple

------
Matt Temple
Director, Research Computing
Dana-Farber Cancer Institute.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130221/3c1c8b10/attachment.html>

Gluster users - Feb 2013 - One brick in replicated/distributed volume not being written to.

[Gluster-users] One brick in replicated/distributed volume not being written to.