thr3ads.net - Gluster users - [Gluster-users] glusterd service fails to start on one peer [Oct 2013]

If this information is useful, please help other people find it:
Share via:

Mark Morlino

2013-Oct-07 20:11 UTC

[Gluster-users] glusterd service fails to start on one peer

I'm hoping that someone here can point me the right direction to help me
solve a problem I am having.

I've got 3 gluster peers and for some reason glusterd sill not start on one
of them. All are running glusterfs version 3.4.0-8.el6 on Centos 6.4
(2.6.32-358.el6.x86_64).

In /var/log/glusterfs/etc-glusterfs-glusterd.vol.log I see this error
repeated 36 times (alternating between brick-0 and brick-1):

*E [glusterd-store.c:1845:glusterd_store_retrieve_volume] 0-: Unknown
key:> brick-0*

This makes some sense to me since I have 18 replica 2 volumes resulting in
a total of 36 bricks.

Then there are a few more "I" messages and this is the rest of the
file:

*E [glusterd-store.c:2472:glusterd_resolve_all_bricks] 0-glusterd:
resolve> brick failed in restore
> **E [xlator.c:390:xlator_init] 0-management: Initialization of volume
> 'management' failed, review your volfile again
> **E [graph.c:292:glusterfs_graph_init] 0-management: initializing
> translator failed
> **E [graph.c:479:glusterfs_graph_activate] 0-graph: init failed
> **W [glusterfsd.c:1002:cleanup_and_exit]
> (-->/usr/sbin/glusterd(main+0x5d2) [0x406802]
> (-->/usr/sbin/glusterd(glusterfs_volumes_init+0xb7) [0x4051b7]
> (-->/usr/sbin/glusterd(glusterfs_process_volfp+0x103) [0x4050c3]))) 0-:
> received signum (0), shutting down*

 Here are the contents of /etc/glusterfs/glusterd.vol:

*volume management> **    type mgmt/glusterd
> **    option working-directory /var/lib/glusterd
> **    option transport-type socket,rdma
> **    option transport.socket.keepalive-time 10
> **    option transport.socket.keepalive-interval 2
> **    option transport.socket.read-fail-log off
> **end-volume*

glusterd.vol is the same on all of the peers and the other ones work.

Any help on where to look next would be greatly appreciated.

Thanks,
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131007/017d7fa3/attachment.html>

Mark Morlino

2013-Oct-07 20:44 UTC

head link

[Gluster-users] glusterd service fails to start on one peer

So, I guess I figured it out. I had been looking for a volume problem based
on the log messages but it turns out it was a peer definition problem. One
of the files in /var/lib/glusterd/peers was empty. I was able to determine
where to look based on the output of running /usr/sbin/glusterd --debug
--pid-file=/var/run/glusterd.pid and then I was able to copy the missing
file from one of the other peers since each peer has a file for each of the
other 2 peers.


On Mon, Oct 7, 2013 at 12:11 PM, Mark Morlino <mark at gina.alaska.edu>
wrote:
> I'm hoping that someone here can point me the right direction to help
me
> solve a problem I am having.
>
> I've got 3 gluster peers and for some reason glusterd sill not start on
> one of them. All are running glusterfs version 3.4.0-8.el6 on Centos 6.4
> (2.6.32-358.el6.x86_64).
>
> In /var/log/glusterfs/etc-glusterfs-glusterd.vol.log I see this error
> repeated 36 times (alternating between brick-0 and brick-1):
>
> *E [glusterd-store.c:1845:glusterd_store_retrieve_volume] 0-: Unknown
>> key: brick-0*
>
>
> This makes some sense to me since I have 18 replica 2 volumes resulting in
> a total of 36 bricks.
>
> Then there are a few more "I" messages and this is the rest of
the file:
>
> *E [glusterd-store.c:2472:glusterd_resolve_all_bricks] 0-glusterd:
>> resolve brick failed in restore
>> **E [xlator.c:390:xlator_init] 0-management: Initialization of volume
>> 'management' failed, review your volfile again
>> **E [graph.c:292:glusterfs_graph_init] 0-management: initializing
>> translator failed
>> **E [graph.c:479:glusterfs_graph_activate] 0-graph: init failed
>> **W [glusterfsd.c:1002:cleanup_and_exit]
>> (-->/usr/sbin/glusterd(main+0x5d2) [0x406802]
>> (-->/usr/sbin/glusterd(glusterfs_volumes_init+0xb7) [0x4051b7]
>> (-->/usr/sbin/glusterd(glusterfs_process_volfp+0x103) [0x4050c3])))
0-:
>> received signum (0), shutting down*
>
>
>  Here are the contents of /etc/glusterfs/glusterd.vol:
>
> *volume management
>> **    type mgmt/glusterd
>> **    option working-directory /var/lib/glusterd
>> **    option transport-type socket,rdma
>> **    option transport.socket.keepalive-time 10
>> **    option transport.socket.keepalive-interval 2
>> **    option transport.socket.read-fail-log off
>> **end-volume*
>
>
> glusterd.vol is the same on all of the peers and the other ones work.
>
> Any help on where to look next would be greatly appreciated.
>
> Thanks,
> Mark
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131007/cf8283a7/attachment.html>

Seemingly Similar Threads

Search for more reasonably related threads

Gluster users - Oct 2013 - glusterd service fails to start on one peer

[Gluster-users] glusterd service fails to start on one peer

[Gluster-users] glusterd service fails to start on one peer

Seemingly Similar Threads