Mark Morlino
2013-Oct-07 20:11 UTC
[Gluster-users] glusterd service fails to start on one peer
I'm hoping that someone here can point me the right direction to help me solve a problem I am having. I've got 3 gluster peers and for some reason glusterd sill not start on one of them. All are running glusterfs version 3.4.0-8.el6 on Centos 6.4 (2.6.32-358.el6.x86_64). In /var/log/glusterfs/etc-glusterfs-glusterd.vol.log I see this error repeated 36 times (alternating between brick-0 and brick-1): *E [glusterd-store.c:1845:glusterd_store_retrieve_volume] 0-: Unknown key:> brick-0*This makes some sense to me since I have 18 replica 2 volumes resulting in a total of 36 bricks. Then there are a few more "I" messages and this is the rest of the file: *E [glusterd-store.c:2472:glusterd_resolve_all_bricks] 0-glusterd: resolve> brick failed in restore > **E [xlator.c:390:xlator_init] 0-management: Initialization of volume > 'management' failed, review your volfile again > **E [graph.c:292:glusterfs_graph_init] 0-management: initializing > translator failed > **E [graph.c:479:glusterfs_graph_activate] 0-graph: init failed > **W [glusterfsd.c:1002:cleanup_and_exit] > (-->/usr/sbin/glusterd(main+0x5d2) [0x406802] > (-->/usr/sbin/glusterd(glusterfs_volumes_init+0xb7) [0x4051b7] > (-->/usr/sbin/glusterd(glusterfs_process_volfp+0x103) [0x4050c3]))) 0-: > received signum (0), shutting down*Here are the contents of /etc/glusterfs/glusterd.vol: *volume management> ** type mgmt/glusterd > ** option working-directory /var/lib/glusterd > ** option transport-type socket,rdma > ** option transport.socket.keepalive-time 10 > ** option transport.socket.keepalive-interval 2 > ** option transport.socket.read-fail-log off > **end-volume*glusterd.vol is the same on all of the peers and the other ones work. Any help on where to look next would be greatly appreciated. Thanks, Mark -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131007/017d7fa3/attachment.html>
Mark Morlino
2013-Oct-07 20:44 UTC
[Gluster-users] glusterd service fails to start on one peer
So, I guess I figured it out. I had been looking for a volume problem based on the log messages but it turns out it was a peer definition problem. One of the files in /var/lib/glusterd/peers was empty. I was able to determine where to look based on the output of running /usr/sbin/glusterd --debug --pid-file=/var/run/glusterd.pid and then I was able to copy the missing file from one of the other peers since each peer has a file for each of the other 2 peers. On Mon, Oct 7, 2013 at 12:11 PM, Mark Morlino <mark at gina.alaska.edu> wrote:> I'm hoping that someone here can point me the right direction to help me > solve a problem I am having. > > I've got 3 gluster peers and for some reason glusterd sill not start on > one of them. All are running glusterfs version 3.4.0-8.el6 on Centos 6.4 > (2.6.32-358.el6.x86_64). > > In /var/log/glusterfs/etc-glusterfs-glusterd.vol.log I see this error > repeated 36 times (alternating between brick-0 and brick-1): > > *E [glusterd-store.c:1845:glusterd_store_retrieve_volume] 0-: Unknown >> key: brick-0* > > > This makes some sense to me since I have 18 replica 2 volumes resulting in > a total of 36 bricks. > > Then there are a few more "I" messages and this is the rest of the file: > > *E [glusterd-store.c:2472:glusterd_resolve_all_bricks] 0-glusterd: >> resolve brick failed in restore >> **E [xlator.c:390:xlator_init] 0-management: Initialization of volume >> 'management' failed, review your volfile again >> **E [graph.c:292:glusterfs_graph_init] 0-management: initializing >> translator failed >> **E [graph.c:479:glusterfs_graph_activate] 0-graph: init failed >> **W [glusterfsd.c:1002:cleanup_and_exit] >> (-->/usr/sbin/glusterd(main+0x5d2) [0x406802] >> (-->/usr/sbin/glusterd(glusterfs_volumes_init+0xb7) [0x4051b7] >> (-->/usr/sbin/glusterd(glusterfs_process_volfp+0x103) [0x4050c3]))) 0-: >> received signum (0), shutting down* > > > Here are the contents of /etc/glusterfs/glusterd.vol: > > *volume management >> ** type mgmt/glusterd >> ** option working-directory /var/lib/glusterd >> ** option transport-type socket,rdma >> ** option transport.socket.keepalive-time 10 >> ** option transport.socket.keepalive-interval 2 >> ** option transport.socket.read-fail-log off >> **end-volume* > > > glusterd.vol is the same on all of the peers and the other ones work. > > Any help on where to look next would be greatly appreciated. > > Thanks, > Mark >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20131007/cf8283a7/attachment.html>