Sahina Bose
2015-Mar-13 05:57 UTC
[Gluster-users] [ovirt-users] Gluster services won't start any more
Did you upgrade glusterfs on the node? Looks like there's some problem with your volume file? [Adding gluster-users for further help] On 03/12/2015 03:57 PM, RASTELLI Alessandro wrote:> Hi, > tonight - without any apparent reason - the /var/log/gluster directory filled up disk space of one node. > I shut down services, cleaned logs, rebooted but services won't start any more. > > glusterd log says: > [2015-03-12 09:08:14.919478] I [MSGID: 100030] [glusterfsd.c:2018:main] 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.6.2 (args: /usr/sbin/glusterd --pid-file=/var/run/glusterd.pid) > [2015-03-12 09:08:14.935111] I [glusterd.c:1214:init] 0-management: Maximum allowed open file descriptors set to 65536 > [2015-03-12 09:08:14.935142] I [glusterd.c:1259:init] 0-management: Using /var/lib/glusterd as working directory > [2015-03-12 09:08:14.953202] W [rdma.c:4221:__gf_rdma_ctx_create] 0-rpc-transport/rdma: rdma_cm event channel creation failed (No such device) > [2015-03-12 09:08:14.953221] E [rdma.c:4519:init] 0-rdma.management: Failed to initialize IB Device > [2015-03-12 09:08:14.953229] E [rpc-transport.c:333:rpc_transport_load] 0-rpc-transport: 'rdma' initialization failed > [2015-03-12 09:08:14.953280] W [rpcsvc.c:1524:rpcsvc_transport_create] 0-rpc-service: cannot create listener, initing the transport failed > [2015-03-12 09:08:14.956004] I [glusterd.c:413:glusterd_check_gsync_present] 0-glusterd: geo-replication module not installed in the system > [2015-03-12 09:08:14.958341] I [glusterd-store.c:2063:glusterd_restore_op_version] 0-management: Detected new install. Setting op-version to maximum : 30600 > [2015-03-12 09:08:15.166709] E [xlator.c:425:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again > [2015-03-12 09:08:15.166729] E [graph.c:322:glusterfs_graph_init] 0-management: initializing translator failed > [2015-03-12 09:08:15.166737] E [graph.c:525:glusterfs_graph_activate] 0-graph: init failed > [2015-03-12 09:08:15.166987] W [glusterfsd.c:1194:cleanup_and_exit] (--> 0-: received signum (0), shutting down > > Can you please help? > Thank you > > Alessandro > _______________________________________________ > Users mailing list > Users at ovirt.org > http://lists.ovirt.org/mailman/listinfo/users
Krishnan Parthasarathi
2015-Mar-13 06:49 UTC
[Gluster-users] [ovirt-users] Gluster services won't start any more
> > [glusterd-store.c:2063:glusterd_restore_op_version] 0-management: Detected > > new install. Setting op-version to maximum : 30600The above message indicates that /var/lib/glusterd/glusterd.info file, carrying the identify (UUID) of the node and the operating version of the glusterd binary, was empty. This _shouldn't_ happen. We need to check for messages in glusterd log around the time /var/ filesystem was full to understand why this happened.> > [2015-03-12 09:08:15.166709] E [xlator.c:425:xlator_init] 0-management: > > Initialization of volume 'management' failed, review your volfile again > > [2015-03-12 09:08:15.166729] E [graph.c:322:glusterfs_graph_init] > > 0-management: initializing translator failed > > [2015-03-12 09:08:15.166737] E [graph.c:525:glusterfs_graph_activate] > > 0-graph: init failedAs part of the 'init' process, glusterd resolves identities of daemons that need to be spawned as part of hosting volumes. The resolution would fail if the identity of this node changes between a stop and start of glusterd service. Glusterd wouldn't start until the point this inconsistency is resolved.> > [2015-03-12 09:08:15.166987] W [glusterfsd.c:1194:cleanup_and_exit] (--> > > 0-: received signum (0), shutting down > > > > Can you please help?To get out of this situation, we need to reconstruct the configuration files that are 'out of date' with respect to the cluster. This could be tedious but possible if other nodes didn't have their /var filesystem getting filled. Each glusterd maintains its copy of volume and peer configuration under /var/lib/glusterd. * /var/lib/glusterd/peers - Holds one file for every peer, excluding 'self'. This implies that with the help of remaining nodes in the cluster, we can determine this node's identity. This means we can reconstruct /var/lib/glusterd/glusterd.info on this node. For other files under /var/lib/glusterd that are empty, we could use the fact that each node has a copy of the configuration and it can be used to reconstruct. Hope that helps, kp