Deepak Naidu
2017-Feb-24 17:36 UTC
[Gluster-users] volume start: data0: failed: Commit failed on localhost.
Thanks Rafi for workaround.>> To find the root cause we need to get logs for the first failure of volume start or volume stop .Below, exact steps to re-produce the issue and attached log file contents from /varlog/gluster folder(parsed) STEPS to re-produce the issue root at hostname:~# gluster volume create home-folder transport tcp,rdma storageN1:/gluster/disk1/home-folder storageN2:/gluster/disk1/home-folder volume create: home-folder: success: please start the volume to access data root at hostname:~# root at hostname:~# gluster volume info home-folder Volume Name: home-folder Type: Distribute Volume ID: 09abd02a-b760-459f-afde-95b374eafc53 Status: Created Snapshot Count: 0 Number of Bricks: 2 Transport-type: tcp,rdma Bricks: Brick1: storageN1:/gluster/disk1/home-folder Brick2: storageN2:/gluster/disk1/home-folder Options Reconfigured: performance.readdir-ahead: on nfs.disable: on root at hostname:~# gluster volume status home-folder Volume home-folder is not started root at hostname:~# gluster volume start home-folder volume start: home-folder: failed: Commit failed on localhost. Please check log file for details. root at hostname:~# gluster volume status home-folder Volume home-folder is not started root at hostname:~# gluster volume start home-folder force volume start: home-folder: success root at hostname:~# gluster volume status home-folder Status of volume: home-folder Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick storageN1:/gluster/disk1/home-folder N/A N/A Y 17737 Brick storageN2:/gluster/disk1/home-folder N/A N/A N N/A Task Status of Volume home-folder ------------------------------------------------------------------------------ There are no active volume tasks root at hostname:~# gluster volume info home-folder Volume Name: home-folder Type: Distribute Volume ID: 09abd02a-b760-459f-afde-95b374eafc53 Status: Started Snapshot Count: 0 Number of Bricks: 2 Transport-type: tcp,rdma Bricks: Brick1: storageN1:/gluster/disk1/home-folder Brick2: storageN2:/gluster/disk1/home-folder Options Reconfigured: performance.readdir-ahead: on nfs.disable: on root at hostname:~# -- Deepak From: Mohammed Rafi K C [mailto:rkavunga at redhat.com] Sent: Friday, February 24, 2017 1:08 AM To: Deepak Naidu; gluster-users at gluster.org Subject: Re: [Gluster-users] volume start: data0: failed: Commit failed on localhost. It looks like it is ended up in split brain kind of situation. To find the root cause we need to get logs for the first failure of volume start or volume stop . Or to work around it, you can do a volume start force. Regards Rafi KC On 02/24/2017 01:36 PM, Deepak Naidu wrote: I keep on getting this error when my config.transport is set to both tcp,rdma. The volume doesn't start. I get the below error during volume start. To get around this, I end up delete the volume, then configure either only rdma or tcp. May be I am missing something, just trying to get the volume up. root at hostname:~# gluster volume start data0 volume start: data0: failed: Commit failed on localhost. Please check log file for details. root at hostname:~# root@ hostname:~# gluster volume status data0 Staging failed on storageN2. Error: Volume data0 is not started root@ hostname:~ ============[2017-02-24 08:00:29.923516] I [MSGID: 106499] [glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management: Received status volume req for volume data0 [2017-02-24 08:00:29.926140] E [MSGID: 106153] [glusterd-syncop.c:113:gd_collate_errors] 0-glusterd: Staging failed on storageN2. Error: Volume data0 is not started [2017-02-24 08:00:33.770505] I [MSGID: 106499] [glusterd-handler.c:4349:__glusterd_handle_status_volume] 0-management: Received status volume req for volume data0 [2017-02-24 08:00:33.772824] E [MSGID: 106153] [glusterd-syncop.c:113:gd_collate_errors] 0-glusterd: Staging failed on storageN2. Error: Volume data0 is not started ============[2017-02-24 08:01:36.305165] E [MSGID: 106537] [glusterd-volume-ops.c:1660:glusterd_op_stage_start_volume] 0-management: Volume data0 already started [2017-02-24 08:01:36.305191] W [MSGID: 106122] [glusterd-mgmt.c:198:gd_mgmt_v3_pre_validate_fn] 0-management: Volume start prevalidation failed. [2017-02-24 08:01:36.305198] E [MSGID: 106122] [glusterd-mgmt.c:884:glusterd_mgmt_v3_pre_validate] 0-management: Pre Validation failed for operation Start on local node [2017-02-24 08:01:36.305205] E [MSGID: 106122] [glusterd-mgmt.c:2009:glusterd_mgmt_v3_initiate_all_phases] 0-management: Pre Validation Failed -- Deepak ________________________________ This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. ________________________________ _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org<mailto:Gluster-users at gluster.org> http://lists.gluster.org/mailman/listinfo/gluster-users -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170224/246802de/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: Commit-failed.log Type: application/octet-stream Size: 182074 bytes Desc: Commit-failed.log URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170224/246802de/attachment.obj>
Atin Mukherjee
2017-Feb-25 15:16 UTC
[Gluster-users] volume start: data0: failed: Commit failed on localhost.
On Fri, Feb 24, 2017 at 11:06 PM, Deepak Naidu <dnaidu at nvidia.com> wrote:> Thanks Rafi for workaround. > > *>> To find the root cause we need to get logs for the first failure of > volume start or volume stop .* > > Below, exact steps to re-produce the issue and attached log file contents > from /varlog/gluster folder(parsed) > > > > *STEPS to re-produce the issue* > > > > root at hostname:~# gluster volume create home-folder transport tcp,rdma > storageN1:/gluster/disk1/home-folder storageN2:/gluster/disk1/home-folder > > volume create: home-folder: success: please start the volume to access data > > root at hostname:~# > > root at hostname:~# gluster volume info home-folder > > Volume Name: home-folder > > Type: Distribute > > Volume ID: 09abd02a-b760-459f-afde-95b374eafc53 > > Status: Created > > Snapshot Count: 0 > > Number of Bricks: 2 > > Transport-type: tcp,rdma > > Bricks: > > Brick1: storageN1:/gluster/disk1/home-folder > > Brick2: storageN2:/gluster/disk1/home-folder > > Options Reconfigured: > > performance.readdir-ahead: on > > nfs.disable: on > > > > root at hostname:~# gluster volume status home-folder > > Volume home-folder is not started > > root at hostname:~# gluster volume start home-folder > > volume start: home-folder: *failed: Commit failed on localhost*. Please > check log file for details. > > root at hostname:~# gluster volume status home-folder > > Volume home-folder is not started > > > > root at hostname:~# gluster volume start home-folder force > > volume start: home-folder: success > > root at hostname:~# gluster volume status home-folder > > Status of volume: home-folder > > Gluster process TCP Port RDMA Port Online > Pid > > ------------------------------------------------------------ > ------------------ > > Brick storageN1:/gluster/disk1/home-folder N/A N/A Y > 17737 > > Brick storageN2:/gluster/disk1/home-folder N/A N/A N > N/A > > > > Task Status of Volume home-folder > > ------------------------------------------------------------ > ------------------ > > There are no active volume tasks > > root at hostname:~# gluster volume info home-folder > > Volume Name: home-folder > > Type: Distribute > > Volume ID: 09abd02a-b760-459f-afde-95b374eafc53 > > Status: Started > > Snapshot Count: 0 > > Number of Bricks: 2 > > Transport-type: tcp,rdma > > Bricks: > > Brick1: storageN1:/gluster/disk1/home-folder > > Brick2: storageN2:/gluster/disk1/home-folder > > Options Reconfigured: > > performance.readdir-ahead: on > > nfs.disable: on > > root at hostname:~# > > > > > > -- > > Deepak > > > > > > *From:* Mohammed Rafi K C [mailto:rkavunga at redhat.com] > *Sent:* Friday, February 24, 2017 1:08 AM > *To:* Deepak Naidu; gluster-users at gluster.org > *Subject:* Re: [Gluster-users] volume start: data0: failed: Commit failed > on localhost. > > > > It looks like it is ended up in split brain kind of situation. To find the > root cause we need to get logs for the first failure of volume start or > volume stop . > > Or to work around it, you can do a volume start force. > > > > Regards > > Rafi KC > > > > On 02/24/2017 01:36 PM, Deepak Naidu wrote: > > I keep on getting this error when my config.transport is set to both > tcp,rdma. The volume doesn?t start. I get the below error during volume > start. > > > > To get around this, I end up delete the volume, then configure either only > rdma or tcp. May be I am missing something, just trying to get the volume > up. > > > > root at hostname:~# gluster volume start data0 > > volume start: data0: failed: Commit failed on localhost. Please check log > file for details. > > root at hostname:~# > > > > root@ hostname:~# gluster volume status data0 > > Staging failed on storageN2. Error: Volume data0 is not started > > root@ hostname:~ > > > > ============> > [2017-02-24 08:00:29.923516] I [MSGID: 106499] [glusterd-handler.c:4349:__glusterd_handle_status_volume] > 0-management: Received status volume req for volume data0 > > [2017-02-24 08:00:29.926140] E [MSGID: 106153] [glusterd-syncop.c:113:gd_collate_errors] > 0-glusterd: Staging failed on storageN2. Error: Volume data0 is not started > > [2017-02-24 08:00:33.770505] I [MSGID: 106499] [glusterd-handler.c:4349:__glusterd_handle_status_volume] > 0-management: Received status volume req for volume data0 > > [2017-02-24 08:00:33.772824] E [MSGID: 106153] [glusterd-syncop.c:113:gd_collate_errors] > 0-glusterd: Staging failed on storageN2. Error: Volume data0 is not started > > ============> > [2017-02-24 08:01:36.305165] E [MSGID: 106537] [glusterd-volume-ops.c:1660: > glusterd_op_stage_start_volume] 0-management: Volume data0 already started > > [2017-02-24 08:01:36.305191] W [MSGID: 106122] > [glusterd-mgmt.c:198:gd_mgmt_v3_pre_validate_fn] 0-management: Volume > start prevalidation failed. > > [2017-02-24 08:01:36.305198] E [MSGID: 106122] > [glusterd-mgmt.c:884:glusterd_mgmt_v3_pre_validate] 0-management: Pre > Validation failed for operation Start on local node > > [2017-02-24 08:01:36.305205] E [MSGID: 106122] [glusterd-mgmt.c:2009: > glusterd_mgmt_v3_initiate_all_phases] 0-management: Pre Validation Failed > >What you are seeing is one of the side effect of BZ https://bugzilla.redhat.com/show_bug.cgi?id=1386578 & https://review.gluster.org/#/c/15687/ has been already posted for review. So in this case, although the volume status shows up that volume is not started, the brick process(es) actually do start. As a workaround please use volume start force one more time. @Rafi - We should try to get this patch in early and make it part of the next release.> > > > > -- > > Deepak > > > ------------------------------ > > This email message is for the sole use of the intended recipient(s) and > may contain confidential information. Any unauthorized review, use, > disclosure or distribution is prohibited. If you are not the intended > recipient, please contact the sender by reply email and destroy all copies > of the original message. > ------------------------------ > > > > > _______________________________________________ > > Gluster-users mailing list > > Gluster-users at gluster.org > > http://lists.gluster.org/mailman/listinfo/gluster-users > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users >-- ~ Atin (atinm) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170225/c8a9e830/attachment.html>