Alessandro Briosi
2017-Feb-21 08:53 UTC
[Gluster-users] possible gluster error causing kvm to shutdown
Hi all, I have had a couple of times now a KVM VM which suddenly was shutdown (whithout any apparent reason) At the time this happened the only thing I can find in logs are related to gluster: Stops have happened at 16.19 on the 13th and at 03.34 on the 19th. (time is local time which is GMT+1) I though tink that gluster logs are in GMT. This is from 1st node (which also runs the kvm and basically should be client of itself) [2017-02-07 22:29:15.030197] I [MSGID: 114035] [client-handshake.c:202:client_set_lk_version_cbk] 0-datastore1-client-1: Server lk version = 1 [2017-02-19 05:22:07.747187] I [MSGID: 108026] [afr-self-heal-common.c:1173:afr_log_selfheal] 0-datastore1-replicate-0: Completed data selfheal on 9e66f0d2-501 b-4cf9-80db-f423e2e2ef0f. sources=[1] sinks=0 r This is from 2nd node: [2017-02-07 22:29:15.044422] I [MSGID: 114035] [client-handshake.c:202:client_set_lk_version_cbk] 0-datastore1-client-0: Server lk version = 1 [2017-02-08 00:13:58.612483] I [MSGID: 108026] [afr-self-heal-common.c:1173:afr_log_selfheal] 0-datastore1-replicate-0: Completed data selfheal on b32ccae9-01e d-406c-988f-64394e4cb37c. sources=[0] sinks=1 [2017-02-13 16:44:10.570176] I [MSGID: 108026] [afr-self-heal-common.c:1173:afr_log_selfheal] 0-datastore1-replicate-0: Completed data selfheal on bc8f6a7e-31e 5-4b48-946c-f779a4b2e64f. sources=[1] sinks=0 [2017-02-19 04:30:46.049524] I [MSGID: 108026] [afr-self-heal-common.c:1173:afr_log_selfheal] 0-datastore1-replicate-0: Completed data selfheal on bc8f6a7e-31e 5-4b48-946c-f779a4b2e64f. sources=[1] sinks=0 Could this be the cause? This is current volume configuration, I'll be adding an additional node in the near future, but need to have this stable before. Volume Name: datastore1 Type: Replicate Volume ID: e4dbbf6e-11e6-4b36-bab0-c37647ef6ad6 Status: Started Snapshot Count: 0 Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: srvpve1g:/data/brick1/brick Brick2: srvpve2g:/data/brick1/brick Options Reconfigured: nfs.disable: on performance.readdir-ahead: on transport.address-family: inet Thanks. Alessandro
Alessandro Briosi
2017-Feb-21 15:53 UTC
[Gluster-users] possible gluster error causing kvm to shutdown
Il 21/02/2017 09:53, Alessandro Briosi ha scritto:> Hi all, > I have had a couple of times now a KVM VM which suddenly was shutdown > (whithout any apparent reason) > > At the time this happened the only thing I can find in logs are related > to gluster: > > Stops have happened at 16.19 on the 13th and at 03.34 on the 19th. (time > is local time which is GMT+1) > I though tink that gluster logs are in GMT. > > This is from 1st node (which also runs the kvm and basically should be > client of itself) > > [2017-02-07 22:29:15.030197] I [MSGID: 114035] > [client-handshake.c:202:client_set_lk_version_cbk] > 0-datastore1-client-1: Server lk version = 1 > [2017-02-19 05:22:07.747187] I [MSGID: 108026] > [afr-self-heal-common.c:1173:afr_log_selfheal] 0-datastore1-replicate-0: > Completed data selfheal on 9e66f0d2-501 > b-4cf9-80db-f423e2e2ef0f. sources=[1] sinks=0 > r > This is from 2nd node: > > [2017-02-07 22:29:15.044422] I [MSGID: 114035] > [client-handshake.c:202:client_set_lk_version_cbk] > 0-datastore1-client-0: Server lk version = 1 > [2017-02-08 00:13:58.612483] I [MSGID: 108026] > [afr-self-heal-common.c:1173:afr_log_selfheal] 0-datastore1-replicate-0: > Completed data selfheal on b32ccae9-01e > d-406c-988f-64394e4cb37c. sources=[0] sinks=1 > [2017-02-13 16:44:10.570176] I [MSGID: 108026] > [afr-self-heal-common.c:1173:afr_log_selfheal] 0-datastore1-replicate-0: > Completed data selfheal on bc8f6a7e-31e > 5-4b48-946c-f779a4b2e64f. sources=[1] sinks=0 > [2017-02-19 04:30:46.049524] I [MSGID: 108026] > [afr-self-heal-common.c:1173:afr_log_selfheal] 0-datastore1-replicate-0: > Completed data selfheal on bc8f6a7e-31e > 5-4b48-946c-f779a4b2e64f. sources=[1] sinks=0 > > Could this be the cause? > > This is current volume configuration, I'll be adding an additional node > in the near future, but need to have this stable before. > > Volume Name: datastore1 > Type: Replicate > Volume ID: e4dbbf6e-11e6-4b36-bab0-c37647ef6ad6 > Status: Started > Snapshot Count: 0 > Number of Bricks: 1 x 2 = 2 > Transport-type: tcp > Bricks: > Brick1: srvpve1g:/data/brick1/brick > Brick2: srvpve2g:/data/brick1/brick > Options Reconfigured: > nfs.disable: on > performance.readdir-ahead: on > transport.address-family: inet > >nobody has any clue on this? Should I provide more information/logs? For what I understand there was a healing triggered, but I have no idea on why this happened, and why the kvm was shutdown. Gluster client is supposed to be client of both servers for failover. Also there are other vm running and they did not get shutdown. Alessandro