Hi all, I still experience vm crashes with glusterfs. The VM I had problems (kept on crashing) was moved away from gluster and have had no problems since. Now another VM is doing the same. It just shutsdown. gluster is 3.8.13 I know now you are on 3.10 and 3.12, but I had troubles upgrading another cluster to 3.10 (although the processes were off and no files where in use gluster had to heal the files which it found in split brain), so I'm a bit afraid to replicate the process on a customer production environment. BTW, are there any specific flags that should be turned on when using the storage for VM disk images (beside sharding)? Anyway this is the log on the machine running the VM, and only place where I find some errors. This is the storage: ? Volume Name: datastore2 Type: Replicate Volume ID: c95ebb5f-6e04-4f09-91b9-bbbe63d83aea Status: Started Snapshot Count: 0 Number of Bricks: 1 x (2 + 1) = 3 Transport-type: tcp Bricks: Brick1: srvpve2g:/data/brick2/brick Brick2: srvpve3g:/data/brick2/brick Brick3: srvpve1g:/data/brick2/brick (arbiter) Options Reconfigured: nfs.disable: on performance.readdir-ahead: on transport.address-family: inet [2017-12-31 05:25:01.724213] I [glusterfsd-mgmt.c:1600:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing [2018-01-02 07:56:32.763516] I [MSGID: 115036] [server.c:548:server_rpc_notify] 0-datastore1-server: disconnecting connection from srvpve4-1 652-2017/05/03-19:41:30:493103-datastore1-client-1-0-2 [2018-01-02 07:56:32.763554] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-datastore1-server: Shutting down connection srvpve4-1652-2 017/05/03-19:41:30:493103-datastore1-client-1-0-2 [2018-01-03 07:58:37.342761] I [MSGID: 115029] [server-handshake.c:692:server_setvolume] 0-datastore1-server: accepted client from srvpve4-1 573-2018/01/03-07:58:40:761060-datastore1-client-1-0-0 (version: 3.8.13) root at srvpve2:/var/log/glusterfs/bricks# more data-brick2-brick.log [2017-12-31 05:25:01.725328] I [glusterfsd-mgmt.c:1600:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing [2018-01-02 07:56:28.258762] I [MSGID: 115036] [server.c:548:server_rpc_notify] 0-datastore2-server: disconnecting connection from srvpve4-1 563-2017/05/03-19:41:25:843535-datastore2-client-0-0-2 [2018-01-02 07:56:28.258841] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-datastore2-server: Shutting down connection srvpve4-1563-2 017/05/03-19:41:25:843535-datastore2-client-0-0-2 [2018-01-02 12:42:01.140746] I [MSGID: 115036] [server.c:548:server_rpc_notify] 0-datastore2-server: disconnecting connection from srvpve2-6 3690-2017/12/07-07:58:46:36094-datastore2-client-0-0-0 [2018-01-02 12:42:01.140756] I [MSGID: 115036] [server.c:548:server_rpc_notify] 0-datastore2-server: disconnecting connection from srvpve2-6 3690-2017/12/07-07:58:44:188020-datastore2-client-0-0-0 [2018-01-02 12:42:01.140829] I [MSGID: 115013] [server-helpers.c:293:do_fd_cleanup] 0-datastore2-server: fd cleanup on /images/201/vm-201-di sk-2.qcow2 [2018-01-02 12:42:01.140830] W [inodelk.c:399:pl_inodelk_log_cleanup] 0-datastore2-server: releasing lock on a8d82b3d-1cf9-45cf-9858-d854671 0b49c held by {client=0x7f4c840efc50, pid=0 lk-owner=5c6000d0397f0000} [2018-01-02 12:42:01.140849] I [MSGID: 115013] [server-helpers.c:293:do_fd_cleanup] 0-datastore2-server: fd cleanup on /images/201/vm-201-di sk-1.qcow2 [2018-01-02 12:42:01.140897] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-datastore2-server: Shutting down connection srvpve2-63690- 2017/12/07-07:58:46:36094-datastore2-client-0-0-0 [2018-01-02 12:42:01.140897] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-datastore2-server: Shutting down connection srvpve2-63690- 2017/12/07-07:58:44:188020-datastore2-client-0-0-0 [2018-01-02 12:51:19.397620] I [MSGID: 115029] [server-handshake.c:692:server_setvolume] 0-datastore2-server: accepted client from srvpve2-1 05503-2018/01/02-12:51:19:377360-datastore2-client-0-0-0 (version: 3.8.13) [2018-01-02 12:51:19.422927] E [MSGID: 113107] [posix.c:1079:posix_seek] 0-datastore2-posix: seek failed on fd 18 length 43000922112 [No suc h device or address] [2018-01-02 12:51:19.423003] E [MSGID: 115089] [server-rpc-fops.c:2007:server_seek_cbk] 0-datastore2-server: 18: SEEK-2 (a8d82b3d-1cf9-45cf- 9858-d8546710b49c) ==> (No such device or address) [No such device or address] [2018-01-02 12:51:20.017984] I [MSGID: 115029] [server-handshake.c:692:server_setvolume] 0-datastore2-server: accepted client from srvpve2-1 05503-2018/01/02-12:51:20:1706-datastore2-client-0-0-0 (version: 3.8.13) [2018-01-02 12:51:20.044434] E [MSGID: 113107] [posix.c:1079:posix_seek] 0-datastore2-posix: seek failed on fd 19 length 859684929536 [No su ch device or address] [2018-01-02 12:51:20.044481] E [MSGID: 115089] [server-rpc-fops.c:2007:server_seek_cbk] 0-datastore2-server: 18: SEEK-2 (66d9eefb-ee55-40ad- 9f44-c55d1e809006) ==> (No such device or address) [No such device or address] [2018-01-03 07:58:37.489609] I [MSGID: 115029] [server-handshake.c:692:server_setvolume] 0-datastore2-server: accepted client from srvpve4-1 682-2018/01/03-07:58:41:84178-datastore2-client-0-0-0 (version: 3.8.13) [2018-01-03 11:54:36.010664] I [MSGID: 115036] [server.c:548:server_rpc_notify] 0-datastore2-server: disconnecting connection from srvpve2-1 05503-2018/01/02-12:51:20:1706-datastore2-client-0-0-0 [2018-01-03 11:54:36.010674] I [MSGID: 115036] [server.c:548:server_rpc_notify] 0-datastore2-server: disconnecting connection from srvpve2-1 05503-2018/01/02-12:51:19:377360-datastore2-client-0-0-0 [2018-01-03 11:54:36.010730] W [inodelk.c:399:pl_inodelk_log_cleanup] 0-datastore2-server: releasing lock on a8d82b3d-1cf9-45cf-9858-d854671 0b49c held by {client=0x7f4c7c0047b0, pid=0 lk-owner=5c60c0912b7f0000} [2018-01-03 11:54:36.010730] W [inodelk.c:399:pl_inodelk_log_cleanup] 0-datastore2-server: releasing lock on 66d9eefb-ee55-40ad-9f44-c55d1e8 09006 held by {client=0x7f4c840e9210, pid=0 lk-owner=5c807b8b2b7f0000} [2018-01-03 11:54:36.010795] I [MSGID: 115013] [server-helpers.c:293:do_fd_cleanup] 0-datastore2-server: fd cleanup on /images/201/vm-201-di sk-1.qcow2 [2018-01-03 11:54:36.010799] I [MSGID: 115013] [server-helpers.c:293:do_fd_cleanup] 0-datastore2-server: fd cleanup on /images/201/vm-201-di sk-2.qcow2 [2018-01-03 11:54:36.010904] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-datastore2-server: Shutting down connection srvpve2-105503 -2018/01/02-12:51:19:377360-datastore2-client-0-0-0 [2018-01-03 11:54:36.010931] I [MSGID: 101055] [client_t.c:415:gf_client_unref] 0-datastore2-server: Shutting down connection srvpve2-105503 -2018/01/02-12:51:20:1706-datastore2-client-0-0-0 [2018-01-03 12:36:13.188660] I [MSGID: 115029] [server-handshake.c:692:server_setvolume] 0-datastore2-server: accepted client from srvpve2-1 95919-2018/01/03-12:36:13:106797-datastore2-client-0-0-0 (version: 3.8.13) [2018-01-03 12:36:13.238446] E [MSGID: 113107] [posix.c:1079:posix_seek] 0-datastore2-posix: seek failed on fd 19 length 43000922112 [No suc h device or address] [2018-01-03 12:36:13.238506] E [MSGID: 115089] [server-rpc-fops.c:2007:server_seek_cbk] 0-datastore2-server: 18: SEEK-2 (a8d82b3d-1cf9-45cf- 9858-d8546710b49c) ==> (No such device or address) [No such device or address] [2018-01-03 12:36:15.082870] I [MSGID: 115029] [server-handshake.c:692:server_setvolume] 0-datastore2-server: accepted client from srvpve2-1 95919-2018/01/03-12:36:15:65991-datastore2-client-0-0-0 (version: 3.8.13) [2018-01-03 12:36:15.107917] E [MSGID: 113107] [posix.c:1079:posix_seek] 0-datastore2-posix: seek failed on fd 21 length 859684929536 [No su ch device or address] [2018-01-03 12:36:15.107956] E [MSGID: 115089] [server-rpc-fops.c:2007:server_seek_cbk] 0-datastore2-server: 18: SEEK-2 (66d9eefb-ee55-40ad- 9f44-c55d1e809006) ==> (No such device or address) [No such device or address]