thr3ads.net - Gluster users - [Gluster-users] GlusterFS as virtual machine storage [Sep 2017]

If this information is useful, please help other people find it:
Share via:

Pavel Szalbot

2017-Sep-08 08:50 UTC

[Gluster-users] GlusterFS as virtual machine storage

FYI I set up replica 3 (no arbiter this time), did the same thing -
rebooted one node during lots of file IO on VM and IO stopped.

As I mentioned either here or in another thread, this behavior is
caused by high default of network.ping-timeout. My main problem used
to be that setting it to low values like 3s or even 2s did not prevent
the FS to be mounted as read-only in the past (at least with arbiter)
and docs describe reconnect as very costly. If I set ping-timeout to
1s disaster of read-only mount is now prevented.

However I find it very strange because in the past I actually did end
up with read-only filesystem despite of the low ping-timeout.

With replica 3 after node reboot iftop shows data flowing only to the
one of remaining two nodes and there is no entry in heal info for the
volume. Explanation would be very much appreciated ;-)

Few minutes later I reverted back to replica 3 with arbiter (group
virt, ping-timeout 1). All nodes are up. During first fio run, VM
disconnected my ssh session, so I reconnected and saw ext4 problems in
dmesg. I deleted the VM and started a new one. Glustershd.log fills
with metadata heal shortly after fio job starts, but this time system
is stable.
Rebooting one of the nodes does not cause any problem (watching heal
log, i/o on vm).

So I decided to put more stress on VMs disk - I added second job with
direct=1 and started it (now both are running) while one gluster node
is still booting. What happened? One fio job reports "Bus error" and
VM segfaults when trying to run dmesg...

Is this gfapi related? Is this bug in arbiter?

Pavel Szalbot

2017-Sep-08 09:41 UTC

head link

[Gluster-users] GlusterFS as virtual machine storage

Back to replica 3 w/o arbiter. Two fio jobs running (direct=1 and
direct=0), rebooting one node... and VM dmesg looks like:

[  483.862664] blk_update_request: I/O error, dev vda, sector 23125016
[  483.898034] blk_update_request: I/O error, dev vda, sector 2161832
[  483.901103] blk_update_request: I/O error, dev vda, sector 2161832
[  483.904045] Aborting journal on device vda1-8.
[  483.906959] blk_update_request: I/O error, dev vda, sector 2099200
[  483.908306] blk_update_request: I/O error, dev vda, sector 2099200
[  483.909585] Buffer I/O error on dev vda1, logical block 262144,
lost sync page write
[  483.911121] blk_update_request: I/O error, dev vda, sector 2048
[  483.912192] blk_update_request: I/O error, dev vda, sector 2048
[  483.913221] Buffer I/O error on dev vda1, logical block 0, lost
sync page write
[  483.914546] EXT4-fs error (device vda1):
ext4_journal_check_start:56: Detected aborted journal
[  483.916230] EXT4-fs (vda1): Remounting filesystem read-only
[  483.917231] EXT4-fs (vda1): previous I/O error to superblock detected
[  483.917353] JBD2: Error -5 detected when updating journal
superblock for vda1-8.
[  483.921106] blk_update_request: I/O error, dev vda, sector 2048
[  483.922147] blk_update_request: I/O error, dev vda, sector 2048
[  483.923107] Buffer I/O error on dev vda1, logical block 0, lost
sync page write

Root fs is read-only even with 1s ping-timeout...

I really hope I have been idiot for almost a year now and someone
shows what am I doing completely wrong because I dream about joining
the hordes of fellow colleagues who store multiple VMs in gluster and
never had a problem with it. I also suspect the CentOS libvirt version
to be the cause.

-ps


On Fri, Sep 8, 2017 at 10:50 AM, Pavel Szalbot <pavel.szalbot at
gmail.com> wrote:> FYI I set up replica 3 (no arbiter this time), did the same thing -
> rebooted one node during lots of file IO on VM and IO stopped.
>
> As I mentioned either here or in another thread, this behavior is
> caused by high default of network.ping-timeout. My main problem used
> to be that setting it to low values like 3s or even 2s did not prevent
> the FS to be mounted as read-only in the past (at least with arbiter)
> and docs describe reconnect as very costly. If I set ping-timeout to
> 1s disaster of read-only mount is now prevented.
>
> However I find it very strange because in the past I actually did end
> up with read-only filesystem despite of the low ping-timeout.
>
> With replica 3 after node reboot iftop shows data flowing only to the
> one of remaining two nodes and there is no entry in heal info for the
> volume. Explanation would be very much appreciated ;-)
>
> Few minutes later I reverted back to replica 3 with arbiter (group
> virt, ping-timeout 1). All nodes are up. During first fio run, VM
> disconnected my ssh session, so I reconnected and saw ext4 problems in
> dmesg. I deleted the VM and started a new one. Glustershd.log fills
> with metadata heal shortly after fio job starts, but this time system
> is stable.
> Rebooting one of the nodes does not cause any problem (watching heal
> log, i/o on vm).
>
> So I decided to put more stress on VMs disk - I added second job with
> direct=1 and started it (now both are running) while one gluster node
> is still booting. What happened? One fio job reports "Bus error"
and
> VM segfaults when trying to run dmesg...
>
> Is this gfapi related? Is this bug in arbiter?

lemonnierk at ulrar.net

2017-Sep-08 09:42 UTC

head link

[Gluster-users] GlusterFS as virtual machine storage

Oh, you really don't want to go below 30s, I was told.
I'm using 30 seconds for the timeout, and indeed when a node goes down
the VM freez for 30 seconds, but I've never seen them go read only for
that.

I _only_ use virtio though, maybe it's that. What are you using ?


On Fri, Sep 08, 2017 at 11:41:13AM +0200, Pavel Szalbot
wrote:> Back to replica 3 w/o arbiter. Two fio jobs running (direct=1 and
> direct=0), rebooting one node... and VM dmesg looks like:
> 
> [  483.862664] blk_update_request: I/O error, dev vda, sector 23125016
> [  483.898034] blk_update_request: I/O error, dev vda, sector 2161832
> [  483.901103] blk_update_request: I/O error, dev vda, sector 2161832
> [  483.904045] Aborting journal on device vda1-8.
> [  483.906959] blk_update_request: I/O error, dev vda, sector 2099200
> [  483.908306] blk_update_request: I/O error, dev vda, sector 2099200
> [  483.909585] Buffer I/O error on dev vda1, logical block 262144,
> lost sync page write
> [  483.911121] blk_update_request: I/O error, dev vda, sector 2048
> [  483.912192] blk_update_request: I/O error, dev vda, sector 2048
> [  483.913221] Buffer I/O error on dev vda1, logical block 0, lost
> sync page write
> [  483.914546] EXT4-fs error (device vda1):
> ext4_journal_check_start:56: Detected aborted journal
> [  483.916230] EXT4-fs (vda1): Remounting filesystem read-only
> [  483.917231] EXT4-fs (vda1): previous I/O error to superblock detected
> [  483.917353] JBD2: Error -5 detected when updating journal
> superblock for vda1-8.
> [  483.921106] blk_update_request: I/O error, dev vda, sector 2048
> [  483.922147] blk_update_request: I/O error, dev vda, sector 2048
> [  483.923107] Buffer I/O error on dev vda1, logical block 0, lost
> sync page write
> 
> Root fs is read-only even with 1s ping-timeout...
> 
> I really hope I have been idiot for almost a year now and someone
> shows what am I doing completely wrong because I dream about joining
> the hordes of fellow colleagues who store multiple VMs in gluster and
> never had a problem with it. I also suspect the CentOS libvirt version
> to be the cause.
> 
> -ps
> 
> 
> On Fri, Sep 8, 2017 at 10:50 AM, Pavel Szalbot <pavel.szalbot at
gmail.com> wrote:
> > FYI I set up replica 3 (no arbiter this time), did the same thing -
> > rebooted one node during lots of file IO on VM and IO stopped.
> >
> > As I mentioned either here or in another thread, this behavior is
> > caused by high default of network.ping-timeout. My main problem used
> > to be that setting it to low values like 3s or even 2s did not prevent
> > the FS to be mounted as read-only in the past (at least with arbiter)
> > and docs describe reconnect as very costly. If I set ping-timeout to
> > 1s disaster of read-only mount is now prevented.
> >
> > However I find it very strange because in the past I actually did end
> > up with read-only filesystem despite of the low ping-timeout.
> >
> > With replica 3 after node reboot iftop shows data flowing only to the
> > one of remaining two nodes and there is no entry in heal info for the
> > volume. Explanation would be very much appreciated ;-)
> >
> > Few minutes later I reverted back to replica 3 with arbiter (group
> > virt, ping-timeout 1). All nodes are up. During first fio run, VM
> > disconnected my ssh session, so I reconnected and saw ext4 problems in
> > dmesg. I deleted the VM and started a new one. Glustershd.log fills
> > with metadata heal shortly after fio job starts, but this time system
> > is stable.
> > Rebooting one of the nodes does not cause any problem (watching heal
> > log, i/o on vm).
> >
> > So I decided to put more stress on VMs disk - I added second job with
> > direct=1 and started it (now both are running) while one gluster node
> > is still booting. What happened? One fio job reports "Bus
error" and
> > VM segfaults when trying to run dmesg...
> >
> > Is this gfapi related? Is this bug in arbiter?
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: Digital signature
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170908/6c5ecce4/attachment.sig>

Possibly Parallel Threads

Search for more reasonably related threads

Gluster users - Sep 2017 - GlusterFS as virtual machine storage

[Gluster-users] GlusterFS as virtual machine storage

[Gluster-users] GlusterFS as virtual machine storage

[Gluster-users] GlusterFS as virtual machine storage

Possibly Parallel Threads