Il 30-08-2017 17:07 Ivan Rossi ha scritto:> There has ben a bug associated to sharding that led to VM corruption > that has been around for a long time (difficult to reproduce I > understood). I have not seen reports on that for some time after the > last fix, so hopefully now VM hosting is stable.Mmmm... this is precisely the kind of bug that scares me... data corruption :| Any more information on what causes it and how to resolve? Even if in newer Gluster releases it is a solved bug, knowledge on how to treat it would be valuable. Thanks. -- Danti Gionatan Supporto Tecnico Assyoma S.r.l. - www.assyoma.it email: g.danti at assyoma.it - info at assyoma.it GPG public key ID: FF5F32A8
lemonnierk at ulrar.net
2017-Sep-03 22:25 UTC
[Gluster-users] GlusterFS as virtual machine storage
On Sun, Sep 03, 2017 at 10:21:33PM +0200, Gionatan Danti wrote:> Il 30-08-2017 17:07 Ivan Rossi ha scritto: > > There has ben a bug associated to sharding that led to VM corruption > > that has been around for a long time (difficult to reproduce I > > understood). I have not seen reports on that for some time after the > > last fix, so hopefully now VM hosting is stable. > > Mmmm... this is precisely the kind of bug that scares me... data > corruption :| > Any more information on what causes it and how to resolve? Even if in > newer Gluster releases it is a solved bug, knowledge on how to treat it > would be valuable. >I don't have a solution, instead of growing my volumes I just create new ones. Couldn't tell you if it's solved in recent release, never had the courage to try it out :) It's a bit hard to trigger too so having it work once might not be enough. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Digital signature URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170903/1fdb7fdd/attachment.sig>
Hi all,
I have promised to do some testing and I finally find some time and
infrastructure.
So I have 3 servers with Gluster 3.10.5 on CentOS 7. I created
replicated volume with arbiter (2+1) and VM on KVM (via Openstack)
with disk accessible through gfapi. Volume group is set to virt
(gluster volume set gv_openstack_1 virt). VM runs current (all
packages updated) Ubuntu Xenial.
I set up following fio job:
[job1]
ioengine=libaio
size=1g
loops=16
bs=512k
direct=1
filename=/tmp/fio.data2
When I run fio fio.job and reboot one of the data nodes, IO statistics
reported by fio drop to 0KB/0KB and 0 IOPS. After a while, root
filesystem gets remounted as read-only.
If you care about infrastructure, setup details etc., do not hesitate to ask.
Gluster info on volume:
Volume Name: gv_openstack_1
Type: Replicate
Volume ID: 2425ae63-3765-4b5e-915b-e132e0d3fff1
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: gfs-2.san:/export/gfs/gv_1
Brick2: gfs-3.san:/export/gfs/gv_1
Brick3: docker3.san:/export/gfs/gv_1 (arbiter)
Options Reconfigured:
nfs.disable: on
transport.address-family: inet
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
performance.low-prio-threads: 32
network.remote-dio: enable
cluster.eager-lock: enable
cluster.quorum-type: auto
cluster.server-quorum-type: server
cluster.data-self-heal-algorithm: full
cluster.locking-scheme: granular
cluster.shd-max-threads: 8
cluster.shd-wait-qlength: 10000
features.shard: on
user.cifs: off
Partial KVM XML dump:
<disk type='network' device='disk'>
<driver name='qemu' type='raw'
cache='none'/>
<source protocol='gluster'
name='gv_openstack_1/volume-77ebfd13-6a92-4f38-b036-e9e55d752e1e'>
<host name='10.0.1.201' port='24007'/>
</source>
<backingStore/>
<target dev='vda' bus='virtio'/>
<serial>77ebfd13-6a92-4f38-b036-e9e55d752e1e</serial>
<alias name='virtio-disk0'/>
<address type='pci' domain='0x0000' bus='0x00'
slot='0x04'
function='0x0'/>
</disk>
Networking is LACP on data nodes, stack of Juniper EX4550's (10Gbps
SFP+), separate VLAN for Gluster traffic, SSD only on Gluster all
nodes (including arbiter).
I would really love to know what am I doing wrong, because this is my
experience with Gluster for a long time a and a reason I would not
recommend it as VM storage backend in production environment where you
cannot start/stop VMs on your own (e.g. providing private clouds for
customers).
-ps
On Sun, Sep 3, 2017 at 10:21 PM, Gionatan Danti <g.danti at assyoma.it>
wrote:> Il 30-08-2017 17:07 Ivan Rossi ha scritto:
>>
>> There has ben a bug associated to sharding that led to VM corruption
>> that has been around for a long time (difficult to reproduce I
>> understood). I have not seen reports on that for some time after the
>> last fix, so hopefully now VM hosting is stable.
>
>
> Mmmm... this is precisely the kind of bug that scares me... data corruption
> :|
> Any more information on what causes it and how to resolve? Even if in newer
> Gluster releases it is a solved bug, knowledge on how to treat it would be
> valuable.
>
>
> Thanks.
>
> --
> Danti Gionatan
> Supporto Tecnico
> Assyoma S.r.l. - www.assyoma.it
> email: g.danti at assyoma.it - info at assyoma.it
> GPG public key ID: FF5F32A8
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
you need to set cluster.server-quorum-ratio 51% On 6 September 2017 at 10:12, Pavel Szalbot <pavel.szalbot at gmail.com> wrote:> Hi all, > > I have promised to do some testing and I finally find some time and > infrastructure. > > So I have 3 servers with Gluster 3.10.5 on CentOS 7. I created > replicated volume with arbiter (2+1) and VM on KVM (via Openstack) > with disk accessible through gfapi. Volume group is set to virt > (gluster volume set gv_openstack_1 virt). VM runs current (all > packages updated) Ubuntu Xenial. > > I set up following fio job: > > [job1] > ioengine=libaio > size=1g > loops=16 > bs=512k > direct=1 > filename=/tmp/fio.data2 > > When I run fio fio.job and reboot one of the data nodes, IO statistics > reported by fio drop to 0KB/0KB and 0 IOPS. After a while, root > filesystem gets remounted as read-only. > > If you care about infrastructure, setup details etc., do not hesitate to > ask. > > Gluster info on volume: > > Volume Name: gv_openstack_1 > Type: Replicate > Volume ID: 2425ae63-3765-4b5e-915b-e132e0d3fff1 > Status: Started > Snapshot Count: 0 > Number of Bricks: 1 x (2 + 1) = 3 > Transport-type: tcp > Bricks: > Brick1: gfs-2.san:/export/gfs/gv_1 > Brick2: gfs-3.san:/export/gfs/gv_1 > Brick3: docker3.san:/export/gfs/gv_1 (arbiter) > Options Reconfigured: > nfs.disable: on > transport.address-family: inet > performance.quick-read: off > performance.read-ahead: off > performance.io-cache: off > performance.stat-prefetch: off > performance.low-prio-threads: 32 > network.remote-dio: enable > cluster.eager-lock: enable > cluster.quorum-type: auto > cluster.server-quorum-type: server > cluster.data-self-heal-algorithm: full > cluster.locking-scheme: granular > cluster.shd-max-threads: 8 > cluster.shd-wait-qlength: 10000 > features.shard: on > user.cifs: off > > Partial KVM XML dump: > > <disk type='network' device='disk'> > <driver name='qemu' type='raw' cache='none'/> > <source protocol='gluster' > name='gv_openstack_1/volume-77ebfd13-6a92-4f38-b036-e9e55d752e1e'> > <host name='10.0.1.201' port='24007'/> > </source> > <backingStore/> > <target dev='vda' bus='virtio'/> > <serial>77ebfd13-6a92-4f38-b036-e9e55d752e1e</serial> > <alias name='virtio-disk0'/> > <address type='pci' domain='0x0000' bus='0x00' slot='0x04' > function='0x0'/> > </disk> > > Networking is LACP on data nodes, stack of Juniper EX4550's (10Gbps > SFP+), separate VLAN for Gluster traffic, SSD only on Gluster all > nodes (including arbiter). > > I would really love to know what am I doing wrong, because this is my > experience with Gluster for a long time a and a reason I would not > recommend it as VM storage backend in production environment where you > cannot start/stop VMs on your own (e.g. providing private clouds for > customers). > -ps > > > On Sun, Sep 3, 2017 at 10:21 PM, Gionatan Danti <g.danti at assyoma.it> > wrote: > > Il 30-08-2017 17:07 Ivan Rossi ha scritto: > >> > >> There has ben a bug associated to sharding that led to VM corruption > >> that has been around for a long time (difficult to reproduce I > >> understood). I have not seen reports on that for some time after the > >> last fix, so hopefully now VM hosting is stable. > > > > > > Mmmm... this is precisely the kind of bug that scares me... data > corruption > > :| > > Any more information on what causes it and how to resolve? Even if in > newer > > Gluster releases it is a solved bug, knowledge on how to treat it would > be > > valuable. > > > > > > Thanks. > > > > -- > > Danti Gionatan > > Supporto Tecnico > > Assyoma S.r.l. - www.assyoma.it > > email: g.danti at assyoma.it - info at assyoma.it > > GPG public key ID: FF5F32A8 > > _______________________________________________ > > Gluster-users mailing list > > Gluster-users at gluster.org > > http://lists.gluster.org/mailman/listinfo/gluster-users > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20170906/e637a975/attachment.html>