thr3ads.net - Gluster users - [Gluster-users] GlusterFS as virtual machine storage [Sep 2017]

If this information is useful, please help other people find it:
Share via:

Alastair Neil

2017-Sep-06 19:59 UTC

[Gluster-users] GlusterFS as virtual machine storage

you need to set

cluster.server-quorum-ratio             51%

On 6 September 2017 at 10:12, Pavel Szalbot <pavel.szalbot at gmail.com>
wrote:
> Hi all,
>
> I have promised to do some testing and I finally find some time and
> infrastructure.
>
> So I have 3 servers with Gluster 3.10.5 on CentOS 7. I created
> replicated volume with arbiter (2+1) and VM on KVM (via Openstack)
> with disk accessible through gfapi. Volume group is set to virt
> (gluster volume set gv_openstack_1 virt). VM runs current (all
> packages updated) Ubuntu Xenial.
>
> I set up following fio job:
>
> [job1]
> ioengine=libaio
> size=1g
> loops=16
> bs=512k
> direct=1
> filename=/tmp/fio.data2
>
> When I run fio fio.job and reboot one of the data nodes, IO statistics
> reported by fio drop to 0KB/0KB and 0 IOPS. After a while, root
> filesystem gets remounted as read-only.
>
> If you care about infrastructure, setup details etc., do not hesitate to
> ask.
>
> Gluster info on volume:
>
> Volume Name: gv_openstack_1
> Type: Replicate
> Volume ID: 2425ae63-3765-4b5e-915b-e132e0d3fff1
> Status: Started
> Snapshot Count: 0
> Number of Bricks: 1 x (2 + 1) = 3
> Transport-type: tcp
> Bricks:
> Brick1: gfs-2.san:/export/gfs/gv_1
> Brick2: gfs-3.san:/export/gfs/gv_1
> Brick3: docker3.san:/export/gfs/gv_1 (arbiter)
> Options Reconfigured:
> nfs.disable: on
> transport.address-family: inet
> performance.quick-read: off
> performance.read-ahead: off
> performance.io-cache: off
> performance.stat-prefetch: off
> performance.low-prio-threads: 32
> network.remote-dio: enable
> cluster.eager-lock: enable
> cluster.quorum-type: auto
> cluster.server-quorum-type: server
> cluster.data-self-heal-algorithm: full
> cluster.locking-scheme: granular
> cluster.shd-max-threads: 8
> cluster.shd-wait-qlength: 10000
> features.shard: on
> user.cifs: off
>
> Partial KVM XML dump:
>
>     <disk type='network' device='disk'>
>       <driver name='qemu' type='raw'
cache='none'/>
>       <source protocol='gluster'
>
name='gv_openstack_1/volume-77ebfd13-6a92-4f38-b036-e9e55d752e1e'>
>         <host name='10.0.1.201' port='24007'/>
>       </source>
>       <backingStore/>
>       <target dev='vda' bus='virtio'/>
>       <serial>77ebfd13-6a92-4f38-b036-e9e55d752e1e</serial>
>       <alias name='virtio-disk0'/>
>       <address type='pci' domain='0x0000'
bus='0x00' slot='0x04'
> function='0x0'/>
>     </disk>
>
> Networking is LACP on data nodes, stack of Juniper EX4550's (10Gbps
> SFP+), separate VLAN for Gluster traffic, SSD only on Gluster all
> nodes (including arbiter).
>
> I would really love to know what am I doing wrong, because this is my
> experience with Gluster for a long time a and a reason I would not
> recommend it as VM storage backend in production environment where you
> cannot start/stop VMs on your own (e.g. providing private clouds for
> customers).
> -ps
>
>
> On Sun, Sep 3, 2017 at 10:21 PM, Gionatan Danti <g.danti at
assyoma.it>
> wrote:
> > Il 30-08-2017 17:07 Ivan Rossi ha scritto:
> >>
> >> There has ben a bug associated to sharding that led to VM
corruption
> >> that has been around for a long time (difficult to reproduce I
> >> understood). I have not seen reports on that for some time after
the
> >> last fix, so hopefully now VM hosting is stable.
> >
> >
> > Mmmm... this is precisely the kind of bug that scares me... data
> corruption
> > :|
> > Any more information on what causes it and how to resolve? Even if in
> newer
> > Gluster releases it is a solved bug, knowledge on how to treat it
would
> be
> > valuable.
> >
> >
> > Thanks.
> >
> > --
> > Danti Gionatan
> > Supporto Tecnico
> > Assyoma S.r.l. - www.assyoma.it
> > email: g.danti at assyoma.it - info at assyoma.it
> > GPG public key ID: FF5F32A8
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > http://lists.gluster.org/mailman/listinfo/gluster-users
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170906/e637a975/attachment.html>

lemonnierk at ulrar.net

2017-Sep-06 20:06 UTC

head link

[Gluster-users] GlusterFS as virtual machine storage

Mh, I never had to do that and I never had that problem. Is that an
arbiter specific thing ? With replica 3 it just works.

On Wed, Sep 06, 2017 at 03:59:14PM -0400, Alastair Neil
wrote:> you need to set
> 
> cluster.server-quorum-ratio             51%
> 
> On 6 September 2017 at 10:12, Pavel Szalbot <pavel.szalbot at
gmail.com> wrote:
> 
> > Hi all,
> >
> > I have promised to do some testing and I finally find some time and
> > infrastructure.
> >
> > So I have 3 servers with Gluster 3.10.5 on CentOS 7. I created
> > replicated volume with arbiter (2+1) and VM on KVM (via Openstack)
> > with disk accessible through gfapi. Volume group is set to virt
> > (gluster volume set gv_openstack_1 virt). VM runs current (all
> > packages updated) Ubuntu Xenial.
> >
> > I set up following fio job:
> >
> > [job1]
> > ioengine=libaio
> > size=1g
> > loops=16
> > bs=512k
> > direct=1
> > filename=/tmp/fio.data2
> >
> > When I run fio fio.job and reboot one of the data nodes, IO statistics
> > reported by fio drop to 0KB/0KB and 0 IOPS. After a while, root
> > filesystem gets remounted as read-only.
> >
> > If you care about infrastructure, setup details etc., do not hesitate
to
> > ask.
> >
> > Gluster info on volume:
> >
> > Volume Name: gv_openstack_1
> > Type: Replicate
> > Volume ID: 2425ae63-3765-4b5e-915b-e132e0d3fff1
> > Status: Started
> > Snapshot Count: 0
> > Number of Bricks: 1 x (2 + 1) = 3
> > Transport-type: tcp
> > Bricks:
> > Brick1: gfs-2.san:/export/gfs/gv_1
> > Brick2: gfs-3.san:/export/gfs/gv_1
> > Brick3: docker3.san:/export/gfs/gv_1 (arbiter)
> > Options Reconfigured:
> > nfs.disable: on
> > transport.address-family: inet
> > performance.quick-read: off
> > performance.read-ahead: off
> > performance.io-cache: off
> > performance.stat-prefetch: off
> > performance.low-prio-threads: 32
> > network.remote-dio: enable
> > cluster.eager-lock: enable
> > cluster.quorum-type: auto
> > cluster.server-quorum-type: server
> > cluster.data-self-heal-algorithm: full
> > cluster.locking-scheme: granular
> > cluster.shd-max-threads: 8
> > cluster.shd-wait-qlength: 10000
> > features.shard: on
> > user.cifs: off
> >
> > Partial KVM XML dump:
> >
> >     <disk type='network' device='disk'>
> >       <driver name='qemu' type='raw'
cache='none'/>
> >       <source protocol='gluster'
> >
name='gv_openstack_1/volume-77ebfd13-6a92-4f38-b036-e9e55d752e1e'>
> >         <host name='10.0.1.201' port='24007'/>
> >       </source>
> >       <backingStore/>
> >       <target dev='vda' bus='virtio'/>
> >      
<serial>77ebfd13-6a92-4f38-b036-e9e55d752e1e</serial>
> >       <alias name='virtio-disk0'/>
> >       <address type='pci' domain='0x0000'
bus='0x00' slot='0x04'
> > function='0x0'/>
> >     </disk>
> >
> > Networking is LACP on data nodes, stack of Juniper EX4550's
(10Gbps
> > SFP+), separate VLAN for Gluster traffic, SSD only on Gluster all
> > nodes (including arbiter).
> >
> > I would really love to know what am I doing wrong, because this is my
> > experience with Gluster for a long time a and a reason I would not
> > recommend it as VM storage backend in production environment where you
> > cannot start/stop VMs on your own (e.g. providing private clouds for
> > customers).
> > -ps
> >
> >
> > On Sun, Sep 3, 2017 at 10:21 PM, Gionatan Danti <g.danti at
assyoma.it>
> > wrote:
> > > Il 30-08-2017 17:07 Ivan Rossi ha scritto:
> > >>
> > >> There has ben a bug associated to sharding that led to VM
corruption
> > >> that has been around for a long time (difficult to reproduce
I
> > >> understood). I have not seen reports on that for some time
after the
> > >> last fix, so hopefully now VM hosting is stable.
> > >
> > >
> > > Mmmm... this is precisely the kind of bug that scares me... data
> > corruption
> > > :|
> > > Any more information on what causes it and how to resolve? Even
if in
> > newer
> > > Gluster releases it is a solved bug, knowledge on how to treat it
would
> > be
> > > valuable.
> > >
> > >
> > > Thanks.
> > >
> > > --
> > > Danti Gionatan
> > > Supporto Tecnico
> > > Assyoma S.r.l. - www.assyoma.it
> > > email: g.danti at assyoma.it - info at assyoma.it
> > > GPG public key ID: FF5F32A8
> > > _______________________________________________
> > > Gluster-users mailing list
> > > Gluster-users at gluster.org
> > > http://lists.gluster.org/mailman/listinfo/gluster-users
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > http://lists.gluster.org/mailman/listinfo/gluster-users
> >
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: Digital signature
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170906/0fee773a/attachment.sig>

Alastair Neil

2017-Sep-07 15:52 UTC

head link

[Gluster-users] GlusterFS as virtual machine storage

*shrug* I don't use arbiter for vm work loads just straight replica 3.
There are some gotchas with using an arbiter for VM workloads.  If
quorum-type is auto and a brick that is not the arbiter drop out then if
the up brick is dirty as far as the arbiter is concerned i.e. the only good
copy is on the down brick you will get ENOTCONN and your VMs will halt on
IO.

On 6 September 2017 at 16:06, <lemonnierk at ulrar.net> wrote:
> Mh, I never had to do that and I never had that problem. Is that an
> arbiter specific thing ? With replica 3 it just works.
>
> On Wed, Sep 06, 2017 at 03:59:14PM -0400, Alastair Neil wrote:
> > you need to set
> >
> > cluster.server-quorum-ratio             51%
> >
> > On 6 September 2017 at 10:12, Pavel Szalbot <pavel.szalbot at
gmail.com>
> wrote:
> >
> > > Hi all,
> > >
> > > I have promised to do some testing and I finally find some time
and
> > > infrastructure.
> > >
> > > So I have 3 servers with Gluster 3.10.5 on CentOS 7. I created
> > > replicated volume with arbiter (2+1) and VM on KVM (via
Openstack)
> > > with disk accessible through gfapi. Volume group is set to virt
> > > (gluster volume set gv_openstack_1 virt). VM runs current (all
> > > packages updated) Ubuntu Xenial.
> > >
> > > I set up following fio job:
> > >
> > > [job1]
> > > ioengine=libaio
> > > size=1g
> > > loops=16
> > > bs=512k
> > > direct=1
> > > filename=/tmp/fio.data2
> > >
> > > When I run fio fio.job and reboot one of the data nodes, IO
statistics
> > > reported by fio drop to 0KB/0KB and 0 IOPS. After a while, root
> > > filesystem gets remounted as read-only.
> > >
> > > If you care about infrastructure, setup details etc., do not
hesitate
> to
> > > ask.
> > >
> > > Gluster info on volume:
> > >
> > > Volume Name: gv_openstack_1
> > > Type: Replicate
> > > Volume ID: 2425ae63-3765-4b5e-915b-e132e0d3fff1
> > > Status: Started
> > > Snapshot Count: 0
> > > Number of Bricks: 1 x (2 + 1) = 3
> > > Transport-type: tcp
> > > Bricks:
> > > Brick1: gfs-2.san:/export/gfs/gv_1
> > > Brick2: gfs-3.san:/export/gfs/gv_1
> > > Brick3: docker3.san:/export/gfs/gv_1 (arbiter)
> > > Options Reconfigured:
> > > nfs.disable: on
> > > transport.address-family: inet
> > > performance.quick-read: off
> > > performance.read-ahead: off
> > > performance.io-cache: off
> > > performance.stat-prefetch: off
> > > performance.low-prio-threads: 32
> > > network.remote-dio: enable
> > > cluster.eager-lock: enable
> > > cluster.quorum-type: auto
> > > cluster.server-quorum-type: server
> > > cluster.data-self-heal-algorithm: full
> > > cluster.locking-scheme: granular
> > > cluster.shd-max-threads: 8
> > > cluster.shd-wait-qlength: 10000
> > > features.shard: on
> > > user.cifs: off
> > >
> > > Partial KVM XML dump:
> > >
> > >     <disk type='network' device='disk'>
> > >       <driver name='qemu' type='raw'
cache='none'/>
> > >       <source protocol='gluster'
> > >
name='gv_openstack_1/volume-77ebfd13-6a92-4f38-b036-e9e55d752e1e'>
> > >         <host name='10.0.1.201'
port='24007'/>
> > >       </source>
> > >       <backingStore/>
> > >       <target dev='vda' bus='virtio'/>
> > >      
<serial>77ebfd13-6a92-4f38-b036-e9e55d752e1e</serial>
> > >       <alias name='virtio-disk0'/>
> > >       <address type='pci' domain='0x0000'
bus='0x00' slot='0x04'
> > > function='0x0'/>
> > >     </disk>
> > >
> > > Networking is LACP on data nodes, stack of Juniper EX4550's
(10Gbps
> > > SFP+), separate VLAN for Gluster traffic, SSD only on Gluster all
> > > nodes (including arbiter).
> > >
> > > I would really love to know what am I doing wrong, because this
is my
> > > experience with Gluster for a long time a and a reason I would
not
> > > recommend it as VM storage backend in production environment
where you
> > > cannot start/stop VMs on your own (e.g. providing private clouds
for
> > > customers).
> > > -ps
> > >
> > >
> > > On Sun, Sep 3, 2017 at 10:21 PM, Gionatan Danti <g.danti at
assyoma.it>
> > > wrote:
> > > > Il 30-08-2017 17:07 Ivan Rossi ha scritto:
> > > >>
> > > >> There has ben a bug associated to sharding that led to
VM corruption
> > > >> that has been around for a long time (difficult to
reproduce I
> > > >> understood). I have not seen reports on that for some
time after the
> > > >> last fix, so hopefully now VM hosting is stable.
> > > >
> > > >
> > > > Mmmm... this is precisely the kind of bug that scares me...
data
> > > corruption
> > > > :|
> > > > Any more information on what causes it and how to resolve?
Even if in
> > > newer
> > > > Gluster releases it is a solved bug, knowledge on how to
treat it
> would
> > > be
> > > > valuable.
> > > >
> > > >
> > > > Thanks.
> > > >
> > > > --
> > > > Danti Gionatan
> > > > Supporto Tecnico
> > > > Assyoma S.r.l. - www.assyoma.it
> > > > email: g.danti at assyoma.it - info at assyoma.it
> > > > GPG public key ID: FF5F32A8
> > > > _______________________________________________
> > > > Gluster-users mailing list
> > > > Gluster-users at gluster.org
> > > > http://lists.gluster.org/mailman/listinfo/gluster-users
> > > _______________________________________________
> > > Gluster-users mailing list
> > > Gluster-users at gluster.org
> > > http://lists.gluster.org/mailman/listinfo/gluster-users
> > >
>
> > _______________________________________________
> > Gluster-users mailing list
> > Gluster-users at gluster.org
> > http://lists.gluster.org/mailman/listinfo/gluster-users
>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170907/2362dd1b/attachment.html>

Reasonably Related Threads

Search for more possibly parallel threads

Gluster users - Sep 2017 - GlusterFS as virtual machine storage

[Gluster-users] GlusterFS as virtual machine storage

[Gluster-users] GlusterFS as virtual machine storage

[Gluster-users] GlusterFS as virtual machine storage

Reasonably Related Threads