thr3ads.net - Gluster users - [Gluster-users] GlusterFS as virtual machine storage [Sep 2017]

If this information is useful, please help other people find it:
Share via:

2017-Sep-09 00:35 UTC

[Gluster-users] GlusterFS as virtual machine storage

Pavel.

Is there a difference between native client (fuse) and libgfapi in 
regards to the crashing/read-only behaviour?

We use Rep2 + Arb and can shutdown a node cleanly, without issue on our 
VMs. We do it all the time for upgrades and maintenance.

However we are still on native client as we haven't had time to work on 
libgfapi yet. Maybe that is more tolerant.

We have linux VMs mostly with XFS filesystems.

During the downtime, the VMs continue to run with normal speed.

In this case we migrated to the VM so date node 2 (c2g.gluster) and 
shutdown c1g.gluster to do some upgrades.

# gluster peer status
Number of Peers: 2

Hostname: c1g.gluster
Uuid: 91be2005-30e6-462b-a66e-773913cacab6
State: Peer in Cluster (Disconnected)
Hostname: arb-c2.gluster
Uuid: 20862755-e54e-4b79-96a8-59e78c6a6a2e
State: Peer in Cluster (Connected)

# gluster volume status
Status of volume: brick1
Gluster process???????????????????????????? TCP Port? RDMA Port Online? Pid
------------------------------------------------------------------------------
Brick c2g.gluster:/GLUSTER/brick1?????? 49152???? 0 Y?????? 5194
Brick arb-c2.gluster:/GLUSTER/brick1??????? 49152???? 0 Y?????? 3647
Self-heal Daemon on localhost?????????????? N/A?????? N/A Y?????? 5214
Self-heal Daemon on arb-c2.gluster????????? N/A?????? N/A Y?????? 3667

Task Status of Volume brick1
------------------------------------------------------------------------------
There are no active volume tasks

When we return the c1g node, we do see a "pause" in the VMs as the 
shards heal. By pause meaning a terminal session gets spongy, but that 
passes pretty quickly.

Also are your VMs mounted in libvirt with caching? We always use 
cache='none' so we can migrate around easily.

Finally, you seem to be using oVirt/RHEV. Is it possible that your platform is
triggering a protective response on the VMs (by suspending).

-wk

On 9/8/2017 5:13 AM, Gandalf Corvotempesta wrote:> 2017-09-08 14:11 GMT+02:00 Pavel Szalbot <pavel.szalbot at
gmail.com>:
>> Gandalf, SIGKILL (killall -9 glusterfsd) did not stop I/O after few
>> minutes. SIGTERM on the other hand causes crash, but this time it is
>> not read-only remount, but around 10 IOPS tops and 2 IOPS on average.
>> -ps
> So, seems to be reliable to server crashes but not to server shutdown :)
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users

Pavel Szalbot

2017-Sep-09 06:05 UTC

head link

[Gluster-users] GlusterFS as virtual machine storage

Hi,

On Sat, Sep 9, 2017 at 2:35 AM, WK <wkmail at bneit.com>
wrote:> Pavel.
>
> Is there a difference between native client (fuse) and libgfapi in regards
> to the crashing/read-only behaviour?
I switched to FUSE now and the VM crashed (read-only remount)
immediately after one node started rebooting.

I tried to mount.glusterfs same volume on different server (not VM),
running Ubuntu Xenial and gluster client 3.10.5.

mount -t glusterfs -o backupvolfile-server=10.0.1.202
10.0.1.201:/gv_openstack_1 /mnt/gv_openstack_1/

I ran fio job I described earlier. As soon as I killall glusterfsd,
fio reported:

fio: io_u error on file /mnt/gv_openstack_1/fio.data: Transport
endpoint is not connected: read offset=7022575616, buflen=262144
fio: pid=7205, err=107/file:io_u.c:1582, func=io_u error,
error=Transport endpoint is not connected

And crashed. I still cannot believe I am the only one experiencing
these problems and that tells me, that there must be some problem in
my setup. However I have not experienced any crashes if all nodes were
up. Ever. I suspected disks and network as culprit, but we run SMART
tests frequently (short and long), bricks are on RAID10 (6xSSDs),
switches are Juniper EX4550s (shallow packet buffer, but no drops in
the statistics) pretty much dedicated to Gluster and we ran many VMs
and stored and heavily used other data there. And gluster logs or
system logs do not provide any hint of HW/network failure.
> We use Rep2 + Arb and can shutdown a node cleanly, without issue on our
VMs.
> We do it all the time for upgrades and maintenance.
>
> However we are still on native client as we haven't had time to work on
> libgfapi yet. Maybe that is more tolerant.
>
> We have linux VMs mostly with XFS filesystems.
We use whatever official cloud (Openstack) images provide, all tests I
describe are on Ubuntu Xenial VMs, ext4.
> During the downtime, the VMs continue to run with normal speed.
>
> In this case we migrated to the VM so date node 2 (c2g.gluster) and
shutdown
> c1g.gluster to do some upgrades.
>
> # gluster peer status
> Number of Peers: 2
>
> Hostname: c1g.gluster
> Uuid: 91be2005-30e6-462b-a66e-773913cacab6
> State: Peer in Cluster (Disconnected)
> Hostname: arb-c2.gluster
> Uuid: 20862755-e54e-4b79-96a8-59e78c6a6a2e
> State: Peer in Cluster (Connected)
>
> # gluster volume status
> Status of volume: brick1
> Gluster process                             TCP Port  RDMA Port Online  Pid
>
------------------------------------------------------------------------------
> Brick c2g.gluster:/GLUSTER/brick1       49152     0 Y       5194
> Brick arb-c2.gluster:/GLUSTER/brick1        49152     0 Y       3647
> Self-heal Daemon on localhost               N/A       N/A Y       5214
> Self-heal Daemon on arb-c2.gluster          N/A       N/A Y       3667
>
> Task Status of Volume brick1
>
------------------------------------------------------------------------------
> There are no active volume tasks
>
> When we return the c1g node, we do see a "pause" in the VMs as
the shards
> heal. By pause meaning a terminal session gets spongy, but that passes
> pretty quickly.
Hmm, do you see any errors in VM's dmesg? Or any other reasons for
"sponginess"?
> Also are your VMs mounted in libvirt with caching? We always use
> cache='none' so we can migrate around easily.
No cache, virtio:

    <disk type='network' device='disk'>
      <driver name='qemu' type='raw'
cache='none'/>
      <source protocol='gluster'
name='gv_openstack_1/volume-3a7eaf5a-8348-4f01-b59f-f28cd8cea771'>
        <host name='10.0.1.201' port='24007'/>
      </source>
      <backingStore/>
      <target dev='vda' bus='virtio'/>
      <serial>3a7eaf5a-8348-4f01-b59f-f28cd8cea771</serial>
      <alias name='virtio-disk0'/>
      <address type='pci' domain='0x0000' bus='0x00'
slot='0x04'
function='0x0'/>
    </disk>
> Finally, you seem to be using oVirt/RHEV. Is it possible that your platform
> is triggering a protective response on the VMs (by suspending).
No, this is Openstack environment, I am not aware of any protective mechanisms.

-ps

Pavel Szalbot

2017-Sep-09 07:09 UTC

head link

[Gluster-users] GlusterFS as virtual machine storage

Sorry, I did not start the glusterfsd on the node I was shutting
yesterday and now killed another one during FUSE test, so it had to
crash immediately (only one of three nodes were actually up). This
definitely happened for the first time (only one node had been killed
yesterday).

Using FUSE seems to be OK with replica 3. So this can be gfapi related
or maybe rather libvirt related.

I tried ioengine=gfapi with fio and job survived reboot.


-ps


On Sat, Sep 9, 2017 at 8:05 AM, Pavel Szalbot <pavel.szalbot at gmail.com>
wrote:> Hi,
>
> On Sat, Sep 9, 2017 at 2:35 AM, WK <wkmail at bneit.com> wrote:
>> Pavel.
>>
>> Is there a difference between native client (fuse) and libgfapi in
regards
>> to the crashing/read-only behaviour?
>
> I switched to FUSE now and the VM crashed (read-only remount)
> immediately after one node started rebooting.
>
> I tried to mount.glusterfs same volume on different server (not VM),
> running Ubuntu Xenial and gluster client 3.10.5.
>
> mount -t glusterfs -o backupvolfile-server=10.0.1.202
> 10.0.1.201:/gv_openstack_1 /mnt/gv_openstack_1/
>
> I ran fio job I described earlier. As soon as I killall glusterfsd,
> fio reported:
>
> fio: io_u error on file /mnt/gv_openstack_1/fio.data: Transport
> endpoint is not connected: read offset=7022575616, buflen=262144
> fio: pid=7205, err=107/file:io_u.c:1582, func=io_u error,
> error=Transport endpoint is not connected
>
> And crashed. I still cannot believe I am the only one experiencing
> these problems and that tells me, that there must be some problem in
> my setup. However I have not experienced any crashes if all nodes were
> up. Ever. I suspected disks and network as culprit, but we run SMART
> tests frequently (short and long), bricks are on RAID10 (6xSSDs),
> switches are Juniper EX4550s (shallow packet buffer, but no drops in
> the statistics) pretty much dedicated to Gluster and we ran many VMs
> and stored and heavily used other data there. And gluster logs or
> system logs do not provide any hint of HW/network failure.
>
>> We use Rep2 + Arb and can shutdown a node cleanly, without issue on our
VMs.
>> We do it all the time for upgrades and maintenance.
>>
>> However we are still on native client as we haven't had time to
work on
>> libgfapi yet. Maybe that is more tolerant.
>>
>> We have linux VMs mostly with XFS filesystems.
>
> We use whatever official cloud (Openstack) images provide, all tests I
> describe are on Ubuntu Xenial VMs, ext4.
>
>> During the downtime, the VMs continue to run with normal speed.
>>
>> In this case we migrated to the VM so date node 2 (c2g.gluster) and
shutdown
>> c1g.gluster to do some upgrades.
>>
>> # gluster peer status
>> Number of Peers: 2
>>
>> Hostname: c1g.gluster
>> Uuid: 91be2005-30e6-462b-a66e-773913cacab6
>> State: Peer in Cluster (Disconnected)
>> Hostname: arb-c2.gluster
>> Uuid: 20862755-e54e-4b79-96a8-59e78c6a6a2e
>> State: Peer in Cluster (Connected)
>>
>> # gluster volume status
>> Status of volume: brick1
>> Gluster process                             TCP Port  RDMA Port Online 
Pid
>>
------------------------------------------------------------------------------
>> Brick c2g.gluster:/GLUSTER/brick1       49152     0 Y       5194
>> Brick arb-c2.gluster:/GLUSTER/brick1        49152     0 Y       3647
>> Self-heal Daemon on localhost               N/A       N/A Y       5214
>> Self-heal Daemon on arb-c2.gluster          N/A       N/A Y       3667
>>
>> Task Status of Volume brick1
>>
------------------------------------------------------------------------------
>> There are no active volume tasks
>>
>> When we return the c1g node, we do see a "pause" in the VMs
as the shards
>> heal. By pause meaning a terminal session gets spongy, but that passes
>> pretty quickly.
>
> Hmm, do you see any errors in VM's dmesg? Or any other reasons for
"sponginess"?
>
>> Also are your VMs mounted in libvirt with caching? We always use
>> cache='none' so we can migrate around easily.
>
> No cache, virtio:
>
>     <disk type='network' device='disk'>
>       <driver name='qemu' type='raw'
cache='none'/>
>       <source protocol='gluster'
>
name='gv_openstack_1/volume-3a7eaf5a-8348-4f01-b59f-f28cd8cea771'>
>         <host name='10.0.1.201' port='24007'/>
>       </source>
>       <backingStore/>
>       <target dev='vda' bus='virtio'/>
>       <serial>3a7eaf5a-8348-4f01-b59f-f28cd8cea771</serial>
>       <alias name='virtio-disk0'/>
>       <address type='pci' domain='0x0000'
bus='0x00' slot='0x04'
> function='0x0'/>
>     </disk>
>
>> Finally, you seem to be using oVirt/RHEV. Is it possible that your
platform
>> is triggering a protective response on the VMs (by suspending).
>
> No, this is Openstack environment, I am not aware of any protective
mechanisms.
>
> -ps

2017-Sep-09 17:26 UTC

head link

[Gluster-users] GlusterFS as virtual machine storage

On 9/8/2017 11:05 PM, Pavel Szalbot wrote:>
>> When we return the c1g node, we do see a "pause" in the VMs
as the shards
>> heal. By pause meaning a terminal session gets spongy, but that passes
>> pretty quickly.
> Hmm, do you see any errors in VM's dmesg? Or any other reasons for
"sponginess"?
No, it is a "load" issue. It has to be the shards being collected and 
written to the now out of date copy that just came back on line.

Same feeling as if a VM was hit with a dictionary attack or? you ran a 
huge copy operation on it.

Maybe Matching Threads

Search for more maybe matching threads

Gluster users - Sep 2017 - GlusterFS as virtual machine storage

[Gluster-users] GlusterFS as virtual machine storage

[Gluster-users] GlusterFS as virtual machine storage

[Gluster-users] GlusterFS as virtual machine storage

[Gluster-users] GlusterFS as virtual machine storage

Maybe Matching Threads