thr3ads.net - Gluster users - [Gluster-users] VM going down [May 2017]

If this information is useful, please help other people find it:
Share via:

Alessandro Briosi

2017-May-11 13:49 UTC

[Gluster-users] VM going down

Il 11/05/2017 14:09, Niels de Vos ha scritto:> On Thu, May 11, 2017 at 12:35:42PM +0530, Krutika Dhananjay wrote:
>> Niels,
>>
>> Allesandro's configuration does not have shard enabled. So it has
>> definitely not got anything to do with shard not supporting seek fop.
> Yes, but in case sharding would have been enabled, the seek FOP would be
> handled correctly (detected as not supported at all).
>
> I'm still not sure how arbiter prevents doing shards though. We
normally
> advise to use sharding *and* (optional) arbiter for VM workloads,
> arbiter without sharding has not been tested much. In addition, the seek
> functionality is only available in recent kernels, so there has been
> little testing on CentOS or similar enterprise Linux distributions.
Where is stated that arbiter should be used with sharding?
Or that arbiter functionality without sharding is still in "testing"
phase?
I thought that having a 3 replica on a 3 nodes cluster would have been a
waste of space. (I can only support loosing 1 host at a time, and that's
fine.)

Anyway I had this happen also before with the same VM when there was no
arbiter, and I thought it was for some strange reason a "quorum" thing
which would trigger the file not beeing available in gluster, thogh
there were no clues in the logs.
So I added the arbiter brick, but it happened again last week.

The first VM I reported about going down was created on a volume with
arbiter enabled from the start, so I dubt it's something to do with arbiter.

I think it might have something to do with a load problem ? Though the
hosts are really not beeing used that much.

Anyway this is a brief description of my setup.

3 dell servers with RAID 10 SAS Disks
each server has 2 bonded 1Gbps ethernets dedicated to gluster (2
dedicated to the proxmox cluster and 2 for comunication with the hosts
on the LAN) (each on it's VLAN in the switch)
Also jumbo frames are enabled on ethernets and switches.

each server is a proxmox host which has gluster installed and configured
as server and client.

The RAID has a LVM thin provisioned which is divided into 3 bricks (2
big for the data and 1 small for the arbiter).
each Thin LVM is XFS formatted and mounted as brick.
There are 3 volumes configured which replicate 3 with arbiter (so 2
really holding the data).
Volumes are:
datastore1: data on srv1 and srv2, arbiter srv3
datastore2: data on srv2 and srv3, arbiter srv1
datastore3: data on srv1 and srv3, arbiter srv2

On each datastore basically there is a main VM (plus some others which
though are not so important). (3 VM are mainly important)

datastore1 was converted from 2 replica to 3 replica with arbiter, the
other 2 were created as described.

The VM on the first datastore crashed more times (even where there was
no arbiter, which I thought for some reason there was a split brain
which gluster could not handle).

Last week also the 2nd VM (on datastore2) crashed, and that's when I
started the thread (before as there were no special errors logged I
thought it could have been caused by something in the VM)

Till now the 3rd VM never crashed.

Still any help on this would be really appreciated.

I know it could also be a problem somewhere else, but I have other
setups without gluster which simply work.
That's why I want to start the VM with gdb, to check next time why the
kvm process shuts down.

Alessandro
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170511/e2abf5f2/attachment.html>

Pranith Kumar Karampuri

2017-May-11 14:15 UTC

head link

[Gluster-users] VM going down

On Thu, May 11, 2017 at 7:19 PM, Alessandro Briosi <ab1 at metalit.com>
wrote:
> Il 11/05/2017 14:09, Niels de Vos ha scritto:
>
> On Thu, May 11, 2017 at 12:35:42PM +0530, Krutika Dhananjay wrote:
>
> Niels,
>
> Allesandro's configuration does not have shard enabled. So it has
> definitely not got anything to do with shard not supporting seek fop.
>
> Yes, but in case sharding would have been enabled, the seek FOP would be
> handled correctly (detected as not supported at all).
>
> I'm still not sure how arbiter prevents doing shards though. We
normally
> advise to use sharding **and** (optional) arbiter for VM workloads,
> arbiter without sharding has not been tested much. In addition, the seek
> functionality is only available in recent kernels, so there has been
> little testing on CentOS or similar enterprise Linux distributions.
>
>
> Where is stated that arbiter should be used with sharding?
>
This information is inaccurate. arbiter can be used independent of sharding.

> Or that arbiter functionality without sharding is still in
"testing" phase?
> I thought that having a 3 replica on a 3 nodes cluster would have been a
> waste of space. (I can only support loosing 1 host at a time, and
that's
> fine.)
>
> Anyway I had this happen also before with the same VM when there was no
> arbiter, and I thought it was for some strange reason a "quorum"
thing
> which would trigger the file not beeing available in gluster, thogh there
> were no clues in the logs.
> So I added the arbiter brick, but it happened again last week.
>
> The first VM I reported about going down was created on a volume with
> arbiter enabled from the start, so I dubt it's something to do with
arbiter.
>
> I think it might have something to do with a load problem ? Though the
> hosts are really not beeing used that much.
>
> Anyway this is a brief description of my setup.
>
> 3 dell servers with RAID 10 SAS Disks
> each server has 2 bonded 1Gbps ethernets dedicated to gluster (2 dedicated
> to the proxmox cluster and 2 for comunication with the hosts on the LAN)
> (each on it's VLAN in the switch)
> Also jumbo frames are enabled on ethernets and switches.
>
> each server is a proxmox host which has gluster installed and configured
> as server and client.
>
> The RAID has a LVM thin provisioned which is divided into 3 bricks (2 big
> for the data and 1 small for the arbiter).
> each Thin LVM is XFS formatted and mounted as brick.
> There are 3 volumes configured which replicate 3 with arbiter (so 2 really
> holding the data).
> Volumes are:
> datastore1: data on srv1 and srv2, arbiter srv3
> datastore2: data on srv2 and srv3, arbiter srv1
> datastore3: data on srv1 and srv3, arbiter srv2
>
> On each datastore basically there is a main VM (plus some others which
> though are not so important). (3 VM are mainly important)
>
> datastore1 was converted from 2 replica to 3 replica with arbiter, the
> other 2 were created as described.
>
> The VM on the first datastore crashed more times (even where there was no
> arbiter, which I thought for some reason there was a split brain which
> gluster could not handle).
>
> Last week also the 2nd VM (on datastore2) crashed, and that's when I
> started the thread (before as there were no special errors logged I thought
> it could have been caused by something in the VM)
>
> Till now the 3rd VM never crashed.
>
> Still any help on this would be really appreciated.
>
> I know it could also be a problem somewhere else, but I have other setups
> without gluster which simply work.
> That's why I want to start the VM with gdb, to check next time why the
kvm
> process shuts down.
>
> Alessandro
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://lists.gluster.org/mailman/listinfo/gluster-users
>


-- 
Pranith
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170511/5513aac1/attachment.html>

Niels de Vos

2017-May-12 09:36 UTC

head link

[Gluster-users] VM going down

On Thu, May 11, 2017 at 03:49:27PM +0200, Alessandro Briosi
wrote:> Il 11/05/2017 14:09, Niels de Vos ha scritto:
> > On Thu, May 11, 2017 at 12:35:42PM +0530, Krutika Dhananjay wrote:
> >> Niels,
> >>
> >> Allesandro's configuration does not have shard enabled. So it
has
> >> definitely not got anything to do with shard not supporting seek
fop.
> > Yes, but in case sharding would have been enabled, the seek FOP would
be
> > handled correctly (detected as not supported at all).
> >
> > I'm still not sure how arbiter prevents doing shards though. We
normally
> > advise to use sharding *and* (optional) arbiter for VM workloads,
> > arbiter without sharding has not been tested much. In addition, the
seek
> > functionality is only available in recent kernels, so there has been
> > little testing on CentOS or similar enterprise Linux distributions.
> 
> Where is stated that arbiter should be used with sharding?
> Or that arbiter functionality without sharding is still in
"testing" phase?
> I thought that having a 3 replica on a 3 nodes cluster would have been a
> waste of space. (I can only support loosing 1 host at a time, and
that's
> fine.)
There is no "arbiter should be used with sharding", our
recommendations
are to use sharding for VM workloads, with an optional arbiter. But we
still expect VMs on non-sharded volumes to work just fine, with or
without arbiter.
> Anyway I had this happen also before with the same VM when there was no
> arbiter, and I thought it was for some strange reason a "quorum"
thing
> which would trigger the file not beeing available in gluster, thogh
> there were no clues in the logs.
> So I added the arbiter brick, but it happened again last week.
If it is always the same VM, I wonder if there could be a small
filesystem corruption in that VM? Were there any actions done on the
storage of that VM, like resizing the block-device (VM image) or
something like that? Systems can sometimes try to access data outside of
the block device when it was resized, but the filesystem on the block
device was not. This would 'trick' the filesystem in thinking it has
more space to access than the block device has. If the filesystem access
in the VM is 'passed the block device', and this gets through to Gluster
which does a seek with that too large offset, the log you posted would
be a result.

> The first VM I reported about going down was created on a volume with
> arbiter enabled from the start, so I dubt it's something to do with
arbiter.
> 
> I think it might have something to do with a load problem ? Though the
> hosts are really not beeing used that much.
> 
> Anyway this is a brief description of my setup.
> 
> 3 dell servers with RAID 10 SAS Disks
> each server has 2 bonded 1Gbps ethernets dedicated to gluster (2
> dedicated to the proxmox cluster and 2 for comunication with the hosts
> on the LAN) (each on it's VLAN in the switch)
> Also jumbo frames are enabled on ethernets and switches.
> 
> each server is a proxmox host which has gluster installed and configured
> as server and client.
Do you know how proxmox accesses the VM images? Does it use QEMU+gfapi
or is it all over a FUSE mount? New versions of QEMU+gfapi have seek
support, and only new versions of the Linux kernel support seek over
FUSE. In order to track where the problem may be, we need to look into
the client (QEMU or FUSE) that does the seek with an invalid offset.

> The RAID has a LVM thin provisioned which is divided into 3 bricks (2
> big for the data and 1 small for the arbiter).
> each Thin LVM is XFS formatted and mounted as brick.
> There are 3 volumes configured which replicate 3 with arbiter (so 2
> really holding the data).
> Volumes are:
> datastore1: data on srv1 and srv2, arbiter srv3
> datastore2: data on srv2 and srv3, arbiter srv1
> datastore3: data on srv1 and srv3, arbiter srv2
> 
> On each datastore basically there is a main VM (plus some others which
> though are not so important). (3 VM are mainly important)
> 
> datastore1 was converted from 2 replica to 3 replica with arbiter, the
> other 2 were created as described.
> 
> The VM on the first datastore crashed more times (even where there was
> no arbiter, which I thought for some reason there was a split brain
> which gluster could not handle).
> 
> Last week also the 2nd VM (on datastore2) crashed, and that's when I
> started the thread (before as there were no special errors logged I
> thought it could have been caused by something in the VM)
> 
> Till now the 3rd VM never crashed.
> 
> Still any help on this would be really appreciated.
> 
> I know it could also be a problem somewhere else, but I have other
> setups without gluster which simply work.
> That's why I want to start the VM with gdb, to check next time why the
> kvm process shuts down.
If the problem in the log from the brick is any clue, I would say that
QEMU aborts when the seek failed. Somehow the seek got executed with a
too high offset (passed the size of the file), and that returned an
error.

We'll need to find out what makes QEMU (or FUSE) think the file is
larger than it actually is on the brick. If you have a way of reprodcing
it, you could enable more verbose logging on the client side
(diagnostics.client-log-level volume option), but if you run many VMs,
that may accumilate a lot of logs.

You probably should open a bug so that we have all the troubleshooting
and debugging details in one location. Once we find the problem we can
move the bug to the right component.
  https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS

HTH,
Niels
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 801 bytes
Desc: not available
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170512/25bf6492/attachment.sig>

Gluster users - May 2017 - VM going down

[Gluster-users] VM going down

[Gluster-users] VM going down

[Gluster-users] VM going down