thr3ads.net - Gluster users - [Gluster-users] Gluster on ZFS with Compression [Oct 2015]

If this information is useful, please help other people find it:
Share via:

Lindsay Mathieson

2015-Sep-30 05:00 UTC

[Gluster-users] Gluster on ZFS with Compression

I'm revisiting Gluster for the purpose of hosting Virtual Machine images
(KVM). I was considering the following configuration

2 Nodes
- 1 Brick per node (replication = 2)
- 2 * 1GB Eth, LACP Bonded
- Bricks hosted on ZFS
- VM Images accessed via Block driver (gfapi)

ZFS Config:
- Raid 10
- SSD SLOG and L2ARC
- 4 GB RAM
 - Compression (lz4)

Does that seem like an sane layout?

Question: With the gfapi driver, does the vm image appear as a file on the
host (zfs) file system?


Background: I currently have our VM's hosted on Ceph using a similar config
as above, minus zfs. I've found that the performance for such a small setup
is terrible, the maintenance headache is high and when a drive drops out,
the performance gets *really* bad. Last time I checked, gluster was much
slower at healing large files than ceph, I'm hoping that has improved :)

-- 
Lindsay
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150930/470f7b5d/attachment.html>

Tiemen Ruiten

2015-Sep-30 08:36 UTC

head link

[Gluster-users] Gluster on ZFS with Compression

Hello Lindsay,
>From personal experience: A two node volume can get you into trouble whenone of the nodes goes down unexpectedly/crashes. At the very least, you
should have an arbiter volume (introduced in Gluster 3.7) on a separate
physical node.

We are running oVirt VM's on top of a two node Gluster cluster and a few
months ago, I ended up transferring several terabytes from one node to the
other because it was the fastest way to resolve the split-brain issues
after a crash of Gluster on one of the nodes. In effect, the second node
did not give us any redundancy, because the vm-images in split-brain would
not be available for writes.

I don't think 4 GB is enough RAM, especially if you have a large L2ARC:
every L2ARC entry needs an entry in ARC as well, which is always in RAM.
RAM is relatively cheap nowadays, so go for at least 16 or 32.

You should also count the number of spindles you have and have it not
exceed the number of VM's  you're running much to get decent disk IO
performance.

On 30 September 2015 at 07:00, Lindsay Mathieson <
lindsay.mathieson at gmail.com> wrote:
> I'm revisiting Gluster for the purpose of hosting Virtual Machine
images
> (KVM). I was considering the following configuration
>
> 2 Nodes
> - 1 Brick per node (replication = 2)
> - 2 * 1GB Eth, LACP Bonded
> - Bricks hosted on ZFS
> - VM Images accessed via Block driver (gfapi)
>
> ZFS Config:
> - Raid 10
> - SSD SLOG and L2ARC
> - 4 GB RAM
>  - Compression (lz4)
>
> Does that seem like an sane layout?
>
> Question: With the gfapi driver, does the vm image appear as a file on the
> host (zfs) file system?
>
>
> Background: I currently have our VM's hosted on Ceph using a similar
> config as above, minus zfs. I've found that the performance for such a
> small setup is terrible, the maintenance headache is high and when a drive
> drops out, the performance gets *really* bad. Last time I checked, gluster
> was much slower at healing large files than ceph, I'm hoping that has
> improved :)
>
> --
> Lindsay
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>

-- 
Tiemen Ruiten
Systems Engineer
R&D Media
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150930/3353bf30/attachment.html>

Frank Rothenstein

2015-Sep-30 08:50 UTC

head link

[Gluster-users] Gluster on ZFS with Compression

Hi,

I'm actually doing this on a pretty much similar system. I'm using
ovirt with KVM, RAIDZ on 4 disks with LZ4 and also dedup. My brick
-nodes are also ovirt-Nodes (VM-Hosts), and have 32/48 GB RAM. ZFS may
use up to 18GB of it, so a little more than your setup. ovirt needs
rep=3, i have 3 bricks per node.
I have no complaints about speed, ovirt is using thin provisioned vm
-disks as one big file per disk, self heal operations do need their
time but with little impact as far as i have seen. Performance is all
about network speed i would say. Running VMs directly on the bricks may
improve your VMs by using L2ARC...

Frank

Am Mittwoch, den 30.09.2015, 15:00 +1000 schrieb Lindsay
Mathieson:> I'm revisiting Gluster for the purpose of hosting Virtual Machine
> images (KVM). I was considering the following configuration
> 
> 2 Nodes
> - 1 Brick per node (replication = 2)
> - 2 * 1GB Eth, LACP Bonded
> - Bricks hosted on ZFS
> - VM Images accessed via Block driver (gfapi)
> 
> ZFS Config:
> - Raid 10
> - SSD SLOG and L2ARC
> - 4 GB RAM
>  - Compression (lz4)
> 
> Does that seem like an sane layout?
> 
> Question: With the gfapi driver, does the vm image appear as a file
> on the host (zfs) file system?
> 
> 
> Background: I currently have our VM's hosted on Ceph using a similar
> config as above, minus zfs. I've found that the performance for such
> a small setup is terrible, the maintenance headache is high and when
> a drive drops out, the performance gets *really* bad. Last time I
> checked, gluster was much slower at healing large files than ceph,
> I'm hoping that has improved :)
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users


 

______________________________________________________________________________
BODDEN-KLINIKEN Ribnitz-Damgarten GmbH
Sandhufe 2
18311 Ribnitz-Damgarten

Telefon: 03821-700-0
Fax:       03821-700-240

E-Mail: info at bodden-kliniken.de   Internet: http://www.bodden-kliniken.de

Sitz: Ribnitz-Damgarten, Amtsgericht: Stralsund, HRB 2919, Steuer-Nr.:
081/126/00028
Aufsichtsratsvorsitzende: Carmen Schr?ter, Gesch?ftsf?hrer: Dr. Falko Milski

Der Inhalt dieser E-Mail ist ausschlie?lich f?r den bezeichneten Adressaten
bestimmt. Wenn Sie nicht der vorge-
sehene Adressat dieser E-Mail oder dessen Vertreter sein sollten, beachten Sie
bitte, dass jede Form der Ver?f-
fentlichung, Vervielf?ltigung oder Weitergabe des Inhalts dieser E-Mail
unzul?ssig ist. Wir bitten Sie, sofort den
Absender zu informieren und die E-Mail zu l?schen. 


             Bodden-Kliniken Ribnitz-Damgarten GmbH 2014
*** Virenfrei durch Kerio Mail Server und Sophos Antivirus ***

Lindsay Mathieson

2015-Oct-01 06:42 UTC

head link

[Gluster-users] Gluster on ZFS with Compression

On 30 September 2015 at 18:36, Tiemen Ruiten <t.ruiten at rdmedia.com>
wrote:
> At the very least, you should have an arbiter volume (introduced in
> Gluster 3.7) on a separate physical node.
>
Running Proxmox (Debian Wheezy) so limited to 3.6, however I do have a
third peer node for voting purposes.


>
> I don't think 4 GB is enough RAM, especially if you have a large L2ARC
>
Learned my lesson with earlier zfs setups :)  1GB ZIL, 10GB L2ARC.

> You should also count the number of spindles you have and have it not
> exceed the number of VM's  you're running much to get decent disk
IO
> performance.
>
New one to me, did you mean the reverse? Number of VM's should not exceed
the spindles?


thanks,



-- 
Lindsay
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20151001/e8a0d56e/attachment.html>

Lindsay Mathieson

2015-Oct-01 06:47 UTC

head link

[Gluster-users] Gluster on ZFS with Compression

On 30 September 2015 at 18:50, Frank Rothenstein <
f.rothenstein at bodden-kliniken.de> wrote:
> My brick
> -nodes are also ovirt-Nodes (VM-Hosts),
>

I should have said that my brick noders are also VM nodes (Proxmox). 64 GB
Ram, E5-2620 CPU

> rep=3, i have 3 bricks per node.
>
Are your bricks separate disks? I assumed it would be better to let zfs
handle multiple disks with striping/caching and just present one brick
(zpool dataset) to gluster.


>   Running VMs directly on the bricks may
> improve your VMs by using L2ARC...
>

Not sure what you mean by that - you run the vm direct from the brick on
zfs rather than via the gluster mount? doesn't that mess with the
replication?


thanks,

-- 
Lindsay
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20151001/a73b81c0/attachment.html>

Lindsay Mathieson

2015-Oct-10 14:10 UTC

head link

[Gluster-users] Gluster on ZFS with Compression

On 30 September 2015 at 18:36, Tiemen Ruiten <t.ruiten at rdmedia.com>
wrote:
> From personal experience: A two node volume can get you into trouble when
> one of the nodes goes down unexpectedly/crashes. At the very least, you
> should have an arbiter volume (introduced in Gluster 3.7) on a separate
> physical node.
>
I've introduced a third node for full replica 3 now. Surprised and pleased
that there is no real performance drop.

This is where the docs get a bit frustrating :(
https://gluster.readthedocs.org/en/release-3.7.0/Features/server-quorum/
discusses server quorum and the settings, but doesn't say what the default
values are, and there is no way of reading volume settings in gluster.




-- 
Lindsay
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20151011/fdda0c9c/attachment.html>

Gluster users - Oct 2015 - Gluster on ZFS with Compression

[Gluster-users] Gluster on ZFS with Compression

[Gluster-users] Gluster on ZFS with Compression

[Gluster-users] Gluster on ZFS with Compression

[Gluster-users] Gluster on ZFS with Compression

[Gluster-users] Gluster on ZFS with Compression

[Gluster-users] Gluster on ZFS with Compression