thr3ads.net - Gluster users - [Gluster-users] Performance in VM guests when hosting VM images on Gluster [Feb 2013]

If this information is useful, please help other people find it:
Share via:

Torbjørn Thorsen

2013-Feb-27 15:14 UTC

[Gluster-users] Performance in VM guests when hosting VM images on Gluster

I'm seeing less-than-stellar performance on my Gluster deployment when
hosting VM images on the FUSE mount.
I've seen that this topic has surfaced before, but my googling and
perusing of the list archive haven't turned out very conclusive.

I'm on a 2-node distribute+replicate cluster, the clients use Gluster
via the FUSE mount.
torbjorn at storage01:~$ sudo gluster volume info

Volume Name: gluster0
Type: Distributed-Replicate
Volume ID: 81bbf681-ecdb-4866-9b45-41d5d2df7b35
Status: Started
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: storage01.gluster.trollweb.net:/srv/gluster/brick0
Brick2: storage02.gluster.trollweb.net:/srv/gluster/brick0
Brick3: storage01.gluster.trollweb.net:/srv/gluster/brick1
Brick4: storage02.gluster.trollweb.net:/srv/gluster/brick1

The naive dd case from one client, on the dom0, looks like this:
torbjorn at xen01:/srv/ganeti/shared-file-storage/tmp$ sudo dd
if=/dev/zero of=bigfile bs=1024k count=2000
2097152000 bytes (2.1 GB) copied, 22.9161 s, 91.5 MB/s

The clients see each node on a separate 1Gbps NIC, so this is pretty
close to the expected transfer rate.

Writing with the sync flag, from dom0, looks like so:
torbjorn at xen01:/srv/ganeti/shared-file-storage/tmp$ sudo dd
if=/dev/zero of=bigfile bs=1024k count=2000 oflag=sync
2097152000 bytes (2.1 GB) copied, 51.1271 s, 41.0 MB/s

If we use a file on the gluster mount as backing for a loop device,
and do a sync write:
torbjorn at xen01:/srv/ganeti/shared-file-storage/tmp$ sudo dd
if=/dev/zero of=/dev/loop1 bs=1024k count=2000 oflag=sync
2097152000 bytes (2.1 GB) copied, 56.3729 s, 37.2 MB/s

The Xen instances are managed by Ganeti, using the loopback interface
over a file on Gluster.
Inside the Xen instance the performance is not quite what I was hoping.
torbjorn at hennec:~$ sudo dd if=/dev/zero of=bigfile bs=1024k count=2000
2097152000 bytes (2.1 GB) copied, 1267.39 s, 1.7 MB/s

The transfer rate is similar when using sync or direct flags with dd.

Are these expected performance levels ?
A couple of threads[1] talk about performance, and seem to indicate my
situation isn't unique.
However, I'm under the impression that other are using a similar setup
with much better performance.


[1]:
* http://www.gluster.org/pipermail/gluster-users/2012-January/032369.html
* http://www.gluster.org/pipermail/gluster-users/2012-July/033763.html

--
Vennlig hilsen
Torbj?rn Thorsen
Utvikler / driftstekniker

Trollweb Solutions AS
- Professional Magento Partner
www.trollweb.no

Telefon dagtid: +47 51215300
Telefon kveld/helg: For kunder med Serviceavtale

Bes?ksadresse: Luramyrveien 40, 4313 Sandnes
Postadresse: Maurholen 57, 4316 Sandnes

Husk at alle v?re standard-vilk?r alltid er gjeldende

Brian Foster

2013-Feb-27 20:46 UTC

head link

[Gluster-users] Performance in VM guests when hosting VM images on Gluster

On 02/27/2013 10:14 AM, Torbj?rn Thorsen wrote:> I'm seeing less-than-stellar performance on my Gluster deployment when
> hosting VM images on the FUSE mount.
> I've seen that this topic has surfaced before, but my googling and
> perusing of the list archive haven't turned out very conclusive.
> 
> I'm on a 2-node distribute+replicate cluster, the clients use Gluster
> via the FUSE mount.
> torbjorn at storage01:~$ sudo gluster volume info
> 
> Volume Name: gluster0
> Type: Distributed-Replicate
> Volume ID: 81bbf681-ecdb-4866-9b45-41d5d2df7b35
> Status: Started
> Number of Bricks: 2 x 2 = 4
> Transport-type: tcp
> Bricks:
> Brick1: storage01.gluster.trollweb.net:/srv/gluster/brick0
> Brick2: storage02.gluster.trollweb.net:/srv/gluster/brick0
> Brick3: storage01.gluster.trollweb.net:/srv/gluster/brick1
> Brick4: storage02.gluster.trollweb.net:/srv/gluster/brick1
> 
> The naive dd case from one client, on the dom0, looks like this:
> torbjorn at xen01:/srv/ganeti/shared-file-storage/tmp$ sudo dd
> if=/dev/zero of=bigfile bs=1024k count=2000
> 2097152000 bytes (2.1 GB) copied, 22.9161 s, 91.5 MB/s
> 
> The clients see each node on a separate 1Gbps NIC, so this is pretty
> close to the expected transfer rate.
> 
> Writing with the sync flag, from dom0, looks like so:
> torbjorn at xen01:/srv/ganeti/shared-file-storage/tmp$ sudo dd
> if=/dev/zero of=bigfile bs=1024k count=2000 oflag=sync
> 2097152000 bytes (2.1 GB) copied, 51.1271 s, 41.0 MB/s
> 
> If we use a file on the gluster mount as backing for a loop device,
> and do a sync write:
> torbjorn at xen01:/srv/ganeti/shared-file-storage/tmp$ sudo dd
> if=/dev/zero of=/dev/loop1 bs=1024k count=2000 oflag=sync
> 2097152000 bytes (2.1 GB) copied, 56.3729 s, 37.2 MB/s
> 
What you might want to try is compare each case with gluster profiling
enabled on your volume (e.g., run a 'gluster ... profile info' to clear
the interval stats, run your test, run another 'profile info' and see
how many write requests occurred, divide the amount of data transferred
by the number of requests).

Running similar tests on a couple random servers around here brings me
from 70-80MB/s down to 10MB/s over loop. The profile data clearly shows
that loop is breaking what were previously 128k (max) write requests
into 4k requests. I don't know enough about the block layer to say why
that occurs, but I'd be suspicious of the combination of the block
interface on top of a filesystem (fuse) with synchronous request
submission (no caching, writes are immediately submitted to the client
fs). That said, I'm on an older kernel (or an older loop driver anyways,
I think) and your throughput above doesn't seem to be much worse with
loop alone...

Brian
> The Xen instances are managed by Ganeti, using the loopback interface
> over a file on Gluster.
> Inside the Xen instance the performance is not quite what I was hoping.
> torbjorn at hennec:~$ sudo dd if=/dev/zero of=bigfile bs=1024k count=2000
> 2097152000 bytes (2.1 GB) copied, 1267.39 s, 1.7 MB/s
> 
> The transfer rate is similar when using sync or direct flags with dd.
> 
> Are these expected performance levels ?
> A couple of threads[1] talk about performance, and seem to indicate my
> situation isn't unique.
> However, I'm under the impression that other are using a similar setup
> with much better performance.
> 
> 
> [1]:
> * http://www.gluster.org/pipermail/gluster-users/2012-January/032369.html
> * http://www.gluster.org/pipermail/gluster-users/2012-July/033763.html
> 
> --
> Vennlig hilsen
> Torbj?rn Thorsen
> Utvikler / driftstekniker
> 
> Trollweb Solutions AS
> - Professional Magento Partner
> www.trollweb.no
> 
> Telefon dagtid: +47 51215300
> Telefon kveld/helg: For kunder med Serviceavtale
> 
> Bes?ksadresse: Luramyrveien 40, 4313 Sandnes
> Postadresse: Maurholen 57, 4316 Sandnes
> 
> Husk at alle v?re standard-vilk?r alltid er gjeldende
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://supercolony.gluster.org/mailman/listinfo/gluster-users
>

Torbjørn Thorsen

2013-Mar-05 12:57 UTC

head link

[Gluster-users] Fwd: Performance in VM guests when hosting VM images on Gluster

On Fri, Mar 1, 2013 at 7:01 PM, Brian Foster <bfoster at redhat.com>
wrote:> On 03/01/2013 11:48 AM, Torbj?rn Thorsen wrote:
>> On Thu, Feb 28, 2013 at 4:54 PM, Brian Foster <bfoster at
redhat.com> wrote:
> All writes are done with sync, so I don't quite understand how cache
>> flushing comes in.
>>
>
> Flushing doesn't seem to be a factor, I was just noting previously that
> the only slowdown I noticed in my brief tests were associated with
flushing.
>
> Note again though that loop seems to flush on close(). I suspect a
> reason for this is so 'losetup -d' can return immediately, but
that's
> just a guess. IOW, if you hadn't used oflag=sync, the close() issued by
> dd before it actually exits would result in flushing the buffers
> associated with the loop device to the backing store. You are using
> oflag=sync, so that doesn't really matter.
Ah, I see.
I thought you meant close() on the FD that was backing the loop device,
but now I see what you mean.
Doing an non-sync dd run towards a loop device, it felt like that was the case.
I was seeing high throughput, but pressing ^C didn't stop dd,
and I'm guessing it's because it was blocking on close().
>>
..>> To me it seems that a fresh loop device does mostly 64kb writes,
>> and at some point during a 24 hour window, changes to doing 4kb writes
?
>>
>
> Yeah, interesting data. One thing I was curious about is whether
> write-behind or some other caching translator was behind this one way or
> another (including the possibility that the higher throughput value is
> actually due to a bug, rather than the other way around). If I
> understand the io-stats translator correctly however, these request size
> metrics should match the size of the requests coming into gluster and
> thus suggest something else is going on.
>
> Regardless, I think it's best to narrow the problem down and rule out
as
> much as possible. Could you try some of the commands in my previous
> email to disable performance translators and see if it affects
> throughput? For example, does disabling any particular translator
> degrade throughput consistently (even on new loop devices)? If so, does
> re-enabling a particular translator enhance throughput on an already
> mapped and "degraded" loop (without unmapping/remapping the
loop)?
>
I was running with defaults, no configuration had been done after
installing Gluster.

If I disable the write-behind translator, I immediately see pretty
much the same speeds
as the "degraded loop", ie. ~3MB/s.
Gluster profiling tells me the same story, all writes are now 4KB requests.

If write-behind is disabled, the loop device is slow even if it's fresh.
Enabling write-behind, even while dd is writing to the loop device,
seems to increase the speed right away, without needing a new fd to the device.

A degraded loop device without an open fd will be fast after a toggle
of write-behind.
However, it seems that an open fd will keep the loop device slow.
I've only tested that with Xen, as that was the only thing I had with
a long-lived open fd to a loop device.
> Also, what gluster and kernel versions are you on?
# uname -a
Linux xen-storage01 2.6.32-5-xen-amd64 #1 SMP Sun May 6 08:57:29 UTC
2012 x86_64 GNU/Linux

# dpkg -l | grep $(uname -r)
ii  linux-image-2.6.32-5-xen-amd64      2.6.32-46
Linux 2.6.32 for 64-bit PCs, Xen dom0 support

# dpkg -l | grep gluster
ii  glusterfs-client                    3.3.1-1
clustered file-system (client package)
ii  glusterfs-common                    3.3.1-1
GlusterFS common libraries and translator modules


--
Vennlig hilsen
Torbj?rn Thorsen
Utvikler / driftstekniker

Trollweb Solutions AS
- Professional Magento Partner
www.trollweb.no

Telefon dagtid: +47 51215300
Telefon kveld/helg: For kunder med Serviceavtale

Bes?ksadresse: Luramyrveien 40, 4313 Sandnes
Postadresse: Maurholen 57, 4316 Sandnes

Husk at alle v?re standard-vilk?r alltid er gjeldende

Gluster users - Feb 2013 - Performance in VM guests when hosting VM images on Gluster

[Gluster-users] Performance in VM guests when hosting VM images on Gluster

[Gluster-users] Performance in VM guests when hosting VM images on Gluster

[Gluster-users] Fwd: Performance in VM guests when hosting VM images on Gluster