Torbjørn Thorsen
2013-Feb-27 15:14 UTC
[Gluster-users] Performance in VM guests when hosting VM images on Gluster
I'm seeing less-than-stellar performance on my Gluster deployment when hosting VM images on the FUSE mount. I've seen that this topic has surfaced before, but my googling and perusing of the list archive haven't turned out very conclusive. I'm on a 2-node distribute+replicate cluster, the clients use Gluster via the FUSE mount. torbjorn at storage01:~$ sudo gluster volume info Volume Name: gluster0 Type: Distributed-Replicate Volume ID: 81bbf681-ecdb-4866-9b45-41d5d2df7b35 Status: Started Number of Bricks: 2 x 2 = 4 Transport-type: tcp Bricks: Brick1: storage01.gluster.trollweb.net:/srv/gluster/brick0 Brick2: storage02.gluster.trollweb.net:/srv/gluster/brick0 Brick3: storage01.gluster.trollweb.net:/srv/gluster/brick1 Brick4: storage02.gluster.trollweb.net:/srv/gluster/brick1 The naive dd case from one client, on the dom0, looks like this: torbjorn at xen01:/srv/ganeti/shared-file-storage/tmp$ sudo dd if=/dev/zero of=bigfile bs=1024k count=2000 2097152000 bytes (2.1 GB) copied, 22.9161 s, 91.5 MB/s The clients see each node on a separate 1Gbps NIC, so this is pretty close to the expected transfer rate. Writing with the sync flag, from dom0, looks like so: torbjorn at xen01:/srv/ganeti/shared-file-storage/tmp$ sudo dd if=/dev/zero of=bigfile bs=1024k count=2000 oflag=sync 2097152000 bytes (2.1 GB) copied, 51.1271 s, 41.0 MB/s If we use a file on the gluster mount as backing for a loop device, and do a sync write: torbjorn at xen01:/srv/ganeti/shared-file-storage/tmp$ sudo dd if=/dev/zero of=/dev/loop1 bs=1024k count=2000 oflag=sync 2097152000 bytes (2.1 GB) copied, 56.3729 s, 37.2 MB/s The Xen instances are managed by Ganeti, using the loopback interface over a file on Gluster. Inside the Xen instance the performance is not quite what I was hoping. torbjorn at hennec:~$ sudo dd if=/dev/zero of=bigfile bs=1024k count=2000 2097152000 bytes (2.1 GB) copied, 1267.39 s, 1.7 MB/s The transfer rate is similar when using sync or direct flags with dd. Are these expected performance levels ? A couple of threads[1] talk about performance, and seem to indicate my situation isn't unique. However, I'm under the impression that other are using a similar setup with much better performance. [1]: * http://www.gluster.org/pipermail/gluster-users/2012-January/032369.html * http://www.gluster.org/pipermail/gluster-users/2012-July/033763.html -- Vennlig hilsen Torbj?rn Thorsen Utvikler / driftstekniker Trollweb Solutions AS - Professional Magento Partner www.trollweb.no Telefon dagtid: +47 51215300 Telefon kveld/helg: For kunder med Serviceavtale Bes?ksadresse: Luramyrveien 40, 4313 Sandnes Postadresse: Maurholen 57, 4316 Sandnes Husk at alle v?re standard-vilk?r alltid er gjeldende
Brian Foster
2013-Feb-27 20:46 UTC
[Gluster-users] Performance in VM guests when hosting VM images on Gluster
On 02/27/2013 10:14 AM, Torbj?rn Thorsen wrote:> I'm seeing less-than-stellar performance on my Gluster deployment when > hosting VM images on the FUSE mount. > I've seen that this topic has surfaced before, but my googling and > perusing of the list archive haven't turned out very conclusive. > > I'm on a 2-node distribute+replicate cluster, the clients use Gluster > via the FUSE mount. > torbjorn at storage01:~$ sudo gluster volume info > > Volume Name: gluster0 > Type: Distributed-Replicate > Volume ID: 81bbf681-ecdb-4866-9b45-41d5d2df7b35 > Status: Started > Number of Bricks: 2 x 2 = 4 > Transport-type: tcp > Bricks: > Brick1: storage01.gluster.trollweb.net:/srv/gluster/brick0 > Brick2: storage02.gluster.trollweb.net:/srv/gluster/brick0 > Brick3: storage01.gluster.trollweb.net:/srv/gluster/brick1 > Brick4: storage02.gluster.trollweb.net:/srv/gluster/brick1 > > The naive dd case from one client, on the dom0, looks like this: > torbjorn at xen01:/srv/ganeti/shared-file-storage/tmp$ sudo dd > if=/dev/zero of=bigfile bs=1024k count=2000 > 2097152000 bytes (2.1 GB) copied, 22.9161 s, 91.5 MB/s > > The clients see each node on a separate 1Gbps NIC, so this is pretty > close to the expected transfer rate. > > Writing with the sync flag, from dom0, looks like so: > torbjorn at xen01:/srv/ganeti/shared-file-storage/tmp$ sudo dd > if=/dev/zero of=bigfile bs=1024k count=2000 oflag=sync > 2097152000 bytes (2.1 GB) copied, 51.1271 s, 41.0 MB/s > > If we use a file on the gluster mount as backing for a loop device, > and do a sync write: > torbjorn at xen01:/srv/ganeti/shared-file-storage/tmp$ sudo dd > if=/dev/zero of=/dev/loop1 bs=1024k count=2000 oflag=sync > 2097152000 bytes (2.1 GB) copied, 56.3729 s, 37.2 MB/s >What you might want to try is compare each case with gluster profiling enabled on your volume (e.g., run a 'gluster ... profile info' to clear the interval stats, run your test, run another 'profile info' and see how many write requests occurred, divide the amount of data transferred by the number of requests). Running similar tests on a couple random servers around here brings me from 70-80MB/s down to 10MB/s over loop. The profile data clearly shows that loop is breaking what were previously 128k (max) write requests into 4k requests. I don't know enough about the block layer to say why that occurs, but I'd be suspicious of the combination of the block interface on top of a filesystem (fuse) with synchronous request submission (no caching, writes are immediately submitted to the client fs). That said, I'm on an older kernel (or an older loop driver anyways, I think) and your throughput above doesn't seem to be much worse with loop alone... Brian> The Xen instances are managed by Ganeti, using the loopback interface > over a file on Gluster. > Inside the Xen instance the performance is not quite what I was hoping. > torbjorn at hennec:~$ sudo dd if=/dev/zero of=bigfile bs=1024k count=2000 > 2097152000 bytes (2.1 GB) copied, 1267.39 s, 1.7 MB/s > > The transfer rate is similar when using sync or direct flags with dd. > > Are these expected performance levels ? > A couple of threads[1] talk about performance, and seem to indicate my > situation isn't unique. > However, I'm under the impression that other are using a similar setup > with much better performance. > > > [1]: > * http://www.gluster.org/pipermail/gluster-users/2012-January/032369.html > * http://www.gluster.org/pipermail/gluster-users/2012-July/033763.html > > -- > Vennlig hilsen > Torbj?rn Thorsen > Utvikler / driftstekniker > > Trollweb Solutions AS > - Professional Magento Partner > www.trollweb.no > > Telefon dagtid: +47 51215300 > Telefon kveld/helg: For kunder med Serviceavtale > > Bes?ksadresse: Luramyrveien 40, 4313 Sandnes > Postadresse: Maurholen 57, 4316 Sandnes > > Husk at alle v?re standard-vilk?r alltid er gjeldende > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users >
Torbjørn Thorsen
2013-Mar-05 12:57 UTC
[Gluster-users] Fwd: Performance in VM guests when hosting VM images on Gluster
On Fri, Mar 1, 2013 at 7:01 PM, Brian Foster <bfoster at redhat.com> wrote:> On 03/01/2013 11:48 AM, Torbj?rn Thorsen wrote: >> On Thu, Feb 28, 2013 at 4:54 PM, Brian Foster <bfoster at redhat.com> wrote: > All writes are done with sync, so I don't quite understand how cache >> flushing comes in. >> > > Flushing doesn't seem to be a factor, I was just noting previously that > the only slowdown I noticed in my brief tests were associated with flushing. > > Note again though that loop seems to flush on close(). I suspect a > reason for this is so 'losetup -d' can return immediately, but that's > just a guess. IOW, if you hadn't used oflag=sync, the close() issued by > dd before it actually exits would result in flushing the buffers > associated with the loop device to the backing store. You are using > oflag=sync, so that doesn't really matter.Ah, I see. I thought you meant close() on the FD that was backing the loop device, but now I see what you mean. Doing an non-sync dd run towards a loop device, it felt like that was the case. I was seeing high throughput, but pressing ^C didn't stop dd, and I'm guessing it's because it was blocking on close().>>..>> To me it seems that a fresh loop device does mostly 64kb writes, >> and at some point during a 24 hour window, changes to doing 4kb writes ? >> > > Yeah, interesting data. One thing I was curious about is whether > write-behind or some other caching translator was behind this one way or > another (including the possibility that the higher throughput value is > actually due to a bug, rather than the other way around). If I > understand the io-stats translator correctly however, these request size > metrics should match the size of the requests coming into gluster and > thus suggest something else is going on. > > Regardless, I think it's best to narrow the problem down and rule out as > much as possible. Could you try some of the commands in my previous > email to disable performance translators and see if it affects > throughput? For example, does disabling any particular translator > degrade throughput consistently (even on new loop devices)? If so, does > re-enabling a particular translator enhance throughput on an already > mapped and "degraded" loop (without unmapping/remapping the loop)? >I was running with defaults, no configuration had been done after installing Gluster. If I disable the write-behind translator, I immediately see pretty much the same speeds as the "degraded loop", ie. ~3MB/s. Gluster profiling tells me the same story, all writes are now 4KB requests. If write-behind is disabled, the loop device is slow even if it's fresh. Enabling write-behind, even while dd is writing to the loop device, seems to increase the speed right away, without needing a new fd to the device. A degraded loop device without an open fd will be fast after a toggle of write-behind. However, it seems that an open fd will keep the loop device slow. I've only tested that with Xen, as that was the only thing I had with a long-lived open fd to a loop device.> Also, what gluster and kernel versions are you on?# uname -a Linux xen-storage01 2.6.32-5-xen-amd64 #1 SMP Sun May 6 08:57:29 UTC 2012 x86_64 GNU/Linux # dpkg -l | grep $(uname -r) ii linux-image-2.6.32-5-xen-amd64 2.6.32-46 Linux 2.6.32 for 64-bit PCs, Xen dom0 support # dpkg -l | grep gluster ii glusterfs-client 3.3.1-1 clustered file-system (client package) ii glusterfs-common 3.3.1-1 GlusterFS common libraries and translator modules -- Vennlig hilsen Torbj?rn Thorsen Utvikler / driftstekniker Trollweb Solutions AS - Professional Magento Partner www.trollweb.no Telefon dagtid: +47 51215300 Telefon kveld/helg: For kunder med Serviceavtale Bes?ksadresse: Luramyrveien 40, 4313 Sandnes Postadresse: Maurholen 57, 4316 Sandnes Husk at alle v?re standard-vilk?r alltid er gjeldende