thr3ads.net - Xen users - [Xen-users] Performance issues when serving a file from one domain to another (same host) on xen 3.4 [Jul 2011]

If this information is useful, please help other people find it:
Share via:

Joe Whitney

2011-Jul-26 21:31 UTC

[Xen-users] Performance issues when serving a file from one domain to another (same host) on xen 3.4

Hello,

I have consistently seen poor performance when a domain is serving a file
from a locally-attached storage device over the xenbridge "network" to
another, client domain on the same host.  I have reduced the problem to the
following very simple scenario involving two domUs: one client and one
server.  For my purpose the only difference is that server has an SSD
mounted (as a block device) at /mnt.  Each has 1 vcpu and 512mb RAM on a
4-hyperthreaded-core machine (shows as 8 "cores" in dom0).

server: eth1 IP address 192.168.62.110

client: eth1 IP address 192.168.62.202

(in the following, I am executing "echo 3 >
/proc/sys/vm/drop_caches" on
dom0 before each command shown)


First to test the speed of tearing through a random gigabyte of data I put
there for the purpose:
server# time cat /mnt/randgig > /dev/null
~4s (4 seconds, times here are averages over several runs, dropping caches
between)

Now let''s test the speed of the "network" between client and
server without
interference from the disk

server# dd if=/dev/zero bs=4096 count=262144 | nc -lvv -p 3500 -q0
client# time nc 192.168.62.110 3500 > /dev/null
~3.5s

Finally, let''s actually transfer data from disk to the client
server# dd if=/dev/zero bs=4096 count=262144 | nc -lvv -p 3500 -q0
client# time nc 192.168.62.110 3500 > /dev/null
~18.8s

So you see, it is much slower to both read from disk and transfer over the
(xenbridge) network than to do either alone, even though (in theory) I have
enough processors (4 or 8 depending on how you count) to do all the work.

If I move the client to a different (identically configured) host attached
by 1Gbit ethernet through a switch, I get these revised times:

transfer a gig of /dev/zero from server to client: 9.5s instead of 3.5
transfer a gig of /mnt/randgig from server to client: 14.2s instead of 18.8s
!!

This further confirms that there is some bad interaction between disk and
network i/o scheduling, presumably in the dom0 backend but I am not sure how
to tell for sure.

I have tried every combination of # of vcpus, pinning vcpus, etc on both
domUs and dom0.  I have also tried the experiment with dom0 as the server;
the main difference is that the performance is worse in all cases but still
better if the client is on a different host.

So in summary my questions are:

1) why is it so much slower to transfer a file from disk over the xenbridge
network than either reading from the disk or sending bytes over the network
alone?
2) what can I do about it?

I have searched in vain for any hint of this problem, except that the Xen
documentation says somewhere I should pin and fix the number of dom0 cpus
when doing I/O-intensive work in the guests, but I have tried this to no
avail.

I would appreciate any insights.

Best,

Joe Whitney


_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Todd Deshane

2011-Jul-28 15:30 UTC

head link

Re: [Xen-users] Performance issues when serving a file from one domain to another (same host) on xen 3.4

On Tue, Jul 26, 2011 at 9:31 PM, Joe Whitney <jwhitney@cs.toronto.edu>
wrote:> Hello,
> I have consistently seen poor performance when a domain is serving a file
> from a locally-attached storage device over the xenbridge
"network" to
> another, client domain on the same host.  I have reduced the problem to the
> following very simple scenario involving two domUs: one client and one
> server.  For my purpose the only difference is that server has an SSD
> mounted (as a block device) at /mnt.  Each has 1 vcpu and 512mb RAM on a
> 4-hyperthreaded-core machine (shows as 8 "cores" in dom0).
> server: eth1 IP address 192.168.62.110
> client: eth1 IP address 192.168.62.202
> (in the following, I am executing "echo 3 >
/proc/sys/vm/drop_caches" on
> dom0 before each command shown)
>
> First to test the speed of tearing through a random gigabyte of data I put
> there for the purpose:
> server# time cat /mnt/randgig > /dev/null
> ~4s (4 seconds, times here are averages over several runs, dropping caches
> between)
> Now let''s test the speed of the "network" between client
and server without
> interference from the disk
> server# dd if=/dev/zero bs=4096 count=262144 | nc -lvv -p 3500 -q0
> client# time nc 192.168.62.110 3500 > /dev/null
> ~3.5s
> Finally, let''s actually transfer data from disk to the client
> server# dd if=/dev/zero bs=4096 count=262144 | nc -lvv -p 3500 -q0
> client# time nc 192.168.62.110 3500 > /dev/null
> ~18.8s
> So you see, it is much slower to both read from disk and transfer over the
> (xenbridge) network than to do either alone, even though (in theory) I have
> enough processors (4 or 8 depending on how you count) to do all the work.
> If I move the client to a different (identically configured) host attached
> by 1Gbit ethernet through a switch, I get these revised times:
> transfer a gig of /dev/zero from server to client: 9.5s instead of 3.5
> transfer a gig of /mnt/randgig from server to client: 14.2s instead of
18.8s
> !!
> This further confirms that there is some bad interaction between disk and
> network i/o scheduling, presumably in the dom0 backend but I am not sure
how
> to tell for sure.
> I have tried every combination of # of vcpus, pinning vcpus, etc on both
> domUs and dom0.  I have also tried the experiment with dom0 as the server;
> the main difference is that the performance is worse in all cases but still
> better if the client is on a different host.
> So in summary my questions are:
> 1) why is it so much slower to transfer a file from disk over the xenbridge
> network than either reading from the disk or sending bytes over the network
> alone?
> 2) what can I do about it?
> I have searched in vain for any hint of this problem, except that the Xen
> documentation says somewhere I should pin and fix the number of dom0 cpus
> when doing I/O-intensive work in the guests, but I have tried this to no
> avail.
> I would appreciate any insights.
Have you tried making a bridge manually to see if it performs similarly?

What is the CPU load like during each of these (both dom0 and domU) cases?

Thanks,
Todd

-- 
Todd Deshane
http://www.linkedin.com/in/deshantm
http://www.xen.org/products/cloudxen.html
http://runningxen.com/

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Joe Whitney

2011-Jul-29 18:11 UTC

head link

Re: [Xen-users] Performance issues when serving a file from one domain to another (same host) on xen 3.4

Hi Todd, can you clarify what you mean by "making a bridge manually"?

I''ll run some tests again re: load on client, server and
dom0.''

On Thu, Jul 28, 2011 at 11:30 AM, Todd Deshane <todd.deshane@xen.org>
wrote:
> On Tue, Jul 26, 2011 at 9:31 PM, Joe Whitney
<jwhitney@cs.toronto.edu>
> wrote:
> > Hello,
> > I have consistently seen poor performance when a domain is serving a
file
> > from a locally-attached storage device over the xenbridge
"network" to
> > another, client domain on the same host.  I have reduced the problem
to
> the
> > following very simple scenario involving two domUs: one client and one
> > server.  For my purpose the only difference is that server has an SSD
> > mounted (as a block device) at /mnt.  Each has 1 vcpu and 512mb RAM on
a
> > 4-hyperthreaded-core machine (shows as 8 "cores" in dom0).
> > server: eth1 IP address 192.168.62.110
> > client: eth1 IP address 192.168.62.202
> > (in the following, I am executing "echo 3 >
/proc/sys/vm/drop_caches" on
> > dom0 before each command shown)
> >
> > First to test the speed of tearing through a random gigabyte of data I
> put
> > there for the purpose:
> > server# time cat /mnt/randgig > /dev/null
> > ~4s (4 seconds, times here are averages over several runs, dropping
> caches
> > between)
> > Now let''s test the speed of the "network" between
client and server
> without
> > interference from the disk
> > server# dd if=/dev/zero bs=4096 count=262144 | nc -lvv -p 3500 -q0
> > client# time nc 192.168.62.110 3500 > /dev/null
> > ~3.5s
> > Finally, let''s actually transfer data from disk to the client
> > server# dd if=/dev/zero bs=4096 count=262144 | nc -lvv -p 3500 -q0
> > client# time nc 192.168.62.110 3500 > /dev/null
> > ~18.8s
> > So you see, it is much slower to both read from disk and transfer over
> the
> > (xenbridge) network than to do either alone, even though (in theory) I
> have
> > enough processors (4 or 8 depending on how you count) to do all the
work.
> > If I move the client to a different (identically configured) host
> attached
> > by 1Gbit ethernet through a switch, I get these revised times:
> > transfer a gig of /dev/zero from server to client: 9.5s instead of 3.5
> > transfer a gig of /mnt/randgig from server to client: 14.2s instead of
> 18.8s
> > !!
> > This further confirms that there is some bad interaction between disk
and
> > network i/o scheduling, presumably in the dom0 backend but I am not
sure
> how
> > to tell for sure.
> > I have tried every combination of # of vcpus, pinning vcpus, etc on
both
> > domUs and dom0.  I have also tried the experiment with dom0 as the
> server;
> > the main difference is that the performance is worse in all cases but
> still
> > better if the client is on a different host.
> > So in summary my questions are:
> > 1) why is it so much slower to transfer a file from disk over the
> xenbridge
> > network than either reading from the disk or sending bytes over the
> network
> > alone?
> > 2) what can I do about it?
> > I have searched in vain for any hint of this problem, except that the
Xen
> > documentation says somewhere I should pin and fix the number of dom0
cpus
> > when doing I/O-intensive work in the guests, but I have tried this to
no
> > avail.
> > I would appreciate any insights.
>
> Have you tried making a bridge manually to see if it performs similarly?
>
> What is the CPU load like during each of these (both dom0 and domU) cases?
>
> Thanks,
> Todd
>
> --
> Todd Deshane
> http://www.linkedin.com/in/deshantm
> http://www.xen.org/products/cloudxen.html
> http://runningxen.com/
>

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Walter Robert Ditzler

2011-Aug-03 15:19 UTC

head link

RE: [Xen-users] Performance issues when serving a file from one domain to another (same host) on xen 3.4

a bridge defined directly on the network stack!

 

auto lo

iface lo inet loopback

 

auto br0

iface br0 inet static

        address         10.41.10.42

        netmask         255.255.255.0

        network         10.41.10.0

        broadcast       10.41.10.255

        gateway         10.41.10.1

        bridge_ports    eth0

        bridge_stp      on

        bridge_maxwait  0

 

vif             = [ ''bridge=br0,mac=xx:xx:xx:xx:xx:xx'' ]

 

that helped me about performance.

 

thanks walter

 

From: xen-users-bounces@lists.xensource.com
[mailto:xen-users-bounces@lists.xensource.com] On Behalf Of Joe Whitney
Sent: Freitag, 29. Juli 2011 20:11
To: Todd Deshane
Cc: xen-users@lists.xensource.com
Subject: Re: [Xen-users] Performance issues when serving a file from one
domain to another (same host) on xen 3.4

 

Hi Todd, can you clarify what you mean by "making a bridge manually"?

 

I''ll run some tests again re: load on client, server and
dom0.''

 

On Thu, Jul 28, 2011 at 11:30 AM, Todd Deshane <todd.deshane@xen.org>
wrote:

On Tue, Jul 26, 2011 at 9:31 PM, Joe Whitney <jwhitney@cs.toronto.edu>
wrote:> Hello,
> I have consistently seen poor performance when a domain is serving a file
> from a locally-attached storage device over the xenbridge
"network" to
> another, client domain on the same host.  I have reduced the problem to
the> following very simple scenario involving two domUs: one client and one
> server.  For my purpose the only difference is that server has an SSD
> mounted (as a block device) at /mnt.  Each has 1 vcpu and 512mb RAM on a
> 4-hyperthreaded-core machine (shows as 8 "cores" in dom0).
> server: eth1 IP address 192.168.62.110
> client: eth1 IP address 192.168.62.202
> (in the following, I am executing "echo 3 >
/proc/sys/vm/drop_caches" on
> dom0 before each command shown)
>
> First to test the speed of tearing through a random gigabyte of data I put
> there for the purpose:
> server# time cat /mnt/randgig > /dev/null
> ~4s (4 seconds, times here are averages over several runs, dropping caches
> between)
> Now let''s test the speed of the "network" between client
and server
without> interference from the disk
> server# dd if=/dev/zero bs=4096 count=262144 | nc -lvv -p 3500 -q0
> client# time nc 192.168.62.110 3500 > /dev/null
> ~3.5s
> Finally, let''s actually transfer data from disk to the client
> server# dd if=/dev/zero bs=4096 count=262144 | nc -lvv -p 3500 -q0
> client# time nc 192.168.62.110 3500 > /dev/null
> ~18.8s
> So you see, it is much slower to both read from disk and transfer over the
> (xenbridge) network than to do either alone, even though (in theory) I
have> enough processors (4 or 8 depending on how you count) to do all the work.
> If I move the client to a different (identically configured) host attached
> by 1Gbit ethernet through a switch, I get these revised times:
> transfer a gig of /dev/zero from server to client: 9.5s instead of 3.5
> transfer a gig of /mnt/randgig from server to client: 14.2s instead of
18.8s> !!
> This further confirms that there is some bad interaction between disk and
> network i/o scheduling, presumably in the dom0 backend but I am not sure
how> to tell for sure.
> I have tried every combination of # of vcpus, pinning vcpus, etc on both
> domUs and dom0.  I have also tried the experiment with dom0 as the server;
> the main difference is that the performance is worse in all cases but
still> better if the client is on a different host.
> So in summary my questions are:
> 1) why is it so much slower to transfer a file from disk over the
xenbridge> network than either reading from the disk or sending bytes over the
network> alone?
> 2) what can I do about it?
> I have searched in vain for any hint of this problem, except that the Xen
> documentation says somewhere I should pin and fix the number of dom0 cpus
> when doing I/O-intensive work in the guests, but I have tried this to no
> avail.
> I would appreciate any insights.
Have you tried making a bridge manually to see if it performs similarly?

What is the CPU load like during each of these (both dom0 and domU) cases?

Thanks,
Todd

--
Todd Deshane
http://www.linkedin.com/in/deshantm
http://www.xen.org/products/cloudxen.html
http://runningxen.com/

 



_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Xen users - Jul 2011 - Performance issues when serving a file from one domain to another (same host) on xen 3.4

[Xen-users] Performance issues when serving a file from one domain to another (same host) on xen 3.4

Re: [Xen-users] Performance issues when serving a file from one domain to another (same host) on xen 3.4

Re: [Xen-users] Performance issues when serving a file from one domain to another (same host) on xen 3.4

RE: [Xen-users] Performance issues when serving a file from one domain to another (same host) on xen 3.4