Joe Whitney
2011-Jul-26 21:31 UTC
[Xen-users] Performance issues when serving a file from one domain to another (same host) on xen 3.4
Hello, I have consistently seen poor performance when a domain is serving a file from a locally-attached storage device over the xenbridge "network" to another, client domain on the same host. I have reduced the problem to the following very simple scenario involving two domUs: one client and one server. For my purpose the only difference is that server has an SSD mounted (as a block device) at /mnt. Each has 1 vcpu and 512mb RAM on a 4-hyperthreaded-core machine (shows as 8 "cores" in dom0). server: eth1 IP address 192.168.62.110 client: eth1 IP address 192.168.62.202 (in the following, I am executing "echo 3 > /proc/sys/vm/drop_caches" on dom0 before each command shown) First to test the speed of tearing through a random gigabyte of data I put there for the purpose: server# time cat /mnt/randgig > /dev/null ~4s (4 seconds, times here are averages over several runs, dropping caches between) Now let''s test the speed of the "network" between client and server without interference from the disk server# dd if=/dev/zero bs=4096 count=262144 | nc -lvv -p 3500 -q0 client# time nc 192.168.62.110 3500 > /dev/null ~3.5s Finally, let''s actually transfer data from disk to the client server# dd if=/dev/zero bs=4096 count=262144 | nc -lvv -p 3500 -q0 client# time nc 192.168.62.110 3500 > /dev/null ~18.8s So you see, it is much slower to both read from disk and transfer over the (xenbridge) network than to do either alone, even though (in theory) I have enough processors (4 or 8 depending on how you count) to do all the work. If I move the client to a different (identically configured) host attached by 1Gbit ethernet through a switch, I get these revised times: transfer a gig of /dev/zero from server to client: 9.5s instead of 3.5 transfer a gig of /mnt/randgig from server to client: 14.2s instead of 18.8s !! This further confirms that there is some bad interaction between disk and network i/o scheduling, presumably in the dom0 backend but I am not sure how to tell for sure. I have tried every combination of # of vcpus, pinning vcpus, etc on both domUs and dom0. I have also tried the experiment with dom0 as the server; the main difference is that the performance is worse in all cases but still better if the client is on a different host. So in summary my questions are: 1) why is it so much slower to transfer a file from disk over the xenbridge network than either reading from the disk or sending bytes over the network alone? 2) what can I do about it? I have searched in vain for any hint of this problem, except that the Xen documentation says somewhere I should pin and fix the number of dom0 cpus when doing I/O-intensive work in the guests, but I have tried this to no avail. I would appreciate any insights. Best, Joe Whitney _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Todd Deshane
2011-Jul-28 15:30 UTC
Re: [Xen-users] Performance issues when serving a file from one domain to another (same host) on xen 3.4
On Tue, Jul 26, 2011 at 9:31 PM, Joe Whitney <jwhitney@cs.toronto.edu> wrote:> Hello, > I have consistently seen poor performance when a domain is serving a file > from a locally-attached storage device over the xenbridge "network" to > another, client domain on the same host. I have reduced the problem to the > following very simple scenario involving two domUs: one client and one > server. For my purpose the only difference is that server has an SSD > mounted (as a block device) at /mnt. Each has 1 vcpu and 512mb RAM on a > 4-hyperthreaded-core machine (shows as 8 "cores" in dom0). > server: eth1 IP address 192.168.62.110 > client: eth1 IP address 192.168.62.202 > (in the following, I am executing "echo 3 > /proc/sys/vm/drop_caches" on > dom0 before each command shown) > > First to test the speed of tearing through a random gigabyte of data I put > there for the purpose: > server# time cat /mnt/randgig > /dev/null > ~4s (4 seconds, times here are averages over several runs, dropping caches > between) > Now let''s test the speed of the "network" between client and server without > interference from the disk > server# dd if=/dev/zero bs=4096 count=262144 | nc -lvv -p 3500 -q0 > client# time nc 192.168.62.110 3500 > /dev/null > ~3.5s > Finally, let''s actually transfer data from disk to the client > server# dd if=/dev/zero bs=4096 count=262144 | nc -lvv -p 3500 -q0 > client# time nc 192.168.62.110 3500 > /dev/null > ~18.8s > So you see, it is much slower to both read from disk and transfer over the > (xenbridge) network than to do either alone, even though (in theory) I have > enough processors (4 or 8 depending on how you count) to do all the work. > If I move the client to a different (identically configured) host attached > by 1Gbit ethernet through a switch, I get these revised times: > transfer a gig of /dev/zero from server to client: 9.5s instead of 3.5 > transfer a gig of /mnt/randgig from server to client: 14.2s instead of 18.8s > !! > This further confirms that there is some bad interaction between disk and > network i/o scheduling, presumably in the dom0 backend but I am not sure how > to tell for sure. > I have tried every combination of # of vcpus, pinning vcpus, etc on both > domUs and dom0. I have also tried the experiment with dom0 as the server; > the main difference is that the performance is worse in all cases but still > better if the client is on a different host. > So in summary my questions are: > 1) why is it so much slower to transfer a file from disk over the xenbridge > network than either reading from the disk or sending bytes over the network > alone? > 2) what can I do about it? > I have searched in vain for any hint of this problem, except that the Xen > documentation says somewhere I should pin and fix the number of dom0 cpus > when doing I/O-intensive work in the guests, but I have tried this to no > avail. > I would appreciate any insights.Have you tried making a bridge manually to see if it performs similarly? What is the CPU load like during each of these (both dom0 and domU) cases? Thanks, Todd -- Todd Deshane http://www.linkedin.com/in/deshantm http://www.xen.org/products/cloudxen.html http://runningxen.com/ _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Joe Whitney
2011-Jul-29 18:11 UTC
Re: [Xen-users] Performance issues when serving a file from one domain to another (same host) on xen 3.4
Hi Todd, can you clarify what you mean by "making a bridge manually"? I''ll run some tests again re: load on client, server and dom0.'' On Thu, Jul 28, 2011 at 11:30 AM, Todd Deshane <todd.deshane@xen.org> wrote:> On Tue, Jul 26, 2011 at 9:31 PM, Joe Whitney <jwhitney@cs.toronto.edu> > wrote: > > Hello, > > I have consistently seen poor performance when a domain is serving a file > > from a locally-attached storage device over the xenbridge "network" to > > another, client domain on the same host. I have reduced the problem to > the > > following very simple scenario involving two domUs: one client and one > > server. For my purpose the only difference is that server has an SSD > > mounted (as a block device) at /mnt. Each has 1 vcpu and 512mb RAM on a > > 4-hyperthreaded-core machine (shows as 8 "cores" in dom0). > > server: eth1 IP address 192.168.62.110 > > client: eth1 IP address 192.168.62.202 > > (in the following, I am executing "echo 3 > /proc/sys/vm/drop_caches" on > > dom0 before each command shown) > > > > First to test the speed of tearing through a random gigabyte of data I > put > > there for the purpose: > > server# time cat /mnt/randgig > /dev/null > > ~4s (4 seconds, times here are averages over several runs, dropping > caches > > between) > > Now let''s test the speed of the "network" between client and server > without > > interference from the disk > > server# dd if=/dev/zero bs=4096 count=262144 | nc -lvv -p 3500 -q0 > > client# time nc 192.168.62.110 3500 > /dev/null > > ~3.5s > > Finally, let''s actually transfer data from disk to the client > > server# dd if=/dev/zero bs=4096 count=262144 | nc -lvv -p 3500 -q0 > > client# time nc 192.168.62.110 3500 > /dev/null > > ~18.8s > > So you see, it is much slower to both read from disk and transfer over > the > > (xenbridge) network than to do either alone, even though (in theory) I > have > > enough processors (4 or 8 depending on how you count) to do all the work. > > If I move the client to a different (identically configured) host > attached > > by 1Gbit ethernet through a switch, I get these revised times: > > transfer a gig of /dev/zero from server to client: 9.5s instead of 3.5 > > transfer a gig of /mnt/randgig from server to client: 14.2s instead of > 18.8s > > !! > > This further confirms that there is some bad interaction between disk and > > network i/o scheduling, presumably in the dom0 backend but I am not sure > how > > to tell for sure. > > I have tried every combination of # of vcpus, pinning vcpus, etc on both > > domUs and dom0. I have also tried the experiment with dom0 as the > server; > > the main difference is that the performance is worse in all cases but > still > > better if the client is on a different host. > > So in summary my questions are: > > 1) why is it so much slower to transfer a file from disk over the > xenbridge > > network than either reading from the disk or sending bytes over the > network > > alone? > > 2) what can I do about it? > > I have searched in vain for any hint of this problem, except that the Xen > > documentation says somewhere I should pin and fix the number of dom0 cpus > > when doing I/O-intensive work in the guests, but I have tried this to no > > avail. > > I would appreciate any insights. > > Have you tried making a bridge manually to see if it performs similarly? > > What is the CPU load like during each of these (both dom0 and domU) cases? > > Thanks, > Todd > > -- > Todd Deshane > http://www.linkedin.com/in/deshantm > http://www.xen.org/products/cloudxen.html > http://runningxen.com/ >_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Walter Robert Ditzler
2011-Aug-03 15:19 UTC
RE: [Xen-users] Performance issues when serving a file from one domain to another (same host) on xen 3.4
a bridge defined directly on the network stack! auto lo iface lo inet loopback auto br0 iface br0 inet static address 10.41.10.42 netmask 255.255.255.0 network 10.41.10.0 broadcast 10.41.10.255 gateway 10.41.10.1 bridge_ports eth0 bridge_stp on bridge_maxwait 0 vif = [ ''bridge=br0,mac=xx:xx:xx:xx:xx:xx'' ] that helped me about performance. thanks walter From: xen-users-bounces@lists.xensource.com [mailto:xen-users-bounces@lists.xensource.com] On Behalf Of Joe Whitney Sent: Freitag, 29. Juli 2011 20:11 To: Todd Deshane Cc: xen-users@lists.xensource.com Subject: Re: [Xen-users] Performance issues when serving a file from one domain to another (same host) on xen 3.4 Hi Todd, can you clarify what you mean by "making a bridge manually"? I''ll run some tests again re: load on client, server and dom0.'' On Thu, Jul 28, 2011 at 11:30 AM, Todd Deshane <todd.deshane@xen.org> wrote: On Tue, Jul 26, 2011 at 9:31 PM, Joe Whitney <jwhitney@cs.toronto.edu> wrote:> Hello, > I have consistently seen poor performance when a domain is serving a file > from a locally-attached storage device over the xenbridge "network" to > another, client domain on the same host. I have reduced the problem tothe> following very simple scenario involving two domUs: one client and one > server. For my purpose the only difference is that server has an SSD > mounted (as a block device) at /mnt. Each has 1 vcpu and 512mb RAM on a > 4-hyperthreaded-core machine (shows as 8 "cores" in dom0). > server: eth1 IP address 192.168.62.110 > client: eth1 IP address 192.168.62.202 > (in the following, I am executing "echo 3 > /proc/sys/vm/drop_caches" on > dom0 before each command shown) > > First to test the speed of tearing through a random gigabyte of data I put > there for the purpose: > server# time cat /mnt/randgig > /dev/null > ~4s (4 seconds, times here are averages over several runs, dropping caches > between) > Now let''s test the speed of the "network" between client and serverwithout> interference from the disk > server# dd if=/dev/zero bs=4096 count=262144 | nc -lvv -p 3500 -q0 > client# time nc 192.168.62.110 3500 > /dev/null > ~3.5s > Finally, let''s actually transfer data from disk to the client > server# dd if=/dev/zero bs=4096 count=262144 | nc -lvv -p 3500 -q0 > client# time nc 192.168.62.110 3500 > /dev/null > ~18.8s > So you see, it is much slower to both read from disk and transfer over the > (xenbridge) network than to do either alone, even though (in theory) Ihave> enough processors (4 or 8 depending on how you count) to do all the work. > If I move the client to a different (identically configured) host attached > by 1Gbit ethernet through a switch, I get these revised times: > transfer a gig of /dev/zero from server to client: 9.5s instead of 3.5 > transfer a gig of /mnt/randgig from server to client: 14.2s instead of18.8s> !! > This further confirms that there is some bad interaction between disk and > network i/o scheduling, presumably in the dom0 backend but I am not surehow> to tell for sure. > I have tried every combination of # of vcpus, pinning vcpus, etc on both > domUs and dom0. I have also tried the experiment with dom0 as the server; > the main difference is that the performance is worse in all cases butstill> better if the client is on a different host. > So in summary my questions are: > 1) why is it so much slower to transfer a file from disk over thexenbridge> network than either reading from the disk or sending bytes over thenetwork> alone? > 2) what can I do about it? > I have searched in vain for any hint of this problem, except that the Xen > documentation says somewhere I should pin and fix the number of dom0 cpus > when doing I/O-intensive work in the guests, but I have tried this to no > avail. > I would appreciate any insights.Have you tried making a bridge manually to see if it performs similarly? What is the CPU load like during each of these (both dom0 and domU) cases? Thanks, Todd -- Todd Deshane http://www.linkedin.com/in/deshantm http://www.xen.org/products/cloudxen.html http://runningxen.com/ _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users