thr3ads.net - Xen devel - [Xen-devel] XCP [Jun 2010]

If this information is useful, please help other people find it:
Share via:

AkshayKumar Mehta

2010-Jun-01 17:49 UTC

[Xen-devel] XCP

Hi there,

 

We are using latest version of XCP on 6 hosts. While issuing VM.start or
VM.start_on xmlrpc functional call , it says :

 

 

{''Status'': ''Failure'',
''ErrorDescription'': [''SESSION_INVALID'',
''OpaqueRef:cfb6df14-387d-40a1-cc27-d5962cba7712'']}

 

However if I put VM.start in a loop maybe after  20-30 tries it succeeds
. 

But VM.start_on does not succeed even after 70 tries.

One more observation -  VM.clone succeeds after 7-8 tries 

VM.hard_shutdown works fine 

 

Can you guide me on this issue,

Akshay

 

 

 



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jonathan Ludlam

2010-Jun-01 19:06 UTC

head link

Re: [Xen-devel] XCP

IIRC, there was a fairly nasty session caching bug that could cause issues like
this. It''s been fixed since, and the upcoming 0.5 release (coming in a
week or so) should fix it.

Cheers,

Jon



On 1 Jun 2010, at 18:49, AkshayKumar Mehta wrote:

Hi there,

We are using latest version of XCP on 6 hosts. While issuing VM.start or
VM.start_on xmlrpc functional call , it says :


{''Status'': ''Failure'',
''ErrorDescription'': [''SESSION_INVALID'',
''OpaqueRef:cfb6df14-387d-40a1-cc27-d5962cba7712'']}

However if I put VM.start in a loop maybe after  20-30 tries it succeeds .
But VM.start_on does not succeed even after 70 tries.
One more observation -  VM.clone succeeds after 7-8 tries
VM.hard_shutdown works fine

Can you guide me on this issue,
Akshay



<ATT00001..txt>



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

AkshayKumar Mehta

2010-Jun-01 19:15 UTC

head link

RE: [Xen-devel] XCP

Hi Jonathan Ludlam,

 

Thanks! Let me know how to update to .5 without disturbing the existing
configuration

 

Again thanks for your quick reply.

Akshay

 

________________________________

From: Jonathan Ludlam [mailto:Jonathan.Ludlam@eu.citrix.com] 
Sent: Tuesday, June 01, 2010 12:07 PM
To: AkshayKumar Mehta
Cc: xen-devel@lists.xensource.com; Pradeep Padala
Subject: Re: [Xen-devel] XCP

 

IIRC, there was a fairly nasty session caching bug that could cause
issues like this. It''s been fixed since, and the upcoming 0.5 release
(coming in a week or so) should fix it.

 

Cheers,

 

Jon

 

 

 

On 1 Jun 2010, at 18:49, AkshayKumar Mehta wrote:





Hi there,

 

We are using latest version of XCP on 6 hosts. While issuing VM.start or
VM.start_on xmlrpc functional call , it says :

 

 

{''Status'': ''Failure'',
''ErrorDescription'': [''SESSION_INVALID'',
''OpaqueRef:cfb6df14-387d-40a1-cc27-d5962cba7712'']}

 

However if I put VM.start in a loop maybe after  20-30 tries it succeeds
.

But VM.start_on does not succeed even after 70 tries.

One more observation -  VM.clone succeeds after 7-8 tries

VM.hard_shutdown works fine

 

Can you guide me on this issue,

Akshay

 

 

 

<ATT00001..txt>

 



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Andreas Olsowski

2010-Jun-01 21:17 UTC

head link

[Xen-devel] slow live magration / xc_restore on xen4 pvops

Hi,

in preparation for our soon to arrive central storage array i wanted to 
test live magration and remus replication and stumbled upon a  problem.
When migrating a test-vm (512megs ram, idle) between my 3 servers two of 
them are extremely slow in "receiving" the vm. There is little to no
cpu
utilization from xc_restore until shortly before migration is complete.
The same goes for xm restore.
The xend.log contains:
[2010-06-01 21:16:27 5211] DEBUG (XendCheckpoint:286) 
restore:shadow=0x0, _static_max=0x20000000, _static_min=0x0,
[2010-06-01 21:16:27 5211] DEBUG (XendCheckpoint:305) [xc_restore]: 
/usr/lib/xen/bin/xc_restore 48 43 1 2 0 0 0 0
[2010-06-01 21:16:27 5211] INFO (XendCheckpoint:423) xc_domain_restore 
start: p2m_size = 20000
[2010-06-01 21:16:27 5211] INFO (XendCheckpoint:423) Reloading memory 
pages:   0%
[2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR Internal 
error: Error when reading batch size
[2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR Internal 
error: error when buffering batch, finishing

When receiving a vm via live migration finally finishes. You can see the 
large gap in the timestamps.
The vm is perfectly fine after that, it just takes way too long.


First off let me explain my server setup, detailed information on trying 
to narrow down the error follows.
I have 3 servers running xen4 with 2.6.31.13-pvops as kernel, its the 
current kernel from jeremy''s xen/master git branch.
The guests are running vanilla 2.6.32.11 kernels.

The 3 servers differ slightly in hardware, two are Dell PE 2950 and one 
is a Dell R710, the 2950''s have 2 Quad-Xeon CPUs (L5335 and L5410), the
R710 has 2 Quad Xeon E5520.
All machines have 24gigs of RAM.

They are called "tarballerina" (E5520), "xentruio1" (L5335)
ad
"xenturio2" (L5410).

Currently i use tarballerina for testing purposes but i dont consider 
anything in my setup "stable".
xenturio1 has 27 guests running, xenturio2 25.
No guest does anything that would even put a dent into the systems 
performance (ldap servers, radius, department webservers, etc.).

I created a test-vm on my current central iscsi storage, called
"hatest"
that idles around, has 2 VCPUs and 512megs of ram.

First i testen xm save/restore:
tarballerina:~# time xm restore /var/saverestore-t.mem
real    0m13.227s
user    0m0.090s
sys     0m0.023s
xenturio1:~# time xm restore /var/saverestore-x1.mem
real    4m15.173s
user    0m0.138s
sys     0m0.029s


When migrating to xenturio1 or 2 it the migration takes 181 to 278 
seconds, when migrating it to tarballerina it takes rougly 30seconds:
tarballerina:~# time xm migrate --live hatest 10.0.1.98
real    3m57.971s
user    0m0.086s
sys     0m0.029s
xenturio1:~# time xm migrate --live hatest 10.0.1.100
real    0m43.588s
user    0m0.123s
sys     0m0.034s


--- attempt of narrowing it down ----
My first guess was that since tarballerina had almost no guest running 
that did anything, it could be a issue of memory usage by the tapdisk2 
processes (each dom0 has been mem-set to 4096M).
I then started almost all vms that i have on tarballerina:
tarballerina:~# time xm save saverestore-t /var/saverestore-t.mem
real    0m2.884s
tarballerina:~# time xm restore /var/saverestore-t.mem
real    0m15.594s


i tried this several times, sometimes it too 30+ seconds.

Then i started 2 VMs that run load and io generating processes  (stress, 
dd, openssl encryption, md5sum).
But this didnt affect xm restore perfomance, it still was quite fast:
tarballerina:~# time xm save saverestore-t /var/saverestore-t.mem
real    0m7.476s
user    0m0.101s
sys     0m0.022s
tarballerina:~# time xm restore /var/saverestore-t.mem
real    0m45.544s
user    0m0.094s
sys     0m0.022s

i tried several times again, restore took 17 to 45 seconds

Then i tried migrating the test-vm to tarballerina again, still fast, 
inspite of several vms including load and io generating vms:
This ate almost all available ram.
cputimes for xc_restore according to target machine''s "top":
tarballerina -> xenturio1: 0:05:xx , cpu 2-4%, near the end 40%.
xenturio1 > tarballerina: 0:04:xx, cpu 4-8%, near the end 54%.

tarballerina:~# time xm migrate --live hatest 10.0.1.98
real    3m29.779s
user    0m0.102s
sys     0m0.017s
xenturio1:~# time xm migrate --live hatest 10.0.1.100
real    0m28.386s
user    0m0.154s
sys     0m0.032s


so my attempt of narrowing the problem down failed, its neither the free 
memory of the dom0 nor the load, io or the memory the other domUs utilize.
---end attempt---

More info(xm list, meminfo, table with migration times, etc.) on my 
setup can be found here:
http://andiolsi.rz.uni-lueneburg.de/node/37

There was another guy who has the same error in his logfile, this might 
be unrelated or not:
http://lists.xensource.com/archives/html/xen-users/2010-05/msg00318.html

Further information can be given, should demand for i arise.

With best regards

---
Andreas Olsowski <andreas.olsowski@uni.leuphana.de>
Leuphana Universität Lüneburg
System- und Netzwerktechnik
Rechenzentrum, Geb 7, Raum 15
Scharnhorststr. 1
21335 Lüneburg

Tel: ++49 4131 / 6771309



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2010-Jun-02 07:11 UTC

head link

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

Hi Andreas,

This is an interesting bug, to be sure. I think you need to modify the
restore code to get a better idea of what''s going on. The file in the
Xen
tree is tools/libxc/xc_domain_restore.c. You will see it contains many
DBGPRINTF and DPRINTF calls, some of which are commented out, and some of
which may ''log'' at too low a priority level to make it to the
log file. For
your purposes you might change them to ERROR calls as they will definitely
get properly logged. One area of possible concern is that our read function
(RDEXACT, which is a macro mapping to rdexact) was modified for Remus to
have a select() call with a timeout of 1000ms. Do I entirely trust it? Not
when we have the inexplicable behaviour that you''re seeing. So you
might try
mapping RDEXACT() to read_exact() instead (which is what we already do when
building for __MINIOS__).

This all assumes you know your way around C code at least a little bit.

 -- Keir

On 01/06/2010 22:17, "Andreas Olsowski"
<andreas.olsowski@uni.leuphana.de>
wrote:
> Hi,
> 
> in preparation for our soon to arrive central storage array i wanted to
> test live magration and remus replication and stumbled upon a  problem.
> When migrating a test-vm (512megs ram, idle) between my 3 servers two of
> them are extremely slow in "receiving" the vm. There is little to
no cpu
> utilization from xc_restore until shortly before migration is complete.
> The same goes for xm restore.
> The xend.log contains:
> [2010-06-01 21:16:27 5211] DEBUG (XendCheckpoint:286)
> restore:shadow=0x0, _static_max=0x20000000, _static_min=0x0,
> [2010-06-01 21:16:27 5211] DEBUG (XendCheckpoint:305) [xc_restore]:
> /usr/lib/xen/bin/xc_restore 48 43 1 2 0 0 0 0
> [2010-06-01 21:16:27 5211] INFO (XendCheckpoint:423) xc_domain_restore
> start: p2m_size = 20000
> [2010-06-01 21:16:27 5211] INFO (XendCheckpoint:423) Reloading memory
> pages:   0%
> [2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR Internal
> error: Error when reading batch size
> [2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR Internal
> error: error when buffering batch, finishing
> 
> When receiving a vm via live migration finally finishes. You can see the
> large gap in the timestamps.
> The vm is perfectly fine after that, it just takes way too long.
> 
> 
> First off let me explain my server setup, detailed information on trying
> to narrow down the error follows.
> I have 3 servers running xen4 with 2.6.31.13-pvops as kernel, its the
> current kernel from jeremy''s xen/master git branch.
> The guests are running vanilla 2.6.32.11 kernels.
> 
> The 3 servers differ slightly in hardware, two are Dell PE 2950 and one
> is a Dell R710, the 2950''s have 2 Quad-Xeon CPUs (L5335 and
L5410), the
> R710 has 2 Quad Xeon E5520.
> All machines have 24gigs of RAM.
> 
> They are called "tarballerina" (E5520), "xentruio1"
(L5335) ad
> "xenturio2" (L5410).
> 
> Currently i use tarballerina for testing purposes but i dont consider
> anything in my setup "stable".
> xenturio1 has 27 guests running, xenturio2 25.
> No guest does anything that would even put a dent into the systems
> performance (ldap servers, radius, department webservers, etc.).
> 
> I created a test-vm on my current central iscsi storage, called
"hatest"
> that idles around, has 2 VCPUs and 512megs of ram.
> 
> First i testen xm save/restore:
> tarballerina:~# time xm restore /var/saverestore-t.mem
> real    0m13.227s
> user    0m0.090s
> sys     0m0.023s
> xenturio1:~# time xm restore /var/saverestore-x1.mem
> real    4m15.173s
> user    0m0.138s
> sys     0m0.029s
> 
> 
> When migrating to xenturio1 or 2 it the migration takes 181 to 278
> seconds, when migrating it to tarballerina it takes rougly 30seconds:
> tarballerina:~# time xm migrate --live hatest 10.0.1.98
> real    3m57.971s
> user    0m0.086s
> sys     0m0.029s
> xenturio1:~# time xm migrate --live hatest 10.0.1.100
> real    0m43.588s
> user    0m0.123s
> sys     0m0.034s
> 
> 
> --- attempt of narrowing it down ----
> My first guess was that since tarballerina had almost no guest running
> that did anything, it could be a issue of memory usage by the tapdisk2
> processes (each dom0 has been mem-set to 4096M).
> I then started almost all vms that i have on tarballerina:
> tarballerina:~# time xm save saverestore-t /var/saverestore-t.mem
> real    0m2.884s
> tarballerina:~# time xm restore /var/saverestore-t.mem
> real    0m15.594s
> 
> 
> i tried this several times, sometimes it too 30+ seconds.
> 
> Then i started 2 VMs that run load and io generating processes  (stress,
> dd, openssl encryption, md5sum).
> But this didnt affect xm restore perfomance, it still was quite fast:
> tarballerina:~# time xm save saverestore-t /var/saverestore-t.mem
> real    0m7.476s
> user    0m0.101s
> sys     0m0.022s
> tarballerina:~# time xm restore /var/saverestore-t.mem
> real    0m45.544s
> user    0m0.094s
> sys     0m0.022s
> 
> i tried several times again, restore took 17 to 45 seconds
> 
> Then i tried migrating the test-vm to tarballerina again, still fast,
> inspite of several vms including load and io generating vms:
> This ate almost all available ram.
> cputimes for xc_restore according to target machine''s
"top":
> tarballerina -> xenturio1: 0:05:xx , cpu 2-4%, near the end 40%.
> xenturio1 > tarballerina: 0:04:xx, cpu 4-8%, near the end 54%.
> 
> tarballerina:~# time xm migrate --live hatest 10.0.1.98
> real    3m29.779s
> user    0m0.102s
> sys     0m0.017s
> xenturio1:~# time xm migrate --live hatest 10.0.1.100
> real    0m28.386s
> user    0m0.154s
> sys     0m0.032s
> 
> 
> so my attempt of narrowing the problem down failed, its neither the free
> memory of the dom0 nor the load, io or the memory the other domUs utilize.
> ---end attempt---
> 
> More info(xm list, meminfo, table with migration times, etc.) on my
> setup can be found here:
> http://andiolsi.rz.uni-lueneburg.de/node/37
> 
> There was another guy who has the same error in his logfile, this might
> be unrelated or not:
> http://lists.xensource.com/archives/html/xen-users/2010-05/msg00318.html
> 
> Further information can be given, should demand for i arise.
> 
> With best regards
> 
> ---
> Andreas Olsowski <andreas.olsowski@uni.leuphana.de>
> Leuphana Universität Lüneburg
> System- und Netzwerktechnik
> Rechenzentrum, Geb 7, Raum 15
> Scharnhorststr. 1
> 21335 Lüneburg
> 
> Tel: ++49 4131 / 6771309
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Andreas Olsowski

2010-Jun-02 15:46 UTC

head link

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

Hi Keir,

i changed all DRPRINTF calls to ERROR and // DPRINTF to ERROR as well.
There are no DBGPRINTF calls in my xc_domain_restore.c though.

This is the new xend.log output, of course in this case the "ERROR Internal
error:" is actually debug output.

xenturio1:~# tail -f /var/log/xen/xend.log
[2010-06-02 15:44:19 5468] DEBUG (XendCheckpoint:286) restore:shadow=0x0,
_static_max=0x20000000, _static_min=0x0,
[2010-06-02 15:44:19 5468] DEBUG (XendCheckpoint:305) [xc_restore]:
/usr/lib/xen/bin/xc_restore 50 51 1 2 0 0 0 0
[2010-06-02 15:44:19 5468] INFO (XendCheckpoint:423) ERROR Internal error:
xc_domain_restore start: p2m_size = 20000
[2010-06-02 15:44:19 5468] INFO (XendCheckpoint:423)
[2010-06-02 15:44:19 5468] INFO (XendCheckpoint:423) ERROR Internal error:
Reloading memory pages:   0%
[2010-06-02 15:44:19 5468] INFO (XendCheckpoint:423)
[2010-06-02 15:44:19 5468] INFO (XendCheckpoint:423) ERROR Internal error:
reading batch of -7 pages
[2010-06-02 15:44:19 5468] INFO (XendCheckpoint:423)
[2010-06-02 15:44:19 5468] INFO (XendCheckpoint:423) ERROR Internal error:
reading batch of 1024 pages
[2010-06-02 15:44:19 5468] INFO (XendCheckpoint:423)
[2010-06-02 15:49:02 5468] INFO (XendCheckpoint:423) ERROR Internal error:
reading batch of 1024 pages
[2010-06-02 15:49:02 5468] INFO (XendCheckpoint:423)
[2010-06-02 15:49:02 5468] INFO (XendCheckpoint:423) ERROR Internal error:
reading batch of 1024 pages
[2010-06-02 15:49:02 5468] INFO (XendCheckpoint:423)
[2010-06-02 15:49:03 5468] INFO (XendCheckpoint:423) ERROR Internal error:
reading batch of 1024 pages
...
[2010-06-02 15:49:09 5468] INFO (XendCheckpoint:423) ERROR Internal err100%
...

One can see the timegap bewteen the first and the following memory batch reads.
After that restoration works as expected.
You might notice, that you have "0%" and then "100%" and no
steps inbetween, whereas with xc_save you have, is that intentional or maybe
another symptom for the same problem?

as for the read_exact stuff:
tarballerina:/usr/src/xen-4.0.0# find . -type f -iname \*.c -exec grep -H
RDEXACT {} \;
tarballerina:/usr/src/xen-4.0.0# find . -type f -iname \*.c -exec grep -H
rdexact {} \;

There are no RDEXACT/rdexact matches in my xen source code.

In a few hours i will shutdown all virtual machines on one of the hosts
experiencing slow xc_restores, maybe reboot it and check if xc_restore is any
faster without load or utilization on the machine.

Ill check in with results later.


On Wed, 2 Jun 2010 08:11:31 +0100
Keir Fraser <keir.fraser@eu.citrix.com> wrote:
> Hi Andreas,
> 
> This is an interesting bug, to be sure. I think you need to modify the
> restore code to get a better idea of what''s going on. The file in
the Xen
> tree is tools/libxc/xc_domain_restore.c. You will see it contains many
> DBGPRINTF and DPRINTF calls, some of which are commented out, and some of
> which may ''log'' at too low a priority level to make it to
the log file. For
> your purposes you might change them to ERROR calls as they will definitely
> get properly logged. One area of possible concern is that our read function
> (RDEXACT, which is a macro mapping to rdexact) was modified for Remus to
> have a select() call with a timeout of 1000ms. Do I entirely trust it? Not
> when we have the inexplicable behaviour that you''re seeing. So you
might try
> mapping RDEXACT() to read_exact() instead (which is what we already do when
> building for __MINIOS__).
> 
> This all assumes you know your way around C code at least a little bit.
> 
>  -- Keir

-- 
Andreas Olsowski <andreas.olsowski@uni.leuphana.de>
Leuphana Universität Lüneburg
System- und Netzwerktechnik
Rechenzentrum, Geb 7, Raum 15
Scharnhorststr. 1
21335 Lüneburg

Tel: ++49 4131 / 6771309

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2010-Jun-02 15:55 UTC

head link

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

On 02/06/2010 16:46, "Andreas Olsowski"
<andreas.olsowski@uni.leuphana.de>
wrote:
> One can see the timegap bewteen the first and the following memory batch
> reads.
> After that restoration works as expected.
> You might notice, that you have "0%" and then "100%"
and no steps inbetween,
> whereas with xc_save you have, is that intentional or maybe another symptom
> for the same problem?
Does the log look similar for a restore on a fast system (except the
timestamps of course)?
> as for the read_exact stuff:
> tarballerina:/usr/src/xen-4.0.0# find . -type f -iname \*.c -exec grep -H
> RDEXACT {} \;
> tarballerina:/usr/src/xen-4.0.0# find . -type f -iname \*.c -exec grep -H
> rdexact {} \;
> 
> There are no RDEXACT/rdexact matches in my xen source code.
Ah, because you''re using 4.0. Well, I wouldn''t worry about it
just now
anyway. It may be more fruitful to continue looking for a concrete
behavioural different between a fast and slow restore, apart from merely
timing, by inspecting logs.

 -- Keir
> In a few hours i will shutdown all virtual machines on one of the hosts
> experiencing slow xc_restores, maybe reboot it and check if xc_restore is
any
> faster without load or utilization on the machine.
> 
> Ill check in with results later.
> 
> 
> On Wed, 2 Jun 2010 08:11:31 +0100
> Keir Fraser <keir.fraser@eu.citrix.com> wrote:
> 
>> Hi Andreas,
>> 
>> This is an interesting bug, to be sure. I think you need to modify the
>> restore code to get a better idea of what''s going on. The file
in the Xen
>> tree is tools/libxc/xc_domain_restore.c. You will see it contains many
>> DBGPRINTF and DPRINTF calls, some of which are commented out, and some
of
>> which may ''log'' at too low a priority level to make
it to the log file. For
>> your purposes you might change them to ERROR calls as they will
definitely
>> get properly logged. One area of possible concern is that our read
function
>> (RDEXACT, which is a macro mapping to rdexact) was modified for Remus
to
>> have a select() call with a timeout of 1000ms. Do I entirely trust it?
Not
>> when we have the inexplicable behaviour that you''re seeing. So
you might try
>> mapping RDEXACT() to read_exact() instead (which is what we already do
when
>> building for __MINIOS__).
>> 
>> This all assumes you know your way around C code at least a little bit.
>> 
>>  -- Keir
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ian Jackson

2010-Jun-02 16:18 UTC

head link

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

Andreas Olsowski writes ("[Xen-devel] slow live magration / xc_restore on
xen4 pvops"):> [2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR Internal 
> error: Error when reading batch size
> [2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR Internal 
> error: error when buffering batch, finishing
These errors, and the slowness of migrations, are caused by changes
made to support Remus.  Previously, a migration would be regarded as
complete as soon as the final information including CPU states was
received at the migration target.  xc_domain_restore would return
immediately at that point.

Since the Remus patches, xc_domain_restore waits until it gets an IO
error, and also has a very short timeout which induces IO errors if
nothing is received if there is no timeout.  This is correct in the
Remus case but wrong in the normal case.

The code should be changed so that xc_domain_restore
 (a) takes an explicit parameter for the IO timeout, which
     should default to something much longer than the 100ms or so of
     the Remus case, and
 (b) gets told whether
    (i) it should return immediately after receiving the "tail"
        which contains the CPU state; or
    (ii) it should attempt to keep reading after receiving the "tail"
        and only return when the connection fails.

In the case (b)(i), which should be the usual case, the behaviour
should be that which we would get if changeset 20406:0f893b8f7c15 was
reverted.  The offending code is mostly this, from 20406:

+    // DPRINTF("Buffered checkpoint\n");
+
+    if ( pagebuf_get(&pagebuf, io_fd, xc_handle, dom) ) {
+        ERROR("error when buffering batch, finishing\n");
+        goto finish;
+    }
+    memset(&tmptail, 0, sizeof(tmptail));
+    if ( buffer_tail(&tmptail, io_fd, max_vcpu_id, vcpumap,
+                     ext_vcpucontext) < 0 ) {
+        ERROR ("error buffering image tail, finishing");
+        goto finish;
+    }
+    tailbuf_free(&tailbuf);
+    memcpy(&tailbuf, &tmptail, sizeof(tailbuf));
+
+    goto loadpages;
+
+  finish:

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ian Jackson

2010-Jun-02 16:20 UTC

head link

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

I wrote:> These errors, and the slowness of migrations, [...]
Actually looking at your log it has a 4min30 delay in it which is
quite striking and well beyond the kind of delay which ought to occur
due to the problem I just wrote about.

Does the same problem happen with xl ?

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2010-Jun-02 16:24 UTC

head link

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

On 02/06/2010 17:18, "Ian Jackson" <Ian.Jackson@eu.citrix.com>
wrote:
> Andreas Olsowski writes ("[Xen-devel] slow live magration / xc_restore
on xen4
> pvops"):
>> [2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR Internal
>> error: Error when reading batch size
>> [2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR Internal
>> error: error when buffering batch, finishing
> 
> These errors, and the slowness of migrations, are caused by changes
> made to support Remus.  Previously, a migration would be regarded as
> complete as soon as the final information including CPU states was
> received at the migration target.  xc_domain_restore would return
> immediately at that point.
This probably needs someone with Remus knowledge to take a look, to keep all
cases working correctly. I''ll Cc Brendan. It''d be good to get
this fixed for
a 4.0.1 in a few weeks.

 -- Keir
> Since the Remus patches, xc_domain_restore waits until it gets an IO
> error, and also has a very short timeout which induces IO errors if
> nothing is received if there is no timeout.  This is correct in the
> Remus case but wrong in the normal case.
> 
> The code should be changed so that xc_domain_restore
>  (a) takes an explicit parameter for the IO timeout, which
>      should default to something much longer than the 100ms or so of
>      the Remus case, and
>  (b) gets told whether
>     (i) it should return immediately after receiving the "tail"
>         which contains the CPU state; or
>     (ii) it should attempt to keep reading after receiving the
"tail"
>         and only return when the connection fails.
> 
> In the case (b)(i), which should be the usual case, the behaviour
> should be that which we would get if changeset 20406:0f893b8f7c15 was
> reverted.  The offending code is mostly this, from 20406:
> 
> +    // DPRINTF("Buffered checkpoint\n");
> +
> +    if ( pagebuf_get(&pagebuf, io_fd, xc_handle, dom) ) {
> +        ERROR("error when buffering batch, finishing\n");
> +        goto finish;
> +    }
> +    memset(&tmptail, 0, sizeof(tmptail));
> +    if ( buffer_tail(&tmptail, io_fd, max_vcpu_id, vcpumap,
> +                     ext_vcpucontext) < 0 ) {
> +        ERROR ("error buffering image tail, finishing");
> +        goto finish;
> +    }
> +    tailbuf_free(&tailbuf);
> +    memcpy(&tailbuf, &tmptail, sizeof(tailbuf));
> +
> +    goto loadpages;
> +
> +  finish:
> 
> Ian.
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Brendan Cully

2010-Jun-02 16:27 UTC

head link

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

On Wednesday, 02 June 2010 at 17:18, Ian Jackson wrote:> Andreas Olsowski writes ("[Xen-devel] slow live magration / xc_restore
on xen4 pvops"):
> > [2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR Internal 
> > error: Error when reading batch size
> > [2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR Internal 
> > error: error when buffering batch, finishing
> 
> These errors, and the slowness of migrations, are caused by changes
> made to support Remus.  Previously, a migration would be regarded as
> complete as soon as the final information including CPU states was
> received at the migration target.  xc_domain_restore would return
> immediately at that point.
> 
> Since the Remus patches, xc_domain_restore waits until it gets an IO
> error, and also has a very short timeout which induces IO errors if
> nothing is received if there is no timeout.  This is correct in the
> Remus case but wrong in the normal case.
> 
> The code should be changed so that xc_domain_restore
>  (a) takes an explicit parameter for the IO timeout, which
>      should default to something much longer than the 100ms or so of
>      the Remus case, and
>  (b) gets told whether
>     (i) it should return immediately after receiving the "tail"
>         which contains the CPU state; or
>     (ii) it should attempt to keep reading after receiving the
"tail"
>         and only return when the connection fails.
I''m going to have a look at this today, but the way the code was
originally written I don''t believe this should have been a problem:

1. reads are only supposed to be able to time out after the entire
first checkpoint has been received (IOW this wouldn''t kick in until
normal migration had already completed)

2. in normal migration, the sender should close the fd after sending
all data, immediately triggering an IO error on the receiver and
completing the restore.

I did try to avoid disturbing regular live migration as much as
possible when I wrote the code. I suspect some other regression has
crept in, and I''ll investigate.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Andreas Olsowski

2010-Jun-02 22:59 UTC

head link

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

I did some further research now and shut down all virtual machines on 
xenturio1, after that i got (3 runs):
(xm save takes ~5 seconds , user and sys are always negligible so i  
removed those to reduce text)

xenturio1:~# time xm restore /var/saverestore-x1.mem
real    0m25.349s 0m27.456s 0m27.208s

So the fact that there were running machines did impact performance of 
xc_restore.

I proceeded to start create 20 "dummy" vms with 1gig ram and 4vpcus 
each  (dom0 has 4096M fixed, 24gig total available):
xenturio1:~# for i in {1..20} ; do echo creating dummy$i ; xt vm create 
dummy$i -vlan 27 -mem 1024 -cpus 4 ; done
creating dummy1
vm/create> successfully created vm ''dummy1''
....
creating dummy20
vm/create> successfully created vm ''dummy20''

and started them
for i in {1..20} ; do echo starting dummy$i ; xm start dummy$i ; done

So my memory allocation should now  be 100% (4gig dom0 20gig domUs), but 
why did i have 512megs to spare for "saverestore-x1"? Oh well,
onwards.

Once again i ran a save/restore, 3 times to be sure (edited the 
additional results in output).

With 20 running vms:
xenturio1:~# time xm restore /var/saverestore-x1.mem
real    1m16.375s 0m31.306s 1m10.214s

With 16 running vms:
xenturio1:~# time xm restore /var/saverestore-x1.mem
real    1m49.741s 1m38.696s 0m55.615s

With 12 running vms:
xenturio1:~# time xm restore /var/saverestore-x1.mem
real    1m3.101s 2m4.254s 1m27.193s

With 8 running vms:
xenturio1:~# time xm restore /var/saverestore-x1.mem
real    0m36.867s 0m43.513s 0m33.199s

With 4 running vms:
xenturio1:~# time xm restore /var/saverestore-x1.mem
real    0m40.454s 0m44.929s 1m7.215s

Keep in mind, those dumUs dont do anything at all, they just idle.
What is going on there the results seem completely random, running more 
domUs can be faster than running less? How is that even possible?

So i deleted the dummyXs and started the productive domUs again, in 3 
steps to take further measurements:


after first batch:
xenturio1:~# time xm restore /var/saverestore-x1.mem
real    0m23.968s 1m22.133s 1m24.420s

after second batch:
xenturio1:~# time xm restore /var/saverestore-x1.mem
real    1m54.310s 1m11.340s 1m47.643s

after third batch:
xenturio1:~# time xm restore /var/saverestore-x1.mem
real    1m52.065s 1m34.517s 2m8.644s 1m25.473s 1m35.943s 1m45.074s 
1m48.407s 1m18.277s 1m18.931s 1m27.458s

So my current guess is, xc_restore speed depends on the amount of used 
memory or rather how much is beeing grabbed by running processes. Does 
that make any sense?

But if that is so, explain:
I started 3 vms running "stress" that do:
load average: 30.94, 30.04, 21.00
Mem:   5909844k total,  4020480k used,  1889364k free,      288k buffers

But still:
tarballerina:~# time xm restore /var/saverestore-t.mem
real    0m38.654s

Why doesnt xc_restore slow down on tarballerina, no matter what i do?
Again: all 3 machines have 24gigs ram, 2x quad xeons and dom0 is fixed 
to 4096M ram.
all use the same xen4 sources, the same kernels with the same configs.

Is the Xeon E5520 with DDR3 really this much faster than the L5335 and 
L5410 with DDR2?

If someone were to tell me, that this is expected behaviour i wouldnt 
like it, but at least i could accept it.
Are machines doing plenty of cpu and memory utilizaton not a good 
measurement in this or any case?

I think tomorrow night i will migrate all machines from xenturio1 to 
tarballerina, but first i have to verify that all vlans are available, 
that i cannot do right now.

---

Andreas

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Brendan Cully

2010-Jun-03 01:04 UTC

head link

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

On Wednesday, 02 June 2010 at 17:24, Keir Fraser wrote:> On 02/06/2010 17:18, "Ian Jackson"
<Ian.Jackson@eu.citrix.com> wrote:
> 
> > Andreas Olsowski writes ("[Xen-devel] slow live magration /
xc_restore on xen4
> > pvops"):
> >> [2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR
Internal
> >> error: Error when reading batch size
> >> [2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR
Internal
> >> error: error when buffering batch, finishing
> > 
> > These errors, and the slowness of migrations, are caused by changes
> > made to support Remus.  Previously, a migration would be regarded as
> > complete as soon as the final information including CPU states was
> > received at the migration target.  xc_domain_restore would return
> > immediately at that point.
> 
> This probably needs someone with Remus knowledge to take a look, to keep
all
> cases working correctly. I''ll Cc Brendan. It''d be good to
get this fixed for
> a 4.0.1 in a few weeks.
I''ve done a bit of profiling of the restore code and observed the
slowness here too. It looks to me like it''s probably related to
superpage changes. The big hit appears to be at the front of the
restore process during calls to allocate_mfn_list, under the
normal_page case. It looks like we''re calling
xc_domain_memory_populate_physmap once per page here, instead of
batching the allocation? I haven''t had time to investigate further
today, but I think this is the culprit.
> 
>  -- Keir
> 
> > Since the Remus patches, xc_domain_restore waits until it gets an IO
> > error, and also has a very short timeout which induces IO errors if
> > nothing is received if there is no timeout.  This is correct in the
> > Remus case but wrong in the normal case.
> > 
> > The code should be changed so that xc_domain_restore
> >  (a) takes an explicit parameter for the IO timeout, which
> >      should default to something much longer than the 100ms or so of
> >      the Remus case, and
> >  (b) gets told whether
> >     (i) it should return immediately after receiving the
"tail"
> >         which contains the CPU state; or
> >     (ii) it should attempt to keep reading after receiving the
"tail"
> >         and only return when the connection fails.
> > 
> > In the case (b)(i), which should be the usual case, the behaviour
> > should be that which we would get if changeset 20406:0f893b8f7c15 was
> > reverted.  The offending code is mostly this, from 20406:
> > 
> > +    // DPRINTF("Buffered checkpoint\n");
> > +
> > +    if ( pagebuf_get(&pagebuf, io_fd, xc_handle, dom) ) {
> > +        ERROR("error when buffering batch, finishing\n");
> > +        goto finish;
> > +    }
> > +    memset(&tmptail, 0, sizeof(tmptail));
> > +    if ( buffer_tail(&tmptail, io_fd, max_vcpu_id, vcpumap,
> > +                     ext_vcpucontext) < 0 ) {
> > +        ERROR ("error buffering image tail, finishing");
> > +        goto finish;
> > +    }
> > +    tailbuf_free(&tailbuf);
> > +    memcpy(&tailbuf, &tmptail, sizeof(tailbuf));
> > +
> > +    goto loadpages;
> > +
> > +  finish:
> > 
> > Ian.
> > 
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xensource.com
> > http://lists.xensource.com/xen-devel
> 
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
> 
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

AkshayKumar Mehta

2010-Jun-03 03:03 UTC

head link

RE: [Xen-devel] XCP

Hi Jonathan Ludlam,

Is it possible for me to download from repository - Unstable version is
fine till the release comes by. 

Let me know the steps/link to repository.

Akshay

 

________________________________

From: Jonathan Ludlam [mailto:Jonathan.Ludlam@eu.citrix.com] 
Sent: Tuesday, June 01, 2010 12:07 PM
To: AkshayKumar Mehta
Cc: xen-devel@lists.xensource.com; Pradeep Padala
Subject: Re: [Xen-devel] XCP

 

IIRC, there was a fairly nasty session caching bug that could cause
issues like this. It''s been fixed since, and the upcoming 0.5 release
(coming in a week or so) should fix it.

 

Cheers,

 

Jon

 

 

 

On 1 Jun 2010, at 18:49, AkshayKumar Mehta wrote:





Hi there,

 

We are using latest version of XCP on 6 hosts. While issuing VM.start or
VM.start_on xmlrpc functional call , it says :

 

 

{''Status'': ''Failure'',
''ErrorDescription'': [''SESSION_INVALID'',
''OpaqueRef:cfb6df14-387d-40a1-cc27-d5962cba7712'']}

 

However if I put VM.start in a loop maybe after  20-30 tries it succeeds
.

But VM.start_on does not succeed even after 70 tries.

One more observation -  VM.clone succeeds after 7-8 tries

VM.hard_shutdown works fine

 

Can you guide me on this issue,

Akshay

 

 

 

<ATT00001..txt>

 



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Brendan Cully

2010-Jun-03 04:31 UTC

head link

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

On Wednesday, 02 June 2010 at 18:04, Brendan Cully
wrote:> On Wednesday, 02 June 2010 at 17:24, Keir Fraser wrote:
> > On 02/06/2010 17:18, "Ian Jackson"
<Ian.Jackson@eu.citrix.com> wrote:
> > 
> > > Andreas Olsowski writes ("[Xen-devel] slow live magration /
xc_restore on xen4
> > > pvops"):
> > >> [2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR
Internal
> > >> error: Error when reading batch size
> > >> [2010-06-01 21:20:57 5211] INFO (XendCheckpoint:423) ERROR
Internal
> > >> error: error when buffering batch, finishing
> > > 
> > > These errors, and the slowness of migrations, are caused by
changes
> > > made to support Remus.  Previously, a migration would be regarded
as
> > > complete as soon as the final information including CPU states
was
> > > received at the migration target.  xc_domain_restore would return
> > > immediately at that point.
> > 
> > This probably needs someone with Remus knowledge to take a look, to
keep all
> > cases working correctly. I''ll Cc Brendan. It''d be
good to get this fixed for
> > a 4.0.1 in a few weeks.
> 
> I''ve done a bit of profiling of the restore code and observed the
> slowness here too. It looks to me like it''s probably related to
> superpage changes. The big hit appears to be at the front of the
> restore process during calls to allocate_mfn_list, under the
> normal_page case. It looks like we''re calling
> xc_domain_memory_populate_physmap once per page here, instead of
> batching the allocation? I haven''t had time to investigate further
> today, but I think this is the culprit.
By the way, this only seems to matter on pvops -- restore is still
pretty quick on 2.6.18. I''m somewhat surprised that there''d be
any
significant difference in allocating guest memory between the two
kernels (isn''t this almost entirely Xen''s responsibility?),
but it
does explain why this wasn''t noticed until recently.
> > 
> > 
> > > Since the Remus patches, xc_domain_restore waits until it gets an
IO
> > > error, and also has a very short timeout which induces IO errors
if
> > > nothing is received if there is no timeout.  This is correct in
the
> > > Remus case but wrong in the normal case.
> > > 
> > > The code should be changed so that xc_domain_restore
> > >  (a) takes an explicit parameter for the IO timeout, which
> > >      should default to something much longer than the 100ms or so
of
> > >      the Remus case, and
> > >  (b) gets told whether
> > >     (i) it should return immediately after receiving the
"tail"
> > >         which contains the CPU state; or
> > >     (ii) it should attempt to keep reading after receiving the
"tail"
> > >         and only return when the connection fails.
> > > 
> > > In the case (b)(i), which should be the usual case, the behaviour
> > > should be that which we would get if changeset 20406:0f893b8f7c15
was
> > > reverted.  The offending code is mostly this, from 20406:
> > > 
> > > +    // DPRINTF("Buffered checkpoint\n");
> > > +
> > > +    if ( pagebuf_get(&pagebuf, io_fd, xc_handle, dom) ) {
> > > +        ERROR("error when buffering batch,
finishing\n");
> > > +        goto finish;
> > > +    }
> > > +    memset(&tmptail, 0, sizeof(tmptail));
> > > +    if ( buffer_tail(&tmptail, io_fd, max_vcpu_id, vcpumap,
> > > +                     ext_vcpucontext) < 0 ) {
> > > +        ERROR ("error buffering image tail,
finishing");
> > > +        goto finish;
> > > +    }
> > > +    tailbuf_free(&tailbuf);
> > > +    memcpy(&tailbuf, &tmptail, sizeof(tailbuf));
> > > +
> > > +    goto loadpages;
> > > +
> > > +  finish:
> > > 
> > > Ian.
> > > 
> > > _______________________________________________
> > > Xen-devel mailing list
> > > Xen-devel@lists.xensource.com
> > > http://lists.xensource.com/xen-devel
> > 
> > 
> > 
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xensource.com
> > http://lists.xensource.com/xen-devel
> > 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
> 
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2010-Jun-03 05:47 UTC

head link

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

On 03/06/2010 02:04, "Brendan Cully" <Brendan@cs.ubc.ca> wrote:
> I''ve done a bit of profiling of the restore code and observed the
> slowness here too. It looks to me like it''s probably related to
> superpage changes. The big hit appears to be at the front of the
> restore process during calls to allocate_mfn_list, under the
> normal_page case. It looks like we''re calling
> xc_domain_memory_populate_physmap once per page here, instead of
> batching the allocation? I haven''t had time to investigate further
> today, but I think this is the culprit.
Ccing Edwin Zhai. He wrote the superpage logic for domain restore.

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Brendan Cully

2010-Jun-03 06:45 UTC

head link

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

On Thursday, 03 June 2010 at 06:47, Keir Fraser wrote:> On 03/06/2010 02:04, "Brendan Cully" <Brendan@cs.ubc.ca>
wrote:
> 
> > I''ve done a bit of profiling of the restore code and observed
the
> > slowness here too. It looks to me like it''s probably related
to
> > superpage changes. The big hit appears to be at the front of the
> > restore process during calls to allocate_mfn_list, under the
> > normal_page case. It looks like we''re calling
> > xc_domain_memory_populate_physmap once per page here, instead of
> > batching the allocation? I haven''t had time to investigate
further
> > today, but I think this is the culprit.
> 
> Ccing Edwin Zhai. He wrote the superpage logic for domain restore.
Here''s some data on the slowdown going from 2.6.18 to pvops dom0:

I wrapped the call to allocate_mfn_list in uncanonicalize_pagetable
to measure the time to do the allocation.

kernel, min call time, max call time
2.6.18, 4 us, 72 us
pvops, 202 us, 10696 us (!)

It looks like pvops is dramatically slower to perform the
xc_domain_memory_populate_physmap call!

I''ll attach the patch and raw data below.




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jeremy Fitzhardinge

2010-Jun-03 06:53 UTC

head link

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

On 06/02/2010 11:45 PM, Brendan Cully wrote:> On Thursday, 03 June 2010 at 06:47, Keir Fraser wrote:
>   
>> On 03/06/2010 02:04, "Brendan Cully"
<Brendan@cs.ubc.ca> wrote:
>>
>>     
>>> I''ve done a bit of profiling of the restore code and
observed the
>>> slowness here too. It looks to me like it''s probably
related to
>>> superpage changes. The big hit appears to be at the front of the
>>> restore process during calls to allocate_mfn_list, under the
>>> normal_page case. It looks like we''re calling
>>> xc_domain_memory_populate_physmap once per page here, instead of
>>> batching the allocation? I haven''t had time to investigate
further
>>> today, but I think this is the culprit.
>>>       
>> Ccing Edwin Zhai. He wrote the superpage logic for domain restore.
>>     
> Here''s some data on the slowdown going from 2.6.18 to pvops dom0:
>
> I wrapped the call to allocate_mfn_list in uncanonicalize_pagetable
> to measure the time to do the allocation.
>
> kernel, min call time, max call time
> 2.6.18, 4 us, 72 us
> pvops, 202 us, 10696 us (!)
>
> It looks like pvops is dramatically slower to perform the
> xc_domain_memory_populate_physmap call!
>   
That appears to be implemented as a raw hypercall, so the kernel has
very little to do with it.  The only thing I can see there that might be
relevent is that the mlock hypercalls could be slow for some reason?

    J

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Brendan Cully

2010-Jun-03 06:55 UTC

head link

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

On Wednesday, 02 June 2010 at 23:45, Brendan Cully
wrote:> On Thursday, 03 June 2010 at 06:47, Keir Fraser wrote:
> > On 03/06/2010 02:04, "Brendan Cully"
<Brendan@cs.ubc.ca> wrote:
> > 
> > > I''ve done a bit of profiling of the restore code and
observed the
> > > slowness here too. It looks to me like it''s probably
related to
> > > superpage changes. The big hit appears to be at the front of the
> > > restore process during calls to allocate_mfn_list, under the
> > > normal_page case. It looks like we''re calling
> > > xc_domain_memory_populate_physmap once per page here, instead of
> > > batching the allocation? I haven''t had time to
investigate further
> > > today, but I think this is the culprit.
> > 
> > Ccing Edwin Zhai. He wrote the superpage logic for domain restore.
> 
> Here''s some data on the slowdown going from 2.6.18 to pvops dom0:
> 
> I wrapped the call to allocate_mfn_list in uncanonicalize_pagetable
> to measure the time to do the allocation.
> 
> kernel, min call time, max call time
> 2.6.18, 4 us, 72 us
> pvops, 202 us, 10696 us (!)
> 
> It looks like pvops is dramatically slower to perform the
> xc_domain_memory_populate_physmap call!
Looking at changeset 20841:

  Allow certain performance-critical hypercall wrappers to register data
  buffers via a new interface which allows them to be
''bounced'' into a
  pre-mlock''ed page-sized per-thread data area. This saves the cost of
  mlock/munlock on every such hypercall, which can be very expensive on
  modern kernels.

...maybe the lock_pages call in xc_memory_op (called from
xc_domain_memory_populate_physmap) has gotten very expensive?
Especially considering this hypercall is now issued once per page.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2010-Jun-03 07:12 UTC

head link

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

On 03/06/2010 07:55, "Brendan Cully" <Brendan@cs.ubc.ca> wrote:
>> kernel, min call time, max call time
>> 2.6.18, 4 us, 72 us
>> pvops, 202 us, 10696 us (!)
>> 
>> It looks like pvops is dramatically slower to perform the
>> xc_domain_memory_populate_physmap call!
> 
> Looking at changeset 20841:
> 
>   Allow certain performance-critical hypercall wrappers to register data
>   buffers via a new interface which allows them to be
''bounced'' into a
>   pre-mlock''ed page-sized per-thread data area. This saves the
cost of
>   mlock/munlock on every such hypercall, which can be very expensive on
>   modern kernels.
> 
> ...maybe the lock_pages call in xc_memory_op (called from
> xc_domain_memory_populate_physmap) has gotten very expensive?
> Especially considering this hypercall is now issued once per page.
Maybe there are two issues here then. I mean, there''s slow, and
there''s 10ms
for a presumably in-core kernel operation, which is rather mad.

Getting our batching back for 4k allocations is the most critical thing
though, of course.

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Zhai, Edwin

2010-Jun-03 08:58 UTC

head link

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

I assume this is PV domU rather than HVM, right?

1. we need check if super page is the culprit by SP_check1.patch.

2. if this can fix this problem, we need further check where the extra 
costs comes: the speculative algorithm, or the super page population 
hypercall by SP_check2.patch

If SP_check2.patch works, the culprit is the new allocation hypercall(so 
guest creation also suffer); Else, the speculative algorithm.

Does it make sense?

Thanks,
edwin


Brendan Cully wrote:> On Thursday, 03 June 2010 at 06:47, Keir Fraser wrote:
>   
>> On 03/06/2010 02:04, "Brendan Cully"
<Brendan@cs.ubc.ca> wrote:
>>
>>     
>>> I''ve done a bit of profiling of the restore code and
observed the
>>> slowness here too. It looks to me like it''s probably
related to
>>> superpage changes. The big hit appears to be at the front of the
>>> restore process during calls to allocate_mfn_list, under the
>>> normal_page case. It looks like we''re calling
>>> xc_domain_memory_populate_physmap once per page here, instead of
>>> batching the allocation? I haven''t had time to investigate
further
>>> today, but I think this is the culprit.
>>>       
>> Ccing Edwin Zhai. He wrote the superpage logic for domain restore.
>>     
>
> Here''s some data on the slowdown going from 2.6.18 to pvops dom0:
>
> I wrapped the call to allocate_mfn_list in uncanonicalize_pagetable
> to measure the time to do the allocation.
>
> kernel, min call time, max call time
> 2.6.18, 4 us, 72 us
> pvops, 202 us, 10696 us (!)
>
> It looks like pvops is dramatically slower to perform the
> xc_domain_memory_populate_physmap call!
>
> I''ll attach the patch and raw data below.
>   
-- 
best rgds,
edwin




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ian Jackson

2010-Jun-03 10:01 UTC

head link

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

Brendan Cully writes ("Re: [Xen-devel] slow live magration / xc_restore on
xen4 pvops"):> 2. in normal migration, the sender should close the fd after sending
> all data, immediately triggering an IO error on the receiver and
> completing the restore.
This is not true.  In normal migration, the fd is used by the
machinery which surrounds xc_domain_restore (in xc_save and also in xl
or xend).  In any case it would be quite wrong for a library function
like xc_domain_restore to eat the fd.

It''s not necessary for xc_domain_restore to behave this way in all
cases; all that''s needed is parameters to tell it how to behave.
> I did try to avoid disturbing regular live migration as much as
> possible when I wrote the code. I suspect some other regression has
> crept in, and I''ll investigate.
The short timeout is another regression.  A normal live migration or
restore should not fall over just because no data is available for
100ms.

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Jonathan Ludlam

2010-Jun-03 10:24 UTC

head link

Re: [Xen-devel] XCP

There aren''t any binaries at the moment, but you could build your own
from the public repositories - I sent a script out on the 25th March that will
compile everything up. You''ll need to replace the xapi and fe
executables (xapi is output in xen-api.hg/ocaml/xapi/ and fe is found in
xen-api-libs.hg/forking_executioner/)

I''ll reattach the script to this mail.

Cheers,

Jon





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Brendan Cully

2010-Jun-03 15:03 UTC

head link

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

On Thursday, 03 June 2010 at 11:01, Ian Jackson wrote:> Brendan Cully writes ("Re: [Xen-devel] slow live magration /
xc_restore on xen4 pvops"):
> > 2. in normal migration, the sender should close the fd after sending
> > all data, immediately triggering an IO error on the receiver and
> > completing the restore.
> 
> This is not true.  In normal migration, the fd is used by the
> machinery which surrounds xc_domain_restore (in xc_save and also in xl
> or xend).  In any case it would be quite wrong for a library function
> like xc_domain_restore to eat the fd.
The sender closes the fd, as it always has. xc_domain_restore has
always consumed the entire contents of the fd, because the qemu tail
has no length header under normal migration. There''s no behavioral
difference here that I can see.
> It''s not necessary for xc_domain_restore to behave this way in all
> cases; all that''s needed is parameters to tell it how to behave.
I have no objection to a more explicit interface. The current form is
simply Remus trying to be as invisible as possible to the rest of the
tool stack.
> > I did try to avoid disturbing regular live migration as much as
> > possible when I wrote the code. I suspect some other regression has
> > crept in, and I''ll investigate.
> 
> The short timeout is another regression.  A normal live migration or
> restore should not fall over just because no data is available for
> 100ms.
(the timeout is 1s, by the way).

For some reason you clipped the bit of my previous message where I say
this doesn''t happen:

1. reads are only supposed to be able to time out after the entire
first checkpoint has been received (IOW this wouldn''t kick in until
normal migration had already completed)    

Let''s take a look at read_exact_timed in xc_domain_restore:

if ( completed ) {
    /* expect a heartbeat every HEARBEAT_MS ms maximum */
    tv.tv_sec = HEARTBEAT_MS / 1000;
    tv.tv_usec = (HEARTBEAT_MS % 1000) * 1000;

    FD_ZERO(&rfds);
    FD_SET(fd, &rfds);
    len = select(fd + 1, &rfds, NULL, NULL, &tv);
    if ( !FD_ISSET(fd, &rfds) ) {
        fprintf(stderr, "read_exact_timed failed (select returned
%zd)\n", len);
        return -1;
    }
}

''completed'' is not set until the first entire checkpoint
(i.e., the
entirety of non-Remus migration) has completed. So, no issue.

I see no evidence that Remus has anything to do with the live
migration performance regression discussed in this thread, and I
haven''t seen any other reported issues either. I think the mlock issue
is a much more likely candidate.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2010-Jun-03 15:18 UTC

head link

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

On 03/06/2010 16:03, "Brendan Cully" <brendan@cs.ubc.ca> wrote:
> I see no evidence that Remus has anything to do with the live
> migration performance regression discussed in this thread, and I
> haven''t seen any other reported issues either. I think the mlock
issue
> is a much more likely candidate.
I agree it''s probably lack of batching plus expensive mlocks. The
performance difference between different machines under test is either
because one runs out of 2MB superpage extents before the other (for some
reason) or because mlock operations are for some reason much more likely to
take a slow path in the kernel (possibly including disk i/o) for some
reason.

We need to get batching back, and Edwin is on the case for that: I hope
Andreas will try out Edwin''s patch to work towards that. We can also
reduce
mlock cost by mlocking some domain_restore arrays across the entire restore
operation, I should imagine.

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ian Jackson

2010-Jun-03 17:15 UTC

head link

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

Brendan Cully writes ("Re: [Xen-devel] slow live magration / xc_restore on
xen4 pvops"):> The sender closes the fd, as it always has. xc_domain_restore has
> always consumed the entire contents of the fd, because the qemu tail
> has no length header under normal migration. There''s no behavioral
> difference here that I can see.
No, that is not the case.  Look for example at "save" in
XendCheckpoint.py in xend, where the save code:
  1. Converts the domain config to sxp and writes it to the fd
  2. Calls xc_save (which calls xc_domain_save)
  3. Writes the qemu save file to the fd
> I have no objection to a more explicit interface. The current form is
> simply Remus trying to be as invisible as possible to the rest of the
> tool stack.
My complaint is that that is not currently the case.
> 1. reads are only supposed to be able to time out after the entire
> first checkpoint has been received (IOW this wouldn''t kick in
until
> normal migration had already completed)
OMG I hadn''t noticed that you had introduced a static variable for
that; I had assumed that "read_exact_timed" was roughly what it said
on the tin.

I think I shall stop now before I become more rude.

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

AkshayKumar Mehta

2010-Jun-03 17:20 UTC

head link

RE: [Xen-devel] XCP

Thanks! I have already downloaded them. Will build it and deployed it

Akshay

 

________________________________

From: Jonathan Ludlam [mailto:Jonathan.Ludlam@eu.citrix.com] 
Sent: Thursday, June 03, 2010 3:25 AM
To: AkshayKumar Mehta
Cc: xen-devel@lists.xensource.com; Pradeep Padala
Subject: Re: [Xen-devel] XCP

 

There aren''t any binaries at the moment, but you could build your own
from the public repositories - I sent a script out on the 25th March
that will compile everything up. You''ll need to replace the xapi and fe
executables (xapi is output in xen-api.hg/ocaml/xapi/ and fe is found in
xen-api-libs.hg/forking_executioner/)

 

I''ll reattach the script to this mail.

 

Cheers,

 

Jon

 



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Brendan Cully

2010-Jun-03 17:29 UTC

head link

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

On Thursday, 03 June 2010 at 18:15, Ian Jackson wrote:> Brendan Cully writes ("Re: [Xen-devel] slow live magration /
xc_restore on xen4 pvops"):
> > The sender closes the fd, as it always has. xc_domain_restore has
> > always consumed the entire contents of the fd, because the qemu tail
> > has no length header under normal migration. There''s no
behavioral
> > difference here that I can see.
> 
> No, that is not the case.  Look for example at "save" in
> XendCheckpoint.py in xend, where the save code:
>   1. Converts the domain config to sxp and writes it to the fd
>   2. Calls xc_save (which calls xc_domain_save)
>   3. Writes the qemu save file to the fd
4. (in XendDomain) closed the fd. Again, this is the _sender_. I fail
to see your point.
> > I have no objection to a more explicit interface. The current form is
> > simply Remus trying to be as invisible as possible to the rest of the
> > tool stack.
> 
> My complaint is that that is not currently the case.
> 
> > 1. reads are only supposed to be able to time out after the entire
> > first checkpoint has been received (IOW this wouldn''t kick in
until
> > normal migration had already completed)
> 
> OMG I hadn''t noticed that you had introduced a static variable for
> that; I had assumed that "read_exact_timed" was roughly what it
said
> on the tin.
> 
> I think I shall stop now before I become more rude.
Feel free to reply if you have an actual Remus-caused regression
instead of FUD based on misreading the code. I''d certainly be
interested in fixing something real.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ian Jackson

2010-Jun-03 18:02 UTC

head link

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

Brendan Cully writes ("Re: [Xen-devel] slow live magration / xc_restore on
xen4 pvops"):> On Thursday, 03 June 2010 at 18:15, Ian Jackson wrote:
> > No, that is not the case.  Look for example at "save" in
> > XendCheckpoint.py in xend, where the save code:
> >   1. Converts the domain config to sxp and writes it to the fd
> >   2. Calls xc_save (which calls xc_domain_save)
> >   3. Writes the qemu save file to the fd
> 
> 4. (in XendDomain) closed the fd. Again, this is the _sender_. I fail
> to see your point.
In the receiver this corresponds to the qemu savefile being read from
the fd, after xc_domain_restore has returned.  So the fd remains
readable after xc_domain_restore and the save image data sent by
xc_domain_save and received by xc_domain_restore is self-delimiting.

In Xen 3.4 this is easily seen in XendCheckpoint.py, where the
corresponding receive logic is clearly visible.  In Xen 4.x this is
different because of the Remus patches.

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2010-Jun-09 13:32 UTC

head link

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

Edwin, Dave,

The issue is clearly that xc_domain_restore now only ever issues
populate_physmap requests for a single extent at a time. This might be okay
when allocating superpages, but that is rarely the case for PV guests
(depends on a rare domain config parameter) and is not guaranteed for HVM
guests either. The resulting performance is unacceptable, especially when
the kernel''s underlying mlock() is slow.

It looks to me like the root cause is Dave McCracken''s patch
xen-unstable:19639, which Edwin Zhai''s patch xen-unstable:20126 merely
builds upon. Ultimately I don''t care who fixes it, but I would like a
fix
for 4.0.1 which releases in the next few weeks, and if I have to do it
myself I will simply hack out the above two changesets. I''d rather have
domain restore working in reasonable time than the relatively small
performance boost of guest superpage mappings.

 Thanks,
 Keir

On 03/06/2010 09:58, "Zhai, Edwin" <edwin.zhai@intel.com> wrote:
> I assume this is PV domU rather than HVM, right?
> 
> 1. we need check if super page is the culprit by SP_check1.patch.
> 
> 2. if this can fix this problem, we need further check where the extra
> costs comes: the speculative algorithm, or the super page population
> hypercall by SP_check2.patch
> 
> If SP_check2.patch works, the culprit is the new allocation hypercall(so
> guest creation also suffer); Else, the speculative algorithm.
> 
> Does it make sense?
> 
> Thanks,
> edwin
> 
> 
> Brendan Cully wrote:
>> On Thursday, 03 June 2010 at 06:47, Keir Fraser wrote:
>>   
>>> On 03/06/2010 02:04, "Brendan Cully"
<Brendan@cs.ubc.ca> wrote:
>>> 
>>>     
>>>> I''ve done a bit of profiling of the restore code and
observed the
>>>> slowness here too. It looks to me like it''s probably
related to
>>>> superpage changes. The big hit appears to be at the front of
the
>>>> restore process during calls to allocate_mfn_list, under the
>>>> normal_page case. It looks like we''re calling
>>>> xc_domain_memory_populate_physmap once per page here, instead
of
>>>> batching the allocation? I haven''t had time to
investigate further
>>>> today, but I think this is the culprit.
>>>>       
>>> Ccing Edwin Zhai. He wrote the superpage logic for domain restore.
>>>     
>> 
>> Here''s some data on the slowdown going from 2.6.18 to pvops
dom0:
>> 
>> I wrapped the call to allocate_mfn_list in uncanonicalize_pagetable
>> to measure the time to do the allocation.
>> 
>> kernel, min call time, max call time
>> 2.6.18, 4 us, 72 us
>> pvops, 202 us, 10696 us (!)
>> 
>> It looks like pvops is dramatically slower to perform the
>> xc_domain_memory_populate_physmap call!
>> 
>> I''ll attach the patch and raw data below.
>>   

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2010-Jun-10 09:27 UTC

head link

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

Andreas,

You can check whether this is fixed by the latest fixes in
http://xenbits.xensource.com/xen-4.0-testing.hg. You should only need to
rebuild and reinstall tools/libxc.

 Thanks,
 Keir

On 02/06/2010 23:59, "Andreas Olsowski"
<andreas.olsowski@uni.leuphana.de>
wrote:
> I did some further research now and shut down all virtual machines on
> xenturio1, after that i got (3 runs):
> (xm save takes ~5 seconds , user and sys are always negligible so i
> removed those to reduce text)
> 
> xenturio1:~# time xm restore /var/saverestore-x1.mem
> real    0m25.349s 0m27.456s 0m27.208s
> 
> So the fact that there were running machines did impact performance of
> xc_restore.
> 
> I proceeded to start create 20 "dummy" vms with 1gig ram and
4vpcus
> each  (dom0 has 4096M fixed, 24gig total available):
> xenturio1:~# for i in {1..20} ; do echo creating dummy$i ; xt vm create
> dummy$i -vlan 27 -mem 1024 -cpus 4 ; done
> creating dummy1
> vm/create> successfully created vm ''dummy1''
> ....
> creating dummy20
> vm/create> successfully created vm ''dummy20''
> 
> and started them
> for i in {1..20} ; do echo starting dummy$i ; xm start dummy$i ; done
> 
> So my memory allocation should now  be 100% (4gig dom0 20gig domUs), but
> why did i have 512megs to spare for "saverestore-x1"? Oh well,
onwards.
> 
> Once again i ran a save/restore, 3 times to be sure (edited the
> additional results in output).
> 
> With 20 running vms:
> xenturio1:~# time xm restore /var/saverestore-x1.mem
> real    1m16.375s 0m31.306s 1m10.214s
> 
> With 16 running vms:
> xenturio1:~# time xm restore /var/saverestore-x1.mem
> real    1m49.741s 1m38.696s 0m55.615s
> 
> With 12 running vms:
> xenturio1:~# time xm restore /var/saverestore-x1.mem
> real    1m3.101s 2m4.254s 1m27.193s
> 
> With 8 running vms:
> xenturio1:~# time xm restore /var/saverestore-x1.mem
> real    0m36.867s 0m43.513s 0m33.199s
> 
> With 4 running vms:
> xenturio1:~# time xm restore /var/saverestore-x1.mem
> real    0m40.454s 0m44.929s 1m7.215s
> 
> Keep in mind, those dumUs dont do anything at all, they just idle.
> What is going on there the results seem completely random, running more
> domUs can be faster than running less? How is that even possible?
> 
> So i deleted the dummyXs and started the productive domUs again, in 3
> steps to take further measurements:
> 
> 
> after first batch:
> xenturio1:~# time xm restore /var/saverestore-x1.mem
> real    0m23.968s 1m22.133s 1m24.420s
> 
> after second batch:
> xenturio1:~# time xm restore /var/saverestore-x1.mem
> real    1m54.310s 1m11.340s 1m47.643s
> 
> after third batch:
> xenturio1:~# time xm restore /var/saverestore-x1.mem
> real    1m52.065s 1m34.517s 2m8.644s 1m25.473s 1m35.943s 1m45.074s
> 1m48.407s 1m18.277s 1m18.931s 1m27.458s
> 
> So my current guess is, xc_restore speed depends on the amount of used
> memory or rather how much is beeing grabbed by running processes. Does
> that make any sense?
> 
> But if that is so, explain:
> I started 3 vms running "stress" that do:
> load average: 30.94, 30.04, 21.00
> Mem:   5909844k total,  4020480k used,  1889364k free,      288k buffers
> 
> But still:
> tarballerina:~# time xm restore /var/saverestore-t.mem
> real    0m38.654s
> 
> Why doesnt xc_restore slow down on tarballerina, no matter what i do?
> Again: all 3 machines have 24gigs ram, 2x quad xeons and dom0 is fixed
> to 4096M ram.
> all use the same xen4 sources, the same kernels with the same configs.
> 
> Is the Xeon E5520 with DDR3 really this much faster than the L5335 and
> L5410 with DDR2?
> 
> If someone were to tell me, that this is expected behaviour i wouldnt
> like it, but at least i could accept it.
> Are machines doing plenty of cpu and memory utilizaton not a good
> measurement in this or any case?
> 
> I think tomorrow night i will migrate all machines from xenturio1 to
> tarballerina, but first i have to verify that all vlans are available,
> that i cannot do right now.
> 
> ---
> 
> Andreas
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

AkshayKumar Mehta

2010-Aug-31 01:33 UTC

head link

RE: [Xen-devel] XCP - iisues with XCP .5

Hi  Jon,

 

I am facing some issues with XCP .5 :

1. Slave machine frequently hangs and on reboot looses NICs information
. Running ifconfig show the interfaces and can also ping the network.

2. It fails to migrate VMs. ( though  I get a Success in status of the
returned object) 

 

Question :

 

If I point a fresh installation of XCP master and slaves( new pool )  to
a repository that is pre-existing , how can I get the old VMs from the
vhd files in new setup? ( This is the only pool utilizing the store
repository ) 

 

 

regards

 

Akshay

 

________________________________

From: AkshayKumar Mehta 
Sent: Thursday, June 03, 2010 10:21 AM
To: ''Jonathan Ludlam''
Cc: xen-devel@lists.xensource.com; Pradeep Padala
Subject: RE: [Xen-devel] XCP

 

Thanks! I have already downloaded them. Will build it and deployed it

Akshay

 

________________________________

From: Jonathan Ludlam [mailto:Jonathan.Ludlam@eu.citrix.com] 
Sent: Thursday, June 03, 2010 3:25 AM
To: AkshayKumar Mehta
Cc: xen-devel@lists.xensource.com; Pradeep Padala
Subject: Re: [Xen-devel] XCP

 

There aren''t any binaries at the moment, but you could build your own
from the public repositories - I sent a script out on the 25th March
that will compile everything up. You''ll need to replace the xapi and fe
executables (xapi is output in xen-api.hg/ocaml/xapi/ and fe is found in
xen-api-libs.hg/forking_executioner/)

 

I''ll reattach the script to this mail.

 

Cheers,

 

Jon

 



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Reasonably Related Threads

Search for more possibly parallel threads

Xen devel - Jun 2010 - XCP

[Xen-devel] XCP

Re: [Xen-devel] XCP

RE: [Xen-devel] XCP

[Xen-devel] slow live magration / xc_restore on xen4 pvops

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

RE: [Xen-devel] XCP

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

Re: [Xen-devel] XCP

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

RE: [Xen-devel] XCP

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

Re: [Xen-devel] slow live magration / xc_restore on xen4 pvops

RE: [Xen-devel] XCP - iisues with XCP .5

Reasonably Related Threads