Hi: Greeting first. I was trying to run about 24 HVMS (currently only Linux, later will involve Windows) on one physical server with 24GB memory, 16CPUs. Each VM is configured with 2GB memory, and I reserved 8GB memory for dom0. For safety reason, only domain U''s memory is allowed to balloon. Inside domain U, I used xenballooned provide by xensource, periodically write /proc/meminfo into xenstore in dom 0(/local/domain/did/memory/meminfo). And in domain 0, I wrote a python script to read the meminfo, like xen provided strategy, use Committed_AS to calculate the domain U balloon target. The time interval is 1 seconds. Inside each VM, I setup a apache server for test. Well, I''d like to say the result is not so good. It appears that too much read/write on xenstore, when I give some of the stress(by using ab) to guest domains, the CPU usage of xenstore is up to 100%. Thus the monitor running in dom0 also response quite slowly. Also, in ab test, the Committed_AS grows very fast, reach to maxmem in short time, but in fact the only a small amount of memory guest really need, so I guess there should be some more to be taken into consideration for ballooning. For xenstore issue, I first plan to wrote a C program inside domain U to replace xenballoond to see whether the situation will be refined. If not, how about set up event channel directly for domU and dom0, would it be faster? Regards balloon strategy, I would do like this, when there are enough memory , just fulfill the guest balloon request, and when shortage of memory, distribute memory evenly on the guests those request inflation. Does anyone have better suggestion, thanks in advance. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Since currently /cpu/meminfo is sent to domain 0 via xenstore, which in my opinoin is slow. What I want to do is: there is a shared page between domU and dom0, and domU periodically update the meminfo into the page, while on the other side dom0 retrive the updated data for caculating the target, which is used by guest for balloning. The problem I met is, currently I don''t know how to implement a shared page between dom0 and domU. Would it like dom 0 alloc a unbound event and wait guest to connect, and transfer date through grant table? Or someone has more efficient way? many thanks.> From: tinnycloud@hotmail.com > To: xen-devel@lists.xensource.com > CC: dan.magenheimer@oracle.com; George.Dunlap@eu.citrix.com > Subject: Xen balloon driver discuss > Date: Sun, 21 Nov 2010 14:26:01 +0800 > > Hi: > Greeting first. > > I was trying to run about 24 HVMS (currently only Linux, later will > involve Windows) on one physical server with 24GB memory, 16CPUs. > Each VM is configured with 2GB memory, and I reserved 8GB memory for > dom0. > For safety reason, only domain U''s memory is allowed to balloon. > > Inside domain U, I used xenballooned provide by xensource, > periodically write /proc/meminfo into xenstore in dom > 0(/local/domain/did/memory/meminfo). > And in domain 0, I wrote a python script to read the meminfo, like > xen provided strategy, use Committed_AS to calculate the domain U balloon > target. > The time interval is 1 seconds. > > Inside each VM, I setup a apache server for test. Well, I''d > like to say the result is not so good. > It appears that too much read/write on xenstore, when I give some of > the stress(by using ab) to guest domains, > the CPU usage of xenstore is up to 100%. Thus the monitor running in > dom0 also response quite slowly. > Also, in ab test, the Committed_AS grows very fast, reach to maxmem > in short time, but in fact the only a small amount > of memory guest really need, so I guess there should be some more to > be taken into consideration for ballooning. > > For xenstore issue, I first plan to wrote a C program inside domain > U to replace xenballoond to see whether the situation > will be refined. If not, how about set up event channel directly for > domU and dom0, would it be faster? > > Regards balloon strategy, I would do like this, when there are > enough memory , just fulfill the guest balloon request, and when shortage > of memory, distribute memory evenly on the guests those request > inflation. > > Does anyone have better suggestion, thanks in advance. >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Xenstore IS slow and you could improve xenballoond performance by only sending the single CommittedAS value from xenballoond in domU to dom0 instead of all of /proc/meminfo. But you are making an assumption that getting memory utilization information from domU to dom0 FASTER (e.g. with a shared page) will provide better ballooning results. I have not found this to be the case, which is what led to my investigation into self-ballooning, which led to Transcendent Memory. See the 2010 Xen Summit for more information. In your last paragraph below "Regards balloon strategy", the problem is it is not easy to define "enough memory" and "shortage of memory" within any guest and almost impossible to define it and effectively load balance across many guests. See my Linux Plumber''s Conference presentation (with complete speaker notes) here: http://oss.oracle.com/projects/tmem/dist/documentation/presentations/MemMgmtVirtEnv-LPC2010-Final.pdf http://oss.oracle.com/projects/tmem/dist/documentation/presentations/MemMgmtVirtEnv-LPC2010-SpkNotes.pdf From: MaoXiaoyun [mailto:tinnycloud@hotmail.com] Sent: Sunday, November 21, 2010 9:33 PM To: xen devel Cc: Dan Magenheimer; george.dunlap@eu.citrix.com Subject: RE: Xen balloon driver discuss Since currently /cpu/meminfo is sent to domain 0 via xenstore, which in my opinoin is slow. What I want to do is: there is a shared page between domU and dom0, and domU periodically update the meminfo into the page, while on the other side dom0 retrive the updated data for caculating the target, which is used by guest for balloning. The problem I met is, currently I don''t know how to implement a shared page between dom0 and domU. Would it like dom 0 alloc a unbound event and wait guest to connect, and transfer date through grant table? Or someone has more efficient way? many thanks.> From: tinnycloud@hotmail.com > To: xen-devel@lists.xensource.com > CC: dan.magenheimer@oracle.com; George.Dunlap@eu.citrix.com > Subject: Xen balloon driver discuss > Date: Sun, 21 Nov 2010 14:26:01 +0800 > > Hi: > Greeting first. > > I was trying to run about 24 HVMS (currently only Linux, later will > involve Windows) on one physical server with 24GB memory, 16CPUs. > Each VM is configured with 2GB memory, and I reserved 8GB memory for > dom0. > For safety reason, only domain U''s memory is allowed to balloon. > > Inside domain U, I used xenballooned provide by xensource, > periodically write /proc/meminfo into xenstore in dom > 0(/local/domain/did/memory/meminfo). > And in domain 0, I wrote a python script to read the meminfo, like > xen provided strategy, use Committed_AS to calculate the domain U balloon > target. > The time interval is ! 1 seconds. > > Inside each VM, I setup a apache server for test. Well, I''d > like to say the result is not so good. > It appears that too much read/write on xenstore, when I give some of > the stress(by using ab) to guest domains, > the CPU usage of xenstore is up to 100%. Thus the monitor running in > dom0 also response quite slowly. > Also, in ab test, the Committed_AS grows very fast, reach to maxmem > in short time, but in fact the only a small amount > of memory guest really need, so I guess there should be some more to > be taken into consideration for ballooning. > > For xenstore issue, I first plan to wrote a C program inside domain > U to replace xenballoond to see whether the situation > will be refined. If not, how about set up event channel directly for > domU and dom0, would it be faster? > > Regards balloon strategy, I would do like this, when there ! are > enough memory , just fulfill the guest balloon request, and when shortage > of memory, distribute memory evenly on the guests those request > inflation. > > Does anyone have better suggestion, thanks in advance. >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
HI Dan: Appreciate for your presentation in summarizing the memory overcommit, really vivid and in great help. Well, I guess recently days the strategy in my mind will fall into the solution Set C in pdf. The tmem solution your worked out for memory overcommit is both efficient and effective. I guess I will have a try on Linux Guest. The real situation I have is most of the running VMs on host are windows. So I had to come up those policies to balance the memory. Although policies are all workload dependent. Good news is host workload is configurable, and not very heavy So I will try to figure out some favorable policy. The policies referred in pdf are good start for me. Today, instead of trying to implement "/proc/meminfo" with shared pages, I hacked the balloon driver to have another workqueue periodically write meminfo into xenstore through xenbus, which solve the problem of xenstrore high CPU utilization problem. Later I will try to google more on how Citrix does. Thanks for your help, or do you have any better idea for windows guest? Sent: Dan Magenheimer [mailto:dan.magenheimer@oracle.com] Date: 2010.11.23 1:47 To: MaoXiaoyun; xen devel CC: george.dunlap@eu.citrix.com Subject: RE: Xen balloon driver discuss Xenstore IS slow and you could improve xenballoond performance by only sending the single CommittedAS value from xenballoond in domU to dom0 instead of all of /proc/meminfo. But you are making an assumption that getting memory utilization information from domU to dom0 FASTER (e.g. with a shared page) will provide better ballooning results. I have not found this to be the case, which is what led to my investigation into self-ballooning, which led to Transcendent Memory. See the 2010 Xen Summit for more information. In your last paragraph below "Regards balloon strategy", the problem is it is not easy to define "enough memory" and "shortage of memory" within any guest and almost impossible to define it and effectively load balance across many guests. See my Linux Plumber''s Conference presentation (with complete speaker notes) here: http://oss.oracle.com/projects/tmem/dist/documentation/presentations/MemMgmt VirtEnv-LPC2010-Final.pdf http://oss.oracle.com/projects/tmem/dist/documentation/presentations/MemMgmt VirtEnv-LPC2010-SpkNotes.pdf From: MaoXiaoyun [mailto:tinnycloud@hotmail.com] Sent: Sunday, November 21, 2010 9:33 PM To: xen devel Cc: Dan Magenheimer; george.dunlap@eu.citrix.com Subject: RE: Xen balloon driver discuss Since currently /cpu/meminfo is sent to domain 0 via xenstore, which in my opinoin is slow. What I want to do is: there is a shared page between domU and dom0, and domU periodically update the meminfo into the page, while on the other side dom0 retrive the updated data for caculating the target, which is used by guest for balloning. The problem I met is, currently I don''t know how to implement a shared page between dom0 and domU. Would it like dom 0 alloc a unbound event and wait guest to connect, and transfer date through grant table? Or someone has more efficient way? many thanks.> From: tinnycloud@hotmail.com > To: xen-devel@lists.xensource.com > CC: dan.magenheimer@oracle.com; George.Dunlap@eu.citrix.com > Subject: Xen balloon driver discuss > Date: Sun, 21 Nov 2010 14:26:01 +0800 > > Hi: > Greeting first. > > I was trying to run about 24 HVMS (currently only Linux, later will > involve Windows) on one physical server with 24GB memory, 16CPUs. > Each VM is configured with 2GB memory, and I reserved 8GB memory for > dom0. > For safety reason, only domain U''s memory is allowed to balloon. > > Inside domain U, I used xenballooned provide by xensource, > periodically write /proc/meminfo into xenstore in dom > 0(/local/domain/did/memory/meminfo). > And in domain 0, I wrote a python script to read the meminfo, like > xen provided strategy, use Committed_AS to calculate the domain U balloon > target. > The time interval is ! 1 seconds. > > Inside each VM, I setup a apache server for test. Well, I''d > like to say the result is not so good. > It appears that too much read/write on xenstore, when I give some of > the stress(by using ab) to guest domains, > the CPU usage of xenstore is up to 100%. Thus the monitor running in > dom0 also response quite slowly. > Also, in ab test, the Committed_AS grows very fast, reach to maxmem > in short time, but in fact the only a small amount > of memory guest really need, so I guess there should be some more to > be taken into consideration for ballooning. > > For xenstore issue, I first plan to wrote a C program inside domain > U to replace xenballoond to see whether the situation > will be refined. If not, how about set up event channel directly for > domU and dom0, would it be faster? > > Regards balloon strategy, I would do like this, when there ! are > enough memory , just fulfill the guest balloon request, and when shortage > of memory, distribute memory evenly on the guests those request > inflation. > > Does anyone have better suggestion, thanks in advance. >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi Dan: I have set the benchmark to test balloon driver, but unfortunately the Xen crashed on memory Panic. Before I attach the details output from serial port(which takes time on next run), I am afraid of I might miss something on test environment. My dom0 kernel is 2.6.31, pvops. Well currently there is no driver/xen/balloon.c on this kernel source tree, so I build the xen-balloon.ko, Xen-platform-pci.ko form linux-2.6.18.x86_64, and installed in Dom U, which is redhat 5.4. What I did is put a C program in the each Dom U(total 24 HVM), the program will allocate the memory and fill it with random string repeatly. And in dom0, a phthon monitor will collect the meminfo from xenstore and calculate the target to balloon from Committed_AS. The panic happens when the program is running in just one Dom. I am writing to ask whether my balloon driver is out of date, or where can I get the latest source code, I''ve googled a lot, but still have a lot of confusion on those source tree. Many thanks. From: tinnycloud [mailto:tinnycloud@hotmail.com] Date: 2010.11.23 22:58 TO: ''Dan Magenheimer''; ''xen devel'' CC: ''george.dunlap@eu.citrix.com'' Subject: re: Xen balloon driver discuss HI Dan: Appreciate for your presentation in summarizing the memory overcommit, really vivid and in great help. Well, I guess recently days the strategy in my mind will fall into the solution Set C in pdf. The tmem solution your worked out for memory overcommit is both efficient and effective. I guess I will have a try on Linux Guest. The real situation I have is most of the running VMs on host are windows. So I had to come up those policies to balance the memory. Although policies are all workload dependent. Good news is host workload is configurable, and not very heavy So I will try to figure out some favorable policy. The policies referred in pdf are good start for me. Today, instead of trying to implement "/proc/meminfo" with shared pages, I hacked the balloon driver to have another workqueue periodically write meminfo into xenstore through xenbus, which solve the problem of xenstrore high CPU utilization problem. Later I will try to google more on how Citrix does. Thanks for your help, or do you have any better idea for windows guest? Sent: Dan Magenheimer [mailto:dan.magenheimer@oracle.com] Date: 2010.11.23 1:47 To: MaoXiaoyun; xen devel CC: george.dunlap@eu.citrix.com Subject: RE: Xen balloon driver discuss Xenstore IS slow and you could improve xenballoond performance by only sending the single CommittedAS value from xenballoond in domU to dom0 instead of all of /proc/meminfo. But you are making an assumption that getting memory utilization information from domU to dom0 FASTER (e.g. with a shared page) will provide better ballooning results. I have not found this to be the case, which is what led to my investigation into self-ballooning, which led to Transcendent Memory. See the 2010 Xen Summit for more information. In your last paragraph below "Regards balloon strategy", the problem is it is not easy to define "enough memory" and "shortage of memory" within any guest and almost impossible to define it and effectively load balance across many guests. See my Linux Plumber''s Conference presentation (with complete speaker notes) here: http://oss.oracle.com/projects/tmem/dist/documentation/presentations/MemMgmt VirtEnv-LPC2010-Final.pdf http://oss.oracle.com/projects/tmem/dist/documentation/presentations/MemMgmt VirtEnv-LPC2010-SpkNotes.pdf From: MaoXiaoyun [mailto:tinnycloud@hotmail.com] Sent: Sunday, November 21, 2010 9:33 PM To: xen devel Cc: Dan Magenheimer; george.dunlap@eu.citrix.com Subject: RE: Xen balloon driver discuss Since currently /cpu/meminfo is sent to domain 0 via xenstore, which in my opinoin is slow. What I want to do is: there is a shared page between domU and dom0, and domU periodically update the meminfo into the page, while on the other side dom0 retrive the updated data for caculating the target, which is used by guest for balloning. The problem I met is, currently I don''t know how to implement a shared page between dom0 and domU. Would it like dom 0 alloc a unbound event and wait guest to connect, and transfer date through grant table? Or someone has more efficient way? many thanks.> From: tinnycloud@hotmail.com > To: xen-devel@lists.xensource.com > CC: dan.magenheimer@oracle.com; George.Dunlap@eu.citrix.com > Subject: Xen balloon driver discuss > Date: Sun, 21 Nov 2010 14:26:01 +0800 > > Hi: > Greeting first. > > I was trying to run about 24 HVMS (currently only Linux, later will > involve Windows) on one physical server with 24GB memory, 16CPUs. > Each VM is configured with 2GB memory, and I reserved 8GB memory for > dom0. > For safety reason, only domain U''s memory is allowed to balloon. > > Inside domain U, I used xenballooned provide by xensource, > periodically write /proc/meminfo into xenstore in dom > 0(/local/domain/did/memory/meminfo). > And in domain 0, I wrote a python script to read the meminfo, like > xen provided strategy, use Committed_AS to calculate the domain U balloon > target. > The time interval is ! 1 seconds. > > Inside each VM, I setup a apache server for test. Well, I''d > like to say the result is not so good. > It appears that too much read/write on xenstore, when I give some of > the stress(by using ab) to guest domains, > the CPU usage of xenstore is up to 100%. Thus the monitor running in > dom0 also response quite slowly. > Also, in ab test, the Committed_AS grows very fast, reach to maxmem > in short time, but in fact the only a small amount > of memory guest really need, so I guess there should be some more to > be taken into consideration for ballooning. > > For xenstore issue, I first plan to wrote a C program inside domain > U to replace xenballoond to see whether the situation > will be refined. If not, how about set up event channel directly for > domU and dom0, would it be faster? > > Regards balloon strategy, I would do like this, when there ! are > enough memory , just fulfill the guest balloon request, and when shortage > of memory, distribute memory evenly on the guests those request > inflation. > > Does anyone have better suggestion, thanks in advance. >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Am I understanding correctly that you are running each linux-2.6.18 as HVM (not PV)? I didn''t think that the linux-2.6.18 balloon driver worked at all in an HVM guest. You also didn''t say what version of Xen you are using. If you are running xen-unstable, you should also provide the changeset number. In any case, any load of HVM guests should never crash Xen itself, but if you are running HVM guests, I probably can''t help much as I almost never run HVM guests. From: cloudroot [mailto:cloudroot@sina.com] Sent: Friday, November 26, 2010 11:55 PM To: tinnycloud; Dan Magenheimer; xen devel Cc: george.dunlap@eu.citrix.com Subject: re: Xen balloon driver discuss Hi Dan: I have set the benchmark to test balloon driver, but unfortunately the Xen crashed on memory Panic. Before I attach the details output from serial port(which takes time on next run), I am afraid of I might miss something on test environment. My dom0 kernel is 2.6.31, pvops. Well currently there is no driver/xen/balloon.c on this kernel source tree, so I build the xen-balloon.ko, Xen-platform-pci.ko form linux-2.6.18.x86_64, and installed in Dom U, which is redhat 5.4. What I did is put a C program in the each Dom U(total 24 HVM), the program will allocate the memory and fill it with random string repeatly. And in dom0, a phthon monitor will collect the meminfo from xenstore and calculate the target to balloon from Committed_AS. The panic happens when the program is running in just one Dom. I am writing to ask whether my balloon driver is out of date, or where can I get the latest source code, I''ve googled a lot, but still have a lot of confusion on those source tree. Many thanks. From: tinnycloud [mailto:tinnycloud@hotmail.com] Date: 2010.11.23 22:58 TO: ''Dan Magenheimer''; ''xen devel'' CC: ''george.dunlap@eu.citrix.com'' Subject: re: Xen balloon driver discuss HI Dan: Appreciate for your presentation in summarizing the memory overcommit, really vivid and in great help. Well, I guess recently days the strategy in my mind will fall into the solution Set C in pdf. The tmem solution your worked out for memory overcommit is both efficient and effective. I guess I will have a try on Linux Guest. The real situation I have is most of the running VMs on host are windows. So I had to come up those policies to balance the memory. Although policies are all workload dependent. Good news is host workload is configurable, and not very heavy So I will try to figure out some favorable policy. The policies referred in pdf are good start for me. Today, instead of trying to implement "/proc/meminfo" with shared pages, I hacked the balloon driver to have another workqueue periodically write meminfo into xenstore through xenbus, which solve the problem of xenstrore high CPU utilization problem. Later I will try to google more on how Citrix does. Thanks for your help, or do you have any better idea for windows guest? Sent: Dan Magenheimer [mailto:dan.magenheimer@oracle.com] Date: 2010.11.23 1:47 To: MaoXiaoyun; xen devel CC: george.dunlap@eu.citrix.com Subject: RE: Xen balloon driver discuss Xenstore IS slow and you could improve xenballoond performance by only sending the single CommittedAS value from xenballoond in domU to dom0 instead of all of /proc/meminfo. But you are making an assumption that getting memory utilization information from domU to dom0 FASTER (e.g. with a shared page) will provide better ballooning results. I have not found this to be the case, which is what led to my investigation into self-ballooning, which led to Transcendent Memory. See the 2010 Xen Summit for more information. In your last paragraph below "Regards balloon strategy", the problem is it is not easy to define "enough memory" and "shortage of memory" within any guest and almost impossible to define it and effectively load balance across many guests. See my Linux Plumber''s Conference presentation (with complete speaker notes) here: http://oss.oracle.com/projects/tmem/dist/documentation/presentations/MemMgmtVirtEnv-LPC2010-Final.pdf http://oss.oracle.com/projects/tmem/dist/documentation/presentations/MemMgmtVirtEnv-LPC2010-SpkNotes.pdf From: MaoXiaoyun [mailto:tinnycloud@hotmail.com] Sent: Sunday, November 21, 2010 9:33 PM To: xen devel Cc: Dan Magenheimer; george.dunlap@eu.citrix.com Subject: RE: Xen balloon driver discuss Since currently /cpu/meminfo is sent to domain 0 via xenstore, which in my opinoin is slow. What I want to do is: there is a shared page between domU and dom0, and domU periodically update the meminfo into the page, while on the other side dom0 retrive the updated data for caculating the target, which is used by guest for balloning. The problem I met is, currently I don''t know how to implement a shared page between dom0 and domU. Would it like dom 0 alloc a unbound event and wait guest to connect, and transfer date through grant table? Or someone has more efficient way? many thanks.> From: tinnycloud@hotmail.com > To: xen-devel@lists.xensource.com > CC: dan.magenheimer@oracle.com; George.Dunlap@eu.citrix.com > Subject: Xen balloon driver discuss > Date: Sun, 21 Nov 2010 14:26:01 +0800 > > Hi: > Greeting first. > > I was trying to run about 24 HVMS (currently only Linux, later will > involve Windows) on one physical server with 24GB memory, 16CPUs. > Each VM is configured with 2GB memory, and I reserved 8GB memory for > dom0. > For safety reason, only domain U''s memory is allowed to balloon. > > Inside domain U, I used xenballooned provide by xensource, > periodically write /proc/meminfo into xenstore in dom > 0(/local/domain/did/memory/meminfo). > And in domain 0, I wrote a python script to read the meminfo, like > xen provided strategy, use Committed_AS to calculate the domain U balloon > target. > The time interval is ! 1 seconds. > > Inside each VM, I setup a apache server for test. Well, I''d > like to say the result is not so good. > It appears that too much read/write on xenstore, when I give some of > the stress(by using ab) to guest domains, > the CPU usage of xenstore is up to 100%. Thus the monitor running in > dom0 also response quite slowly. > Also, in ab test, the Committed_AS grows very fast, reach to maxmem > in short time, but in fact the only a small amount > of memory guest really need, so I guess there should be some more to > be taken into consideration for ballooning. > > For xenstore issue, I first plan to wrote a C program inside domain > U to replace xenballoond to see whether the situation > will be refined. If not, how about set up event channel directly for > domU and dom0, would it be faster? > > Regards balloon strategy, I would do like this, when there ! are > enough memory , just fulfill the guest balloon request, and when shortage > of memory, distribute memory evenly on the guests those request > inflation. > > Does anyone have better suggestion, thanks in advance. >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Sat, Nov 27, 2010 at 02:54:46PM +0800, cloudroot wrote:> Hi Dan: > > > > I have set the benchmark to test balloon driver, but > unfortunately the Xen crashed on memory Panic. > > Before I attach the details output from serial port(which takes > time on next run), I am afraid of I might miss something on test > environment. > > > > My dom0 kernel is 2.6.31, pvops. >You should switch to 2.6.32 based dom0 kernel. 2.6.31 tree is not supported or maintained anymore. So switch to xen/stable-2.6.32.x branch in Jeremy''s git tree. Other than that.. I haven''t tried ballooning with HVM guests, so I''m not sure if that should work with the EL5 kernel. -- Pasi> Well currently there is no driver/xen/balloon.c on this kernel source > tree, so I build the xen-balloon.ko, Xen-platform-pci.ko form > > linux-2.6.18.x86_64, and installed in Dom U, which is redhat 5.4. > > > > What I did is put a C program in the each Dom U(total 24 HVM), > the program will allocate the memory and fill it with random string > repeatly. > > And in dom0, a phthon monitor will collect the meminfo from > xenstore and calculate the target to balloon from Committed_AS. > > The panic happens when the program is running in just one Dom. > > > > I am writing to ask whether my balloon driver is out of date, or > where can I get the latest source code, > > I''ve googled a lot, but still have a lot of confusion on those > source tree. > > > > Many thanks. > > > > > > From: tinnycloud [mailto:tinnycloud@hotmail.com] > Date: 2010.11.23 22:58 > TO: ''Dan Magenheimer''; ''xen devel'' > CC: ''george.dunlap@eu.citrix.com'' > Subject: re: Xen balloon driver discuss > > > > HI Dan: > > > > Appreciate for your presentation in summarizing the memory > overcommit, really vivid and in great help. > > Well, I guess recently days the strategy in my mind will fall > into the solution Set C in pdf. > > > > The tmem solution your worked out for memory overcommit is both > efficient and effective. > > I guess I will have a try on Linux Guest. > > > > The real situation I have is most of the running VMs on host are > windows. So I had to come up those policies to balance the memory. > > Although policies are all workload dependent. Good news is host > workload is configurable, and not very heavy > > So I will try to figure out some favorable policy. The policies referred > in pdf are good start for me. > > > > Today, instead of trying to implement "/proc/meminfo" with shared > pages, I hacked the balloon driver to have another > > workqueue periodically write meminfo into xenstore through > xenbus, which solve the problem of xenstrore high CPU > > utilization problem. > > > > Later I will try to google more on how Citrix does. > > Thanks for your help, or do you have any better idea for windows > guest? > > > > > > Sent: Dan Magenheimer [mailto:dan.magenheimer@oracle.com] > Date: 2010.11.23 1:47 > To: MaoXiaoyun; xen devel > CC: george.dunlap@eu.citrix.com > Subject: RE: Xen balloon driver discuss > > > > Xenstore IS slow and you could improve xenballoond performance by only > sending the single CommittedAS value from xenballoond in domU to dom0 > instead of all of /proc/meminfo. But you are making an assumption that > getting memory utilization information from domU to dom0 FASTER (e.g. with > a shared page) will provide better ballooning results. I have not found > this to be the case, which is what led to my investigation into > self-ballooning, which led to Transcendent Memory. See the 2010 Xen > Summit for more information. > > > > In your last paragraph below "Regards balloon strategy", the problem is it > is not easy to define "enough memory" and "shortage of memory" within any > guest and almost impossible to define it and effectively load balance > across many guests. See my Linux Plumber''s Conference presentation (with > complete speaker notes) here: > > > > [1]http://oss.oracle.com/projects/tmem/dist/documentation/presentations/MemMgmtVirtEnv-LPC2010-Final.pdf > > > > [2]http://oss.oracle.com/projects/tmem/dist/documentation/presentations/MemMgmtVirtEnv-LPC2010-SpkNotes.pdf > > > > From: MaoXiaoyun [mailto:tinnycloud@hotmail.com] > Sent: Sunday, November 21, 2010 9:33 PM > To: xen devel > Cc: Dan Magenheimer; george.dunlap@eu.citrix.com > Subject: RE: Xen balloon driver discuss > > > > > Since currently /cpu/meminfo is sent to domain 0 via xenstore, which in my > opinoin is slow. > What I want to do is: there is a shared page between domU and dom0, and > domU periodically > update the meminfo into the page, while on the other side dom0 retrive the > updated data for > caculating the target, which is used by guest for balloning. > > The problem I met is, currently I don''t know how to implement a shared > page between > dom0 and domU. > Would it like dom 0 alloc a unbound event and wait guest to connect, and > transfer date through > grant table? > Or someone has more efficient way? > many thanks. > > > From: tinnycloud@hotmail.com > > To: xen-devel@lists.xensource.com > > CC: dan.magenheimer@oracle.com; George.Dunlap@eu.citrix.com > > Subject: Xen balloon driver discuss > > Date: Sun, 21 Nov 2010 14:26:01 +0800 > > > > Hi: > > Greeting first. > > > > I was trying to run about 24 HVMS (currently only Linux, later will > > involve Windows) on one physical server with 24GB memory, 16CPUs. > > Each VM is configured with 2GB memory, and I reserved 8GB memory for > > dom0. > > For safety reason, only domain U''s memory is allowed to balloon. > > > > Inside domain U, I used xenballooned provide by xensource, > > periodically write /proc/meminfo into xenstore in dom > > 0(/local/domain/did/memory/meminfo). > > And in domain 0, I wrote a python script to read the meminfo, like > > xen provided strategy, use Committed_AS to calculate the domain U > balloon > > target. > > The time interval is ! 1 seconds. > > > > Inside each VM, I setup a apache server for test. Well, I''d > > like to say the result is not so good. > > It appears that too much read/write on xenstore, when I give some of > > the stress(by using ab) to guest domains, > > the CPU usage of xenstore is up to 100%. Thus the monitor running in > > dom0 also response quite slowly. > > Also, in ab test, the Committed_AS grows very fast, reach to maxmem > > in short time, but in fact the only a small amount > > of memory guest really need, so I guess there should be some more to > > be taken into consideration for ballooning. > > > > For xenstore issue, I first plan to wrote a C program inside domain > > U to replace xenballoond to see whether the situation > > will be refined. If not, how about set up event channel directly for > > domU and dom0, would it be faster? > > > > Regards balloon strategy, I would do like this, when there ! are > > enough memory , just fulfill the guest balloon request, and when > shortage > > of memory, distribute memory evenly on the guests those request > > inflation. > > > > Does anyone have better suggestion, thanks in advance. > > > > References > > Visible links > 1. http://oss.oracle.com/projects/tmem/dist/documentation/presentations/MemMgmtVirtEnv-LPC2010-Final.pdf > 2. http://oss.oracle.com/projects/tmem/dist/documentation/presentations/MemMgmtVirtEnv-LPC2010-SpkNotes.pdf> _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi Dan: You are right, the HVM guest is kernel-2.6.18-164.el5.src.rpm, coming from ftp://ftp.redhat.com/pub/redhat/linux/enterprise/5Server/en/os/SRPMS/ Currently the balloon driver is compiled from this kernel. (So I am afraid of if the driver may out of date, and I plan to get new balloon.c from xenlinux and put it into this kernel to compile a new xen-balloon.ko) My xen is 4.0.0, again pvops kernel 2.6.31 Actually, I have two problems, first is PoD "populate-on-demand memory" issue, and second is xen panic(I will get more test and report on another reply) I have googled some and apply the patch from http://lists.xensource.com/archives/html/xen-devel/2010-07/msg01404.html, but it doesn''t work for me. -------------------------------------------------Domain Crash Case--------------------------------------------- The issue is easy to reproduce, I started one HVM with command line: xm cr hvm.linux.balloon maxmem=2048 memory=512 the guest works well at first, but crashed as long as I logined into it throught VNC the serial output is: blktap_sysfs_create: adding attributes for dev ffff8801224df000 (XEN) p2m_pod_demand_populate: Out of populate-on-demand memory! tot_pages 132088 pod_entries 9489 (XEN) domain_crash called from p2m.c:1127 (XEN) Domain 4 reported crashed by domain 0 on cpu#0: (XEN) printk: 31 messages suppressed. (XEN) grant_table.c:555:d0 Iomem mapping not permitted ffffffffffffffff (domain 4) blktap_sysfs_destroy blktap_sysfs_create: adding attributes for dev ffff88012259ca00 -------------------------------------------------Xen Crash Case--------------------------------------------- In addition, if start guest like m cr hvm.linux.balloon maxmem=2048 memory=400 blktap_sysfs_destroy blktap_sysfs_create: adding attributes for dev ffff8801224df000 (XEN) p2m_pod_demand_populate: Out of populate-on-demand memory! tot_pages 132088 pod_entries 9489 (XEN) domain_crash called from p2m.c:1127 (XEN) Domain 4 reported crashed by domain 0 on cpu#0: (XEN) printk: 31 messages suppressed. (XEN) grant_table.c:555:d0 Iomem mapping not permitted ffffffffffffffff (domain 4) blktap_sysfs_destroy blktap_sysfs_create: adding attributes for dev ffff88012259ca00 blktap_sysfs_destroy blktap_sysfs_create: adding attributes for dev ffff88012259c600 (XEN) Error: p2m lock held by p2m_change_type (XEN) Xen BUG at p2m-ept.c:38 (XEN) ----[ Xen-4.0.0 x86_64 debug=n Not tainted ]---- (XEN) CPU: 6 (XEN) RIP: e008:[<ffff82c4801df2aa>] ept_pod_check_and_populate+0x13a/0x150 (XEN) RFLAGS: 0000000000010282 CONTEXT: hypervisor (XEN) rax: 0000000000000000 rbx: ffff83063fdc0000 rcx: 0000000000000092 (XEN) rdx: 000000000000000a rsi: 000000000000000a rdi: ffff82c48021e844 (XEN) rbp: ffff83023fefff28 rsp: ffff83023feffc18 r8: 0000000000000001 (XEN) r9: 0000000000000001 r10: 0000000000000000 r11: ffff82c4801318d0 (XEN) r12: ffff8302f5914ef8 r13: 0000000000000001 r14: 0000000000000000 (XEN) r15: 0000000000003bdf cr0: 0000000080050033 cr4: 00000000000026f0 (XEN) cr3: 000000063fc2e000 cr2: 00002ba99c046000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 (XEN) Xen stack trace from rsp=ffff83023feffc18: (XEN) 0000000000000002 0000000000000000 0000000000000000 ffff83063fdc0000 (XEN) ffff8302f5914ef8 0000000000000001 ffff83023feffc70 ffff82c4801df46e (XEN) 0000000000000000 ffff83023feffcc4 0000000000003bdf 00000000000001df (XEN) ffff8302f5914000 ffff83063fdc0000 ffff83023fefff28 0000000000003bdf (XEN) 0000000000000002 0000000000000001 0000000000000030 ffff82c4801bafe4 (XEN) ffff8302f89dc000 000000043fefff28 ffff83023fefff28 0000000000003bdf (XEN) 00000000002f9223 0000000000000030 ffff83023fefff28 ffff82c48019bab1 (XEN) 0000000000000000 00000001bdc62000 0000000000000000 0000000000000182 (XEN) ffff8300bdc62000 ffff82c4801b3824 ffff83063fdc0348 07008300bdc62000 (XEN) ffff83023fe808d0 0000000000000040 000000063fc3601e 0000000000000000 (XEN) ffff83023fefff28 ffff82c480167d17 ffff82c4802509c0 0000000000000000 (XEN) 0000000003bdf000 000000000001c000 ffff83023feffdc8 0000000000000080 (XEN) ffff82c480250dd0 0000000000003bdf 00ff82c480250080 ffff82c480250dc0 (XEN) ffff82c480250080 ffff82c480250dc0 0000000000004040 0000000000000000 (XEN) 0000000000004040 0000000000000040 ffff82c4801447da 0000000000000080 (XEN) ffff83023fefff28 0000000000000092 ffff82c4801a7f6c 00000000000000fc (XEN) 0000000000000092 0000000000000006 ffff8300bdc63760 0000000000000006 (XEN) ffff82c48025c100 ffff82c480250100 ffff82c480250100 0000000000000292 (XEN) ffff8300bdc637f0 00000249b30f6a00 0000000000000292 ffff82c4801a9383 (XEN) 00000000000000ef ffff8300bdc62000 ffff8300bdc62000 ffff8300bdc637e8 (XEN) Xen call trace: (XEN) [<ffff82c4801df2aa>] ept_pod_check_and_populate+0x13a/0x150 (XEN) [<ffff82c4801df46e>] ept_get_entry+0x1ae/0x1c0 (XEN) [<ffff82c4801bafe4>] p2m_change_type+0x144/0x1b0 (XEN) [<ffff82c48019bab1>] hvm_hap_nested_page_fault+0x121/0x190 (XEN) [<ffff82c4801b3824>] vmx_vmexit_handler+0x304/0x1a90 (XEN) [<ffff82c480167d17>] __smp_call_function_interrupt+0x57/0x90 (XEN) [<ffff82c4801447da>] __find_next_bit+0x6a/0x70 (XEN) [<ffff82c4801a7f6c>] vpic_get_highest_priority_irq+0x2c/0xa0 (XEN) [<ffff82c4801a9383>] pt_update_irq+0x33/0x1e0 (XEN) [<ffff82c4801a6042>] vlapic_has_pending_irq+0x42/0x70 (XEN) [<ffff82c4801a0c88>] hvm_vcpu_has_pending_irq+0x88/0xa0 (XEN) [<ffff82c4801b263b>] vmx_vmenter_helper+0x5b/0x150 (XEN) [<ffff82c4801ada63>] vmx_asm_do_vmentry+0x0/0xdd (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 6: (XEN) Xen BUG at p2m-ept.c:38 (XEN) **************************************** (XEN) (XEN) Manual reset required (''noreboot'' specified) ---------------------------------------Works configuration-------------------------------------------------- And if starts guest like xm cr hvm.linux.balloon maxmem=1024 memory=512 the guest can be successfully logon through VNC Any idea on what happens? PoD is new to me, I will try to know more, thanks. From: Dan Magenheimer [mailto:dan.magenheimer@oracle.com] Date: 2010.11.28 10:36 sent: tinnycloud; xen devel cc: george.dunlap@eu.citrix.com subject: RE: Xen balloon driver discuss Am I understanding correctly that you are running each linux-2.6.18 as HVM (not PV)? I didn''t think that the linux-2.6.18 balloon driver worked at all in an HVM guest. You also didn''t say what version of Xen you are using. If you are running xen-unstable, you should also provide the changeset number. In any case, any load of HVM guests should never crash Xen itself, but if you are running HVM guests, I probably can''t help much as I almost never run HVM guests. From: cloudroot [mailto:cloudroot@sina.com] Sent: Friday, November 26, 2010 11:55 PM To: tinnycloud; Dan Magenheimer; xen devel Cc: george.dunlap@eu.citrix.com Subject: re: Xen balloon driver discuss Hi Dan: I have set the benchmark to test balloon driver, but unfortunately the Xen crashed on memory Panic. Before I attach the details output from serial port(which takes time on next run), I am afraid of I might miss something on test environment. My dom0 kernel is 2.6.31, pvops. Well currently there is no driver/xen/balloon.c on this kernel source tree, so I build the xen-balloon.ko, Xen-platform-pci.ko form linux-2.6.18.x86_64, and installed in Dom U, which is redhat 5.4. What I did is put a C program in the each Dom U(total 24 HVM), the program will allocate the memory and fill it with random string repeatly. And in dom0, a phthon monitor will collect the meminfo from xenstore and calculate the target to balloon from Committed_AS. The panic happens when the program is running in just one Dom. I am writing to ask whether my balloon driver is out of date, or where can I get the latest source code, I''ve googled a lot, but still have a lot of confusion on those source tree. Many thanks. From: tinnycloud [mailto:tinnycloud@hotmail.com] Date: 2010.11.23 22:58 TO: ''Dan Magenheimer''; ''xen devel'' CC: ''george.dunlap@eu.citrix.com'' Subject: re: Xen balloon driver discuss HI Dan: Appreciate for your presentation in summarizing the memory overcommit, really vivid and in great help. Well, I guess recently days the strategy in my mind will fall into the solution Set C in pdf. The tmem solution your worked out for memory overcommit is both efficient and effective. I guess I will have a try on Linux Guest. The real situation I have is most of the running VMs on host are windows. So I had to come up those policies to balance the memory. Although policies are all workload dependent. Good news is host workload is configurable, and not very heavy So I will try to figure out some favorable policy. The policies referred in pdf are good start for me. Today, instead of trying to implement "/proc/meminfo" with shared pages, I hacked the balloon driver to have another workqueue periodically write meminfo into xenstore through xenbus, which solve the problem of xenstrore high CPU utilization problem. Later I will try to google more on how Citrix does. Thanks for your help, or do you have any better idea for windows guest? Sent: Dan Magenheimer [mailto:dan.magenheimer@oracle.com] Date: 2010.11.23 1:47 To: MaoXiaoyun; xen devel CC: george.dunlap@eu.citrix.com Subject: RE: Xen balloon driver discuss Xenstore IS slow and you could improve xenballoond performance by only sending the single CommittedAS value from xenballoond in domU to dom0 instead of all of /proc/meminfo. But you are making an assumption that getting memory utilization information from domU to dom0 FASTER (e.g. with a shared page) will provide better ballooning results. I have not found this to be the case, which is what led to my investigation into self-ballooning, which led to Transcendent Memory. See the 2010 Xen Summit for more information. In your last paragraph below "Regards balloon strategy", the problem is it is not easy to define "enough memory" and "shortage of memory" within any guest and almost impossible to define it and effectively load balance across many guests. See my Linux Plumber''s Conference presentation (with complete speaker notes) here: http://oss.oracle.com/projects/tmem/dist/documentation/presentations/MemMgmt VirtEnv-LPC2010-Final.pdf http://oss.oracle.com/projects/tmem/dist/documentation/presentations/MemMgmt VirtEnv-LPC2010-SpkNotes.pdf From: MaoXiaoyun [mailto:tinnycloud@hotmail.com] Sent: Sunday, November 21, 2010 9:33 PM To: xen devel Cc: Dan Magenheimer; george.dunlap@eu.citrix.com Subject: RE: Xen balloon driver discuss Since currently /cpu/meminfo is sent to domain 0 via xenstore, which in my opinoin is slow. What I want to do is: there is a shared page between domU and dom0, and domU periodically update the meminfo into the page, while on the other side dom0 retrive the updated data for caculating the target, which is used by guest for balloning. The problem I met is, currently I don''t know how to implement a shared page between dom0 and domU. Would it like dom 0 alloc a unbound event and wait guest to connect, and transfer date through grant table? Or someone has more efficient way? many thanks.> From: tinnycloud@hotmail.com > To: xen-devel@lists.xensource.com > CC: dan.magenheimer@oracle.com; George.Dunlap@eu.citrix.com > Subject: Xen balloon driver discuss > Date: Sun, 21 Nov 2010 14:26:01 +0800 > > Hi: > Greeting first. > > I was trying to run about 24 HVMS (currently only Linux, later will > involve Windows) on one physical server with 24GB memory, 16CPUs. > Each VM is configured with 2GB memory, and I reserved 8GB memory for > dom0. > For safety reason, only domain U''s memory is allowed to balloon. > > Inside domain U, I used xenballooned provide by xensource, > periodically write /proc/meminfo into xenstore in dom > 0(/local/domain/did/memory/meminfo). > And in domain 0, I wrote a python script to read the meminfo, like > xen provided strategy, use Committed_AS to calculate the domain U balloon > target. > The time interval is ! 1 seconds. > > Inside each VM, I setup a apache server for test. Well, I''d > like to say the result is not so good. > It appears that too much read/write on xenstore, when I give some of > the stress(by using ab) to guest domains, > the CPU usage of xenstore is up to 100%. Thus the monitor running in > dom0 also response quite slowly. > Also, in ab test, the Committed_AS grows very fast, reach to maxmem > in short time, but in fact the only a small amount > of memory guest really need, so I guess there should be some more to > be taken into consideration for ballooning. > > For xenstore issue, I first plan to wrote a C program inside domain > U to replace xenballoond to see whether the situation > will be refined. If not, how about set up event channel directly for > domU and dom0, would it be faster? > > Regards balloon strategy, I would do like this, when there ! are > enough memory , just fulfill the guest balloon request, and when shortage > of memory, distribute memory evenly on the guests those request > inflation. > > Does anyone have better suggestion, thanks in advance. >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi George: I read http://lists.xensource.com/archives/html/xen-devel/2010-07/msg01404.html more carefully, and got my print out of first call of p2m_pod_demand_populate(), which is : houyi-chunk2.dev.sd.aliyun.com login: blktap_sysfs_create: adding attributes for dev ffff880122466400 (XEN) p2m_pod_demand_populate: =========pulate-on-demand memory! tot_pages 132088 pod_entries 523776 And memory/target under /local/domain/1/ is 524288. So 523776 is less than 524288, I think the problem is similar, right? But the question is why the patch doesn’t work for me. Many thanks. From: tinnycloud [mailto:tinnycloud@hotmail.com] Date: 2010年11月29日 12:21 To: ''Dan Magenheimer''; ''xen devel'' CC: ''george.dunlap@eu.citrix.com'' Subject: re: Xen balloon driver discuss Hi Dan: You are right, the HVM guest is kernel-2.6.18-164.el5.src.rpm, coming from ftp://ftp.redhat.com/pub/redhat/linux/enterprise/5Server/en/os/SRPMS/ Currently the balloon driver is compiled from this kernel. (So I am afraid of if the driver may out of date, and I plan to get new balloon.c from xenlinux and put it into this kernel to compile a new xen-balloon.ko) My xen is 4.0.0, again pvops kernel 2.6.31 Actually, I have two problems, first is PoD “populate-on-demand memory” issue, and second is xen panic(I will get more test and report on another reply) I have googled some and apply the patch from http://lists.xensource.com/archives/html/xen-devel/2010-07/msg01404.html, but it doesn’t work for me. -------------------------------------------------Domain Crash Case--------------------------------------------- The issue is easy to reproduce, I started one HVM with command line: xm cr hvm.linux.balloon maxmem=2048 memory=512 the guest works well at first, but crashed as long as I logined into it throught VNC the serial output is: blktap_sysfs_create: adding attributes for dev ffff8801224df000 (XEN) p2m_pod_demand_populate: Out of populate-on-demand memory! tot_pages 132088 pod_entries 9489 (XEN) domain_crash called from p2m.c:1127 (XEN) Domain 4 reported crashed by domain 0 on cpu#0: (XEN) printk: 31 messages suppressed. (XEN) grant_table.c:555:d0 Iomem mapping not permitted ffffffffffffffff (domain 4) blktap_sysfs_destroy blktap_sysfs_create: adding attributes for dev ffff88012259ca00 -------------------------------------------------Xen Crash Case--------------------------------------------- In addition, if start guest like m cr hvm.linux.balloon maxmem=2048 memory=400 blktap_sysfs_destroy blktap_sysfs_create: adding attributes for dev ffff8801224df000 (XEN) p2m_pod_demand_populate: Out of populate-on-demand memory! tot_pages 132088 pod_entries 9489 (XEN) domain_crash called from p2m.c:1127 (XEN) Domain 4 reported crashed by domain 0 on cpu#0: (XEN) printk: 31 messages suppressed. (XEN) grant_table.c:555:d0 Iomem mapping not permitted ffffffffffffffff (domain 4) blktap_sysfs_destroy blktap_sysfs_create: adding attributes for dev ffff88012259ca00 blktap_sysfs_destroy blktap_sysfs_create: adding attributes for dev ffff88012259c600 (XEN) Error: p2m lock held by p2m_change_type (XEN) Xen BUG at p2m-ept.c:38 (XEN) ----[ Xen-4.0.0 x86_64 debug=n Not tainted ]---- (XEN) CPU: 6 (XEN) RIP: e008:[<ffff82c4801df2aa>] ept_pod_check_and_populate+0x13a/0x150 (XEN) RFLAGS: 0000000000010282 CONTEXT: hypervisor (XEN) rax: 0000000000000000 rbx: ffff83063fdc0000 rcx: 0000000000000092 (XEN) rdx: 000000000000000a rsi: 000000000000000a rdi: ffff82c48021e844 (XEN) rbp: ffff83023fefff28 rsp: ffff83023feffc18 r8: 0000000000000001 (XEN) r9: 0000000000000001 r10: 0000000000000000 r11: ffff82c4801318d0 (XEN) r12: ffff8302f5914ef8 r13: 0000000000000001 r14: 0000000000000000 (XEN) r15: 0000000000003bdf cr0: 0000000080050033 cr4: 00000000000026f0 (XEN) cr3: 000000063fc2e000 cr2: 00002ba99c046000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 (XEN) Xen stack trace from rsp=ffff83023feffc18: (XEN) 0000000000000002 0000000000000000 0000000000000000 ffff83063fdc0000 (XEN) ffff8302f5914ef8 0000000000000001 ffff83023feffc70 ffff82c4801df46e (XEN) 0000000000000000 ffff83023feffcc4 0000000000003bdf 00000000000001df (XEN) ffff8302f5914000 ffff83063fdc0000 ffff83023fefff28 0000000000003bdf (XEN) 0000000000000002 0000000000000001 0000000000000030 ffff82c4801bafe4 (XEN) ffff8302f89dc000 000000043fefff28 ffff83023fefff28 0000000000003bdf (XEN) 00000000002f9223 0000000000000030 ffff83023fefff28 ffff82c48019bab1 (XEN) 0000000000000000 00000001bdc62000 0000000000000000 0000000000000182 (XEN) ffff8300bdc62000 ffff82c4801b3824 ffff83063fdc0348 07008300bdc62000 (XEN) ffff83023fe808d0 0000000000000040 000000063fc3601e 0000000000000000 (XEN) ffff83023fefff28 ffff82c480167d17 ffff82c4802509c0 0000000000000000 (XEN) 0000000003bdf000 000000000001c000 ffff83023feffdc8 0000000000000080 (XEN) ffff82c480250dd0 0000000000003bdf 00ff82c480250080 ffff82c480250dc0 (XEN) ffff82c480250080 ffff82c480250dc0 0000000000004040 0000000000000000 (XEN) 0000000000004040 0000000000000040 ffff82c4801447da 0000000000000080 (XEN) ffff83023fefff28 0000000000000092 ffff82c4801a7f6c 00000000000000fc (XEN) 0000000000000092 0000000000000006 ffff8300bdc63760 0000000000000006 (XEN) ffff82c48025c100 ffff82c480250100 ffff82c480250100 0000000000000292 (XEN) ffff8300bdc637f0 00000249b30f6a00 0000000000000292 ffff82c4801a9383 (XEN) 00000000000000ef ffff8300bdc62000 ffff8300bdc62000 ffff8300bdc637e8 (XEN) Xen call trace: (XEN) [<ffff82c4801df2aa>] ept_pod_check_and_populate+0x13a/0x150 (XEN) [<ffff82c4801df46e>] ept_get_entry+0x1ae/0x1c0 (XEN) [<ffff82c4801bafe4>] p2m_change_type+0x144/0x1b0 (XEN) [<ffff82c48019bab1>] hvm_hap_nested_page_fault+0x121/0x190 (XEN) [<ffff82c4801b3824>] vmx_vmexit_handler+0x304/0x1a90 (XEN) [<ffff82c480167d17>] __smp_call_function_interrupt+0x57/0x90 (XEN) [<ffff82c4801447da>] __find_next_bit+0x6a/0x70 (XEN) [<ffff82c4801a7f6c>] vpic_get_highest_priority_irq+0x2c/0xa0 (XEN) [<ffff82c4801a9383>] pt_update_irq+0x33/0x1e0 (XEN) [<ffff82c4801a6042>] vlapic_has_pending_irq+0x42/0x70 (XEN) [<ffff82c4801a0c88>] hvm_vcpu_has_pending_irq+0x88/0xa0 (XEN) [<ffff82c4801b263b>] vmx_vmenter_helper+0x5b/0x150 (XEN) [<ffff82c4801ada63>] vmx_asm_do_vmentry+0x0/0xdd (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 6: (XEN) Xen BUG at p2m-ept.c:38 (XEN) **************************************** (XEN) (XEN) Manual reset required (''noreboot'' specified) ---------------------------------------Works configuration-------------------------------------------------- And if starts guest like xm cr hvm.linux.balloon maxmem=1024 memory=512 the guest can be successfully logon through VNC Any idea on what happens? PoD is new to me, I will try to know more, thanks. From: Dan Magenheimer [mailto:dan.magenheimer@oracle.com] Date: 2010.11.28 10:36 sent: tinnycloud; xen devel cc: george.dunlap@eu.citrix.com subject: RE: Xen balloon driver discuss Am I understanding correctly that you are running each linux-2.6.18 as HVM (not PV)? I didn’t think that the linux-2.6.18 balloon driver worked at all in an HVM guest. You also didn’t say what version of Xen you are using. If you are running xen-unstable, you should also provide the changeset number. In any case, any load of HVM guests should never crash Xen itself, but if you are running HVM guests, I probably can’t help much as I almost never run HVM guests. From: cloudroot [mailto:cloudroot@sina.com] Sent: Friday, November 26, 2010 11:55 PM To: tinnycloud; Dan Magenheimer; xen devel Cc: george.dunlap@eu.citrix.com Subject: re: Xen balloon driver discuss Hi Dan: I have set the benchmark to test balloon driver, but unfortunately the Xen crashed on memory Panic. Before I attach the details output from serial port(which takes time on next run), I am afraid of I might miss something on test environment. My dom0 kernel is 2.6.31, pvops. Well currently there is no driver/xen/balloon.c on this kernel source tree, so I build the xen-balloon.ko, Xen-platform-pci.ko form linux-2.6.18.x86_64, and installed in Dom U, which is redhat 5.4. What I did is put a C program in the each Dom U(total 24 HVM), the program will allocate the memory and fill it with random string repeatly. And in dom0, a phthon monitor will collect the meminfo from xenstore and calculate the target to balloon from Committed_AS. The panic happens when the program is running in just one Dom. I am writing to ask whether my balloon driver is out of date, or where can I get the latest source code, I’ve googled a lot, but still have a lot of confusion on those source tree. Many thanks. From: tinnycloud [mailto:tinnycloud@hotmail.com] Date: 2010.11.23 22:58 TO: ''Dan Magenheimer''; ''xen devel'' CC: ''george.dunlap@eu.citrix.com'' Subject: re: Xen balloon driver discuss HI Dan: Appreciate for your presentation in summarizing the memory overcommit, really vivid and in great help. Well, I guess recently days the strategy in my mind will fall into the solution Set C in pdf. The tmem solution your worked out for memory overcommit is both efficient and effective. I guess I will have a try on Linux Guest. The real situation I have is most of the running VMs on host are windows. So I had to come up those policies to balance the memory. Although policies are all workload dependent. Good news is host workload is configurable, and not very heavy So I will try to figure out some favorable policy. The policies referred in pdf are good start for me. Today, instead of trying to implement “/proc/meminfo” with shared pages, I hacked the balloon driver to have another workqueue periodically write meminfo into xenstore through xenbus, which solve the problem of xenstrore high CPU utilization problem. Later I will try to google more on how Citrix does. Thanks for your help, or do you have any better idea for windows guest? Sent: Dan Magenheimer [mailto:dan.magenheimer@oracle.com] Date: 2010.11.23 1:47 To: MaoXiaoyun; xen devel CC: george.dunlap@eu.citrix.com Subject: RE: Xen balloon driver discuss Xenstore IS slow and you could improve xenballoond performance by only sending the single CommittedAS value from xenballoond in domU to dom0 instead of all of /proc/meminfo. But you are making an assumption that getting memory utilization information from domU to dom0 FASTER (e.g. with a shared page) will provide better ballooning results. I have not found this to be the case, which is what led to my investigation into self-ballooning, which led to Transcendent Memory. See the 2010 Xen Summit for more information. In your last paragraph below “Regards balloon strategy”, the problem is it is not easy to define “enough memory” and “shortage of memory” within any guest and almost impossible to define it and effectively load balance across many guests. See my Linux Plumber’s Conference presentation (with complete speaker notes) here: http://oss.oracle.com/projects/tmem/dist/documentation/presentations/MemMgmt VirtEnv-LPC2010-Final.pdf http://oss.oracle.com/projects/tmem/dist/documentation/presentations/MemMgmt VirtEnv-LPC2010-SpkNotes.pdf From: MaoXiaoyun [mailto:tinnycloud@hotmail.com] Sent: Sunday, November 21, 2010 9:33 PM To: xen devel Cc: Dan Magenheimer; george.dunlap@eu.citrix.com Subject: RE: Xen balloon driver discuss Since currently /cpu/meminfo is sent to domain 0 via xenstore, which in my opinoin is slow. What I want to do is: there is a shared page between domU and dom0, and domU periodically update the meminfo into the page, while on the other side dom0 retrive the updated data for caculating the target, which is used by guest for balloning. The problem I met is, currently I don''t know how to implement a shared page between dom0 and domU. Would it like dom 0 alloc a unbound event and wait guest to connect, and transfer date through grant table? Or someone has more efficient way? many thanks.> From: tinnycloud@hotmail.com > To: xen-devel@lists.xensource.com > CC: dan.magenheimer@oracle.com; George.Dunlap@eu.citrix.com > Subject: Xen balloon driver discuss > Date: Sun, 21 Nov 2010 14:26:01 +0800 > > Hi: > Greeting first. > > I was trying to run about 24 HVMS (currently only Linux, later will > involve Windows) on one physical server with 24GB memory, 16CPUs. > Each VM is configured with 2GB memory, and I reserved 8GB memory for > dom0. > For safety reason, only domain U''s memory is allowed to balloon. > > Inside domain U, I used xenballooned provide by xensource, > periodically write /proc/meminfo into xenstore in dom > 0(/local/domain/did/memory/meminfo). > And in domain 0, I wrote a python script to read the meminfo, like > xen provided strategy, use Committed_AS to calculate the domain U balloon > target. > The time interval is ! 1 seconds. > > Inside each VM, I setup a apache server for test. Well, I''d > like to say the result is not so good. > It appears that too much read/write on xenstore, when I give some of > the stress(by using ab) to guest domains, > the CPU usage of xenstore is up to 100%. Thus the monitor running in > dom0 also response quite slowly. > Also, in ab test, the Committed_AS grows very fast, reach to maxmem > in short time, but in fact the only a small amount > of memory guest really need, so I guess there should be some more to > be taken into consideration for ballooning. > > For xenstore issue, I first plan to wrote a C program inside domain > U to replace xenballoond to see whether the situation > will be refined. If not, how about set up event channel directly for > domU and dom0, would it be faster? > > Regards balloon strategy, I would do like this, when there ! are > enough memory , just fulfill the guest balloon request, and when shortage > of memory, distribute memory evenly on the guests those request > inflation. > > Does anyone have better suggestion, thanks in advance. >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
I am also interested with tinnycloud''s problem. It looks that the pod cache has been used up like this: if ( p2md->pod.count == 0 ) goto out_of_memory; George, would you please take a look on this problem, and, if possbile, tell a little more about what does PoD cache mean? Is it a memory pool for PoD allocation? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Well, I forget to print out the pod count, please see below. Thanks blktap_sysfs_destroy blktap_sysfs_create: adding attributes for dev ffff88015456b200 (XEN) p2m_pod_demand_populate: =========pulate-on-demand memory! tot_pages 132088 pod_entries 523776 pod_count 130560 (XEN) p2m_pod_demand_populate: =========pulate-on-demand memory! tot_pages 132088 pod_entries 523264 pod_count 130048 (XEN) p2m_pod_demand_populate: =========pulate-on-demand memory! tot_pages 132088 pod_entries 522752 pod_count 129536 (XEN) p2m_pod_demand_populate: =========pulate-on-demand memory! tot_pages 132088 pod_entries 522240 pod_count 129024 (XEN) p2m_pod_demand_populate: =========pulate-on-demand memory! tot_pages 132088 pod_entries 521728 pod_count 128512 (XEN) p2m_pod_demand_populate: =========pulate-on-demand memory! tot_pages 132088 pod_entries 521216 pod_count 128000 (XEN) p2m_pod_demand_populate: =========pulate-on-demand memory! tot_pages 132088 pod_entries 520704 pod_count 127488 (XEN) p2m_pod_demand_populate: =========pulate-on-demand memory! tot_pages 132088 pod_entries 520192 pod_count 126976 (XEN) p2m_pod_demand_populate: =========pulate-on-demand memory! tot_pages 132088 pod_entries 519680 pod_count 126464 It looks like 512M pod cached is too small. Since if I xm memset the hvm to 1G before I login into it through VNC, the guest won’t crash. So the solution is to enlarge the pod cache,like xm cr hvm.linux.balloon maxmem=2048 memory=768 am I right? To: ''xen devel''; george.dunlap@eu.citrix.com CC: ''tinnycloud''; ''Dan Magenheimer'' Subject: Re: Xen balloon driver discuss Hi George: I read http://lists.xensource.com/archives/html/xen-devel/2010-07/msg01404.html more carefully, and got my print out of first call of p2m_pod_demand_populate(), which is : houyi-chunk2.dev.sd.aliyun.com login: blktap_sysfs_create: adding attributes for dev ffff880122466400 (XEN) p2m_pod_demand_populate: =========pulate-on-demand memory! tot_pages 132088 pod_entries 523776 And memory/target under /local/domain/1/ is 524288. So 523776 is less than 524288, I think the problem is similar, right? But the question is why the patch doesn’t work for me. Many thanks. From: tinnycloud [mailto:tinnycloud@hotmail.com] Date: 2010年11月29日 12:21 To: ''Dan Magenheimer''; ''xen devel'' CC: ''george.dunlap@eu.citrix.com'' Subject: re: Xen balloon driver discuss Hi Dan: You are right, the HVM guest is kernel-2.6.18-164.el5.src.rpm, coming from ftp://ftp.redhat.com/pub/redhat/linux/enterprise/5Server/en/os/SRPMS/ Currently the balloon driver is compiled from this kernel. (So I am afraid of if the driver may out of date, and I plan to get new balloon.c from xenlinux and put it into this kernel to compile a new xen-balloon.ko) My xen is 4.0.0, again pvops kernel 2.6.31 Actually, I have two problems, first is PoD “populate-on-demand memory” issue, and second is xen panic(I will get more test and report on another reply) I have googled some and apply the patch from http://lists.xensource.com/archives/html/xen-devel/2010-07/msg01404.html, but it doesn’t work for me. -------------------------------------------------Domain Crash Case--------------------------------------------- The issue is easy to reproduce, I started one HVM with command line: xm cr hvm.linux.balloon maxmem=2048 memory=512 the guest works well at first, but crashed as long as I logined into it throught VNC the serial output is: blktap_sysfs_create: adding attributes for dev ffff8801224df000 (XEN) p2m_pod_demand_populate: Out of populate-on-demand memory! tot_pages 132088 pod_entries 9489 (XEN) domain_crash called from p2m.c:1127 (XEN) Domain 4 reported crashed by domain 0 on cpu#0: (XEN) printk: 31 messages suppressed. (XEN) grant_table.c:555:d0 Iomem mapping not permitted ffffffffffffffff (domain 4) blktap_sysfs_destroy blktap_sysfs_create: adding attributes for dev ffff88012259ca00 -------------------------------------------------Xen Crash Case--------------------------------------------- In addition, if start guest like m cr hvm.linux.balloon maxmem=2048 memory=400 blktap_sysfs_destroy blktap_sysfs_create: adding attributes for dev ffff8801224df000 (XEN) p2m_pod_demand_populate: Out of populate-on-demand memory! tot_pages 132088 pod_entries 9489 (XEN) domain_crash called from p2m.c:1127 (XEN) Domain 4 reported crashed by domain 0 on cpu#0: (XEN) printk: 31 messages suppressed. (XEN) grant_table.c:555:d0 Iomem mapping not permitted ffffffffffffffff (domain 4) blktap_sysfs_destroy blktap_sysfs_create: adding attributes for dev ffff88012259ca00 blktap_sysfs_destroy blktap_sysfs_create: adding attributes for dev ffff88012259c600 (XEN) Error: p2m lock held by p2m_change_type (XEN) Xen BUG at p2m-ept.c:38 (XEN) ----[ Xen-4.0.0 x86_64 debug=n Not tainted ]---- (XEN) CPU: 6 (XEN) RIP: e008:[<ffff82c4801df2aa>] ept_pod_check_and_populate+0x13a/0x150 (XEN) RFLAGS: 0000000000010282 CONTEXT: hypervisor (XEN) rax: 0000000000000000 rbx: ffff83063fdc0000 rcx: 0000000000000092 (XEN) rdx: 000000000000000a rsi: 000000000000000a rdi: ffff82c48021e844 (XEN) rbp: ffff83023fefff28 rsp: ffff83023feffc18 r8: 0000000000000001 (XEN) r9: 0000000000000001 r10: 0000000000000000 r11: ffff82c4801318d0 (XEN) r12: ffff8302f5914ef8 r13: 0000000000000001 r14: 0000000000000000 (XEN) r15: 0000000000003bdf cr0: 0000000080050033 cr4: 00000000000026f0 (XEN) cr3: 000000063fc2e000 cr2: 00002ba99c046000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 (XEN) Xen stack trace from rsp=ffff83023feffc18: (XEN) 0000000000000002 0000000000000000 0000000000000000 ffff83063fdc0000 (XEN) ffff8302f5914ef8 0000000000000001 ffff83023feffc70 ffff82c4801df46e (XEN) 0000000000000000 ffff83023feffcc4 0000000000003bdf 00000000000001df (XEN) ffff8302f5914000 ffff83063fdc0000 ffff83023fefff28 0000000000003bdf (XEN) 0000000000000002 0000000000000001 0000000000000030 ffff82c4801bafe4 (XEN) ffff8302f89dc000 000000043fefff28 ffff83023fefff28 0000000000003bdf (XEN) 00000000002f9223 0000000000000030 ffff83023fefff28 ffff82c48019bab1 (XEN) 0000000000000000 00000001bdc62000 0000000000000000 0000000000000182 (XEN) ffff8300bdc62000 ffff82c4801b3824 ffff83063fdc0348 07008300bdc62000 (XEN) ffff83023fe808d0 0000000000000040 000000063fc3601e 0000000000000000 (XEN) ffff83023fefff28 ffff82c480167d17 ffff82c4802509c0 0000000000000000 (XEN) 0000000003bdf000 000000000001c000 ffff83023feffdc8 0000000000000080 (XEN) ffff82c480250dd0 0000000000003bdf 00ff82c480250080 ffff82c480250dc0 (XEN) ffff82c480250080 ffff82c480250dc0 0000000000004040 0000000000000000 (XEN) 0000000000004040 0000000000000040 ffff82c4801447da 0000000000000080 (XEN) ffff83023fefff28 0000000000000092 ffff82c4801a7f6c 00000000000000fc (XEN) 0000000000000092 0000000000000006 ffff8300bdc63760 0000000000000006 (XEN) ffff82c48025c100 ffff82c480250100 ffff82c480250100 0000000000000292 (XEN) ffff8300bdc637f0 00000249b30f6a00 0000000000000292 ffff82c4801a9383 (XEN) 00000000000000ef ffff8300bdc62000 ffff8300bdc62000 ffff8300bdc637e8 (XEN) Xen call trace: (XEN) [<ffff82c4801df2aa>] ept_pod_check_and_populate+0x13a/0x150 (XEN) [<ffff82c4801df46e>] ept_get_entry+0x1ae/0x1c0 (XEN) [<ffff82c4801bafe4>] p2m_change_type+0x144/0x1b0 (XEN) [<ffff82c48019bab1>] hvm_hap_nested_page_fault+0x121/0x190 (XEN) [<ffff82c4801b3824>] vmx_vmexit_handler+0x304/0x1a90 (XEN) [<ffff82c480167d17>] __smp_call_function_interrupt+0x57/0x90 (XEN) [<ffff82c4801447da>] __find_next_bit+0x6a/0x70 (XEN) [<ffff82c4801a7f6c>] vpic_get_highest_priority_irq+0x2c/0xa0 (XEN) [<ffff82c4801a9383>] pt_update_irq+0x33/0x1e0 (XEN) [<ffff82c4801a6042>] vlapic_has_pending_irq+0x42/0x70 (XEN) [<ffff82c4801a0c88>] hvm_vcpu_has_pending_irq+0x88/0xa0 (XEN) [<ffff82c4801b263b>] vmx_vmenter_helper+0x5b/0x150 (XEN) [<ffff82c4801ada63>] vmx_asm_do_vmentry+0x0/0xdd (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 6: (XEN) Xen BUG at p2m-ept.c:38 (XEN) **************************************** (XEN) (XEN) Manual reset required (''noreboot'' specified) ---------------------------------------Works configuration-------------------------------------------------- And if starts guest like xm cr hvm.linux.balloon maxmem=1024 memory=512 the guest can be successfully logon through VNC Any idea on what happens? PoD is new to me, I will try to know more, thanks. From: Dan Magenheimer [mailto:dan.magenheimer@oracle.com] Date: 2010.11.28 10:36 sent: tinnycloud; xen devel cc: george.dunlap@eu.citrix.com subject: RE: Xen balloon driver discuss Am I understanding correctly that you are running each linux-2.6.18 as HVM (not PV)? I didn’t think that the linux-2.6.18 balloon driver worked at all in an HVM guest. You also didn’t say what version of Xen you are using. If you are running xen-unstable, you should also provide the changeset number. In any case, any load of HVM guests should never crash Xen itself, but if you are running HVM guests, I probably can’t help much as I almost never run HVM guests. From: cloudroot [mailto:cloudroot@sina.com] Sent: Friday, November 26, 2010 11:55 PM To: tinnycloud; Dan Magenheimer; xen devel Cc: george.dunlap@eu.citrix.com Subject: re: Xen balloon driver discuss Hi Dan: I have set the benchmark to test balloon driver, but unfortunately the Xen crashed on memory Panic. Before I attach the details output from serial port(which takes time on next run), I am afraid of I might miss something on test environment. My dom0 kernel is 2.6.31, pvops. Well currently there is no driver/xen/balloon.c on this kernel source tree, so I build the xen-balloon.ko, Xen-platform-pci.ko form linux-2.6.18.x86_64, and installed in Dom U, which is redhat 5.4. What I did is put a C program in the each Dom U(total 24 HVM), the program will allocate the memory and fill it with random string repeatly. And in dom0, a phthon monitor will collect the meminfo from xenstore and calculate the target to balloon from Committed_AS. The panic happens when the program is running in just one Dom. I am writing to ask whether my balloon driver is out of date, or where can I get the latest source code, I’ve googled a lot, but still have a lot of confusion on those source tree. Many thanks. From: tinnycloud [mailto:tinnycloud@hotmail.com] Date: 2010.11.23 22:58 TO: ''Dan Magenheimer''; ''xen devel'' CC: ''george.dunlap@eu.citrix.com'' Subject: re: Xen balloon driver discuss HI Dan: Appreciate for your presentation in summarizing the memory overcommit, really vivid and in great help. Well, I guess recently days the strategy in my mind will fall into the solution Set C in pdf. The tmem solution your worked out for memory overcommit is both efficient and effective. I guess I will have a try on Linux Guest. The real situation I have is most of the running VMs on host are windows. So I had to come up those policies to balance the memory. Although policies are all workload dependent. Good news is host workload is configurable, and not very heavy So I will try to figure out some favorable policy. The policies referred in pdf are good start for me. Today, instead of trying to implement “/proc/meminfo” with shared pages, I hacked the balloon driver to have another workqueue periodically write meminfo into xenstore through xenbus, which solve the problem of xenstrore high CPU utilization problem. Later I will try to google more on how Citrix does. Thanks for your help, or do you have any better idea for windows guest? Sent: Dan Magenheimer [mailto:dan.magenheimer@oracle.com] Date: 2010.11.23 1:47 To: MaoXiaoyun; xen devel CC: george.dunlap@eu.citrix.com Subject: RE: Xen balloon driver discuss Xenstore IS slow and you could improve xenballoond performance by only sending the single CommittedAS value from xenballoond in domU to dom0 instead of all of /proc/meminfo. But you are making an assumption that getting memory utilization information from domU to dom0 FASTER (e.g. with a shared page) will provide better ballooning results. I have not found this to be the case, which is what led to my investigation into self-ballooning, which led to Transcendent Memory. See the 2010 Xen Summit for more information. In your last paragraph below “Regards balloon strategy”, the problem is it is not easy to define “enough memory” and “shortage of memory” within any guest and almost impossible to define it and effectively load balance across many guests. See my Linux Plumber’s Conference presentation (with complete speaker notes) here: http://oss.oracle.com/projects/tmem/dist/documentation/presentations/MemMgmt VirtEnv-LPC2010-Final.pdf http://oss.oracle.com/projects/tmem/dist/documentation/presentations/MemMgmt VirtEnv-LPC2010-SpkNotes.pdf From: MaoXiaoyun [mailto:tinnycloud@hotmail.com] Sent: Sunday, November 21, 2010 9:33 PM To: xen devel Cc: Dan Magenheimer; george.dunlap@eu.citrix.com Subject: RE: Xen balloon driver discuss Since currently /cpu/meminfo is sent to domain 0 via xenstore, which in my opinoin is slow. What I want to do is: there is a shared page between domU and dom0, and domU periodically update the meminfo into the page, while on the other side dom0 retrive the updated data for caculating the target, which is used by guest for balloning. The problem I met is, currently I don''t know how to implement a shared page between dom0 and domU. Would it like dom 0 alloc a unbound event and wait guest to connect, and transfer date through grant table? Or someone has more efficient way? many thanks.> From: tinnycloud@hotmail.com > To: xen-devel@lists.xensource.com > CC: dan.magenheimer@oracle.com; George.Dunlap@eu.citrix.com > Subject: Xen balloon driver discuss > Date: Sun, 21 Nov 2010 14:26:01 +0800 > > Hi: > Greeting first. > > I was trying to run about 24 HVMS (currently only Linux, later will > involve Windows) on one physical server with 24GB memory, 16CPUs. > Each VM is configured with 2GB memory, and I reserved 8GB memory for > dom0. > For safety reason, only domain U''s memory is allowed to balloon. > > Inside domain U, I used xenballooned provide by xensource, > periodically write /proc/meminfo into xenstore in dom > 0(/local/domain/did/memory/meminfo). > And in domain 0, I wrote a python script to read the meminfo, like > xen provided strategy, use Committed_AS to calculate the domain U balloon > target. > The time interval is ! 1 seconds. > > Inside each VM, I setup a apache server for test. Well, I''d > like to say the result is not so good. > It appears that too much read/write on xenstore, when I give some of > the stress(by using ab) to guest domains, > the CPU usage of xenstore is up to 100%. Thus the monitor running in > dom0 also response quite slowly. > Also, in ab test, the Committed_AS grows very fast, reach to maxmem > in short time, but in fact the only a small amount > of memory guest really need, so I guess there should be some more to > be taken into consideration for ballooning. > > For xenstore issue, I first plan to wrote a C program inside domain > U to replace xenballoond to see whether the situation > will be refined. If not, how about set up event channel directly for > domU and dom0, would it be faster? > > Regards balloon strategy, I would do like this, when there ! are > enough memory , just fulfill the guest balloon request, and when shortage > of memory, distribute memory evenly on the guests those request > inflation. > > Does anyone have better suggestion, thanks in advance. >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
No, you''re confusing two things. pod_entries is the number of entries in the p2m table that have neither been populated with memory, nor been reclaimed by the balloon driver. Are you sure the balloon driver is actually working? Chu: Yes, the PoD "cache" is the memory pool which is used to populate PoD entries. "Cache" is a bad name, I should have called it "pool" to begin with. -George On 29/11/10 06:34, xiaoyun.maoxy wrote:> Hi George: > > I read > http://lists.xensource.com/archives/html/xen-devel/2010-07/msg01404.html > more carefully, and got my print out of > > first call of p2m_pod_demand_populate(), which is : > > houyi-chunk2.dev.sd.aliyun.com login: blktap_sysfs_create: adding > attributes for dev ffff880122466400 > > (XEN) p2m_pod_demand_populate: =========pulate-on-demand memory! > tot_pages 132088 pod_entries 523776 > > And memory/target under /local/domain/1/ is 524288. > > So 523776 is less than 524288, I think the problem is similar, right? > > But the question is why the patch doesn’t work for me. > > Many thanks. > > *From:* tinnycloud [mailto:tinnycloud@hotmail.com] > *Date:* 2010年11月29日 12:21 > *To:* ''Dan Magenheimer''; ''xen devel'' > *CC:* ''george.dunlap@eu.citrix.com'' > *Subject:* re: Xen balloon driver discuss > > Hi Dan: > > You are right, the HVM guest is kernel-2.6.18-164.el5.src.rpm, coming > from ftp://ftp.redhat.com/pub/redhat/linux/enterprise/5Server/en/os/SRPMS/ > > Currently the balloon driver is compiled from this kernel. (So I am > afraid of if the driver may out of date, and I plan to get new balloon.c > from xenlinux and put it into this kernel to compile a new xen-balloon.ko) > > My xen is 4.0.0, again pvops kernel 2.6.31 > > Actually, I have two problems, first is PoD “populate-on-demand memory” > issue, and second is xen panic(I will get more test and report on > another reply) > > I have googled some and apply the patch from > http://lists.xensource.com/archives/html/xen-devel/2010-07/msg01404.html, but > it doesn’t work for me. > > -------------------------------------------------Domain Crash > Case--------------------------------------------- > > The issue is easy to reproduce, I started one HVM with command line: > > xm cr hvm.linux.balloon maxmem=2048 memory=512 > > the guest works well at first, but crashed as long as I logined into it > throught VNC > > the serial output is: > > blktap_sysfs_create: adding attributes for dev ffff8801224df000 > > (XEN) p2m_pod_demand_populate: Out of populate-on-demand memory! > tot_pages 132088 pod_entries 9489 > > (XEN) domain_crash called from p2m.c:1127 > > (XEN) Domain 4 reported crashed by domain 0 on cpu#0: > > (XEN) printk: 31 messages suppressed. > > (XEN) grant_table.c:555:d0 Iomem mapping not permitted ffffffffffffffff > (domain 4) > > blktap_sysfs_destroy > > blktap_sysfs_create: adding attributes for dev ffff88012259ca00 > > -------------------------------------------------Xen Crash > Case--------------------------------------------- > > In addition, if start guest like > > m cr hvm.linux.balloon maxmem=2048 memory=400 > > blktap_sysfs_destroy > > blktap_sysfs_create: adding attributes for dev ffff8801224df000 > > (XEN) p2m_pod_demand_populate: Out of populate-on-demand memory! > tot_pages 132088 pod_entries 9489 > > (XEN) domain_crash called from p2m.c:1127 > > (XEN) Domain 4 reported crashed by domain 0 on cpu#0: > > (XEN) printk: 31 messages suppressed. > > (XEN) grant_table.c:555:d0 Iomem mapping not permitted ffffffffffffffff > (domain 4) > > blktap_sysfs_destroy > > blktap_sysfs_create: adding attributes for dev ffff88012259ca00 > > blktap_sysfs_destroy > > blktap_sysfs_create: adding attributes for dev ffff88012259c600 > > (XEN) Error: p2m lock held by p2m_change_type > > (XEN) Xen BUG at p2m-ept.c:38 > > (XEN) ----[ Xen-4.0.0 x86_64 debug=n Not tainted ]---- > > (XEN) CPU: 6 > > (XEN) RIP: e008:[<ffff82c4801df2aa>] ept_pod_check_and_populate+0x13a/0x150 > > (XEN) RFLAGS: 0000000000010282 CONTEXT: hypervisor > > (XEN) rax: 0000000000000000 rbx: ffff83063fdc0000 rcx: 0000000000000092 > > (XEN) rdx: 000000000000000a rsi: 000000000000000a rdi: ffff82c48021e844 > > (XEN) rbp: ffff83023fefff28 rsp: ffff83023feffc18 r8: 0000000000000001 > > (XEN) r9: 0000000000000001 r10: 0000000000000000 r11: ffff82c4801318d0 > > (XEN) r12: ffff8302f5914ef8 r13: 0000000000000001 r14: 0000000000000000 > > (XEN) r15: 0000000000003bdf cr0: 0000000080050033 cr4: 00000000000026f0 > > (XEN) cr3: 000000063fc2e000 cr2: 00002ba99c046000 > > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 > > (XEN) Xen stack trace from rsp=ffff83023feffc18: > > (XEN) 0000000000000002 0000000000000000 0000000000000000 ffff83063fdc0000 > > (XEN) ffff8302f5914ef8 0000000000000001 ffff83023feffc70 ffff82c4801df46e > > (XEN) 0000000000000000 ffff83023feffcc4 0000000000003bdf 00000000000001df > > (XEN) ffff8302f5914000 ffff83063fdc0000 ffff83023fefff28 0000000000003bdf > > (XEN) 0000000000000002 0000000000000001 0000000000000030 ffff82c4801bafe4 > > (XEN) ffff8302f89dc000 000000043fefff28 ffff83023fefff28 0000000000003bdf > > (XEN) 00000000002f9223 0000000000000030 ffff83023fefff28 ffff82c48019bab1 > > (XEN) 0000000000000000 00000001bdc62000 0000000000000000 0000000000000182 > > (XEN) ffff8300bdc62000 ffff82c4801b3824 ffff83063fdc0348 07008300bdc62000 > > (XEN) ffff83023fe808d0 0000000000000040 000000063fc3601e 0000000000000000 > > (XEN) ffff83023fefff28 ffff82c480167d17 ffff82c4802509c0 0000000000000000 > > (XEN) 0000000003bdf000 000000000001c000 ffff83023feffdc8 0000000000000080 > > (XEN) ffff82c480250dd0 0000000000003bdf 00ff82c480250080 ffff82c480250dc0 > > (XEN) ffff82c480250080 ffff82c480250dc0 0000000000004040 0000000000000000 > > (XEN) 0000000000004040 0000000000000040 ffff82c4801447da 0000000000000080 > > (XEN) ffff83023fefff28 0000000000000092 ffff82c4801a7f6c 00000000000000fc > > (XEN) 0000000000000092 0000000000000006 ffff8300bdc63760 0000000000000006 > > (XEN) ffff82c48025c100 ffff82c480250100 ffff82c480250100 0000000000000292 > > (XEN) ffff8300bdc637f0 00000249b30f6a00 0000000000000292 ffff82c4801a9383 > > (XEN) 00000000000000ef ffff8300bdc62000 ffff8300bdc62000 ffff8300bdc637e8 > > (XEN) Xen call trace: > > (XEN) [<ffff82c4801df2aa>] ept_pod_check_and_populate+0x13a/0x150 > > (XEN) [<ffff82c4801df46e>] ept_get_entry+0x1ae/0x1c0 > > (XEN) [<ffff82c4801bafe4>] p2m_change_type+0x144/0x1b0 > > (XEN) [<ffff82c48019bab1>] hvm_hap_nested_page_fault+0x121/0x190 > > (XEN) [<ffff82c4801b3824>] vmx_vmexit_handler+0x304/0x1a90 > > (XEN) [<ffff82c480167d17>] __smp_call_function_interrupt+0x57/0x90 > > (XEN) [<ffff82c4801447da>] __find_next_bit+0x6a/0x70 > > (XEN) [<ffff82c4801a7f6c>] vpic_get_highest_priority_irq+0x2c/0xa0 > > (XEN) [<ffff82c4801a9383>] pt_update_irq+0x33/0x1e0 > > (XEN) [<ffff82c4801a6042>] vlapic_has_pending_irq+0x42/0x70 > > (XEN) [<ffff82c4801a0c88>] hvm_vcpu_has_pending_irq+0x88/0xa0 > > (XEN) [<ffff82c4801b263b>] vmx_vmenter_helper+0x5b/0x150 > > (XEN) [<ffff82c4801ada63>] vmx_asm_do_vmentry+0x0/0xdd > > (XEN) > > (XEN) > > (XEN) **************************************** > > (XEN) Panic on CPU 6: > > (XEN) Xen BUG at p2m-ept.c:38 > > (XEN) **************************************** > > (XEN) > > (XEN) Manual reset required (''noreboot'' specified) > > ---------------------------------------Works > configuration-------------------------------------------------- > > And if starts guest like > > xm cr hvm.linux.balloon maxmem=1024 memory=512 > > the guest can be successfully logon through VNC > > Any idea on what happens? > > PoD is new to me, I will try to know more, thanks. > > *From:* Dan Magenheimer [mailto:dan.magenheimer@oracle.com] > *Date:* 2010.11.28 10:36 > *sent:* tinnycloud; xen devel > *cc:* george.dunlap@eu.citrix.com > *subject:* RE: Xen balloon driver discuss > > Am I understanding correctly that you are running each linux-2.6.18 as > HVM (not PV)? I didn’t think that the linux-2.6.18 balloon driver worked > at all in an HVM guest. > > You also didn’t say what version of Xen you are using. If you are > running xen-unstable, you should also provide the changeset number. > > In any case, any load of HVM guests should never crash Xen itself, but > if you are running HVM guests, I probably can’t help much as I almost > never run HVM guests. > > *From:* cloudroot [mailto:cloudroot@sina.com] > *Sent:* Friday, November 26, 2010 11:55 PM > *To:* tinnycloud; Dan Magenheimer; xen devel > *Cc:* george.dunlap@eu.citrix.com > *Subject:* re: Xen balloon driver discuss > > Hi Dan: > > I have set the benchmark to test balloon driver, but unfortunately the > Xen crashed on memory Panic. > > Before I attach the details output from serial port(which takes time on > next run), I am afraid of I might miss something on test environment. > > My dom0 kernel is 2.6.31, pvops. > > Well currently there is no driver/xen/balloon.c on this kernel source > tree, so I build the xen-balloon.ko, Xen-platform-pci.ko form > > linux-2.6.18.x86_64, and installed in Dom U, which is redhat 5.4. > > What I did is put a C program in the each Dom U(total 24 HVM), the > program will allocate the memory and fill it with random string repeatly. > > And in dom0, a phthon monitor will collect the meminfo from xenstore and > calculate the target to balloon from Committed_AS. > > The panic happens when the program is running in just one Dom. > > I am writing to ask whether my balloon driver is out of date, or where > can I get the latest source code, > > I’ve googled a lot, but still have a lot of confusion on those source tree. > > Many thanks. > > *From:* tinnycloud [mailto:tinnycloud@hotmail.com] > *Date:* 2010.11.23 22:58 > *TO:* ''Dan Magenheimer''; ''xen devel'' > *CC:* ''george.dunlap@eu.citrix.com'' > *Subject:* re: Xen balloon driver discuss > > HI Dan: > > Appreciate for your presentation in summarizing the memory overcommit, > really vivid and in great help. > > Well, I guess recently days the strategy in my mind will fall into the > solution Set C in pdf. > > The tmem solution your worked out for memory overcommit is both > efficient and effective. > > I guess I will have a try on Linux Guest. > > The real situation I have is most of the running VMs on host are > windows. So I had to come up those policies to balance the memory. > > Although policies are all workload dependent. Good news is host workload > is configurable, and not very heavy > > So I will try to figure out some favorable policy. The policies referred > in pdf are good start for me. > > Today, instead of trying to implement “/proc/meminfo” with shared pages, > I hacked the balloon driver to have another > > workqueue periodically write meminfo into xenstore through xenbus, which > solve the problem of xenstrore high CPU > > utilization problem. > > Later I will try to google more on how Citrix does. > > Thanks for your help, or do you have any better idea for windows guest? > > *Sent:* Dan Magenheimer [mailto:dan.magenheimer@oracle.com] > *Date:* 2010.11.23 1:47 > *To:* MaoXiaoyun; xen devel > *CC:* george.dunlap@eu.citrix.com > *Subject:* RE: Xen balloon driver discuss > > Xenstore IS slow and you could improve xenballoond performance by only > sending the single CommittedAS value from xenballoond in domU to dom0 > instead of all of /proc/meminfo. But you are making an assumption that > getting memory utilization information from domU to dom0 FASTER (e.g. > with a shared page) will provide better ballooning results. I have not > found this to be the case, which is what led to my investigation into > self-ballooning, which led to Transcendent Memory. See the 2010 Xen > Summit for more information. > > In your last paragraph below “Regards balloon strategy”, the problem is > it is not easy to define “enough memory” and “shortage of memory” within > any guest and almost impossible to define it and effectively load > balance across many guests. See my Linux Plumber’s Conference > presentation (with complete speaker notes) here: > > http://oss.oracle.com/projects/tmem/dist/documentation/presentations/MemMgmtVirtEnv-LPC2010-Final.pdf > > http://oss.oracle.com/projects/tmem/dist/documentation/presentations/MemMgmtVirtEnv-LPC2010-SpkNotes.pdf > > *From:* MaoXiaoyun [mailto:tinnycloud@hotmail.com] > *Sent:* Sunday, November 21, 2010 9:33 PM > *To:* xen devel > *Cc:* Dan Magenheimer; george.dunlap@eu.citrix.com > *Subject:* RE: Xen balloon driver discuss > > > Since currently /cpu/meminfo is sent to domain 0 via xenstore, which in > my opinoin is slow. > What I want to do is: there is a shared page between domU and dom0, and > domU periodically > update the meminfo into the page, while on the other side dom0 retrive > the updated data for > caculating the target, which is used by guest for balloning. > > The problem I met is, currently I don''t know how to implement a shared > page between > dom0 and domU. > Would it like dom 0 alloc a unbound event and wait guest to connect, and > transfer date through > grant table? > Or someone has more efficient way? > many thanks. > >> From: tinnycloud@hotmail.com >> To: xen-devel@lists.xensource.com >> CC: dan.magenheimer@oracle.com; George.Dunlap@eu.citrix.com >> Subject: Xen balloon driver discuss >> Date: Sun, 21 Nov 2010 14:26:01 +0800 >> >> Hi: >> Greeting first. >> >> I was trying to run about 24 HVMS (currently only Linux, later will >> involve Windows) on one physical server with 24GB memory, 16CPUs. >> Each VM is configured with 2GB memory, and I reserved 8GB memory for >> dom0. >> For safety reason, only domain U''s memory is allowed to balloon. >> >> Inside domain U, I used xenballooned provide by xensource, >> periodically write /proc/meminfo into xenstore in dom >> 0(/local/domain/did/memory/meminfo). >> And in domain 0, I wrote a python script to read the meminfo, like >> xen provided strategy, use Committed_AS to calculate the domain U balloon >> target. >> The time interval is ! 1 seconds. >> >> Inside each VM, I setup a apache server for test. Well, I''d >> like to say the result is not so good. >> It appears that too much read/write on xenstore, when I give some of >> the stress(by using ab) to guest domains, >> the CPU usage of xenstore is up to 100%. Thus the monitor running in >> dom0 also response quite slowly. >> Also, in ab test, the Committed_AS grows very fast, reach to maxmem >> in short time, but in fact the only a small amount >> of memory guest really need, so I guess there should be some more to >> be taken into consideration for ballooning. >> >> For xenstore issue, I first plan to wrote a C program inside domain >> U to replace xenballoond to see whether the situation >> will be refined. If not, how about set up event channel directly for >> domU and dom0, would it be faster? >> >> Regards balloon strategy, I would do like this, when there ! are >> enough memory , just fulfill the guest balloon request, and when shortage >> of memory, distribute memory evenly on the guests those request >> inflation. >> >> Does anyone have better suggestion, thanks in advance. >> >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
FYI, the balloon driver in 2.6.18 was meant to be working at some point. The xen tree has some drivers which will compile for 2.6.18 externally and will run in HVM mode. More modern kernels need Stefano''s pv-on-hvm patch series to be able to access xenstore (which is a requisite for a working balloon driver). -George On 28/11/10 02:36, Dan Magenheimer wrote:> Am I understanding correctly that you are running each linux-2.6.18 as > HVM (not PV)? I didn’t think that the linux-2.6.18 balloon driver worked > at all in an HVM guest. > > You also didn’t say what version of Xen you are using. If you are > running xen-unstable, you should also provide the changeset number. > > In any case, any load of HVM guests should never crash Xen itself, but > if you are running HVM guests, I probably can’t help much as I almost > never run HVM guests. > > *From:* cloudroot [mailto:cloudroot@sina.com] > *Sent:* Friday, November 26, 2010 11:55 PM > *To:* tinnycloud; Dan Magenheimer; xen devel > *Cc:* george.dunlap@eu.citrix.com > *Subject:* re: Xen balloon driver discuss > > Hi Dan: > > I have set the benchmark to test balloon driver, but unfortunately the > Xen crashed on memory Panic. > > Before I attach the details output from serial port(which takes time on > next run), I am afraid of I might miss something on test environment. > > My dom0 kernel is 2.6.31, pvops. > > Well currently there is no driver/xen/balloon.c on this kernel source > tree, so I build the xen-balloon.ko, Xen-platform-pci.ko form > > linux-2.6.18.x86_64, and installed in Dom U, which is redhat 5.4. > > What I did is put a C program in the each Dom U(total 24 HVM), the > program will allocate the memory and fill it with random string repeatly. > > And in dom0, a phthon monitor will collect the meminfo from xenstore and > calculate the target to balloon from Committed_AS. > > The panic happens when the program is running in just one Dom. > > I am writing to ask whether my balloon driver is out of date, or where > can I get the latest source code, > > I’ve googled a lot, but still have a lot of confusion on those source tree. > > Many thanks. > > *From:* tinnycloud [mailto:tinnycloud@hotmail.com] > *Date:* 2010.11.23 22:58 > *TO:* ''Dan Magenheimer''; ''xen devel'' > *CC:* ''george.dunlap@eu.citrix.com'' > *Subject:* re: Xen balloon driver discuss > > HI Dan: > > Appreciate for your presentation in summarizing the memory overcommit, > really vivid and in great help. > > Well, I guess recently days the strategy in my mind will fall into the > solution Set C in pdf. > > The tmem solution your worked out for memory overcommit is both > efficient and effective. > > I guess I will have a try on Linux Guest. > > The real situation I have is most of the running VMs on host are > windows. So I had to come up those policies to balance the memory. > > Although policies are all workload dependent. Good news is host workload > is configurable, and not very heavy > > So I will try to figure out some favorable policy. The policies referred > in pdf are good start for me. > > Today, instead of trying to implement “/proc/meminfo” with shared pages, > I hacked the balloon driver to have another > > workqueue periodically write meminfo into xenstore through xenbus, which > solve the problem of xenstrore high CPU > > utilization problem. > > Later I will try to google more on how Citrix does. > > Thanks for your help, or do you have any better idea for windows guest? > > *Sent:* Dan Magenheimer [mailto:dan.magenheimer@oracle.com] > *Date:* 2010.11.23 1:47 > *To:* MaoXiaoyun; xen devel > *CC:* george.dunlap@eu.citrix.com > *Subject:* RE: Xen balloon driver discuss > > Xenstore IS slow and you could improve xenballoond performance by only > sending the single CommittedAS value from xenballoond in domU to dom0 > instead of all of /proc/meminfo. But you are making an assumption that > getting memory utilization information from domU to dom0 FASTER (e.g. > with a shared page) will provide better ballooning results. I have not > found this to be the case, which is what led to my investigation into > self-ballooning, which led to Transcendent Memory. See the 2010 Xen > Summit for more information. > > In your last paragraph below “Regards balloon strategy”, the problem is > it is not easy to define “enough memory” and “shortage of memory” within > any guest and almost impossible to define it and effectively load > balance across many guests. See my Linux Plumber’s Conference > presentation (with complete speaker notes) here: > > http://oss.oracle.com/projects/tmem/dist/documentation/presentations/MemMgmtVirtEnv-LPC2010-Final.pdf > > http://oss.oracle.com/projects/tmem/dist/documentation/presentations/MemMgmtVirtEnv-LPC2010-SpkNotes.pdf > > *From:* MaoXiaoyun [mailto:tinnycloud@hotmail.com] > *Sent:* Sunday, November 21, 2010 9:33 PM > *To:* xen devel > *Cc:* Dan Magenheimer; george.dunlap@eu.citrix.com > *Subject:* RE: Xen balloon driver discuss > > > Since currently /cpu/meminfo is sent to domain 0 via xenstore, which in > my opinoin is slow. > What I want to do is: there is a shared page between domU and dom0, and > domU periodically > update the meminfo into the page, while on the other side dom0 retrive > the updated data for > caculating the target, which is used by guest for balloning. > > The problem I met is, currently I don''t know how to implement a shared > page between > dom0 and domU. > Would it like dom 0 alloc a unbound event and wait guest to connect, and > transfer date through > grant table? > Or someone has more efficient way? > many thanks. > >> From: tinnycloud@hotmail.com >> To: xen-devel@lists.xensource.com >> CC: dan.magenheimer@oracle.com; George.Dunlap@eu.citrix.com >> Subject: Xen balloon driver discuss >> Date: Sun, 21 Nov 2010 14:26:01 +0800 >> >> Hi: >> Greeting first. >> >> I was trying to run about 24 HVMS (currently only Linux, later will >> involve Windows) on one physical server with 24GB memory, 16CPUs. >> Each VM is configured with 2GB memory, and I reserved 8GB memory for >> dom0. >> For safety reason, only domain U''s memory is allowed to balloon. >> >> Inside domain U, I used xenballooned provide by xensource, >> periodically write /proc/meminfo into xenstore in dom >> 0(/local/domain/did/memory/meminfo). >> And in domain 0, I wrote a python script to read the meminfo, like >> xen provided strategy, use Committed_AS to calculate the domain U balloon >> target. >> The time interval is ! 1 seconds. >> >> Inside each VM, I setup a apache server for test. Well, I''d >> like to say the result is not so good. >> It appears that too much read/write on xenstore, when I give some of >> the stress(by using ab) to guest domains, >> the CPU usage of xenstore is up to 100%. Thus the monitor running in >> dom0 also response quite slowly. >> Also, in ab test, the Committed_AS grows very fast, reach to maxmem >> in short time, but in fact the only a small amount >> of memory guest really need, so I guess there should be some more to >> be taken into consideration for ballooning. >> >> For xenstore issue, I first plan to wrote a C program inside domain >> U to replace xenballoond to see whether the situation >> will be refined. If not, how about set up event channel directly for >> domU and dom0, would it be faster? >> >> Regards balloon strategy, I would do like this, when there ! are >> enough memory , just fulfill the guest balloon request, and when shortage >> of memory, distribute memory evenly on the guests those request >> inflation. >> >> Does anyone have better suggestion, thanks in advance. >> >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Well, I think I know the problem. The PoD expects Guest domain to balloon to the Max memory when getting start. But in fact, I only have balloon driver installed in domain U, but have no actual balloon work. So that is, if we run out of PoD cache before balloon works, Xen will crash domain(goto out_of_memory), and at this situation, domain U swap(dom U can’t use swap memory) is not available , right? And when balloon actually works, the pod cached will finally decrease to 0, and no longer be used any more, right? In my understanding, Pod Cache is much like a memory pool used for domain initialization, this remind me of tmem, which is a pool of all host memory. But tmem needs to have host OS modification, but since Pod supports hvm, could we use this method to implement a tmem like memory overcommit? From: Chu Rui [mailto:ruichu@gmail.com] TO: tinnycloud CC: xen-devel@lists.xensource.com; George.Dunlap@eu.citrix.com; dan.magenheimer@oracle.com Subject: Re: [Xen-devel] Xen balloon driver discuss I am also interested with tinnycloud''s problem. It looks that the pod cache has been used up like this: if ( p2md->pod.count == 0 ) goto out_of_memory; George, would you please take a look on this problem, and, if possbile, tell a little more about what does PoD cache mean? Is it a memory pool for PoD allocation? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 29/11/10 10:55, tinnycloud wrote:> So that is, if we run out of PoD cache before balloon works, Xen will > crash domain(goto out_of_memory),That''s right; PoD is only meant to allow a guest to run from boot until the balloon driver can load. It''s to allow a guest to "boot ballooned."> and at this situation, domain U swap(dom U can’t use swap memory) is not > available , right?I don''t believe swap and PoD are integrated at the moment, no.> And when balloon actually works, the pod cached will finally decrease to > 0, and no longer be used any more, right?Conceptually, yes. What actually happens is that ballooning will reduce it so that pod_entries==cache_size. Entries will stay PoD until the guest touches them. It''s likely that eventually the guest will touch all the pages, at which point the PoD cache will be 0.> could we use this method to implement a tmem like memory overcommit?PoD does require guest knowledge -- it requires the balloon driver to be loaded soon after boot so the so the guest will limit its memory usage. It also doesn''t allow overcommit. Memory in the PoD cache is already allocated to the VM, and can''t be used for something else. You can''t to overcommit without either: * The guest knowing that it might not get the memory back, and being OK with that (tmem), or * Swapping, which doesn''t require PoD at all. If you''re thinking about scanning for zero pages and automatically reclaiming them, for instance, you have to be able to deal with a situation where the guest decides to use a page you''ve reclaimed but you''ve already given your last free page to someone else, and there are no more zero pages anywhere on the system. That would mean either just pausing the VM indefinitely, or choosing another guest page to swap out. -George> > *From:* Chu Rui [mailto:ruichu@gmail.com] > *TO:* tinnycloud > *CC:* xen-devel@lists.xensource.com; George.Dunlap@eu.citrix.com; > dan.magenheimer@oracle.com > *Subject:* Re: [Xen-devel] Xen balloon driver discuss > > I am also interested with tinnycloud''s problem. > > It looks that the pod cache has been used up like this: > > if ( p2md->pod.count == 0 ) > goto out_of_memory; > > George, would you please take a look on this problem, and, if possbile, > tell a little more about what does PoD cache mean? Is it a memory pool > for PoD allocation? >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi George: Appreciate for the details, I get more understandings but still have some confusions. 1.Is it necessery to balloon to max-target at dom U right startup?(for, xm cr xxx.hvm maxmem=2048 memory=1024; max-target is 2048-1024) Say is it safe to balloon to let the guest has only 512M memory in total? or 1536 M(in this situation, i guess the pod entry will also reduce and extra 512M memory will be added to Pod cache,right)? 2. Suppose we have a xen wide PoD memorym pool, that is accessable for every guest domains, when the guest needs a page, it get the page from the pool, and we can still use balloon strategy to have the guest free pages back to the pool. So if the amount of all domain memory inuse is less than host physial memory, it is safe. And when no memory available from host, domain need new memory may pause for for waiting for others to free, or use swap memory, is it possible? 3. If item 2 is possible, it looks more like tmem, what will tmem do when all memory on request is larger than host physical memory? I will have detail look tomorrow. Thanks for your kindly help. From my iPad 2010-11-29,19:19,George Dunlap <George.Dunlap@eu.citrix.com> :> On 29/11/10 10:55, tinnycloud wrote: >> So that is, if we run out of PoD cache before balloon works, Xen will >> crash domain(goto out_of_memory), > > That''s right; PoD is only meant to allow a guest to run from boot until > the balloon driver can load. It''s to allow a guest to "boot ballooned." > >> and at this situation, domain U swap(dom U can’t use swap memory) is not >> available , right? > > I don''t believe swap and PoD are integrated at the moment, no. > >> And when balloon actually works, the pod cached will finally decrease to >> 0, and no longer be used any more, right? > > Conceptually, yes. What actually happens is that ballooning will reduce > it so that pod_entries==cache_size. Entries will stay PoD until the > guest touches them. It''s likely that eventually the guest will touch > all the pages, at which point the PoD cache will be 0. > >> could we use this method to implement a tmem like memory overcommit? > > PoD does require guest knowledge -- it requires the balloon driver to be > loaded soon after boot so the so the guest will limit its memory usage. > It also doesn''t allow overcommit. Memory in the PoD cache is already > allocated to the VM, and can''t be used for something else. > > You can''t to overcommit without either: > * The guest knowing that it might not get the memory back, and being OK > with that (tmem), or > * Swapping, which doesn''t require PoD at all. > > If you''re thinking about scanning for zero pages and automatically > reclaiming them, for instance, you have to be able to deal with a > situation where the guest decides to use a page you''ve reclaimed but > you''ve already given your last free page to someone else, and there are > no more zero pages anywhere on the system. That would mean either just > pausing the VM indefinitely, or choosing another guest page to swap out. > > -George > >> >> *From:* Chu Rui [mailto:ruichu@gmail.com] >> *TO:* tinnycloud >> *CC:* xen-devel@lists.xensource.com; George.Dunlap@eu.citrix.com; >> dan.magenheimer@oracle.com >> *Subject:* Re: [Xen-devel] Xen balloon driver discuss >> >> I am also interested with tinnycloud''s problem. >> >> It looks that the pod cache has been used up like this: >> >> if ( p2md->pod.count == 0 ) >> goto out_of_memory; >> >> George, would you please take a look on this problem, and, if possbile, >> tell a little more about what does PoD cache mean? Is it a memory pool >> for PoD allocation? >> > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Yes, sorry, I was thinking about the upstream balloon driver which fails to init if (!xen_pv_domain()). The only other problem I can think of in the RH5 balloon driver is I think there is no minimum-size check... i.e. if you try to balloon to a very small size (which can happen accidentally if you use the wrong units to /proc/xen/balloon), the guest kernel will crash.> -----Original Message----- > From: George Dunlap [mailto:George.Dunlap@eu.citrix.com] > Sent: Monday, November 29, 2010 3:12 AM > To: Dan Magenheimer > Cc: cloudroot; tinnycloud; xen devel > Subject: [Xen-devel] Re: Xen balloon driver discuss > > FYI, the balloon driver in 2.6.18 was meant to be working at some > point. > The xen tree has some drivers which will compile for 2.6.18 > externally > and will run in HVM mode. More modern kernels need Stefano''s pv-on-hvm > patch series to be able to access xenstore (which is a requisite for a > working balloon driver). > > -George > > On 28/11/10 02:36, Dan Magenheimer wrote: > > Am I understanding correctly that you are running each linux-2.6.18 > as > > HVM (not PV)? I didn''t think that the linux-2.6.18 balloon driver > worked > > at all in an HVM guest. > > > > You also didn''t say what version of Xen you are using. If you are > > running xen-unstable, you should also provide the changeset number. > > > > In any case, any load of HVM guests should never crash Xen itself, > but > > if you are running HVM guests, I probably can''t help much as I almost > > never run HVM guests. > > > > *From:* cloudroot [mailto:cloudroot@sina.com] > > *Sent:* Friday, November 26, 2010 11:55 PM > > *To:* tinnycloud; Dan Magenheimer; xen devel > > *Cc:* george.dunlap@eu.citrix.com > > *Subject:* re: Xen balloon driver discuss > > > > Hi Dan: > > > > I have set the benchmark to test balloon driver, but unfortunately > the > > Xen crashed on memory Panic. > > > > Before I attach the details output from serial port(which takes time > on > > next run), I am afraid of I might miss something on test environment. > > > > My dom0 kernel is 2.6.31, pvops. > > > > Well currently there is no driver/xen/balloon.c on this kernel source > > tree, so I build the xen-balloon.ko, Xen-platform-pci.ko form > > > > linux-2.6.18.x86_64, and installed in Dom U, which is redhat 5.4. > > > > What I did is put a C program in the each Dom U(total 24 HVM), the > > program will allocate the memory and fill it with random string > repeatly. > > > > And in dom0, a phthon monitor will collect the meminfo from xenstore > and > > calculate the target to balloon from Committed_AS. > > > > The panic happens when the program is running in just one Dom. > > > > I am writing to ask whether my balloon driver is out of date, or > where > > can I get the latest source code, > > > > I''ve googled a lot, but still have a lot of confusion on those source > tree. > > > > Many thanks. > > > > *From:* tinnycloud [mailto:tinnycloud@hotmail.com] > > *Date:* 2010.11.23 22:58 > > *TO:* ''Dan Magenheimer''; ''xen devel'' > > *CC:* ''george.dunlap@eu.citrix.com'' > > *Subject:* re: Xen balloon driver discuss > > > > HI Dan: > > > > Appreciate for your presentation in summarizing the memory > overcommit, > > really vivid and in great help. > > > > Well, I guess recently days the strategy in my mind will fall into > the > > solution Set C in pdf. > > > > The tmem solution your worked out for memory overcommit is both > > efficient and effective. > > > > I guess I will have a try on Linux Guest. > > > > The real situation I have is most of the running VMs on host are > > windows. So I had to come up those policies to balance the memory. > > > > Although policies are all workload dependent. Good news is host > workload > > is configurable, and not very heavy > > > > So I will try to figure out some favorable policy. The policies > referred > > in pdf are good start for me. > > > > Today, instead of trying to implement "/proc/meminfo" with shared > pages, > > I hacked the balloon driver to have another > > > > workqueue periodically write meminfo into xenstore through xenbus, > which > > solve the problem of xenstrore high CPU > > > > utilization problem. > > > > Later I will try to google more on how Citrix does. > > > > Thanks for your help, or do you have any better idea for windows > guest? > > > > *Sent:* Dan Magenheimer [mailto:dan.magenheimer@oracle.com] > > *Date:* 2010.11.23 1:47 > > *To:* MaoXiaoyun; xen devel > > *CC:* george.dunlap@eu.citrix.com > > *Subject:* RE: Xen balloon driver discuss > > > > Xenstore IS slow and you could improve xenballoond performance by > only > > sending the single CommittedAS value from xenballoond in domU to dom0 > > instead of all of /proc/meminfo. But you are making an assumption > that > > getting memory utilization information from domU to dom0 FASTER (e.g. > > with a shared page) will provide better ballooning results. I have > not > > found this to be the case, which is what led to my investigation into > > self-ballooning, which led to Transcendent Memory. See the 2010 Xen > > Summit for more information. > > > > In your last paragraph below "Regards balloon strategy", the problem > is > > it is not easy to define "enough memory" and "shortage of memory" > within > > any guest and almost impossible to define it and effectively load > > balance across many guests. See my Linux Plumber''s Conference > > presentation (with complete speaker notes) here: > > > > > http://oss.oracle.com/projects/tmem/dist/documentation/presentations/Me > mMgmtVirtEnv-LPC2010-Final.pdf > > > > > http://oss.oracle.com/projects/tmem/dist/documentation/presentations/Me > mMgmtVirtEnv-LPC2010-SpkNotes.pdf > > > > *From:* MaoXiaoyun [mailto:tinnycloud@hotmail.com] > > *Sent:* Sunday, November 21, 2010 9:33 PM > > *To:* xen devel > > *Cc:* Dan Magenheimer; george.dunlap@eu.citrix.com > > *Subject:* RE: Xen balloon driver discuss > > > > > > Since currently /cpu/meminfo is sent to domain 0 via xenstore, which > in > > my opinoin is slow. > > What I want to do is: there is a shared page between domU and dom0, > and > > domU periodically > > update the meminfo into the page, while on the other side dom0 > retrive > > the updated data for > > caculating the target, which is used by guest for balloning. > > > > The problem I met is, currently I don''t know how to implement a > shared > > page between > > dom0 and domU. > > Would it like dom 0 alloc a unbound event and wait guest to connect, > and > > transfer date through > > grant table? > > Or someone has more efficient way? > > many thanks. > > > >> From: tinnycloud@hotmail.com > >> To: xen-devel@lists.xensource.com > >> CC: dan.magenheimer@oracle.com; George.Dunlap@eu.citrix.com > >> Subject: Xen balloon driver discuss > >> Date: Sun, 21 Nov 2010 14:26:01 +0800 > >> > >> Hi: > >> Greeting first. > >> > >> I was trying to run about 24 HVMS (currently only Linux, later will > >> involve Windows) on one physical server with 24GB memory, 16CPUs. > >> Each VM is configured with 2GB memory, and I reserved 8GB memory > for > >> dom0. > >> For safety reason, only domain U''s memory is allowed to balloon. > >> > >> Inside domain U, I used xenballooned provide by xensource, > >> periodically write /proc/meminfo into xenstore in dom > >> 0(/local/domain/did/memory/meminfo). > >> And in domain 0, I wrote a python script to read the meminfo, like > >> xen provided strategy, use Committed_AS to calculate the domain U > balloon > >> target. > >> The time interval is ! 1 seconds. > >> > >> Inside each VM, I setup a apache server for test. Well, I''d > >> like to say the result is not so good. > >> It appears that too much read/write on xenstore, when I give some > of > >> the stress(by using ab) to guest domains, > >> the CPU usage of xenstore is up to 100%. Thus the monitor running > in > >> dom0 also response quite slowly. > >> Also, in ab test, the Committed_AS grows very fast, reach to maxmem > >> in short time, but in fact the only a small amount > >> of memory guest really need, so I guess there should be some more > to > >> be taken into consideration for ballooning. > >> > >> For xenstore issue, I first plan to wrote a C program inside domain > >> U to replace xenballoond to see whether the situation > >> will be refined. If not, how about set up event channel directly > for > >> domU and dom0, would it be faster? > >> > >> Regards balloon strategy, I would do like this, when there ! are > >> enough memory , just fulfill the guest balloon request, and when > shortage > >> of memory, distribute memory evenly on the guests those request > >> inflation. > >> > >> Does anyone have better suggestion, thanks in advance. > >> > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
I also think it is a little strange in current PoD implementation, different with my image. In my mind, as tinnycloud mentioned, the PoD cache should be a pool as large as the idle memory in VMM, and shared by all guests. If a guest usages was always smaller than the predefined PoD cache size, the unused part could be appropriated for others. On the contrary, If a guest balloon was delayed to be started, the VMM should populate more memory to satisfy it (supposing the VMM has enough memory). In current PoD, the balloon should be started as soon as possbile, otherwise the guest will be crashed after the PoD cache is exhausted(supposing the emergency sweep does not work). That''s dangerous, although in most cases it works well. George, I wonder why do you implement it as this? It looks better to used a resilient PoD cache, intead of a fixed one. Your wonderful work was appreciated, I just want to know your thought. 在 2010年11月29日 下午7:19,George Dunlap <George.Dunlap@eu.citrix.com>写道:> On 29/11/10 10:55, tinnycloud wrote: > > So that is, if we run out of PoD cache before balloon works, Xen will > > crash domain(goto out_of_memory), > > That''s right; PoD is only meant to allow a guest to run from boot until > the balloon driver can load. It''s to allow a guest to "boot ballooned." > > > and at this situation, domain U swap(dom U can’t use swap memory) is not > > available , right? > > I don''t believe swap and PoD are integrated at the moment, no. > > > And when balloon actually works, the pod cached will finally decrease to > > 0, and no longer be used any more, right? > > Conceptually, yes. What actually happens is that ballooning will reduce > it so that pod_entries==cache_size. Entries will stay PoD until the > guest touches them. It''s likely that eventually the guest will touch > all the pages, at which point the PoD cache will be 0. > > > could we use this method to implement a tmem like memory overcommit? > > PoD does require guest knowledge -- it requires the balloon driver to be > loaded soon after boot so the so the guest will limit its memory usage. > It also doesn''t allow overcommit. Memory in the PoD cache is already > allocated to the VM, and can''t be used for something else. > > You can''t to overcommit without either: > * The guest knowing that it might not get the memory back, and being OK > with that (tmem), or > * Swapping, which doesn''t require PoD at all. > > If you''re thinking about scanning for zero pages and automatically > reclaiming them, for instance, you have to be able to deal with a > situation where the guest decides to use a page you''ve reclaimed but > you''ve already given your last free page to someone else, and there are > no more zero pages anywhere on the system. That would mean either just > pausing the VM indefinitely, or choosing another guest page to swap out. > > -George > > > > > *From:* Chu Rui [mailto:ruichu@gmail.com] > > *TO:* tinnycloud > > *CC:* xen-devel@lists.xensource.com; George.Dunlap@eu.citrix.com; > > dan.magenheimer@oracle.com > > *Subject:* Re: [Xen-devel] Xen balloon driver discuss > > > > I am also interested with tinnycloud''s problem. > > > > It looks that the pod cache has been used up like this: > > > > if ( p2md->pod.count == 0 ) > > goto out_of_memory; > > > > George, would you please take a look on this problem, and, if possbile, > > tell a little more about what does PoD cache mean? Is it a memory pool > > for PoD allocation? > > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 29/11/10 15:41, hotmaim wrote:> Appreciate for the details, I get more understandings but still have some confusions. > 1.Is it necessery to balloon to max-target at dom U right startup?(for, xm cr xxx.hvm maxmem=2048 memory=1024; max-target is 2048-1024) Say is it safe to balloon to let the guest has only 512M memory in total? or 1536 M(in this situation, i guess the pod entry will > also reduce and extra 512M memory will be added to Pod cache,right)?I''m sorry, I can''t figure out what you mean. The tools will set "target" to the value of "memory". The balloon driver is supposed to see how many total pages it has (2048M) and "inflate" the balloon until the number of pages is at the target (1024M in your example above).> 2. Suppose we have a xen wide PoD memorym pool, that is accessable for every guest domains, when the guest needs a page, it get the page from the pool, and we can still use > balloon strategy to have the guest free pages back to the pool. So if the amount of all domain > memory inuse is less than host physial memory, it is safe. And when no memory available from > host, domain need new memory may pause for for waiting for others to free, or use swap memory, is it possible?We already have a pool of free memory accessible to all the guest domains: It''s called the Xen free page list. :-) One of the explicit purposes of PoD is to set aside a fixed amount of memory for a guest, so that no other domains / processes can claim it. It''s guaranteed that memory, and as long as it has a working balloon driver, shouldn''t have any issues using it properly. Sharing it with other VMs would undermine this, and make it pretty much the same as the Xen free page list. I''m not an expert in tmem, but as I understand it, the whole point of tmem is to use knowledge of the guest OS to be able to throw away certain data. You can''t get guest-specific knowledge without modifying the guest OS to have it tell Xen somehow. It sounds like what you''re advocating is *allocate*-on-demand (as opposed to PoD, which allocates all the memory at the beginning but *populates* the p2m table on demand): tell all the guests they have more memory than is available total, assuming that only some of them are going to try to use all of it; and allocating the memory as it''s used. This works well for processes, but operating systems are typically built with the assumption that memory not used is memory completely wasted. They therefore keep disk cache pages and unused memory pages around "just in case", and I predict that any guest which has an active workload will eventually use all the memory it''s been told it has, even if it''s only actively using a small portion of it. At that point, Xen will be forced to try to guess which page is the least important to have around and swap it out. Alternately, the tools could slowly balloon down all of the guests as the memory starts to run out; but then you have a situation where the guest that gets the most memory is the one that touched it first, not the one which actually needs it. At any rate, PoD is meant to solve exactly one problem: booting "ballooned". At the moment it doesn''t lend itself to other solutions. -George> 2010-11-29,19:19,George Dunlap<George.Dunlap@eu.citrix.com> : > >> On 29/11/10 10:55, tinnycloud wrote: >>> So that is, if we run out of PoD cache before balloon works, Xen will >>> crash domain(goto out_of_memory), >> >> That''s right; PoD is only meant to allow a guest to run from boot until >> the balloon driver can load. It''s to allow a guest to "boot ballooned." >> >>> and at this situation, domain U swap(dom U can’t use swap memory) is not >>> available , right? >> >> I don''t believe swap and PoD are integrated at the moment, no. >> >>> And when balloon actually works, the pod cached will finally decrease to >>> 0, and no longer be used any more, right? >> >> Conceptually, yes. What actually happens is that ballooning will reduce >> it so that pod_entries==cache_size. Entries will stay PoD until the >> guest touches them. It''s likely that eventually the guest will touch >> all the pages, at which point the PoD cache will be 0. >> >>> could we use this method to implement a tmem like memory overcommit? >> >> PoD does require guest knowledge -- it requires the balloon driver to be >> loaded soon after boot so the so the guest will limit its memory usage. >> It also doesn''t allow overcommit. Memory in the PoD cache is already >> allocated to the VM, and can''t be used for something else. >> >> You can''t to overcommit without either: >> * The guest knowing that it might not get the memory back, and being OK >> with that (tmem), or >> * Swapping, which doesn''t require PoD at all. >> >> If you''re thinking about scanning for zero pages and automatically >> reclaiming them, for instance, you have to be able to deal with a >> situation where the guest decides to use a page you''ve reclaimed but >> you''ve already given your last free page to someone else, and there are >> no more zero pages anywhere on the system. That would mean either just >> pausing the VM indefinitely, or choosing another guest page to swap out. >> >> -George >> >>> >>> *From:* Chu Rui [mailto:ruichu@gmail.com] >>> *TO:* tinnycloud >>> *CC:* xen-devel@lists.xensource.com; George.Dunlap@eu.citrix.com; >>> dan.magenheimer@oracle.com >>> *Subject:* Re: [Xen-devel] Xen balloon driver discuss >>> >>> I am also interested with tinnycloud''s problem. >>> >>> It looks that the pod cache has been used up like this: >>> >>> if ( p2md->pod.count == 0 ) >>> goto out_of_memory; >>> >>> George, would you please take a look on this problem, and, if possbile, >>> tell a little more about what does PoD cache mean? Is it a memory pool >>> for PoD allocation? >>> >> >>_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 30/11/10 03:51, Chu Rui wrote:> George, I wonder why do you implement it as this? It looks better to > used a resilient PoD cache, intead of a fixed one. Your wonderful work > was appreciated, I just want to know your thought.The main reason it was implemented this way was to make things predictable for the toolstack. The XenServer control stack recently implmented an automatic dynamic memory control functionality that would allow you to simply set some ranges for memory, and it would automatically change the ballooning for you. To do that effectively, it needs to know how much memory is in use by every guest, and be able to guarantee that a guest can get memory if it needs it. Furthermore, XenServer has a High Availability (HA) option, which will allow you to guarantee that if a given number of hosts go down, certain VMs can be guaranteed to restart on other hosts. For both of these options, having strong control of the memory is a necessity. As I said in another e-mail, the Xen free page list is an already-existing, system-wide pool from which VMs can (in theory) allocate RAM. We could make it such that when a VM runs too low on PoD memory, it pauses and notifies the toolstack somehow. A toolstack which wanted to be more flexible with the balloon drivers could then increase its PoD cache size from the free cache pool if available; if not available, it could even create more free memory by ballooning down other VMs. Xen doesn''t need to be involved. If you want to implement that functionality, I''m sure your patches would be welcome. :-) -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Thank you for your kindly help. Well, on last mail, you mentioned that balloon will make pod_entries equal to cache_size as soon as it start to work when guest starts up.>From my understanding, if we start guest such as:xm cr xxx.hvm maxmem=2048 memory=512 then, we should set the /local/domain/did/memory/target to 522240 ( ( 512M-2M) * 1204, 2M for VGA in your another patch? ) to tell the balloon driver in guest to inflate, right? And when balloon driver balloon to let guest memory has this target, I think pod_entires will equal to cached_size, right? I did some experiment on this, the result shows different. Step 1. xm cr xxx.hvm maxmem=2048 memory=512 at the very beginning, I printed out domain tot_pages, 1320288, pod.entry_count 523776, that is 2046M, pod.count 130560, that is 512M (XEN) tot_pages 132088 pod_entries 523776 pod_count 130560 currently, /local/domain/did/memory/target in default will be written to 524288 after guest start up, balloon driver will balloon, when finish, I can see pod.entry_count reduce to 23552, pod,count 14063 (XEN) DomPage list too long to display (XEN) Tot pages 132088 PoD entries=23552 cachesize=14063 Step 2. In my understanding, /local/domain/did/memory/target should be at least 510 * 1024 , and then pod_entries will equal to cache_size I use 500, So I did: xm mem-set domain_id 500 then I can see pod.entry_count reduce to 22338, pod,count 15921, still not equal (XEN) Memory pages belonging to domain 4: (XEN) DomPage list too long to display (XEN) Tot pages 132088 PoD entries=22338 cachesize=15921 Step 3. Only after I did : xm mem-set domain_id 470 Pod_entries is equal to pod.count (XEN) DomPage list too long to display (XEN) Tot pages 130825 PoD entries=14677 cachesize=14677 Later from the code, I learnt that those two values are forced to be equal, in 700 out_entry_check: 701 /* If we''ve reduced our "liabilities" beyond our "assets", free some */ 702 if ( p2md->pod.entry_count < p2md->pod.count ) 703 { 704 p2m_pod_set_cache_target(d, p2md->pod.entry_count); 705 } 706 So in conclude, it looks like something goes wrong, the PoD entries should equal to cachesize(pod.count) as soon as the balloon driver inflate to max - target, right? Many thanks. ---------------------------------------------------------------------------- ------------------ From: George Dunlap [mailto:George.Dunlap@eu.citrix.com] to: hotmaim cc: Chu Rui; xen-devel@lists.xensource.com; Dan Magenheimer Sub: Re: [Xen-devel] Xen balloon driver discuss On 29/11/10 15:41, hotmaim wrote:> Appreciate for the details, I get more understandings but stillhave some confusions.> 1.Is it necessery to balloon to max-target at dom U rightstartup?(for, xm cr xxx.hvm maxmem=2048 memory=1024; max-target is 2048-1024) Say is it safe to balloon to let the guest has only 512M memory in total? or 1536 M(in this situation, i guess the pod entry will> also reduce and extra 512M memory will be added to Pod cache,right)?I''m sorry, I can''t figure out what you mean. The tools will set "target" to the value of "memory". The balloon driver is supposed to see how many total pages it has (2048M) and "inflate" the balloon until the number of pages is at the target (1024M in your example above).> 2. Suppose we have a xen wide PoD memorym pool, that is accessable forevery guest domains, when the guest needs a page, it get the page from the pool, and we can still use> balloon strategy to have the guest free pages back to the pool. So if theamount of all domain> memory inuse is less than host physial memory, it is safe. And when nomemory available from> host, domain need new memory may pause for for waiting for others to free,or use swap memory, is it possible? We already have a pool of free memory accessible to all the guest domains: It''s called the Xen free page list. :-) One of the explicit purposes of PoD is to set aside a fixed amount of memory for a guest, so that no other domains / processes can claim it. It''s guaranteed that memory, and as long as it has a working balloon driver, shouldn''t have any issues using it properly. Sharing it with other VMs would undermine this, and make it pretty much the same as the Xen free page list. I''m not an expert in tmem, but as I understand it, the whole point of tmem is to use knowledge of the guest OS to be able to throw away certain data. You can''t get guest-specific knowledge without modifying the guest OS to have it tell Xen somehow. It sounds like what you''re advocating is *allocate*-on-demand (as opposed to PoD, which allocates all the memory at the beginning but *populates* the p2m table on demand): tell all the guests they have more memory than is available total, assuming that only some of them are going to try to use all of it; and allocating the memory as it''s used. This works well for processes, but operating systems are typically built with the assumption that memory not used is memory completely wasted. They therefore keep disk cache pages and unused memory pages around "just in case", and I predict that any guest which has an active workload will eventually use all the memory it''s been told it has, even if it''s only actively using a small portion of it. At that point, Xen will be forced to try to guess which page is the least important to have around and swap it out. Alternately, the tools could slowly balloon down all of the guests as the memory starts to run out; but then you have a situation where the guest that gets the most memory is the one that touched it first, not the one which actually needs it. At any rate, PoD is meant to solve exactly one problem: booting "ballooned". At the moment it doesn''t lend itself to other solutions. -George> 2010-11-29,19:19,George Dunlap<George.Dunlap@eu.citrix.com> : > >> On 29/11/10 10:55, tinnycloud wrote: >>> So that is, if we run out of PoD cache before balloon works, Xen will >>> crash domain(goto out_of_memory), >> >> That''s right; PoD is only meant to allow a guest to run from boot until >> the balloon driver can load. It''s to allow a guest to "boot ballooned." >> >>> and at this situation, domain U swap(dom U can’t use swap memory) isnot>>> available , right? >> >> I don''t believe swap and PoD are integrated at the moment, no. >> >>> And when balloon actually works, the pod cached will finally decrease to >>> 0, and no longer be used any more, right? >> >> Conceptually, yes. What actually happens is that ballooning will reduce >> it so that pod_entries==cache_size. Entries will stay PoD until the >> guest touches them. It''s likely that eventually the guest will touch >> all the pages, at which point the PoD cache will be 0. >> >>> could we use this method to implement a tmem like memory overcommit? >> >> PoD does require guest knowledge -- it requires the balloon driver to be >> loaded soon after boot so the so the guest will limit its memory usage. >> It also doesn''t allow overcommit. Memory in the PoD cache is already >> allocated to the VM, and can''t be used for something else. >> >> You can''t to overcommit without either: >> * The guest knowing that it might not get the memory back, and being OK >> with that (tmem), or >> * Swapping, which doesn''t require PoD at all. >> >> If you''re thinking about scanning for zero pages and automatically >> reclaiming them, for instance, you have to be able to deal with a >> situation where the guest decides to use a page you''ve reclaimed but >> you''ve already given your last free page to someone else, and there are >> no more zero pages anywhere on the system. That would mean either just >> pausing the VM indefinitely, or choosing another guest page to swap out. >> >> -George >> >>> >>> *From:* Chu Rui [mailto:ruichu@gmail.com] >>> *TO:* tinnycloud >>> *CC:* xen-devel@lists.xensource.com; George.Dunlap@eu.citrix.com; >>> dan.magenheimer@oracle.com >>> *Subject:* Re: [Xen-devel] Xen balloon driver discuss >>> >>> I am also interested with tinnycloud''s problem. >>> >>> It looks that the pod cache has been used up like this: >>> >>> if ( p2md->pod.count == 0 ) >>> goto out_of_memory; >>> >>> George, would you please take a look on this problem, and, if possbile, >>> tell a little more about what does PoD cache mean? Is it a memory pool >>> for PoD allocation? >>> >> >>_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> One of the explicit purposes of PoD is to set aside a fixed amount of > memory for a guest, so that no other domains / processes can claim it. > It''s guaranteed that memory, and as long as it has a working balloon > driver, shouldn''t have any issues using it properly. Sharing it with > other VMs would undermine this, and make it pretty much the same as the > Xen free page list. > : > It sounds like what you''re advocating is *allocate*-on-demand (as > opposed to PoD, which allocates all the memory at the beginning but > *populates* the p2m table on demand): tell all the guests they have > more > memory than is available total, assuming that only some of them are > going to try to use all of it; and allocating the memory as it''s used. > This works well for processes, but operating systems are typically > built > with the assumption that memory not used is memory completely wasted. > They therefore keep disk cache pages and unused memory pages around > "just in case", and I predict that any guest which has an active > workload will eventually use all the memory it''s been told it has, even > if it''s only actively using a small portion of it. At that point, Xen > will be forced to try to guess which page is the least important to > have > around and swap it out.Maybe another key point about PoD is worth mentioning here (and probably very obvious to George and possibly mentioned somewhere else in this thread and I just missed it): The guest will *crash* if it attempts to write to a PoD page and Xen has no real physical page to back it. Or alternately, the guest must be stopped (perhaps for a long time) until Xen does have a real physical page to back it. Real Windows guest users won''t like that, so the memory should be pre-allocated and remain reserved for that guest. Or the toolset/dom0 must implement host-swapping, which has all sorts of nasty unpredictable performance issues. Dan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi George: I think I know the problem, it is due to the balloon driver I used it out of date. My Guest kernel is from(ftp://ftp.redhat.com/pub/redhat/linux/enterprise/5Server/en/os/SRPMS/, kernel-2.6.18-164.el5.src.rpm, so as the balloon driver ) The problem is at the very beginning, Pod Entry total is different from the current_pages pages in balloon. (at the beginning, both Pod Entry and current_pages shall point to the same value, that is total memory allocated for guest, But in fact, Pod Entry is 523776 < current_pages is 514879 So from Pod aspect, the balloon need to inflate to 523776 - target, but the balloon driver only inflate 514879 -target This is the problem. ) So later I will try to get the balloon.c from xenlinux to build a new driver, to see if solve the problem. Thanks. ----- ----- From: tinnycloud [mailto:tinnycloud@hotmail.com] Sent: 2010.11.30 21:59 To: ''George Dunlap'' cc: ''Chu Rui''; ''xen-devel@lists.xensource.com''; ''Dan Magenheimer'' Subject: re: [Xen-devel] Xen balloon driver discuss Thank you for your kindly help. Well, on last mail, you mentioned that balloon will make pod_entries equal to cache_size as soon as it start to work when guest starts up.>From my understanding, if we start guest such as:xm cr xxx.hvm maxmem=2048 memory=512 then, we should set the /local/domain/did/memory/target to 522240 ( ( 512M-2M) * 1204, 2M for VGA in your another patch? ) to tell the balloon driver in guest to inflate, right? And when balloon driver balloon to let guest memory has this target, I think pod_entires will equal to cached_size, right? I did some experiment on this, the result shows different. Step 1. xm cr xxx.hvm maxmem=2048 memory=512 at the very beginning, I printed out domain tot_pages, 1320288, pod.entry_count 523776, that is 2046M, pod.count 130560, that is 512M (XEN) tot_pages 132088 pod_entries 523776 pod_count 130560 currently, /local/domain/did/memory/target in default will be written to 524288 after guest start up, balloon driver will balloon, when finish, I can see pod.entry_count reduce to 23552, pod,count 14063 (XEN) DomPage list too long to display (XEN) Tot pages 132088 PoD entries=23552 cachesize=14063 Step 2. In my understanding, /local/domain/did/memory/target should be at least 510 * 1024 , and then pod_entries will equal to cache_size I use 500, So I did: xm mem-set domain_id 500 then I can see pod.entry_count reduce to 22338, pod,count 15921, still not equal (XEN) Memory pages belonging to domain 4: (XEN) DomPage list too long to display (XEN) Tot pages 132088 PoD entries=22338 cachesize=15921 Step 3. Only after I did : xm mem-set domain_id 470 Pod_entries is equal to pod.count (XEN) DomPage list too long to display (XEN) Tot pages 130825 PoD entries=14677 cachesize=14677 Later from the code, I learnt that those two values are forced to be equal, in 700 out_entry_check: 701 /* If we''ve reduced our "liabilities" beyond our "assets", free some */ 702 if ( p2md->pod.entry_count < p2md->pod.count ) 703 { 704 p2m_pod_set_cache_target(d, p2md->pod.entry_count); 705 } 706 So in conclude, it looks like something goes wrong, the PoD entries should equal to cachesize(pod.count) as soon as the balloon driver inflate to max - target, right? Many thanks. ---------------------------------------------------------------------------- ------------------ From: George Dunlap [mailto:George.Dunlap@eu.citrix.com] to: hotmaim cc: Chu Rui; xen-devel@lists.xensource.com; Dan Magenheimer Sub: Re: [Xen-devel] Xen balloon driver discuss On 29/11/10 15:41, hotmaim wrote:> Appreciate for the details, I get more understandings but stillhave some confusions.> 1.Is it necessery to balloon to max-target at dom U rightstartup?(for, xm cr xxx.hvm maxmem=2048 memory=1024; max-target is 2048-1024) Say is it safe to balloon to let the guest has only 512M memory in total? or 1536 M(in this situation, i guess the pod entry will> also reduce and extra 512M memory will be added to Pod cache,right)?I''m sorry, I can''t figure out what you mean. The tools will set "target" to the value of "memory". The balloon driver is supposed to see how many total pages it has (2048M) and "inflate" the balloon until the number of pages is at the target (1024M in your example above).> 2. Suppose we have a xen wide PoD memorym pool, that is accessable forevery guest domains, when the guest needs a page, it get the page from the pool, and we can still use> balloon strategy to have the guest free pages back to the pool. So if theamount of all domain> memory inuse is less than host physial memory, it is safe. And when nomemory available from> host, domain need new memory may pause for for waiting for others to free,or use swap memory, is it possible? We already have a pool of free memory accessible to all the guest domains: It''s called the Xen free page list. :-) One of the explicit purposes of PoD is to set aside a fixed amount of memory for a guest, so that no other domains / processes can claim it. It''s guaranteed that memory, and as long as it has a working balloon driver, shouldn''t have any issues using it properly. Sharing it with other VMs would undermine this, and make it pretty much the same as the Xen free page list. I''m not an expert in tmem, but as I understand it, the whole point of tmem is to use knowledge of the guest OS to be able to throw away certain data. You can''t get guest-specific knowledge without modifying the guest OS to have it tell Xen somehow. It sounds like what you''re advocating is *allocate*-on-demand (as opposed to PoD, which allocates all the memory at the beginning but *populates* the p2m table on demand): tell all the guests they have more memory than is available total, assuming that only some of them are going to try to use all of it; and allocating the memory as it''s used. This works well for processes, but operating systems are typically built with the assumption that memory not used is memory completely wasted. They therefore keep disk cache pages and unused memory pages around "just in case", and I predict that any guest which has an active workload will eventually use all the memory it''s been told it has, even if it''s only actively using a small portion of it. At that point, Xen will be forced to try to guess which page is the least important to have around and swap it out. Alternately, the tools could slowly balloon down all of the guests as the memory starts to run out; but then you have a situation where the guest that gets the most memory is the one that touched it first, not the one which actually needs it. At any rate, PoD is meant to solve exactly one problem: booting "ballooned". At the moment it doesn''t lend itself to other solutions. -George> 2010-11-29,19:19,George Dunlap<George.Dunlap@eu.citrix.com> : > >> On 29/11/10 10:55, tinnycloud wrote: >>> So that is, if we run out of PoD cache before balloon works, Xen will >>> crash domain(goto out_of_memory), >> >> That''s right; PoD is only meant to allow a guest to run from boot until >> the balloon driver can load. It''s to allow a guest to "boot ballooned." >> >>> and at this situation, domain U swap(dom U can’t use swap memory) isnot>>> available , right? >> >> I don''t believe swap and PoD are integrated at the moment, no. >> >>> And when balloon actually works, the pod cached will finally decrease to >>> 0, and no longer be used any more, right? >> >> Conceptually, yes. What actually happens is that ballooning will reduce >> it so that pod_entries==cache_size. Entries will stay PoD until the >> guest touches them. It''s likely that eventually the guest will touch >> all the pages, at which point the PoD cache will be 0. >> >>> could we use this method to implement a tmem like memory overcommit? >> >> PoD does require guest knowledge -- it requires the balloon driver to be >> loaded soon after boot so the so the guest will limit its memory usage. >> It also doesn''t allow overcommit. Memory in the PoD cache is already >> allocated to the VM, and can''t be used for something else. >> >> You can''t to overcommit without either: >> * The guest knowing that it might not get the memory back, and being OK >> with that (tmem), or >> * Swapping, which doesn''t require PoD at all. >> >> If you''re thinking about scanning for zero pages and automatically >> reclaiming them, for instance, you have to be able to deal with a >> situation where the guest decides to use a page you''ve reclaimed but >> you''ve already given your last free page to someone else, and there are >> no more zero pages anywhere on the system. That would mean either just >> pausing the VM indefinitely, or choosing another guest page to swap out. >> >> -George >> >>> >>> *From:* Chu Rui [mailto:ruichu@gmail.com] >>> *TO:* tinnycloud >>> *CC:* xen-devel@lists.xensource.com; George.Dunlap@eu.citrix.com; >>> dan.magenheimer@oracle.com >>> *Subject:* Re: [Xen-devel] Xen balloon driver discuss >>> >>> I am also interested with tinnycloud''s problem. >>> >>> It looks that the pod cache has been used up like this: >>> >>> if ( p2md->pod.count == 0 ) >>> goto out_of_memory; >>> >>> George, would you please take a look on this problem, and, if possbile, >>> tell a little more about what does PoD cache mean? Is it a memory pool >>> for PoD allocation? >>> >> >>_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
To build a new balloon driver is much easier than I thought, so I quickly get more results Create domain like: xm cr xxx.hvm maxmem=2048 memory=512 In new the cur_page is 525312(larger than Pod Entry, so after balloon pod_entry == pod_cached well be satisfied), that is 2052M, later I found that this number comes from domain->max_pages. In /local/domain/did/memory/target is 524288, that is 512M Inside guest, from /proc/meminfo, the total memory is 482236KB, that is 470.93M Strange is balloon driver holds memory = 2052 - 512 = 1540M And the guest actually has 470.93M 1540 + 470.93 = 2010.93 < 2048 So I wonder where is the memory goes (2048-2010.93)? Thanks. ----- ----- date: 2010,12,1 13:07 To: ''tinnycloud''; ''George Dunlap'' CC: ''Chu Rui''; xen-devel@lists.xensource.com; ''Dan Magenheimer'' Subject: re: [Xen-devel] Xen balloon driver discuss Hi George: I think I know the problem, it is due to the balloon driver I used it out of date. My Guest kernel is from(ftp://ftp.redhat.com/pub/redhat/linux/enterprise/5Server/en/os/SRPMS/, kernel-2.6.18-164.el5.src.rpm, so as the balloon driver ) The problem is at the very beginning, Pod Entry total is different from the current_pages pages in balloon. (at the beginning, both Pod Entry and current_pages shall point to the same value, that is total memory allocated for guest, But in fact, Pod Entry is 523776 < current_pages is 514879 So from Pod aspect, the balloon need to inflate to 523776 - target, but the balloon driver only inflate 514879 -target This is the problem. ) So later I will try to get the balloon.c from xenlinux to build a new driver, to see if solve the problem. Thanks. ----- ----- From: tinnycloud [mailto:tinnycloud@hotmail.com] Sent: 2010.11.30 21:59 To: ''George Dunlap'' cc: ''Chu Rui''; ''xen-devel@lists.xensource.com''; ''Dan Magenheimer'' Subject: re: [Xen-devel] Xen balloon driver discuss Thank you for your kindly help. Well, on last mail, you mentioned that balloon will make pod_entries equal to cache_size as soon as it start to work when guest starts up.>From my understanding, if we start guest such as:xm cr xxx.hvm maxmem=2048 memory=512 then, we should set the /local/domain/did/memory/target to 522240 ( ( 512M-2M) * 1204, 2M for VGA in your another patch? ) to tell the balloon driver in guest to inflate, right? And when balloon driver balloon to let guest memory has this target, I think pod_entires will equal to cached_size, right? I did some experiment on this, the result shows different. Step 1. xm cr xxx.hvm maxmem=2048 memory=512 at the very beginning, I printed out domain tot_pages, 1320288, pod.entry_count 523776, that is 2046M, pod.count 130560, that is 512M (XEN) tot_pages 132088 pod_entries 523776 pod_count 130560 currently, /local/domain/did/memory/target in default will be written to 524288 after guest start up, balloon driver will balloon, when finish, I can see pod.entry_count reduce to 23552, pod,count 14063 (XEN) DomPage list too long to display (XEN) Tot pages 132088 PoD entries=23552 cachesize=14063 Step 2. In my understanding, /local/domain/did/memory/target should be at least 510 * 1024 , and then pod_entries will equal to cache_size I use 500, So I did: xm mem-set domain_id 500 then I can see pod.entry_count reduce to 22338, pod,count 15921, still not equal (XEN) Memory pages belonging to domain 4: (XEN) DomPage list too long to display (XEN) Tot pages 132088 PoD entries=22338 cachesize=15921 Step 3. Only after I did : xm mem-set domain_id 470 Pod_entries is equal to pod.count (XEN) DomPage list too long to display (XEN) Tot pages 130825 PoD entries=14677 cachesize=14677 Later from the code, I learnt that those two values are forced to be equal, in 700 out_entry_check: 701 /* If we''ve reduced our "liabilities" beyond our "assets", free some */ 702 if ( p2md->pod.entry_count < p2md->pod.count ) 703 { 704 p2m_pod_set_cache_target(d, p2md->pod.entry_count); 705 } 706 So in conclude, it looks like something goes wrong, the PoD entries should equal to cachesize(pod.count) as soon as the balloon driver inflate to max - target, right? Many thanks. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
tinnycloud
2011-Jan-12 14:41 UTC
[Xen-devel] strange CPU utilization, could related to credit schedule ?
Hi Geogre: We have quite strange CPU usage behaivor in one of our DomU(2008 HVM) Totally, our host has 16 physical CPU, and 9 VMS. Most of time, the all VMs works fine, the CPU usage are low and resonable, But at every high workload time(say 9:00-11:00AM, there are 8 VMs, each is a web server, cutomers accesses the page at this time), we login into the 9th VM which is idle, find that its CPU usage is at 85%, doesn''t make any sense since we have no task running, also the usage distrbutes evenly across most of the processes. I wonder if it relates to CPU schedule algorithm in Xen. After go through http://lists.xensource.com/archives/html/xen-devel/2010-07/msg00414.html I can''t figure out any assumptiones explains our situation. So what do u think of this? Many thanks. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2011-Jan-12 16:41 UTC
Re: [Xen-devel] strange CPU utilization, could related to credit schedule ?
Where is that 85% number coming from -- is this from within the VM, or from xentop? If it''s Windows reporting from within the VM, one hypothesis is that it has to do with processing and running with virtual time. It may simply be a side effect of the VM only getting a small percentage of the cpu. If it''s xentop, it''s probably the vm reacting somehow to getting only a small percentage of the CPU. We saw something like this with early versions of Windows 2k3, but that problem was addressed in later service packs. At any rate, to find out what Windows is doing would require a bit more investigation. :-) -George On Wed, Jan 12, 2011 at 2:41 PM, tinnycloud <tinnycloud@hotmail.com> wrote:> Hi Geogre: > > We have quite strange CPU usage behaivor in one of our DomU(2008 > HVM) > Totally, our host has 16 physical CPU, and 9 VMS. > > Most of time, the all VMs works fine, the CPU usage are low and > resonable, > But at every high workload time(say 9:00-11:00AM, there are 8 VMs, each is a > web server, > cutomers accesses the page at this time), we login into the 9th VM which > is idle, find that > its CPU usage is at 85%, doesn''t make any sense since we have no task > running, also the > usage distrbutes evenly across most of the processes. > > I wonder if it relates to CPU schedule algorithm in Xen. > After go > through http://lists.xensource.com/archives/html/xen-devel/2010-07/msg00414.html > I can''t figure out any assumptiones explains our situation. > So what do u think of this? > > Many thanks. > > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
MaoXiaoyun
2011-Jan-13 04:29 UTC
RE: [Xen-devel] strange CPU utilization, could related to credit schedule ?
85% is from VM. I forget to tell that, 8VMS each of them has 2 VCPUS, and the 9th VM, which is 2008 has 8VCPUs. We are still trying to reproduce the scence. I have questiones on VM idle. How does Xen know VM is idle, or when VM is idle, what is VCPU state in Xen, blocked or runable, and how is the CPU utiliazation calcauted? (I assume that the Idle VM finish physical CPU use before the time splice, and its state come to blocked, then put it into *inactive* queue, right? But will it is possible VM''s VCPU come back to *active* queue when VM still in idle, then we may have the phenomenon of VCPU shift between twe queues?) Also, when VM''s load comes up, will its priority be set BOOST, thus put the head of *active* queue to be sheduled earlier?> Date: Wed, 12 Jan 2011 16:41:07 +0000 > Subject: Re: [Xen-devel] strange CPU utilization, could related to credit schedule ? > From: George.Dunlap@eu.citrix.com > To: tinnycloud@hotmail.com > CC: xen-devel@lists.xensource.com > > Where is that 85% number coming from -- is this from within the VM, or > from xentop? > > If it''s Windows reporting from within the VM, one hypothesis is that > it has to do with processing and running with virtual time. It may > simply be a side effect of the VM only getting a small percentage of > the cpu. > > If it''s xentop, it''s probably the vm reacting somehow to getting only > a small percentage of the CPU. We saw something like this with early > versions of Windows 2k3, but that problem was addressed in later > service packs. At any rate, to find out what Windows is doing would > require a bit more investigation. :-) > > -George > > On Wed, Jan 12, 2011 at 2:41 PM, tinnycloud <tinnycloud@hotmail.com> wrote: > > Hi Geogre: > > > > We have quite strange CPU usage behaivor in one of our DomU(2008 > > HVM) > > Totally, our host has 16 physical CPU, and 9 VMS. > > > > Most of time, the all VMs works fine, the CPU usage are low and > > resonable, > > But at every high workload time(say 9:00-11:00AM, there are 8 VMs, each is a > > web server, > > cutomers accesses the page at this time), we login into the 9th VM which > > is idle, find that > > its CPU usage is at 85%, doesn''t make any sense since we have no task > > running, also the > > usage distrbutes evenly across most of the processes. > > > > I wonder if it relates to CPU schedule algorithm in Xen. > > After go > > through http://lists.xensource.com/archives/html/xen-devel/2010-07/msg00414.html > > I can''t figure out any assumptiones explains our situation. > > So what do u think of this? > > > > Many thanks. > > > > > > > > > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xensource.com > > http://lists.xensource.com/xen-devel > > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
MaoXiaoyun
2011-Jan-17 03:52 UTC
RE: [Xen-devel] strange CPU utilization, could related to credit schedule ?
Hi George: I've been looking into the credit schedule over agian and again. Well, I not smart enough to get fully understanding. Could you help to clarify below understanding? 1. From the algorithm, since domains credits is direct proportion to its weight, I think if there are two cpu-bound domains with same weight, no matter how many vcpus they have, they will have the same CPU times accmulated, right? 2. if 1 is true, what the different between domains with same weight but have different VCPUS(say one has 4 vcpus, another has 8)? 3. I am fully understand the problems of "credit 1 schedule "in your ppt of "Xenschedulerstatus" (1)Client hypervisors and audio/video Audio VM: 5% CPU 2x Kernel-build VMs: 97% cpu 30-40 audio skips over 5 minutes Do you mean "kernel-build VMs" has great impact on "Audio VM", and does priority CSCHED_PRI_TS_BOOST solve this? (2)Not fair to latency-sensitive workloads Network scp: “Fair share” 50%, usage 20-30% (3) Load balancing 64 threads (4 x 8 x 2) Unpredictable Not scalable Power management, Hyperthreads Could you help to explan more ? many many thanks, those confusions really makes me headache, I am a bit of silly. From: tinnycloud@hotmail.com To: george.dunlap@eu.citrix.com; xen-devel@lists.xensource.com Subject: RE: [Xen-devel] strange CPU utilization, could related to credit schedule ? Date: Thu, 13 Jan 2011 12:29:05 +0800 85% is from VM. I forget to tell that, 8VMS each of them has 2 VCPUS, and the 9th VM, which is 2008 has 8VCPUs. We are still trying to reproduce the scence. I have questiones on VM idle. How does Xen know VM is idle, or when VM is idle, what is VCPU state in Xen, blocked or runable, and how is the CPU utiliazation calcauted? (I assume that the Idle VM finish physical CPU use before the time splice, and its state come to blocked, then put it into *inactive* queue, right? But will it is possible VM's VCPU come back to *active* queue when VM still in idle, then we may have the phenomenon of VCPU shift between twe queues?) Also, when VM's load comes up, will its priority be set BOOST, thus put the head of *active* queue to be sheduled earlier?> Date: Wed, 12 Jan 2011 16:41:07 +0000 > Subject: Re: [Xen-devel] strange CPU utilization, could related to credit schedule ? > From: George.Dunlap@eu.citrix.com > To: tinnycloud@hotmail.com > CC: xen-devel@lists.xensource.com > > Where is that 85% number coming from -- is this from within the VM, or > from xentop? > > If it's Windows reporting from within the VM, one hypothesis is that > it has to do with processing and running with virtual time. It may > simply be a side effect of the VM only getting a small percentage of > the cpu. > > If it's xentop, it's probably the vm reacting somehow to getting only > a small percentage of the CPU. We saw something like this with early > versions of Windows 2k3, but that problem was addressed in later > service packs. At any rate, to find out what Windows is doing would > require a bit more investigation. :-) > > -George > > On Wed, Jan 12, 2011 at 2:41 PM, tinnycloud <tinnycloud@hotmail.com> wrote: > > Hi Geogre: > > > > We have quite strange CPU usage behaivor in one of our DomU(2008 > > HVM) > > Totally, our host has 16 physical CPU, and 9 VMS. > > > > Most of time, the all VMs works fine, the CPU usage are low and > > resonable, > > But at every high workload time(say 9:00-11:00AM, there are 8 VMs, each is a > > web server, > > cutomers accesses the page at this time), we login into the 9th VM which > > is idle, find that > > its CPU usage is at 85%, doesn't make any sense since we have no task > > running, also the > > usage distrbutes evenly across most of the processes. > > > > I wonder if it relates to CPU schedule algorithm in Xen. > > After go > > through http://lists.xensource.com/archives/html/xen-devel/2010-07/msg00414.html > > I can't figure out any assumptiones explains our situation. > > So what do u think of this? > > > > Many thanks. > > > > > > > > > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xensource.com > > http://lists.xensource.com/xen-devel > > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2011-Jan-17 10:41 UTC
Re: [Xen-devel] strange CPU utilization, could related to credit schedule ?
On Mon, Jan 17, 2011 at 3:52 AM, MaoXiaoyun <tinnycloud@hotmail.com> wrote:> Hi George: > 1. From the algorithm, since domains credits is direct proportion > to its weight, > I think if there are two cpu-bound domains with same weight, no matter how > many > vcpus they have, they will have the same CPU times accmulated, right?It used to be the case, yes. But since that is very counter-intuitive, some months ago I introduced a change such that the weight is calculated on a per-vcpu basis. If you look in csched_acct(), when accounting credit, weight of a domain is multiplied by sdom->active_vcpu_count.> 2. if 1 is true, what the different between domains with same > weight but have > different VCPUS(say one has 4 vcpus, another has 8)?If two domains have the same number of "active" vcpus (4 each, for example) they''ll get the same amount of CPU time. But if the 8-vcpu domain has 8 vcpus in "active" mode, it will get twice as much time. But this is a recent change; in earlier versions of Xen (before 3.4 for sure, and possibly 4.0, I can''t remember), if two VMs are given the same weight, they''ll get the same cpu time.> 3. I am fully understand the problems of "credit 1 schedule "in your > ppt of "Xenschedulerstatus" > > (1)Client hypervisors and audio/video > Audio VM: 5% CPU > 2x Kernel-build VMs: 97% cpu > 30-40 audio skips over 5 minutes > > Do you mean "kernel-build VMs" has great impact on "Audio VM", and does > priority CSCHED_PRI_TS_BOOST > solve this?BOOST does not solve this problem. I think I described the problem in the paper: BOOST is an unstable place to be -- you can''t stay there very long. The way BOOST works is this: * You are put into BOOST if your credits reach a certain threshold (30ms worth of credit) * You are taken out of BOOST if you are interrupted by a scheduler "tick" If you run at about 5% (or about 1/20 of the time), you can expect to be running on average every 20 ticks. Since timer ticks happen every 10ms, that means you can expect to stay in BOOST for an average of 200ms. So no matter how little cpu you use, you''ll flip back and forth between BOOST and normal, often several times per second.> many many thanks, those confusions really makes me headache, I am a bit of > silly.不是! 懂scheduling非常难. It probably took me about six months to really understand what was going on. :-) -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
kim.jin
2011-Jan-17 10:51 UTC
Re: Re: [Xen-devel] strange CPU utilization, could related to creditschedule ?
Then, how about the frequency of CPU? e.g., one VM have 1GHz CPUs, but the other have 2GHz CPUs. ------------------ Best Regards! Kim King 2011-01-17 ------------------------------------------------------------- George Dunlap 2011-01-17 18:41:35 MaoXiaoyun xen devel Re: [Xen-devel] strange CPU utilization, could related to creditschedule ?>On Mon, Jan 17, 2011 at 3:52 AM, MaoXiaoyun <tinnycloud@hotmail.com> wrote: >> Hi George: >> 1. From the algorithm, since domains credits is direct proportion >> to its weight, >> I think if there are two cpu-bound domains with same weight, no matter how >> many >> vcpus they have, they will have the same CPU times accmulated, right? > >It used to be the case, yes. But since that is very >counter-intuitive, some months ago I introduced a change such that the >weight is calculated on a per-vcpu basis. If you look in >csched_acct(), when accounting credit, weight of a domain is >multiplied by sdom->active_vcpu_count. > >> 2. if 1 is true, what the different between domains with same >> weight but have >> different VCPUS(say one has 4 vcpus, another has 8)? > >If two domains have the same number of "active" vcpus (4 each, for >example) they'll get the same amount of CPU time. But if the 8-vcpu >domain has 8 vcpus in "active" mode, it will get twice as much time. > >But this is a recent change; in earlier versions of Xen (before 3.4 >for sure, and possibly 4.0, I can't remember), if two VMs are given >the same weight, they'll get the same cpu time. > >> 3. I am fully understand the problems of "credit 1 schedule "in your >> ppt of "Xenschedulerstatus" >> >> (1)Client hypervisors and audio/video >> Audio VM: 5% CPU >> 2x Kernel-build VMs: 97% cpu >> 30-40 audio skips over 5 minutes >> >> Do you mean "kernel-build VMs" has great impact on "Audio VM", and does >> priority CSCHED_PRI_TS_BOOST >> solve this? > >BOOST does not solve this problem. I think I described the problem in >the paper: BOOST is an unstable place to be -- you can't stay there >very long. The way BOOST works is this: >* You are put into BOOST if your credits reach a certain threshold >(30ms worth of credit) >* You are taken out of BOOST if you are interrupted by a scheduler "tick" > >If you run at about 5% (or about 1/20 of the time), you can expect to >be running on average every 20 ticks. Since timer ticks happen every >10ms, that means you can expect to stay in BOOST for an average of >200ms. > >So no matter how little cpu you use, you'll flip back and forth >between BOOST and normal, often several times per second. > >> many many thanks, those confusions really makes me headache, I am a bit of >> silly. > >不是! 懂scheduling非常难. It probably took me about six months to really >understand what was going on. :-) > > -George > >_______________________________________________ >Xen-devel mailing list >Xen-devel@lists.xensource.com >http://lists.xensource.com/xen-devel >._______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2011-Jan-17 10:56 UTC
Re: Re: [Xen-devel] strange CPU utilization, could related to creditschedule ?
On Mon, Jan 17, 2011 at 10:51 AM, kim.jin <kim.jin@stromasys.com> wrote:> Then, how about the frequency of CPU? e.g., one VM have 1GHz CPUs, but the other have 2GHz CPUs.Do you mean, if someone is using CPU frequency scaling? -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
kim.jin
2011-Jan-17 11:30 UTC
Re: Re: Re: [Xen-devel] strange CPU utilization, could related tocreditschedule ?
>On Mon, Jan 17, 2011 at 10:51 AM, kim.jin <kim.jin@stromasys.com> wrote: >> Then, how about the frequency of CPU? e.g., one VM have 1GHz CPUs, but the other have 2GHz CPUs. > >Do you mean, if someone is using CPU frequency scaling?The similar thing. Do the new algorith care about the frequency of vCPU?> -GeorgeBest Regards! Kim King 2011-01-17 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel