Dan Magenheimer
2010-Aug-31 22:16 UTC
[Xen-devel] RE: Queries on Tmem and Difference Engine.
> Thanks for reply. > Now I am very clear with my queries on tmem.I''m glad it was helpful! I hope it is OK to post replies on the list so that answers to your very good questions can be read and discussed by others.> I am exploring to solve fragmentation problem through buddy allocation > technique. > http://www.kernel.org/doc/gorman/html/understand/understand009.html > http://en.wikipedia.org/wiki/Buddy_memory_allocation > Do you have any other thoughts on this concept of fragmentation ?Xen already uses a buddy allocator. The problem is that tmem works around Xen''s allocator for various performance reasons. And since tmem absorbs all physical memory in the system, tmem must be involved in freeing ephemeral pages and many many pages might need to be freed to obtain a buddy. Solving this for tmem still means there is a problem for other Xen dynamic memory management solutions (such as ballooning and page sharing). I think the right solution for the fragmentation problem is that no code in Xen should attempt to allocate order>0 pages unless it is prepared to fail and fall back to instead allocating and using a set of order**2 individual pages.> Is there some discussion going on sharing guest via tmem?? > I tried to search a lot but did find anything worth.I''m not sure what you are asking. Is this question about shared ephemeral pools that require multiple guests sharing a clustered filesystem? Or are you asking about finding guests that have tmem-enabled kernels? Thanks, Dan> On Tue, Aug 31, 2010 at 8:20 PM, Dan Magenheimer > <dan.magenheimer@oracle.com> wrote: > >> Subject: Queries on Tmem and Difference Engine. > >> > >> Hi, > >> I have gone through presentation on "Transcendent Memory on Xen" > >> > >> > http://oss.oracle.com/projects/tmem/dist/documentation/presentations/Tr > >> anscendentMemoryXenSummit2010.pdf > >> read some paper on tmem and have some good idea about tmem.But > still > >> have few questions on it. > > > > Hi Ashwin -- > > > > Thank you for your interest in tmem! > > > >> In tmem pool the deduplication is performed only on pages in > >> ephemeral pools why it''s not performed on > >> persistent pool?"Since by deduplication we are saving memory. > > > > First, there is an accounting issue. Persistent pages "owned" by > > a domain count against each domain''s maxmem allocation. If a > > domain attempts to put a persistent page and the domain has > > already used up its maxmem, the put fails. This is important for > > avoiding denial-of-service attacks. So if persistent > > pages are deduplicated, what happens in the following: > > > > - domX puts a persistent page with contents ABC > > - domY puts a persistent page with contents ABC, but domY > > is already at maxmem... but since the page can be deduplicated > > and takes no additional memory it is accepted by tmem > > - domX flushes the page containing ABC > > - who owns the persistent ABC page in tmem? Has domY exceeded > > maxmem or not? > > and there are other similar scenarios. > > > > Second, I wasn''t sure that there would be many opportunities > > for deduplication in swap pages which are, by definition, dirty. > > Deduplication takes some additional memory for data structures > > and may take a great deal of additional CPU time, even if > > no deduplication occurs. So it is important to use it only > > if it is fairly certain that there will be some value. > > This is something you could measure for your project since, > > in a test environment, you do not need to worry about > > denial-of-service. > > > > Interestingly, if the accounting problem were solved, the > > flexibility tmem has defined for handling "duplicate puts" > > nicely avoids the CoW-overcommitment problem seen by Difference > > Engine and Satori and VMware. If memory is exhausted and > > a domain attempts a persistent put that would cause a CoW, > > the put can be simply rejected by tmem. So host swapping > > is never required. > > > >> In difference engine by Diwaker Gupta,Guest VM shares the > pages. > >> www.usenix.org/publications/login/2009-04/openpdfs/gupta.pdf This > >> seems kind of over-committing RAM > >> for Guest OS.There are many discussions going on VMware > >> memory-overcommit feature using sharing. > >> Over-committing exists in Xen-server5.0 at HVM,Why such feature is > not > >> provided at PV? and what is the > >> status of the difference engine? Is it included in Xen ? > > > > You can find my opinion of host swapping in the linux kernel mailing > > list here: http://lkml.org/lkml/2010/5/2/49 > > (and you might find the entire thread interesting). > > > > The Difference Engine code was never submitted to Xen. A > > version of the Satori code was submitted to Xen in December 2009 > > and is in Xen 4.0. > > > >> In above presentation, it has mention that "inter guest shared > >> memory" is under investigation and fragmentation > >> is an outstanding issues .What is the status of implementation ? I > >> would like to carry out this project. > >> What are your suggestions on it? > > > > Shared persistent pools have never been implemented in tmem > > although most of the code is already there because shared ephemeral > > pools and non-shared persistent pools are supported. > > > > I am not a networking expert, but I believe they would be useful > > for networking between two guests: If two guests discover they are > > on the same host, tmem can serve as a transport layer. If you > > are very interested in networking, exploring this might make > > a good project. > > > > Fragmentation: since tmem absorbs all free memory in the system one > > page at a time, if Xen attempts to allocate memory of order>0 > (order==1 > > means two consecutive physical pages, order==2 means four consecutive > > physical pages, order==3 means eight, etc), the allocation will fail. > > The worst problem may be fixed soon though others still must be > fixed: > > http://lists.xensource.com/archives/html/xen-devel/2010- > 08/msg01350.html > > This might also make a good Xen-related project. > > > > I hope this answers your questions! > > Dan > > > > > > -- > With Regards, > Ashwin Vasani > B.E. (Fourth Year) > Computer Engineering, > Pune Institute of Computer Technology. > +91 9960405802_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2010-Sep-13 21:56 UTC
[Xen-devel] RE: Queries on Tmem and Difference Engine.
> From: Dan Magenheimer > Subject: RE: Queries on Tmem and Difference Engine. > > > Thanks for reply. > > Now I am very clear with my queries on tmem. > > I''m glad it was helpful! I hope it is OK to post replies > on the list so that answers to your very good questions > can be read and discussed by others. > > > I am exploring to solve fragmentation problem through buddy > allocation > > technique. > > http://www.kernel.org/doc/gorman/html/understand/understand009.html > > http://en.wikipedia.org/wiki/Buddy_memory_allocation > > Do you have any other thoughts on this concept of fragmentation ? > > Xen already uses a buddy allocator. The problem is that tmem > works around Xen''s allocator for various performance reasons. > And since tmem absorbs all physical memory in the system, > tmem must be involved in freeing ephemeral pages and many > many pages might need to be freed to obtain a buddy. > > Solving this for tmem still means there is a problem for other > Xen dynamic memory management solutions (such as ballooning and > page sharing). I think the right solution for the fragmentation > problem is that no code in Xen should attempt to allocate > order>0 pages unless it is prepared to fail and fall back to > instead allocating and using a set of order**2 individual pages.After thinking about this some more, adding changes so that tmem can use and release ephemeral pools in 2MB chunks might be very useful for domains that request "huge pages". This may have performance issues, so should probably only be enabled with a command line option (e.g. tmem_2mb). _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel