Aim is to implement Xen Memory Deduplication with minimum overhead. Our approach to de-duplication is as follows? In most cases, Domain-U uses a small set of well-known operating systems such as Linux, FreeBSD and Microsoft Windows. In such environment many domains share read-only filesystems that contain operating system and frequently usedprogram files and libraries.Each domain has their own writable filesystems for storing data and temporary files. In this configuration, multiple pages scattered in different domains mostly happen to contain same disk block. So, in our approach to perform deduplication we intend to add a data structure in dom 0 which store disk block number and the machine frame number(MFN) when a read request for the read only code(and data) is made. Now when another domain U places the request for the block of code and Dom 0 recieves a request for I/O (DMA), it will first check into the data structure for the entry for the block. If it finds the block it will return the MFN of the already read page and map it to the requesting domain''s PFN resulting in zero I/O processing time of blocks which are already read. This in turn results in de-duplication of the read only pages accessed by multiple domains without any overhead of hashing the page. Test case scenario: Consider a Dom0 linux kernel using a filesystem with deduplication enabled. Then we install a DomU kernel with the virtual disk as a image file on the disk(.img). Then we make multiple copies of the image to deploy multiple DomUs running same kernel. Now, as deduplication is enabled in the file system initially all the blocks of the domains will be pointing to the same disk blocks. Now when the kernel''s are booted, they all will consume memory only once for the programs(code segment) loaded in the memory. Now as these OSs start to write to their own virtual filesystems the blocks of the image will be COW''ed by the filesystem resulting in different block number. Is such a approach implemented? We intend to implement this project. What are the suspected challanges? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Aim is to implement Xen Memory Deduplication with minimum overhead. Our approach to de-duplication is as follows? In most cases, Domain-U uses a small set of well-known operating systems such as Linux, FreeBSD and Microsoft Windows. In such environment many domains share read-only filesystems that contain operating system and frequently usedprogram files and libraries.Each domain has their own writable filesystems for storing data and temporary files. In this configuration, multiple pages scattered in different domains mostly happen to contain same disk block. So, in our approach to perform deduplication we intend to add a data structure in dom 0 which store disk block number and the machine frame number(MFN) when a read request for the read only code(and data) is made. Now when another domain U places the request for the block of code and Dom 0 recieves a request for I/O (DMA), it will first check into the data structure for the entry for the block. If it finds the block it will return the MFN of the already read page and map it to the requesting domain''s PFN resulting in zero I/O processing time of blocks which are already read. This in turn results in de-duplication of the read only pages accessed by multiple domains without any overhead of hashing the page. Test case scenario: Consider a Dom0 linux kernel using a filesystem with deduplication enabled. Then we install a DomU kernel with the virtual disk as a image file on the disk(.img). Then we make multiple copies of the image to deploy multiple DomUs running same kernel. Now, as deduplication is enabled in the file system initially all the blocks of the domains will be pointing to the same disk blocks. Now when the kernel''s are booted, they all will consume memory only once for the programs(code segment) loaded in the memory. Now as these OSs start to write to their own virtual filesystems the blocks of the image will be COW''ed by the filesystem resulting in different block number. Is such a approach implemented? We intend to implement this project. What are the suspected challanges? Regards, Aditya Gadre _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Aim is to implement Xen Memory Deduplication with minimum overhead. Our approach to de-duplication is as follows In most cases, Domain-U uses a small set of well-known operating systems such as Linux, FreeBSD and Microsoft Windows. In such environment many domains share read-only filesystems that contain operating system and frequently usedprogram files and libraries.Each domain has their own writable filesystems for storing data and temporary files. In this configuration, multiple pages scattered in different domains mostly happen to contain same disk block. So, in our approach to perform deduplication we intend to add a data structure in dom 0 which store disk block number and the machine frame number(MFN) when a read request for the read only code(and data) is made. Now when another domain U places the request for the block of code and Dom 0 recieves a request for I/O (DMA), it will first check into the data structure for the entry for the block. If it finds the block it will return the MFN of the already read page and map it to the requesting domain''s PFN resulting in zero I/O processing time of blocks which are already read. This in turn results in de-duplication of the read only pages accessed by multiple domains without any overhead of hashing the page. Test case scenario: Consider a Dom0 linux kernel using a filesystem with deduplication enabled. Then we install a DomU kernel with the virtual disk as a image file on the disk(.img). Then we make multiple copies of the image to deploy multiple DomUs running same kernel. Now, as deduplication is enabled in the file system initially all the blocks of the domains will be pointing to the same disk blocks. Now when the kernel''s are booted, they all will consume memory only once for the programs(code segment) loaded in the memory. Now as these OSs start to write to their own virtual filesystems the blocks of the image will be COW''ed by the filesystem resulting in different block number. Is such a approach implemented? We intend to implement this as a project. What are the suspected challanges? Regards, Aditya Gadre _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Sat, Oct 09, 2010 at 11:26:23PM +0530, Aditya Gadre wrote:> Aim is to implement Xen Memory Deduplication with minimum overhead. > > Our approach to de-duplication is as follows > > In most cases, Domain-U uses a small set of well-known operating systems > such as Linux, FreeBSD and Microsoft Windows. In such environment many > domains share read-only filesystems that contain operating system and > frequently usedprogram files and libraries.Each domain has their own > writable filesystems for storing data and temporary files. In this > configuration, multiple pages scattered in different domains mostly happen > to contain same disk block. So, in our approach to perform deduplication > we intend to add a data structure in dom 0 which store disk block number > and the machine frame number(MFN) when a read request for the read only > code(and data) is made. Now when another domain U places the request for > the block of code and Dom 0 recieves a request for I/O (DMA), it will > first check into the data structure for the entry for the block. If it > finds the block it will return the MFN of the already read page and map it > to the requesting domain''s PFN resulting in zero I/O processing time of > blocks which are already read. This in turn results in de-duplication of > the read only pages accessed by multiple domains without any overhead of > hashing the page. > > Test case scenario: > > Consider a Dom0 linux kernel using a filesystem with deduplication > enabled. Then we install a DomU kernel with the virtual disk as a image > file on the disk(.img). Then we make multiple copies of the image to > deploy multiple DomUs running same kernel. Now, as deduplication is > enabled in the file system initially all the blocks of the domains will be > pointing to the same disk blocks. Now when the kernel''s are booted, they > all will consume memory only once for the programs(code segment) loaded in > the memory. Now as these OSs start to write to their own virtual > filesystems the blocks of the image will be COW''ed by the filesystem > resulting in different block number. > Is such a approach implemented? We intend to implement this as a project. > What are the suspected challanges? >Yeah, I think the image COW is possible using the Xen blktap2 vhd support, and also maybe Xen qcow* stuff. Also check Xen4.0 wiki page for more info about the memory sharing etc: http://wiki.xensource.com/xenwiki/Xen4.0 -- Pasi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
I''m not an expert on it but I believe this sounds very similar to the page sharing implementation that already exists in Xen 4.0. The implementation in Xen only works on HVM guests and only on machines that have EPT though. The patches (which were accepted into Xen) were posted here: http://lists.xensource.com/archives/html/xen-devel/2009-12/msg00797.html From: Aditya Gadre [mailto:adivb2003@gmail.com] Sent: Saturday, October 09, 2010 11:56 AM To: Xen-devel@lists.xensource.com Subject: [Xen-devel] Xen Memory De-duplication Aim is to implement Xen Memory Deduplication with minimum overhead. Our approach to de-duplication is as follows In most cases, Domain-U uses a small set of well-known operating systems such as Linux, FreeBSD and Microsoft Windows. In such environment many domains share read-only filesystems that contain operating system and frequently usedprogram files and libraries.Each domain has their own writable filesystems for storing data and temporary files. In this configuration, multiple pages scattered in different domains mostly happen to contain same disk block. So, in our approach to perform deduplication we intend to add a data structure in dom 0 which store disk block number and the machine frame number(MFN) when a read request for the read only code(and data) is made. Now when another domain U places the request for the block of code and Dom 0 recieves a request for I/O (DMA), it will first check into the data structure for the entry for the block. If it finds the block it will return the MFN of the already read page and map it to the requesting domain''s PFN resulting in zero I/O processing time of blocks which are already read. This in turn results in de-duplication of the read only pages accessed by multiple domains without any overhead of hashing the page. Test case scenario: Consider a Dom0 linux kernel using a filesystem with deduplication enabled. Then we install a DomU kernel with the virtual disk as a image file on the disk(.img). Then we make multiple copies of the image to deploy multiple DomUs running same kernel. Now, as deduplication is enabled in the file system initially all the blocks of the domains will be pointing to the same disk blocks. Now when the kernel''s are booted, they all will consume memory only once for the programs(code segment) loaded in the memory. Now as these OSs start to write to their own virtual filesystems the blocks of the image will be COW''ed by the filesystem resulting in different block number. Is such a approach implemented? We intend to implement this as a project. What are the suspected challanges? Regards, Aditya Gadre _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
This kind of implementation will require the disk blocks from different DomUs to be mapped to same physical disk block. For example, 1) Shared read only filesystem 2) Union based filesystem 3) Virtual machine images deployed on a host filesystem which has deduplication enabled What kind of arrangement of filesystem is used in production environments for DomUs which host large number of VMs as in cloud enviorment? On Sun, Oct 10, 2010 at 5:10 AM, Dan Magenheimer <dan.magenheimer@oracle.com> wrote:> I’m not an expert on it but I believe this sounds very similar to the > page sharing implementation that already exists in Xen 4.0. The > implementation in Xen only works on HVM guests and only on machines that > have EPT though. The patches (which were accepted into Xen) were posted > here: > > > > http://lists.xensource.com/archives/html/xen-devel/2009-12/msg00797.html > > > > *From:* Aditya Gadre [mailto:adivb2003@gmail.com] > *Sent:* Saturday, October 09, 2010 11:56 AM > > *To:* Xen-devel@lists.xensource.com > *Subject:* [Xen-devel] Xen Memory De-duplication > > > > Aim is to implement Xen Memory Deduplication with minimum overhead. > > Our approach to de-duplication is as follows > > In most cases, Domain-U uses a small set of well-known operating systems > such as Linux, FreeBSD and Microsoft Windows. In such environment many > domains share read-only filesystems that contain operating system and > frequently usedprogram files and libraries.Each domain has their own > writable filesystems for storing data and temporary files. In this > configuration, multiple pages scattered in different domains mostly happen > to contain same disk block. So, in our approach to perform deduplication we > intend to add a data structure in dom 0 which store disk block number and > the machine frame number(MFN) when a read request for the read only code(and > data) is made. Now when another domain U places the request for the block of > code and Dom 0 recieves a request for I/O (DMA), it will first check into > the data structure for the entry for the block. If it finds the block it > will return the MFN of the already read page and map it to the requesting > domain''s PFN resulting in zero I/O processing time of blocks which are > already read. This in turn results in de-duplication of the read only pages > accessed by multiple domains without any overhead of hashing the page. > > Test case scenario: > > Consider a Dom0 linux kernel using a filesystem with deduplication enabled. > Then we install a DomU kernel with the virtual disk as a image file on the > disk(.img). Then we make multiple copies of the image to deploy multiple > DomUs running same kernel. Now, as deduplication is enabled in the file > system initially all the blocks of the domains will be pointing to the same > disk blocks. Now when the kernel''s are booted, they all will consume memory > only once for the programs(code segment) loaded in the memory. Now as these > OSs start to write to their own virtual filesystems the blocks of the image > will be COW''ed by the filesystem resulting in different block number. > Is such a approach implemented? We intend to implement this as a project. > What are the suspected challanges? > > Regards, > Aditya Gadre >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Sun, Oct 10, 2010 at 10:54:58AM +0530, Aditya Gadre wrote:> This kind of implementation will require the disk blocks from different > DomUs to be mapped to same physical disk block. > For example, > 1) Shared read only filesystem > 2) Union based filesystem > 3) Virtual machine images deployed on a host filesystem which has > deduplication enabled >I guess Xen blktap qcow* images should do? And maybe blktap2 VHD? -- Pasi> What kind of arrangement of filesystem is used in production environments > for DomUs which host large number of VMs as in cloud enviorment? > > On Sun, Oct 10, 2010 at 5:10 AM, Dan Magenheimer > <[1]dan.magenheimer@oracle.com> wrote: > > I*m not an expert on it but I believe this sounds very similar to the > page sharing implementation that already exists in Xen 4.0. The > implementation in Xen only works on HVM guests and only on machines that > have EPT though. The patches (which were accepted into Xen) were posted > here: > > > > [2]http://lists.xensource.com/archives/html/xen-devel/2009-12/msg00797.html > > > > From: Aditya Gadre [mailto:[3]adivb2003@gmail.com] > Sent: Saturday, October 09, 2010 11:56 AM > To: [4]Xen-devel@lists.xensource.com > Subject: [Xen-devel] Xen Memory De-duplication > > > > Aim is to implement Xen Memory Deduplication with minimum overhead. > > Our approach to de-duplication is as follows > > In most cases, Domain-U uses a small set of well-known operating systems > such as Linux, FreeBSD and Microsoft Windows. In such environment many > domains share read-only filesystems that contain operating system and > frequently usedprogram files and libraries.Each domain has their own > writable filesystems for storing data and temporary files. In this > configuration, multiple pages scattered in different domains mostly > happen to contain same disk block. So, in our approach to perform > deduplication we intend to add a data structure in dom 0 which store > disk block number and the machine frame number(MFN) when a read request > for the read only code(and data) is made. Now when another domain U > places the request for the block of code and Dom 0 recieves a request > for I/O (DMA), it will first check into the data structure for the entry > for the block. If it finds the block it will return the MFN of the > already read page and map it to the requesting domain''s PFN resulting in > zero I/O processing time of blocks which are already read. This in turn > results in de-duplication of the read only pages accessed by multiple > domains without any overhead of hashing the page. > > Test case scenario: > > Consider a Dom0 linux kernel using a filesystem with deduplication > enabled. Then we install a DomU kernel with the virtual disk as a image > file on the disk(.img). Then we make multiple copies of the image to > deploy multiple DomUs running same kernel. Now, as deduplication is > enabled in the file system initially all the blocks of the domains will > be pointing to the same disk blocks. Now when the kernel''s are booted, > they all will consume memory only once for the programs(code segment) > loaded in the memory. Now as these OSs start to write to their own > virtual filesystems the blocks of the image will be COW''ed by the > filesystem resulting in different block number. > Is such a approach implemented? We intend to implement this as a > project. What are the suspected challanges? > > Regards, > Aditya Gadre > > References > > Visible links > 1. mailto:dan.magenheimer@oracle.com > 2. http://lists.xensource.com/archives/html/xen-devel/2009-12/msg00797.html > 3. mailto:adivb2003@gmail.com > 4. mailto:Xen-devel@lists.xensource.com_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Not sure about the DMA part, but I suggest you also take a look at satori project code (memshr modules) in xen. http://www.usenix.org/events/usenix09/tech/slides/milos.pdf On Sun, Oct 10, 2010 at 5:34 AM, Pasi Kärkkäinen <pasik@iki.fi> wrote:> On Sun, Oct 10, 2010 at 10:54:58AM +0530, Aditya Gadre wrote: > > This kind of implementation will require the disk blocks from > different > > DomUs to be mapped to same physical disk block. > > For example, > > 1) Shared read only filesystem > > 2) Union based filesystem > > 3) Virtual machine images deployed on a host filesystem which has > > deduplication enabled > > > > I guess Xen blktap qcow* images should do? And maybe blktap2 VHD? > > -- Pasi > > > What kind of arrangement of filesystem is used in production > environments > > for DomUs which host large number of VMs as in cloud enviorment? > > > > On Sun, Oct 10, 2010 at 5:10 AM, Dan Magenheimer > > <[1]dan.magenheimer@oracle.com> wrote: > > > > I*m not an expert on it but I believe this sounds very similar to > the > > page sharing implementation that already exists in Xen 4.0. The > > implementation in Xen only works on HVM guests and only on machines > that > > have EPT though. The patches (which were accepted into Xen) were > posted > > here: > > > > > > > > [2] > http://lists.xensource.com/archives/html/xen-devel/2009-12/msg00797.html > > > > > > > > From: Aditya Gadre [mailto:[3]adivb2003@gmail.com] > > Sent: Saturday, October 09, 2010 11:56 AM > > To: [4]Xen-devel@lists.xensource.com > > Subject: [Xen-devel] Xen Memory De-duplication > > > > > > > > Aim is to implement Xen Memory Deduplication with minimum overhead. > > > > Our approach to de-duplication is as follows > > > > In most cases, Domain-U uses a small set of well-known operating > systems > > such as Linux, FreeBSD and Microsoft Windows. In such environment > many > > domains share read-only filesystems that contain operating system > and > > frequently usedprogram files and libraries.Each domain has their own > > writable filesystems for storing data and temporary files. In this > > configuration, multiple pages scattered in different domains mostly > > happen to contain same disk block. So, in our approach to perform > > deduplication we intend to add a data structure in dom 0 which store > > disk block number and the machine frame number(MFN) when a read > request > > for the read only code(and data) is made. Now when another domain U > > places the request for the block of code and Dom 0 recieves a > request > > for I/O (DMA), it will first check into the data structure for the > entry > > for the block. If it finds the block it will return the MFN of the > > already read page and map it to the requesting domain''s PFN > resulting in > > zero I/O processing time of blocks which are already read. This in > turn > > results in de-duplication of the read only pages accessed by > multiple > > domains without any overhead of hashing the page. > > > > Test case scenario: > > > > Consider a Dom0 linux kernel using a filesystem with deduplication > > enabled. Then we install a DomU kernel with the virtual disk as a > image > > file on the disk(.img). Then we make multiple copies of the image to > > deploy multiple DomUs running same kernel. Now, as deduplication is > > enabled in the file system initially all the blocks of the domains > will > > be pointing to the same disk blocks. Now when the kernel''s are > booted, > > they all will consume memory only once for the programs(code > segment) > > loaded in the memory. Now as these OSs start to write to their own > > virtual filesystems the blocks of the image will be COW''ed by the > > filesystem resulting in different block number. > > Is such a approach implemented? We intend to implement this as a > > project. What are the suspected challanges? > > > > Regards, > > Aditya Gadre > > > > References > > > > Visible links > > 1. mailto:dan.magenheimer@oracle.com > > 2. > http://lists.xensource.com/archives/html/xen-devel/2009-12/msg00797.html > > 3. mailto:adivb2003@gmail.com > > 4. mailto:Xen-devel@lists.xensource.com > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >-- perception is but an offspring of its own self _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
At 18:56 +0100 on 09 Oct (1286650583), Aditya Gadre wrote:> Is such a approach implemented? We intend to implement this as a > project. What are the suspected challanges?Yes, this was implemented last year; the patches are in the xen-unstable tree. They hook the read path in blocktap to detect duplicate reads of the same block and turn them into copy-on-write mappings in the hypervisor. Cheers, Tim. -- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, XenServer Engineering Citrix Systems UK Ltd. (Company #02937203, SL9 0BG) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Aditya Gadre wrote:> This kind of implementation will require the disk blocks from > different DomUs to be mapped to same physical disk block. > For example, > 1) Shared read only filesystem > 2) Union based filesystem > 3) Virtual machine images deployed on a host filesystem which has > deduplication enabled > > What kind of arrangement of filesystem is used in production > environments for DomUs which host large number of VMs as in cloud > enviorment? >I don''t know for others, but for us (eg: at GPLHost), none of what you described above is doable. Each VM has its own LVM partition, and we wont have shared filesystem among many VMs. Never ever. We don''t use virtual machine *images* either. What would be nicer, would be a more general approach, and maybe have the possibility to use a filesystem that is already mounted on the dom0. Why? Because most of the time, what is wasted, is the free space in each LVM, in what I described above. Thomas _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
At 11:20 +0100 on 12 Oct (1286882408), Thomas Goirand wrote:> What would be nicer, would be a more general approach, and > maybe have the possibility to use a filesystem that is already > mounted on the dom0.Do you want something more than NFS/CIFS mounts already offer? Tim. -- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, XenServer Engineering Citrix Systems UK Ltd. (Company #02937203, SL9 0BG) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel