Oleksandr Dmytryshyn
2013-Nov-25 16:21 UTC
[PATCH] xen: arm: add missing flushing dcache to the copy to/clean guest functions
Without flushing dcache the hypervisor couldn''t copy the device tree correctly when booting the kernel dom0 Image (memory with device tree is corrupted). As the result - when we try to load the kernel dom0 Image - dom0 hungs frequently. This issue is not reproduced with the kernel dom0 zImage because the zImage decompressor code flushes all dcache before starting the decompressed kernel Image. When the hypervisor loads the kernel uImage or initrd, this memory region isn''t corrupted because the hypervisor code flushes the dcache. Signed-off-by: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com> --- xen/arch/arm/guestcopy.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/xen/arch/arm/guestcopy.c b/xen/arch/arm/guestcopy.c index d146cd6..28d3151 100644 --- a/xen/arch/arm/guestcopy.c +++ b/xen/arch/arm/guestcopy.c @@ -24,6 +24,7 @@ unsigned long raw_copy_to_guest(void *to, const void *from, unsigned len) p = map_domain_page(g>>PAGE_SHIFT); p += offset; memcpy(p, from, size); + flush_xen_dcache_va_range(p, size); unmap_domain_page(p - offset); len -= size; @@ -54,6 +55,7 @@ unsigned long raw_clear_guest(void *to, unsigned len) p = map_domain_page(g>>PAGE_SHIFT); p += offset; memset(p, 0x00, size); + flush_xen_dcache_va_range(p, size); unmap_domain_page(p - offset); len -= size; -- 1.8.2.rc2
Ian Campbell
2013-Nov-25 16:33 UTC
Re: [PATCH] xen: arm: add missing flushing dcache to the copy to/clean guest functions
On Mon, 2013-11-25 at 18:21 +0200, Oleksandr Dmytryshyn wrote: Thanks for the patch.> Without flushing dcache the hypervisor couldn''t copy the device tree > correctly when booting the kernel dom0 Image (memory with device tree > is corrupted). As the result - when we try to load the kernel dom0 > Image - dom0 hungs frequently. This issue is not reproduced with the > kernel dom0 zImage because the zImage decompressor code flushes all > dcache before starting the decompressed kernel Image. When the > hypervisor loads the kernel uImage or initrd, this memory region > isn''t corrupted because the hypervisor code flushes the dcache.So if not then when/how is this reproduced? In general I would like to try and keep flushes out of this code path because they are used in the hypercall path, we have decreed that guest''s must have caching enabled to make hypercalls (at least those which take in-memory arguments). I think the right fix is to do the flushes domain_build.c, similar to how kernel_zimage_load does it. This might need an opencoded version of copy_to_user. Or better, introduce a flushing variant which shares most code with the normal one via a common internal function. Or perhaps we should flush all of the new guest''s RAM after building. I think Julien was looking at doing something along those lines for the domU building case. Ian.> > Signed-off-by: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com> > --- > xen/arch/arm/guestcopy.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/xen/arch/arm/guestcopy.c b/xen/arch/arm/guestcopy.c > index d146cd6..28d3151 100644 > --- a/xen/arch/arm/guestcopy.c > +++ b/xen/arch/arm/guestcopy.c > @@ -24,6 +24,7 @@ unsigned long raw_copy_to_guest(void *to, const void *from, unsigned len) > p = map_domain_page(g>>PAGE_SHIFT); > p += offset; > memcpy(p, from, size); > + flush_xen_dcache_va_range(p, size); > > unmap_domain_page(p - offset); > len -= size; > @@ -54,6 +55,7 @@ unsigned long raw_clear_guest(void *to, unsigned len) > p = map_domain_page(g>>PAGE_SHIFT); > p += offset; > memset(p, 0x00, size); > + flush_xen_dcache_va_range(p, size); > > unmap_domain_page(p - offset); > len -= size;
Oleksandr Dmytryshyn
2013-Nov-25 17:04 UTC
Re: [PATCH] xen: arm: add missing flushing dcache to the copy to/clean guest functions
Hi, Ian. This issue is reproduced, when we tried to start the kernel Image (uncompressed image) instead of the zImage (Dom0). We''ve written a patch for the hypervisor. With this patch the hypervisor can boot not only zImage, but uImage to. uImage - consists from the input image with the 64 bytes header. We use ''mkimage'' utility to generate the uImage. Now the hypervisor does analyse the uImage header, removes this header header, copies the rest to the destination and simply jumps to the loaded start of the copied image. Normal case: we used uImage generated from the kernel zImage. Kernel Dom0 boots normally. Bad case: all the same instead of the Dom0 kernel. We used uImage generated from the kernel Image (with the same parameters as the zImage). In this case the kernel hungs frequently. I''ll rework and push the new version of this patch. I''ll introduce a new function with flushing the dcache which will be based on the copy_to_user function. Oleksandr Dmytryshyn | Product Engineering and Development GlobalLogic P x3657 M +38.067.382.2525 www.globallogic.com http://www.globallogic.com/email_disclaimer.txt On Mon, Nov 25, 2013 at 6:33 PM, Ian Campbell <Ian.Campbell@citrix.com> wrote:> On Mon, 2013-11-25 at 18:21 +0200, Oleksandr Dmytryshyn wrote: > > Thanks for the patch. > >> Without flushing dcache the hypervisor couldn''t copy the device tree >> correctly when booting the kernel dom0 Image (memory with device tree >> is corrupted). As the result - when we try to load the kernel dom0 >> Image - dom0 hungs frequently. This issue is not reproduced with the >> kernel dom0 zImage because the zImage decompressor code flushes all >> dcache before starting the decompressed kernel Image. When the >> hypervisor loads the kernel uImage or initrd, this memory region >> isn''t corrupted because the hypervisor code flushes the dcache. > > So if not then when/how is this reproduced? > > In general I would like to try and keep flushes out of this code path > because they are used in the hypercall path, we have decreed that > guest''s must have caching enabled to make hypercalls (at least those > which take in-memory arguments). > > I think the right fix is to do the flushes domain_build.c, similar to > how kernel_zimage_load does it. This might need an opencoded version of > copy_to_user. Or better, introduce a flushing variant which shares most > code with the normal one via a common internal function. > > Or perhaps we should flush all of the new guest''s RAM after building. I > think Julien was looking at doing something along those lines for the > domU building case. > > Ian. > >> >> Signed-off-by: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com > >> --- >> xen/arch/arm/guestcopy.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/xen/arch/arm/guestcopy.c b/xen/arch/arm/guestcopy.c >> index d146cd6..28d3151 100644 >> --- a/xen/arch/arm/guestcopy.c >> +++ b/xen/arch/arm/guestcopy.c >> @@ -24,6 +24,7 @@ unsigned long raw_copy_to_guest(void *to, const void*from, unsigned len)>> p = map_domain_page(g>>PAGE_SHIFT); >> p += offset; >> memcpy(p, from, size); >> + flush_xen_dcache_va_range(p, size); >> >> unmap_domain_page(p - offset); >> len -= size; >> @@ -54,6 +55,7 @@ unsigned long raw_clear_guest(void *to, unsigned len) >> p = map_domain_page(g>>PAGE_SHIFT); >> p += offset; >> memset(p, 0x00, size); >> + flush_xen_dcache_va_range(p, size); >> >> unmap_domain_page(p - offset); >> len -= size; > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Julien Grall
2013-Nov-25 17:06 UTC
Re: [PATCH] xen: arm: add missing flushing dcache to the copy to/clean guest functions
On 11/25/2013 04:33 PM, Ian Campbell wrote:> On Mon, 2013-11-25 at 18:21 +0200, Oleksandr Dmytryshyn wrote: > > Thanks for the patch. > >> Without flushing dcache the hypervisor couldn''t copy the device tree >> correctly when booting the kernel dom0 Image (memory with device tree >> is corrupted). As the result - when we try to load the kernel dom0 >> Image - dom0 hungs frequently. This issue is not reproduced with the >> kernel dom0 zImage because the zImage decompressor code flushes all >> dcache before starting the decompressed kernel Image. When the >> hypervisor loads the kernel uImage or initrd, this memory region >> isn''t corrupted because the hypervisor code flushes the dcache. > > So if not then when/how is this reproduced? > > In general I would like to try and keep flushes out of this code path > because they are used in the hypercall path, we have decreed that > guest''s must have caching enabled to make hypercalls (at least those > which take in-memory arguments). > > I think the right fix is to do the flushes domain_build.c, similar to > how kernel_zimage_load does it. This might need an opencoded version of > copy_to_user. Or better, introduce a flushing variant which shares most > code with the normal one via a common internal function. > > Or perhaps we should flush all of the new guest''s RAM after building. I > think Julien was looking at doing something along those lines for the > domU building case. >I was planning to handle only domU, but the issue can also happen with dom0 (which is finally a guest as another :)). The main reason behind the both issues is because Linux is starting with cache disabled. So if Xen/Dom0 are copying data in a guest, some data can stay in the cache for a while. In this case the guest will see corrupted data. Flushing all the new RAM seems a bit complicated and slow (AFAIK, you can only flush data cache page by page). I would stay on a similar solution and check if the guest is already running or not. Ian, what do you think? -- Julien Grall
Oleksandr Dmytryshyn
2013-Nov-26 08:15 UTC
Re: [PATCH] xen: arm: add missing flushing dcache to the copy to/clean guest functions
Hi to all. Currently we use Dom0 kernel size about 30 MBytes (on Panda5 board). I''ve measured a flushing time in the 2 cases: 1. Flush about 30 MBytes (flush_xen_dcache_va_range() function) - 1280 uS 2. Full dcache flush (I''ve ported this from u-boot to the hypervisor) - 93 uS I''ve used OMAP5 ES2.0 for test (Panda5 board). I''ve used 32K_TIMER in OMAP5 to measure the flushing time. The problem exists only for the device tree image (for Dom0), because the kernel Dom0 image (and ramdisk image) is copied with flushing the dcache. Oleksandr Dmytryshyn | Product Engineering and Development GlobalLogic P x3657 M +38.067.382.2525 www.globallogic.com http://www.globallogic.com/email_disclaimer.txt On Mon, Nov 25, 2013 at 7:06 PM, Julien Grall <julien.grall@citrix.com> wrote:> On 11/25/2013 04:33 PM, Ian Campbell wrote: >> On Mon, 2013-11-25 at 18:21 +0200, Oleksandr Dmytryshyn wrote: >> >> Thanks for the patch. >> >>> Without flushing dcache the hypervisor couldn''t copy the device tree >>> correctly when booting the kernel dom0 Image (memory with device tree >>> is corrupted). As the result - when we try to load the kernel dom0 >>> Image - dom0 hungs frequently. This issue is not reproduced with the >>> kernel dom0 zImage because the zImage decompressor code flushes all >>> dcache before starting the decompressed kernel Image. When the >>> hypervisor loads the kernel uImage or initrd, this memory region >>> isn''t corrupted because the hypervisor code flushes the dcache. >> >> So if not then when/how is this reproduced? >> >> In general I would like to try and keep flushes out of this code path >> because they are used in the hypercall path, we have decreed that >> guest''s must have caching enabled to make hypercalls (at least those >> which take in-memory arguments). >> >> I think the right fix is to do the flushes domain_build.c, similar to >> how kernel_zimage_load does it. This might need an opencoded version of >> copy_to_user. Or better, introduce a flushing variant which shares most >> code with the normal one via a common internal function. >> >> Or perhaps we should flush all of the new guest''s RAM after building. I >> think Julien was looking at doing something along those lines for the >> domU building case. >> > > I was planning to handle only domU, but the issue can also happen with > dom0 (which is finally a guest as another :)). > > The main reason behind the both issues is because Linux is starting with > cache disabled. So if Xen/Dom0 are copying data in a guest, some data > can stay in the cache for a while. In this case the guest will see > corrupted data. > > Flushing all the new RAM seems a bit complicated and slow (AFAIK, you > can only flush data cache page by page). I would stay on a similar > solution and check if the guest is already running or not. > > Ian, what do you think? > > -- > Julien Grall
Stefano Stabellini
2013-Dec-01 16:21 UTC
Re: [PATCH] xen: arm: add missing flushing dcache to the copy to/clean guest functions
On Tue, 26 Nov 2013, Oleksandr Dmytryshyn wrote:> Hi to all. > > Currently we use Dom0 kernel size about 30 MBytes (on Panda5 board). > > I''ve measured a flushing time in the 2 cases: > 1. Flush about 30 MBytes (flush_xen_dcache_va_range() function) - 1280 uS > 2. Full dcache flush (I''ve ported this from u-boot to the hypervisor) - 93 uSThe difference is huge! Does the "full dcache flush" cost 93 uS no matter the address range we want to flush? How long does it take to flush only the kernel, initramfs and DTB using flush_xen_dcache_va_range? It is probably worth sending a patch with your full dcache flush ported from u-boot and start using that.
Ian Campbell
2013-Dec-02 10:10 UTC
Re: [PATCH] xen: arm: add missing flushing dcache to the copy to/clean guest functions
On Sun, 2013-12-01 at 16:21 +0000, Stefano Stabellini wrote:> On Tue, 26 Nov 2013, Oleksandr Dmytryshyn wrote: > > Hi to all. > > > > Currently we use Dom0 kernel size about 30 MBytes (on Panda5 board). > > > > I''ve measured a flushing time in the 2 cases: > > 1. Flush about 30 MBytes (flush_xen_dcache_va_range() function) - 1280 uS > > 2. Full dcache flush (I''ve ported this from u-boot to the hypervisor) - 93 uS > > The difference is huge! > > Does the "full dcache flush" cost 93 uS no matter the address range we > want to flush?I assume it uses flushes by set/way so it will depend on the processor''s cache parameters, but will almost certainly require far less individual operations than doing a page by page flush of even a moderately sized region. These timing don''t take into account the "collateral damage" of flushing a bunch of unrelated data out of the cache which will need reloading, though.> How long does it take to flush only the kernel, initramfs and DTB using > flush_xen_dcache_va_range? > > It is probably worth sending a patch with your full dcache flush ported > from u-boot and start using that.Yes, license allowing I''d be happy to have such a patch. I suppose this is mostly code from arch/arm/cpu/armv7/cache_v7.c which is GPL-2.0+. It would be useful to have such a patch around e.g. for debug and testing stuff even if it doesn''t actually get used in practice... In the past I''ve hacked in the Linux kernels asm version of these loops. I suppose they are more optimal but I am in favour of taking the simpler C versions unless/until measurement shows that an ASM version is better. Ian.