Rik van Riel
2006-Jul-28 07:06 UTC
[Xen-devel] [PATCH][RFC] open HVM backing storage with O_SYNC
I noticed that the new qemu-dm code has DMA_MULTI_THREAD defined, so I/O already overlaps with CPU run time of the guest domain. This means that we might as well open the backing storage with O_SYNC, so writes done by the guest hit the disk when the guest expects them to, and in the other the guest expects them to. I am now running my postgresql HVM test domain (which has had its database eaten a number of times by the async write behaviour) with this patch, and will try to abuse it heavily over the next few days. Any comments on this patch? -- "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - Brian W. Kernighan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Rik van Riel
2006-Jul-28 17:02 UTC
Re: [Xen-devel] [PATCH][RFC] open HVM backing storage with O_SYNC
Rik van Riel wrote:> Any comments on this patch?I got some comments from Alan, who would like to see this behaviour tunable with hdparm from inside the guest. This requires larger qemu changes though, to be specific an ->fsync callback into each of the backing store drivers, so that is something for the qemu mailing list. The current bottleneck seems to be that MAX_MULT_COUNT is only 16. I will try raising this to 256 so we can transport a lot more data per world and domain switch... -- "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - Brian W. Kernighan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Rik van Riel
2006-Jul-28 20:21 UTC
Re: [Xen-devel] [PATCH][RFC] open HVM backing storage with O_SYNC
Rik van Riel wrote:> Rik van Riel wrote: > >> Any comments on this patch? > > I got some comments from Alan, who would like to see this behaviour > tunable with hdparm from inside the guest. This requires larger > qemu changes though, to be specific an ->fsync callback into each > of the backing store drivers, so that is something for the qemu > mailing list.Considering the AIO-based development going on in the qemu community, I think we should stick with the O_SYNC band-aid. The idea Alan described would just be a fancier band-aid.> The current bottleneck seems to be that MAX_MULT_COUNT is only 16.Upon closer inspection of the code, this seems to not be the case for LBA48 transfers. -- "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - Brian W. Kernighan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Christian Limpach
2006-Jul-29 00:44 UTC
Re: [Xen-devel] [PATCH][RFC] open HVM backing storage with O_SYNC
On 7/28/06, Rik van Riel <riel@redhat.com> wrote:> Rik van Riel wrote: > > Rik van Riel wrote: > > > >> Any comments on this patch? > > > > I got some comments from Alan, who would like to see this behaviour > > tunable with hdparm from inside the guest. This requires larger > > qemu changes though, to be specific an ->fsync callback into each > > of the backing store drivers, so that is something for the qemu > > mailing list. > > Considering the AIO-based development going on in the qemu community, > I think we should stick with the O_SYNC band-aid. The idea Alan > described would just be a fancier band-aid.Another possibility would be to integrate blktap/tapdisk into qemu which will provide asynchronous completion events and hides the immediate AIO interaction from qemu. This should also make using qemu inside a stub domain easier since the code to talk to tapdisk will be very similar to the blkfront code. Also, this is somewhat required to use tap devices for HVM domains, the alternative of using blkfront within dom0 to export the device for qemu to use doesn''t sound too appealing. Do you fancy looking into this?> > The current bottleneck seems to be that MAX_MULT_COUNT is only 16. > > Upon closer inspection of the code, this seems to not be the case for > LBA48 transfers.Any other ideas what could be the bottleneck then? christian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Rik van Riel
2006-Jul-30 22:45 UTC
Re: [Xen-devel] [PATCH][RFC] open HVM backing storage with O_SYNC
Christian Limpach wrote:> Another possibility would be to integrate blktap/tapdisk into qemu > which will provide asynchronous completion events and hides the > immediate AIO interaction from qemu. This should also make using qemu > inside a stub domain easierSounds like a very good idea indeed.> Do you fancy looking into this?Unfortunately we''ve got some nasty blocker bugs left for Fedora Core 6 which we''re trying to track down first...>> > The current bottleneck seems to be that MAX_MULT_COUNT is only 16. >> >> Upon closer inspection of the code, this seems to not be the case for >> LBA48 transfers. > > Any other ideas what could be the bottleneck then?Probably scheduling latency. I''m running 2 VT domains on this system, and both qemu-dm processes are taking up to 25% of the CPU each, on a 3GHz system. When running top inside the VT guest, a lot of CPU time is spent in "hi" and "si" time, which is irq code being emulated by qemu-dm. Of course, with qemu-dm taking this much CPU time, it''ll have a lower CPU priority and will not get scheduled immediately. Still fast enough to have 10000+ context switches/second, but apparently not quite fast enough for the VT guest to have decent performance under heavy network traffic... -- "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - Brian W. Kernighan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Rik van Riel
2006-Aug-02 23:26 UTC
Re: [Xen-devel] [PATCH][RFC] open HVM backing storage with O_SYNC
Rik van Riel wrote:> Christian Limpach wrote:>> Any other ideas what could be the bottleneck then? > > Probably scheduling latency.After switching to the rtl8139 network emulation (which now work well), the CPU use of both qemu-dm and my VT guests dramatically decreased and performance is a lot better now. I''ll let you know what the next bottleneck is once I run into it :) -- "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - Brian W. Kernighan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Christian Limpach
2006-Aug-04 09:29 UTC
Re: [Xen-devel] [PATCH][RFC] open HVM backing storage with O_SYNC
On 7/28/06, Rik van Riel <riel@redhat.com> wrote:> I noticed that the new qemu-dm code has DMA_MULTI_THREAD defined, so > I/O already overlaps with CPU run time of the guest domain. This > means that we might as well open the backing storage with O_SYNC, so > writes done by the guest hit the disk when the guest expects them to, > and in the other the guest expects them to. > > I am now running my postgresql HVM test domain (which has had its > database eaten a number of times by the async write behaviour) with > this patch, and will try to abuse it heavily over the next few days. > > Any comments on this patch?Applied, thanks! christian> > -- > "Debugging is twice as hard as writing the code in the first place. > Therefore, if you write the code as cleverly as possible, you are, > by definition, not smart enough to debug it." - Brian W. Kernighan > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel > > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel