Andres Lagar-Cavilla
2011-Nov-23 04:52 UTC
Re: [PATCH 3 of 3] RFC: xenpaging: use waitqueue in ept_get
Hi Olaf, thanks for posting these RFC patches, great work! I have several comments. Most importantly, I want to hash out the interactions between these wait queues and the work I''ve been doing on p2m synchronization. I''ve been runnning Xen with synchronized (i.e. locking) p2m lookups for a few weeks now with little/no trouble. You can refer to a patch I posted to the list previously, which I can resend. (I''m waiting on feedback on some previous patches I sent to keep pushing on this work) - I think the waitqueue should be part of struct arch_domain. It is an x86 construct that applies only to the base level p2m of the domain, not every possible p2m in a nested setting. - The same could be said of the p2m.pod struct, by the way, and that''s a patch I want to get upstreamed too. - The right place to wait is not ept->get_entry, imho, but rather the implementation-independent caller (get_gfn_type_access). More p2m implementations could be added in the future (however unlikely that may be) that can also perform paging. - With locking p2m lookups, one needs to be careful about in_atomic. I haven''t performed a full audit of all callers, but this is what I can say: 1. there are no code paths that perform recursive p2m lookups, *on different gfns*, with p2m_query_t != p2m_query 2. there are recursive lookups of different gfns, with p2m_query_t =p2m_query, most notably pod sweeps. These do not represent a problem here, since those paths will skip over gfns that are paged. Why is this important? Because we need to be in a position where a code snippet such as get_gfn_type_access(gfn) { retry: p2m_lock() mfn = p2m->get_entry(gfn, &p2mt) if (p2mt == paging) { p2m_unlock() sleep_on_waitqueue() goto retry; } works. This will get us a non-racy p2m that sends vcpu''s to sleep without holding locks. Makes sense? - What is the plan for grant operations for pv-on-hvm drivers? Those will enter get_gfn* holding the domain lock, and thus in an atomic context. Andres> Date: Tue, 22 Nov 2011 22:13:25 +0100 > From: Olaf Hering <olaf@aepfle.de> > To: xen-devel@lists.xensource.com > Subject: [Xen-devel] [PATCH 3 of 3] RFC: xenpaging: use waitqueue in > ept_get > Message-ID: <9d63ecd3969bb7a2e398.1321996405@probook.site> > Content-Type: text/plain; charset="us-ascii" > > # HG changeset patch > # User Olaf Hering <olaf@aepfle.de> > # Date 1321996199 -3600 > # Node ID 9d63ecd3969bb7a2e39841f6c859b4c23f750642 > # Parent de6860cb9205b68d1287482288d1b7b9d0255609 > RFC: xenpaging: use waitqueue in ept_get > %0
Olaf Hering
2011-Nov-23 16:37 UTC
Re: [PATCH 3 of 3] RFC: xenpaging: use waitqueue in ept_get
On Tue, Nov 22, Andres Lagar-Cavilla wrote:> Hi Olaf, > > thanks for posting these RFC patches, great work! > > I have several comments. Most importantly, I want to hash out the > interactions between these wait queues and the work I''ve been doing on p2m > synchronization. I''ve been runnning Xen with synchronized (i.e. locking) > p2m lookups for a few weeks now with little/no trouble. You can refer to a > patch I posted to the list previously, which I can resend. (I''m waiting on > feedback on some previous patches I sent to keep pushing on this work) > > - I think the waitqueue should be part of struct arch_domain. It is an x86 > construct that applies only to the base level p2m of the domain, not every > possible p2m in a nested setting.Good point, I will move it. On the other hand, its current location cant be the final solution. A wake_up would start all waiting vcpus, not just the ones waiting for a certain gfn (in case of paging). There needs to be a better way for selective wake_up.> - The right place to wait is not ept->get_entry, imho, but rather the > implementation-independent caller (get_gfn_type_access). More p2m > implementations could be added in the future (however unlikely that may > be) that can also perform paging.The wait could probably be moved one level up, yes.> - With locking p2m lookups, one needs to be careful about in_atomic. I > haven''t performed a full audit of all callers, but this is what I can say: > 1. there are no code paths that perform recursive p2m lookups, *on > different gfns*, with p2m_query_t != p2m_query > 2. there are recursive lookups of different gfns, with p2m_query_t => p2m_query, most notably pod sweeps. These do not represent a problem here, > since those paths will skip over gfns that are paged. > > Why is this important? Because we need to be in a position where a code > snippet such as > > get_gfn_type_access(gfn) { > retry: > p2m_lock() > mfn = p2m->get_entry(gfn, &p2mt) > if (p2mt == paging) { > p2m_unlock() > sleep_on_waitqueue() > goto retry; > } > > works. This will get us a non-racy p2m that sends vcpu''s to sleep without > holding locks. Makes sense?Something like that can be done if needed, yes.> - What is the plan for grant operations for pv-on-hvm drivers? Those will > enter get_gfn* holding the domain lock, and thus in an atomic context.Is that a new thing? So far my PVonHVM guests did not encounter that issue. I see do_grant_table_op() taking the domain_lock, but is this code path really entered from the guest or rather from dom0? __get_paged_frame() returns GNTST_eagain, and that needs to be handled by callers of do_grant_table_op. linux-2.6.18-xen.hg has code to retry the grant operation in that case. Olaf
Andres Lagar-Cavilla
2011-Nov-23 17:36 UTC
Re: [PATCH 3 of 3] RFC: xenpaging: use waitqueue in ept_get
Below:> On Tue, Nov 22, Andres Lagar-Cavilla wrote: > >> Hi Olaf, >> >> thanks for posting these RFC patches, great work! >> >> I have several comments. Most importantly, I want to hash out the >> interactions between these wait queues and the work I''ve been doing on >> p2m >> synchronization. I''ve been runnning Xen with synchronized (i.e. locking) >> p2m lookups for a few weeks now with little/no trouble. You can refer to >> a >> patch I posted to the list previously, which I can resend. (I''m waiting >> on >> feedback on some previous patches I sent to keep pushing on this work) >> >> - I think the waitqueue should be part of struct arch_domain. It is an >> x86 >> construct that applies only to the base level p2m of the domain, not >> every >> possible p2m in a nested setting. > > Good point, I will move it. On the other hand, its current location cant > be the final solution. A wake_up would start all waiting vcpus, not just > the ones waiting for a certain gfn (in case of paging). There needs to > be a better way for selective wake_up.As long as you wrap the wait queue go-to-sleep action in a while loop that rechecks the sleep condition before exiting the loop, this should be fine. It''s a standard idiom. There is an argument against spurious wake-ups, but imhois no biggie.> >> - The right place to wait is not ept->get_entry, imho, but rather the >> implementation-independent caller (get_gfn_type_access). More p2m >> implementations could be added in the future (however unlikely that may >> be) that can also perform paging. > > The wait could probably be moved one level up, yes. > >> - With locking p2m lookups, one needs to be careful about in_atomic. I >> haven''t performed a full audit of all callers, but this is what I can >> say: >> 1. there are no code paths that perform recursive p2m lookups, *on >> different gfns*, with p2m_query_t != p2m_query >> 2. there are recursive lookups of different gfns, with p2m_query_t =>> p2m_query, most notably pod sweeps. These do not represent a problem >> here, >> since those paths will skip over gfns that are paged. >> >> Why is this important? Because we need to be in a position where a code >> snippet such as >> >> get_gfn_type_access(gfn) { >> retry: >> p2m_lock() >> mfn = p2m->get_entry(gfn, &p2mt) >> if (p2mt == paging) { >> p2m_unlock() >> sleep_on_waitqueue() >> goto retry; >> } >> >> works. This will get us a non-racy p2m that sends vcpu''s to sleep >> without >> holding locks. Makes sense? > > Something like that can be done if needed, yes. > >> - What is the plan for grant operations for pv-on-hvm drivers? Those >> will >> enter get_gfn* holding the domain lock, and thus in an atomic context. > > Is that a new thing? So far my PVonHVM guests did not encounter that > issue. I see do_grant_table_op() taking the domain_lock, but is this > code path really entered from the guest or rather from dom0? > __get_paged_frame() returns GNTST_eagain, and that needs to be handled > by callers of do_grant_table_op. linux-2.6.18-xen.hg has code to retry > the grant operation in that case.I''m on your side here. I''ve seen posts in the list about putting dom0 into an hvm container using ept, though. Andres> > Olaf >