thr3ads.net - Xen devel - [Xen-devel] Re: mem-event interface [Jun 2010]

If this information is useful, please help other people find it:
Share via:

Grzegorz Milos

2010-Jun-23 22:19 UTC

[Xen-devel] Re: mem-event interface

[From Gregor]

There are two major events that the memory sharing code needs to
communicate over the hypervisor/userspace boundary:
1. GFN unsharing failed due to lack of memory. This will be called the
''OOM event'' from now on.
2. MFN is no longer sharable (actually an opaque sharing handle would
be communicated instead of the MFN). ''Handle invalidate event''
from
now on.

The requirements on the OOM event are relatively similar to the
page-in event. The way this should operate is that the faulting VCPU
is paused, and the pager is requested to free up some memory. When it
does so, it should generate an appropriate response, and wake up the
VCPU back again using a domctl. The event is going to be low volume,
and since it is going to be handled synchronously, likely in tens of
ms, there are no particular requirements on the efficiency.

Handle invalidate event type is less important in the short term
because the userspace sharing daemon is designed to be resilient to
unfresh sharing state. However, if it is missing it will make the
sharing progressively less effective as time goes on. The idea is that
the hypervisor communicates which sharing handles are no longer valid,
such that the sharing daemon only attempts to share pages in the
correct state. This would be relatively high volume event, but it
doesn''t need to be accurate (i.e. events can be dropped if they are
not consumed quickly enough). As such this event should be batch
delivered, in an asynchronous fashion.

The OOM event is coded up in Xen, but it will not be consumed properly
in the pager. If I remember correctly, I didn''t want to interfere with
the page-in events because the event interface assumed that mem-event
responses are inserted onto the ring in precisely the same order as
the requests. This may not be the case when we start mixing different
event types. WRT to the handle invalidation, the relevant hooks exist
in Xen, and in the mem sharing daemon, but there is no way to
communicate events to two different consumers atm.

Since the requirements on the two different sharing event types are
substantially different, I think it may be easier if separate channels
(i.e. separate rings) were used to transfer them. This would also fix
the multiple consumers issue relatively easily. Of course you may know
of some other mem events that wouldn''t fit in that scheme.

I remember that there was someone working on an external anti-virus
software, which prompted the whole mem-event work. I don''t remember
his/hers name or affiliation (could you remind me?), but maybe he/she
would be interested in working on some of this?

Thanks
Gregor

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Grzegorz Milos

2010-Jun-23 22:21 UTC

head link

[Xen-devel] Re: mem-event interface

[From Patrick]

I think the idea of multiple rings is a good one. We''ll register the
clients in Xen and when an mem_event is reached, we can just iterate
through the list of listeners to see who needs a notification.

The person working on the anti-virus stuff is Bryan Payne from Georgia
Tech. I''ve CCed him as well so we can get his input on this stuff as
well. It''s better to hash out a proper interface now rather than
continually changing it around.


Patrick

On Wed, Jun 23, 2010 at 11:19 PM, Grzegorz Milos
<grzegorz.milos@gmail.com> wrote:> [From Gregor]
>
> There are two major events that the memory sharing code needs to
> communicate over the hypervisor/userspace boundary:
> 1. GFN unsharing failed due to lack of memory. This will be called the
> ''OOM event'' from now on.
> 2. MFN is no longer sharable (actually an opaque sharing handle would
> be communicated instead of the MFN). ''Handle invalidate
event'' from
> now on.
>
> The requirements on the OOM event are relatively similar to the
> page-in event. The way this should operate is that the faulting VCPU
> is paused, and the pager is requested to free up some memory. When it
> does so, it should generate an appropriate response, and wake up the
> VCPU back again using a domctl. The event is going to be low volume,
> and since it is going to be handled synchronously, likely in tens of
> ms, there are no particular requirements on the efficiency.
>
> Handle invalidate event type is less important in the short term
> because the userspace sharing daemon is designed to be resilient to
> unfresh sharing state. However, if it is missing it will make the
> sharing progressively less effective as time goes on. The idea is that
> the hypervisor communicates which sharing handles are no longer valid,
> such that the sharing daemon only attempts to share pages in the
> correct state. This would be relatively high volume event, but it
> doesn''t need to be accurate (i.e. events can be dropped if they
are
> not consumed quickly enough). As such this event should be batch
> delivered, in an asynchronous fashion.
>
> The OOM event is coded up in Xen, but it will not be consumed properly
> in the pager. If I remember correctly, I didn''t want to interfere
with
> the page-in events because the event interface assumed that mem-event
> responses are inserted onto the ring in precisely the same order as
> the requests. This may not be the case when we start mixing different
> event types. WRT to the handle invalidation, the relevant hooks exist
> in Xen, and in the mem sharing daemon, but there is no way to
> communicate events to two different consumers atm.
>
> Since the requirements on the two different sharing event types are
> substantially different, I think it may be easier if separate channels
> (i.e. separate rings) were used to transfer them. This would also fix
> the multiple consumers issue relatively easily. Of course you may know
> of some other mem events that wouldn''t fit in that scheme.
>
> I remember that there was someone working on an external anti-virus
> software, which prompted the whole mem-event work. I don''t
remember
> his/hers name or affiliation (could you remind me?), but maybe he/she
> would be interested in working on some of this?
>
> Thanks
> Gregor
>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Grzegorz Milos

2010-Jun-23 22:21 UTC

head link

[Xen-devel] Re: mem-event interface

[From Bryan]

Bryan D. Payne
 to Patrick, me, george.dunlap, Andrew, Steven
	
show details Jun 16 (7 days ago)
	
Patrick, thanks for the inclusion.

Since I''m coming in the middle of this discussion, forgive me if
I''ve
missed something.  But is the idea here to create a more general
interface that could support various different types of memory events
+ notification?  And the two events listed below are just a subset of
the events that could / would be supported?

In general, I like the sound of where this is going but I would like
to see support for notification of events such as when a domU reads /
writes / execs a pre-specified byte(s) of memory.  As such, there
would need to be a notification path (as discussed below) and also a
control path to setup the memory regions that the user app cares
about.

-bryan

On Wed, Jun 23, 2010 at 11:21 PM, Grzegorz Milos
<grzegorz.milos@gmail.com> wrote:> [From Patrick]
>
> I think the idea of multiple rings is a good one. We''ll register
the
> clients in Xen and when an mem_event is reached, we can just iterate
> through the list of listeners to see who needs a notification.
>
> The person working on the anti-virus stuff is Bryan Payne from Georgia
> Tech. I''ve CCed him as well so we can get his input on this stuff
as
> well. It''s better to hash out a proper interface now rather than
> continually changing it around.
>
>
> Patrick
>
> On Wed, Jun 23, 2010 at 11:19 PM, Grzegorz Milos
> <grzegorz.milos@gmail.com> wrote:
>> [From Gregor]
>>
>> There are two major events that the memory sharing code needs to
>> communicate over the hypervisor/userspace boundary:
>> 1. GFN unsharing failed due to lack of memory. This will be called the
>> ''OOM event'' from now on.
>> 2. MFN is no longer sharable (actually an opaque sharing handle would
>> be communicated instead of the MFN). ''Handle invalidate
event'' from
>> now on.
>>
>> The requirements on the OOM event are relatively similar to the
>> page-in event. The way this should operate is that the faulting VCPU
>> is paused, and the pager is requested to free up some memory. When it
>> does so, it should generate an appropriate response, and wake up the
>> VCPU back again using a domctl. The event is going to be low volume,
>> and since it is going to be handled synchronously, likely in tens of
>> ms, there are no particular requirements on the efficiency.
>>
>> Handle invalidate event type is less important in the short term
>> because the userspace sharing daemon is designed to be resilient to
>> unfresh sharing state. However, if it is missing it will make the
>> sharing progressively less effective as time goes on. The idea is that
>> the hypervisor communicates which sharing handles are no longer valid,
>> such that the sharing daemon only attempts to share pages in the
>> correct state. This would be relatively high volume event, but it
>> doesn''t need to be accurate (i.e. events can be dropped if
they are
>> not consumed quickly enough). As such this event should be batch
>> delivered, in an asynchronous fashion.
>>
>> The OOM event is coded up in Xen, but it will not be consumed properly
>> in the pager. If I remember correctly, I didn''t want to
interfere with
>> the page-in events because the event interface assumed that mem-event
>> responses are inserted onto the ring in precisely the same order as
>> the requests. This may not be the case when we start mixing different
>> event types. WRT to the handle invalidation, the relevant hooks exist
>> in Xen, and in the mem sharing daemon, but there is no way to
>> communicate events to two different consumers atm.
>>
>> Since the requirements on the two different sharing event types are
>> substantially different, I think it may be easier if separate channels
>> (i.e. separate rings) were used to transfer them. This would also fix
>> the multiple consumers issue relatively easily. Of course you may know
>> of some other mem events that wouldn''t fit in that scheme.
>>
>> I remember that there was someone working on an external anti-virus
>> software, which prompted the whole mem-event work. I don''t
remember
>> his/hers name or affiliation (could you remind me?), but maybe he/she
>> would be interested in working on some of this?
>>
>> Thanks
>> Gregor
>>
>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Grzegorz Milos

2010-Jun-23 22:22 UTC

head link

[Xen-devel] Re: mem-event interface

[From Patrick]
> Since I''m coming in the middle of this discussion, forgive me if
I''ve
> missed something.  But is the idea here to create a more general
> interface that could support various different types of memory events
> + notification?  And the two events listed below are just a subset of
> the events that could / would be supported?
That''s correct.

> In general, I like the sound of where this is going but I would like
> to see support for notification of events such as when a domU reads /
> writes / execs a pre-specified byte(s) of memory.  As such, there
> would need to be a notification path (as discussed below) and also a
> control path to setup the memory regions that the user app cares
> about.
Sub-page events is something I would like to have included as well.
Currently the control path is basically just "nominating" a page (for
either swapping or sharing). It''s not entirely clear to me the best
way to go about this. With swapping and sharing we have code in Xen to
handle both cases. However, to just receive notifications (like
"read", "write", "execute") I don''t think
we need specialised support
(or at least just once to handle the notifications). I''m thinking it
might be good to have a daemon to handle these events in user-space
and register clients with the user-space daemon. Each client would get
a unique client ID which could be used to identify who should get the
response. This way, we could just register that somebody is interested
in that page (or byte, etc) and let the user-space tool handle most of
the complex logic (i.e. which of the clients should that particular
notification go to). This requires some notion of priority for memory
areas (e.g. if one client requests notification for access to a byte
of page foo and another requests notification for access to any of
page foo, then we only need Xen to store that it should notify for
page foo and just send along which byte(s) of the page were accessed
as well, then the user-space daemon can determine if both clients
should be notified or just the one) (e.g. if one client requests async
notification and another requests sync notification, then Xen only
needs to know to do sync notification). What''s everybody thoughts on
this? Does it seem reasonable or have I gone completely mad?


Patrick

On Wed, Jun 23, 2010 at 11:21 PM, Grzegorz Milos
<grzegorz.milos@gmail.com> wrote:> [From Bryan]
>
> Bryan D. Payne
>  to Patrick, me, george.dunlap, Andrew, Steven
>
> show details Jun 16 (7 days ago)
>
> Patrick, thanks for the inclusion.
>
> Since I''m coming in the middle of this discussion, forgive me if
I''ve
> missed something.  But is the idea here to create a more general
> interface that could support various different types of memory events
> + notification?  And the two events listed below are just a subset of
> the events that could / would be supported?
>
> In general, I like the sound of where this is going but I would like
> to see support for notification of events such as when a domU reads /
> writes / execs a pre-specified byte(s) of memory.  As such, there
> would need to be a notification path (as discussed below) and also a
> control path to setup the memory regions that the user app cares
> about.
>
> -bryan
>
> On Wed, Jun 23, 2010 at 11:21 PM, Grzegorz Milos
> <grzegorz.milos@gmail.com> wrote:
>> [From Patrick]
>>
>> I think the idea of multiple rings is a good one. We''ll
register the
>> clients in Xen and when an mem_event is reached, we can just iterate
>> through the list of listeners to see who needs a notification.
>>
>> The person working on the anti-virus stuff is Bryan Payne from Georgia
>> Tech. I''ve CCed him as well so we can get his input on this
stuff as
>> well. It''s better to hash out a proper interface now rather
than
>> continually changing it around.
>>
>>
>> Patrick
>>
>> On Wed, Jun 23, 2010 at 11:19 PM, Grzegorz Milos
>> <grzegorz.milos@gmail.com> wrote:
>>> [From Gregor]
>>>
>>> There are two major events that the memory sharing code needs to
>>> communicate over the hypervisor/userspace boundary:
>>> 1. GFN unsharing failed due to lack of memory. This will be called
the
>>> ''OOM event'' from now on.
>>> 2. MFN is no longer sharable (actually an opaque sharing handle
would
>>> be communicated instead of the MFN). ''Handle invalidate
event'' from
>>> now on.
>>>
>>> The requirements on the OOM event are relatively similar to the
>>> page-in event. The way this should operate is that the faulting
VCPU
>>> is paused, and the pager is requested to free up some memory. When
it
>>> does so, it should generate an appropriate response, and wake up
the
>>> VCPU back again using a domctl. The event is going to be low
volume,
>>> and since it is going to be handled synchronously, likely in tens
of
>>> ms, there are no particular requirements on the efficiency.
>>>
>>> Handle invalidate event type is less important in the short term
>>> because the userspace sharing daemon is designed to be resilient to
>>> unfresh sharing state. However, if it is missing it will make the
>>> sharing progressively less effective as time goes on. The idea is
that
>>> the hypervisor communicates which sharing handles are no longer
valid,
>>> such that the sharing daemon only attempts to share pages in the
>>> correct state. This would be relatively high volume event, but it
>>> doesn''t need to be accurate (i.e. events can be dropped if
they are
>>> not consumed quickly enough). As such this event should be batch
>>> delivered, in an asynchronous fashion.
>>>
>>> The OOM event is coded up in Xen, but it will not be consumed
properly
>>> in the pager. If I remember correctly, I didn''t want to
interfere with
>>> the page-in events because the event interface assumed that
mem-event
>>> responses are inserted onto the ring in precisely the same order as
>>> the requests. This may not be the case when we start mixing
different
>>> event types. WRT to the handle invalidation, the relevant hooks
exist
>>> in Xen, and in the mem sharing daemon, but there is no way to
>>> communicate events to two different consumers atm.
>>>
>>> Since the requirements on the two different sharing event types are
>>> substantially different, I think it may be easier if separate
channels
>>> (i.e. separate rings) were used to transfer them. This would also
fix
>>> the multiple consumers issue relatively easily. Of course you may
know
>>> of some other mem events that wouldn''t fit in that scheme.
>>>
>>> I remember that there was someone working on an external anti-virus
>>> software, which prompted the whole mem-event work. I don''t
remember
>>> his/hers name or affiliation (could you remind me?), but maybe
he/she
>>> would be interested in working on some of this?
>>>
>>> Thanks
>>> Gregor
>>>
>>
>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Grzegorz Milos

2010-Jun-23 22:22 UTC

head link

[Xen-devel] Re: mem-event interface

[From Bryan]
> needs to know to do sync notification). What''s everybody thoughts
on
> this? Does it seem reasonable or have I gone completely mad?
I like this idea as it keeps Xen as simple as possible and should also
help to reduce the number of notifications sent from Xen up to user
space (e.g., one notification to the daemon could then be pushed out
to multiple clients that care about it).

For what it''s worth, I''d be happy to build such a daemon into
XenAccess.  This may be a logical place for it since XenAccess is
already doing address translations and such, so it would be easier for
a client app to specify an address range of interest as a virtual
address or physical address.  This would prevent the need to repeat
some of that address translation functionality in yet another library.

Alternatively, we could provide the daemon functionality in libxc or
some other Xen library and only provide support for low level
addresses (e.g., pfn + offset).  Then XenAccess could build on top of
that to offer higher level addresses (e.g., pa or va) using its
existing translation mechanisms.  This approach would more closely
mirror the current division of labor between XenAccess and libxc.

-bryan

On Wed, Jun 23, 2010 at 11:22 PM, Grzegorz Milos
<grzegorz.milos@gmail.com> wrote:> [From Patrick]
>
>> Since I''m coming in the middle of this discussion, forgive me
if I''ve
>> missed something.  But is the idea here to create a more general
>> interface that could support various different types of memory events
>> + notification?  And the two events listed below are just a subset of
>> the events that could / would be supported?
>
> That''s correct.
>
>
>> In general, I like the sound of where this is going but I would like
>> to see support for notification of events such as when a domU reads /
>> writes / execs a pre-specified byte(s) of memory.  As such, there
>> would need to be a notification path (as discussed below) and also a
>> control path to setup the memory regions that the user app cares
>> about.
>
> Sub-page events is something I would like to have included as well.
> Currently the control path is basically just "nominating" a page
(for
> either swapping or sharing). It''s not entirely clear to me the
best
> way to go about this. With swapping and sharing we have code in Xen to
> handle both cases. However, to just receive notifications (like
> "read", "write", "execute") I don''t
think we need specialised support
> (or at least just once to handle the notifications). I''m thinking
it
> might be good to have a daemon to handle these events in user-space
> and register clients with the user-space daemon. Each client would get
> a unique client ID which could be used to identify who should get the
> response. This way, we could just register that somebody is interested
> in that page (or byte, etc) and let the user-space tool handle most of
> the complex logic (i.e. which of the clients should that particular
> notification go to). This requires some notion of priority for memory
> areas (e.g. if one client requests notification for access to a byte
> of page foo and another requests notification for access to any of
> page foo, then we only need Xen to store that it should notify for
> page foo and just send along which byte(s) of the page were accessed
> as well, then the user-space daemon can determine if both clients
> should be notified or just the one) (e.g. if one client requests async
> notification and another requests sync notification, then Xen only
> needs to know to do sync notification). What''s everybody thoughts
on
> this? Does it seem reasonable or have I gone completely mad?
>
>
> Patrick
>
> On Wed, Jun 23, 2010 at 11:21 PM, Grzegorz Milos
> <grzegorz.milos@gmail.com> wrote:
>> [From Bryan]
>>
>> Bryan D. Payne
>>  to Patrick, me, george.dunlap, Andrew, Steven
>>
>> show details Jun 16 (7 days ago)
>>
>> Patrick, thanks for the inclusion.
>>
>> Since I''m coming in the middle of this discussion, forgive me
if I''ve
>> missed something.  But is the idea here to create a more general
>> interface that could support various different types of memory events
>> + notification?  And the two events listed below are just a subset of
>> the events that could / would be supported?
>>
>> In general, I like the sound of where this is going but I would like
>> to see support for notification of events such as when a domU reads /
>> writes / execs a pre-specified byte(s) of memory.  As such, there
>> would need to be a notification path (as discussed below) and also a
>> control path to setup the memory regions that the user app cares
>> about.
>>
>> -bryan
>>
>> On Wed, Jun 23, 2010 at 11:21 PM, Grzegorz Milos
>> <grzegorz.milos@gmail.com> wrote:
>>> [From Patrick]
>>>
>>> I think the idea of multiple rings is a good one. We''ll
register the
>>> clients in Xen and when an mem_event is reached, we can just
iterate
>>> through the list of listeners to see who needs a notification.
>>>
>>> The person working on the anti-virus stuff is Bryan Payne from
Georgia
>>> Tech. I''ve CCed him as well so we can get his input on
this stuff as
>>> well. It''s better to hash out a proper interface now
rather than
>>> continually changing it around.
>>>
>>>
>>> Patrick
>>>
>>> On Wed, Jun 23, 2010 at 11:19 PM, Grzegorz Milos
>>> <grzegorz.milos@gmail.com> wrote:
>>>> [From Gregor]
>>>>
>>>> There are two major events that the memory sharing code needs
to
>>>> communicate over the hypervisor/userspace boundary:
>>>> 1. GFN unsharing failed due to lack of memory. This will be
called the
>>>> ''OOM event'' from now on.
>>>> 2. MFN is no longer sharable (actually an opaque sharing handle
would
>>>> be communicated instead of the MFN). ''Handle
invalidate event'' from
>>>> now on.
>>>>
>>>> The requirements on the OOM event are relatively similar to the
>>>> page-in event. The way this should operate is that the faulting
VCPU
>>>> is paused, and the pager is requested to free up some memory.
When it
>>>> does so, it should generate an appropriate response, and wake
up the
>>>> VCPU back again using a domctl. The event is going to be low
volume,
>>>> and since it is going to be handled synchronously, likely in
tens of
>>>> ms, there are no particular requirements on the efficiency.
>>>>
>>>> Handle invalidate event type is less important in the short
term
>>>> because the userspace sharing daemon is designed to be
resilient to
>>>> unfresh sharing state. However, if it is missing it will make
the
>>>> sharing progressively less effective as time goes on. The idea
is that
>>>> the hypervisor communicates which sharing handles are no longer
valid,
>>>> such that the sharing daemon only attempts to share pages in
the
>>>> correct state. This would be relatively high volume event, but
it
>>>> doesn''t need to be accurate (i.e. events can be
dropped if they are
>>>> not consumed quickly enough). As such this event should be
batch
>>>> delivered, in an asynchronous fashion.
>>>>
>>>> The OOM event is coded up in Xen, but it will not be consumed
properly
>>>> in the pager. If I remember correctly, I didn''t want
to interfere with
>>>> the page-in events because the event interface assumed that
mem-event
>>>> responses are inserted onto the ring in precisely the same
order as
>>>> the requests. This may not be the case when we start mixing
different
>>>> event types. WRT to the handle invalidation, the relevant hooks
exist
>>>> in Xen, and in the mem sharing daemon, but there is no way to
>>>> communicate events to two different consumers atm.
>>>>
>>>> Since the requirements on the two different sharing event types
are
>>>> substantially different, I think it may be easier if separate
channels
>>>> (i.e. separate rings) were used to transfer them. This would
also fix
>>>> the multiple consumers issue relatively easily. Of course you
may know
>>>> of some other mem events that wouldn''t fit in that
scheme.
>>>>
>>>> I remember that there was someone working on an external
anti-virus
>>>> software, which prompted the whole mem-event work. I
don''t remember
>>>> his/hers name or affiliation (could you remind me?), but maybe
he/she
>>>> would be interested in working on some of this?
>>>>
>>>> Thanks
>>>> Gregor
>>>>
>>>
>>
>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Grzegorz Milos

2010-Jun-23 22:23 UTC

head link

[Xen-devel] Re: mem-event interface

[From Patrick]
> I like this idea as it keeps Xen as simple as possible and should also
> help to reduce the number of notifications sent from Xen up to user
> space (e.g., one notification to the daemon could then be pushed out
> to multiple clients that care about it).
Yeah, that was my general thinking as well. So the immediate change to
the mem_event interface for this would be a way to specify sub-page
level stuff. The best way to approach this is probably by specifying a
start and end range (or more likely start address and size). This way
things like swapping and sharing would specify the start address of
the page they''re interested in and PAGE_SIZE (or, more realistically
there would be an additional lib call to do page-level stuff, which
would just take the pfn and do this translation under the hood).

> For what it''s worth, I''d be happy to build such a daemon
into
> XenAccess.  This may be a logical place for it since XenAccess is
> already doing address translations and such, so it would be easier for
> a client app to specify an address range of interest as a virtual
> address or physical address.  This would prevent the need to repeat
> some of that address translation functionality in yet another library.
>
> Alternatively, we could provide the daemon functionality in libxc or
> some other Xen library and only provide support for low level
> addresses (e.g., pfn + offset).  Then XenAccess could build on top of
> that to offer higher level addresses (e.g., pa or va) using its
> existing translation mechanisms.  This approach would more closely
> mirror the current division of labor between XenAccess and libxc.
This sounds good to me. I''d lean towards  the second approach as I
think it''s the better long-term solution. I''m a bit rusty on
my
XenAccess, but how feasible is it to even move some of the gva/pfn/mfn
translation code out into the library and have the mem_event daemon
use that? I do remember reading through and borrowing XenAccess code
(or at least the general mechanism) to do address translation stuff
for other projects, so it seems like having a general way to do that
would be a win. I think I did it with the CoW stuff, which I actually
want to port to the mem_event interface as well, both to have it
available and as another example of neat things we can do with the
interface.


Patrick

On Wed, Jun 23, 2010 at 11:22 PM, Grzegorz Milos
<grzegorz.milos@gmail.com> wrote:> [From Bryan]
>
>> needs to know to do sync notification). What''s everybody
thoughts on
>> this? Does it seem reasonable or have I gone completely mad?
>
> I like this idea as it keeps Xen as simple as possible and should also
> help to reduce the number of notifications sent from Xen up to user
> space (e.g., one notification to the daemon could then be pushed out
> to multiple clients that care about it).
>
> For what it''s worth, I''d be happy to build such a daemon
into
> XenAccess.  This may be a logical place for it since XenAccess is
> already doing address translations and such, so it would be easier for
> a client app to specify an address range of interest as a virtual
> address or physical address.  This would prevent the need to repeat
> some of that address translation functionality in yet another library.
>
> Alternatively, we could provide the daemon functionality in libxc or
> some other Xen library and only provide support for low level
> addresses (e.g., pfn + offset).  Then XenAccess could build on top of
> that to offer higher level addresses (e.g., pa or va) using its
> existing translation mechanisms.  This approach would more closely
> mirror the current division of labor between XenAccess and libxc.
>
> -bryan
>
> On Wed, Jun 23, 2010 at 11:22 PM, Grzegorz Milos
> <grzegorz.milos@gmail.com> wrote:
>> [From Patrick]
>>
>>> Since I''m coming in the middle of this discussion, forgive
me if I''ve
>>> missed something.  But is the idea here to create a more general
>>> interface that could support various different types of memory
events
>>> + notification?  And the two events listed below are just a subset
of
>>> the events that could / would be supported?
>>
>> That''s correct.
>>
>>
>>> In general, I like the sound of where this is going but I would
like
>>> to see support for notification of events such as when a domU reads
/
>>> writes / execs a pre-specified byte(s) of memory.  As such, there
>>> would need to be a notification path (as discussed below) and also
a
>>> control path to setup the memory regions that the user app cares
>>> about.
>>
>> Sub-page events is something I would like to have included as well.
>> Currently the control path is basically just "nominating" a
page (for
>> either swapping or sharing). It''s not entirely clear to me the
best
>> way to go about this. With swapping and sharing we have code in Xen to
>> handle both cases. However, to just receive notifications (like
>> "read", "write", "execute") I
don''t think we need specialised support
>> (or at least just once to handle the notifications). I''m
thinking it
>> might be good to have a daemon to handle these events in user-space
>> and register clients with the user-space daemon. Each client would get
>> a unique client ID which could be used to identify who should get the
>> response. This way, we could just register that somebody is interested
>> in that page (or byte, etc) and let the user-space tool handle most of
>> the complex logic (i.e. which of the clients should that particular
>> notification go to). This requires some notion of priority for memory
>> areas (e.g. if one client requests notification for access to a byte
>> of page foo and another requests notification for access to any of
>> page foo, then we only need Xen to store that it should notify for
>> page foo and just send along which byte(s) of the page were accessed
>> as well, then the user-space daemon can determine if both clients
>> should be notified or just the one) (e.g. if one client requests async
>> notification and another requests sync notification, then Xen only
>> needs to know to do sync notification). What''s everybody
thoughts on
>> this? Does it seem reasonable or have I gone completely mad?
>>
>>
>> Patrick
>>
>> On Wed, Jun 23, 2010 at 11:21 PM, Grzegorz Milos
>> <grzegorz.milos@gmail.com> wrote:
>>> [From Bryan]
>>>
>>> Bryan D. Payne
>>>  to Patrick, me, george.dunlap, Andrew, Steven
>>>
>>> show details Jun 16 (7 days ago)
>>>
>>> Patrick, thanks for the inclusion.
>>>
>>> Since I''m coming in the middle of this discussion, forgive
me if I''ve
>>> missed something.  But is the idea here to create a more general
>>> interface that could support various different types of memory
events
>>> + notification?  And the two events listed below are just a subset
of
>>> the events that could / would be supported?
>>>
>>> In general, I like the sound of where this is going but I would
like
>>> to see support for notification of events such as when a domU reads
/
>>> writes / execs a pre-specified byte(s) of memory.  As such, there
>>> would need to be a notification path (as discussed below) and also
a
>>> control path to setup the memory regions that the user app cares
>>> about.
>>>
>>> -bryan
>>>
>>> On Wed, Jun 23, 2010 at 11:21 PM, Grzegorz Milos
>>> <grzegorz.milos@gmail.com> wrote:
>>>> [From Patrick]
>>>>
>>>> I think the idea of multiple rings is a good one.
We''ll register the
>>>> clients in Xen and when an mem_event is reached, we can just
iterate
>>>> through the list of listeners to see who needs a notification.
>>>>
>>>> The person working on the anti-virus stuff is Bryan Payne from
Georgia
>>>> Tech. I''ve CCed him as well so we can get his input on
this stuff as
>>>> well. It''s better to hash out a proper interface now
rather than
>>>> continually changing it around.
>>>>
>>>>
>>>> Patrick
>>>>
>>>> On Wed, Jun 23, 2010 at 11:19 PM, Grzegorz Milos
>>>> <grzegorz.milos@gmail.com> wrote:
>>>>> [From Gregor]
>>>>>
>>>>> There are two major events that the memory sharing code
needs to
>>>>> communicate over the hypervisor/userspace boundary:
>>>>> 1. GFN unsharing failed due to lack of memory. This will be
called the
>>>>> ''OOM event'' from now on.
>>>>> 2. MFN is no longer sharable (actually an opaque sharing
handle would
>>>>> be communicated instead of the MFN). ''Handle
invalidate event'' from
>>>>> now on.
>>>>>
>>>>> The requirements on the OOM event are relatively similar to
the
>>>>> page-in event. The way this should operate is that the
faulting VCPU
>>>>> is paused, and the pager is requested to free up some
memory. When it
>>>>> does so, it should generate an appropriate response, and
wake up the
>>>>> VCPU back again using a domctl. The event is going to be
low volume,
>>>>> and since it is going to be handled synchronously, likely
in tens of
>>>>> ms, there are no particular requirements on the efficiency.
>>>>>
>>>>> Handle invalidate event type is less important in the short
term
>>>>> because the userspace sharing daemon is designed to be
resilient to
>>>>> unfresh sharing state. However, if it is missing it will
make the
>>>>> sharing progressively less effective as time goes on. The
idea is that
>>>>> the hypervisor communicates which sharing handles are no
longer valid,
>>>>> such that the sharing daemon only attempts to share pages
in the
>>>>> correct state. This would be relatively high volume event,
but it
>>>>> doesn''t need to be accurate (i.e. events can be
dropped if they are
>>>>> not consumed quickly enough). As such this event should be
batch
>>>>> delivered, in an asynchronous fashion.
>>>>>
>>>>> The OOM event is coded up in Xen, but it will not be
consumed properly
>>>>> in the pager. If I remember correctly, I didn''t
want to interfere with
>>>>> the page-in events because the event interface assumed that
mem-event
>>>>> responses are inserted onto the ring in precisely the same
order as
>>>>> the requests. This may not be the case when we start mixing
different
>>>>> event types. WRT to the handle invalidation, the relevant
hooks exist
>>>>> in Xen, and in the mem sharing daemon, but there is no way
to
>>>>> communicate events to two different consumers atm.
>>>>>
>>>>> Since the requirements on the two different sharing event
types are
>>>>> substantially different, I think it may be easier if
separate channels
>>>>> (i.e. separate rings) were used to transfer them. This
would also fix
>>>>> the multiple consumers issue relatively easily. Of course
you may know
>>>>> of some other mem events that wouldn''t fit in that
scheme.
>>>>>
>>>>> I remember that there was someone working on an external
anti-virus
>>>>> software, which prompted the whole mem-event work. I
don''t remember
>>>>> his/hers name or affiliation (could you remind me?), but
maybe he/she
>>>>> would be interested in working on some of this?
>>>>>
>>>>> Thanks
>>>>> Gregor
>>>>>
>>>>
>>>
>>
>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Grzegorz Milos

2010-Jun-23 22:23 UTC

head link

[Xen-devel] Re: mem-event interface

[From Bryan]
> XenAccess, but how feasible is it to even move some of the gva/pfn/mfn
> translation code out into the library and have the mem_event daemon
> use that? I do remember reading through and borrowing XenAccess code
This is certainly doable.  But if we decide to make a Xen library
depend on XenAccess, then it would make sense to include XenAccess as
part of the Xen distribution, IMHO.  This probably isn''t too
unreasonable to consider, but we''d want to make sure that the
XenAccess configuration is either simplified or eliminated to avoid
causing headaches for the average person using this stuff.  Something
to think about...

-bryan

On Wed, Jun 23, 2010 at 11:23 PM, Grzegorz Milos
<grzegorz.milos@gmail.com> wrote:> [From Patrick]
>
>> I like this idea as it keeps Xen as simple as possible and should also
>> help to reduce the number of notifications sent from Xen up to user
>> space (e.g., one notification to the daemon could then be pushed out
>> to multiple clients that care about it).
>
> Yeah, that was my general thinking as well. So the immediate change to
> the mem_event interface for this would be a way to specify sub-page
> level stuff. The best way to approach this is probably by specifying a
> start and end range (or more likely start address and size). This way
> things like swapping and sharing would specify the start address of
> the page they''re interested in and PAGE_SIZE (or, more
realistically
> there would be an additional lib call to do page-level stuff, which
> would just take the pfn and do this translation under the hood).
>
>
>> For what it''s worth, I''d be happy to build such a
daemon into
>> XenAccess.  This may be a logical place for it since XenAccess is
>> already doing address translations and such, so it would be easier for
>> a client app to specify an address range of interest as a virtual
>> address or physical address.  This would prevent the need to repeat
>> some of that address translation functionality in yet another library.
>>
>> Alternatively, we could provide the daemon functionality in libxc or
>> some other Xen library and only provide support for low level
>> addresses (e.g., pfn + offset).  Then XenAccess could build on top of
>> that to offer higher level addresses (e.g., pa or va) using its
>> existing translation mechanisms.  This approach would more closely
>> mirror the current division of labor between XenAccess and libxc.
>
> This sounds good to me. I''d lean towards  the second approach as I
> think it''s the better long-term solution. I''m a bit rusty
on my
> XenAccess, but how feasible is it to even move some of the gva/pfn/mfn
> translation code out into the library and have the mem_event daemon
> use that? I do remember reading through and borrowing XenAccess code
> (or at least the general mechanism) to do address translation stuff
> for other projects, so it seems like having a general way to do that
> would be a win. I think I did it with the CoW stuff, which I actually
> want to port to the mem_event interface as well, both to have it
> available and as another example of neat things we can do with the
> interface.
>
>
> Patrick
>
> On Wed, Jun 23, 2010 at 11:22 PM, Grzegorz Milos
> <grzegorz.milos@gmail.com> wrote:
>> [From Bryan]
>>
>>> needs to know to do sync notification). What''s everybody
thoughts on
>>> this? Does it seem reasonable or have I gone completely mad?
>>
>> I like this idea as it keeps Xen as simple as possible and should also
>> help to reduce the number of notifications sent from Xen up to user
>> space (e.g., one notification to the daemon could then be pushed out
>> to multiple clients that care about it).
>>
>> For what it''s worth, I''d be happy to build such a
daemon into
>> XenAccess.  This may be a logical place for it since XenAccess is
>> already doing address translations and such, so it would be easier for
>> a client app to specify an address range of interest as a virtual
>> address or physical address.  This would prevent the need to repeat
>> some of that address translation functionality in yet another library.
>>
>> Alternatively, we could provide the daemon functionality in libxc or
>> some other Xen library and only provide support for low level
>> addresses (e.g., pfn + offset).  Then XenAccess could build on top of
>> that to offer higher level addresses (e.g., pa or va) using its
>> existing translation mechanisms.  This approach would more closely
>> mirror the current division of labor between XenAccess and libxc.
>>
>> -bryan
>>
>> On Wed, Jun 23, 2010 at 11:22 PM, Grzegorz Milos
>> <grzegorz.milos@gmail.com> wrote:
>>> [From Patrick]
>>>
>>>> Since I''m coming in the middle of this discussion,
forgive me if I''ve
>>>> missed something.  But is the idea here to create a more
general
>>>> interface that could support various different types of memory
events
>>>> + notification?  And the two events listed below are just a
subset of
>>>> the events that could / would be supported?
>>>
>>> That''s correct.
>>>
>>>
>>>> In general, I like the sound of where this is going but I would
like
>>>> to see support for notification of events such as when a domU
reads /
>>>> writes / execs a pre-specified byte(s) of memory.  As such,
there
>>>> would need to be a notification path (as discussed below) and
also a
>>>> control path to setup the memory regions that the user app
cares
>>>> about.
>>>
>>> Sub-page events is something I would like to have included as well.
>>> Currently the control path is basically just "nominating"
a page (for
>>> either swapping or sharing). It''s not entirely clear to me
the best
>>> way to go about this. With swapping and sharing we have code in Xen
to
>>> handle both cases. However, to just receive notifications (like
>>> "read", "write", "execute") I
don''t think we need specialised support
>>> (or at least just once to handle the notifications). I''m
thinking it
>>> might be good to have a daemon to handle these events in user-space
>>> and register clients with the user-space daemon. Each client would
get
>>> a unique client ID which could be used to identify who should get
the
>>> response. This way, we could just register that somebody is
interested
>>> in that page (or byte, etc) and let the user-space tool handle most
of
>>> the complex logic (i.e. which of the clients should that particular
>>> notification go to). This requires some notion of priority for
memory
>>> areas (e.g. if one client requests notification for access to a
byte
>>> of page foo and another requests notification for access to any of
>>> page foo, then we only need Xen to store that it should notify for
>>> page foo and just send along which byte(s) of the page were
accessed
>>> as well, then the user-space daemon can determine if both clients
>>> should be notified or just the one) (e.g. if one client requests
async
>>> notification and another requests sync notification, then Xen only
>>> needs to know to do sync notification). What''s everybody
thoughts on
>>> this? Does it seem reasonable or have I gone completely mad?
>>>
>>>
>>> Patrick
>>>
>>> On Wed, Jun 23, 2010 at 11:21 PM, Grzegorz Milos
>>> <grzegorz.milos@gmail.com> wrote:
>>>> [From Bryan]
>>>>
>>>> Bryan D. Payne
>>>>  to Patrick, me, george.dunlap, Andrew, Steven
>>>>
>>>> show details Jun 16 (7 days ago)
>>>>
>>>> Patrick, thanks for the inclusion.
>>>>
>>>> Since I''m coming in the middle of this discussion,
forgive me if I''ve
>>>> missed something.  But is the idea here to create a more
general
>>>> interface that could support various different types of memory
events
>>>> + notification?  And the two events listed below are just a
subset of
>>>> the events that could / would be supported?
>>>>
>>>> In general, I like the sound of where this is going but I would
like
>>>> to see support for notification of events such as when a domU
reads /
>>>> writes / execs a pre-specified byte(s) of memory.  As such,
there
>>>> would need to be a notification path (as discussed below) and
also a
>>>> control path to setup the memory regions that the user app
cares
>>>> about.
>>>>
>>>> -bryan
>>>>
>>>> On Wed, Jun 23, 2010 at 11:21 PM, Grzegorz Milos
>>>> <grzegorz.milos@gmail.com> wrote:
>>>>> [From Patrick]
>>>>>
>>>>> I think the idea of multiple rings is a good one.
We''ll register the
>>>>> clients in Xen and when an mem_event is reached, we can
just iterate
>>>>> through the list of listeners to see who needs a
notification.
>>>>>
>>>>> The person working on the anti-virus stuff is Bryan Payne
from Georgia
>>>>> Tech. I''ve CCed him as well so we can get his
input on this stuff as
>>>>> well. It''s better to hash out a proper interface
now rather than
>>>>> continually changing it around.
>>>>>
>>>>>
>>>>> Patrick
>>>>>
>>>>> On Wed, Jun 23, 2010 at 11:19 PM, Grzegorz Milos
>>>>> <grzegorz.milos@gmail.com> wrote:
>>>>>> [From Gregor]
>>>>>>
>>>>>> There are two major events that the memory sharing code
needs to
>>>>>> communicate over the hypervisor/userspace boundary:
>>>>>> 1. GFN unsharing failed due to lack of memory. This
will be called the
>>>>>> ''OOM event'' from now on.
>>>>>> 2. MFN is no longer sharable (actually an opaque
sharing handle would
>>>>>> be communicated instead of the MFN). ''Handle
invalidate event'' from
>>>>>> now on.
>>>>>>
>>>>>> The requirements on the OOM event are relatively
similar to the
>>>>>> page-in event. The way this should operate is that the
faulting VCPU
>>>>>> is paused, and the pager is requested to free up some
memory. When it
>>>>>> does so, it should generate an appropriate response,
and wake up the
>>>>>> VCPU back again using a domctl. The event is going to
be low volume,
>>>>>> and since it is going to be handled synchronously,
likely in tens of
>>>>>> ms, there are no particular requirements on the
efficiency.
>>>>>>
>>>>>> Handle invalidate event type is less important in the
short term
>>>>>> because the userspace sharing daemon is designed to be
resilient to
>>>>>> unfresh sharing state. However, if it is missing it
will make the
>>>>>> sharing progressively less effective as time goes on.
The idea is that
>>>>>> the hypervisor communicates which sharing handles are
no longer valid,
>>>>>> such that the sharing daemon only attempts to share
pages in the
>>>>>> correct state. This would be relatively high volume
event, but it
>>>>>> doesn''t need to be accurate (i.e. events can
be dropped if they are
>>>>>> not consumed quickly enough). As such this event should
be batch
>>>>>> delivered, in an asynchronous fashion.
>>>>>>
>>>>>> The OOM event is coded up in Xen, but it will not be
consumed properly
>>>>>> in the pager. If I remember correctly, I
didn''t want to interfere with
>>>>>> the page-in events because the event interface assumed
that mem-event
>>>>>> responses are inserted onto the ring in precisely the
same order as
>>>>>> the requests. This may not be the case when we start
mixing different
>>>>>> event types. WRT to the handle invalidation, the
relevant hooks exist
>>>>>> in Xen, and in the mem sharing daemon, but there is no
way to
>>>>>> communicate events to two different consumers atm.
>>>>>>
>>>>>> Since the requirements on the two different sharing
event types are
>>>>>> substantially different, I think it may be easier if
separate channels
>>>>>> (i.e. separate rings) were used to transfer them. This
would also fix
>>>>>> the multiple consumers issue relatively easily. Of
course you may know
>>>>>> of some other mem events that wouldn''t fit in
that scheme.
>>>>>>
>>>>>> I remember that there was someone working on an
external anti-virus
>>>>>> software, which prompted the whole mem-event work. I
don''t remember
>>>>>> his/hers name or affiliation (could you remind me?),
but maybe he/she
>>>>>> would be interested in working on some of this?
>>>>>>
>>>>>> Thanks
>>>>>> Gregor
>>>>>>
>>>>>
>>>>
>>>
>>
>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Grzegorz Milos

2010-Jun-23 22:24 UTC

head link

[Xen-devel] Re: mem-event interface

[From Patrick]

I guess I''m more envisioning integrating all this with libxc and
having XenAccess et al. use that. Keeping it as a separate, VM
introspection library makes sense too. In any case, I think having
XenAccess as part of Xen is a good move. VM introspection is a useful
thing to have and I think a lot of projects could benefit from it.


Patrick

On Wed, Jun 23, 2010 at 11:23 PM, Grzegorz Milos
<grzegorz.milos@gmail.com> wrote:> [From Bryan]
>
>> XenAccess, but how feasible is it to even move some of the gva/pfn/mfn
>> translation code out into the library and have the mem_event daemon
>> use that? I do remember reading through and borrowing XenAccess code
>
> This is certainly doable.  But if we decide to make a Xen library
> depend on XenAccess, then it would make sense to include XenAccess as
> part of the Xen distribution, IMHO.  This probably isn''t too
> unreasonable to consider, but we''d want to make sure that the
> XenAccess configuration is either simplified or eliminated to avoid
> causing headaches for the average person using this stuff.  Something
> to think about...
>
> -bryan
>
> On Wed, Jun 23, 2010 at 11:23 PM, Grzegorz Milos
> <grzegorz.milos@gmail.com> wrote:
>> [From Patrick]
>>
>>> I like this idea as it keeps Xen as simple as possible and should
also
>>> help to reduce the number of notifications sent from Xen up to user
>>> space (e.g., one notification to the daemon could then be pushed
out
>>> to multiple clients that care about it).
>>
>> Yeah, that was my general thinking as well. So the immediate change to
>> the mem_event interface for this would be a way to specify sub-page
>> level stuff. The best way to approach this is probably by specifying a
>> start and end range (or more likely start address and size). This way
>> things like swapping and sharing would specify the start address of
>> the page they''re interested in and PAGE_SIZE (or, more
realistically
>> there would be an additional lib call to do page-level stuff, which
>> would just take the pfn and do this translation under the hood).
>>
>>
>>> For what it''s worth, I''d be happy to build such a
daemon into
>>> XenAccess.  This may be a logical place for it since XenAccess is
>>> already doing address translations and such, so it would be easier
for
>>> a client app to specify an address range of interest as a virtual
>>> address or physical address.  This would prevent the need to repeat
>>> some of that address translation functionality in yet another
library.
>>>
>>> Alternatively, we could provide the daemon functionality in libxc
or
>>> some other Xen library and only provide support for low level
>>> addresses (e.g., pfn + offset).  Then XenAccess could build on top
of
>>> that to offer higher level addresses (e.g., pa or va) using its
>>> existing translation mechanisms.  This approach would more closely
>>> mirror the current division of labor between XenAccess and libxc.
>>
>> This sounds good to me. I''d lean towards  the second approach
as I
>> think it''s the better long-term solution. I''m a bit
rusty on my
>> XenAccess, but how feasible is it to even move some of the gva/pfn/mfn
>> translation code out into the library and have the mem_event daemon
>> use that? I do remember reading through and borrowing XenAccess code
>> (or at least the general mechanism) to do address translation stuff
>> for other projects, so it seems like having a general way to do that
>> would be a win. I think I did it with the CoW stuff, which I actually
>> want to port to the mem_event interface as well, both to have it
>> available and as another example of neat things we can do with the
>> interface.
>>
>>
>> Patrick
>>
>> On Wed, Jun 23, 2010 at 11:22 PM, Grzegorz Milos
>> <grzegorz.milos@gmail.com> wrote:
>>> [From Bryan]
>>>
>>>> needs to know to do sync notification). What''s
everybody thoughts on
>>>> this? Does it seem reasonable or have I gone completely mad?
>>>
>>> I like this idea as it keeps Xen as simple as possible and should
also
>>> help to reduce the number of notifications sent from Xen up to user
>>> space (e.g., one notification to the daemon could then be pushed
out
>>> to multiple clients that care about it).
>>>
>>> For what it''s worth, I''d be happy to build such a
daemon into
>>> XenAccess.  This may be a logical place for it since XenAccess is
>>> already doing address translations and such, so it would be easier
for
>>> a client app to specify an address range of interest as a virtual
>>> address or physical address.  This would prevent the need to repeat
>>> some of that address translation functionality in yet another
library.
>>>
>>> Alternatively, we could provide the daemon functionality in libxc
or
>>> some other Xen library and only provide support for low level
>>> addresses (e.g., pfn + offset).  Then XenAccess could build on top
of
>>> that to offer higher level addresses (e.g., pa or va) using its
>>> existing translation mechanisms.  This approach would more closely
>>> mirror the current division of labor between XenAccess and libxc.
>>>
>>> -bryan
>>>
>>> On Wed, Jun 23, 2010 at 11:22 PM, Grzegorz Milos
>>> <grzegorz.milos@gmail.com> wrote:
>>>> [From Patrick]
>>>>
>>>>> Since I''m coming in the middle of this discussion,
forgive me if I''ve
>>>>> missed something.  But is the idea here to create a more
general
>>>>> interface that could support various different types of
memory events
>>>>> + notification?  And the two events listed below are just a
subset of
>>>>> the events that could / would be supported?
>>>>
>>>> That''s correct.
>>>>
>>>>
>>>>> In general, I like the sound of where this is going but I
would like
>>>>> to see support for notification of events such as when a
domU reads /
>>>>> writes / execs a pre-specified byte(s) of memory.  As such,
there
>>>>> would need to be a notification path (as discussed below)
and also a
>>>>> control path to setup the memory regions that the user app
cares
>>>>> about.
>>>>
>>>> Sub-page events is something I would like to have included as
well.
>>>> Currently the control path is basically just
"nominating" a page (for
>>>> either swapping or sharing). It''s not entirely clear
to me the best
>>>> way to go about this. With swapping and sharing we have code in
Xen to
>>>> handle both cases. However, to just receive notifications (like
>>>> "read", "write", "execute") I
don''t think we need specialised support
>>>> (or at least just once to handle the notifications).
I''m thinking it
>>>> might be good to have a daemon to handle these events in
user-space
>>>> and register clients with the user-space daemon. Each client
would get
>>>> a unique client ID which could be used to identify who should
get the
>>>> response. This way, we could just register that somebody is
interested
>>>> in that page (or byte, etc) and let the user-space tool handle
most of
>>>> the complex logic (i.e. which of the clients should that
particular
>>>> notification go to). This requires some notion of priority for
memory
>>>> areas (e.g. if one client requests notification for access to a
byte
>>>> of page foo and another requests notification for access to any
of
>>>> page foo, then we only need Xen to store that it should notify
for
>>>> page foo and just send along which byte(s) of the page were
accessed
>>>> as well, then the user-space daemon can determine if both
clients
>>>> should be notified or just the one) (e.g. if one client
requests async
>>>> notification and another requests sync notification, then Xen
only
>>>> needs to know to do sync notification). What''s
everybody thoughts on
>>>> this? Does it seem reasonable or have I gone completely mad?
>>>>
>>>>
>>>> Patrick
>>>>
>>>> On Wed, Jun 23, 2010 at 11:21 PM, Grzegorz Milos
>>>> <grzegorz.milos@gmail.com> wrote:
>>>>> [From Bryan]
>>>>>
>>>>> Bryan D. Payne
>>>>>  to Patrick, me, george.dunlap, Andrew, Steven
>>>>>
>>>>> show details Jun 16 (7 days ago)
>>>>>
>>>>> Patrick, thanks for the inclusion.
>>>>>
>>>>> Since I''m coming in the middle of this discussion,
forgive me if I''ve
>>>>> missed something.  But is the idea here to create a more
general
>>>>> interface that could support various different types of
memory events
>>>>> + notification?  And the two events listed below are just a
subset of
>>>>> the events that could / would be supported?
>>>>>
>>>>> In general, I like the sound of where this is going but I
would like
>>>>> to see support for notification of events such as when a
domU reads /
>>>>> writes / execs a pre-specified byte(s) of memory.  As such,
there
>>>>> would need to be a notification path (as discussed below)
and also a
>>>>> control path to setup the memory regions that the user app
cares
>>>>> about.
>>>>>
>>>>> -bryan
>>>>>
>>>>> On Wed, Jun 23, 2010 at 11:21 PM, Grzegorz Milos
>>>>> <grzegorz.milos@gmail.com> wrote:
>>>>>> [From Patrick]
>>>>>>
>>>>>> I think the idea of multiple rings is a good one.
We''ll register the
>>>>>> clients in Xen and when an mem_event is reached, we can
just iterate
>>>>>> through the list of listeners to see who needs a
notification.
>>>>>>
>>>>>> The person working on the anti-virus stuff is Bryan
Payne from Georgia
>>>>>> Tech. I''ve CCed him as well so we can get his
input on this stuff as
>>>>>> well. It''s better to hash out a proper
interface now rather than
>>>>>> continually changing it around.
>>>>>>
>>>>>>
>>>>>> Patrick
>>>>>>
>>>>>> On Wed, Jun 23, 2010 at 11:19 PM, Grzegorz Milos
>>>>>> <grzegorz.milos@gmail.com> wrote:
>>>>>>> [From Gregor]
>>>>>>>
>>>>>>> There are two major events that the memory sharing
code needs to
>>>>>>> communicate over the hypervisor/userspace boundary:
>>>>>>> 1. GFN unsharing failed due to lack of memory. This
will be called the
>>>>>>> ''OOM event'' from now on.
>>>>>>> 2. MFN is no longer sharable (actually an opaque
sharing handle would
>>>>>>> be communicated instead of the MFN).
''Handle invalidate event'' from
>>>>>>> now on.
>>>>>>>
>>>>>>> The requirements on the OOM event are relatively
similar to the
>>>>>>> page-in event. The way this should operate is that
the faulting VCPU
>>>>>>> is paused, and the pager is requested to free up
some memory. When it
>>>>>>> does so, it should generate an appropriate
response, and wake up the
>>>>>>> VCPU back again using a domctl. The event is going
to be low volume,
>>>>>>> and since it is going to be handled synchronously,
likely in tens of
>>>>>>> ms, there are no particular requirements on the
efficiency.
>>>>>>>
>>>>>>> Handle invalidate event type is less important in
the short term
>>>>>>> because the userspace sharing daemon is designed to
be resilient to
>>>>>>> unfresh sharing state. However, if it is missing it
will make the
>>>>>>> sharing progressively less effective as time goes
on. The idea is that
>>>>>>> the hypervisor communicates which sharing handles
are no longer valid,
>>>>>>> such that the sharing daemon only attempts to share
pages in the
>>>>>>> correct state. This would be relatively high volume
event, but it
>>>>>>> doesn''t need to be accurate (i.e. events
can be dropped if they are
>>>>>>> not consumed quickly enough). As such this event
should be batch
>>>>>>> delivered, in an asynchronous fashion.
>>>>>>>
>>>>>>> The OOM event is coded up in Xen, but it will not
be consumed properly
>>>>>>> in the pager. If I remember correctly, I
didn''t want to interfere with
>>>>>>> the page-in events because the event interface
assumed that mem-event
>>>>>>> responses are inserted onto the ring in precisely
the same order as
>>>>>>> the requests. This may not be the case when we
start mixing different
>>>>>>> event types. WRT to the handle invalidation, the
relevant hooks exist
>>>>>>> in Xen, and in the mem sharing daemon, but there is
no way to
>>>>>>> communicate events to two different consumers atm.
>>>>>>>
>>>>>>> Since the requirements on the two different sharing
event types are
>>>>>>> substantially different, I think it may be easier
if separate channels
>>>>>>> (i.e. separate rings) were used to transfer them.
This would also fix
>>>>>>> the multiple consumers issue relatively easily. Of
course you may know
>>>>>>> of some other mem events that wouldn''t fit
in that scheme.
>>>>>>>
>>>>>>> I remember that there was someone working on an
external anti-virus
>>>>>>> software, which prompted the whole mem-event work.
I don''t remember
>>>>>>> his/hers name or affiliation (could you remind
me?), but maybe he/she
>>>>>>> would be interested in working on some of this?
>>>>>>>
>>>>>>> Thanks
>>>>>>> Gregor
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Grzegorz Milos

2010-Jun-23 22:24 UTC

head link

[Xen-devel] Re: mem-event interface

[From Bryan]
> I guess I''m more envisioning integrating all this with libxc and
> having XenAccess et al. use that. Keeping it as a separate, VM
> introspection library makes sense too. In any case, I think having
> XenAccess as part of Xen is a good move. VM introspection is a useful
> thing to have and I think a lot of projects could benefit from it.
>From my experience, the address translations can actually be prettytricky.  This is a big chunk of what XenAccess does, and it requires
some memory analysis in the domU to find necessary page tables and
such.  So it may be more than you really want to add to libxc.  But if
you go down this route, then I could certainly simplify the XenAccess
code, so I wouldn''t complain about that :-)

-bryan

On Wed, Jun 23, 2010 at 11:24 PM, Grzegorz Milos
<grzegorz.milos@gmail.com> wrote:> [From Patrick]
>
> I guess I''m more envisioning integrating all this with libxc and
> having XenAccess et al. use that. Keeping it as a separate, VM
> introspection library makes sense too. In any case, I think having
> XenAccess as part of Xen is a good move. VM introspection is a useful
> thing to have and I think a lot of projects could benefit from it.
>
>
> Patrick
>
> On Wed, Jun 23, 2010 at 11:23 PM, Grzegorz Milos
> <grzegorz.milos@gmail.com> wrote:
>> [From Bryan]
>>
>>> XenAccess, but how feasible is it to even move some of the
gva/pfn/mfn
>>> translation code out into the library and have the mem_event daemon
>>> use that? I do remember reading through and borrowing XenAccess
code
>>
>> This is certainly doable.  But if we decide to make a Xen library
>> depend on XenAccess, then it would make sense to include XenAccess as
>> part of the Xen distribution, IMHO.  This probably isn''t too
>> unreasonable to consider, but we''d want to make sure that the
>> XenAccess configuration is either simplified or eliminated to avoid
>> causing headaches for the average person using this stuff.  Something
>> to think about...
>>
>> -bryan
>>
>> On Wed, Jun 23, 2010 at 11:23 PM, Grzegorz Milos
>> <grzegorz.milos@gmail.com> wrote:
>>> [From Patrick]
>>>
>>>> I like this idea as it keeps Xen as simple as possible and
should also
>>>> help to reduce the number of notifications sent from Xen up to
user
>>>> space (e.g., one notification to the daemon could then be
pushed out
>>>> to multiple clients that care about it).
>>>
>>> Yeah, that was my general thinking as well. So the immediate change
to
>>> the mem_event interface for this would be a way to specify sub-page
>>> level stuff. The best way to approach this is probably by
specifying a
>>> start and end range (or more likely start address and size). This
way
>>> things like swapping and sharing would specify the start address of
>>> the page they''re interested in and PAGE_SIZE (or, more
realistically
>>> there would be an additional lib call to do page-level stuff, which
>>> would just take the pfn and do this translation under the hood).
>>>
>>>
>>>> For what it''s worth, I''d be happy to build
such a daemon into
>>>> XenAccess.  This may be a logical place for it since XenAccess
is
>>>> already doing address translations and such, so it would be
easier for
>>>> a client app to specify an address range of interest as a
virtual
>>>> address or physical address.  This would prevent the need to
repeat
>>>> some of that address translation functionality in yet another
library.
>>>>
>>>> Alternatively, we could provide the daemon functionality in
libxc or
>>>> some other Xen library and only provide support for low level
>>>> addresses (e.g., pfn + offset).  Then XenAccess could build on
top of
>>>> that to offer higher level addresses (e.g., pa or va) using its
>>>> existing translation mechanisms.  This approach would more
closely
>>>> mirror the current division of labor between XenAccess and
libxc.
>>>
>>> This sounds good to me. I''d lean towards  the second
approach as I
>>> think it''s the better long-term solution. I''m a
bit rusty on my
>>> XenAccess, but how feasible is it to even move some of the
gva/pfn/mfn
>>> translation code out into the library and have the mem_event daemon
>>> use that? I do remember reading through and borrowing XenAccess
code
>>> (or at least the general mechanism) to do address translation stuff
>>> for other projects, so it seems like having a general way to do
that
>>> would be a win. I think I did it with the CoW stuff, which I
actually
>>> want to port to the mem_event interface as well, both to have it
>>> available and as another example of neat things we can do with the
>>> interface.
>>>
>>>
>>> Patrick
>>>
>>> On Wed, Jun 23, 2010 at 11:22 PM, Grzegorz Milos
>>> <grzegorz.milos@gmail.com> wrote:
>>>> [From Bryan]
>>>>
>>>>> needs to know to do sync notification). What''s
everybody thoughts on
>>>>> this? Does it seem reasonable or have I gone completely
mad?
>>>>
>>>> I like this idea as it keeps Xen as simple as possible and
should also
>>>> help to reduce the number of notifications sent from Xen up to
user
>>>> space (e.g., one notification to the daemon could then be
pushed out
>>>> to multiple clients that care about it).
>>>>
>>>> For what it''s worth, I''d be happy to build
such a daemon into
>>>> XenAccess.  This may be a logical place for it since XenAccess
is
>>>> already doing address translations and such, so it would be
easier for
>>>> a client app to specify an address range of interest as a
virtual
>>>> address or physical address.  This would prevent the need to
repeat
>>>> some of that address translation functionality in yet another
library.
>>>>
>>>> Alternatively, we could provide the daemon functionality in
libxc or
>>>> some other Xen library and only provide support for low level
>>>> addresses (e.g., pfn + offset).  Then XenAccess could build on
top of
>>>> that to offer higher level addresses (e.g., pa or va) using its
>>>> existing translation mechanisms.  This approach would more
closely
>>>> mirror the current division of labor between XenAccess and
libxc.
>>>>
>>>> -bryan
>>>>
>>>> On Wed, Jun 23, 2010 at 11:22 PM, Grzegorz Milos
>>>> <grzegorz.milos@gmail.com> wrote:
>>>>> [From Patrick]
>>>>>
>>>>>> Since I''m coming in the middle of this
discussion, forgive me if I''ve
>>>>>> missed something.  But is the idea here to create a
more general
>>>>>> interface that could support various different types of
memory events
>>>>>> + notification?  And the two events listed below are
just a subset of
>>>>>> the events that could / would be supported?
>>>>>
>>>>> That''s correct.
>>>>>
>>>>>
>>>>>> In general, I like the sound of where this is going but
I would like
>>>>>> to see support for notification of events such as when
a domU reads /
>>>>>> writes / execs a pre-specified byte(s) of memory.  As
such, there
>>>>>> would need to be a notification path (as discussed
below) and also a
>>>>>> control path to setup the memory regions that the user
app cares
>>>>>> about.
>>>>>
>>>>> Sub-page events is something I would like to have included
as well.
>>>>> Currently the control path is basically just
"nominating" a page (for
>>>>> either swapping or sharing). It''s not entirely
clear to me the best
>>>>> way to go about this. With swapping and sharing we have
code in Xen to
>>>>> handle both cases. However, to just receive notifications
(like
>>>>> "read", "write", "execute") I
don''t think we need specialised support
>>>>> (or at least just once to handle the notifications).
I''m thinking it
>>>>> might be good to have a daemon to handle these events in
user-space
>>>>> and register clients with the user-space daemon. Each
client would get
>>>>> a unique client ID which could be used to identify who
should get the
>>>>> response. This way, we could just register that somebody is
interested
>>>>> in that page (or byte, etc) and let the user-space tool
handle most of
>>>>> the complex logic (i.e. which of the clients should that
particular
>>>>> notification go to). This requires some notion of priority
for memory
>>>>> areas (e.g. if one client requests notification for access
to a byte
>>>>> of page foo and another requests notification for access to
any of
>>>>> page foo, then we only need Xen to store that it should
notify for
>>>>> page foo and just send along which byte(s) of the page were
accessed
>>>>> as well, then the user-space daemon can determine if both
clients
>>>>> should be notified or just the one) (e.g. if one client
requests async
>>>>> notification and another requests sync notification, then
Xen only
>>>>> needs to know to do sync notification). What''s
everybody thoughts on
>>>>> this? Does it seem reasonable or have I gone completely
mad?
>>>>>
>>>>>
>>>>> Patrick
>>>>>
>>>>> On Wed, Jun 23, 2010 at 11:21 PM, Grzegorz Milos
>>>>> <grzegorz.milos@gmail.com> wrote:
>>>>>> [From Bryan]
>>>>>>
>>>>>> Bryan D. Payne
>>>>>>  to Patrick, me, george.dunlap, Andrew, Steven
>>>>>>
>>>>>> show details Jun 16 (7 days ago)
>>>>>>
>>>>>> Patrick, thanks for the inclusion.
>>>>>>
>>>>>> Since I''m coming in the middle of this
discussion, forgive me if I''ve
>>>>>> missed something.  But is the idea here to create a
more general
>>>>>> interface that could support various different types of
memory events
>>>>>> + notification?  And the two events listed below are
just a subset of
>>>>>> the events that could / would be supported?
>>>>>>
>>>>>> In general, I like the sound of where this is going but
I would like
>>>>>> to see support for notification of events such as when
a domU reads /
>>>>>> writes / execs a pre-specified byte(s) of memory.  As
such, there
>>>>>> would need to be a notification path (as discussed
below) and also a
>>>>>> control path to setup the memory regions that the user
app cares
>>>>>> about.
>>>>>>
>>>>>> -bryan
>>>>>>
>>>>>> On Wed, Jun 23, 2010 at 11:21 PM, Grzegorz Milos
>>>>>> <grzegorz.milos@gmail.com> wrote:
>>>>>>> [From Patrick]
>>>>>>>
>>>>>>> I think the idea of multiple rings is a good one.
We''ll register the
>>>>>>> clients in Xen and when an mem_event is reached, we
can just iterate
>>>>>>> through the list of listeners to see who needs a
notification.
>>>>>>>
>>>>>>> The person working on the anti-virus stuff is Bryan
Payne from Georgia
>>>>>>> Tech. I''ve CCed him as well so we can get
his input on this stuff as
>>>>>>> well. It''s better to hash out a proper
interface now rather than
>>>>>>> continually changing it around.
>>>>>>>
>>>>>>>
>>>>>>> Patrick
>>>>>>>
>>>>>>> On Wed, Jun 23, 2010 at 11:19 PM, Grzegorz Milos
>>>>>>> <grzegorz.milos@gmail.com> wrote:
>>>>>>>> [From Gregor]
>>>>>>>>
>>>>>>>> There are two major events that the memory
sharing code needs to
>>>>>>>> communicate over the hypervisor/userspace
boundary:
>>>>>>>> 1. GFN unsharing failed due to lack of memory.
This will be called the
>>>>>>>> ''OOM event'' from now on.
>>>>>>>> 2. MFN is no longer sharable (actually an
opaque sharing handle would
>>>>>>>> be communicated instead of the MFN).
''Handle invalidate event'' from
>>>>>>>> now on.
>>>>>>>>
>>>>>>>> The requirements on the OOM event are
relatively similar to the
>>>>>>>> page-in event. The way this should operate is
that the faulting VCPU
>>>>>>>> is paused, and the pager is requested to free
up some memory. When it
>>>>>>>> does so, it should generate an appropriate
response, and wake up the
>>>>>>>> VCPU back again using a domctl. The event is
going to be low volume,
>>>>>>>> and since it is going to be handled
synchronously, likely in tens of
>>>>>>>> ms, there are no particular requirements on the
efficiency.
>>>>>>>>
>>>>>>>> Handle invalidate event type is less important
in the short term
>>>>>>>> because the userspace sharing daemon is
designed to be resilient to
>>>>>>>> unfresh sharing state. However, if it is
missing it will make the
>>>>>>>> sharing progressively less effective as time
goes on. The idea is that
>>>>>>>> the hypervisor communicates which sharing
handles are no longer valid,
>>>>>>>> such that the sharing daemon only attempts to
share pages in the
>>>>>>>> correct state. This would be relatively high
volume event, but it
>>>>>>>> doesn''t need to be accurate (i.e.
events can be dropped if they are
>>>>>>>> not consumed quickly enough). As such this
event should be batch
>>>>>>>> delivered, in an asynchronous fashion.
>>>>>>>>
>>>>>>>> The OOM event is coded up in Xen, but it will
not be consumed properly
>>>>>>>> in the pager. If I remember correctly, I
didn''t want to interfere with
>>>>>>>> the page-in events because the event interface
assumed that mem-event
>>>>>>>> responses are inserted onto the ring in
precisely the same order as
>>>>>>>> the requests. This may not be the case when we
start mixing different
>>>>>>>> event types. WRT to the handle invalidation,
the relevant hooks exist
>>>>>>>> in Xen, and in the mem sharing daemon, but
there is no way to
>>>>>>>> communicate events to two different consumers
atm.
>>>>>>>>
>>>>>>>> Since the requirements on the two different
sharing event types are
>>>>>>>> substantially different, I think it may be
easier if separate channels
>>>>>>>> (i.e. separate rings) were used to transfer
them. This would also fix
>>>>>>>> the multiple consumers issue relatively easily.
Of course you may know
>>>>>>>> of some other mem events that wouldn''t
fit in that scheme.
>>>>>>>>
>>>>>>>> I remember that there was someone working on an
external anti-virus
>>>>>>>> software, which prompted the whole mem-event
work. I don''t remember
>>>>>>>> his/hers name or affiliation (could you remind
me?), but maybe he/she
>>>>>>>> would be interested in working on some of this?
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Gregor
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Grzegorz Milos

2010-Jun-23 22:25 UTC

head link

[Xen-devel] Re: mem-event interface

[From Patrick]

Ah. Well, as long as it''s in it''s own library or API or
whatever so
other applications can take advantage of it, then it''s fine by me :)
libintrospec or something like that.


Patrick


On Wed, Jun 23, 2010 at 11:24 PM, Grzegorz Milos
<grzegorz.milos@gmail.com> wrote:> [From Bryan]
>
>> I guess I''m more envisioning integrating all this with libxc
and
>> having XenAccess et al. use that. Keeping it as a separate, VM
>> introspection library makes sense too. In any case, I think having
>> XenAccess as part of Xen is a good move. VM introspection is a useful
>> thing to have and I think a lot of projects could benefit from it.
>
> From my experience, the address translations can actually be pretty
> tricky.  This is a big chunk of what XenAccess does, and it requires
> some memory analysis in the domU to find necessary page tables and
> such.  So it may be more than you really want to add to libxc.  But if
> you go down this route, then I could certainly simplify the XenAccess
> code, so I wouldn''t complain about that :-)
>
> -bryan
>
> On Wed, Jun 23, 2010 at 11:24 PM, Grzegorz Milos
> <grzegorz.milos@gmail.com> wrote:
>> [From Patrick]
>>
>> I guess I''m more envisioning integrating all this with libxc
and
>> having XenAccess et al. use that. Keeping it as a separate, VM
>> introspection library makes sense too. In any case, I think having
>> XenAccess as part of Xen is a good move. VM introspection is a useful
>> thing to have and I think a lot of projects could benefit from it.
>>
>>
>> Patrick
>>
>> On Wed, Jun 23, 2010 at 11:23 PM, Grzegorz Milos
>> <grzegorz.milos@gmail.com> wrote:
>>> [From Bryan]
>>>
>>>> XenAccess, but how feasible is it to even move some of the
gva/pfn/mfn
>>>> translation code out into the library and have the mem_event
daemon
>>>> use that? I do remember reading through and borrowing XenAccess
code
>>>
>>> This is certainly doable.  But if we decide to make a Xen library
>>> depend on XenAccess, then it would make sense to include XenAccess
as
>>> part of the Xen distribution, IMHO.  This probably isn''t
too
>>> unreasonable to consider, but we''d want to make sure that
the
>>> XenAccess configuration is either simplified or eliminated to avoid
>>> causing headaches for the average person using this stuff.
 Something
>>> to think about...
>>>
>>> -bryan
>>>
>>> On Wed, Jun 23, 2010 at 11:23 PM, Grzegorz Milos
>>> <grzegorz.milos@gmail.com> wrote:
>>>> [From Patrick]
>>>>
>>>>> I like this idea as it keeps Xen as simple as possible and
should also
>>>>> help to reduce the number of notifications sent from Xen up
to user
>>>>> space (e.g., one notification to the daemon could then be
pushed out
>>>>> to multiple clients that care about it).
>>>>
>>>> Yeah, that was my general thinking as well. So the immediate
change to
>>>> the mem_event interface for this would be a way to specify
sub-page
>>>> level stuff. The best way to approach this is probably by
specifying a
>>>> start and end range (or more likely start address and size).
This way
>>>> things like swapping and sharing would specify the start
address of
>>>> the page they''re interested in and PAGE_SIZE (or, more
realistically
>>>> there would be an additional lib call to do page-level stuff,
which
>>>> would just take the pfn and do this translation under the
hood).
>>>>
>>>>
>>>>> For what it''s worth, I''d be happy to
build such a daemon into
>>>>> XenAccess.  This may be a logical place for it since
XenAccess is
>>>>> already doing address translations and such, so it would be
easier for
>>>>> a client app to specify an address range of interest as a
virtual
>>>>> address or physical address.  This would prevent the need
to repeat
>>>>> some of that address translation functionality in yet
another library.
>>>>>
>>>>> Alternatively, we could provide the daemon functionality in
libxc or
>>>>> some other Xen library and only provide support for low
level
>>>>> addresses (e.g., pfn + offset).  Then XenAccess could build
on top of
>>>>> that to offer higher level addresses (e.g., pa or va) using
its
>>>>> existing translation mechanisms.  This approach would more
closely
>>>>> mirror the current division of labor between XenAccess and
libxc.
>>>>
>>>> This sounds good to me. I''d lean towards  the second
approach as I
>>>> think it''s the better long-term solution. I''m
a bit rusty on my
>>>> XenAccess, but how feasible is it to even move some of the
gva/pfn/mfn
>>>> translation code out into the library and have the mem_event
daemon
>>>> use that? I do remember reading through and borrowing XenAccess
code
>>>> (or at least the general mechanism) to do address translation
stuff
>>>> for other projects, so it seems like having a general way to do
that
>>>> would be a win. I think I did it with the CoW stuff, which I
actually
>>>> want to port to the mem_event interface as well, both to have
it
>>>> available and as another example of neat things we can do with
the
>>>> interface.
>>>>
>>>>
>>>> Patrick
>>>>
>>>> On Wed, Jun 23, 2010 at 11:22 PM, Grzegorz Milos
>>>> <grzegorz.milos@gmail.com> wrote:
>>>>> [From Bryan]
>>>>>
>>>>>> needs to know to do sync notification). What''s
everybody thoughts on
>>>>>> this? Does it seem reasonable or have I gone completely
mad?
>>>>>
>>>>> I like this idea as it keeps Xen as simple as possible and
should also
>>>>> help to reduce the number of notifications sent from Xen up
to user
>>>>> space (e.g., one notification to the daemon could then be
pushed out
>>>>> to multiple clients that care about it).
>>>>>
>>>>> For what it''s worth, I''d be happy to
build such a daemon into
>>>>> XenAccess.  This may be a logical place for it since
XenAccess is
>>>>> already doing address translations and such, so it would be
easier for
>>>>> a client app to specify an address range of interest as a
virtual
>>>>> address or physical address.  This would prevent the need
to repeat
>>>>> some of that address translation functionality in yet
another library.
>>>>>
>>>>> Alternatively, we could provide the daemon functionality in
libxc or
>>>>> some other Xen library and only provide support for low
level
>>>>> addresses (e.g., pfn + offset).  Then XenAccess could build
on top of
>>>>> that to offer higher level addresses (e.g., pa or va) using
its
>>>>> existing translation mechanisms.  This approach would more
closely
>>>>> mirror the current division of labor between XenAccess and
libxc.
>>>>>
>>>>> -bryan
>>>>>
>>>>> On Wed, Jun 23, 2010 at 11:22 PM, Grzegorz Milos
>>>>> <grzegorz.milos@gmail.com> wrote:
>>>>>> [From Patrick]
>>>>>>
>>>>>>> Since I''m coming in the middle of this
discussion, forgive me if I''ve
>>>>>>> missed something.  But is the idea here to create a
more general
>>>>>>> interface that could support various different
types of memory events
>>>>>>> + notification?  And the two events listed below
are just a subset of
>>>>>>> the events that could / would be supported?
>>>>>>
>>>>>> That''s correct.
>>>>>>
>>>>>>
>>>>>>> In general, I like the sound of where this is going
but I would like
>>>>>>> to see support for notification of events such as
when a domU reads /
>>>>>>> writes / execs a pre-specified byte(s) of memory.
 As such, there
>>>>>>> would need to be a notification path (as discussed
below) and also a
>>>>>>> control path to setup the memory regions that the
user app cares
>>>>>>> about.
>>>>>>
>>>>>> Sub-page events is something I would like to have
included as well.
>>>>>> Currently the control path is basically just
"nominating" a page (for
>>>>>> either swapping or sharing). It''s not entirely
clear to me the best
>>>>>> way to go about this. With swapping and sharing we have
code in Xen to
>>>>>> handle both cases. However, to just receive
notifications (like
>>>>>> "read", "write",
"execute") I don''t think we need specialised support
>>>>>> (or at least just once to handle the notifications).
I''m thinking it
>>>>>> might be good to have a daemon to handle these events
in user-space
>>>>>> and register clients with the user-space daemon. Each
client would get
>>>>>> a unique client ID which could be used to identify who
should get the
>>>>>> response. This way, we could just register that
somebody is interested
>>>>>> in that page (or byte, etc) and let the user-space tool
handle most of
>>>>>> the complex logic (i.e. which of the clients should
that particular
>>>>>> notification go to). This requires some notion of
priority for memory
>>>>>> areas (e.g. if one client requests notification for
access to a byte
>>>>>> of page foo and another requests notification for
access to any of
>>>>>> page foo, then we only need Xen to store that it should
notify for
>>>>>> page foo and just send along which byte(s) of the page
were accessed
>>>>>> as well, then the user-space daemon can determine if
both clients
>>>>>> should be notified or just the one) (e.g. if one client
requests async
>>>>>> notification and another requests sync notification,
then Xen only
>>>>>> needs to know to do sync notification). What''s
everybody thoughts on
>>>>>> this? Does it seem reasonable or have I gone completely
mad?
>>>>>>
>>>>>>
>>>>>> Patrick
>>>>>>
>>>>>> On Wed, Jun 23, 2010 at 11:21 PM, Grzegorz Milos
>>>>>> <grzegorz.milos@gmail.com> wrote:
>>>>>>> [From Bryan]
>>>>>>>
>>>>>>> Bryan D. Payne
>>>>>>>  to Patrick, me, george.dunlap, Andrew, Steven
>>>>>>>
>>>>>>> show details Jun 16 (7 days ago)
>>>>>>>
>>>>>>> Patrick, thanks for the inclusion.
>>>>>>>
>>>>>>> Since I''m coming in the middle of this
discussion, forgive me if I''ve
>>>>>>> missed something.  But is the idea here to create a
more general
>>>>>>> interface that could support various different
types of memory events
>>>>>>> + notification?  And the two events listed below
are just a subset of
>>>>>>> the events that could / would be supported?
>>>>>>>
>>>>>>> In general, I like the sound of where this is going
but I would like
>>>>>>> to see support for notification of events such as
when a domU reads /
>>>>>>> writes / execs a pre-specified byte(s) of memory.
 As such, there
>>>>>>> would need to be a notification path (as discussed
below) and also a
>>>>>>> control path to setup the memory regions that the
user app cares
>>>>>>> about.
>>>>>>>
>>>>>>> -bryan
>>>>>>>
>>>>>>> On Wed, Jun 23, 2010 at 11:21 PM, Grzegorz Milos
>>>>>>> <grzegorz.milos@gmail.com> wrote:
>>>>>>>> [From Patrick]
>>>>>>>>
>>>>>>>> I think the idea of multiple rings is a good
one. We''ll register the
>>>>>>>> clients in Xen and when an mem_event is
reached, we can just iterate
>>>>>>>> through the list of listeners to see who needs
a notification.
>>>>>>>>
>>>>>>>> The person working on the anti-virus stuff is
Bryan Payne from Georgia
>>>>>>>> Tech. I''ve CCed him as well so we can
get his input on this stuff as
>>>>>>>> well. It''s better to hash out a proper
interface now rather than
>>>>>>>> continually changing it around.
>>>>>>>>
>>>>>>>>
>>>>>>>> Patrick
>>>>>>>>
>>>>>>>> On Wed, Jun 23, 2010 at 11:19 PM, Grzegorz
Milos
>>>>>>>> <grzegorz.milos@gmail.com> wrote:
>>>>>>>>> [From Gregor]
>>>>>>>>>
>>>>>>>>> There are two major events that the memory
sharing code needs to
>>>>>>>>> communicate over the hypervisor/userspace
boundary:
>>>>>>>>> 1. GFN unsharing failed due to lack of
memory. This will be called the
>>>>>>>>> ''OOM event'' from now on.
>>>>>>>>> 2. MFN is no longer sharable (actually an
opaque sharing handle would
>>>>>>>>> be communicated instead of the MFN).
''Handle invalidate event'' from
>>>>>>>>> now on.
>>>>>>>>>
>>>>>>>>> The requirements on the OOM event are
relatively similar to the
>>>>>>>>> page-in event. The way this should operate
is that the faulting VCPU
>>>>>>>>> is paused, and the pager is requested to
free up some memory. When it
>>>>>>>>> does so, it should generate an appropriate
response, and wake up the
>>>>>>>>> VCPU back again using a domctl. The event
is going to be low volume,
>>>>>>>>> and since it is going to be handled
synchronously, likely in tens of
>>>>>>>>> ms, there are no particular requirements on
the efficiency.
>>>>>>>>>
>>>>>>>>> Handle invalidate event type is less
important in the short term
>>>>>>>>> because the userspace sharing daemon is
designed to be resilient to
>>>>>>>>> unfresh sharing state. However, if it is
missing it will make the
>>>>>>>>> sharing progressively less effective as
time goes on. The idea is that
>>>>>>>>> the hypervisor communicates which sharing
handles are no longer valid,
>>>>>>>>> such that the sharing daemon only attempts
to share pages in the
>>>>>>>>> correct state. This would be relatively
high volume event, but it
>>>>>>>>> doesn''t need to be accurate (i.e.
events can be dropped if they are
>>>>>>>>> not consumed quickly enough). As such this
event should be batch
>>>>>>>>> delivered, in an asynchronous fashion.
>>>>>>>>>
>>>>>>>>> The OOM event is coded up in Xen, but it
will not be consumed properly
>>>>>>>>> in the pager. If I remember correctly, I
didn''t want to interfere with
>>>>>>>>> the page-in events because the event
interface assumed that mem-event
>>>>>>>>> responses are inserted onto the ring in
precisely the same order as
>>>>>>>>> the requests. This may not be the case when
we start mixing different
>>>>>>>>> event types. WRT to the handle
invalidation, the relevant hooks exist
>>>>>>>>> in Xen, and in the mem sharing daemon, but
there is no way to
>>>>>>>>> communicate events to two different
consumers atm.
>>>>>>>>>
>>>>>>>>> Since the requirements on the two different
sharing event types are
>>>>>>>>> substantially different, I think it may be
easier if separate channels
>>>>>>>>> (i.e. separate rings) were used to transfer
them. This would also fix
>>>>>>>>> the multiple consumers issue relatively
easily. Of course you may know
>>>>>>>>> of some other mem events that
wouldn''t fit in that scheme.
>>>>>>>>>
>>>>>>>>> I remember that there was someone working
on an external anti-virus
>>>>>>>>> software, which prompted the whole
mem-event work. I don''t remember
>>>>>>>>> his/hers name or affiliation (could you
remind me?), but maybe he/she
>>>>>>>>> would be interested in working on some of
this?
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Gregor
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Grzegorz Milos

2010-Jun-23 22:25 UTC

head link

[Xen-devel] Re: mem-event interface

[From Gregor]

Joining the discussion after a few days :), I think I agree with the
general design decision of how to split the code between Xen, libxc
and a separate lib.

However, I''m a bit wary about putting anything non-essential in libxc,
and it seems like the event demux might be quite complex and dependant
on the type of events you are handling. Therefore we don''t want to end
up with really complex daemon in libxc. Instead I think we should try
to make use of multiple rings in order to alleviate some of the demux
headaches (sharing related events would go to the memshr daemon
through one ring, paging to the pager through another, introspection
events to XenAccess etc.), and then do further demux in the relevant
daemon.

This could potentially introduce some inefficiencies (e.g. one memory
access could generate multiple events), and could cause the daemons to
step on each other toes, but I don''t think that''s going to be
a
problem in practice, because the types of events we are interested in
intercepting at the moment seem to be disjoint enough.

Also, the complexity of handling sync vs. async events, as well as
supporting batching and out-of-order replies, may already be complex
enough without having to worry about demultiplexing ;). So let''s do
things in small steps. I think the priority should be teaching Xen to
handle multiple rings (the last time I looked at the mem_event code it
couldn''t). What do you think?

Thanks
Gregor


On Wed, Jun 23, 2010 at 11:25 PM, Grzegorz Milos
<grzegorz.milos@gmail.com> wrote:> [From Patrick]
>
> Ah. Well, as long as it''s in it''s own library or API or
whatever so
> other applications can take advantage of it, then it''s fine by me
:)
> libintrospec or something like that.
>
>
> Patrick
>
>
> On Wed, Jun 23, 2010 at 11:24 PM, Grzegorz Milos
> <grzegorz.milos@gmail.com> wrote:
>> [From Bryan]
>>
>>> I guess I''m more envisioning integrating all this with
libxc and
>>> having XenAccess et al. use that. Keeping it as a separate, VM
>>> introspection library makes sense too. In any case, I think having
>>> XenAccess as part of Xen is a good move. VM introspection is a
useful
>>> thing to have and I think a lot of projects could benefit from it.
>>
>> From my experience, the address translations can actually be pretty
>> tricky.  This is a big chunk of what XenAccess does, and it requires
>> some memory analysis in the domU to find necessary page tables and
>> such.  So it may be more than you really want to add to libxc.  But if
>> you go down this route, then I could certainly simplify the XenAccess
>> code, so I wouldn''t complain about that :-)
>>
>> -bryan
>>
>> On Wed, Jun 23, 2010 at 11:24 PM, Grzegorz Milos
>> <grzegorz.milos@gmail.com> wrote:
>>> [From Patrick]
>>>
>>> I guess I''m more envisioning integrating all this with
libxc and
>>> having XenAccess et al. use that. Keeping it as a separate, VM
>>> introspection library makes sense too. In any case, I think having
>>> XenAccess as part of Xen is a good move. VM introspection is a
useful
>>> thing to have and I think a lot of projects could benefit from it.
>>>
>>>
>>> Patrick
>>>
>>> On Wed, Jun 23, 2010 at 11:23 PM, Grzegorz Milos
>>> <grzegorz.milos@gmail.com> wrote:
>>>> [From Bryan]
>>>>
>>>>> XenAccess, but how feasible is it to even move some of the
gva/pfn/mfn
>>>>> translation code out into the library and have the
mem_event daemon
>>>>> use that? I do remember reading through and borrowing
XenAccess code
>>>>
>>>> This is certainly doable.  But if we decide to make a Xen
library
>>>> depend on XenAccess, then it would make sense to include
XenAccess as
>>>> part of the Xen distribution, IMHO.  This probably
isn''t too
>>>> unreasonable to consider, but we''d want to make sure
that the
>>>> XenAccess configuration is either simplified or eliminated to
avoid
>>>> causing headaches for the average person using this stuff.
 Something
>>>> to think about...
>>>>
>>>> -bryan
>>>>
>>>> On Wed, Jun 23, 2010 at 11:23 PM, Grzegorz Milos
>>>> <grzegorz.milos@gmail.com> wrote:
>>>>> [From Patrick]
>>>>>
>>>>>> I like this idea as it keeps Xen as simple as possible
and should also
>>>>>> help to reduce the number of notifications sent from
Xen up to user
>>>>>> space (e.g., one notification to the daemon could then
be pushed out
>>>>>> to multiple clients that care about it).
>>>>>
>>>>> Yeah, that was my general thinking as well. So the
immediate change to
>>>>> the mem_event interface for this would be a way to specify
sub-page
>>>>> level stuff. The best way to approach this is probably by
specifying a
>>>>> start and end range (or more likely start address and
size). This way
>>>>> things like swapping and sharing would specify the start
address of
>>>>> the page they''re interested in and PAGE_SIZE (or,
more realistically
>>>>> there would be an additional lib call to do page-level
stuff, which
>>>>> would just take the pfn and do this translation under the
hood).
>>>>>
>>>>>
>>>>>> For what it''s worth, I''d be happy to
build such a daemon into
>>>>>> XenAccess.  This may be a logical place for it since
XenAccess is
>>>>>> already doing address translations and such, so it
would be easier for
>>>>>> a client app to specify an address range of interest as
a virtual
>>>>>> address or physical address.  This would prevent the
need to repeat
>>>>>> some of that address translation functionality in yet
another library.
>>>>>>
>>>>>> Alternatively, we could provide the daemon
functionality in libxc or
>>>>>> some other Xen library and only provide support for low
level
>>>>>> addresses (e.g., pfn + offset).  Then XenAccess could
build on top of
>>>>>> that to offer higher level addresses (e.g., pa or va)
using its
>>>>>> existing translation mechanisms.  This approach would
more closely
>>>>>> mirror the current division of labor between XenAccess
and libxc.
>>>>>
>>>>> This sounds good to me. I''d lean towards  the
second approach as I
>>>>> think it''s the better long-term solution.
I''m a bit rusty on my
>>>>> XenAccess, but how feasible is it to even move some of the
gva/pfn/mfn
>>>>> translation code out into the library and have the
mem_event daemon
>>>>> use that? I do remember reading through and borrowing
XenAccess code
>>>>> (or at least the general mechanism) to do address
translation stuff
>>>>> for other projects, so it seems like having a general way
to do that
>>>>> would be a win. I think I did it with the CoW stuff, which
I actually
>>>>> want to port to the mem_event interface as well, both to
have it
>>>>> available and as another example of neat things we can do
with the
>>>>> interface.
>>>>>
>>>>>
>>>>> Patrick
>>>>>
>>>>> On Wed, Jun 23, 2010 at 11:22 PM, Grzegorz Milos
>>>>> <grzegorz.milos@gmail.com> wrote:
>>>>>> [From Bryan]
>>>>>>
>>>>>>> needs to know to do sync notification).
What''s everybody thoughts on
>>>>>>> this? Does it seem reasonable or have I gone
completely mad?
>>>>>>
>>>>>> I like this idea as it keeps Xen as simple as possible
and should also
>>>>>> help to reduce the number of notifications sent from
Xen up to user
>>>>>> space (e.g., one notification to the daemon could then
be pushed out
>>>>>> to multiple clients that care about it).
>>>>>>
>>>>>> For what it''s worth, I''d be happy to
build such a daemon into
>>>>>> XenAccess.  This may be a logical place for it since
XenAccess is
>>>>>> already doing address translations and such, so it
would be easier for
>>>>>> a client app to specify an address range of interest as
a virtual
>>>>>> address or physical address.  This would prevent the
need to repeat
>>>>>> some of that address translation functionality in yet
another library.
>>>>>>
>>>>>> Alternatively, we could provide the daemon
functionality in libxc or
>>>>>> some other Xen library and only provide support for low
level
>>>>>> addresses (e.g., pfn + offset).  Then XenAccess could
build on top of
>>>>>> that to offer higher level addresses (e.g., pa or va)
using its
>>>>>> existing translation mechanisms.  This approach would
more closely
>>>>>> mirror the current division of labor between XenAccess
and libxc.
>>>>>>
>>>>>> -bryan
>>>>>>
>>>>>> On Wed, Jun 23, 2010 at 11:22 PM, Grzegorz Milos
>>>>>> <grzegorz.milos@gmail.com> wrote:
>>>>>>> [From Patrick]
>>>>>>>
>>>>>>>> Since I''m coming in the middle of this
discussion, forgive me if I''ve
>>>>>>>> missed something.  But is the idea here to
create a more general
>>>>>>>> interface that could support various different
types of memory events
>>>>>>>> + notification?  And the two events listed
below are just a subset of
>>>>>>>> the events that could / would be supported?
>>>>>>>
>>>>>>> That''s correct.
>>>>>>>
>>>>>>>
>>>>>>>> In general, I like the sound of where this is
going but I would like
>>>>>>>> to see support for notification of events such
as when a domU reads /
>>>>>>>> writes / execs a pre-specified byte(s) of
memory.  As such, there
>>>>>>>> would need to be a notification path (as
discussed below) and also a
>>>>>>>> control path to setup the memory regions that
the user app cares
>>>>>>>> about.
>>>>>>>
>>>>>>> Sub-page events is something I would like to have
included as well.
>>>>>>> Currently the control path is basically just
"nominating" a page (for
>>>>>>> either swapping or sharing). It''s not
entirely clear to me the best
>>>>>>> way to go about this. With swapping and sharing we
have code in Xen to
>>>>>>> handle both cases. However, to just receive
notifications (like
>>>>>>> "read", "write",
"execute") I don''t think we need specialised support
>>>>>>> (or at least just once to handle the
notifications). I''m thinking it
>>>>>>> might be good to have a daemon to handle these
events in user-space
>>>>>>> and register clients with the user-space daemon.
Each client would get
>>>>>>> a unique client ID which could be used to identify
who should get the
>>>>>>> response. This way, we could just register that
somebody is interested
>>>>>>> in that page (or byte, etc) and let the user-space
tool handle most of
>>>>>>> the complex logic (i.e. which of the clients should
that particular
>>>>>>> notification go to). This requires some notion of
priority for memory
>>>>>>> areas (e.g. if one client requests notification for
access to a byte
>>>>>>> of page foo and another requests notification for
access to any of
>>>>>>> page foo, then we only need Xen to store that it
should notify for
>>>>>>> page foo and just send along which byte(s) of the
page were accessed
>>>>>>> as well, then the user-space daemon can determine
if both clients
>>>>>>> should be notified or just the one) (e.g. if one
client requests async
>>>>>>> notification and another requests sync
notification, then Xen only
>>>>>>> needs to know to do sync notification).
What''s everybody thoughts on
>>>>>>> this? Does it seem reasonable or have I gone
completely mad?
>>>>>>>
>>>>>>>
>>>>>>> Patrick
>>>>>>>
>>>>>>> On Wed, Jun 23, 2010 at 11:21 PM, Grzegorz Milos
>>>>>>> <grzegorz.milos@gmail.com> wrote:
>>>>>>>> [From Bryan]
>>>>>>>>
>>>>>>>> Bryan D. Payne
>>>>>>>>  to Patrick, me, george.dunlap, Andrew, Steven
>>>>>>>>
>>>>>>>> show details Jun 16 (7 days ago)
>>>>>>>>
>>>>>>>> Patrick, thanks for the inclusion.
>>>>>>>>
>>>>>>>> Since I''m coming in the middle of this
discussion, forgive me if I''ve
>>>>>>>> missed something.  But is the idea here to
create a more general
>>>>>>>> interface that could support various different
types of memory events
>>>>>>>> + notification?  And the two events listed
below are just a subset of
>>>>>>>> the events that could / would be supported?
>>>>>>>>
>>>>>>>> In general, I like the sound of where this is
going but I would like
>>>>>>>> to see support for notification of events such
as when a domU reads /
>>>>>>>> writes / execs a pre-specified byte(s) of
memory.  As such, there
>>>>>>>> would need to be a notification path (as
discussed below) and also a
>>>>>>>> control path to setup the memory regions that
the user app cares
>>>>>>>> about.
>>>>>>>>
>>>>>>>> -bryan
>>>>>>>>
>>>>>>>> On Wed, Jun 23, 2010 at 11:21 PM, Grzegorz
Milos
>>>>>>>> <grzegorz.milos@gmail.com> wrote:
>>>>>>>>> [From Patrick]
>>>>>>>>>
>>>>>>>>> I think the idea of multiple rings is a
good one. We''ll register the
>>>>>>>>> clients in Xen and when an mem_event is
reached, we can just iterate
>>>>>>>>> through the list of listeners to see who
needs a notification.
>>>>>>>>>
>>>>>>>>> The person working on the anti-virus stuff
is Bryan Payne from Georgia
>>>>>>>>> Tech. I''ve CCed him as well so we
can get his input on this stuff as
>>>>>>>>> well. It''s better to hash out a
proper interface now rather than
>>>>>>>>> continually changing it around.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Patrick
>>>>>>>>>
>>>>>>>>> On Wed, Jun 23, 2010 at 11:19 PM, Grzegorz
Milos
>>>>>>>>> <grzegorz.milos@gmail.com> wrote:
>>>>>>>>>> [From Gregor]
>>>>>>>>>>
>>>>>>>>>> There are two major events that the
memory sharing code needs to
>>>>>>>>>> communicate over the
hypervisor/userspace boundary:
>>>>>>>>>> 1. GFN unsharing failed due to lack of
memory. This will be called the
>>>>>>>>>> ''OOM event'' from now
on.
>>>>>>>>>> 2. MFN is no longer sharable (actually
an opaque sharing handle would
>>>>>>>>>> be communicated instead of the MFN).
''Handle invalidate event'' from
>>>>>>>>>> now on.
>>>>>>>>>>
>>>>>>>>>> The requirements on the OOM event are
relatively similar to the
>>>>>>>>>> page-in event. The way this should
operate is that the faulting VCPU
>>>>>>>>>> is paused, and the pager is requested
to free up some memory. When it
>>>>>>>>>> does so, it should generate an
appropriate response, and wake up the
>>>>>>>>>> VCPU back again using a domctl. The
event is going to be low volume,
>>>>>>>>>> and since it is going to be handled
synchronously, likely in tens of
>>>>>>>>>> ms, there are no particular
requirements on the efficiency.
>>>>>>>>>>
>>>>>>>>>> Handle invalidate event type is less
important in the short term
>>>>>>>>>> because the userspace sharing daemon is
designed to be resilient to
>>>>>>>>>> unfresh sharing state. However, if it
is missing it will make the
>>>>>>>>>> sharing progressively less effective as
time goes on. The idea is that
>>>>>>>>>> the hypervisor communicates which
sharing handles are no longer valid,
>>>>>>>>>> such that the sharing daemon only
attempts to share pages in the
>>>>>>>>>> correct state. This would be relatively
high volume event, but it
>>>>>>>>>> doesn''t need to be accurate
(i.e. events can be dropped if they are
>>>>>>>>>> not consumed quickly enough). As such
this event should be batch
>>>>>>>>>> delivered, in an asynchronous fashion.
>>>>>>>>>>
>>>>>>>>>> The OOM event is coded up in Xen, but
it will not be consumed properly
>>>>>>>>>> in the pager. If I remember correctly,
I didn''t want to interfere with
>>>>>>>>>> the page-in events because the event
interface assumed that mem-event
>>>>>>>>>> responses are inserted onto the ring in
precisely the same order as
>>>>>>>>>> the requests. This may not be the case
when we start mixing different
>>>>>>>>>> event types. WRT to the handle
invalidation, the relevant hooks exist
>>>>>>>>>> in Xen, and in the mem sharing daemon,
but there is no way to
>>>>>>>>>> communicate events to two different
consumers atm.
>>>>>>>>>>
>>>>>>>>>> Since the requirements on the two
different sharing event types are
>>>>>>>>>> substantially different, I think it may
be easier if separate channels
>>>>>>>>>> (i.e. separate rings) were used to
transfer them. This would also fix
>>>>>>>>>> the multiple consumers issue relatively
easily. Of course you may know
>>>>>>>>>> of some other mem events that
wouldn''t fit in that scheme.
>>>>>>>>>>
>>>>>>>>>> I remember that there was someone
working on an external anti-virus
>>>>>>>>>> software, which prompted the whole
mem-event work. I don''t remember
>>>>>>>>>> his/hers name or affiliation (could you
remind me?), but maybe he/she
>>>>>>>>>> would be interested in working on some
of this?
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>> Gregor
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Dan Magenheimer

2010-Jun-23 23:12 UTC

head link

RE: [Xen-devel] Re: mem-event interface

Hi Gregor --

I assume you are posting this offlist discussion for
participation and feedback.  You moved quickly from
claiming a vague need into very specific mechanisms,
so pardon me if I need to take a step back.  The
page sharing code was added very quickly to xen-unstable
last year without (afaict) much review or iteration,
so there''s probably other developers that could use some
additional background.  I appreciate that you are
moving this phase into open discussion!

I gather the ''OOM event'' occurs when a guest tries to write
to memory on a page that it thinks it owns, but the page
is actually transparently shared.  As a result, the
write must fail and instead some hypervisor swapping
activity must occur, apparently driven by a userland
process in dom0 to some swap disks that are configured
and owned by dom0?  If this is correct, why is it
necessary for address/sub-page/translation information
to be included in the event... it is likely that it
won''t be this specific page that is swapped out,
correct?

I''m not clear on why/when the "handle invalidate" event
might occur.  Could you explain more?

I still have to raise a general objection to hypervisor
swapping in any real world workload.  The VMware users I''ve
talked to hate it and turn off page sharing because of it.
While there are definitely some workloads where page
sharing can have a huge advantage (essentially by being so
homogeneous and "static" across many guests as to avoid
any swapping), it is not widely used because of swapping.

I had vaguely thought you had managed to avoid the worst
of the swapping problems but I don''t recall why/how...
and I had thought that any swapping that did exist was
solved by the page sharing code as submitted, but
never had a chance to dig deeper.  I gather I was
wrong and this discussion is the next step toward making
page sharing functional in real world corner cases?
(I have had questions about page sharing in 4.0 and
have said, basically, I don''t know and, since we are
not shipping a 4.0-based hypervisor yet, we will
have to wait and see.)

Thanks,
Dan
> -----Original Message-----
> From: Grzegorz Milos [mailto:grzegorz.milos@gmail.com]
> Sent: Wednesday, June 23, 2010 4:19 PM
> To: Xen-Devel (E-mail); george.dunlap@eu.citrix.com; Andrew Peace;
> Steven Hand; Patrick Colp; Bryan D. Payne
> Subject: [Xen-devel] Re: mem-event interface
> 
> [From Gregor]
> 
> There are two major events that the memory sharing code needs to
> communicate over the hypervisor/userspace boundary:
> 1. GFN unsharing failed due to lack of memory. This will be called the
> ''OOM event'' from now on.
> 2. MFN is no longer sharable (actually an opaque sharing handle would
> be communicated instead of the MFN). ''Handle invalidate
event'' from
> now on.
> 
> The requirements on the OOM event are relatively similar to the
> page-in event. The way this should operate is that the faulting VCPU
> is paused, and the pager is requested to free up some memory. When it
> does so, it should generate an appropriate response, and wake up the
> VCPU back again using a domctl. The event is going to be low volume,
> and since it is going to be handled synchronously, likely in tens of
> ms, there are no particular requirements on the efficiency.
> 
> Handle invalidate event type is less important in the short term
> because the userspace sharing daemon is designed to be resilient to
> unfresh sharing state. However, if it is missing it will make the
> sharing progressively less effective as time goes on. The idea is that
> the hypervisor communicates which sharing handles are no longer valid,
> such that the sharing daemon only attempts to share pages in the
> correct state. This would be relatively high volume event, but it
> doesn''t need to be accurate (i.e. events can be dropped if they
are
> not consumed quickly enough). As such this event should be batch
> delivered, in an asynchronous fashion.
> 
> The OOM event is coded up in Xen, but it will not be consumed properly
> in the pager. If I remember correctly, I didn''t want to interfere
with
> the page-in events because the event interface assumed that mem-event
> responses are inserted onto the ring in precisely the same order as
> the requests. This may not be the case when we start mixing different
> event types. WRT to the handle invalidation, the relevant hooks exist
> in Xen, and in the mem sharing daemon, but there is no way to
> communicate events to two different consumers atm.
> 
> Since the requirements on the two different sharing event types are
> substantially different, I think it may be easier if separate channels
> (i.e. separate rings) were used to transfer them. This would also fix
> the multiple consumers issue relatively easily. Of course you may know
> of some other mem events that wouldn''t fit in that scheme.
> 
> I remember that there was someone working on an external anti-virus
> software, which prompted the whole mem-event work. I don''t
remember
> his/hers name or affiliation (could you remind me?), but maybe he/she
> would be interested in working on some of this?
> 
> Thanks
> Gregor
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Tim Deegan

2010-Jun-24 09:18 UTC

head link

Re: [Xen-devel] Re: mem-event interface

At 23:25 +0100 on 23 Jun (1277335526), Grzegorz Milos
wrote:> However, I''m a bit wary about putting anything non-essential in
libxc,
> and it seems like the event demux might be quite complex and dependant
> on the type of events you are handling. Therefore we don''t want to
end
> up with really complex daemon in libxc. Instead I think we should try
> to make use of multiple rings in order to alleviate some of the demux
> headaches (sharing related events would go to the memshr daemon
> through one ring, paging to the pager through another, introspection
> events to XenAccess etc.), and then do further demux in the relevant
> daemon.
I agree that multiple rings are a good idea here - especially if we want
to disaggregate and have event handlers in multiple domains. 

Maybe the ring-registering interface could take a type and a rangeset -
that would reduce the amount of extra chatter at the cost of some more
overhead in Xen.
> This could potentially introduce some inefficiencies (e.g. one memory
> access could generate multiple events), and could cause the daemons to
> step on each other toes, but I don''t think that''s going
to be a
> problem in practice, because the types of events we are interested in
> intercepting at the moment seem to be disjoint enough.
> 
> Also, the complexity of handling sync vs. async events, as well as
> supporting batching and out-of-order replies, may already be complex
> enough without having to worry about demultiplexing ;). So let''s
do
> things in small steps. I think the priority should be teaching Xen to
> handle multiple rings (the last time I looked at the mem_event code it
> couldn''t). What do you think?
> 
> Thanks
> Gregor
> 
> 
> On Wed, Jun 23, 2010 at 11:25 PM, Grzegorz Milos
> <grzegorz.milos@gmail.com> wrote:
> > [From Patrick]
> >
> > Ah. Well, as long as it''s in it''s own library or API
or whatever so
> > other applications can take advantage of it, then it''s fine
by me :)
> > libintrospec or something like that.
> >
> >
> > Patrick
> >
> >
> > On Wed, Jun 23, 2010 at 11:24 PM, Grzegorz Milos
> > <grzegorz.milos@gmail.com> wrote:
> >> [From Bryan]
> >>
> >>> I guess I''m more envisioning integrating all this
with libxc and
> >>> having XenAccess et al. use that. Keeping it as a separate, VM
> >>> introspection library makes sense too. In any case, I think
having
> >>> XenAccess as part of Xen is a good move. VM introspection is a
useful
> >>> thing to have and I think a lot of projects could benefit from
it.
> >>
> >> From my experience, the address translations can actually be
pretty
> >> tricky.  This is a big chunk of what XenAccess does, and it
requires
> >> some memory analysis in the domU to find necessary page tables and
> >> such.  So it may be more than you really want to add to libxc. 
But if
> >> you go down this route, then I could certainly simplify the
XenAccess
> >> code, so I wouldn''t complain about that :-)
> >>
> >> -bryan
> >>
> >> On Wed, Jun 23, 2010 at 11:24 PM, Grzegorz Milos
> >> <grzegorz.milos@gmail.com> wrote:
> >>> [From Patrick]
> >>>
> >>> I guess I''m more envisioning integrating all this
with libxc and
> >>> having XenAccess et al. use that. Keeping it as a separate, VM
> >>> introspection library makes sense too. In any case, I think
having
> >>> XenAccess as part of Xen is a good move. VM introspection is a
useful
> >>> thing to have and I think a lot of projects could benefit from
it.
> >>>
> >>>
> >>> Patrick
> >>>
> >>> On Wed, Jun 23, 2010 at 11:23 PM, Grzegorz Milos
> >>> <grzegorz.milos@gmail.com> wrote:
> >>>> [From Bryan]
> >>>>
> >>>>> XenAccess, but how feasible is it to even move some of
the gva/pfn/mfn
> >>>>> translation code out into the library and have the
mem_event daemon
> >>>>> use that? I do remember reading through and borrowing
XenAccess code
> >>>>
> >>>> This is certainly doable.  But if we decide to make a Xen
library
> >>>> depend on XenAccess, then it would make sense to include
XenAccess as
> >>>> part of the Xen distribution, IMHO.  This probably
isn''t too
> >>>> unreasonable to consider, but we''d want to make
sure that the
> >>>> XenAccess configuration is either simplified or eliminated
to avoid
> >>>> causing headaches for the average person using this stuff.
Something
> >>>> to think about...
> >>>>
> >>>> -bryan
> >>>>
> >>>> On Wed, Jun 23, 2010 at 11:23 PM, Grzegorz Milos
> >>>> <grzegorz.milos@gmail.com> wrote:
> >>>>> [From Patrick]
> >>>>>
> >>>>>> I like this idea as it keeps Xen as simple as
possible and should also
> >>>>>> help to reduce the number of notifications sent
from Xen up to user
> >>>>>> space (e.g., one notification to the daemon could
then be pushed out
> >>>>>> to multiple clients that care about it).
> >>>>>
> >>>>> Yeah, that was my general thinking as well. So the
immediate change to
> >>>>> the mem_event interface for this would be a way to
specify sub-page
> >>>>> level stuff. The best way to approach this is probably
by specifying a
> >>>>> start and end range (or more likely start address and
size). This way
> >>>>> things like swapping and sharing would specify the
start address of
> >>>>> the page they''re interested in and PAGE_SIZE
(or, more realistically
> >>>>> there would be an additional lib call to do page-level
stuff, which
> >>>>> would just take the pfn and do this translation under
the hood).
> >>>>>
> >>>>>
> >>>>>> For what it''s worth, I''d be
happy to build such a daemon into
> >>>>>> XenAccess.  This may be a logical place for it
since XenAccess is
> >>>>>> already doing address translations and such, so it
would be easier for
> >>>>>> a client app to specify an address range of
interest as a virtual
> >>>>>> address or physical address.  This would prevent
the need to repeat
> >>>>>> some of that address translation functionality in
yet another library.
> >>>>>>
> >>>>>> Alternatively, we could provide the daemon
functionality in libxc or
> >>>>>> some other Xen library and only provide support
for low level
> >>>>>> addresses (e.g., pfn + offset).  Then XenAccess
could build on top of
> >>>>>> that to offer higher level addresses (e.g., pa or
va) using its
> >>>>>> existing translation mechanisms.  This approach
would more closely
> >>>>>> mirror the current division of labor between
XenAccess and libxc.
> >>>>>
> >>>>> This sounds good to me. I''d lean towards  the
second approach as I
> >>>>> think it''s the better long-term solution.
I''m a bit rusty on my
> >>>>> XenAccess, but how feasible is it to even move some of
the gva/pfn/mfn
> >>>>> translation code out into the library and have the
mem_event daemon
> >>>>> use that? I do remember reading through and borrowing
XenAccess code
> >>>>> (or at least the general mechanism) to do address
translation stuff
> >>>>> for other projects, so it seems like having a general
way to do that
> >>>>> would be a win. I think I did it with the CoW stuff,
which I actually
> >>>>> want to port to the mem_event interface as well, both
to have it
> >>>>> available and as another example of neat things we can
do with the
> >>>>> interface.
> >>>>>
> >>>>>
> >>>>> Patrick
> >>>>>
> >>>>> On Wed, Jun 23, 2010 at 11:22 PM, Grzegorz Milos
> >>>>> <grzegorz.milos@gmail.com> wrote:
> >>>>>> [From Bryan]
> >>>>>>
> >>>>>>> needs to know to do sync notification).
What''s everybody thoughts on
> >>>>>>> this? Does it seem reasonable or have I gone
completely mad?
> >>>>>>
> >>>>>> I like this idea as it keeps Xen as simple as
possible and should also
> >>>>>> help to reduce the number of notifications sent
from Xen up to user
> >>>>>> space (e.g., one notification to the daemon could
then be pushed out
> >>>>>> to multiple clients that care about it).
> >>>>>>
> >>>>>> For what it''s worth, I''d be
happy to build such a daemon into
> >>>>>> XenAccess.  This may be a logical place for it
since XenAccess is
> >>>>>> already doing address translations and such, so it
would be easier for
> >>>>>> a client app to specify an address range of
interest as a virtual
> >>>>>> address or physical address.  This would prevent
the need to repeat
> >>>>>> some of that address translation functionality in
yet another library.
> >>>>>>
> >>>>>> Alternatively, we could provide the daemon
functionality in libxc or
> >>>>>> some other Xen library and only provide support
for low level
> >>>>>> addresses (e.g., pfn + offset).  Then XenAccess
could build on top of
> >>>>>> that to offer higher level addresses (e.g., pa or
va) using its
> >>>>>> existing translation mechanisms.  This approach
would more closely
> >>>>>> mirror the current division of labor between
XenAccess and libxc.
> >>>>>>
> >>>>>> -bryan
> >>>>>>
> >>>>>> On Wed, Jun 23, 2010 at 11:22 PM, Grzegorz Milos
> >>>>>> <grzegorz.milos@gmail.com> wrote:
> >>>>>>> [From Patrick]
> >>>>>>>
> >>>>>>>> Since I''m coming in the middle of
this discussion, forgive me if I''ve
> >>>>>>>> missed something.  But is the idea here to
create a more general
> >>>>>>>> interface that could support various
different types of memory events
> >>>>>>>> + notification?  And the two events listed
below are just a subset of
> >>>>>>>> the events that could / would be
supported?
> >>>>>>>
> >>>>>>> That''s correct.
> >>>>>>>
> >>>>>>>
> >>>>>>>> In general, I like the sound of where this
is going but I would like
> >>>>>>>> to see support for notification of events
such as when a domU reads /
> >>>>>>>> writes / execs a pre-specified byte(s) of
memory.  As such, there
> >>>>>>>> would need to be a notification path (as
discussed below) and also a
> >>>>>>>> control path to setup the memory regions
that the user app cares
> >>>>>>>> about.
> >>>>>>>
> >>>>>>> Sub-page events is something I would like to
have included as well.
> >>>>>>> Currently the control path is basically just
"nominating" a page (for
> >>>>>>> either swapping or sharing). It''s not
entirely clear to me the best
> >>>>>>> way to go about this. With swapping and
sharing we have code in Xen to
> >>>>>>> handle both cases. However, to just receive
notifications (like
> >>>>>>> "read", "write",
"execute") I don''t think we need specialised support
> >>>>>>> (or at least just once to handle the
notifications). I''m thinking it
> >>>>>>> might be good to have a daemon to handle these
events in user-space
> >>>>>>> and register clients with the user-space
daemon. Each client would get
> >>>>>>> a unique client ID which could be used to
identify who should get the
> >>>>>>> response. This way, we could just register
that somebody is interested
> >>>>>>> in that page (or byte, etc) and let the
user-space tool handle most of
> >>>>>>> the complex logic (i.e. which of the clients
should that particular
> >>>>>>> notification go to). This requires some notion
of priority for memory
> >>>>>>> areas (e.g. if one client requests
notification for access to a byte
> >>>>>>> of page foo and another requests notification
for access to any of
> >>>>>>> page foo, then we only need Xen to store that
it should notify for
> >>>>>>> page foo and just send along which byte(s) of
the page were accessed
> >>>>>>> as well, then the user-space daemon can
determine if both clients
> >>>>>>> should be notified or just the one) (e.g. if
one client requests async
> >>>>>>> notification and another requests sync
notification, then Xen only
> >>>>>>> needs to know to do sync notification).
What''s everybody thoughts on
> >>>>>>> this? Does it seem reasonable or have I gone
completely mad?
> >>>>>>>
> >>>>>>>
> >>>>>>> Patrick
> >>>>>>>
> >>>>>>> On Wed, Jun 23, 2010 at 11:21 PM, Grzegorz
Milos
> >>>>>>> <grzegorz.milos@gmail.com> wrote:
> >>>>>>>> [From Bryan]
> >>>>>>>>
> >>>>>>>> Bryan D. Payne
> >>>>>>>>  to Patrick, me, george.dunlap, Andrew,
Steven
> >>>>>>>>
> >>>>>>>> show details Jun 16 (7 days ago)
> >>>>>>>>
> >>>>>>>> Patrick, thanks for the inclusion.
> >>>>>>>>
> >>>>>>>> Since I''m coming in the middle of
this discussion, forgive me if I''ve
> >>>>>>>> missed something.  But is the idea here to
create a more general
> >>>>>>>> interface that could support various
different types of memory events
> >>>>>>>> + notification?  And the two events listed
below are just a subset of
> >>>>>>>> the events that could / would be
supported?
> >>>>>>>>
> >>>>>>>> In general, I like the sound of where this
is going but I would like
> >>>>>>>> to see support for notification of events
such as when a domU reads /
> >>>>>>>> writes / execs a pre-specified byte(s) of
memory.  As such, there
> >>>>>>>> would need to be a notification path (as
discussed below) and also a
> >>>>>>>> control path to setup the memory regions
that the user app cares
> >>>>>>>> about.
> >>>>>>>>
> >>>>>>>> -bryan
> >>>>>>>>
> >>>>>>>> On Wed, Jun 23, 2010 at 11:21 PM, Grzegorz
Milos
> >>>>>>>> <grzegorz.milos@gmail.com> wrote:
> >>>>>>>>> [From Patrick]
> >>>>>>>>>
> >>>>>>>>> I think the idea of multiple rings is
a good one. We''ll register the
> >>>>>>>>> clients in Xen and when an mem_event
is reached, we can just iterate
> >>>>>>>>> through the list of listeners to see
who needs a notification.
> >>>>>>>>>
> >>>>>>>>> The person working on the anti-virus
stuff is Bryan Payne from Georgia
> >>>>>>>>> Tech. I''ve CCed him as well
so we can get his input on this stuff as
> >>>>>>>>> well. It''s better to hash out
a proper interface now rather than
> >>>>>>>>> continually changing it around.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Patrick
> >>>>>>>>>
> >>>>>>>>> On Wed, Jun 23, 2010 at 11:19 PM,
Grzegorz Milos
> >>>>>>>>> <grzegorz.milos@gmail.com>
wrote:
> >>>>>>>>>> [From Gregor]
> >>>>>>>>>>
> >>>>>>>>>> There are two major events that
the memory sharing code needs to
> >>>>>>>>>> communicate over the
hypervisor/userspace boundary:
> >>>>>>>>>> 1. GFN unsharing failed due to
lack of memory. This will be called the
> >>>>>>>>>> ''OOM event'' from
now on.
> >>>>>>>>>> 2. MFN is no longer sharable
(actually an opaque sharing handle would
> >>>>>>>>>> be communicated instead of the
MFN). ''Handle invalidate event'' from
> >>>>>>>>>> now on.
> >>>>>>>>>>
> >>>>>>>>>> The requirements on the OOM event
are relatively similar to the
> >>>>>>>>>> page-in event. The way this should
operate is that the faulting VCPU
> >>>>>>>>>> is paused, and the pager is
requested to free up some memory. When it
> >>>>>>>>>> does so, it should generate an
appropriate response, and wake up the
> >>>>>>>>>> VCPU back again using a domctl.
The event is going to be low volume,
> >>>>>>>>>> and since it is going to be
handled synchronously, likely in tens of
> >>>>>>>>>> ms, there are no particular
requirements on the efficiency.
> >>>>>>>>>>
> >>>>>>>>>> Handle invalidate event type is
less important in the short term
> >>>>>>>>>> because the userspace sharing
daemon is designed to be resilient to
> >>>>>>>>>> unfresh sharing state. However, if
it is missing it will make the
> >>>>>>>>>> sharing progressively less
effective as time goes on. The idea is that
> >>>>>>>>>> the hypervisor communicates which
sharing handles are no longer valid,
> >>>>>>>>>> such that the sharing daemon only
attempts to share pages in the
> >>>>>>>>>> correct state. This would be
relatively high volume event, but it
> >>>>>>>>>> doesn''t need to be
accurate (i.e. events can be dropped if they are
> >>>>>>>>>> not consumed quickly enough). As
such this event should be batch
> >>>>>>>>>> delivered, in an asynchronous
fashion.
> >>>>>>>>>>
> >>>>>>>>>> The OOM event is coded up in Xen,
but it will not be consumed properly
> >>>>>>>>>> in the pager. If I remember
correctly, I didn''t want to interfere with
> >>>>>>>>>> the page-in events because the
event interface assumed that mem-event
> >>>>>>>>>> responses are inserted onto the
ring in precisely the same order as
> >>>>>>>>>> the requests. This may not be the
case when we start mixing different
> >>>>>>>>>> event types. WRT to the handle
invalidation, the relevant hooks exist
> >>>>>>>>>> in Xen, and in the mem sharing
daemon, but there is no way to
> >>>>>>>>>> communicate events to two
different consumers atm.
> >>>>>>>>>>
> >>>>>>>>>> Since the requirements on the two
different sharing event types are
> >>>>>>>>>> substantially different, I think
it may be easier if separate channels
> >>>>>>>>>> (i.e. separate rings) were used to
transfer them. This would also fix
> >>>>>>>>>> the multiple consumers issue
relatively easily. Of course you may know
> >>>>>>>>>> of some other mem events that
wouldn''t fit in that scheme.
> >>>>>>>>>>
> >>>>>>>>>> I remember that there was someone
working on an external anti-virus
> >>>>>>>>>> software, which prompted the whole
mem-event work. I don''t remember
> >>>>>>>>>> his/hers name or affiliation
(could you remind me?), but maybe he/she
> >>>>>>>>>> would be interested in working on
some of this?
> >>>>>>>>>>
> >>>>>>>>>> Thanks
> >>>>>>>>>> Gregor
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> >
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, XenServer Engineering
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Tim Deegan

2010-Jun-24 09:26 UTC

head link

Re: [Xen-devel] Re: mem-event interface

At 00:12 +0100 on 24 Jun (1277338337), Dan Magenheimer
wrote:> I gather the ''OOM event'' occurs when a guest tries to
write
> to memory on a page that it thinks it owns, but the page
> is actually transparently shared.  As a result, the
> write must fail and instead some hypervisor swapping
> activity must occur, apparently driven by a userland
> process in dom0 to some swap disks that are configured
> and owned by dom0?  If this is correct, why is it
> necessary for address/sub-page/translation information
> to be included in the event... it is likely that it
> won''t be this specific page that is swapped out,
> correct?
It would be nice to use the same ''memory event'' interface for
other
things (like out-of-domain virus scanners and other security stuff) that
might want to operate on sub-page areas.  I think it''s a good idea to
put that in the interface now even if the initial users (sharing and
swapping) only operate on pages.

So though in this case the address information isn''t very useful, in
general it lets us punt policy decisions into the tools, which I like. 
> I''m not clear on why/when the "handle invalidate" event
> might occur.  Could you explain more?
> 
> I still have to raise a general objection to hypervisor
> swapping in any real world workload.  The VMware users I''ve
> talked to hate it and turn off page sharing because of it.
Agreed - but there should be more, and more useful, clients of the same
interface.
> While there are definitely some workloads where page
> sharing can have a huge advantage (essentially by being so
> homogeneous and "static" across many guests as to avoid
> any swapping), it is not widely used because of swapping.
> 
> I had vaguely thought you had managed to avoid the worst
> of the swapping problems but I don''t recall why/how...
> and I had thought that any swapping that did exist was
> solved by the page sharing code as submitted, but
> never had a chance to dig deeper.  I gather I was
> wrong and this discussion is the next step toward making
> page sharing functional in real world corner cases?
> (I have had questions about page sharing in 4.0 and
> have said, basically, I don''t know and, since we are
> not shipping a 4.0-based hypervisor yet, we will
> have to wait and see.)
> 
> Thanks,
> Dan
> 
> > -----Original Message-----
> > From: Grzegorz Milos [mailto:grzegorz.milos@gmail.com]
> > Sent: Wednesday, June 23, 2010 4:19 PM
> > To: Xen-Devel (E-mail); george.dunlap@eu.citrix.com; Andrew Peace;
> > Steven Hand; Patrick Colp; Bryan D. Payne
> > Subject: [Xen-devel] Re: mem-event interface
> > 
> > [From Gregor]
> > 
> > There are two major events that the memory sharing code needs to
> > communicate over the hypervisor/userspace boundary:
> > 1. GFN unsharing failed due to lack of memory. This will be called the
> > ''OOM event'' from now on.
> > 2. MFN is no longer sharable (actually an opaque sharing handle would
> > be communicated instead of the MFN). ''Handle invalidate
event'' from
> > now on.
> > 
> > The requirements on the OOM event are relatively similar to the
> > page-in event. The way this should operate is that the faulting VCPU
> > is paused, and the pager is requested to free up some memory. When it
> > does so, it should generate an appropriate response, and wake up the
> > VCPU back again using a domctl. The event is going to be low volume,
> > and since it is going to be handled synchronously, likely in tens of
> > ms, there are no particular requirements on the efficiency.
> > 
> > Handle invalidate event type is less important in the short term
> > because the userspace sharing daemon is designed to be resilient to
> > unfresh sharing state. However, if it is missing it will make the
> > sharing progressively less effective as time goes on. The idea is that
> > the hypervisor communicates which sharing handles are no longer valid,
> > such that the sharing daemon only attempts to share pages in the
> > correct state. This would be relatively high volume event, but it
> > doesn''t need to be accurate (i.e. events can be dropped if
they are
> > not consumed quickly enough). As such this event should be batch
> > delivered, in an asynchronous fashion.
> > 
> > The OOM event is coded up in Xen, but it will not be consumed properly
> > in the pager. If I remember correctly, I didn''t want to
interfere with
> > the page-in events because the event interface assumed that mem-event
> > responses are inserted onto the ring in precisely the same order as
> > the requests. This may not be the case when we start mixing different
> > event types. WRT to the handle invalidation, the relevant hooks exist
> > in Xen, and in the mem sharing daemon, but there is no way to
> > communicate events to two different consumers atm.
> > 
> > Since the requirements on the two different sharing event types are
> > substantially different, I think it may be easier if separate channels
> > (i.e. separate rings) were used to transfer them. This would also fix
> > the multiple consumers issue relatively easily. Of course you may know
> > of some other mem events that wouldn''t fit in that scheme.
> > 
> > I remember that there was someone working on an external anti-virus
> > software, which prompted the whole mem-event work. I don''t
remember
> > his/hers name or affiliation (could you remind me?), but maybe he/she
> > would be interested in working on some of this?
> > 
> > Thanks
> > Gregor
> > 
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xensource.com
> > http://lists.xensource.com/xen-devel
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, XenServer Engineering
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

George Dunlap

2010-Jun-24 11:09 UTC

head link

Re: [Xen-devel] Re: mem-event interface

On 24/06/10 00:12, Dan Magenheimer wrote:> The page sharing code was added very quickly to xen-unstable
> last year without (afaict) much review or iteration,
> so there''s probably other developers that could use some
> additional background.  I appreciate that you are
> moving this phase into open discussion!
It''s very easy for those of us in Citrix or connected to Citrix (i.e., 
the Cambridge or UBC computer labs) to make design decisions in off-list 
discussions, effectively presenting them to the list as nearly finished 
designs, with little opportunity for community involvement.

I think we all recognize that this is Not A Good Thing (TM), so I think 
if other community members outside the Citrix connections notice this, 
please feel free to remind us to be more open with discussions. :-)

  -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Grzegorz Milos

2010-Jun-27 15:30 UTC

head link

Re: [Xen-devel] Re: mem-event interface

Tim has already answered some of your questions, let me fill in the gaps.
> I assume you are posting this offlist discussion for
> participation and feedback.  You moved quickly from
> claiming a vague need into very specific mechanisms,
> so pardon me if I need to take a step back.  The
> page sharing code was added very quickly to xen-unstable
> last year without (afaict) much review or iteration,
> so there''s probably other developers that could use some
> additional background.  I appreciate that you are
> moving this phase into open discussion!
The code got dropped into Xen quite quickly (partly) because I was
leaving Citrix, and therefore we wanted to make the code available,
before I was gone. While I cannot work on it full time any more, I
want to contribute some of my time to the project, to make it more
usable, document it and to smoothen out the rough endges.
> I gather the ''OOM event'' occurs when a guest tries to
write
> to memory on a page that it thinks it owns, but the page
> is actually transparently shared.  As a result, the
> write must fail and instead some hypervisor swapping
> activity must occur, apparently driven by a userland
> process in dom0 to some swap disks that are configured
> and owned by dom0?  If this is correct, why is it
> necessary for address/sub-page/translation information
> to be included in the event... it is likely that it
> won''t be this specific page that is swapped out,
> correct?
To reiterate Tim''s point. The mem-event interface is supposed to be
general enough to send a bunch of different memory management events
through. OOM events are about particular domains, and not about
specific frames/pages. Similarly, sub-page access events should albo
be supported.
> I''m not clear on why/when the "handle invalidate" event
> might occur.  Could you explain more?
This is specific to memory sharing, it means that a particular memory
frame (represented by an opaque sharing handle) is no longer sharable.
> I still have to raise a general objection to hypervisor
> swapping in any real world workload.  The VMware users I''ve
> talked to hate it and turn off page sharing because of it.
> While there are definitely some workloads where page
> sharing can have a huge advantage (essentially by being so
> homogeneous and "static" across many guests as to avoid
> any swapping), it is not widely used because of swapping.
I guess this is out of scope of this particular email thread. But to
shed some light on it, the extra memory gained through sharing can be
used in several different ways. Some of them will require the
safeguards against OOM, which implies paging. In my view, the paging
functionality should be supported, and the tools should implement a
policy which provides best performance+predictability.
> I had vaguely thought you had managed to avoid the worst
> of the swapping problems but I don''t recall why/how...
> and I had thought that any swapping that did exist was
> solved by the page sharing code as submitted, but
> never had a chance to dig deeper.
My approach relies on PV domains, or at least virtualisation aware
memory management. The current memory sharing code concentrates on HVM
domains though. The long term goal is definitely to optimise the
VM-hypervisor memory management as much as possible.

Thanks
Gregor

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Grzegorz Milos

2010-Jun-27 15:45 UTC

head link

Re: [Xen-devel] Re: mem-event interface

> I agree that multiple rings are a good idea here - especially if we want
> to disaggregate and have event handlers in multiple domains.
>
> Maybe the ring-registering interface could take a type and a rangeset -
> that would reduce the amount of extra chatter at the cost of some more
> overhead in Xen.
>
Well, the trouble is what do units you express the ranges in. In pfns
belonging to a given guest, or in mfns? Either way memory sharing
would use <0 - max_{p,m}fn> rangeset most of the time. Similarly for
teh pager (I believe). Bryan, could you comment on XenAccess? I guess
rangesets would be useful there the most.

I certainly agree that we will have to swallow some complexity in Xen,
to make the interface efficient. Some filters will have to live in
Xen, in order not to generate unnecessarily large rate of no-op
events.

Thanks
Gregor

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Patrick Colp

2010-Jun-27 17:40 UTC

head link

Re: [Xen-devel] Re: mem-event interface

On 27 June 2010 08:45, Grzegorz Milos <grzegorz.milos@gmail.com>
wrote:>> I agree that multiple rings are a good idea here - especially if we
want
>> to disaggregate and have event handlers in multiple domains.
>>
>> Maybe the ring-registering interface could take a type and a rangeset -
>> that would reduce the amount of extra chatter at the cost of some more
>> overhead in Xen.
>>
>
> Well, the trouble is what do units you express the ranges in. In pfns
> belonging to a given guest, or in mfns? Either way memory sharing
> would use <0 - max_{p,m}fn> rangeset most of the time. Similarly for
> teh pager (I believe). Bryan, could you comment on XenAccess? I guess
> rangesets would be useful there the most.
>
> I certainly agree that we will have to swallow some complexity in Xen,
> to make the interface efficient. Some filters will have to live in
> Xen, in order not to generate unnecessarily large rate of no-op
> events.
I suppose one way to handle the range is to specify the range in terms
of full address (i.e. not pfn, so page 0xf would be specified as
0xf000). This way, we can specify the full range of memory (e.g.
<0xf000, 0xf001> to watch the first byte of the page with pfn 0xf).
However, it might be useful to have a flag that lets you specify if
you mean pfns, mfns, or full address ranges (or something of the
like). Xen should return some sort of unique identifier for each
handler so that new ranges can easily be added/removed dynamically.


Patrick

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Dan Magenheimer

2010-Jun-28 03:12 UTC

head link

RE: [Xen-devel] Re: mem-event interface

> From: Patrick Colp [mailto:pjcolp@cs.ubc.ca]
> Sent: Sunday, June 27, 2010 11:40 AM
> To: Grzegorz Milos
> Cc: Xen-Devel (E-mail); Tim Deegan; George Dunlap; Bryan D. Payne;
> Andrew Peace; Steven Hand
> Subject: Re: [Xen-devel] Re: mem-event interface
> 
> On 27 June 2010 08:45, Grzegorz Milos <grzegorz.milos@gmail.com>
wrote:
> >> I agree that multiple rings are a good idea here - especially if
we
> want
> >> to disaggregate and have event handlers in multiple domains.
> >>
> >> Maybe the ring-registering interface could take a type and a
> rangeset -
> >> that would reduce the amount of extra chatter at the cost of some
> more
> >> overhead in Xen.
> >>
> >
> > Well, the trouble is what do units you express the ranges in. In pfns
> > belonging to a given guest, or in mfns? Either way memory sharing
> > would use <0 - max_{p,m}fn> rangeset most of the time. Similarly
for
> > teh pager (I believe). Bryan, could you comment on XenAccess? I guess
> > rangesets would be useful there the most.
> >
> > I certainly agree that we will have to swallow some complexity in
> Xen,
> > to make the interface efficient. Some filters will have to live in
> > Xen, in order not to generate unnecessarily large rate of no-op
> > events.
> 
> I suppose one way to handle the range is to specify the range in terms
> of full address (i.e. not pfn, so page 0xf would be specified as
> 0xf000). This way, we can specify the full range of memory (e.g.
> <0xf000, 0xf001> to watch the first byte of the page with pfn 0xf).
> However, it might be useful to have a flag that lets you specify if
> you mean pfns, mfns, or full address ranges (or something of the
> like). Xen should return some sort of unique identifier for each
> handler so that new ranges can easily be added/removed dynamically.
Probably a good idea to plan for page sizes different from 4K anyway.
I wouldn''t be surprised if a 2M-pagesize-only Xen exists in the
not-too-distant future.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Tim Deegan

2010-Jun-28 09:28 UTC

head link

Re: [Xen-devel] Re: mem-event interface

At 16:45 +0100 on 27 Jun (1277657152), Grzegorz Milos
wrote:> Well, the trouble is what do units you express the ranges in. In pfns
> belonging to a given guest, or in mfns? Either way memory sharing
> would use <0 - max_{p,m}fn> rangeset most of the time. Similarly for
> teh pager (I believe). Bryan, could you comment on XenAccess? I guess
> rangesets would be useful there the most.
Guest-physical addresses (i.e. GFNs but at byte granularity), I
think.  The hypercall interface handles all HVM memory in GFN-space, so
I think this should be no exception. 

Cheers,

Tim.
> I certainly agree that we will have to swallow some complexity in Xen,
> to make the interface efficient. Some filters will have to live in
> Xen, in order not to generate unnecessarily large rate of no-op
> events.
> 
> Thanks
> Gregor
-- 
Tim Deegan <Tim.Deegan@citrix.com>
Principal Software Engineer, XenServer Engineering
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Jun 2010 - Re: mem-event interface

[Xen-devel] Re: mem-event interface

[Xen-devel] Re: mem-event interface

[Xen-devel] Re: mem-event interface

[Xen-devel] Re: mem-event interface

[Xen-devel] Re: mem-event interface

[Xen-devel] Re: mem-event interface

[Xen-devel] Re: mem-event interface

[Xen-devel] Re: mem-event interface

[Xen-devel] Re: mem-event interface

[Xen-devel] Re: mem-event interface

[Xen-devel] Re: mem-event interface

RE: [Xen-devel] Re: mem-event interface

Re: [Xen-devel] Re: mem-event interface

Re: [Xen-devel] Re: mem-event interface

Re: [Xen-devel] Re: mem-event interface

Re: [Xen-devel] Re: mem-event interface

Re: [Xen-devel] Re: mem-event interface

Re: [Xen-devel] Re: mem-event interface

RE: [Xen-devel] Re: mem-event interface

Re: [Xen-devel] Re: mem-event interface