Hi, attached is a patch to support lock profiling in the Xen hypervisor. The feature is switchable at compile time (like software performance counters) in Rules.mk. If lock profiling is not activated, code and data structures are the same as without the patch. With activated lock profiling some statistical data is added to each spinlock_t structure and locking/unlocking keeps track for each lock of: - number of successful locks - cumulated time of lock held - number of locking calls leading to wait cycles - cumulated waiting time The profile data may be obtained either via hotkey from the console or with the new tool xenlockprof. To be able to find all locks for printing out the profile data, a linked list with the locks is built up. Locks defined via DEFINE_SPINLOCK are added to this list automatically, locks in dynamically allocated data structures have to be added (and removed) explicitly to the queue to be able to print the profile data. Adding is done via spin_lock_init_prof (there are two additional calls with one or two index parameters to be able to distinguish locks e.g. per domid and vcpu-nr), removing is done via spin_lock_destroy_prof (may be called for a completely zeroed structure as well). I''m not completely satisfied with the solution for the dynamically initialized locks, but I had no better idea in the first run. Another enhancement would be to expand the profiling to rw-locks as well, but this would have required a rewrite of the lock routines using try_lock like for spinlocks. I could do this if the lock profiling is accepted. Comments welcome :-) Juergen -- Juergen Gross Principal Developer Operating Systems TSP ES&S SWE OS6 Telephone: +49 (0) 89 636 47950 Fujitsu Technolgy Solutions e-mail: juergen.gross@ts.fujitsu.com Otto-Hahn-Ring 6 Internet: ts.fujitsu.com D-81739 Muenchen Company details: ts.fujitsu.com/imprint.html _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2009-Oct-08 08:15 UTC
Re: [Xen-devel] [Patch] support of lock profiling in Xen
On 08/10/2009 08:48, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:> I''m not completely satisfied with the solution for the dynamically initialized > locks, but I had no better idea in the first run. > Another enhancement would be to expand the profiling to rw-locks as well, but > this would have required a rewrite of the lock routines using try_lock like > for spinlocks. I could do this if the lock profiling is accepted. > > Comments welcome :-)The method of chaining and initialising the info is kind of icky. Requiring users to unchain dynamic locks is just asking for this support to be perpetually broken, or be used only for static locks. How about defining new initialisers DEFINE_NAMED_SPINLOCK() and named_spinlock_init(). This would indicate you consider a lock important enough to name (and hence profile) and also categorise dynamically-allocated locks, causing their stats to be aggregated (after all, lock optimisations will have aggregate effect across all locks of that category). If lock profiling is compiled in, have a static array of lock-profile descriptors (keeps things simple - could make it a growable array, or something). On lock init, walk the array looking for that name. If found, write the entry index into a new field in the spinlock struct. If not found, allocate next entry in lock-profile array, init it, and write its index into the spinlock struct. On lock operations, if the index field in the spinlock is non-zero, update stats in the associated profile structure. Regarding races from locks aliasing to the same profile structure -- either assume that doesn''t matter much, or perhaps update fields with atomic ops. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2009-Oct-08 08:21 UTC
Re: [Xen-devel] [Patch] support of lock profiling in Xen
On 08/10/2009 08:48, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:> Comments welcome :-)There''s also the question of whether this really gets you useful extra info than can be digested from a xentrace run. I suppose at least the info is easier to understand, and not many people know how to drive xentrace. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Juergen Gross
2009-Oct-08 08:27 UTC
Re: [Xen-devel] [Patch] support of lock profiling in Xen
Keir Fraser wrote:> On 08/10/2009 08:48, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote: > >> I''m not completely satisfied with the solution for the dynamically initialized >> locks, but I had no better idea in the first run. >> Another enhancement would be to expand the profiling to rw-locks as well, but >> this would have required a rewrite of the lock routines using try_lock like >> for spinlocks. I could do this if the lock profiling is accepted. >> >> Comments welcome :-) > > The method of chaining and initialising the info is kind of icky. Requiring > users to unchain dynamic locks is just asking for this support to be > perpetually broken, or be used only for static locks. > > How about defining new initialisers DEFINE_NAMED_SPINLOCK() and > named_spinlock_init(). This would indicate you consider a lock important > enough to name (and hence profile) and also categorise dynamically-allocated > locks, causing their stats to be aggregated (after all, lock optimisations > will have aggregate effect across all locks of that category). > > If lock profiling is compiled in, have a static array of lock-profile > descriptors (keeps things simple - could make it a growable array, or > something). On lock init, walk the array looking for that name. If found, > write the entry index into a new field in the spinlock struct. If not found, > allocate next entry in lock-profile array, init it, and write its index into > the spinlock struct.Don''t you think a solution like for performance counters would be better then? The index would be in a header file and the sums could be held per-cpu to avoid races. It would even be possible to define lock arrays if summing up all data for e.g. domain specific locks is not desired.> > On lock operations, if the index field in the spinlock is non-zero, update > stats in the associated profile structure. Regarding races from locks > aliasing to the same profile structure -- either assume that doesn''t matter > much, or perhaps update fields with atomic ops.Juergen -- Juergen Gross Principal Developer Operating Systems TSP ES&S SWE OS6 Telephone: +49 (0) 89 636 47950 Fujitsu Technolgy Solutions e-mail: juergen.gross@ts.fujitsu.com Otto-Hahn-Ring 6 Internet: ts.fujitsu.com D-81739 Muenchen Company details: ts.fujitsu.com/imprint.html _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2009-Oct-08 08:38 UTC
Re: [Xen-devel] [Patch] support of lock profiling in Xen
On 08/10/2009 09:27, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:>> If lock profiling is compiled in, have a static array of lock-profile >> descriptors (keeps things simple - could make it a growable array, or >> something). On lock init, walk the array looking for that name. If found, >> write the entry index into a new field in the spinlock struct. If not found, >> allocate next entry in lock-profile array, init it, and write its index into >> the spinlock struct. > > Don''t you think a solution like for performance counters would be better then? > The index would be in a header file and the sums could be held per-cpu to > avoid races. > It would even be possible to define lock arrays if summing up all data for > e.g. domain specific locks is not desired.It''s just a question of impact on our existing locking API. I''d like to minimise that. I''m not particularly a fan of the perfc stuff. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Juergen Gross
2009-Oct-08 08:38 UTC
Re: [Xen-devel] [Patch] support of lock profiling in Xen
Keir Fraser wrote:> On 08/10/2009 08:48, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote: > >> Comments welcome :-) > > There''s also the question of whether this really gets you useful extra info > than can be digested from a xentrace run. I suppose at least the info is > easier to understand, and not many people know how to drive xentrace.It helped me a lot for narrowing down my performance problem with a 8 vcpu BS2000 system (this was the main reason I made the patch). Finding such a problem in xentrace is quite a bit of work. xentrace produced over 500 MB of data in 30 seconds, while reading the xenlockprof output revealed the bottleneck in seconds. And I think the lock profiling tool is much better for evaluating optimizations in Xen regarding lock conflicts, if the differences are only a few percent. Juergen -- Juergen Gross Principal Developer Operating Systems TSP ES&S SWE OS6 Telephone: +49 (0) 89 636 47950 Fujitsu Technolgy Solutions e-mail: juergen.gross@ts.fujitsu.com Otto-Hahn-Ring 6 Internet: ts.fujitsu.com D-81739 Muenchen Company details: ts.fujitsu.com/imprint.html _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2009-Oct-08 08:40 UTC
Re: [Xen-devel] [Patch] support of lock profiling in Xen
On 08/10/2009 09:27, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:> It would even be possible to define lock arrays if summing up all data for > e.g. domain specific locks is not desired.Allowing spinlock_init() to specify not only a name but also a ''category-specific index'' might be okay. For example, domid, in the case of a domain-specific lock. Hiding the allocation of profiling info structs in the spinlock subsystem somehow... (e.g., radix tree to make a simple growable array). -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Juergen Gross
2009-Oct-08 09:04 UTC
Re: [Xen-devel] [Patch] support of lock profiling in Xen
Keir Fraser wrote:> On 08/10/2009 09:27, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote: > >> It would even be possible to define lock arrays if summing up all data for >> e.g. domain specific locks is not desired. > > Allowing spinlock_init() to specify not only a name but also a > ''category-specific index'' might be okay. For example, domid, in the case of > a domain-specific lock. Hiding the allocation of profiling info structs in > the spinlock subsystem somehow... (e.g., radix tree to make a simple > growable array).I''ll try another patch... Juergen -- Juergen Gross Principal Developer Operating Systems TSP ES&S SWE OS6 Telephone: +49 (0) 89 636 47950 Fujitsu Technolgy Solutions e-mail: juergen.gross@ts.fujitsu.com Otto-Hahn-Ring 6 Internet: ts.fujitsu.com D-81739 Muenchen Company details: ts.fujitsu.com/imprint.html _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2009-Oct-08 09:17 UTC
Re: [Xen-devel] [Patch] support of lock profiling in Xen
On 08/10/2009 09:38, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:>> There''s also the question of whether this really gets you useful extra info >> than can be digested from a xentrace run. I suppose at least the info is >> easier to understand, and not many people know how to drive xentrace. > > It helped me a lot for narrowing down my performance problem with a 8 vcpu > BS2000 system (this was the main reason I made the patch). > Finding such a problem in xentrace is quite a bit of work. xentrace produced > over 500 MB of data in 30 seconds, while reading the xenlockprof output > revealed the bottleneck in seconds. > > And I think the lock profiling tool is much better for evaluating > optimizations in Xen regarding lock conflicts, if the differences are only > a few percent.Okay, well the only thing I''m dead against in a lockmeter patch is requring dynamic registration and unregistration of individual locks. Largely because I see a never-ending stream of fix this unregistration, fix that unregistration, stretching out before me, because callers will forever be getting broken. I suppose most locks are either global or per-domain, so perhaps you could just have domain create/destroy hooks into lockmeter, to alloc/free an array of per-domain locks? Would that actually be an argument for a perfc-style central declaration of all profileable locks, so you could indicate which are global and which are per-domain? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2009-Oct-08 09:21 UTC
Re: [Xen-devel] [Patch] support of lock profiling in Xen
On 08/10/2009 10:04, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:>>> It would even be possible to define lock arrays if summing up all data for >>> e.g. domain specific locks is not desired. >> >> Allowing spinlock_init() to specify not only a name but also a >> ''category-specific index'' might be okay. For example, domid, in the case of >> a domain-specific lock. Hiding the allocation of profiling info structs in >> the spinlock subsystem somehow... (e.g., radix tree to make a simple >> growable array). > > I''ll try another patch...Well, see my email just now. I could perhaps live with perfc-style declarations after all... Because this might make sense if locks can be broadly categorised as global or as per-domain, and it could avoid you having to make your implementation much more complex? After all, now I think about it, an ''arbitrary'' index specified to spinlock_init() is just making our lives more complicated if it will basically *always* be a domid. And knowing at a higher level which locks are actually per-domain could help your pretty printing? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Juergen Gross
2009-Oct-08 09:35 UTC
Re: [Xen-devel] [Patch] support of lock profiling in Xen
Keir Fraser wrote:> On 08/10/2009 10:04, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote: > >>>> It would even be possible to define lock arrays if summing up all data for >>>> e.g. domain specific locks is not desired. >>> Allowing spinlock_init() to specify not only a name but also a >>> ''category-specific index'' might be okay. For example, domid, in the case of >>> a domain-specific lock. Hiding the allocation of profiling info structs in >>> the spinlock subsystem somehow... (e.g., radix tree to make a simple >>> growable array). >> I''ll try another patch... > > Well, see my email just now. I could perhaps live with perfc-style > declarations after all... Because this might make sense if locks can be > broadly categorised as global or as per-domain, and it could avoid you > having to make your implementation much more complex? > > After all, now I think about it, an ''arbitrary'' index specified to > spinlock_init() is just making our lives more complicated if it will > basically *always* be a domid. And knowing at a higher level which locks are > actually per-domain could help your pretty printing?I thought in this direction, too. I would, however, try not to limit it to per domain locks, even if the first implementation would support only global locks and per domain locks. The perfc style declarations should only be necessary for the dynamically initialized locks, and those could live near the domain structure definition (or any other structure where they are needed). I think the statically defined locks can be handled via an own data section requiring no centrally defined index. Juergen -- Juergen Gross Principal Developer Operating Systems TSP ES&S SWE OS6 Telephone: +49 (0) 89 636 47950 Fujitsu Technolgy Solutions e-mail: juergen.gross@ts.fujitsu.com Otto-Hahn-Ring 6 Internet: ts.fujitsu.com D-81739 Muenchen Company details: ts.fujitsu.com/imprint.html _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2009-Oct-08 09:47 UTC
Re: [Xen-devel] [Patch] support of lock profiling in Xen
On 08/10/2009 10:35, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:>> After all, now I think about it, an ''arbitrary'' index specified to >> spinlock_init() is just making our lives more complicated if it will >> basically *always* be a domid. And knowing at a higher level which locks are >> actually per-domain could help your pretty printing? > > I thought in this direction, too. > I would, however, try not to limit it to per domain locks, even if the first > implementation would support only global locks and per domain locks. > > The perfc style declarations should only be necessary for the dynamically > initialized locks, and those could live near the domain structure definition > (or any other structure where they are needed). > > I think the statically defined locks can be handled via an own data section > requiring no centrally defined index.Okay, well this all sounds an acceptable kind of direction. Worth spinning another patch. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Juergen Gross
2009-Oct-09 08:27 UTC
Re: [Xen-devel] [Patch] support of lock profiling in Xen
Keir Fraser wrote:> On 08/10/2009 10:35, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote: > >>> After all, now I think about it, an ''arbitrary'' index specified to >>> spinlock_init() is just making our lives more complicated if it will >>> basically *always* be a domid. And knowing at a higher level which locks are >>> actually per-domain could help your pretty printing? >> I thought in this direction, too. >> I would, however, try not to limit it to per domain locks, even if the first >> implementation would support only global locks and per domain locks. >> >> The perfc style declarations should only be necessary for the dynamically >> initialized locks, and those could live near the domain structure definition >> (or any other structure where they are needed). >> >> I think the statically defined locks can be handled via an own data section >> requiring no centrally defined index. > > Okay, well this all sounds an acceptable kind of direction. Worth spinning > another patch.Here it is. I managed to avoid the perfc-like stuff completely. Included are 2 global locks defined for profiling and 2 locks in the domain structure. I''ve added some comments in spinlock.h how to use the profiling in structures. Tested by compilation and boot of Dom0 with and without profiling enabled. Juergen -- Juergen Gross Principal Developer Operating Systems TSP ES&S SWE OS6 Telephone: +49 (0) 89 636 47950 Fujitsu Technolgy Solutions e-mail: juergen.gross@ts.fujitsu.com Otto-Hahn-Ring 6 Internet: ts.fujitsu.com D-81739 Muenchen Company details: ts.fujitsu.com/imprint.html _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2009-Oct-09 09:22 UTC
Re: [Xen-devel] [Patch] support of lock profiling in Xen
On 09/10/2009 09:27, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:>> Okay, well this all sounds an acceptable kind of direction. Worth spinning >> another patch. > > Here it is. > I managed to avoid the perfc-like stuff completely. > > Included are 2 global locks defined for profiling and 2 locks in the domain > structure. > > I''ve added some comments in spinlock.h how to use the profiling in structures. > > Tested by compilation and boot of Dom0 with and without profiling enabled.Not keen on exposing the lock_profile_funcs[] stuff; that and a few other aspects are ratehr clumsy looking. I would rather dynamically register/unregister lock-profile blocks; something like: /* domain creation */ lock_profile_register_struct( LOCKPROF_TYPE_PERDOMAIN, d, d->domid); /* domain destruction */ lock_profile_unregister_struct(d); Requires a ''struct lock_profile lock_profile'' or similar to be declared within the domain structure. No need to hide that definition within a macro I expect, nor to specify in advance the ''max # locks''. Where lock_profile is a struct containing the ''idx'' (domid in this case) and the head of a linked list of locks, which gets appended to whenever a new pofiled lock is declared. All locks can go back to having the entire profile struct embedded within them - I don''t care about the space wastage for unprofiled locks. This means that lock_profile is a fixed-size struct and does not need to contain the profiling data itself -- it just points at a chain of locks that extends on each spin_lock_init_prof(). lock_profile_register_struct() adds the lock_profile_struct to a linked list of such structs for this LOCKPROF_TYPE (so you keep an array of list_heads indexed by LOCKPROF_TYPE, and you can protect that array and all sub-linked-lists with a big rwlock). Something like: static DEFINE_RWLOCK lock_profiles_lock; static struct lock_profile *lock_profiles[NR_LOCKPROF_TYPES] Or somesuch. Regarding macro style, crap like starting a multi-statement macro with a semi-colon, and not terminating it with one (e.g., _spin_lock_init_prof()) is unacceptable style. The C idiom is to enclose multiple statements within a do{}while(0), or possibly the gcc idiom {()}. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2009-Oct-09 10:35 UTC
Re: [Xen-devel] [Patch] support of lock profiling in Xen
On Thu, Oct 8, 2009 at 9:38 AM, Juergen Gross <juergen.gross@ts.fujitsu.com> wrote:> It helped me a lot for narrowing down my performance problem with a 8 vcpu > BS2000 system (this was the main reason I made the patch). > Finding such a problem in xentrace is quite a bit of work. xentrace produced > over 500 MB of data in 30 seconds, while reading the xenlockprof output > revealed the bottleneck in seconds.Minor point, but IIRC you identified the shadow lock as the problem; but the shadow lock covers a lot of code. Did this patch alone tell you what *aspect* of the shadow code was causing the problem? That''s what xentrace was designed to do. :-) I agree, however, that for a lot of uses, xentrace isn''t the right tool. It does introduce a lot of memory and disk overhead. Having this kind of lock profiling, and sample-based profiling like oprofile or gprof for Xen is the right tool for cases where a low-overhead aggregate measurement is all that''s necessary. Xentrace excels when aggregates don''t give you enough information, and you need to drill down into specific sequences of events, or see how things change over time. It''s also useful for situations where it''s inconvenient to run a test iteratively as you add aggregate information (for instance, if it''s a customer running the test or you''re looking at a problem 3-hours into a 6-hour test). Instead of having Xen make new aggregates, you can use the analysis tool to make new aggregates as you need them. -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser wrote:> On 09/10/2009 09:27, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote: > >>> Okay, well this all sounds an acceptable kind of direction. Worth spinning >>> another patch. >> Here it is. >> I managed to avoid the perfc-like stuff completely. >> >> Included are 2 global locks defined for profiling and 2 locks in the domain >> structure. >> >> I''ve added some comments in spinlock.h how to use the profiling in structures. >> >> Tested by compilation and boot of Dom0 with and without profiling enabled. > > Not keen on exposing the lock_profile_funcs[] stuff; that and a few other > aspects are ratehr clumsy looking. I would rather dynamically > register/unregister lock-profile blocks; something like: > /* domain creation */ > lock_profile_register_struct( > LOCKPROF_TYPE_PERDOMAIN, d, d->domid); > /* domain destruction */ > lock_profile_unregister_struct(d); > Requires a ''struct lock_profile lock_profile'' or similar to be declared > within the domain structure. No need to hide that definition within a macro > I expect, nor to specify in advance the ''max # locks''.Round 3 then :-) I think a rw-lock is not needed, as reading is a very rare event which will be executed in parallel less then once a year. ;-) Better? Juergen -- Juergen Gross Principal Developer Operating Systems TSP ES&S SWE OS6 Telephone: +49 (0) 89 636 47950 Fujitsu Technolgy Solutions e-mail: juergen.gross@ts.fujitsu.com Otto-Hahn-Ring 6 Internet: ts.fujitsu.com D-81739 Muenchen Company details: ts.fujitsu.com/imprint.html _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2009-Oct-13 15:49 UTC
Re: [Xen-devel] [Patch] support of lock profiling in Xen
On 13/10/2009 14:56, "Juergen Gross" <juergen.gross@ts.fujitsu.com> wrote:>> Requires a ''struct lock_profile lock_profile'' or similar to be declared >> within the domain structure. No need to hide that definition within a macro >> I expect, nor to specify in advance the ''max # locks''. > > Round 3 then :-) > > I think a rw-lock is not needed, as reading is a very rare event which will > be executed in parallel less then once a year. ;-) > > Better?Yeah, this is fine I think. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel