In "Understanding the Linux Kernel" 3rd edition, section 4.7 "Softirqs and Tasklets" it states: "Activation and execution [of defferable functions] are bound together: a deferrable function that has been activated by a given CPU must be executed on the same CPU. There is no self-evident reason suggesting that this rule is beneficial for system performance. Binding the deferrable function to the activating CPU could in theory make better use of the CPU hardware cache. After all, it is conceivable that the activating kernel thread accesses some data structures that will also be used by the deferrable function. However, the relevant lines could easily be no longer in the cache when the deferrable function is run because its execution can be delayed a long time. Moreover, binding a function to a CPU is always a potentially "dangerous" operation, because one CPU might end up very busy while the others are mostly idle." I think in Xen, the same approach is taken from what I can tell (softirq_pending is keyed by cpu, used both in activation (cpu_raise_softirq) and execution (do_softirq)). Is there room for optimization here? I''m looking at Xen''s softirq mechanism and just noticed this, that''s all. I have no performance data or anything. In Xen, when would the hardware cache situation in the quote above generally have a real effect? Are there any particularly large data structures accessed with defferable functions in Xen? Tim _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
I think I got the subject of the email wrong, it looks like it is by real cpu not vcpu? Tim On Wed, 15 Mar 2006 13:42:09 -0600 Tim Freeman <tfreeman@mcs.anl.gov> wrote:> In "Understanding the Linux Kernel" 3rd edition, section 4.7 "Softirqs and > Tasklets" it states: > > "Activation and execution [of defferable functions] are bound together: a > deferrable function that has been activated by a given CPU must be executed on > the same CPU. There is no self-evident reason suggesting that this rule is > beneficial for system performance. Binding the deferrable function to the > activating CPU could in theory make better use of the CPU hardware cache. > After all, it is conceivable that the activating kernel thread accesses some > data structures that will also be used by the deferrable function. However, > the relevant lines could easily be no longer in the cache when the deferrable > function is run because its execution can be delayed a long time. Moreover, > binding a function to a CPU is always a potentially "dangerous" operation, > because one CPU might end up very busy while the others are mostly idle." > > I think in Xen, the same approach is taken from what I can tell > (softirq_pending is keyed by cpu, used both in activation (cpu_raise_softirq) > and execution (do_softirq)). > > Is there room for optimization here? I''m looking at Xen''s softirq mechanism > and just noticed this, that''s all. I have no performance data or anything. > In Xen, when would the hardware cache situation in the quote above generally > have a real effect? Are there any particularly large data structures accessed > with defferable functions in Xen? > > Tim > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Looking more closely there does not seem to be a ksoftirqd equivalent: in entry.S, test_all_events is jumped to often via ret_from_intr. So it looks like the pending softirq requests are handled quite often and block until they are done. So I''m guessing the quote from the book does not apply so much (and these deferrable functions in Xen need to be quite fast). Tim On Wed, 15 Mar 2006 13:42:09 -0600 Tim Freeman <tfreeman@mcs.anl.gov> wrote:> In "Understanding the Linux Kernel" 3rd edition, section 4.7 "Softirqs and > Tasklets" it states: > > "Activation and execution [of defferable functions] are bound together: a > deferrable function that has been activated by a given CPU must be executed on > the same CPU. There is no self-evident reason suggesting that this rule is > beneficial for system performance. Binding the deferrable function to the > activating CPU could in theory make better use of the CPU hardware cache. > After all, it is conceivable that the activating kernel thread accesses some > data structures that will also be used by the deferrable function. However, > the relevant lines could easily be no longer in the cache when the deferrable > function is run because its execution can be delayed a long time. Moreover, > binding a function to a CPU is always a potentially "dangerous" operation, > because one CPU might end up very busy while the others are mostly idle." > > I think in Xen, the same approach is taken from what I can tell > (softirq_pending is keyed by cpu, used both in activation (cpu_raise_softirq) > and execution (do_softirq)). > > Is there room for optimization here? I''m looking at Xen''s softirq mechanism > and just noticed this, that''s all. I have no performance data or anything. > In Xen, when would the hardware cache situation in the quote above generally > have a real effect? Are there any particularly large data structures accessed > with defferable functions in Xen? > > Tim >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 15 Mar 2006, at 20:23, Tim Freeman wrote:> Looking more closely there does not seem to be a ksoftirqd equivalent: > in > entry.S, test_all_events is jumped to often via ret_from_intr. So it > looks like > the pending softirq requests are handled quite often and block until > they are > done. So I''m guessing the quote from the book does not apply so much > (and these > deferrable functions in Xen need to be quite fast).They occur fairly often, they are short in duration, and they often need to be executed on a particular CPU (that''s true of the schedule and timer softirqs at least). Performance-wise, CPU affinity should win over any small resulting load imbalance. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel