Hi James,
This doesn''t sound like a lock ordering issue or an exception to the
stated
lock ordering rules. In particular, there''s no need to take the meta
lock
before dtrace or provider; rather, the meta lock must be taken before
either
of those locks if it is to be taken at all.
Is there a situation in which the lock ordering has been invalidated?
Do you
have information about why thread B is being starved out of using the
provider
lock? fasttrap_cleanup_pid_cb() should only be called frequently if
there''s
a high degree of turnover for pid and USDT probes.
Adam
On Mar 25, 2009, at 2:46 PM, James McIlree wrote:
>
> I''m looking at a hang/stall while dtrace''ing on heavily
loaded
> systems.
>
> I''ve got the following scenario:
>
> Kernel Thread A is waiting to acquire tthe dtrace_meta_lock
> Kernel Thread B owns the dtrace_meta_lock, and is waiting on the
> dtrace_provider_lock
> Kernel Thread C owns the dtrace_provider_lock, and is executing
> normally.
>
> However, I remembered this helpful comment from dtrace.c:
> * The lock ordering between these three locks is dtrace_meta_lock
> before
> * dtrace_provider_lock before dtrace_lock. (In particular, there are
> * several places where dtrace_provider_lock is held by the framework
> as it
> * calls into the providers -- which then call back into the framework,
> * grabbing dtrace_lock.)
> Kernel Thread C is calling into dtrace_unregister from
> fasttrap_cleanup_pid_cb().
> As best I can tell, fasttrap_cleanup_pid_cb() never takes any of the
> dtrace locks. It does
> call dtrace_unregister at line 338, though.
>
> The dtrace_unregister function immediately takes the provider_lock
> and the dtrace_lock,
> without taking the meta lock.
>
> Am I seeing an exception to the rules above, or the first signs of
> a potential lock ordering
> issue?
>
> James M
>
>
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org
--
Adam Leventhal, Fishworks http://blogs.sun.com/ahl