thr3ads.net - Nouveau - [Nouveau] [PATCH] mm/hmm: replace hmm_update with mmu_notifier

If this information is useful, please help other people find it:
Share via:

Michal Hocko

2019-Jul-24 18:59 UTC

[Nouveau] [PATCH] mm/hmm: replace hmm_update with mmu_notifier_range

On Wed 24-07-19 20:56:17, Michal Hocko wrote:> On Wed 24-07-19 15:08:37, Jason Gunthorpe wrote:
> > On Wed, Jul 24, 2019 at 07:58:58PM +0200, Michal Hocko wrote:
> [...]
> > > Maybe new users have started relying on a new semantic in the
meantime,
> > > back then, none of the notifier has even started any action in
blocking
> > > mode on a EAGAIN bailout. Most of them simply did trylock early
in the
> > > process and bailed out so there was nothing to do for the
range_end
> > > callback.
> > 
> > Single notifiers are not the problem. I tried to make this clear in
> > the commit message, but lets be more explicit.
> > 
> > We have *two* notifiers registered to the mm, A and B:
> > 
> > A invalidate_range_start: (has no blocking)
> >     spin_lock()
> >     counter++
> >     spin_unlock()
> > 
> > A invalidate_range_end:
> >     spin_lock()
> >     counter--
> >     spin_unlock()
> > 
> > And this one:
> > 
> > B invalidate_range_start: (has blocking)
> >     if (!try_mutex_lock())
> >         return -EAGAIN;
> >     counter++
> >     mutex_unlock()
> > 
> > B invalidate_range_end:
> >     spin_lock()
> >     counter--
> >     spin_unlock()
> > 
> > So now the oom path does:
> > 
> > invalidate_range_start_non_blocking:
> >  for each mn:
> >    a->invalidate_range_start
> >    b->invalidate_range_start
> >    rc = EAGAIN
> > 
> > Now we SKIP A's invalidate_range_end even though A had no idea
this
> > would happen has state that needs to be unwound. A is broken.
> > 
> > B survived just fine.
> > 
> > A and B *alone* work fine, combined they fail.
> 
> But that requires that they share some state, right?
> 
> > When the commit was landed you can use KVM as an example of A and RDMA
> > ODP as an example of B
> 
> Could you point me where those two share the state please? KVM seems to
> be using kvm->mmu_notifier_count but I do not know where to look for the
> RDMA...
Scratch that. ELONGDAY... I can see your point. It is all or nothing
that doesn't really work here. Looking back at your patch it seems
reasonable but I am not sure what is supposed to be a behavior for
notifiers that failed.
-- 
Michal Hocko
SUSE Labs

Jason Gunthorpe

2019-Jul-24 19:21 UTC

head link

[Nouveau] [PATCH] mm/hmm: replace hmm_update with mmu_notifier_range

On Wed, Jul 24, 2019 at 08:59:10PM +0200, Michal Hocko
wrote:> On Wed 24-07-19 20:56:17, Michal Hocko wrote:
> > On Wed 24-07-19 15:08:37, Jason Gunthorpe wrote:
> > > On Wed, Jul 24, 2019 at 07:58:58PM +0200, Michal Hocko wrote:
> > [...]
> > > > Maybe new users have started relying on a new semantic in
the meantime,
> > > > back then, none of the notifier has even started any action
in blocking
> > > > mode on a EAGAIN bailout. Most of them simply did trylock
early in the
> > > > process and bailed out so there was nothing to do for the
range_end
> > > > callback.
> > > 
> > > Single notifiers are not the problem. I tried to make this clear
in
> > > the commit message, but lets be more explicit.
> > > 
> > > We have *two* notifiers registered to the mm, A and B:
> > > 
> > > A invalidate_range_start: (has no blocking)
> > >     spin_lock()
> > >     counter++
> > >     spin_unlock()
> > > 
> > > A invalidate_range_end:
> > >     spin_lock()
> > >     counter--
> > >     spin_unlock()
> > > 
> > > And this one:
> > > 
> > > B invalidate_range_start: (has blocking)
> > >     if (!try_mutex_lock())
> > >         return -EAGAIN;
> > >     counter++
> > >     mutex_unlock()
> > > 
> > > B invalidate_range_end:
> > >     spin_lock()
> > >     counter--
> > >     spin_unlock()
> > > 
> > > So now the oom path does:
> > > 
> > > invalidate_range_start_non_blocking:
> > >  for each mn:
> > >    a->invalidate_range_start
> > >    b->invalidate_range_start
> > >    rc = EAGAIN
> > > 
> > > Now we SKIP A's invalidate_range_end even though A had no
idea this
> > > would happen has state that needs to be unwound. A is broken.
> > > 
> > > B survived just fine.
> > > 
> > > A and B *alone* work fine, combined they fail.
> > 
> > But that requires that they share some state, right?
> > 
> > > When the commit was landed you can use KVM as an example of A and
RDMA
> > > ODP as an example of B
> > 
> > Could you point me where those two share the state please? KVM seems
to
> > be using kvm->mmu_notifier_count but I do not know where to look
for the
> > RDMA...
> 
> Scratch that. ELONGDAY... I can see your point. It is all or nothing
> that doesn't really work here. Looking back at your patch it seems
> reasonable but I am not sure what is supposed to be a behavior for
> notifiers that failed.
Okay, good to know I'm not missing something. The idea was the failed
notifier would have to handle the mandatory _end callback.

I've reflected on it some more, and I have a scheme to be able to
'undo' that is safe against concurrent hlist_del_rcu.

If we change the register to keep the hlist sorted by address then we
can do a targetted 'undo' of past starts terminated by address
less-than comparison of the first failing struct mmu_notifier.

It relies on the fact that rcu is only used to remove items, the list
adds are all protected by mm locks, and the number of mmu notifiers is
very small.

This seems workable and does not need more driver review/update...

However, hmm's implementation still needs more fixing.

Thanks,
Jason

Christoph Hellwig

2019-Jul-24 19:48 UTC

head link

[Nouveau] [PATCH] mm/hmm: replace hmm_update with mmu_notifier_range

On Wed, Jul 24, 2019 at 04:21:55PM -0300, Jason Gunthorpe
wrote:> If we change the register to keep the hlist sorted by address then we
> can do a targetted 'undo' of past starts terminated by address
> less-than comparison of the first failing struct mmu_notifier.
> 
> It relies on the fact that rcu is only used to remove items, the list
> adds are all protected by mm locks, and the number of mmu notifiers is
> very small.
> 
> This seems workable and does not need more driver review/update...
> 
> However, hmm's implementation still needs more fixing.
Can we take one step back, please?  The only reason why drivers
implement both ->invalidate_range_start and ->invalidate_range_end and
expect them to be called paired is to keep some form of counter of
active invalidation "sections".  So instead of doctoring around
undo schemes the only sane answer is to take such a counter into the
core VM code instead of having each driver struggle with it.

Seemingly Similar Threads

Search for more possibly parallel threads

Nouveau - Jul 2019 - [PATCH] mm/hmm: replace hmm_update with mmu_notifier_range

[Nouveau] [PATCH] mm/hmm: replace hmm_update with mmu_notifier_range

[Nouveau] [PATCH] mm/hmm: replace hmm_update with mmu_notifier_range

[Nouveau] [PATCH] mm/hmm: replace hmm_update with mmu_notifier_range

Seemingly Similar Threads