Christian König
2014-Aug-04 14:45 UTC
[Nouveau] [PATCH 09/19] drm/radeon: handle lockup in delayed work, v2
Am 04.08.2014 um 16:40 schrieb Maarten Lankhorst:> op 04-08-14 16:37, Christian K?nig schreef: >>> It'a pain to deal with gpu reset. >> Yeah, well that's nothing new. >> >>> I've now tried other solutions but that would mean reverting to the old style during gpu lockup recovery, and only running the delayed work when !lockup. >>> But this meant that the timeout was useless to add. I think the cleanest is keeping the v2 patch, because potentially any waiting code can be called during lockup recovery. >> The lockup code itself should never call any waiting code and V2 doesn't seem to handle a couple of cases correctly either. >> >> How about moving the fence waiting out of the reset code? > What cases did I miss then? > > I'm curious how you want to move the fence waiting out of reset, when there are so many places that could potentially wait, like radeon_ib_get can call radeon_sa_bo_new which can do a wait, or radeon_ring_alloc that can wait on radeon_fence_wait_next, etc.The IB test itself doesn't needs to be protected by the exclusive lock. Only everything between radeon_save_bios_scratch_regs and radeon_ring_restore. Christian.
Maarten Lankhorst
2014-Aug-04 14:58 UTC
[Nouveau] [PATCH 09/19] drm/radeon: handle lockup in delayed work, v2
op 04-08-14 16:45, Christian K?nig schreef:> Am 04.08.2014 um 16:40 schrieb Maarten Lankhorst: >> op 04-08-14 16:37, Christian K?nig schreef: >>>> It'a pain to deal with gpu reset. >>> Yeah, well that's nothing new. >>> >>>> I've now tried other solutions but that would mean reverting to the old style during gpu lockup recovery, and only running the delayed work when !lockup. >>>> But this meant that the timeout was useless to add. I think the cleanest is keeping the v2 patch, because potentially any waiting code can be called during lockup recovery. >>> The lockup code itself should never call any waiting code and V2 doesn't seem to handle a couple of cases correctly either. >>> >>> How about moving the fence waiting out of the reset code? >> What cases did I miss then? >> >> I'm curious how you want to move the fence waiting out of reset, when there are so many places that could potentially wait, like radeon_ib_get can call radeon_sa_bo_new which can do a wait, or radeon_ring_alloc that can wait on radeon_fence_wait_next, etc. > > The IB test itself doesn't needs to be protected by the exclusive lock. Only everything between radeon_save_bios_scratch_regs and radeon_ring_restore.I'm not sure about that, what do you want to do if the ring tests fail? Do you have to retake the exclusive lock? ~Maarten
Christian König
2014-Aug-04 15:04 UTC
[Nouveau] [PATCH 09/19] drm/radeon: handle lockup in delayed work, v2
Am 04.08.2014 um 16:58 schrieb Maarten Lankhorst:> op 04-08-14 16:45, Christian K?nig schreef: >> Am 04.08.2014 um 16:40 schrieb Maarten Lankhorst: >>> op 04-08-14 16:37, Christian K?nig schreef: >>>>> It'a pain to deal with gpu reset. >>>> Yeah, well that's nothing new. >>>> >>>>> I've now tried other solutions but that would mean reverting to the old style during gpu lockup recovery, and only running the delayed work when !lockup. >>>>> But this meant that the timeout was useless to add. I think the cleanest is keeping the v2 patch, because potentially any waiting code can be called during lockup recovery. >>>> The lockup code itself should never call any waiting code and V2 doesn't seem to handle a couple of cases correctly either. >>>> >>>> How about moving the fence waiting out of the reset code? >>> What cases did I miss then? >>> >>> I'm curious how you want to move the fence waiting out of reset, when there are so many places that could potentially wait, like radeon_ib_get can call radeon_sa_bo_new which can do a wait, or radeon_ring_alloc that can wait on radeon_fence_wait_next, etc. >> The IB test itself doesn't needs to be protected by the exclusive lock. Only everything between radeon_save_bios_scratch_regs and radeon_ring_restore. > I'm not sure about that, what do you want to do if the ring tests fail? Do you have to retake the exclusive lock?Just set need_reset again and return -EAGAIN, that should have mostly the same effect as what we are doing right now. Christian.> > ~Maarten >
Possibly Parallel Threads
- [PATCH 09/19] drm/radeon: handle lockup in delayed work, v2
- [PATCH 09/19] drm/radeon: handle lockup in delayed work, v2
- [PATCH 09/19] drm/radeon: handle lockup in delayed work, v2
- [PATCH 09/19] drm/radeon: handle lockup in delayed work, v2
- [PATCH 09/19] drm/radeon: handle lockup in delayed work, v2