Ben Skeggs
2015-Sep-03 07:19 UTC
[Nouveau] [PATCH 2/3] ltc/gf100: add flush/invalidate functions
On 3 September 2015 at 17:13, Alexandre Courbot <gnurou at gmail.com> wrote:> On Thu, Sep 3, 2015 at 4:09 PM, Ben Skeggs <skeggsb at gmail.com> wrote: >> On 3 September 2015 at 16:42, Alexandre Courbot <acourbot at nvidia.com> wrote: >>> Allow clients to manually flush and invalidate L2. This will be useful >>> for Tegra systems for which we want to write instmem using the CPU. >>> >>> Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> >>> --- >>> drm/nouveau/include/nvkm/subdev/ltc.h | 1 + >>> drm/nouveau/nvkm/subdev/ltc/gf100.c | 36 +++++++++++++++++++++++++++++++++++ >>> drm/nouveau/nvkm/subdev/ltc/gk104.c | 2 ++ >>> drm/nouveau/nvkm/subdev/ltc/gm107.c | 2 ++ >>> drm/nouveau/nvkm/subdev/ltc/priv.h | 2 ++ >>> 5 files changed, 43 insertions(+) >>> >>> diff --git a/drm/nouveau/include/nvkm/subdev/ltc.h b/drm/nouveau/include/nvkm/subdev/ltc.h >>> index 5464fcf482f1..3d4dbbf9aab3 100644 >>> --- a/drm/nouveau/include/nvkm/subdev/ltc.h >>> +++ b/drm/nouveau/include/nvkm/subdev/ltc.h >>> @@ -35,5 +35,6 @@ void nvkm_ltc_flush(struct nvkm_ltc *); >>> >>> int gf100_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); >>> int gk104_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); >>> +int gk20a_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); >>> int gm107_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); >>> #endif >>> diff --git a/drm/nouveau/nvkm/subdev/ltc/gf100.c b/drm/nouveau/nvkm/subdev/ltc/gf100.c >>> index 45ac765b753e..42ae77533865 100644 >>> --- a/drm/nouveau/nvkm/subdev/ltc/gf100.c >>> +++ b/drm/nouveau/nvkm/subdev/ltc/gf100.c >>> @@ -122,6 +122,40 @@ gf100_ltc_intr(struct nvkm_ltc *ltc) >>> } >>> } >>> >>> +void >>> +gf100_ltc_invalidate(struct nvkm_ltc *ltc) >>> +{ >>> + struct nvkm_device *device = ltc->subdev.device; >>> + s64 taken; >>> + >>> + nvkm_wr32(device, 0x70004, 0x00000001); >>> + if ((taken = nvkm_msec(device, 2000, >> I don't suppose you have access to information on more realistic >> timeouts? I'd like to improve all the potential 2s timeout values >> across the driver in general, to avoid things hanging for a long time >> when the GPU craps itself :) > > The longest values I have ever seen are ~270ms (only during driver > initialization, and at low clock speeds), but anyway this should never > ever timeout. If it does, then we will have bigger issues than 2 > second timeouts. :)Indeed it shouldn't, but when things get horribly messed up (as happens occasionally :P) it's annoying and can potentially stall the whole system until things are sorted out. The patch is fine as-is, I was mostly wondering if NVIDIA have data on a maximum time this should ever take that we could use instead.> > Btw, wouldn't it be worth to restore macros like the old nvkm_wait() > to avoid having to write this verbose code every time we need to wait > on a register value?I'm fine doing that, the reason I didn't in the first place was that I'd planned on not having the constant 2s timeout everywhere (as already mentioned) :)
Alexandre Courbot
2015-Sep-03 08:02 UTC
[Nouveau] [PATCH 2/3] ltc/gf100: add flush/invalidate functions
On Thu, Sep 3, 2015 at 4:19 PM, Ben Skeggs <skeggsb at gmail.com> wrote:> On 3 September 2015 at 17:13, Alexandre Courbot <gnurou at gmail.com> wrote: >> On Thu, Sep 3, 2015 at 4:09 PM, Ben Skeggs <skeggsb at gmail.com> wrote: >>> On 3 September 2015 at 16:42, Alexandre Courbot <acourbot at nvidia.com> wrote: >>>> Allow clients to manually flush and invalidate L2. This will be useful >>>> for Tegra systems for which we want to write instmem using the CPU. >>>> >>>> Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> >>>> --- >>>> drm/nouveau/include/nvkm/subdev/ltc.h | 1 + >>>> drm/nouveau/nvkm/subdev/ltc/gf100.c | 36 +++++++++++++++++++++++++++++++++++ >>>> drm/nouveau/nvkm/subdev/ltc/gk104.c | 2 ++ >>>> drm/nouveau/nvkm/subdev/ltc/gm107.c | 2 ++ >>>> drm/nouveau/nvkm/subdev/ltc/priv.h | 2 ++ >>>> 5 files changed, 43 insertions(+) >>>> >>>> diff --git a/drm/nouveau/include/nvkm/subdev/ltc.h b/drm/nouveau/include/nvkm/subdev/ltc.h >>>> index 5464fcf482f1..3d4dbbf9aab3 100644 >>>> --- a/drm/nouveau/include/nvkm/subdev/ltc.h >>>> +++ b/drm/nouveau/include/nvkm/subdev/ltc.h >>>> @@ -35,5 +35,6 @@ void nvkm_ltc_flush(struct nvkm_ltc *); >>>> >>>> int gf100_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); >>>> int gk104_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); >>>> +int gk20a_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); >>>> int gm107_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); >>>> #endif >>>> diff --git a/drm/nouveau/nvkm/subdev/ltc/gf100.c b/drm/nouveau/nvkm/subdev/ltc/gf100.c >>>> index 45ac765b753e..42ae77533865 100644 >>>> --- a/drm/nouveau/nvkm/subdev/ltc/gf100.c >>>> +++ b/drm/nouveau/nvkm/subdev/ltc/gf100.c >>>> @@ -122,6 +122,40 @@ gf100_ltc_intr(struct nvkm_ltc *ltc) >>>> } >>>> } >>>> >>>> +void >>>> +gf100_ltc_invalidate(struct nvkm_ltc *ltc) >>>> +{ >>>> + struct nvkm_device *device = ltc->subdev.device; >>>> + s64 taken; >>>> + >>>> + nvkm_wr32(device, 0x70004, 0x00000001); >>>> + if ((taken = nvkm_msec(device, 2000, >>> I don't suppose you have access to information on more realistic >>> timeouts? I'd like to improve all the potential 2s timeout values >>> across the driver in general, to avoid things hanging for a long time >>> when the GPU craps itself :) >> >> The longest values I have ever seen are ~270ms (only during driver >> initialization, and at low clock speeds), but anyway this should never >> ever timeout. If it does, then we will have bigger issues than 2 >> second timeouts. :) > Indeed it shouldn't, but when things get horribly messed up (as > happens occasionally :P) it's annoying and can potentially stall the > whole system until things are sorted out. The patch is fine as-is, I > was mostly wondering if NVIDIA have data on a maximum time this should > ever take that we could use instead.In nvgpu we do 200 retries with a delay of 5us in between, so we wait at most for 1ms. Yeah, that sounds like a better measure than 2s. :) Btw, when I wrote 270ms, I meant 270us. 1ms would be nice in that case. Let me test with it and resend this patch with better values.> >> >> Btw, wouldn't it be worth to restore macros like the old nvkm_wait() >> to avoid having to write this verbose code every time we need to wait >> on a register value? > I'm fine doing that, the reason I didn't in the first place was that > I'd planned on not having the constant 2s timeout everywhere (as > already mentioned) :)We can just add one extra argument that specifies how long we are willing to wait. :P
Ben Skeggs
2015-Sep-03 08:05 UTC
[Nouveau] [PATCH 2/3] ltc/gf100: add flush/invalidate functions
On 3 September 2015 at 18:02, Alexandre Courbot <gnurou at gmail.com> wrote:> On Thu, Sep 3, 2015 at 4:19 PM, Ben Skeggs <skeggsb at gmail.com> wrote: >> On 3 September 2015 at 17:13, Alexandre Courbot <gnurou at gmail.com> wrote: >>> On Thu, Sep 3, 2015 at 4:09 PM, Ben Skeggs <skeggsb at gmail.com> wrote: >>>> On 3 September 2015 at 16:42, Alexandre Courbot <acourbot at nvidia.com> wrote: >>>>> Allow clients to manually flush and invalidate L2. This will be useful >>>>> for Tegra systems for which we want to write instmem using the CPU. >>>>> >>>>> Signed-off-by: Alexandre Courbot <acourbot at nvidia.com> >>>>> --- >>>>> drm/nouveau/include/nvkm/subdev/ltc.h | 1 + >>>>> drm/nouveau/nvkm/subdev/ltc/gf100.c | 36 +++++++++++++++++++++++++++++++++++ >>>>> drm/nouveau/nvkm/subdev/ltc/gk104.c | 2 ++ >>>>> drm/nouveau/nvkm/subdev/ltc/gm107.c | 2 ++ >>>>> drm/nouveau/nvkm/subdev/ltc/priv.h | 2 ++ >>>>> 5 files changed, 43 insertions(+) >>>>> >>>>> diff --git a/drm/nouveau/include/nvkm/subdev/ltc.h b/drm/nouveau/include/nvkm/subdev/ltc.h >>>>> index 5464fcf482f1..3d4dbbf9aab3 100644 >>>>> --- a/drm/nouveau/include/nvkm/subdev/ltc.h >>>>> +++ b/drm/nouveau/include/nvkm/subdev/ltc.h >>>>> @@ -35,5 +35,6 @@ void nvkm_ltc_flush(struct nvkm_ltc *); >>>>> >>>>> int gf100_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); >>>>> int gk104_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); >>>>> +int gk20a_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); >>>>> int gm107_ltc_new(struct nvkm_device *, int, struct nvkm_ltc **); >>>>> #endif >>>>> diff --git a/drm/nouveau/nvkm/subdev/ltc/gf100.c b/drm/nouveau/nvkm/subdev/ltc/gf100.c >>>>> index 45ac765b753e..42ae77533865 100644 >>>>> --- a/drm/nouveau/nvkm/subdev/ltc/gf100.c >>>>> +++ b/drm/nouveau/nvkm/subdev/ltc/gf100.c >>>>> @@ -122,6 +122,40 @@ gf100_ltc_intr(struct nvkm_ltc *ltc) >>>>> } >>>>> } >>>>> >>>>> +void >>>>> +gf100_ltc_invalidate(struct nvkm_ltc *ltc) >>>>> +{ >>>>> + struct nvkm_device *device = ltc->subdev.device; >>>>> + s64 taken; >>>>> + >>>>> + nvkm_wr32(device, 0x70004, 0x00000001); >>>>> + if ((taken = nvkm_msec(device, 2000, >>>> I don't suppose you have access to information on more realistic >>>> timeouts? I'd like to improve all the potential 2s timeout values >>>> across the driver in general, to avoid things hanging for a long time >>>> when the GPU craps itself :) >>> >>> The longest values I have ever seen are ~270ms (only during driver >>> initialization, and at low clock speeds), but anyway this should never >>> ever timeout. If it does, then we will have bigger issues than 2 >>> second timeouts. :) >> Indeed it shouldn't, but when things get horribly messed up (as >> happens occasionally :P) it's annoying and can potentially stall the >> whole system until things are sorted out. The patch is fine as-is, I >> was mostly wondering if NVIDIA have data on a maximum time this should >> ever take that we could use instead. > > In nvgpu we do 200 retries with a delay of 5us in between, so we wait > at most for 1ms. Yeah, that sounds like a better measure than 2s. :) > > Btw, when I wrote 270ms, I meant 270us. > > 1ms would be nice in that case. Let me test with it and resend this > patch with better values. > >> >>> >>> Btw, wouldn't it be worth to restore macros like the old nvkm_wait() >>> to avoid having to write this verbose code every time we need to wait >>> on a register value? >> I'm fine doing that, the reason I didn't in the first place was that >> I'd planned on not having the constant 2s timeout everywhere (as >> already mentioned) :) > > We can just add one extra argument that specifies how long we are > willing to wait. :PWorks for me :)