Brock York
2018-Jan-16 00:03 UTC
[Nouveau] [RFC 0/4] Implement full clockgating for Kepler1 and 2
Hello Paul I have a GTX 480 (GF100 I believe) at home in an old machine. Is the rest of the Fermi patch set available somewhere for me to test? Thank you Regards Brock On 16 Jan. 2018 9:08 am, "Lyude Paul" <lyude at redhat.com> wrote: It's here! After a lot of investigation, rewrites, and traces, I present the patch series to implement all known )levels of clockgating for Kepler1 and Kepler2 GPUs. Starting with Fermi GPUs (this is probably present on earlier GPUs as well, but with a far less easy to manage interface), nvidia added two clockgating levels that are handled mostly in firmware (with the exception of course, of the driver initially programming all of the register values containing engine delays and that stuff): - CG_CTRL - Main register for enabling/disabling clockgating for engines and hw blocks - BLCG - "Block-level clockgating", a deeper level of clockgating Starting with kepler2 as well, nvidia also introduced: - SLCG - "??? clockgating" even deeper level of clockgating Originally this patchset was going to include work for making this work on Fermi, however on closer investigation it seems that Fermi has one pretty big difference in it's requirements that we don't entirely understand yet. On Fermi the CG_CTRL register for the gr needs to be adjusted during reclocking, while kepler and later generations need no such adjustments. Since this requires more research; the current plan is to leave fermi out of this patchset, and then just rework the code at a later point in time to support Fermi once we understand more. For the time being however, we put as much code we're sure Fermi will be sharing into gf100 files for the future. For the time being; this patchset supports every GPU in the kepler1 and kepler2 family. The main kernel config option to enable this is config=NvPmEnableGating=<level> Where <level> is the deepest level of powersaving to enable. The levels being: 0. NOCG 1. CG (just CG_CTRL) 2. BLCG 3. SLCG Additionally, we leave a couple of small TODO comments to mark spots we'll eventually need to add code for the final level of power saving we have yet to implement or understand: ELPG (engine-level powergating). Here's some very lazily done benchmarks on how much power this saves as well. It should be noted that chances are, these patches save a lot more power then you see here since clockgating works best when the GPU is under load. I have a feeling this will be especially true with SLCG. GK104 stats: Idle (1 4k display @ fbcon): NOCG: 07: 20.53W 0a: 44.12W 0e: 61.23W 0f: 62.23W CG: 07: 18.95W-19.02W 0a: 34.95W-37.48W 0e: 52.70W-54.32W 0f: 53.53W-55.68W BLCG: 07: 18.64W-19.02W 0a: 32.68W-34.78W 0e: 49.14W-50.51W 0f: 49.75W-51.73W GK110 stats: Idle (1 4k display @ fbcon): NOCG: 07: 25.68W 0a: 36.16W 0d: 64.71W 0f: 64.99W CG: 07: 25.68W 0a: 35.58W 0d: 61.54W-61.74W 0f: 61.93W-62.12W BLCG: 07: 25W-25.2W 0a: 33.85W-34.04W 0d: 59.25W-59.44W 0f: 59.35W-59.54W SLCG: 07: 25W-25.2W 0a: 33.85W-33.95W (spikes to 34.04W briefly every now and then) 0d: 59.15W-59.35W 0f: 59.35W-59.54W For the time being, this will be disabled by default in the kernel as we wait to get more widespread testing with this patchset. If you're willing to put up with the potential of instability, please feel free to try this patchset and let us know how well it works for you! Lyude Paul (4): drm/nouveau: Add support for basic clockgating on Kepler1 drm/nouveau: Add support for BLCG clockgating for Kepler1 drm/nouveau: Add BLCG clockgating for Kepler2 drm/nouveau: Add SLCG clockgating for Kepler2 drivers/gpu/drm/nouveau/include/nvkm/subdev/fb.h | 1 + .../gpu/drm/nouveau/include/nvkm/subdev/therm.h | 25 +++ drivers/gpu/drm/nouveau/nvkm/engine/device/base.c | 25 +-- drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.h | 1 + drivers/gpu/drm/nouveau/nvkm/engine/gr/gk104.c | 215 ++++++++++++++++++++- drivers/gpu/drm/nouveau/nvkm/engine/gr/gk104.h | 55 ++++++ drivers/gpu/drm/nouveau/nvkm/engine/gr/gk110.c | 176 +++++++++++++++++ drivers/gpu/drm/nouveau/nvkm/subdev/fb/Kbuild | 1 + drivers/gpu/drm/nouveau/nvkm/subdev/fb/gf100.c | 6 + drivers/gpu/drm/nouveau/nvkm/subdev/fb/gk104.c | 53 +++++ drivers/gpu/drm/nouveau/nvkm/subdev/fb/gk104.h | 35 ++++ drivers/gpu/drm/nouveau/nvkm/subdev/fb/gk110.c | 77 ++++++++ drivers/gpu/drm/nouveau/nvkm/subdev/fb/priv.h | 3 + drivers/gpu/drm/nouveau/nvkm/subdev/therm/Kbuild | 2 + drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c | 84 +++++++- drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf100.c | 77 ++++++++ drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf100.h | 35 ++++ drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf119.c | 8 +- drivers/gpu/drm/nouveau/nvkm/subdev/therm/gk104.c | 136 +++++++++++++ drivers/gpu/drm/nouveau/nvkm/subdev/therm/gk104.h | 56 ++++++ drivers/gpu/drm/nouveau/nvkm/subdev/therm/gt215.c | 2 +- drivers/gpu/drm/nouveau/nvkm/subdev/therm/priv.h | 23 ++- 22 files changed, 1069 insertions(+), 27 deletions(-) create mode 100644 drivers/gpu/drm/nouveau/nvkm/engine/gr/gk104.h create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/fb/gk104.h create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/fb/gk110.c create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf100.c create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf100.h create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/therm/gk104.c create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/therm/gk104.h -- 2.14.3 _______________________________________________ Nouveau mailing list Nouveau at lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/nouveau -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/nouveau/attachments/20180116/aff4bc2f/attachment.html>
Lyude Paul
2018-Jan-16 03:34 UTC
[Nouveau] [RFC 0/4] Implement full clockgating for Kepler1 and 2
Unfortunately no. in order to support reclocking on Fermi with powergating (which is important; since you can't test clockgating very well without it) we are going to need to do some more investigation. Unlike Kepler and later generations, Fermi seems to require some extra maintanence of it's clockgating registers (CG_CTRL) during reclocking operations that we don't quite understand yet. We'll figure it out, but I'd like to get out initial support for Kepler first since I'm confident this series is pretty solid and it would be good to get early feedback on this. Plus, Fermi doesn't have working reclocking in nouveau just yet. I am planning on eventually adding Fermi clockgating support when reclocking gets fixed! On Tue, 2018-01-16 at 11:03 +1100, Brock York wrote:> Hello Paul > > I have a GTX 480 (GF100 I believe) at home in an old machine. Is the rest of > the Fermi patch set available somewhere for me to test? > > Thank you > Regards Brock > > > > On 16 Jan. 2018 9:08 am, "Lyude Paul" <lyude at redhat.com> wrote: > > It's here! After a lot of investigation, rewrites, and traces, I present > > the patch series to implement all known )levels of clockgating for > > Kepler1 and Kepler2 GPUs. > > > > Starting with Fermi GPUs (this is probably present on earlier GPUs as > > well, but with a far less easy to manage interface), nvidia added two > > clockgating levels that are handled mostly in firmware (with the > > exception of course, of the driver initially programming all of the > > register values containing engine delays and that stuff): > > - CG_CTRL - Main register for enabling/disabling clockgating for > > engines and hw blocks > > - BLCG - "Block-level clockgating", a deeper level of clockgating > > Starting with kepler2 as well, nvidia also introduced: > > - SLCG - "??? clockgating" even deeper level of clockgating > > > > Originally this patchset was going to include work for making this work > > on Fermi, however on closer investigation it seems that Fermi has one > > pretty big difference in it's requirements that we don't entirely > > understand yet. On Fermi the CG_CTRL register for the gr needs to be > > adjusted during reclocking, while kepler and later generations need no > > such adjustments. Since this requires more research; the current plan is > > to leave fermi out of this patchset, and then just rework the code at a > > later point in time to support Fermi once we understand more. > > > > For the time being however, we put as much code we're sure Fermi will be > > sharing into gf100 files for the future. > > > > For the time being; this patchset supports every GPU in the kepler1 and > > kepler2 family. The main kernel config option to enable this is > > config=NvPmEnableGating=<level> > > Where <level> is the deepest level of powersaving to enable. The levels > > being: > > 0. NOCG > > 1. CG (just CG_CTRL) > > 2. BLCG > > 3. SLCG > > > > Additionally, we leave a couple of small TODO comments to mark spots > > we'll eventually need to add code for the final level of power saving we > > have yet to implement or understand: ELPG (engine-level powergating). > > > > Here's some very lazily done benchmarks on how much power this saves as > > well. It should be noted that chances are, these patches save a lot more > > power then you see here since clockgating works best when the GPU is > > under load. I have a feeling this will be especially true with SLCG. > > > > GK104 stats: > > Idle (1 4k display @ fbcon): > > NOCG: > > 07: 20.53W > > 0a: 44.12W > > 0e: 61.23W > > 0f: 62.23W > > CG: > > 07: 18.95W-19.02W > > 0a: 34.95W-37.48W > > 0e: 52.70W-54.32W > > 0f: 53.53W-55.68W > > BLCG: > > 07: 18.64W-19.02W > > 0a: 32.68W-34.78W > > 0e: 49.14W-50.51W > > 0f: 49.75W-51.73W > > GK110 stats: > > Idle (1 4k display @ fbcon): > > NOCG: > > 07: 25.68W > > 0a: 36.16W > > 0d: 64.71W > > 0f: 64.99W > > CG: > > 07: 25.68W > > 0a: 35.58W > > 0d: 61.54W-61.74W > > 0f: 61.93W-62.12W > > BLCG: > > 07: 25W-25.2W > > 0a: 33.85W-34.04W > > 0d: 59.25W-59.44W > > 0f: 59.35W-59.54W > > SLCG: > > 07: 25W-25.2W > > 0a: 33.85W-33.95W (spikes to 34.04W briefly every now and then) > > 0d: 59.15W-59.35W > > 0f: 59.35W-59.54W > > > > For the time being, this will be disabled by default in the kernel as we > > wait to get more widespread testing with this patchset. If you're > > willing to put up with the potential of instability, please feel free to > > try this patchset and let us know how well it works for you! > > > > Lyude Paul (4): > > drm/nouveau: Add support for basic clockgating on Kepler1 > > drm/nouveau: Add support for BLCG clockgating for Kepler1 > > drm/nouveau: Add BLCG clockgating for Kepler2 > > drm/nouveau: Add SLCG clockgating for Kepler2 > > > > drivers/gpu/drm/nouveau/include/nvkm/subdev/fb.h | 1 + > > .../gpu/drm/nouveau/include/nvkm/subdev/therm.h | 25 +++ > > drivers/gpu/drm/nouveau/nvkm/engine/device/base.c | 25 +-- > > drivers/gpu/drm/nouveau/nvkm/engine/gr/gf100.h | 1 + > > drivers/gpu/drm/nouveau/nvkm/engine/gr/gk104.c | 215 > > ++++++++++++++++++++- > > drivers/gpu/drm/nouveau/nvkm/engine/gr/gk104.h | 55 ++++++ > > drivers/gpu/drm/nouveau/nvkm/engine/gr/gk110.c | 176 > > +++++++++++++++++ > > drivers/gpu/drm/nouveau/nvkm/subdev/fb/Kbuild | 1 + > > drivers/gpu/drm/nouveau/nvkm/subdev/fb/gf100.c | 6 + > > drivers/gpu/drm/nouveau/nvkm/subdev/fb/gk104.c | 53 +++++ > > drivers/gpu/drm/nouveau/nvkm/subdev/fb/gk104.h | 35 ++++ > > drivers/gpu/drm/nouveau/nvkm/subdev/fb/gk110.c | 77 ++++++++ > > drivers/gpu/drm/nouveau/nvkm/subdev/fb/priv.h | 3 + > > drivers/gpu/drm/nouveau/nvkm/subdev/therm/Kbuild | 2 + > > drivers/gpu/drm/nouveau/nvkm/subdev/therm/base.c | 84 +++++++- > > drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf100.c | 77 ++++++++ > > drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf100.h | 35 ++++ > > drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf119.c | 8 +- > > drivers/gpu/drm/nouveau/nvkm/subdev/therm/gk104.c | 136 +++++++++++++ > > drivers/gpu/drm/nouveau/nvkm/subdev/therm/gk104.h | 56 ++++++ > > drivers/gpu/drm/nouveau/nvkm/subdev/therm/gt215.c | 2 +- > > drivers/gpu/drm/nouveau/nvkm/subdev/therm/priv.h | 23 ++- > > 22 files changed, 1069 insertions(+), 27 deletions(-) > > create mode 100644 drivers/gpu/drm/nouveau/nvkm/engine/gr/gk104.h > > create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/fb/gk104.h > > create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/fb/gk110.c > > create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf100.c > > create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/therm/gf100.h > > create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/therm/gk104.c > > create mode 100644 drivers/gpu/drm/nouveau/nvkm/subdev/therm/gk104.h > > > > -- > > 2.14.3 > > > > _______________________________________________ > > Nouveau mailing list > > Nouveau at lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/nouveau-- Cheers, Lyude Paul
Possibly Parallel Threads
- [RFC 0/4] Implement full clockgating for Kepler1 and 2
- [RFC v3 0/4] Implement full clockgating for Kepler1 and 2
- [RFC v4 0/5] Implement full clockgating for Kepler1 and 2
- [RFC v5 0/5] Implement full clockgating for Kepler1 and 2
- [RFC v6 0/5] Implement full clockgating for Kepler1 and 2