Following a series of patches that implement memory reclocking for NVA3/5/8 with DDR2, DDR3 and GDDR3 on board. I tested these patches on 6 different graphics cards, but I expect reclocking now to work on many more. Testers can pick up these patches and test it by enabling pstate (nouveau.pstate=1). They should then be able to change clocks by writing to /sys/class/drm/card0/device/pstate. Correct values can be read from this sysfs node. Note that this API is likely to change in the future, hence it's hidden behind this kernel parameter. What works: - Reclocking on many DDR2, DDR3 and GDDR3 cards, both up and down - Doing so "invisibly", eg no black screens What still needs fixing: - GDDR5. There's non-reclocking related issues with GDDR5 cards in this family. Sorry, Michael.. - Link training is only done for a single partition, because I haven't seen cards that require training on multiple partitions. If you have one, contact me. - Stability. Sometimes reclocking (esp. in a tight loop) fails and the machine hangs. Reboot and you're fine. Just up'ing the clocks before gaming and bringing them back afterwards should not cause serious issues and once the clocks are changed it's as stable as it was before. - DDR3 cards that require training will give a black screen on the first reclock. When stability improves, we might hide this black screen in the boot process, but not until we're absolutely certain that this can not lead to hangs or crashes on boot. - Some DDR2 cards give corruption in the highest performance level, which dis- appears when clocking back. I suspect this is caused by insufficient bandwidth between the memory and the graphics processor, but I don't have an answer as to why bandwidth is insufficient. - Automatic frequency scaling. Not before the other issues have been resolved. Please review, and happy testing! Roy
Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- drivers/gpu/drm/nouveau/Makefile | 1 + drivers/gpu/drm/nouveau/core/subdev/fb/gddr3.c | 117 +++++++++++++++++++++++++ drivers/gpu/drm/nouveau/core/subdev/fb/priv.h | 1 + 3 files changed, 119 insertions(+) create mode 100644 drivers/gpu/drm/nouveau/core/subdev/fb/gddr3.c diff --git a/drivers/gpu/drm/nouveau/Makefile b/drivers/gpu/drm/nouveau/Makefile index 12c24c8..d5f5aa9 100644 --- a/drivers/gpu/drm/nouveau/Makefile +++ b/drivers/gpu/drm/nouveau/Makefile @@ -129,6 +129,7 @@ nouveau-y += core/subdev/fb/ramgk20a.o nouveau-y += core/subdev/fb/ramgm107.o nouveau-y += core/subdev/fb/sddr2.o nouveau-y += core/subdev/fb/sddr3.o +nouveau-y += core/subdev/fb/gddr3.o nouveau-y += core/subdev/fb/gddr5.o nouveau-y += core/subdev/fuse/base.o nouveau-y += core/subdev/fuse/g80.o diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/gddr3.c b/drivers/gpu/drm/nouveau/core/subdev/fb/gddr3.c new file mode 100644 index 0000000..d85a25d --- /dev/null +++ b/drivers/gpu/drm/nouveau/core/subdev/fb/gddr3.c @@ -0,0 +1,117 @@ +/* + * Copyright 2013 Red Hat Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + * Authors: Ben Skeggs <bskeggs at redhat.com> + * Roy Spliet <rspliet at eclipso.eu> + */ + +#include <subdev/bios.h> +#include "priv.h" + +struct ramxlat { + int id; + u8 enc; +}; + +static inline int +ramxlat(const struct ramxlat *xlat, int id) +{ + while (xlat->id >= 0) { + if (xlat->id == id) + return xlat->enc; + xlat++; + } + return -EINVAL; +} + +static const struct ramxlat +ramgddr3_cl_lo[] = { + { 7, 7 }, { 8, 0 }, { 9, 1 }, { 10, 2 }, { 11, 3 }, + /* the below are mentioned in some, but not all, gddr3 docs */ + { 12, 4 }, { 13, 5 }, { 14, 6 }, + /* XXX: Per Samsung docs, are these used? They overlap with Qimonda */ + /* { 4, 4 }, { 5, 5 }, { 6, 6 }, { 12, 8 }, { 13, 9 }, { 14, 10 }, + * { 15, 11 }, */ + { -1 } +}; + +static const struct ramxlat +ramgddr3_cl_hi[] = { + { 10, 2 }, { 11, 3 }, { 12, 4 }, { 13, 5 }, { 14, 6 }, { 15, 7 }, + { 16, 0 }, { 17, 1 }, + { -1 } +}; + +static const struct ramxlat +ramgddr3_wr_lo[] = { + { 5, 2 }, { 7, 4 }, { 8, 5 }, { 9, 6 }, { 10, 7 }, + { 11, 0 }, + /* the below are mentioned in some, but not all, gddr3 docs */ + { 4, 1 }, { 6, 3 }, { 12, 1 }, { 13 , 2 }, + { -1 } +}; + +int +nouveau_gddr3_calc(struct nouveau_ram *ram) +{ + int CL, WR, CWL, DLL = 0, ODT = 0, hi; + + switch (ram->next->bios.timing_ver) { + case 0x10: + CWL = ram->next->bios.timing_10_CWL; + CL = ram->next->bios.timing_10_CL; + WR = ram->next->bios.timing_10_WR; + DLL = !ram->next->bios.ramcfg_10_DLLoff; + ODT = ram->next->bios.timing_10_ODT; + break; + case 0x20: + CWL = (ram->next->bios.timing[1] & 0x00000f80) >> 7; + CL = (ram->next->bios.timing[1] & 0x0000001f) >> 0; + WR = (ram->next->bios.timing[2] & 0x007f0000) >> 16; + /* XXX: Get these values from the VBIOS instead */ + DLL = !(ram->mr[1] & 0x1); + ODT = (ram->mr[1] & 0x004) >> 2 | + (ram->mr[1] & 0x040) >> 5 | + (ram->mr[1] & 0x200) >> 7; + break; + default: + return -ENOSYS; + } + + hi = ram->mr[2] & 0x1; + CL = ramxlat(hi ? ramgddr3_cl_hi : ramgddr3_cl_lo, CL); + WR = ramxlat(ramgddr3_wr_lo, WR); + if (CL < 0 || CWL < 1 || CWL > 7 || WR < 0) + return -EINVAL; + + ram->mr[0] &= ~0xf74; + ram->mr[0] |= (CWL & 0x07) << 9; + ram->mr[0] |= (CL & 0x07) << 4; + ram->mr[0] |= (CL & 0x08) >> 1; + + ram->mr[1] &= ~0x3fc; + ram->mr[1] |= (ODT & 0x03) << 2; + ram->mr[1] |= (ODT & 0x03) << 8; + ram->mr[1] |= (WR & 0x03) << 4; + ram->mr[1] |= (WR & 0x04) << 5; + ram->mr[1] |= !DLL << 6; + return 0; +} diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/priv.h b/drivers/gpu/drm/nouveau/core/subdev/fb/priv.h index 60322e9..283863f 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/fb/priv.h +++ b/drivers/gpu/drm/nouveau/core/subdev/fb/priv.h @@ -37,6 +37,7 @@ extern struct nouveau_oclass gm107_ram_oclass; int nouveau_sddr2_calc(struct nouveau_ram *ram); int nouveau_sddr3_calc(struct nouveau_ram *ram); +int nouveau_gddr3_calc(struct nouveau_ram *ram); int nouveau_gddr5_calc(struct nouveau_ram *ram, bool nuts); #define nouveau_fb_create(p,e,c,d) \ -- 1.9.3
Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- drivers/gpu/drm/nouveau/core/include/subdev/pwr.h | 2 + drivers/gpu/drm/nouveau/core/subdev/fb/ramfuc.h | 16 ++ drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c | 318 +++++++++++++++++++-- .../gpu/drm/nouveau/core/subdev/pwr/fuc/memx.fuc | 111 +++++++ drivers/gpu/drm/nouveau/core/subdev/pwr/fuc/os.h | 5 + drivers/gpu/drm/nouveau/core/subdev/pwr/memx.c | 35 ++- 6 files changed, 458 insertions(+), 29 deletions(-) diff --git a/drivers/gpu/drm/nouveau/core/include/subdev/pwr.h b/drivers/gpu/drm/nouveau/core/include/subdev/pwr.h index bf3d1f6..f2427bf 100644 --- a/drivers/gpu/drm/nouveau/core/include/subdev/pwr.h +++ b/drivers/gpu/drm/nouveau/core/include/subdev/pwr.h @@ -48,6 +48,8 @@ void nouveau_memx_wait(struct nouveau_memx *, u32 addr, u32 mask, u32 data, u32 nsec); void nouveau_memx_nsec(struct nouveau_memx *, u32 nsec); void nouveau_memx_wait_vblank(struct nouveau_memx *); +void nouveau_memx_train(struct nouveau_memx *); +int nouveau_memx_train_result(struct nouveau_pwr *, u32 *, int); void nouveau_memx_block(struct nouveau_memx *); void nouveau_memx_unblock(struct nouveau_memx *); diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/ramfuc.h b/drivers/gpu/drm/nouveau/core/subdev/fb/ramfuc.h index d1fbbe4..0ac7256 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/fb/ramfuc.h +++ b/drivers/gpu/drm/nouveau/core/subdev/fb/ramfuc.h @@ -141,6 +141,20 @@ ramfuc_wait_vblank(struct ramfuc *ram) } static inline void +ramfuc_train(struct ramfuc *ram) +{ + nouveau_memx_train(ram->memx); +} + +static inline int +ramfuc_train_result(struct nouveau_fb *pfb, u32 *result, u32 rsize) +{ + struct nouveau_pwr *ppwr = nouveau_pwr(pfb); + + return nouveau_memx_train_result(ppwr, result, rsize); +} + +static inline void ramfuc_block(struct ramfuc *ram) { nouveau_memx_block(ram->memx); @@ -162,6 +176,8 @@ ramfuc_unblock(struct ramfuc *ram) #define ram_wait(s,r,m,d,n) ramfuc_wait(&(s)->base, (r), (m), (d), (n)) #define ram_nsec(s,n) ramfuc_nsec(&(s)->base, (n)) #define ram_wait_vblank(s) ramfuc_wait_vblank(&(s)->base) +#define ram_train(s) ramfuc_train(&(s)->base) +#define ram_train_result(s,r,l) ramfuc_train_result((s), (r), (l)) #define ram_block(s) ramfuc_block(&(s)->base) #define ram_unblock(s) ramfuc_unblock(&(s)->base) diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c b/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c index 3601dec..719a09b 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c +++ b/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c @@ -20,17 +20,23 @@ * OTHER DEALINGS IN THE SOFTWARE. * * Authors: Ben Skeggs + * Roy Spliet <rspliet at eclipso.eu> */ #include <subdev/bios.h> #include <subdev/bios/bit.h> #include <subdev/bios/pll.h> #include <subdev/bios/rammap.h> +#include <subdev/bios/M0205.h> #include <subdev/bios/timing.h> #include <subdev/clock/nva3.h> #include <subdev/clock/pll.h> +#include <subdev/timer.h> + +#include <engine/fifo.h> + #include <core/option.h> #include "ramfuc.h" @@ -39,11 +45,14 @@ struct nva3_ramfuc { struct ramfuc base; + struct ramfuc_reg r_0x001610; + struct ramfuc_reg r_0x001700; struct ramfuc_reg r_0x004000; struct ramfuc_reg r_0x004004; struct ramfuc_reg r_0x004018; struct ramfuc_reg r_0x004128; struct ramfuc_reg r_0x004168; + struct ramfuc_reg r_0x100080; struct ramfuc_reg r_0x100200; struct ramfuc_reg r_0x100210; struct ramfuc_reg r_0x100220[9]; @@ -56,6 +65,7 @@ struct nva3_ramfuc { struct ramfuc_reg r_0x100714; struct ramfuc_reg r_0x100718; struct ramfuc_reg r_0x10071c; + struct ramfuc_reg r_0x100720; struct ramfuc_reg r_0x100760; struct ramfuc_reg r_0x1007a0; struct ramfuc_reg r_0x1007e0; @@ -63,15 +73,276 @@ struct nva3_ramfuc { struct ramfuc_reg r_0x1110e0; struct ramfuc_reg r_0x111100; struct ramfuc_reg r_0x111104; + struct ramfuc_reg r_0x1111e0; + struct ramfuc_reg r_0x111400; struct ramfuc_reg r_0x611200; struct ramfuc_reg r_mr[4]; }; +struct nva3_ltrain { + enum { + NVA3_TRAIN_UNKNOWN, + NVA3_TRAIN_UNSUPPORTED, + NVA3_TRAIN_ONCE, + NVA3_TRAIN_EXEC, + NVA3_TRAIN_DONE + } state; + u32 r_100720; + u32 r_1111e0; + u32 r_111400; + struct nouveau_mem *mem; +}; + struct nva3_ram { struct nouveau_ram base; struct nva3_ramfuc fuc; + struct nva3_ltrain ltrain; }; +void +nva3_link_train_calc(u32 *vals, struct nva3_ltrain *train) +{ + int i, lo, hi; + u8 median[8], bins[4] = {0, 0, 0, 0}, bin = 0, qty = 0; + + for (i = 0; i < 8; i++) { + for (lo = 0; lo < 0x40; lo++) { + if (!(vals[lo] & 0x80000000)) + continue; + if (vals[lo] & (0x101 << i)) + break; + } + + if (lo == 0x40) + return; + + for (hi = lo + 1; hi < 0x40; hi++) { + if (!(vals[lo] & 0x80000000)) + continue; + if (!(vals[hi] & (0x101 << i))) { + hi--; + break; + } + } + + median[i] = ((hi - lo) >> 1) + lo; + bins[(median[i] & 0xf0) >> 4]++; + median[i] += 0x30; + } + + /* Find the best value for 0x1111e0 */ + for (i = 0; i < 4; i++) { + if (bins[i] > qty) { + bin = i + 3; + qty = bins[i]; + } + } + + train->r_100720 = 0; + for (i = 0; i < 8; i++) { + median[i] = max(median[i], (u8) (bin << 4)); + median[i] = min(median[i], (u8) ((bin << 4) | 0xf)); + + train->r_100720 |= ((median[i] & 0x0f) << (i << 2)); + } + + train->r_1111e0 = 0x02000000 | (bin * 0x101); + train->r_111400 = 0x0; +} + +/* + * Link training for (at least) DDR3 + */ +int +nva3_link_train(struct nouveau_fb *pfb) +{ + struct nouveau_bios *bios = nouveau_bios(pfb); + struct nva3_ram *ram = (void *)pfb->ram; + struct nouveau_clock *clk = nouveau_clock(pfb); + struct nva3_ltrain *train = &ram->ltrain; + struct nouveau_device *device = nv_device(pfb); + struct nva3_ramfuc *fuc = &ram->fuc; + u32 *result, r1700; + int ret, i; + struct nvbios_M0205T M0205T = { 0 }; + u8 ver, hdr, cnt, len, snr, ssz; + unsigned int clk_current; + unsigned long flags; + unsigned long *f = &flags; + + if (nouveau_boolopt(device->cfgopt, "NvMemExec", true) != true) + return -ENOSYS; + + /* XXX: Multiple partitions? */ + result = kmalloc(64 * sizeof(u32), GFP_KERNEL); + if (!result) + return -ENOMEM; + + train->state = NVA3_TRAIN_EXEC; + + /* Clock speeds for training and back */ + nvbios_M0205Tp(bios, &ver, &hdr, &cnt, &len, &snr, &ssz, &M0205T); + if (M0205T.freq == 0) + return -ENOENT; + + clk_current = clk->read(clk, nv_clk_src_mem); + + ret = nva3_clock_pre(clk, f); + if (ret) + goto out; + + /* First: clock up/down */ + ret = ram->base.calc(pfb, (u32) M0205T.freq * 1000); + if (ret) + goto out; + + /* Do this *after* calc, eliminates write in script */ + nv_wr32(pfb, 0x111400, 0x00000000); + /* XXX: Magic writes that improve train reliability? */ + nv_mask(pfb, 0x100674, 0x0000ffff, 0x00000000); + nv_mask(pfb, 0x1005e4, 0x0000ffff, 0x00000000); + nv_mask(pfb, 0x100b0c, 0x000000ff, 0x00000000); + nv_wr32(pfb, 0x100c04, 0x00000400); + + /* Now the training script */ + r1700 = ram_rd32(fuc, 0x001700); + + ram_mask(fuc, 0x100200, 0x00000800, 0x00000000); + ram_wr32(fuc, 0x611200, 0x3300); + ram_wait_vblank(fuc); + ram_wait(fuc, 0x611200, 0x00000003, 0x00000000, 500000); + ram_mask(fuc, 0x001610, 0x00000083, 0x00000003); + ram_mask(fuc, 0x100080, 0x00000020, 0x00000000); + ram_mask(fuc, 0x10f804, 0x80000000, 0x00000000); + ram_wr32(fuc, 0x001700, 0x00000000); + + ram_train(fuc); + + /* Reset */ + ram_mask(fuc, 0x10f804, 0x80000000, 0x80000000); + ram_wr32(fuc, 0x10053c, 0x0); + ram_wr32(fuc, 0x100720, train->r_100720); + ram_wr32(fuc, 0x1111e0, train->r_1111e0); + ram_wr32(fuc, 0x111400, train->r_111400); + ram_nuke(fuc, 0x100080); + ram_mask(fuc, 0x100080, 0x00000020, 0x00000020); + ram_nsec(fuc, 1000); + + ram_wr32(fuc, 0x001700, r1700); + ram_mask(fuc, 0x001610, 0x00000083, 0x00000080); + ram_wr32(fuc, 0x611200, 0x3330); + ram_mask(fuc, 0x100200, 0x00000800, 0x00000800); + + ram_exec(fuc, true); + + ram->base.calc(pfb, clk_current); + ram_exec(fuc, true); + + /* Post-processing, avoids flicker */ + nv_mask(pfb, 0x616308, 0x10, 0x10); + nv_mask(pfb, 0x616b08, 0x10, 0x10); + + nva3_clock_post(clk, f); + + ram_train_result(pfb, result, 64); + for (i = 0; i < 64; i++) + nv_debug(pfb, "Train: %08x", result[i]); + nva3_link_train_calc(result, train); + + nv_debug(pfb, "Train: %08x %08x %08x", train->r_100720, + train->r_1111e0, train->r_111400); + + kfree(result); + + train->state = NVA3_TRAIN_DONE; + + return ret; + +out: + if(ret == -EBUSY) + f = NULL; + + train->state = NVA3_TRAIN_UNSUPPORTED; + + nva3_clock_post(clk, f); + return ret; +} + +int +nva3_link_train_init(struct nouveau_fb *pfb) +{ + static const u32 pattern[16] = { + 0xaaaaaaaa, 0xcccccccc, 0xdddddddd, 0xeeeeeeee, + 0x00000000, 0x11111111, 0x44444444, 0xdddddddd, + 0x33333333, 0x55555555, 0x77777777, 0x66666666, + 0x99999999, 0x88888888, 0xeeeeeeee, 0xbbbbbbbb, + }; + struct nouveau_bios *bios = nouveau_bios(pfb); + struct nva3_ram *ram = (void *)pfb->ram; + struct nva3_ltrain *train = &ram->ltrain; + struct nouveau_mem *mem; + struct nvbios_M0205E M0205E; + u8 ver, hdr, cnt, len; + u32 r001700; + int ret, i = 0; + + train->state = NVA3_TRAIN_UNSUPPORTED; + + /* We support type "5" + * XXX: training pattern table appears to be unused for this routine */ + if (!nvbios_M0205Ep(bios, i, &ver, &hdr, &cnt, &len, &M0205E)) + return -ENOENT; + + if (M0205E.type != 5) + return 0; + + train->state = NVA3_TRAIN_ONCE; + + ret = pfb->ram->get(pfb, 0x8000, 0x10000, 0, 0x800, &ram->ltrain.mem); + if (ret) + return ret; + + mem = ram->ltrain.mem; + + nv_wr32(pfb, 0x100538, 0x10000000 | (mem->offset >> 16)); + nv_wr32(pfb, 0x1005a8, 0x0000ffff); + nv_mask(pfb, 0x10f800, 0x00000001, 0x00000001); + + for (i = 0; i < 0x30; i++) { + nv_wr32(pfb, 0x10f8c0, (i << 8) | i); + nv_wr32(pfb, 0x10f900, pattern[i % 16]); + } + + for (i = 0; i < 0x30; i++) { + nv_wr32(pfb, 0x10f8e0, (i << 8) | i); + nv_wr32(pfb, 0x10f920, pattern[i % 16]); + } + + /* And upload the pattern */ + r001700 = nv_rd32(pfb, 0x1700); + nv_wr32(pfb, 0x1700, mem->offset >> 16); + for (i = 0; i < 16; i++) + nv_wr32(pfb, 0x700000 + (i << 2), pattern[i]); + for (i = 0; i < 16; i++) + nv_wr32(pfb, 0x700100 + (i << 2), pattern[i]); + nv_wr32(pfb, 0x1700, r001700); + + train->r_100720 = nv_rd32(pfb, 0x100720); + train->r_1111e0 = nv_rd32(pfb, 0x1111e0); + train->r_111400 = nv_rd32(pfb, 0x111400); + + return 0; +} + +void +nva3_link_train_fini(struct nouveau_fb *pfb) +{ + struct nva3_ram *ram = (void *)pfb->ram; + + if (ram->ltrain.mem) + pfb->ram->put(pfb, &ram->ltrain.mem); +} + static int nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) { @@ -90,6 +361,9 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) next->freq = freq; ram->base.next = next; + if (ram->ltrain.state == NVA3_TRAIN_ONCE) + nva3_link_train(pfb); + /* lookup memory config data relevant to the target frequency */ i = 0; while ((data = nvbios_rammapEp(bios, i++, &ver, &hdr, &cnt, &len, @@ -330,38 +604,24 @@ nva3_ram_init(struct nouveau_object *object) { struct nouveau_fb *pfb = (void *)object->parent; struct nva3_ram *ram = (void *)object; - int ret, i; + int ret; ret = nouveau_ram_init(&ram->base); if (ret) return ret; - /* prepare for ddr link training, and load training patterns */ - switch (ram->base.type) { - case NV_MEM_TYPE_DDR3: { - if (nv_device(pfb)->chipset == 0xa8) { - static const u32 pattern[16] = { - 0xaaaaaaaa, 0xcccccccc, 0xdddddddd, 0xeeeeeeee, - 0x00000000, 0x11111111, 0x44444444, 0xdddddddd, - 0x33333333, 0x55555555, 0x77777777, 0x66666666, - 0x99999999, 0x88888888, 0xeeeeeeee, 0xbbbbbbbb, - }; - - nv_wr32(pfb, 0x100538, 0x10001ff6); /*XXX*/ - nv_wr32(pfb, 0x1005a8, 0x0000ffff); - nv_mask(pfb, 0x10f800, 0x00000001, 0x00000001); - for (i = 0; i < 0x30; i++) { - nv_wr32(pfb, 0x10f8c0, (i << 8) | i); - nv_wr32(pfb, 0x10f8e0, (i << 8) | i); - nv_wr32(pfb, 0x10f900, pattern[i % 16]); - nv_wr32(pfb, 0x10f920, pattern[i % 16]); - } - } - } - break; - default: - break; - } + nva3_link_train_init(pfb); + + return 0; +} + +static int +nva3_ram_fini(struct nouveau_object *object, bool suspend) +{ + struct nouveau_fb *pfb = (void *)object->parent; + + if (!suspend) + nva3_link_train_fini(pfb); return 0; } @@ -390,6 +650,8 @@ nva3_ram_ctor(struct nouveau_object *parent, struct nouveau_object *engine, return 0; } + ram->fuc.r_0x001610 = ramfuc_reg(0x001610); + ram->fuc.r_0x001700 = ramfuc_reg(0x001700); ram->fuc.r_0x004000 = ramfuc_reg(0x004000); ram->fuc.r_0x004004 = ramfuc_reg(0x004004); ram->fuc.r_0x004018 = ramfuc_reg(0x004018); @@ -438,6 +700,6 @@ nva3_ram_oclass = { .ctor = nva3_ram_ctor, .dtor = _nouveau_ram_dtor, .init = nva3_ram_init, - .fini = _nouveau_ram_fini, + .fini = nva3_ram_fini, }, }; diff --git a/drivers/gpu/drm/nouveau/core/subdev/pwr/fuc/memx.fuc b/drivers/gpu/drm/nouveau/core/subdev/pwr/fuc/memx.fuc index e89789a..664417c 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/pwr/fuc/memx.fuc +++ b/drivers/gpu/drm/nouveau/core/subdev/pwr/fuc/memx.fuc @@ -50,6 +50,7 @@ handler(WR32 , 0x0000, 0x0002, #memx_func_wr32) handler(WAIT , 0x0004, 0x0000, #memx_func_wait) handler(DELAY , 0x0001, 0x0000, #memx_func_delay) handler(VBLANK, 0x0001, 0x0000, #memx_func_wait_vblank) +handler(TRAIN , 0x0000, 0x0000, #memx_func_train) memx_func_tail: .equ #memx_func_size #memx_func_next - #memx_func_head @@ -63,6 +64,10 @@ memx_ts_end: memx_data_head: .skip 0x0800 memx_data_tail: + +memx_train_head: +.skip 0x0100 +memx_train_tail: #endif /****************************************************************************** @@ -260,6 +265,101 @@ memx_func_delay: // description // // $r15 - current (memx) +// $r4 - packet length +// $r3 - opcode desciption +// $r0 - zero +memx_func_train: +#if NVKM_PPWR_CHIPSET == GT215 +// $r5 - outer loop counter +// $r6 - inner loop counter +// $r7 - entry counter (#memx_train_head + $r7) + movw $r5 0x3 + movw $r7 0x0 + +// Read random memory to wake up... things + imm32($r9, 0x700000) + nv_rd32($r8,$r9) + movw $r14 0x2710 + call(nsec) + + memx_func_train_loop_outer: + mulu $r8 $r5 0x101 + sethi $r8 0x02000000 + imm32($r9, 0x1111e0) + nv_wr32($r9, $r8) + push $r5 + + movw $r6 0x0 + memx_func_train_loop_inner: + movw $r8 0x1111 + mulu $r9 $r6 $r8 + shl b32 $r8 $r9 0x10 + or $r8 $r9 + imm32($r9, 0x100720) + nv_wr32($r9, $r8) + + imm32($r9, 0x100080) + nv_rd32($r8, $r9) + or $r8 $r8 0x20 + nv_wr32($r9, $r8) + + imm32($r9, 0x10053c) + imm32($r8, 0x80003002) + nv_wr32($r9, $r8) + + imm32($r14, 0x100560) + imm32($r13, 0x80000000) + add b32 $r12 $r13 0 + imm32($r11, 0x001e8480) + call(wait) + + // $r5 - inner inner loop counter + // $r9 - result + movw $r5 0 + imm32($r9, 0x8300ffff) + memx_func_train_loop_4x: + imm32($r10, 0x100080) + nv_rd32($r8, $r10) + imm32($r11, 0xffffffdf) + and $r8 $r11 + nv_wr32($r10, $r8) + + imm32($r10, 0x10053c) + imm32($r8, 0x80003002) + nv_wr32($r10, $r8) + + imm32($r14, 0x100560) + imm32($r13, 0x80000000) + mov b32 $r12 $r13 + imm32($r11, 0x00002710) + call(wait) + + nv_rd32($r13, $r14) + and $r9 $r9 $r13 + + add b32 $r5 1 + cmp b16 $r5 0x4 + bra l #memx_func_train_loop_4x + + add b32 $r10 $r7 #memx_train_head + st b32 D[$r10 + 0] $r9 + add b32 $r6 1 + add b32 $r7 4 + + cmp b16 $r6 0x10 + bra l #memx_func_train_loop_inner + + pop $r5 + add b32 $r5 1 + cmp b16 $r5 7 + bra l #memx_func_train_loop_outer + +#endif + ret + +// description +// +// $r15 - current (memx) // $r14 - sender process name // $r13 - message (exec) // $r12 - head of script @@ -307,8 +407,19 @@ memx_exec: // $r11 - data1 // $r0 - zero memx_info: + cmp b16 $r12 0x1 + bra e #memx_info_train + + memx_info_data: mov $r12 #memx_data_head mov $r11 #memx_data_tail - #memx_data_head + bra #memx_info_send + + memx_info_train: + mov $r12 #memx_train_head + mov $r11 #memx_train_tail - #memx_train_head + + memx_info_send: call(send) ret diff --git a/drivers/gpu/drm/nouveau/core/subdev/pwr/fuc/os.h b/drivers/gpu/drm/nouveau/core/subdev/pwr/fuc/os.h index 522e307..c8b06cb 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/pwr/fuc/os.h +++ b/drivers/gpu/drm/nouveau/core/subdev/pwr/fuc/os.h @@ -18,6 +18,10 @@ #define MEMX_MSG_INFO 0 #define MEMX_MSG_EXEC 1 +/* MEMX: info types */ +#define MEMX_INFO_DATA 0 +#define MEMX_INFO_TRAIN 1 + /* MEMX: script opcode definitions */ #define MEMX_ENTER 1 #define MEMX_LEAVE 2 @@ -25,6 +29,7 @@ #define MEMX_WAIT 4 #define MEMX_DELAY 5 #define MEMX_VBLANK 6 +#define MEMX_TRAIN 7 /* I2C_: message identifiers */ #define I2C__MSG_RD08 0 diff --git a/drivers/gpu/drm/nouveau/core/subdev/pwr/memx.c b/drivers/gpu/drm/nouveau/core/subdev/pwr/memx.c index 65eaa25..f6ce39d 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/pwr/memx.c +++ b/drivers/gpu/drm/nouveau/core/subdev/pwr/memx.c @@ -47,7 +47,8 @@ nouveau_memx_init(struct nouveau_pwr *ppwr, struct nouveau_memx **pmemx) u32 reply[2]; int ret; - ret = ppwr->message(ppwr, reply, PROC_MEMX, MEMX_MSG_INFO, 0, 0); + ret = ppwr->message(ppwr, reply, PROC_MEMX, MEMX_MSG_INFO, + MEMX_INFO_DATA, 0); if (ret) return ret; @@ -152,6 +153,38 @@ nouveau_memx_wait_vblank(struct nouveau_memx *memx) } void +nouveau_memx_train(struct nouveau_memx *memx) +{ + nv_debug(memx->ppwr, " MEM TRAIN\n"); + memx_cmd(memx, MEMX_TRAIN, 0, NULL); +} + +int +nouveau_memx_train_result(struct nouveau_pwr *ppwr, u32 *res, int rsize) +{ + u32 reply[2], base, size, i; + int ret; + + ret = ppwr->message(ppwr, reply, PROC_MEMX, MEMX_MSG_INFO, + MEMX_INFO_TRAIN, 0); + if (ret) + return ret; + + base = reply[0]; + size = reply[1] >> 2; + if (size > rsize) + return -ENOMEM; + + /* read the packet */ + nv_wr32(ppwr, 0x10a1c0, 0x02000000 | base); + + for (i = 0; i < size; i++) + res[i] = nv_rd32(ppwr, 0x10a1c4); + + return 0; +} + +void nouveau_memx_block(struct nouveau_memx *memx) { nv_debug(memx->ppwr, " HOST BLOCKED\n"); -- 1.9.3
Roy Spliet
2014-Sep-29 17:27 UTC
[Nouveau] [PATCH 3/7] fb/ramnva3: Ressurect timing calculation code
Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- .../drm/nouveau/core/include/subdev/bios/ramcfg.h | 20 ++++++ drivers/gpu/drm/nouveau/core/subdev/bios/timing.c | 42 +++++++++-- drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c | 81 +++++++++++++++++++--- 3 files changed, 129 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/nouveau/core/include/subdev/bios/ramcfg.h b/drivers/gpu/drm/nouveau/core/include/subdev/bios/ramcfg.h index a685bbd..0b4bdc2 100644 --- a/drivers/gpu/drm/nouveau/core/include/subdev/bios/ramcfg.h +++ b/drivers/gpu/drm/nouveau/core/include/subdev/bios/ramcfg.h @@ -95,9 +95,29 @@ struct nvbios_ramcfg { union { struct { unsigned timing_10_WR:8; + unsigned timing_10_WTR:8; unsigned timing_10_CL:8; + unsigned timing_10_RC:8; + /*empty: 4 */ + unsigned timing_10_RFC:8; /* Byte 5 */ + /*empty: 6 */ + unsigned timing_10_RAS:8; /* Byte 7 */ + /*empty: 8 */ + unsigned timing_10_RP:8; /* Byte 9 */ + unsigned timing_10_RCDRD:8; + unsigned timing_10_RCDWR:8; + unsigned timing_10_RRD:8; + unsigned timing_10_13:8; unsigned timing_10_ODT:3; + /* empty: 15 */ + unsigned timing_10_16:8; + /* empty: 17 */ + unsigned timing_10_18:8; unsigned timing_10_CWL:8; + unsigned timing_10_20:8; + unsigned timing_10_21:8; + /* empty: 22, 23 */ + unsigned timing_10_24:8; }; struct { unsigned timing_20_2e_03:2; diff --git a/drivers/gpu/drm/nouveau/core/subdev/bios/timing.c b/drivers/gpu/drm/nouveau/core/subdev/bios/timing.c index 46d955e..8521eca 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/bios/timing.c +++ b/drivers/gpu/drm/nouveau/core/subdev/bios/timing.c @@ -93,10 +93,44 @@ nvbios_timingEp(struct nouveau_bios *bios, int idx, p->timing_hdr = *hdr; switch (!!data * *ver) { case 0x10: - p->timing_10_WR = nv_ro08(bios, data + 0x00); - p->timing_10_CL = nv_ro08(bios, data + 0x02); - p->timing_10_ODT = nv_ro08(bios, data + 0x0e) & 0x07; - p->timing_10_CWL = nv_ro08(bios, data + 0x13); + p->timing_10_WR = nv_ro08(bios, data + 0x00); + p->timing_10_WTR = nv_ro08(bios, data + 0x01); + p->timing_10_CL = nv_ro08(bios, data + 0x02); + p->timing_10_RC = nv_ro08(bios, data + 0x03); + p->timing_10_RFC = nv_ro08(bios, data + 0x05); + p->timing_10_RAS = nv_ro08(bios, data + 0x07); + p->timing_10_RP = nv_ro08(bios, data + 0x09); + p->timing_10_RCDRD = nv_ro08(bios, data + 0x0a); + p->timing_10_RCDWR = nv_ro08(bios, data + 0x0b); + p->timing_10_RRD = nv_ro08(bios, data + 0x0c); + p->timing_10_13 = nv_ro08(bios, data + 0x0d); + p->timing_10_ODT = nv_ro08(bios, data + 0x0e) & 0x07; + + p->timing_10_24 = 0xff; + p->timing_10_21 = 0; + p->timing_10_20 = 0; + p->timing_10_CWL = 0; + p->timing_10_18 = 0; + p->timing_10_16 = 0; + + switch (min_t(u8, *hdr, 25)) { + case 25: + p->timing_10_24 = nv_ro08(bios, data + 0x18); + case 24: + case 23: + case 22: + p->timing_10_21 = nv_ro08(bios, data + 0x15); + case 21: + p->timing_10_20 = nv_ro08(bios, data + 0x14); + case 20: + p->timing_10_CWL = nv_ro08(bios, data + 0x13); + case 19: + p->timing_10_18 = nv_ro08(bios, data + 0x12); + case 18: + case 17: + p->timing_10_16 = nv_ro08(bios, data + 0x10); + } + break; case 0x20: p->timing[0] = nv_ro32(bios, data + 0x00); diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c b/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c index 719a09b..a4fe7a4 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c +++ b/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c @@ -343,6 +343,66 @@ nva3_link_train_fini(struct nouveau_fb *pfb) pfb->ram->put(pfb, &ram->ltrain.mem); } +/* + * RAM reclocking + */ +#define T(t) cfg->timing_10_##t +static int +nva3_ram_timing_calc(struct nouveau_fb *pfb, u32 *timing) +{ + struct nva3_ram *ram = (void *)pfb->ram; + struct nvbios_ramcfg *cfg = &ram->base.target.bios; + int tUNK_base; + u32 cur3, cur7, cur8; + + cur3 = nv_rd32(pfb, 0x10022c); + cur7 = nv_rd32(pfb, 0x10023c); + cur8 = nv_rd32(pfb, 0x100240); + + if (T(CWL) == 0) + T(CWL) = ((nv_rd32(pfb, 0x100228) & 0x0f000000) >> 24) + 1; + + tUNK_base = ((cur7 & 0x00ff0000) >> 16) - + (cur3 & 0x000000ff) - 1; + + timing[0] = (T(RP) << 24 | T(RAS) << 16 | T(RFC) << 8 | T(RC)); + timing[1] = (T(WR) + 1 + T(CWL)) << 24 | + max_t(u8,T(18), 1) << 16 | + (T(WTR) + 1 + T(CWL)) << 8 | + (5 + T(CL) - T(CWL)); + timing[2] = (T(CWL) - 1) << 24 | + (T(RRD) << 16) | + (T(RCDWR) << 8) | + T(RCDRD); + timing[3] = (cur3 & 0x00ff0000) | + (0x30 + T(CL)) << 24 | + (0xb + T(CL)) << 8 | + (T(CL) - 1); + timing[4] = T(20) << 24 | + T(21) << 16 | + T(13) << 8 | + T(13); + timing[5] = T(RFC) << 24 | + max_t(u8,T(RCDRD), T(RCDWR)) << 16 | + (T(CWL) + 6) << 8 | + T(RP); + timing[6] = (0x5a + T(CL)) << 16 | + (6 - T(CL) + T(CWL)) << 8 | + (0x50 + T(CL) - T(CWL)); + timing[7] = (cur7 & 0xff000000) | + ((tUNK_base + T(CL)) << 16) | + 0x202; + timing[8] = cur8 & 0xffffff00; + + nv_debug(pfb, "Entry: 220: %08x %08x %08x %08x\n", + timing[0], timing[1], timing[2], timing[3]); + nv_debug(pfb, " 230: %08x %08x %08x %08x\n", + timing[4], timing[5], timing[6], timing[7]); + nv_debug(pfb, " 240: %08x\n", timing[8]); + return 0; +} +#undef T + static int nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) { @@ -519,16 +579,17 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) ram_mask(fuc, mr[0], 0x00000000, 0x00000000); ram_nsec(fuc, 1000); - ram_mask(fuc, 0x100220[3], 0x00000000, 0x00000000); - ram_mask(fuc, 0x100220[1], 0x00000000, 0x00000000); - ram_mask(fuc, 0x100220[6], 0x00000000, 0x00000000); - ram_mask(fuc, 0x100220[7], 0x00000000, 0x00000000); - ram_mask(fuc, 0x100220[2], 0x00000000, 0x00000000); - ram_mask(fuc, 0x100220[4], 0x00000000, 0x00000000); - ram_mask(fuc, 0x100220[5], 0x00000000, 0x00000000); - ram_mask(fuc, 0x100220[0], 0x00000000, 0x00000000); - ram_mask(fuc, 0x100220[8], 0x00000000, 0x00000000); - + ram_wr32(fuc, 0x100220[3], timing[3]); + ram_wr32(fuc, 0x100220[1], timing[1]); + ram_wr32(fuc, 0x100220[6], timing[6]); + ram_wr32(fuc, 0x100220[7], timing[7]); + ram_wr32(fuc, 0x100220[2], timing[2]); + ram_wr32(fuc, 0x100220[4], timing[4]); + ram_wr32(fuc, 0x100220[5], timing[5]); + ram_wr32(fuc, 0x100220[0], timing[0]); + ram_wr32(fuc, 0x100220[8], timing[8]); + + /* Misc */ ram_mask(fuc, 0x100200, 0x00001000, !next->bios.ramcfg_10_02_08 << 12); unk714 = ram_rd32(fuc, 0x100714) & ~0xf0000010; -- 1.9.3
Roy Spliet
2014-Sep-29 17:27 UTC
[Nouveau] [PATCH 4/7] fb/ramnva3: Reclocking script for DDR3
Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- .../drm/nouveau/core/include/subdev/bios/ramcfg.h | 3 +- drivers/gpu/drm/nouveau/core/subdev/bios/rammap.c | 3 +- drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c | 299 +++++++++++++++------ drivers/gpu/drm/nouveau/core/subdev/fb/sddr2.c | 2 +- drivers/gpu/drm/nouveau/core/subdev/fb/sddr3.c | 2 +- 5 files changed, 230 insertions(+), 79 deletions(-) diff --git a/drivers/gpu/drm/nouveau/core/include/subdev/bios/ramcfg.h b/drivers/gpu/drm/nouveau/core/include/subdev/bios/ramcfg.h index 0b4bdc2..4a0e0ce 100644 --- a/drivers/gpu/drm/nouveau/core/include/subdev/bios/ramcfg.h +++ b/drivers/gpu/drm/nouveau/core/include/subdev/bios/ramcfg.h @@ -43,8 +43,9 @@ struct nvbios_ramcfg { unsigned ramcfg_10_02_08:1; unsigned ramcfg_10_02_10:1; unsigned ramcfg_10_02_20:1; - unsigned ramcfg_10_02_40:1; + unsigned ramcfg_10_DLLoff:1; unsigned ramcfg_10_03_0f:4; + unsigned ramcfg_10_04_01:1; unsigned ramcfg_10_05:8; unsigned ramcfg_10_06:8; unsigned ramcfg_10_07:8; diff --git a/drivers/gpu/drm/nouveau/core/subdev/bios/rammap.c b/drivers/gpu/drm/nouveau/core/subdev/bios/rammap.c index 585e693..c568522 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/bios/rammap.c +++ b/drivers/gpu/drm/nouveau/core/subdev/bios/rammap.c @@ -162,8 +162,9 @@ nvbios_rammapSp(struct nouveau_bios *bios, u32 data, p->ramcfg_10_02_08 = (nv_ro08(bios, data + 0x02) & 0x08) >> 3; p->ramcfg_10_02_10 = (nv_ro08(bios, data + 0x02) & 0x10) >> 4; p->ramcfg_10_02_20 = (nv_ro08(bios, data + 0x02) & 0x20) >> 5; - p->ramcfg_10_02_40 = (nv_ro08(bios, data + 0x02) & 0x40) >> 6; + p->ramcfg_10_DLLoff = (nv_ro08(bios, data + 0x02) & 0x40) >> 6; p->ramcfg_10_03_0f = (nv_ro08(bios, data + 0x03) & 0x0f) >> 0; + p->ramcfg_10_04_01 = (nv_ro08(bios, data + 0x04) & 0x01) >> 0; p->ramcfg_10_05 = (nv_ro08(bios, data + 0x05) & 0xff) >> 0; p->ramcfg_10_06 = (nv_ro08(bios, data + 0x06) & 0xff) >> 0; p->ramcfg_10_07 = (nv_ro08(bios, data + 0x07) & 0xff) >> 0; diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c b/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c index a4fe7a4..7cff2d4 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c +++ b/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c @@ -47,6 +47,7 @@ struct nva3_ramfuc { struct ramfuc base; struct ramfuc_reg r_0x001610; struct ramfuc_reg r_0x001700; + struct ramfuc_reg r_0x002504; struct ramfuc_reg r_0x004000; struct ramfuc_reg r_0x004004; struct ramfuc_reg r_0x004018; @@ -56,12 +57,14 @@ struct nva3_ramfuc { struct ramfuc_reg r_0x100200; struct ramfuc_reg r_0x100210; struct ramfuc_reg r_0x100220[9]; + struct ramfuc_reg r_0x100264; struct ramfuc_reg r_0x1002d0; struct ramfuc_reg r_0x1002d4; struct ramfuc_reg r_0x1002dc; struct ramfuc_reg r_0x10053c; struct ramfuc_reg r_0x1005a0; struct ramfuc_reg r_0x1005a4; + struct ramfuc_reg r_0x100700; struct ramfuc_reg r_0x100714; struct ramfuc_reg r_0x100718; struct ramfuc_reg r_0x10071c; @@ -69,6 +72,7 @@ struct nva3_ramfuc { struct ramfuc_reg r_0x100760; struct ramfuc_reg r_0x1007a0; struct ramfuc_reg r_0x1007e0; + struct ramfuc_reg r_0x100da0; struct ramfuc_reg r_0x10f804; struct ramfuc_reg r_0x1110e0; struct ramfuc_reg r_0x111100; @@ -403,19 +407,53 @@ nva3_ram_timing_calc(struct nouveau_fb *pfb, u32 *timing) } #undef T +static void +nouveau_sddr3_dll_reset(struct nva3_ramfuc *fuc) +{ + ram_mask(fuc, mr[0], 0x100, 0x100); + ram_nsec(fuc, 1000); + ram_mask(fuc, mr[0], 0x100, 0x000); + ram_nsec(fuc, 1000); +} + +static void +nouveau_sddr3_dll_disable(struct nva3_ramfuc *fuc, u32 *mr) +{ + u32 mr1_old = ram_rd32(fuc, mr[1]); + + if (!(mr1_old & 0x1)) { + ram_wr32(fuc, 0x1002d4, 0x00000001); + ram_wr32(fuc, mr[1], mr[1]); + ram_nsec(fuc, 1000); + } +} + +static void +nva3_ram_lock_pll(struct nva3_ramfuc *fuc, struct nva3_clock_info *mclk) +{ + ram_wr32(fuc, 0x004004, mclk->pll); + ram_mask(fuc, 0x004000, 0x00000001, 0x00000001); + ram_mask(fuc, 0x004000, 0x00000010, 0x00000000); + ram_wait(fuc, 0x004000, 0x00020000, 0x00020000, 64000); + ram_mask(fuc, 0x004000, 0x00000010, 0x00000010); +} + static int nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) { struct nouveau_bios *bios = nouveau_bios(pfb); struct nva3_ram *ram = (void *)pfb->ram; struct nva3_ramfuc *fuc = &ram->fuc; + struct nva3_ltrain *train = &ram->ltrain; struct nva3_clock_info mclk; struct nouveau_ram_data *next; u8 ver, hdr, cnt, len, strap; u32 data; - u32 r004018, r100760, ctrl; + u32 r004018, r100760, r100da0, r111100, ctrl; u32 unk714, unk718, unk71c; int ret, i; + u32 timing[9]; + bool pll2pll; next = &ram->base.target; next->freq = freq; @@ -426,14 +464,9 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) /* lookup memory config data relevant to the target frequency */ i = 0; - while ((data = nvbios_rammapEp(bios, i++, &ver, &hdr, &cnt, &len, - &next->bios))) { - if (freq / 1000 >= next->bios.rammap_min && - freq / 1000 <= next->bios.rammap_max) - break; - } - - if (!data || ver != 0x10 || hdr < 0x0e) { + data = nvbios_rammapEm(bios, freq / 1000, &ver, &hdr, &cnt, &len, + &next->bios); + if (!data || ver != 0x10 || hdr < 0x05) { nv_error(pfb, "invalid/missing rammap entry\n"); return -EINVAL; } @@ -447,7 +480,7 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) data = nvbios_rammapSp(bios, data, ver, hdr, cnt, len, strap, &ver, &hdr, &next->bios); - if (!data || ver != 0x10 || hdr < 0x0e) { + if (!data || ver != 0x10 || hdr < 0x09) { nv_error(pfb, "invalid/missing ramcfg entry\n"); return -EINVAL; } @@ -457,7 +490,7 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) data = nvbios_timingEp(bios, next->bios.ramcfg_timing, &ver, &hdr, &cnt, &len, &next->bios); - if (!data || ver != 0x10 || hdr < 0x19) { + if (!data || ver != 0x10 || hdr < 0x17) { nv_error(pfb, "invalid/missing timing entry\n"); return -EINVAL; } @@ -469,53 +502,81 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) return ret; } + nva3_ram_timing_calc(pfb, timing); + ret = ram_init(fuc, pfb); if (ret) return ret; + /* Determine ram-specific MR values */ + ram->base.mr[0] = ram_rd32(fuc, mr[0]); + ram->base.mr[1] = ram_rd32(fuc, mr[1]); + ram->base.mr[2] = ram_rd32(fuc, mr[2]); + + switch (ram->base.type) { + case NV_MEM_TYPE_DDR3: + ret = nouveau_sddr3_calc(&ram->base); + break; + default: + ret = -ENOSYS; + break; + } + + if (ret) + return ret; + /* XXX: where the fuck does 750MHz come from? */ if (freq <= 750000) { r004018 = 0x10000000; r100760 = 0x22222222; + r100da0 = 0x00000010; } else { r004018 = 0x00000000; r100760 = 0x00000000; + r100da0 = 0x00000000; } + if (!next->bios.ramcfg_10_DLLoff) + r004018 |= 0x00004000; + + /* pll2pll requires to switch to a safe clock first */ ctrl = ram_rd32(fuc, 0x004000); - if (ctrl & 0x00000008) { - if (mclk.pll) { - ram_mask(fuc, 0x004128, 0x00000101, 0x00000101); - ram_wr32(fuc, 0x004004, mclk.pll); - ram_wr32(fuc, 0x004000, (ctrl |= 0x00000001)); - ram_wr32(fuc, 0x004000, (ctrl &= 0xffffffef)); - ram_wait(fuc, 0x004000, 0x00020000, 0x00020000, 64000); - ram_wr32(fuc, 0x004000, (ctrl |= 0x00000010)); - ram_wr32(fuc, 0x004018, 0x00005000 | r004018); - ram_wr32(fuc, 0x004000, (ctrl |= 0x00000004)); - } - } else { - u32 ssel = 0x00000101; - if (mclk.clk) - ssel |= mclk.clk; - else - ssel |= 0x00080000; /* 324MHz, shouldn't matter... */ - ram_mask(fuc, 0x004168, 0x003f3141, ctrl); - } + pll2pll = (!(ctrl & 0x00000008)) && mclk.pll; + /* Pre, NVIDIA does this outside the script */ if (next->bios.ramcfg_10_02_10) { ram_mask(fuc, 0x111104, 0x00000600, 0x00000000); } else { ram_mask(fuc, 0x111100, 0x40000000, 0x40000000); ram_mask(fuc, 0x111104, 0x00000180, 0x00000000); } + /* Always disable this bit during reclock */ + ram_mask(fuc, 0x100200, 0x00000800, 0x00000000); + + /* If switching from non-pll to pll, lock before disabling FB */ + if (mclk.pll && !pll2pll) { + ram_mask(fuc, 0x004128, 0x003f3141, mclk.clk | 0x00000101); + nva3_ram_lock_pll(fuc, &mclk); + } + + /* Start with disabling some CRTCs and PFIFO? */ + ram_wait_vblank(fuc); + ram_wr32(fuc, 0x611200, 0x3300); + ram_mask(fuc, 0x002504, 0x1, 0x1); + ram_nsec(fuc, 10000); + ram_wait(fuc, 0x002504, 0x10, 0x10, 20000); /* XXX: or longer? */ + ram_block(fuc); + ram_nsec(fuc, 2000); + - if (!next->bios.rammap_10_04_02) - ram_mask(fuc, 0x100200, 0x00000800, 0x00000000); - ram_wr32(fuc, 0x611200, 0x00003300); if (!next->bios.ramcfg_10_02_10) - ram_wr32(fuc, 0x111100, 0x4c020000); /*XXX*/ + ram_mask(fuc, 0x111100, 0x04020000, 0x04020000); /*XXX*/ + + /* If we're disabling the DLL, do it now */ + if (next->bios.ramcfg_10_DLLoff) + nouveau_sddr3_dll_disable(fuc, ram->base.mr); + /* Brace RAM for impact */ ram_wr32(fuc, 0x1002d4, 0x00000001); ram_wr32(fuc, 0x1002d0, 0x00000001); ram_wr32(fuc, 0x1002d0, 0x00000001); @@ -523,24 +584,38 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) ram_wr32(fuc, 0x1002dc, 0x00000001); ram_nsec(fuc, 2000); - ctrl = ram_rd32(fuc, 0x004000); - if (!(ctrl & 0x00000008) && mclk.pll) { - ram_wr32(fuc, 0x004000, (ctrl |= 0x00000008)); + if (nv_device(pfb)->chipset == 0xa3 && freq <= 500000) + ram_mask(fuc, 0x100700, 0x00000006, 0x00000006); + + /* Fiddle with clocks */ + /* There's 4 scenario's + * pll->pll: first switch to a 324MHz clock, set up new PLL, switch + * clk->pll: Set up new PLL, switch + * pll->clk: Set up clock, switch + * clk->clk: Overwrite ctrl and other bits, switch */ + + /* Switch to regular clock - 324MHz */ + if (pll2pll) { + ram_mask(fuc, 0x004000, 0x00000004, 0x00000004); + ram_mask(fuc, 0x004168, 0x003f3141, 0x00083101); + ram_mask(fuc, 0x004000, 0x00000008, 0x00000008); ram_mask(fuc, 0x1110e0, 0x00088000, 0x00088000); ram_wr32(fuc, 0x004018, 0x00001000); - ram_wr32(fuc, 0x004000, (ctrl &= ~0x00000001)); - ram_wr32(fuc, 0x004004, mclk.pll); - ram_wr32(fuc, 0x004000, (ctrl |= 0x00000001)); - udelay(64); - ram_wr32(fuc, 0x004018, 0x00005000 | r004018); - udelay(20); - } else - if (!mclk.pll) { - ram_mask(fuc, 0x004168, 0x003f3040, mclk.clk); - ram_wr32(fuc, 0x004000, (ctrl |= 0x00000008)); + nva3_ram_lock_pll(fuc, &mclk); + } + + if (mclk.pll) { + ram_mask(fuc, 0x004000, 0x00000105, 0x00000105); + ram_wr32(fuc, 0x004018, 0x00001000 | r004018); + ram_wr32(fuc, 0x100da0, r100da0); + } else { + ram_mask(fuc, 0x004168, 0x003f3141, mclk.clk | 0x00000101); + ram_mask(fuc, 0x004000, 0x00000108, 0x00000008); ram_mask(fuc, 0x1110e0, 0x00088000, 0x00088000); - ram_wr32(fuc, 0x004018, 0x0000d000 | r004018); + ram_wr32(fuc, 0x004018, 0x00009000 | r004018); + ram_wr32(fuc, 0x100da0, r100da0); } + ram_nsec(fuc, 20000); if (next->bios.rammap_10_04_08) { ram_wr32(fuc, 0x1005a0, next->bios.ramcfg_10_06 << 16 | @@ -554,6 +629,12 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) 0x80000000); ram_mask(fuc, 0x10053c, 0x00001000, 0x00000000); } else { + if (train->state == NVA3_TRAIN_DONE) { + ram_wr32(fuc, 0x100080, 0x1020); + ram_mask(fuc, 0x111400, 0xffffffff, train->r_111400); + ram_mask(fuc, 0x1111e0, 0xffffffff, train->r_1111e0); + ram_mask(fuc, 0x100720, 0xffffffff, train->r_100720); + } ram_mask(fuc, 0x10053c, 0x00001000, 0x00001000); ram_mask(fuc, 0x10f804, 0x80000000, 0x00000000); ram_mask(fuc, 0x100760, 0x22222222, r100760); @@ -561,22 +642,27 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) ram_mask(fuc, 0x1007e0, 0x22222222, r100760); } + if (nv_device(pfb)->chipset == 0xa3 && freq > 500000) { + ram_mask(fuc, 0x100700, 0x00000006, 0x00000000); + } + + /* Final switch */ if (mclk.pll) { ram_mask(fuc, 0x1110e0, 0x00088000, 0x00011000); - ram_wr32(fuc, 0x004000, (ctrl &= ~0x00000008)); + ram_mask(fuc, 0x004000, 0x00000008, 0x00000000); } - /*XXX: LEAVE */ ram_wr32(fuc, 0x1002dc, 0x00000000); ram_wr32(fuc, 0x1002d4, 0x00000001); ram_wr32(fuc, 0x100210, 0x80000000); - ram_nsec(fuc, 1000); - ram_nsec(fuc, 1000); + ram_nsec(fuc, 2000); - ram_mask(fuc, mr[2], 0x00000000, 0x00000000); + /* Set RAM MR parameters and timings */ + ram_wr32(fuc, mr[2], ram->base.mr[2]); + ram_nsec(fuc, 1000); + ram_wr32(fuc, mr[1], ram->base.mr[1]); ram_nsec(fuc, 1000); - ram_nuke(fuc, mr[0]); - ram_mask(fuc, mr[0], 0x00000000, 0x00000000); + ram_wr32(fuc, mr[0], ram->base.mr[0]); ram_nsec(fuc, 1000); ram_wr32(fuc, 0x100220[3], timing[3]); @@ -592,35 +678,75 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) /* Misc */ ram_mask(fuc, 0x100200, 0x00001000, !next->bios.ramcfg_10_02_08 << 12); - unk714 = ram_rd32(fuc, 0x100714) & ~0xf0000010; - unk718 = ram_rd32(fuc, 0x100718) & ~0x00000100; - unk71c = ram_rd32(fuc, 0x10071c) & ~0x00000100; + /* XXX: A lot of "chipset"/"ram type" specific stuff...? */ + unk714 = ram_rd32(fuc, 0x100714) & ~0xf0000130; + unk718 = ram_rd32(fuc, 0x100718) & ~0x00000100; + unk71c = ram_rd32(fuc, 0x10071c) & ~0x00000100; + r111100 = ram_rd32(fuc, 0x111100) & ~0x3a800000; + + if (next->bios.ramcfg_10_02_04) { + switch (ram->base.type) { + case NV_MEM_TYPE_DDR3: + if (nv_device(pfb)->chipset != 0xa8) + r111100 |= 0x00000004; + /* no break */ + default: + break; + } + } else { + switch (ram->base.type) { + case NV_MEM_TYPE_DDR3: + if (nv_device(pfb)->chipset == 0xa8) { + r111100 |= 0x08000000; + } else { + r111100 &= ~0x00000004; + r111100 |= 0x12800000; + } + unk714 |= 0x00000010; + break; + default: + break; + } + } + + unk714 |= (next->bios.ramcfg_10_04_01) << 8; + if (next->bios.ramcfg_10_02_20) unk714 |= 0xf0000000; - if (!next->bios.ramcfg_10_02_04) - unk714 |= 0x00000010; - ram_wr32(fuc, 0x100714, unk714); - + if (next->bios.ramcfg_10_02_02) + unk718 |= 0x00000100; if (next->bios.ramcfg_10_02_01) unk71c |= 0x00000100; - ram_wr32(fuc, 0x10071c, unk71c); + if (next->bios.timing_10_24 != 0xff) { + unk718 &= ~0xf0000000; + unk718 |= next->bios.timing_10_24 << 28; + } + if (next->bios.ramcfg_10_02_10) + r111100 &= ~0x04020000; - if (next->bios.ramcfg_10_02_02) - unk718 |= 0x00000100; - ram_wr32(fuc, 0x100718, unk718); + ram_mask(fuc, 0x100714, 0xffffffff, unk714); + ram_mask(fuc, 0x10071c, 0xffffffff, unk71c); + ram_mask(fuc, 0x100718, 0xffffffff, unk718); + ram_mask(fuc, 0x111100, 0xffffffff, r111100); - if (next->bios.ramcfg_10_02_10) - ram_wr32(fuc, 0x111100, 0x48000000); /*XXX*/ + /* Reset DLL */ + if (!next->bios.ramcfg_10_DLLoff) + nouveau_sddr3_dll_reset(fuc); - ram_mask(fuc, mr[0], 0x100, 0x100); - ram_nsec(fuc, 1000); - ram_mask(fuc, mr[0], 0x100, 0x000); - ram_nsec(fuc, 1000); + ram_nsec(fuc, 14000); + ram_wr32(fuc, 0x100264, 0x1); ram_nsec(fuc, 2000); - ram_nsec(fuc, 12000); - ram_wr32(fuc, 0x611200, 0x00003330); + ram_nuke(fuc, 0x100700); + ram_mask(fuc, 0x100700, 0x01000000, 0x01000000); + ram_mask(fuc, 0x100700, 0x01000000, 0x00000000); + + /* Re-enable FB */ + ram_unblock(fuc); + ram_wr32(fuc, 0x611200, 0x3330); + + /* Post fiddlings */ if (next->bios.rammap_10_04_02) ram_mask(fuc, 0x100200, 0x00000800, 0x00000800); if (next->bios.ramcfg_10_02_10) { @@ -648,7 +774,22 @@ nva3_ram_prog(struct nouveau_fb *pfb) struct nouveau_device *device = nv_device(pfb); struct nva3_ram *ram = (void *)pfb->ram; struct nva3_ramfuc *fuc = &ram->fuc; - ram_exec(fuc, nouveau_boolopt(device->cfgopt, "NvMemExec", true)); + bool exec = nouveau_boolopt(device->cfgopt, "NvMemExec", true); + + if (exec) { + nv_mask(pfb, 0x001534, 0x2, 0x2); + + ram_exec(fuc, true); + + /* Post-processing, avoids flicker */ + nv_mask(pfb, 0x002504, 0x1, 0x0); + nv_mask(pfb, 0x001534, 0x2, 0x0); + + nv_mask(pfb, 0x616308, 0x10, 0x10); + nv_mask(pfb, 0x616b08, 0x10, 0x10); + } else { + ram_exec(fuc, false); + } return 0; } @@ -713,31 +854,39 @@ nva3_ram_ctor(struct nouveau_object *parent, struct nouveau_object *engine, ram->fuc.r_0x001610 = ramfuc_reg(0x001610); ram->fuc.r_0x001700 = ramfuc_reg(0x001700); + ram->fuc.r_0x002504 = ramfuc_reg(0x002504); ram->fuc.r_0x004000 = ramfuc_reg(0x004000); ram->fuc.r_0x004004 = ramfuc_reg(0x004004); ram->fuc.r_0x004018 = ramfuc_reg(0x004018); ram->fuc.r_0x004128 = ramfuc_reg(0x004128); ram->fuc.r_0x004168 = ramfuc_reg(0x004168); + ram->fuc.r_0x100080 = ramfuc_reg(0x100080); ram->fuc.r_0x100200 = ramfuc_reg(0x100200); ram->fuc.r_0x100210 = ramfuc_reg(0x100210); for (i = 0; i < 9; i++) ram->fuc.r_0x100220[i] = ramfuc_reg(0x100220 + (i * 4)); + ram->fuc.r_0x100264 = ramfuc_reg(0x100264); ram->fuc.r_0x1002d0 = ramfuc_reg(0x1002d0); ram->fuc.r_0x1002d4 = ramfuc_reg(0x1002d4); ram->fuc.r_0x1002dc = ramfuc_reg(0x1002dc); ram->fuc.r_0x10053c = ramfuc_reg(0x10053c); ram->fuc.r_0x1005a0 = ramfuc_reg(0x1005a0); ram->fuc.r_0x1005a4 = ramfuc_reg(0x1005a4); + ram->fuc.r_0x100700 = ramfuc_reg(0x100700); ram->fuc.r_0x100714 = ramfuc_reg(0x100714); ram->fuc.r_0x100718 = ramfuc_reg(0x100718); ram->fuc.r_0x10071c = ramfuc_reg(0x10071c); + ram->fuc.r_0x100720 = ramfuc_reg(0x100720); ram->fuc.r_0x100760 = ramfuc_stride(0x100760, 4, ram->base.part_mask); ram->fuc.r_0x1007a0 = ramfuc_stride(0x1007a0, 4, ram->base.part_mask); ram->fuc.r_0x1007e0 = ramfuc_stride(0x1007e0, 4, ram->base.part_mask); + ram->fuc.r_0x100da0 = ramfuc_stride(0x100da0, 4, ram->base.part_mask); ram->fuc.r_0x10f804 = ramfuc_reg(0x10f804); ram->fuc.r_0x1110e0 = ramfuc_stride(0x1110e0, 4, ram->base.part_mask); ram->fuc.r_0x111100 = ramfuc_reg(0x111100); ram->fuc.r_0x111104 = ramfuc_reg(0x111104); + ram->fuc.r_0x1111e0 = ramfuc_reg(0x1111e0); + ram->fuc.r_0x111400 = ramfuc_reg(0x111400); ram->fuc.r_0x611200 = ramfuc_reg(0x611200); if (ram->base.ranks > 1) { diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/sddr2.c b/drivers/gpu/drm/nouveau/core/subdev/fb/sddr2.c index bb1eb8f..252575f 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/fb/sddr2.c +++ b/drivers/gpu/drm/nouveau/core/subdev/fb/sddr2.c @@ -66,7 +66,7 @@ nouveau_sddr2_calc(struct nouveau_ram *ram) case 0x10: CL = ram->next->bios.timing_10_CL; WR = ram->next->bios.timing_10_WR; - DLL = !ram->next->bios.ramcfg_10_02_40; + DLL = !ram->next->bios.ramcfg_10_DLLoff; ODT = ram->next->bios.timing_10_ODT & 3; break; case 0x20: diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/sddr3.c b/drivers/gpu/drm/nouveau/core/subdev/fb/sddr3.c index 83949b1..a2dca48 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/fb/sddr3.c +++ b/drivers/gpu/drm/nouveau/core/subdev/fb/sddr3.c @@ -80,7 +80,7 @@ nouveau_sddr3_calc(struct nouveau_ram *ram) CWL = ram->next->bios.timing_10_CWL; CL = ram->next->bios.timing_10_CL; WR = ram->next->bios.timing_10_WR; - DLL = !ram->next->bios.ramcfg_10_02_40; + DLL = !ram->next->bios.ramcfg_10_DLLoff; ODT = ram->next->bios.timing_10_ODT; break; case 0x20: -- 1.9.3
Roy Spliet
2014-Sep-29 17:27 UTC
[Nouveau] [PATCH 5/7] fb/ramnva3: Reclocking script for DDR2
Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c | 57 +++++++++++++++++------- 1 file changed, 42 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c b/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c index 7cff2d4..0e1697a 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c +++ b/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c @@ -356,7 +356,7 @@ nva3_ram_timing_calc(struct nouveau_fb *pfb, u32 *timing) { struct nva3_ram *ram = (void *)pfb->ram; struct nvbios_ramcfg *cfg = &ram->base.target.bios; - int tUNK_base; + int tUNK_base, tUNK_40_0, prevCL; u32 cur3, cur7, cur8; cur3 = nv_rd32(pfb, 0x10022c); @@ -364,10 +364,11 @@ nva3_ram_timing_calc(struct nouveau_fb *pfb, u32 *timing) cur8 = nv_rd32(pfb, 0x100240); if (T(CWL) == 0) - T(CWL) = ((nv_rd32(pfb, 0x100228) & 0x0f000000) >> 24) + 1; + /* Observed on DDR2 */ + T(CWL) = T(CL) - 1; - tUNK_base = ((cur7 & 0x00ff0000) >> 16) - - (cur3 & 0x000000ff) - 1; + prevCL = (cur3 & 0x000000ff) + 1; + tUNK_base = ((cur7 & 0x00ff0000) >> 16) - prevCL; timing[0] = (T(RP) << 24 | T(RAS) << 16 | T(RFC) << 8 | T(RC)); timing[1] = (T(WR) + 1 + T(CWL)) << 24 | @@ -398,6 +399,16 @@ nva3_ram_timing_calc(struct nouveau_fb *pfb, u32 *timing) 0x202; timing[8] = cur8 & 0xffffff00; + switch (ram->base.type) { + case NV_MEM_TYPE_DDR2: + tUNK_40_0 = prevCL - (cur8 & 0xff); + if (tUNK_40_0 > 0) + timing[8] |= T(CL); + break; + default: + break; + } + nv_debug(pfb, "Entry: 220: %08x %08x %08x %08x\n", timing[0], timing[1], timing[2], timing[3]); nv_debug(pfb, " 230: %08x %08x %08x %08x\n", @@ -408,7 +419,7 @@ nva3_ram_timing_calc(struct nouveau_fb *pfb, u32 *timing) #undef T static void -nouveau_sddr3_dll_reset(struct nva3_ramfuc *fuc) +nouveau_sddr2_dll_reset(struct nva3_ramfuc *fuc) { ram_mask(fuc, mr[0], 0x100, 0x100); ram_nsec(fuc, 1000); @@ -514,6 +525,9 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) ram->base.mr[2] = ram_rd32(fuc, mr[2]); switch (ram->base.type) { + case NV_MEM_TYPE_DDR2: + ret = nouveau_sddr2_calc(&ram->base); + break; case NV_MEM_TYPE_DDR3: ret = nouveau_sddr3_calc(&ram->base); break; @@ -573,8 +587,11 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) ram_mask(fuc, 0x111100, 0x04020000, 0x04020000); /*XXX*/ /* If we're disabling the DLL, do it now */ - if (next->bios.ramcfg_10_DLLoff) + switch (next->bios.ramcfg_10_DLLoff * ram->base.type) { + case NV_MEM_TYPE_DDR3: nouveau_sddr3_dll_disable(fuc, ram->base.mr); + break; + } /* Brace RAM for impact */ ram_wr32(fuc, 0x1002d4, 0x00000001); @@ -658,12 +675,12 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) ram_nsec(fuc, 2000); /* Set RAM MR parameters and timings */ - ram_wr32(fuc, mr[2], ram->base.mr[2]); - ram_nsec(fuc, 1000); - ram_wr32(fuc, mr[1], ram->base.mr[1]); - ram_nsec(fuc, 1000); - ram_wr32(fuc, mr[0], ram->base.mr[0]); - ram_nsec(fuc, 1000); + for (i = 2; i >= 0; i--) { + if (ram_rd32(fuc, mr[i]) != ram->base.mr[i]) { + ram_wr32(fuc, mr[i], ram->base.mr[i]); + ram_nsec(fuc, 1000); + } + } ram_wr32(fuc, 0x100220[3], timing[3]); ram_wr32(fuc, 0x100220[1], timing[1]); @@ -690,11 +707,18 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) if (nv_device(pfb)->chipset != 0xa8) r111100 |= 0x00000004; /* no break */ + case NV_MEM_TYPE_DDR2: + r111100 |= 0x08000000; + break; default: break; } } else { switch (ram->base.type) { + case NV_MEM_TYPE_DDR2: + r111100 |= 0x1a800000; + unk714 |= 0x00000010; + break; case NV_MEM_TYPE_DDR3: if (nv_device(pfb)->chipset == 0xa8) { r111100 |= 0x08000000; @@ -731,12 +755,14 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) /* Reset DLL */ if (!next->bios.ramcfg_10_DLLoff) - nouveau_sddr3_dll_reset(fuc); + nouveau_sddr2_dll_reset(fuc); ram_nsec(fuc, 14000); - ram_wr32(fuc, 0x100264, 0x1); - ram_nsec(fuc, 2000); + if (ram->base.type == NV_MEM_TYPE_DDR3) { + ram_wr32(fuc, 0x100264, 0x1); + ram_nsec(fuc, 2000); + } ram_nuke(fuc, 0x100700); ram_mask(fuc, 0x100700, 0x01000000, 0x01000000); @@ -842,6 +868,7 @@ nva3_ram_ctor(struct nouveau_object *parent, struct nouveau_object *engine, return ret; switch (ram->base.type) { + case NV_MEM_TYPE_DDR2: case NV_MEM_TYPE_DDR3: ram->base.calc = nva3_ram_calc; ram->base.prog = nva3_ram_prog; -- 1.9.3
Roy Spliet
2014-Sep-29 17:27 UTC
[Nouveau] [PATCH 6/7] fb/ramnva3: Reclocking script for GDDR3
Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c | 100 +++++++++++++++++++++-- drivers/gpu/drm/nouveau/core/subdev/gpio/nv50.c | 2 +- 2 files changed, 92 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c b/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c index 0e1697a..3b38a53 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c +++ b/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c @@ -33,6 +33,8 @@ #include <subdev/clock/nva3.h> #include <subdev/clock/pll.h> +#include <subdev/gpio.h> + #include <subdev/timer.h> #include <engine/fifo.h> @@ -43,6 +45,9 @@ #include "nv50.h" +/* XXX: Remove when memx gains GPIO support */ +extern int nv50_gpio_location(int line, u32 *reg, u32 *shift); + struct nva3_ramfuc { struct ramfuc base; struct ramfuc_reg r_0x001610; @@ -81,6 +86,7 @@ struct nva3_ramfuc { struct ramfuc_reg r_0x111400; struct ramfuc_reg r_0x611200; struct ramfuc_reg r_mr[4]; + struct ramfuc_reg r_gpioFBVREF; }; struct nva3_ltrain { @@ -357,15 +363,22 @@ nva3_ram_timing_calc(struct nouveau_fb *pfb, u32 *timing) struct nva3_ram *ram = (void *)pfb->ram; struct nvbios_ramcfg *cfg = &ram->base.target.bios; int tUNK_base, tUNK_40_0, prevCL; - u32 cur3, cur7, cur8; + u32 cur2, cur3, cur7, cur8; + cur2 = nv_rd32(pfb, 0x100228); cur3 = nv_rd32(pfb, 0x10022c); cur7 = nv_rd32(pfb, 0x10023c); cur8 = nv_rd32(pfb, 0x100240); - if (T(CWL) == 0) - /* Observed on DDR2 */ + + switch ((!T(CWL)) * ram->base.type) { + case NV_MEM_TYPE_DDR2: T(CWL) = T(CL) - 1; + break; + case NV_MEM_TYPE_GDDR3: + T(CWL) = ((cur2 & 0xff000000) >> 24) + 1; + break; + } prevCL = (cur3 & 0x000000ff) + 1; tUNK_base = ((cur7 & 0x00ff0000) >> 16) - prevCL; @@ -389,10 +402,10 @@ nva3_ram_timing_calc(struct nouveau_fb *pfb, u32 *timing) T(13); timing[5] = T(RFC) << 24 | max_t(u8,T(RCDRD), T(RCDWR)) << 16 | - (T(CWL) + 6) << 8 | + max_t(u8, (T(CWL) + 6), (T(CL) + 2)) << 8 | T(RP); timing[6] = (0x5a + T(CL)) << 16 | - (6 - T(CL) + T(CWL)) << 8 | + max_t(u8, 1, (6 - T(CL) + T(CWL))) << 8 | (0x50 + T(CL) - T(CWL)); timing[7] = (cur7 & 0xff000000) | ((tUNK_base + T(CL)) << 16) | @@ -401,6 +414,7 @@ nva3_ram_timing_calc(struct nouveau_fb *pfb, u32 *timing) switch (ram->base.type) { case NV_MEM_TYPE_DDR2: + case NV_MEM_TYPE_GDDR3: tUNK_40_0 = prevCL - (cur8 & 0xff); if (tUNK_40_0 > 0) timing[8] |= T(CL); @@ -440,6 +454,17 @@ nouveau_sddr3_dll_disable(struct nva3_ramfuc *fuc, u32 *mr) } static void +nouveau_gddr3_dll_disable(struct nva3_ramfuc *fuc, u32 *mr) +{ + u32 mr1_old = ram_rd32(fuc, mr[1]); + + if (!(mr1_old & 0x40)) { + ram_wr32(fuc, mr[1], mr[1]); + ram_nsec(fuc, 1000); + } +} + +static void nva3_ram_lock_pll(struct nva3_ramfuc *fuc, struct nva3_clock_info *mclk) { ram_wr32(fuc, 0x004004, mclk->pll); @@ -449,6 +474,29 @@ nva3_ram_lock_pll(struct nva3_ramfuc *fuc, struct nva3_clock_info *mclk) ram_mask(fuc, 0x004000, 0x00000010, 0x00000010); } +static void +nva3_ram_fbvref(struct nva3_ramfuc *fuc, u32 val) +{ + struct nouveau_gpio *gpio = nouveau_gpio(fuc->base.pfb); + struct dcb_gpio_func func; + u32 reg, sh, gpio_val; + int ret; + + if (gpio->get(gpio, 0, 0x2e, DCB_GPIO_UNUSED) != val) { + ret = gpio->find(gpio, 0, 0x2e, DCB_GPIO_UNUSED, &func); + if (ret) + return; + + nv50_gpio_location(func.line, ®, &sh); + gpio_val = ram_rd32(fuc, gpioFBVREF); + if (gpio_val & (8 << sh)) + val = !val; + + ram_mask(fuc, gpioFBVREF, (0x3 << sh), ((val | 0x2) << sh)); + ram_nsec(fuc, 20000); + } +} + static int nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) { @@ -531,6 +579,9 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) case NV_MEM_TYPE_DDR3: ret = nouveau_sddr3_calc(&ram->base); break; + case NV_MEM_TYPE_GDDR3: + ret = nouveau_gddr3_calc(&ram->base); + break; default: ret = -ENOSYS; break; @@ -582,17 +633,26 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) ram_block(fuc); ram_nsec(fuc, 2000); - - if (!next->bios.ramcfg_10_02_10) - ram_mask(fuc, 0x111100, 0x04020000, 0x04020000); /*XXX*/ + if (!next->bios.ramcfg_10_02_10) { + if (ram->base.type == NV_MEM_TYPE_GDDR3) + ram_mask(fuc, 0x111100, 0x04020000, 0x00020000); + else + ram_mask(fuc, 0x111100, 0x04020000, 0x04020000); + } /* If we're disabling the DLL, do it now */ switch (next->bios.ramcfg_10_DLLoff * ram->base.type) { case NV_MEM_TYPE_DDR3: nouveau_sddr3_dll_disable(fuc, ram->base.mr); break; + case NV_MEM_TYPE_GDDR3: + nouveau_gddr3_dll_disable(fuc, ram->base.mr); + break; } + if (fuc->r_gpioFBVREF.addr && next->bios.timing_10_ODT) + nva3_ram_fbvref(fuc, 0); + /* Brace RAM for impact */ ram_wr32(fuc, 0x1002d4, 0x00000001); ram_wr32(fuc, 0x1002d0, 0x00000001); @@ -728,6 +788,10 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) } unk714 |= 0x00000010; break; + case NV_MEM_TYPE_GDDR3: + r111100 |= 0x30000000; + unk714 |= 0x00000020; + break; default: break; } @@ -753,11 +817,18 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) ram_mask(fuc, 0x100718, 0xffffffff, unk718); ram_mask(fuc, 0x111100, 0xffffffff, r111100); + if (fuc->r_gpioFBVREF.addr && !next->bios.timing_10_ODT) + nva3_ram_fbvref(fuc, 1); + /* Reset DLL */ if (!next->bios.ramcfg_10_DLLoff) nouveau_sddr2_dll_reset(fuc); - ram_nsec(fuc, 14000); + if (ram->base.type == NV_MEM_TYPE_GDDR3) { + ram_nsec(fuc, 31000); + } else { + ram_nsec(fuc, 14000); + } if (ram->base.type == NV_MEM_TYPE_DDR3) { ram_wr32(fuc, 0x100264, 0x1); @@ -859,8 +930,12 @@ nva3_ram_ctor(struct nouveau_object *parent, struct nouveau_object *engine, struct nouveau_oclass *oclass, void *data, u32 datasize, struct nouveau_object **pobject) { + struct nouveau_fb *pfb = nouveau_fb(parent); + struct nouveau_gpio *gpio = nouveau_gpio(pfb); + struct dcb_gpio_func func; struct nva3_ram *ram; int ret, i; + u32 reg, shift; ret = nv50_ram_create(parent, engine, oclass, &ram); *pobject = nv_object(ram); @@ -870,6 +945,7 @@ nva3_ram_ctor(struct nouveau_object *parent, struct nouveau_object *engine, switch (ram->base.type) { case NV_MEM_TYPE_DDR2: case NV_MEM_TYPE_DDR3: + case NV_MEM_TYPE_GDDR3: ram->base.calc = nva3_ram_calc; ram->base.prog = nva3_ram_prog; ram->base.tidy = nva3_ram_tidy; @@ -928,6 +1004,12 @@ nva3_ram_ctor(struct nouveau_object *parent, struct nouveau_object *engine, ram->fuc.r_mr[3] = ramfuc_reg(0x1002e4); } + ret = gpio->find(gpio, 0, 0x2e, DCB_GPIO_UNUSED, &func); + if (ret == 0) { + nv50_gpio_location(func.line, ®, &shift); + ram->fuc.r_gpioFBVREF = ramfuc_reg(reg); + } + return 0; } diff --git a/drivers/gpu/drm/nouveau/core/subdev/gpio/nv50.c b/drivers/gpu/drm/nouveau/core/subdev/gpio/nv50.c index 1864fa9..2e30d5a 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/gpio/nv50.c +++ b/drivers/gpu/drm/nouveau/core/subdev/gpio/nv50.c @@ -54,7 +54,7 @@ nv50_gpio_reset(struct nouveau_gpio *gpio, u8 match) } } -static int +int nv50_gpio_location(int line, u32 *reg, u32 *shift) { const u32 nv50_gpio_reg[4] = { 0xe104, 0xe108, 0xe280, 0xe284 }; -- 1.9.3
Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- drivers/gpu/drm/nouveau/core/subdev/clock/nva3.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/core/subdev/clock/nva3.c b/drivers/gpu/drm/nouveau/core/subdev/clock/nva3.c index 094551d..07ad012 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/clock/nva3.c +++ b/drivers/gpu/drm/nouveau/core/subdev/clock/nva3.c @@ -510,7 +510,7 @@ nva3_clock_ctor(struct nouveau_object *parent, struct nouveau_object *engine, int ret; ret = nouveau_clock_create(parent, engine, oclass, nva3_domain, NULL, 0, - false, &priv); + true, &priv); *pobject = nv_object(priv); if (ret) return ret; -- 1.9.3
V2: fix whitespace errors in memx.fuc Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- drivers/gpu/drm/nouveau/core/include/subdev/pwr.h | 2 + drivers/gpu/drm/nouveau/core/subdev/fb/ramfuc.h | 16 ++ drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c | 318 +++++++++++++++++++-- .../gpu/drm/nouveau/core/subdev/pwr/fuc/memx.fuc | 111 +++++++ drivers/gpu/drm/nouveau/core/subdev/pwr/fuc/os.h | 5 + drivers/gpu/drm/nouveau/core/subdev/pwr/memx.c | 35 ++- 6 files changed, 458 insertions(+), 29 deletions(-) diff --git a/drivers/gpu/drm/nouveau/core/include/subdev/pwr.h b/drivers/gpu/drm/nouveau/core/include/subdev/pwr.h index bf3d1f6..f2427bf 100644 --- a/drivers/gpu/drm/nouveau/core/include/subdev/pwr.h +++ b/drivers/gpu/drm/nouveau/core/include/subdev/pwr.h @@ -48,6 +48,8 @@ void nouveau_memx_wait(struct nouveau_memx *, u32 addr, u32 mask, u32 data, u32 nsec); void nouveau_memx_nsec(struct nouveau_memx *, u32 nsec); void nouveau_memx_wait_vblank(struct nouveau_memx *); +void nouveau_memx_train(struct nouveau_memx *); +int nouveau_memx_train_result(struct nouveau_pwr *, u32 *, int); void nouveau_memx_block(struct nouveau_memx *); void nouveau_memx_unblock(struct nouveau_memx *); diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/ramfuc.h b/drivers/gpu/drm/nouveau/core/subdev/fb/ramfuc.h index d1fbbe4..0ac7256 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/fb/ramfuc.h +++ b/drivers/gpu/drm/nouveau/core/subdev/fb/ramfuc.h @@ -141,6 +141,20 @@ ramfuc_wait_vblank(struct ramfuc *ram) } static inline void +ramfuc_train(struct ramfuc *ram) +{ + nouveau_memx_train(ram->memx); +} + +static inline int +ramfuc_train_result(struct nouveau_fb *pfb, u32 *result, u32 rsize) +{ + struct nouveau_pwr *ppwr = nouveau_pwr(pfb); + + return nouveau_memx_train_result(ppwr, result, rsize); +} + +static inline void ramfuc_block(struct ramfuc *ram) { nouveau_memx_block(ram->memx); @@ -162,6 +176,8 @@ ramfuc_unblock(struct ramfuc *ram) #define ram_wait(s,r,m,d,n) ramfuc_wait(&(s)->base, (r), (m), (d), (n)) #define ram_nsec(s,n) ramfuc_nsec(&(s)->base, (n)) #define ram_wait_vblank(s) ramfuc_wait_vblank(&(s)->base) +#define ram_train(s) ramfuc_train(&(s)->base) +#define ram_train_result(s,r,l) ramfuc_train_result((s), (r), (l)) #define ram_block(s) ramfuc_block(&(s)->base) #define ram_unblock(s) ramfuc_unblock(&(s)->base) diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c b/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c index 3601dec..719a09b 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c +++ b/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c @@ -20,17 +20,23 @@ * OTHER DEALINGS IN THE SOFTWARE. * * Authors: Ben Skeggs + * Roy Spliet <rspliet at eclipso.eu> */ #include <subdev/bios.h> #include <subdev/bios/bit.h> #include <subdev/bios/pll.h> #include <subdev/bios/rammap.h> +#include <subdev/bios/M0205.h> #include <subdev/bios/timing.h> #include <subdev/clock/nva3.h> #include <subdev/clock/pll.h> +#include <subdev/timer.h> + +#include <engine/fifo.h> + #include <core/option.h> #include "ramfuc.h" @@ -39,11 +45,14 @@ struct nva3_ramfuc { struct ramfuc base; + struct ramfuc_reg r_0x001610; + struct ramfuc_reg r_0x001700; struct ramfuc_reg r_0x004000; struct ramfuc_reg r_0x004004; struct ramfuc_reg r_0x004018; struct ramfuc_reg r_0x004128; struct ramfuc_reg r_0x004168; + struct ramfuc_reg r_0x100080; struct ramfuc_reg r_0x100200; struct ramfuc_reg r_0x100210; struct ramfuc_reg r_0x100220[9]; @@ -56,6 +65,7 @@ struct nva3_ramfuc { struct ramfuc_reg r_0x100714; struct ramfuc_reg r_0x100718; struct ramfuc_reg r_0x10071c; + struct ramfuc_reg r_0x100720; struct ramfuc_reg r_0x100760; struct ramfuc_reg r_0x1007a0; struct ramfuc_reg r_0x1007e0; @@ -63,15 +73,276 @@ struct nva3_ramfuc { struct ramfuc_reg r_0x1110e0; struct ramfuc_reg r_0x111100; struct ramfuc_reg r_0x111104; + struct ramfuc_reg r_0x1111e0; + struct ramfuc_reg r_0x111400; struct ramfuc_reg r_0x611200; struct ramfuc_reg r_mr[4]; }; +struct nva3_ltrain { + enum { + NVA3_TRAIN_UNKNOWN, + NVA3_TRAIN_UNSUPPORTED, + NVA3_TRAIN_ONCE, + NVA3_TRAIN_EXEC, + NVA3_TRAIN_DONE + } state; + u32 r_100720; + u32 r_1111e0; + u32 r_111400; + struct nouveau_mem *mem; +}; + struct nva3_ram { struct nouveau_ram base; struct nva3_ramfuc fuc; + struct nva3_ltrain ltrain; }; +void +nva3_link_train_calc(u32 *vals, struct nva3_ltrain *train) +{ + int i, lo, hi; + u8 median[8], bins[4] = {0, 0, 0, 0}, bin = 0, qty = 0; + + for (i = 0; i < 8; i++) { + for (lo = 0; lo < 0x40; lo++) { + if (!(vals[lo] & 0x80000000)) + continue; + if (vals[lo] & (0x101 << i)) + break; + } + + if (lo == 0x40) + return; + + for (hi = lo + 1; hi < 0x40; hi++) { + if (!(vals[lo] & 0x80000000)) + continue; + if (!(vals[hi] & (0x101 << i))) { + hi--; + break; + } + } + + median[i] = ((hi - lo) >> 1) + lo; + bins[(median[i] & 0xf0) >> 4]++; + median[i] += 0x30; + } + + /* Find the best value for 0x1111e0 */ + for (i = 0; i < 4; i++) { + if (bins[i] > qty) { + bin = i + 3; + qty = bins[i]; + } + } + + train->r_100720 = 0; + for (i = 0; i < 8; i++) { + median[i] = max(median[i], (u8) (bin << 4)); + median[i] = min(median[i], (u8) ((bin << 4) | 0xf)); + + train->r_100720 |= ((median[i] & 0x0f) << (i << 2)); + } + + train->r_1111e0 = 0x02000000 | (bin * 0x101); + train->r_111400 = 0x0; +} + +/* + * Link training for (at least) DDR3 + */ +int +nva3_link_train(struct nouveau_fb *pfb) +{ + struct nouveau_bios *bios = nouveau_bios(pfb); + struct nva3_ram *ram = (void *)pfb->ram; + struct nouveau_clock *clk = nouveau_clock(pfb); + struct nva3_ltrain *train = &ram->ltrain; + struct nouveau_device *device = nv_device(pfb); + struct nva3_ramfuc *fuc = &ram->fuc; + u32 *result, r1700; + int ret, i; + struct nvbios_M0205T M0205T = { 0 }; + u8 ver, hdr, cnt, len, snr, ssz; + unsigned int clk_current; + unsigned long flags; + unsigned long *f = &flags; + + if (nouveau_boolopt(device->cfgopt, "NvMemExec", true) != true) + return -ENOSYS; + + /* XXX: Multiple partitions? */ + result = kmalloc(64 * sizeof(u32), GFP_KERNEL); + if (!result) + return -ENOMEM; + + train->state = NVA3_TRAIN_EXEC; + + /* Clock speeds for training and back */ + nvbios_M0205Tp(bios, &ver, &hdr, &cnt, &len, &snr, &ssz, &M0205T); + if (M0205T.freq == 0) + return -ENOENT; + + clk_current = clk->read(clk, nv_clk_src_mem); + + ret = nva3_clock_pre(clk, f); + if (ret) + goto out; + + /* First: clock up/down */ + ret = ram->base.calc(pfb, (u32) M0205T.freq * 1000); + if (ret) + goto out; + + /* Do this *after* calc, eliminates write in script */ + nv_wr32(pfb, 0x111400, 0x00000000); + /* XXX: Magic writes that improve train reliability? */ + nv_mask(pfb, 0x100674, 0x0000ffff, 0x00000000); + nv_mask(pfb, 0x1005e4, 0x0000ffff, 0x00000000); + nv_mask(pfb, 0x100b0c, 0x000000ff, 0x00000000); + nv_wr32(pfb, 0x100c04, 0x00000400); + + /* Now the training script */ + r1700 = ram_rd32(fuc, 0x001700); + + ram_mask(fuc, 0x100200, 0x00000800, 0x00000000); + ram_wr32(fuc, 0x611200, 0x3300); + ram_wait_vblank(fuc); + ram_wait(fuc, 0x611200, 0x00000003, 0x00000000, 500000); + ram_mask(fuc, 0x001610, 0x00000083, 0x00000003); + ram_mask(fuc, 0x100080, 0x00000020, 0x00000000); + ram_mask(fuc, 0x10f804, 0x80000000, 0x00000000); + ram_wr32(fuc, 0x001700, 0x00000000); + + ram_train(fuc); + + /* Reset */ + ram_mask(fuc, 0x10f804, 0x80000000, 0x80000000); + ram_wr32(fuc, 0x10053c, 0x0); + ram_wr32(fuc, 0x100720, train->r_100720); + ram_wr32(fuc, 0x1111e0, train->r_1111e0); + ram_wr32(fuc, 0x111400, train->r_111400); + ram_nuke(fuc, 0x100080); + ram_mask(fuc, 0x100080, 0x00000020, 0x00000020); + ram_nsec(fuc, 1000); + + ram_wr32(fuc, 0x001700, r1700); + ram_mask(fuc, 0x001610, 0x00000083, 0x00000080); + ram_wr32(fuc, 0x611200, 0x3330); + ram_mask(fuc, 0x100200, 0x00000800, 0x00000800); + + ram_exec(fuc, true); + + ram->base.calc(pfb, clk_current); + ram_exec(fuc, true); + + /* Post-processing, avoids flicker */ + nv_mask(pfb, 0x616308, 0x10, 0x10); + nv_mask(pfb, 0x616b08, 0x10, 0x10); + + nva3_clock_post(clk, f); + + ram_train_result(pfb, result, 64); + for (i = 0; i < 64; i++) + nv_debug(pfb, "Train: %08x", result[i]); + nva3_link_train_calc(result, train); + + nv_debug(pfb, "Train: %08x %08x %08x", train->r_100720, + train->r_1111e0, train->r_111400); + + kfree(result); + + train->state = NVA3_TRAIN_DONE; + + return ret; + +out: + if(ret == -EBUSY) + f = NULL; + + train->state = NVA3_TRAIN_UNSUPPORTED; + + nva3_clock_post(clk, f); + return ret; +} + +int +nva3_link_train_init(struct nouveau_fb *pfb) +{ + static const u32 pattern[16] = { + 0xaaaaaaaa, 0xcccccccc, 0xdddddddd, 0xeeeeeeee, + 0x00000000, 0x11111111, 0x44444444, 0xdddddddd, + 0x33333333, 0x55555555, 0x77777777, 0x66666666, + 0x99999999, 0x88888888, 0xeeeeeeee, 0xbbbbbbbb, + }; + struct nouveau_bios *bios = nouveau_bios(pfb); + struct nva3_ram *ram = (void *)pfb->ram; + struct nva3_ltrain *train = &ram->ltrain; + struct nouveau_mem *mem; + struct nvbios_M0205E M0205E; + u8 ver, hdr, cnt, len; + u32 r001700; + int ret, i = 0; + + train->state = NVA3_TRAIN_UNSUPPORTED; + + /* We support type "5" + * XXX: training pattern table appears to be unused for this routine */ + if (!nvbios_M0205Ep(bios, i, &ver, &hdr, &cnt, &len, &M0205E)) + return -ENOENT; + + if (M0205E.type != 5) + return 0; + + train->state = NVA3_TRAIN_ONCE; + + ret = pfb->ram->get(pfb, 0x8000, 0x10000, 0, 0x800, &ram->ltrain.mem); + if (ret) + return ret; + + mem = ram->ltrain.mem; + + nv_wr32(pfb, 0x100538, 0x10000000 | (mem->offset >> 16)); + nv_wr32(pfb, 0x1005a8, 0x0000ffff); + nv_mask(pfb, 0x10f800, 0x00000001, 0x00000001); + + for (i = 0; i < 0x30; i++) { + nv_wr32(pfb, 0x10f8c0, (i << 8) | i); + nv_wr32(pfb, 0x10f900, pattern[i % 16]); + } + + for (i = 0; i < 0x30; i++) { + nv_wr32(pfb, 0x10f8e0, (i << 8) | i); + nv_wr32(pfb, 0x10f920, pattern[i % 16]); + } + + /* And upload the pattern */ + r001700 = nv_rd32(pfb, 0x1700); + nv_wr32(pfb, 0x1700, mem->offset >> 16); + for (i = 0; i < 16; i++) + nv_wr32(pfb, 0x700000 + (i << 2), pattern[i]); + for (i = 0; i < 16; i++) + nv_wr32(pfb, 0x700100 + (i << 2), pattern[i]); + nv_wr32(pfb, 0x1700, r001700); + + train->r_100720 = nv_rd32(pfb, 0x100720); + train->r_1111e0 = nv_rd32(pfb, 0x1111e0); + train->r_111400 = nv_rd32(pfb, 0x111400); + + return 0; +} + +void +nva3_link_train_fini(struct nouveau_fb *pfb) +{ + struct nva3_ram *ram = (void *)pfb->ram; + + if (ram->ltrain.mem) + pfb->ram->put(pfb, &ram->ltrain.mem); +} + static int nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) { @@ -90,6 +361,9 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) next->freq = freq; ram->base.next = next; + if (ram->ltrain.state == NVA3_TRAIN_ONCE) + nva3_link_train(pfb); + /* lookup memory config data relevant to the target frequency */ i = 0; while ((data = nvbios_rammapEp(bios, i++, &ver, &hdr, &cnt, &len, @@ -330,38 +604,24 @@ nva3_ram_init(struct nouveau_object *object) { struct nouveau_fb *pfb = (void *)object->parent; struct nva3_ram *ram = (void *)object; - int ret, i; + int ret; ret = nouveau_ram_init(&ram->base); if (ret) return ret; - /* prepare for ddr link training, and load training patterns */ - switch (ram->base.type) { - case NV_MEM_TYPE_DDR3: { - if (nv_device(pfb)->chipset == 0xa8) { - static const u32 pattern[16] = { - 0xaaaaaaaa, 0xcccccccc, 0xdddddddd, 0xeeeeeeee, - 0x00000000, 0x11111111, 0x44444444, 0xdddddddd, - 0x33333333, 0x55555555, 0x77777777, 0x66666666, - 0x99999999, 0x88888888, 0xeeeeeeee, 0xbbbbbbbb, - }; - - nv_wr32(pfb, 0x100538, 0x10001ff6); /*XXX*/ - nv_wr32(pfb, 0x1005a8, 0x0000ffff); - nv_mask(pfb, 0x10f800, 0x00000001, 0x00000001); - for (i = 0; i < 0x30; i++) { - nv_wr32(pfb, 0x10f8c0, (i << 8) | i); - nv_wr32(pfb, 0x10f8e0, (i << 8) | i); - nv_wr32(pfb, 0x10f900, pattern[i % 16]); - nv_wr32(pfb, 0x10f920, pattern[i % 16]); - } - } - } - break; - default: - break; - } + nva3_link_train_init(pfb); + + return 0; +} + +static int +nva3_ram_fini(struct nouveau_object *object, bool suspend) +{ + struct nouveau_fb *pfb = (void *)object->parent; + + if (!suspend) + nva3_link_train_fini(pfb); return 0; } @@ -390,6 +650,8 @@ nva3_ram_ctor(struct nouveau_object *parent, struct nouveau_object *engine, return 0; } + ram->fuc.r_0x001610 = ramfuc_reg(0x001610); + ram->fuc.r_0x001700 = ramfuc_reg(0x001700); ram->fuc.r_0x004000 = ramfuc_reg(0x004000); ram->fuc.r_0x004004 = ramfuc_reg(0x004004); ram->fuc.r_0x004018 = ramfuc_reg(0x004018); @@ -438,6 +700,6 @@ nva3_ram_oclass = { .ctor = nva3_ram_ctor, .dtor = _nouveau_ram_dtor, .init = nva3_ram_init, - .fini = _nouveau_ram_fini, + .fini = nva3_ram_fini, }, }; diff --git a/drivers/gpu/drm/nouveau/core/subdev/pwr/fuc/memx.fuc b/drivers/gpu/drm/nouveau/core/subdev/pwr/fuc/memx.fuc index e89789a..4d629dc 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/pwr/fuc/memx.fuc +++ b/drivers/gpu/drm/nouveau/core/subdev/pwr/fuc/memx.fuc @@ -50,6 +50,7 @@ handler(WR32 , 0x0000, 0x0002, #memx_func_wr32) handler(WAIT , 0x0004, 0x0000, #memx_func_wait) handler(DELAY , 0x0001, 0x0000, #memx_func_delay) handler(VBLANK, 0x0001, 0x0000, #memx_func_wait_vblank) +handler(TRAIN , 0x0000, 0x0000, #memx_func_train) memx_func_tail: .equ #memx_func_size #memx_func_next - #memx_func_head @@ -63,6 +64,10 @@ memx_ts_end: memx_data_head: .skip 0x0800 memx_data_tail: + +memx_train_head: +.skip 0x0100 +memx_train_tail: #endif /****************************************************************************** @@ -260,6 +265,101 @@ memx_func_delay: // description // // $r15 - current (memx) +// $r4 - packet length +// $r3 - opcode desciption +// $r0 - zero +memx_func_train: +#if NVKM_PPWR_CHIPSET == GT215 +// $r5 - outer loop counter +// $r6 - inner loop counter +// $r7 - entry counter (#memx_train_head + $r7) + movw $r5 0x3 + movw $r7 0x0 + +// Read random memory to wake up... things + imm32($r9, 0x700000) + nv_rd32($r8,$r9) + movw $r14 0x2710 + call(nsec) + + memx_func_train_loop_outer: + mulu $r8 $r5 0x101 + sethi $r8 0x02000000 + imm32($r9, 0x1111e0) + nv_wr32($r9, $r8) + push $r5 + + movw $r6 0x0 + memx_func_train_loop_inner: + movw $r8 0x1111 + mulu $r9 $r6 $r8 + shl b32 $r8 $r9 0x10 + or $r8 $r9 + imm32($r9, 0x100720) + nv_wr32($r9, $r8) + + imm32($r9, 0x100080) + nv_rd32($r8, $r9) + or $r8 $r8 0x20 + nv_wr32($r9, $r8) + + imm32($r9, 0x10053c) + imm32($r8, 0x80003002) + nv_wr32($r9, $r8) + + imm32($r14, 0x100560) + imm32($r13, 0x80000000) + add b32 $r12 $r13 0 + imm32($r11, 0x001e8480) + call(wait) + + // $r5 - inner inner loop counter + // $r9 - result + movw $r5 0 + imm32($r9, 0x8300ffff) + memx_func_train_loop_4x: + imm32($r10, 0x100080) + nv_rd32($r8, $r10) + imm32($r11, 0xffffffdf) + and $r8 $r11 + nv_wr32($r10, $r8) + + imm32($r10, 0x10053c) + imm32($r8, 0x80003002) + nv_wr32($r10, $r8) + + imm32($r14, 0x100560) + imm32($r13, 0x80000000) + mov b32 $r12 $r13 + imm32($r11, 0x00002710) + call(wait) + + nv_rd32($r13, $r14) + and $r9 $r9 $r13 + + add b32 $r5 1 + cmp b16 $r5 0x4 + bra l #memx_func_train_loop_4x + + add b32 $r10 $r7 #memx_train_head + st b32 D[$r10 + 0] $r9 + add b32 $r6 1 + add b32 $r7 4 + + cmp b16 $r6 0x10 + bra l #memx_func_train_loop_inner + + pop $r5 + add b32 $r5 1 + cmp b16 $r5 7 + bra l #memx_func_train_loop_outer + +#endif + ret + +// description +// +// $r15 - current (memx) // $r14 - sender process name // $r13 - message (exec) // $r12 - head of script @@ -307,8 +407,19 @@ memx_exec: // $r11 - data1 // $r0 - zero memx_info: + cmp b16 $r12 0x1 + bra e #memx_info_train + + memx_info_data: mov $r12 #memx_data_head mov $r11 #memx_data_tail - #memx_data_head + bra #memx_info_send + + memx_info_train: + mov $r12 #memx_train_head + mov $r11 #memx_train_tail - #memx_train_head + + memx_info_send: call(send) ret diff --git a/drivers/gpu/drm/nouveau/core/subdev/pwr/fuc/os.h b/drivers/gpu/drm/nouveau/core/subdev/pwr/fuc/os.h index 522e307..c8b06cb 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/pwr/fuc/os.h +++ b/drivers/gpu/drm/nouveau/core/subdev/pwr/fuc/os.h @@ -18,6 +18,10 @@ #define MEMX_MSG_INFO 0 #define MEMX_MSG_EXEC 1 +/* MEMX: info types */ +#define MEMX_INFO_DATA 0 +#define MEMX_INFO_TRAIN 1 + /* MEMX: script opcode definitions */ #define MEMX_ENTER 1 #define MEMX_LEAVE 2 @@ -25,6 +29,7 @@ #define MEMX_WAIT 4 #define MEMX_DELAY 5 #define MEMX_VBLANK 6 +#define MEMX_TRAIN 7 /* I2C_: message identifiers */ #define I2C__MSG_RD08 0 diff --git a/drivers/gpu/drm/nouveau/core/subdev/pwr/memx.c b/drivers/gpu/drm/nouveau/core/subdev/pwr/memx.c index 65eaa25..f6ce39d 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/pwr/memx.c +++ b/drivers/gpu/drm/nouveau/core/subdev/pwr/memx.c @@ -47,7 +47,8 @@ nouveau_memx_init(struct nouveau_pwr *ppwr, struct nouveau_memx **pmemx) u32 reply[2]; int ret; - ret = ppwr->message(ppwr, reply, PROC_MEMX, MEMX_MSG_INFO, 0, 0); + ret = ppwr->message(ppwr, reply, PROC_MEMX, MEMX_MSG_INFO, + MEMX_INFO_DATA, 0); if (ret) return ret; @@ -152,6 +153,38 @@ nouveau_memx_wait_vblank(struct nouveau_memx *memx) } void +nouveau_memx_train(struct nouveau_memx *memx) +{ + nv_debug(memx->ppwr, " MEM TRAIN\n"); + memx_cmd(memx, MEMX_TRAIN, 0, NULL); +} + +int +nouveau_memx_train_result(struct nouveau_pwr *ppwr, u32 *res, int rsize) +{ + u32 reply[2], base, size, i; + int ret; + + ret = ppwr->message(ppwr, reply, PROC_MEMX, MEMX_MSG_INFO, + MEMX_INFO_TRAIN, 0); + if (ret) + return ret; + + base = reply[0]; + size = reply[1] >> 2; + if (size > rsize) + return -ENOMEM; + + /* read the packet */ + nv_wr32(ppwr, 0x10a1c0, 0x02000000 | base); + + for (i = 0; i < size; i++) + res[i] = nv_rd32(ppwr, 0x10a1c4); + + return 0; +} + +void nouveau_memx_block(struct nouveau_memx *memx) { nv_debug(memx->ppwr, " HOST BLOCKED\n"); -- 1.9.3
Roy Spliet
2014-Oct-02 16:01 UTC
[Nouveau] RESEND: Implement reclocking for DDR2, DDR3, GDDR3
This resend fixes intermediate compilation issues, thus unbreaking git bisect. In addition fixed a little thinko in nouveau_memx_wait, that transformed waits into either sleeps or nops. I didn't bother labelling patches v2, I hope you don't mind! :-) Roy
Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- drivers/gpu/drm/nouveau/core/include/subdev/bios/ramcfg.h | 2 +- drivers/gpu/drm/nouveau/core/subdev/bios/rammap.c | 2 +- drivers/gpu/drm/nouveau/core/subdev/fb/sddr2.c | 2 +- drivers/gpu/drm/nouveau/core/subdev/fb/sddr3.c | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/nouveau/core/include/subdev/bios/ramcfg.h b/drivers/gpu/drm/nouveau/core/include/subdev/bios/ramcfg.h index a685bbd..ae3f17d 100644 --- a/drivers/gpu/drm/nouveau/core/include/subdev/bios/ramcfg.h +++ b/drivers/gpu/drm/nouveau/core/include/subdev/bios/ramcfg.h @@ -43,7 +43,7 @@ struct nvbios_ramcfg { unsigned ramcfg_10_02_08:1; unsigned ramcfg_10_02_10:1; unsigned ramcfg_10_02_20:1; - unsigned ramcfg_10_02_40:1; + unsigned ramcfg_10_DLLoff:1; unsigned ramcfg_10_03_0f:4; unsigned ramcfg_10_05:8; unsigned ramcfg_10_06:8; diff --git a/drivers/gpu/drm/nouveau/core/subdev/bios/rammap.c b/drivers/gpu/drm/nouveau/core/subdev/bios/rammap.c index 585e693..24dc9b3 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/bios/rammap.c +++ b/drivers/gpu/drm/nouveau/core/subdev/bios/rammap.c @@ -162,7 +162,7 @@ nvbios_rammapSp(struct nouveau_bios *bios, u32 data, p->ramcfg_10_02_08 = (nv_ro08(bios, data + 0x02) & 0x08) >> 3; p->ramcfg_10_02_10 = (nv_ro08(bios, data + 0x02) & 0x10) >> 4; p->ramcfg_10_02_20 = (nv_ro08(bios, data + 0x02) & 0x20) >> 5; - p->ramcfg_10_02_40 = (nv_ro08(bios, data + 0x02) & 0x40) >> 6; + p->ramcfg_10_DLLoff = (nv_ro08(bios, data + 0x02) & 0x40) >> 6; p->ramcfg_10_03_0f = (nv_ro08(bios, data + 0x03) & 0x0f) >> 0; p->ramcfg_10_05 = (nv_ro08(bios, data + 0x05) & 0xff) >> 0; p->ramcfg_10_06 = (nv_ro08(bios, data + 0x06) & 0xff) >> 0; diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/sddr2.c b/drivers/gpu/drm/nouveau/core/subdev/fb/sddr2.c index bb1eb8f..252575f 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/fb/sddr2.c +++ b/drivers/gpu/drm/nouveau/core/subdev/fb/sddr2.c @@ -66,7 +66,7 @@ nouveau_sddr2_calc(struct nouveau_ram *ram) case 0x10: CL = ram->next->bios.timing_10_CL; WR = ram->next->bios.timing_10_WR; - DLL = !ram->next->bios.ramcfg_10_02_40; + DLL = !ram->next->bios.ramcfg_10_DLLoff; ODT = ram->next->bios.timing_10_ODT & 3; break; case 0x20: diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/sddr3.c b/drivers/gpu/drm/nouveau/core/subdev/fb/sddr3.c index 83949b1..a2dca48 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/fb/sddr3.c +++ b/drivers/gpu/drm/nouveau/core/subdev/fb/sddr3.c @@ -80,7 +80,7 @@ nouveau_sddr3_calc(struct nouveau_ram *ram) CWL = ram->next->bios.timing_10_CWL; CL = ram->next->bios.timing_10_CL; WR = ram->next->bios.timing_10_WR; - DLL = !ram->next->bios.ramcfg_10_02_40; + DLL = !ram->next->bios.ramcfg_10_DLLoff; ODT = ram->next->bios.timing_10_ODT; break; case 0x20: -- 1.9.3
Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- drivers/gpu/drm/nouveau/Makefile | 1 + drivers/gpu/drm/nouveau/core/subdev/fb/gddr3.c | 117 +++++++++++++++++++++++++ drivers/gpu/drm/nouveau/core/subdev/fb/priv.h | 1 + 3 files changed, 119 insertions(+) create mode 100644 drivers/gpu/drm/nouveau/core/subdev/fb/gddr3.c diff --git a/drivers/gpu/drm/nouveau/Makefile b/drivers/gpu/drm/nouveau/Makefile index 12c24c8..d5f5aa9 100644 --- a/drivers/gpu/drm/nouveau/Makefile +++ b/drivers/gpu/drm/nouveau/Makefile @@ -129,6 +129,7 @@ nouveau-y += core/subdev/fb/ramgk20a.o nouveau-y += core/subdev/fb/ramgm107.o nouveau-y += core/subdev/fb/sddr2.o nouveau-y += core/subdev/fb/sddr3.o +nouveau-y += core/subdev/fb/gddr3.o nouveau-y += core/subdev/fb/gddr5.o nouveau-y += core/subdev/fuse/base.o nouveau-y += core/subdev/fuse/g80.o diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/gddr3.c b/drivers/gpu/drm/nouveau/core/subdev/fb/gddr3.c new file mode 100644 index 0000000..d85a25d --- /dev/null +++ b/drivers/gpu/drm/nouveau/core/subdev/fb/gddr3.c @@ -0,0 +1,117 @@ +/* + * Copyright 2013 Red Hat Inc. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + * + * Authors: Ben Skeggs <bskeggs at redhat.com> + * Roy Spliet <rspliet at eclipso.eu> + */ + +#include <subdev/bios.h> +#include "priv.h" + +struct ramxlat { + int id; + u8 enc; +}; + +static inline int +ramxlat(const struct ramxlat *xlat, int id) +{ + while (xlat->id >= 0) { + if (xlat->id == id) + return xlat->enc; + xlat++; + } + return -EINVAL; +} + +static const struct ramxlat +ramgddr3_cl_lo[] = { + { 7, 7 }, { 8, 0 }, { 9, 1 }, { 10, 2 }, { 11, 3 }, + /* the below are mentioned in some, but not all, gddr3 docs */ + { 12, 4 }, { 13, 5 }, { 14, 6 }, + /* XXX: Per Samsung docs, are these used? They overlap with Qimonda */ + /* { 4, 4 }, { 5, 5 }, { 6, 6 }, { 12, 8 }, { 13, 9 }, { 14, 10 }, + * { 15, 11 }, */ + { -1 } +}; + +static const struct ramxlat +ramgddr3_cl_hi[] = { + { 10, 2 }, { 11, 3 }, { 12, 4 }, { 13, 5 }, { 14, 6 }, { 15, 7 }, + { 16, 0 }, { 17, 1 }, + { -1 } +}; + +static const struct ramxlat +ramgddr3_wr_lo[] = { + { 5, 2 }, { 7, 4 }, { 8, 5 }, { 9, 6 }, { 10, 7 }, + { 11, 0 }, + /* the below are mentioned in some, but not all, gddr3 docs */ + { 4, 1 }, { 6, 3 }, { 12, 1 }, { 13 , 2 }, + { -1 } +}; + +int +nouveau_gddr3_calc(struct nouveau_ram *ram) +{ + int CL, WR, CWL, DLL = 0, ODT = 0, hi; + + switch (ram->next->bios.timing_ver) { + case 0x10: + CWL = ram->next->bios.timing_10_CWL; + CL = ram->next->bios.timing_10_CL; + WR = ram->next->bios.timing_10_WR; + DLL = !ram->next->bios.ramcfg_10_DLLoff; + ODT = ram->next->bios.timing_10_ODT; + break; + case 0x20: + CWL = (ram->next->bios.timing[1] & 0x00000f80) >> 7; + CL = (ram->next->bios.timing[1] & 0x0000001f) >> 0; + WR = (ram->next->bios.timing[2] & 0x007f0000) >> 16; + /* XXX: Get these values from the VBIOS instead */ + DLL = !(ram->mr[1] & 0x1); + ODT = (ram->mr[1] & 0x004) >> 2 | + (ram->mr[1] & 0x040) >> 5 | + (ram->mr[1] & 0x200) >> 7; + break; + default: + return -ENOSYS; + } + + hi = ram->mr[2] & 0x1; + CL = ramxlat(hi ? ramgddr3_cl_hi : ramgddr3_cl_lo, CL); + WR = ramxlat(ramgddr3_wr_lo, WR); + if (CL < 0 || CWL < 1 || CWL > 7 || WR < 0) + return -EINVAL; + + ram->mr[0] &= ~0xf74; + ram->mr[0] |= (CWL & 0x07) << 9; + ram->mr[0] |= (CL & 0x07) << 4; + ram->mr[0] |= (CL & 0x08) >> 1; + + ram->mr[1] &= ~0x3fc; + ram->mr[1] |= (ODT & 0x03) << 2; + ram->mr[1] |= (ODT & 0x03) << 8; + ram->mr[1] |= (WR & 0x03) << 4; + ram->mr[1] |= (WR & 0x04) << 5; + ram->mr[1] |= !DLL << 6; + return 0; +} diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/priv.h b/drivers/gpu/drm/nouveau/core/subdev/fb/priv.h index 60322e9..283863f 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/fb/priv.h +++ b/drivers/gpu/drm/nouveau/core/subdev/fb/priv.h @@ -37,6 +37,7 @@ extern struct nouveau_oclass gm107_ram_oclass; int nouveau_sddr2_calc(struct nouveau_ram *ram); int nouveau_sddr3_calc(struct nouveau_ram *ram); +int nouveau_gddr3_calc(struct nouveau_ram *ram); int nouveau_gddr5_calc(struct nouveau_ram *ram, bool nuts); #define nouveau_fb_create(p,e,c,d) \ -- 1.9.3
V2: fix whitespace errors in memx.fuc Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- drivers/gpu/drm/nouveau/core/include/subdev/pwr.h | 2 + drivers/gpu/drm/nouveau/core/subdev/fb/ramfuc.h | 16 + drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c | 322 +++++++++++++++++++-- .../gpu/drm/nouveau/core/subdev/pwr/fuc/memx.fuc | 111 +++++++ drivers/gpu/drm/nouveau/core/subdev/pwr/fuc/os.h | 5 + drivers/gpu/drm/nouveau/core/subdev/pwr/memx.c | 35 ++- 6 files changed, 462 insertions(+), 29 deletions(-) diff --git a/drivers/gpu/drm/nouveau/core/include/subdev/pwr.h b/drivers/gpu/drm/nouveau/core/include/subdev/pwr.h index bf3d1f6..f2427bf 100644 --- a/drivers/gpu/drm/nouveau/core/include/subdev/pwr.h +++ b/drivers/gpu/drm/nouveau/core/include/subdev/pwr.h @@ -48,6 +48,8 @@ void nouveau_memx_wait(struct nouveau_memx *, u32 addr, u32 mask, u32 data, u32 nsec); void nouveau_memx_nsec(struct nouveau_memx *, u32 nsec); void nouveau_memx_wait_vblank(struct nouveau_memx *); +void nouveau_memx_train(struct nouveau_memx *); +int nouveau_memx_train_result(struct nouveau_pwr *, u32 *, int); void nouveau_memx_block(struct nouveau_memx *); void nouveau_memx_unblock(struct nouveau_memx *); diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/ramfuc.h b/drivers/gpu/drm/nouveau/core/subdev/fb/ramfuc.h index d1fbbe4..0ac7256 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/fb/ramfuc.h +++ b/drivers/gpu/drm/nouveau/core/subdev/fb/ramfuc.h @@ -141,6 +141,20 @@ ramfuc_wait_vblank(struct ramfuc *ram) } static inline void +ramfuc_train(struct ramfuc *ram) +{ + nouveau_memx_train(ram->memx); +} + +static inline int +ramfuc_train_result(struct nouveau_fb *pfb, u32 *result, u32 rsize) +{ + struct nouveau_pwr *ppwr = nouveau_pwr(pfb); + + return nouveau_memx_train_result(ppwr, result, rsize); +} + +static inline void ramfuc_block(struct ramfuc *ram) { nouveau_memx_block(ram->memx); @@ -162,6 +176,8 @@ ramfuc_unblock(struct ramfuc *ram) #define ram_wait(s,r,m,d,n) ramfuc_wait(&(s)->base, (r), (m), (d), (n)) #define ram_nsec(s,n) ramfuc_nsec(&(s)->base, (n)) #define ram_wait_vblank(s) ramfuc_wait_vblank(&(s)->base) +#define ram_train(s) ramfuc_train(&(s)->base) +#define ram_train_result(s,r,l) ramfuc_train_result((s), (r), (l)) #define ram_block(s) ramfuc_block(&(s)->base) #define ram_unblock(s) ramfuc_unblock(&(s)->base) diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c b/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c index 3601dec..45e8a91 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c +++ b/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c @@ -20,17 +20,23 @@ * OTHER DEALINGS IN THE SOFTWARE. * * Authors: Ben Skeggs + * Roy Spliet <rspliet at eclipso.eu> */ #include <subdev/bios.h> #include <subdev/bios/bit.h> #include <subdev/bios/pll.h> #include <subdev/bios/rammap.h> +#include <subdev/bios/M0205.h> #include <subdev/bios/timing.h> #include <subdev/clock/nva3.h> #include <subdev/clock/pll.h> +#include <subdev/timer.h> + +#include <engine/fifo.h> + #include <core/option.h> #include "ramfuc.h" @@ -39,11 +45,14 @@ struct nva3_ramfuc { struct ramfuc base; + struct ramfuc_reg r_0x001610; + struct ramfuc_reg r_0x001700; struct ramfuc_reg r_0x004000; struct ramfuc_reg r_0x004004; struct ramfuc_reg r_0x004018; struct ramfuc_reg r_0x004128; struct ramfuc_reg r_0x004168; + struct ramfuc_reg r_0x100080; struct ramfuc_reg r_0x100200; struct ramfuc_reg r_0x100210; struct ramfuc_reg r_0x100220[9]; @@ -56,6 +65,7 @@ struct nva3_ramfuc { struct ramfuc_reg r_0x100714; struct ramfuc_reg r_0x100718; struct ramfuc_reg r_0x10071c; + struct ramfuc_reg r_0x100720; struct ramfuc_reg r_0x100760; struct ramfuc_reg r_0x1007a0; struct ramfuc_reg r_0x1007e0; @@ -63,15 +73,276 @@ struct nva3_ramfuc { struct ramfuc_reg r_0x1110e0; struct ramfuc_reg r_0x111100; struct ramfuc_reg r_0x111104; + struct ramfuc_reg r_0x1111e0; + struct ramfuc_reg r_0x111400; struct ramfuc_reg r_0x611200; struct ramfuc_reg r_mr[4]; }; +struct nva3_ltrain { + enum { + NVA3_TRAIN_UNKNOWN, + NVA3_TRAIN_UNSUPPORTED, + NVA3_TRAIN_ONCE, + NVA3_TRAIN_EXEC, + NVA3_TRAIN_DONE + } state; + u32 r_100720; + u32 r_1111e0; + u32 r_111400; + struct nouveau_mem *mem; +}; + struct nva3_ram { struct nouveau_ram base; struct nva3_ramfuc fuc; + struct nva3_ltrain ltrain; }; +void +nva3_link_train_calc(u32 *vals, struct nva3_ltrain *train) +{ + int i, lo, hi; + u8 median[8], bins[4] = {0, 0, 0, 0}, bin = 0, qty = 0; + + for (i = 0; i < 8; i++) { + for (lo = 0; lo < 0x40; lo++) { + if (!(vals[lo] & 0x80000000)) + continue; + if (vals[lo] & (0x101 << i)) + break; + } + + if (lo == 0x40) + return; + + for (hi = lo + 1; hi < 0x40; hi++) { + if (!(vals[lo] & 0x80000000)) + continue; + if (!(vals[hi] & (0x101 << i))) { + hi--; + break; + } + } + + median[i] = ((hi - lo) >> 1) + lo; + bins[(median[i] & 0xf0) >> 4]++; + median[i] += 0x30; + } + + /* Find the best value for 0x1111e0 */ + for (i = 0; i < 4; i++) { + if (bins[i] > qty) { + bin = i + 3; + qty = bins[i]; + } + } + + train->r_100720 = 0; + for (i = 0; i < 8; i++) { + median[i] = max(median[i], (u8) (bin << 4)); + median[i] = min(median[i], (u8) ((bin << 4) | 0xf)); + + train->r_100720 |= ((median[i] & 0x0f) << (i << 2)); + } + + train->r_1111e0 = 0x02000000 | (bin * 0x101); + train->r_111400 = 0x0; +} + +/* + * Link training for (at least) DDR3 + */ +int +nva3_link_train(struct nouveau_fb *pfb) +{ + struct nouveau_bios *bios = nouveau_bios(pfb); + struct nva3_ram *ram = (void *)pfb->ram; + struct nouveau_clock *clk = nouveau_clock(pfb); + struct nva3_ltrain *train = &ram->ltrain; + struct nouveau_device *device = nv_device(pfb); + struct nva3_ramfuc *fuc = &ram->fuc; + u32 *result, r1700; + int ret, i; + struct nvbios_M0205T M0205T = { 0 }; + u8 ver, hdr, cnt, len, snr, ssz; + unsigned int clk_current; + unsigned long flags; + unsigned long *f = &flags; + + if (nouveau_boolopt(device->cfgopt, "NvMemExec", true) != true) + return -ENOSYS; + + /* XXX: Multiple partitions? */ + result = kmalloc(64 * sizeof(u32), GFP_KERNEL); + if (!result) + return -ENOMEM; + + train->state = NVA3_TRAIN_EXEC; + + /* Clock speeds for training and back */ + nvbios_M0205Tp(bios, &ver, &hdr, &cnt, &len, &snr, &ssz, &M0205T); + if (M0205T.freq == 0) + return -ENOENT; + + clk_current = clk->read(clk, nv_clk_src_mem); + + ret = nva3_clock_pre(clk, f); + if (ret) + goto out; + + /* First: clock up/down */ + ret = ram->base.calc(pfb, (u32) M0205T.freq * 1000); + if (ret) + goto out; + + /* Do this *after* calc, eliminates write in script */ + nv_wr32(pfb, 0x111400, 0x00000000); + /* XXX: Magic writes that improve train reliability? */ + nv_mask(pfb, 0x100674, 0x0000ffff, 0x00000000); + nv_mask(pfb, 0x1005e4, 0x0000ffff, 0x00000000); + nv_mask(pfb, 0x100b0c, 0x000000ff, 0x00000000); + nv_wr32(pfb, 0x100c04, 0x00000400); + + /* Now the training script */ + r1700 = ram_rd32(fuc, 0x001700); + + ram_mask(fuc, 0x100200, 0x00000800, 0x00000000); + ram_wr32(fuc, 0x611200, 0x3300); + ram_wait_vblank(fuc); + ram_wait(fuc, 0x611200, 0x00000003, 0x00000000, 500000); + ram_mask(fuc, 0x001610, 0x00000083, 0x00000003); + ram_mask(fuc, 0x100080, 0x00000020, 0x00000000); + ram_mask(fuc, 0x10f804, 0x80000000, 0x00000000); + ram_wr32(fuc, 0x001700, 0x00000000); + + ram_train(fuc); + + /* Reset */ + ram_mask(fuc, 0x10f804, 0x80000000, 0x80000000); + ram_wr32(fuc, 0x10053c, 0x0); + ram_wr32(fuc, 0x100720, train->r_100720); + ram_wr32(fuc, 0x1111e0, train->r_1111e0); + ram_wr32(fuc, 0x111400, train->r_111400); + ram_nuke(fuc, 0x100080); + ram_mask(fuc, 0x100080, 0x00000020, 0x00000020); + ram_nsec(fuc, 1000); + + ram_wr32(fuc, 0x001700, r1700); + ram_mask(fuc, 0x001610, 0x00000083, 0x00000080); + ram_wr32(fuc, 0x611200, 0x3330); + ram_mask(fuc, 0x100200, 0x00000800, 0x00000800); + + ram_exec(fuc, true); + + ram->base.calc(pfb, clk_current); + ram_exec(fuc, true); + + /* Post-processing, avoids flicker */ + nv_mask(pfb, 0x616308, 0x10, 0x10); + nv_mask(pfb, 0x616b08, 0x10, 0x10); + + nva3_clock_post(clk, f); + + ram_train_result(pfb, result, 64); + for (i = 0; i < 64; i++) + nv_debug(pfb, "Train: %08x", result[i]); + nva3_link_train_calc(result, train); + + nv_debug(pfb, "Train: %08x %08x %08x", train->r_100720, + train->r_1111e0, train->r_111400); + + kfree(result); + + train->state = NVA3_TRAIN_DONE; + + return ret; + +out: + if(ret == -EBUSY) + f = NULL; + + train->state = NVA3_TRAIN_UNSUPPORTED; + + nva3_clock_post(clk, f); + return ret; +} + +int +nva3_link_train_init(struct nouveau_fb *pfb) +{ + static const u32 pattern[16] = { + 0xaaaaaaaa, 0xcccccccc, 0xdddddddd, 0xeeeeeeee, + 0x00000000, 0x11111111, 0x44444444, 0xdddddddd, + 0x33333333, 0x55555555, 0x77777777, 0x66666666, + 0x99999999, 0x88888888, 0xeeeeeeee, 0xbbbbbbbb, + }; + struct nouveau_bios *bios = nouveau_bios(pfb); + struct nva3_ram *ram = (void *)pfb->ram; + struct nva3_ltrain *train = &ram->ltrain; + struct nouveau_mem *mem; + struct nvbios_M0205E M0205E; + u8 ver, hdr, cnt, len; + u32 r001700; + int ret, i = 0; + + train->state = NVA3_TRAIN_UNSUPPORTED; + + /* We support type "5" + * XXX: training pattern table appears to be unused for this routine */ + if (!nvbios_M0205Ep(bios, i, &ver, &hdr, &cnt, &len, &M0205E)) + return -ENOENT; + + if (M0205E.type != 5) + return 0; + + train->state = NVA3_TRAIN_ONCE; + + ret = pfb->ram->get(pfb, 0x8000, 0x10000, 0, 0x800, &ram->ltrain.mem); + if (ret) + return ret; + + mem = ram->ltrain.mem; + + nv_wr32(pfb, 0x100538, 0x10000000 | (mem->offset >> 16)); + nv_wr32(pfb, 0x1005a8, 0x0000ffff); + nv_mask(pfb, 0x10f800, 0x00000001, 0x00000001); + + for (i = 0; i < 0x30; i++) { + nv_wr32(pfb, 0x10f8c0, (i << 8) | i); + nv_wr32(pfb, 0x10f900, pattern[i % 16]); + } + + for (i = 0; i < 0x30; i++) { + nv_wr32(pfb, 0x10f8e0, (i << 8) | i); + nv_wr32(pfb, 0x10f920, pattern[i % 16]); + } + + /* And upload the pattern */ + r001700 = nv_rd32(pfb, 0x1700); + nv_wr32(pfb, 0x1700, mem->offset >> 16); + for (i = 0; i < 16; i++) + nv_wr32(pfb, 0x700000 + (i << 2), pattern[i]); + for (i = 0; i < 16; i++) + nv_wr32(pfb, 0x700100 + (i << 2), pattern[i]); + nv_wr32(pfb, 0x1700, r001700); + + train->r_100720 = nv_rd32(pfb, 0x100720); + train->r_1111e0 = nv_rd32(pfb, 0x1111e0); + train->r_111400 = nv_rd32(pfb, 0x111400); + + return 0; +} + +void +nva3_link_train_fini(struct nouveau_fb *pfb) +{ + struct nva3_ram *ram = (void *)pfb->ram; + + if (ram->ltrain.mem) + pfb->ram->put(pfb, &ram->ltrain.mem); +} + static int nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) { @@ -90,6 +361,9 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) next->freq = freq; ram->base.next = next; + if (ram->ltrain.state == NVA3_TRAIN_ONCE) + nva3_link_train(pfb); + /* lookup memory config data relevant to the target frequency */ i = 0; while ((data = nvbios_rammapEp(bios, i++, &ver, &hdr, &cnt, &len, @@ -330,38 +604,24 @@ nva3_ram_init(struct nouveau_object *object) { struct nouveau_fb *pfb = (void *)object->parent; struct nva3_ram *ram = (void *)object; - int ret, i; + int ret; ret = nouveau_ram_init(&ram->base); if (ret) return ret; - /* prepare for ddr link training, and load training patterns */ - switch (ram->base.type) { - case NV_MEM_TYPE_DDR3: { - if (nv_device(pfb)->chipset == 0xa8) { - static const u32 pattern[16] = { - 0xaaaaaaaa, 0xcccccccc, 0xdddddddd, 0xeeeeeeee, - 0x00000000, 0x11111111, 0x44444444, 0xdddddddd, - 0x33333333, 0x55555555, 0x77777777, 0x66666666, - 0x99999999, 0x88888888, 0xeeeeeeee, 0xbbbbbbbb, - }; - - nv_wr32(pfb, 0x100538, 0x10001ff6); /*XXX*/ - nv_wr32(pfb, 0x1005a8, 0x0000ffff); - nv_mask(pfb, 0x10f800, 0x00000001, 0x00000001); - for (i = 0; i < 0x30; i++) { - nv_wr32(pfb, 0x10f8c0, (i << 8) | i); - nv_wr32(pfb, 0x10f8e0, (i << 8) | i); - nv_wr32(pfb, 0x10f900, pattern[i % 16]); - nv_wr32(pfb, 0x10f920, pattern[i % 16]); - } - } - } - break; - default: - break; - } + nva3_link_train_init(pfb); + + return 0; +} + +static int +nva3_ram_fini(struct nouveau_object *object, bool suspend) +{ + struct nouveau_fb *pfb = (void *)object->parent; + + if (!suspend) + nva3_link_train_fini(pfb); return 0; } @@ -390,11 +650,14 @@ nva3_ram_ctor(struct nouveau_object *parent, struct nouveau_object *engine, return 0; } + ram->fuc.r_0x001610 = ramfuc_reg(0x001610); + ram->fuc.r_0x001700 = ramfuc_reg(0x001700); ram->fuc.r_0x004000 = ramfuc_reg(0x004000); ram->fuc.r_0x004004 = ramfuc_reg(0x004004); ram->fuc.r_0x004018 = ramfuc_reg(0x004018); ram->fuc.r_0x004128 = ramfuc_reg(0x004128); ram->fuc.r_0x004168 = ramfuc_reg(0x004168); + ram->fuc.r_0x100080 = ramfuc_reg(0x100080); ram->fuc.r_0x100200 = ramfuc_reg(0x100200); ram->fuc.r_0x100210 = ramfuc_reg(0x100210); for (i = 0; i < 9; i++) @@ -408,6 +671,7 @@ nva3_ram_ctor(struct nouveau_object *parent, struct nouveau_object *engine, ram->fuc.r_0x100714 = ramfuc_reg(0x100714); ram->fuc.r_0x100718 = ramfuc_reg(0x100718); ram->fuc.r_0x10071c = ramfuc_reg(0x10071c); + ram->fuc.r_0x100720 = ramfuc_reg(0x100720); ram->fuc.r_0x100760 = ramfuc_stride(0x100760, 4, ram->base.part_mask); ram->fuc.r_0x1007a0 = ramfuc_stride(0x1007a0, 4, ram->base.part_mask); ram->fuc.r_0x1007e0 = ramfuc_stride(0x1007e0, 4, ram->base.part_mask); @@ -415,6 +679,8 @@ nva3_ram_ctor(struct nouveau_object *parent, struct nouveau_object *engine, ram->fuc.r_0x1110e0 = ramfuc_stride(0x1110e0, 4, ram->base.part_mask); ram->fuc.r_0x111100 = ramfuc_reg(0x111100); ram->fuc.r_0x111104 = ramfuc_reg(0x111104); + ram->fuc.r_0x1111e0 = ramfuc_reg(0x1111e0); + ram->fuc.r_0x111400 = ramfuc_reg(0x111400); ram->fuc.r_0x611200 = ramfuc_reg(0x611200); if (ram->base.ranks > 1) { @@ -438,6 +704,6 @@ nva3_ram_oclass = { .ctor = nva3_ram_ctor, .dtor = _nouveau_ram_dtor, .init = nva3_ram_init, - .fini = _nouveau_ram_fini, + .fini = nva3_ram_fini, }, }; diff --git a/drivers/gpu/drm/nouveau/core/subdev/pwr/fuc/memx.fuc b/drivers/gpu/drm/nouveau/core/subdev/pwr/fuc/memx.fuc index e89789a..ec03f9a 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/pwr/fuc/memx.fuc +++ b/drivers/gpu/drm/nouveau/core/subdev/pwr/fuc/memx.fuc @@ -50,6 +50,7 @@ handler(WR32 , 0x0000, 0x0002, #memx_func_wr32) handler(WAIT , 0x0004, 0x0000, #memx_func_wait) handler(DELAY , 0x0001, 0x0000, #memx_func_delay) handler(VBLANK, 0x0001, 0x0000, #memx_func_wait_vblank) +handler(TRAIN , 0x0000, 0x0000, #memx_func_train) memx_func_tail: .equ #memx_func_size #memx_func_next - #memx_func_head @@ -63,6 +64,10 @@ memx_ts_end: memx_data_head: .skip 0x0800 memx_data_tail: + +memx_train_head: +.skip 0x0100 +memx_train_tail: #endif /****************************************************************************** @@ -260,6 +265,101 @@ memx_func_delay: // description // // $r15 - current (memx) +// $r4 - packet length +// $r3 - opcode desciption +// $r0 - zero +memx_func_train: +#if NVKM_PPWR_CHIPSET == GT215 +// $r5 - outer loop counter +// $r6 - inner loop counter +// $r7 - entry counter (#memx_train_head + $r7) + movw $r5 0x3 + movw $r7 0x0 + +// Read random memory to wake up... things + imm32($r9, 0x700000) + nv_rd32($r8,$r9) + movw $r14 0x2710 + call(nsec) + + memx_func_train_loop_outer: + mulu $r8 $r5 0x101 + sethi $r8 0x02000000 + imm32($r9, 0x1111e0) + nv_wr32($r9, $r8) + push $r5 + + movw $r6 0x0 + memx_func_train_loop_inner: + movw $r8 0x1111 + mulu $r9 $r6 $r8 + shl b32 $r8 $r9 0x10 + or $r8 $r9 + imm32($r9, 0x100720) + nv_wr32($r9, $r8) + + imm32($r9, 0x100080) + nv_rd32($r8, $r9) + or $r8 $r8 0x20 + nv_wr32($r9, $r8) + + imm32($r9, 0x10053c) + imm32($r8, 0x80003002) + nv_wr32($r9, $r8) + + imm32($r14, 0x100560) + imm32($r13, 0x80000000) + add b32 $r12 $r13 0 + imm32($r11, 0x001e8480) + call(wait) + + // $r5 - inner inner loop counter + // $r9 - result + movw $r5 0 + imm32($r9, 0x8300ffff) + memx_func_train_loop_4x: + imm32($r10, 0x100080) + nv_rd32($r8, $r10) + imm32($r11, 0xffffffdf) + and $r8 $r11 + nv_wr32($r10, $r8) + + imm32($r10, 0x10053c) + imm32($r8, 0x80003002) + nv_wr32($r10, $r8) + + imm32($r14, 0x100560) + imm32($r13, 0x80000000) + mov b32 $r12 $r13 + imm32($r11, 0x00002710) + call(wait) + + nv_rd32($r13, $r14) + and $r9 $r9 $r13 + + add b32 $r5 1 + cmp b16 $r5 0x4 + bra l #memx_func_train_loop_4x + + add b32 $r10 $r7 #memx_train_head + st b32 D[$r10 + 0] $r9 + add b32 $r6 1 + add b32 $r7 4 + + cmp b16 $r6 0x10 + bra l #memx_func_train_loop_inner + + pop $r5 + add b32 $r5 1 + cmp b16 $r5 7 + bra l #memx_func_train_loop_outer + +#endif + ret + +// description +// +// $r15 - current (memx) // $r14 - sender process name // $r13 - message (exec) // $r12 - head of script @@ -307,8 +407,19 @@ memx_exec: // $r11 - data1 // $r0 - zero memx_info: + cmp b16 $r12 0x1 + bra e #memx_info_train + + memx_info_data: mov $r12 #memx_data_head mov $r11 #memx_data_tail - #memx_data_head + bra #memx_info_send + + memx_info_train: + mov $r12 #memx_train_head + mov $r11 #memx_train_tail - #memx_train_head + + memx_info_send: call(send) ret diff --git a/drivers/gpu/drm/nouveau/core/subdev/pwr/fuc/os.h b/drivers/gpu/drm/nouveau/core/subdev/pwr/fuc/os.h index 522e307..c8b06cb 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/pwr/fuc/os.h +++ b/drivers/gpu/drm/nouveau/core/subdev/pwr/fuc/os.h @@ -18,6 +18,10 @@ #define MEMX_MSG_INFO 0 #define MEMX_MSG_EXEC 1 +/* MEMX: info types */ +#define MEMX_INFO_DATA 0 +#define MEMX_INFO_TRAIN 1 + /* MEMX: script opcode definitions */ #define MEMX_ENTER 1 #define MEMX_LEAVE 2 @@ -25,6 +29,7 @@ #define MEMX_WAIT 4 #define MEMX_DELAY 5 #define MEMX_VBLANK 6 +#define MEMX_TRAIN 7 /* I2C_: message identifiers */ #define I2C__MSG_RD08 0 diff --git a/drivers/gpu/drm/nouveau/core/subdev/pwr/memx.c b/drivers/gpu/drm/nouveau/core/subdev/pwr/memx.c index 65eaa25..f6ce39d 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/pwr/memx.c +++ b/drivers/gpu/drm/nouveau/core/subdev/pwr/memx.c @@ -47,7 +47,8 @@ nouveau_memx_init(struct nouveau_pwr *ppwr, struct nouveau_memx **pmemx) u32 reply[2]; int ret; - ret = ppwr->message(ppwr, reply, PROC_MEMX, MEMX_MSG_INFO, 0, 0); + ret = ppwr->message(ppwr, reply, PROC_MEMX, MEMX_MSG_INFO, + MEMX_INFO_DATA, 0); if (ret) return ret; @@ -152,6 +153,38 @@ nouveau_memx_wait_vblank(struct nouveau_memx *memx) } void +nouveau_memx_train(struct nouveau_memx *memx) +{ + nv_debug(memx->ppwr, " MEM TRAIN\n"); + memx_cmd(memx, MEMX_TRAIN, 0, NULL); +} + +int +nouveau_memx_train_result(struct nouveau_pwr *ppwr, u32 *res, int rsize) +{ + u32 reply[2], base, size, i; + int ret; + + ret = ppwr->message(ppwr, reply, PROC_MEMX, MEMX_MSG_INFO, + MEMX_INFO_TRAIN, 0); + if (ret) + return ret; + + base = reply[0]; + size = reply[1] >> 2; + if (size > rsize) + return -ENOMEM; + + /* read the packet */ + nv_wr32(ppwr, 0x10a1c0, 0x02000000 | base); + + for (i = 0; i < size; i++) + res[i] = nv_rd32(ppwr, 0x10a1c4); + + return 0; +} + +void nouveau_memx_block(struct nouveau_memx *memx) { nv_debug(memx->ppwr, " HOST BLOCKED\n"); -- 1.9.3
Roy Spliet
2014-Oct-02 16:01 UTC
[Nouveau] [PATCH 4/9] fb/ramnva3: Ressurect timing calculation code
Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- .../drm/nouveau/core/include/subdev/bios/ramcfg.h | 20 ++++++ drivers/gpu/drm/nouveau/core/subdev/bios/timing.c | 42 +++++++++-- drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c | 84 +++++++++++++++++++--- 3 files changed, 132 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/nouveau/core/include/subdev/bios/ramcfg.h b/drivers/gpu/drm/nouveau/core/include/subdev/bios/ramcfg.h index ae3f17d..8ead274 100644 --- a/drivers/gpu/drm/nouveau/core/include/subdev/bios/ramcfg.h +++ b/drivers/gpu/drm/nouveau/core/include/subdev/bios/ramcfg.h @@ -95,9 +95,29 @@ struct nvbios_ramcfg { union { struct { unsigned timing_10_WR:8; + unsigned timing_10_WTR:8; unsigned timing_10_CL:8; + unsigned timing_10_RC:8; + /*empty: 4 */ + unsigned timing_10_RFC:8; /* Byte 5 */ + /*empty: 6 */ + unsigned timing_10_RAS:8; /* Byte 7 */ + /*empty: 8 */ + unsigned timing_10_RP:8; /* Byte 9 */ + unsigned timing_10_RCDRD:8; + unsigned timing_10_RCDWR:8; + unsigned timing_10_RRD:8; + unsigned timing_10_13:8; unsigned timing_10_ODT:3; + /* empty: 15 */ + unsigned timing_10_16:8; + /* empty: 17 */ + unsigned timing_10_18:8; unsigned timing_10_CWL:8; + unsigned timing_10_20:8; + unsigned timing_10_21:8; + /* empty: 22, 23 */ + unsigned timing_10_24:8; }; struct { unsigned timing_20_2e_03:2; diff --git a/drivers/gpu/drm/nouveau/core/subdev/bios/timing.c b/drivers/gpu/drm/nouveau/core/subdev/bios/timing.c index 46d955e..8521eca 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/bios/timing.c +++ b/drivers/gpu/drm/nouveau/core/subdev/bios/timing.c @@ -93,10 +93,44 @@ nvbios_timingEp(struct nouveau_bios *bios, int idx, p->timing_hdr = *hdr; switch (!!data * *ver) { case 0x10: - p->timing_10_WR = nv_ro08(bios, data + 0x00); - p->timing_10_CL = nv_ro08(bios, data + 0x02); - p->timing_10_ODT = nv_ro08(bios, data + 0x0e) & 0x07; - p->timing_10_CWL = nv_ro08(bios, data + 0x13); + p->timing_10_WR = nv_ro08(bios, data + 0x00); + p->timing_10_WTR = nv_ro08(bios, data + 0x01); + p->timing_10_CL = nv_ro08(bios, data + 0x02); + p->timing_10_RC = nv_ro08(bios, data + 0x03); + p->timing_10_RFC = nv_ro08(bios, data + 0x05); + p->timing_10_RAS = nv_ro08(bios, data + 0x07); + p->timing_10_RP = nv_ro08(bios, data + 0x09); + p->timing_10_RCDRD = nv_ro08(bios, data + 0x0a); + p->timing_10_RCDWR = nv_ro08(bios, data + 0x0b); + p->timing_10_RRD = nv_ro08(bios, data + 0x0c); + p->timing_10_13 = nv_ro08(bios, data + 0x0d); + p->timing_10_ODT = nv_ro08(bios, data + 0x0e) & 0x07; + + p->timing_10_24 = 0xff; + p->timing_10_21 = 0; + p->timing_10_20 = 0; + p->timing_10_CWL = 0; + p->timing_10_18 = 0; + p->timing_10_16 = 0; + + switch (min_t(u8, *hdr, 25)) { + case 25: + p->timing_10_24 = nv_ro08(bios, data + 0x18); + case 24: + case 23: + case 22: + p->timing_10_21 = nv_ro08(bios, data + 0x15); + case 21: + p->timing_10_20 = nv_ro08(bios, data + 0x14); + case 20: + p->timing_10_CWL = nv_ro08(bios, data + 0x13); + case 19: + p->timing_10_18 = nv_ro08(bios, data + 0x12); + case 18: + case 17: + p->timing_10_16 = nv_ro08(bios, data + 0x10); + } + break; case 0x20: p->timing[0] = nv_ro32(bios, data + 0x00); diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c b/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c index 45e8a91..07dfbba 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c +++ b/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c @@ -343,6 +343,66 @@ nva3_link_train_fini(struct nouveau_fb *pfb) pfb->ram->put(pfb, &ram->ltrain.mem); } +/* + * RAM reclocking + */ +#define T(t) cfg->timing_10_##t +static int +nva3_ram_timing_calc(struct nouveau_fb *pfb, u32 *timing) +{ + struct nva3_ram *ram = (void *)pfb->ram; + struct nvbios_ramcfg *cfg = &ram->base.target.bios; + int tUNK_base; + u32 cur3, cur7, cur8; + + cur3 = nv_rd32(pfb, 0x10022c); + cur7 = nv_rd32(pfb, 0x10023c); + cur8 = nv_rd32(pfb, 0x100240); + + if (T(CWL) == 0) + T(CWL) = ((nv_rd32(pfb, 0x100228) & 0x0f000000) >> 24) + 1; + + tUNK_base = ((cur7 & 0x00ff0000) >> 16) - + (cur3 & 0x000000ff) - 1; + + timing[0] = (T(RP) << 24 | T(RAS) << 16 | T(RFC) << 8 | T(RC)); + timing[1] = (T(WR) + 1 + T(CWL)) << 24 | + max_t(u8,T(18), 1) << 16 | + (T(WTR) + 1 + T(CWL)) << 8 | + (5 + T(CL) - T(CWL)); + timing[2] = (T(CWL) - 1) << 24 | + (T(RRD) << 16) | + (T(RCDWR) << 8) | + T(RCDRD); + timing[3] = (cur3 & 0x00ff0000) | + (0x30 + T(CL)) << 24 | + (0xb + T(CL)) << 8 | + (T(CL) - 1); + timing[4] = T(20) << 24 | + T(21) << 16 | + T(13) << 8 | + T(13); + timing[5] = T(RFC) << 24 | + max_t(u8,T(RCDRD), T(RCDWR)) << 16 | + (T(CWL) + 6) << 8 | + T(RP); + timing[6] = (0x5a + T(CL)) << 16 | + (6 - T(CL) + T(CWL)) << 8 | + (0x50 + T(CL) - T(CWL)); + timing[7] = (cur7 & 0xff000000) | + ((tUNK_base + T(CL)) << 16) | + 0x202; + timing[8] = cur8 & 0xffffff00; + + nv_debug(pfb, "Entry: 220: %08x %08x %08x %08x\n", + timing[0], timing[1], timing[2], timing[3]); + nv_debug(pfb, " 230: %08x %08x %08x %08x\n", + timing[4], timing[5], timing[6], timing[7]); + nv_debug(pfb, " 240: %08x\n", timing[8]); + return 0; +} +#undef T + static int nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) { @@ -356,6 +416,7 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) u32 r004018, r100760, ctrl; u32 unk714, unk718, unk71c; int ret, i; + u32 timing[9]; next = &ram->base.target; next->freq = freq; @@ -409,6 +470,8 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) return ret; } + nva3_ram_timing_calc(pfb, timing); + ret = ram_init(fuc, pfb); if (ret) return ret; @@ -519,16 +582,17 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) ram_mask(fuc, mr[0], 0x00000000, 0x00000000); ram_nsec(fuc, 1000); - ram_mask(fuc, 0x100220[3], 0x00000000, 0x00000000); - ram_mask(fuc, 0x100220[1], 0x00000000, 0x00000000); - ram_mask(fuc, 0x100220[6], 0x00000000, 0x00000000); - ram_mask(fuc, 0x100220[7], 0x00000000, 0x00000000); - ram_mask(fuc, 0x100220[2], 0x00000000, 0x00000000); - ram_mask(fuc, 0x100220[4], 0x00000000, 0x00000000); - ram_mask(fuc, 0x100220[5], 0x00000000, 0x00000000); - ram_mask(fuc, 0x100220[0], 0x00000000, 0x00000000); - ram_mask(fuc, 0x100220[8], 0x00000000, 0x00000000); - + ram_wr32(fuc, 0x100220[3], timing[3]); + ram_wr32(fuc, 0x100220[1], timing[1]); + ram_wr32(fuc, 0x100220[6], timing[6]); + ram_wr32(fuc, 0x100220[7], timing[7]); + ram_wr32(fuc, 0x100220[2], timing[2]); + ram_wr32(fuc, 0x100220[4], timing[4]); + ram_wr32(fuc, 0x100220[5], timing[5]); + ram_wr32(fuc, 0x100220[0], timing[0]); + ram_wr32(fuc, 0x100220[8], timing[8]); + + /* Misc */ ram_mask(fuc, 0x100200, 0x00001000, !next->bios.ramcfg_10_02_08 << 12); unk714 = ram_rd32(fuc, 0x100714) & ~0xf0000010; -- 1.9.3
Roy Spliet
2014-Oct-02 16:01 UTC
[Nouveau] [PATCH 5/9] fb/ramnva3: Reclocking script for DDR3
Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- .../drm/nouveau/core/include/subdev/bios/ramcfg.h | 1 + drivers/gpu/drm/nouveau/core/subdev/bios/rammap.c | 1 + drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c | 292 +++++++++++++++------ 3 files changed, 219 insertions(+), 75 deletions(-) diff --git a/drivers/gpu/drm/nouveau/core/include/subdev/bios/ramcfg.h b/drivers/gpu/drm/nouveau/core/include/subdev/bios/ramcfg.h index 8ead274..4a0e0ce 100644 --- a/drivers/gpu/drm/nouveau/core/include/subdev/bios/ramcfg.h +++ b/drivers/gpu/drm/nouveau/core/include/subdev/bios/ramcfg.h @@ -45,6 +45,7 @@ struct nvbios_ramcfg { unsigned ramcfg_10_02_20:1; unsigned ramcfg_10_DLLoff:1; unsigned ramcfg_10_03_0f:4; + unsigned ramcfg_10_04_01:1; unsigned ramcfg_10_05:8; unsigned ramcfg_10_06:8; unsigned ramcfg_10_07:8; diff --git a/drivers/gpu/drm/nouveau/core/subdev/bios/rammap.c b/drivers/gpu/drm/nouveau/core/subdev/bios/rammap.c index 24dc9b3..c568522 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/bios/rammap.c +++ b/drivers/gpu/drm/nouveau/core/subdev/bios/rammap.c @@ -164,6 +164,7 @@ nvbios_rammapSp(struct nouveau_bios *bios, u32 data, p->ramcfg_10_02_20 = (nv_ro08(bios, data + 0x02) & 0x20) >> 5; p->ramcfg_10_DLLoff = (nv_ro08(bios, data + 0x02) & 0x40) >> 6; p->ramcfg_10_03_0f = (nv_ro08(bios, data + 0x03) & 0x0f) >> 0; + p->ramcfg_10_04_01 = (nv_ro08(bios, data + 0x04) & 0x01) >> 0; p->ramcfg_10_05 = (nv_ro08(bios, data + 0x05) & 0xff) >> 0; p->ramcfg_10_06 = (nv_ro08(bios, data + 0x06) & 0xff) >> 0; p->ramcfg_10_07 = (nv_ro08(bios, data + 0x07) & 0xff) >> 0; diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c b/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c index 07dfbba..7cff2d4 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c +++ b/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c @@ -47,6 +47,7 @@ struct nva3_ramfuc { struct ramfuc base; struct ramfuc_reg r_0x001610; struct ramfuc_reg r_0x001700; + struct ramfuc_reg r_0x002504; struct ramfuc_reg r_0x004000; struct ramfuc_reg r_0x004004; struct ramfuc_reg r_0x004018; @@ -56,12 +57,14 @@ struct nva3_ramfuc { struct ramfuc_reg r_0x100200; struct ramfuc_reg r_0x100210; struct ramfuc_reg r_0x100220[9]; + struct ramfuc_reg r_0x100264; struct ramfuc_reg r_0x1002d0; struct ramfuc_reg r_0x1002d4; struct ramfuc_reg r_0x1002dc; struct ramfuc_reg r_0x10053c; struct ramfuc_reg r_0x1005a0; struct ramfuc_reg r_0x1005a4; + struct ramfuc_reg r_0x100700; struct ramfuc_reg r_0x100714; struct ramfuc_reg r_0x100718; struct ramfuc_reg r_0x10071c; @@ -69,6 +72,7 @@ struct nva3_ramfuc { struct ramfuc_reg r_0x100760; struct ramfuc_reg r_0x1007a0; struct ramfuc_reg r_0x1007e0; + struct ramfuc_reg r_0x100da0; struct ramfuc_reg r_0x10f804; struct ramfuc_reg r_0x1110e0; struct ramfuc_reg r_0x111100; @@ -403,20 +407,53 @@ nva3_ram_timing_calc(struct nouveau_fb *pfb, u32 *timing) } #undef T +static void +nouveau_sddr3_dll_reset(struct nva3_ramfuc *fuc) +{ + ram_mask(fuc, mr[0], 0x100, 0x100); + ram_nsec(fuc, 1000); + ram_mask(fuc, mr[0], 0x100, 0x000); + ram_nsec(fuc, 1000); +} + +static void +nouveau_sddr3_dll_disable(struct nva3_ramfuc *fuc, u32 *mr) +{ + u32 mr1_old = ram_rd32(fuc, mr[1]); + + if (!(mr1_old & 0x1)) { + ram_wr32(fuc, 0x1002d4, 0x00000001); + ram_wr32(fuc, mr[1], mr[1]); + ram_nsec(fuc, 1000); + } +} + +static void +nva3_ram_lock_pll(struct nva3_ramfuc *fuc, struct nva3_clock_info *mclk) +{ + ram_wr32(fuc, 0x004004, mclk->pll); + ram_mask(fuc, 0x004000, 0x00000001, 0x00000001); + ram_mask(fuc, 0x004000, 0x00000010, 0x00000000); + ram_wait(fuc, 0x004000, 0x00020000, 0x00020000, 64000); + ram_mask(fuc, 0x004000, 0x00000010, 0x00000010); +} + static int nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) { struct nouveau_bios *bios = nouveau_bios(pfb); struct nva3_ram *ram = (void *)pfb->ram; struct nva3_ramfuc *fuc = &ram->fuc; + struct nva3_ltrain *train = &ram->ltrain; struct nva3_clock_info mclk; struct nouveau_ram_data *next; u8 ver, hdr, cnt, len, strap; u32 data; - u32 r004018, r100760, ctrl; + u32 r004018, r100760, r100da0, r111100, ctrl; u32 unk714, unk718, unk71c; int ret, i; u32 timing[9]; + bool pll2pll; next = &ram->base.target; next->freq = freq; @@ -427,14 +464,9 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) /* lookup memory config data relevant to the target frequency */ i = 0; - while ((data = nvbios_rammapEp(bios, i++, &ver, &hdr, &cnt, &len, - &next->bios))) { - if (freq / 1000 >= next->bios.rammap_min && - freq / 1000 <= next->bios.rammap_max) - break; - } - - if (!data || ver != 0x10 || hdr < 0x0e) { + data = nvbios_rammapEm(bios, freq / 1000, &ver, &hdr, &cnt, &len, + &next->bios); + if (!data || ver != 0x10 || hdr < 0x05) { nv_error(pfb, "invalid/missing rammap entry\n"); return -EINVAL; } @@ -448,7 +480,7 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) data = nvbios_rammapSp(bios, data, ver, hdr, cnt, len, strap, &ver, &hdr, &next->bios); - if (!data || ver != 0x10 || hdr < 0x0e) { + if (!data || ver != 0x10 || hdr < 0x09) { nv_error(pfb, "invalid/missing ramcfg entry\n"); return -EINVAL; } @@ -458,7 +490,7 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) data = nvbios_timingEp(bios, next->bios.ramcfg_timing, &ver, &hdr, &cnt, &len, &next->bios); - if (!data || ver != 0x10 || hdr < 0x19) { + if (!data || ver != 0x10 || hdr < 0x17) { nv_error(pfb, "invalid/missing timing entry\n"); return -EINVAL; } @@ -476,49 +508,75 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) if (ret) return ret; + /* Determine ram-specific MR values */ + ram->base.mr[0] = ram_rd32(fuc, mr[0]); + ram->base.mr[1] = ram_rd32(fuc, mr[1]); + ram->base.mr[2] = ram_rd32(fuc, mr[2]); + + switch (ram->base.type) { + case NV_MEM_TYPE_DDR3: + ret = nouveau_sddr3_calc(&ram->base); + break; + default: + ret = -ENOSYS; + break; + } + + if (ret) + return ret; + /* XXX: where the fuck does 750MHz come from? */ if (freq <= 750000) { r004018 = 0x10000000; r100760 = 0x22222222; + r100da0 = 0x00000010; } else { r004018 = 0x00000000; r100760 = 0x00000000; + r100da0 = 0x00000000; } + if (!next->bios.ramcfg_10_DLLoff) + r004018 |= 0x00004000; + + /* pll2pll requires to switch to a safe clock first */ ctrl = ram_rd32(fuc, 0x004000); - if (ctrl & 0x00000008) { - if (mclk.pll) { - ram_mask(fuc, 0x004128, 0x00000101, 0x00000101); - ram_wr32(fuc, 0x004004, mclk.pll); - ram_wr32(fuc, 0x004000, (ctrl |= 0x00000001)); - ram_wr32(fuc, 0x004000, (ctrl &= 0xffffffef)); - ram_wait(fuc, 0x004000, 0x00020000, 0x00020000, 64000); - ram_wr32(fuc, 0x004000, (ctrl |= 0x00000010)); - ram_wr32(fuc, 0x004018, 0x00005000 | r004018); - ram_wr32(fuc, 0x004000, (ctrl |= 0x00000004)); - } - } else { - u32 ssel = 0x00000101; - if (mclk.clk) - ssel |= mclk.clk; - else - ssel |= 0x00080000; /* 324MHz, shouldn't matter... */ - ram_mask(fuc, 0x004168, 0x003f3141, ctrl); - } + pll2pll = (!(ctrl & 0x00000008)) && mclk.pll; + /* Pre, NVIDIA does this outside the script */ if (next->bios.ramcfg_10_02_10) { ram_mask(fuc, 0x111104, 0x00000600, 0x00000000); } else { ram_mask(fuc, 0x111100, 0x40000000, 0x40000000); ram_mask(fuc, 0x111104, 0x00000180, 0x00000000); } + /* Always disable this bit during reclock */ + ram_mask(fuc, 0x100200, 0x00000800, 0x00000000); + + /* If switching from non-pll to pll, lock before disabling FB */ + if (mclk.pll && !pll2pll) { + ram_mask(fuc, 0x004128, 0x003f3141, mclk.clk | 0x00000101); + nva3_ram_lock_pll(fuc, &mclk); + } + + /* Start with disabling some CRTCs and PFIFO? */ + ram_wait_vblank(fuc); + ram_wr32(fuc, 0x611200, 0x3300); + ram_mask(fuc, 0x002504, 0x1, 0x1); + ram_nsec(fuc, 10000); + ram_wait(fuc, 0x002504, 0x10, 0x10, 20000); /* XXX: or longer? */ + ram_block(fuc); + ram_nsec(fuc, 2000); + - if (!next->bios.rammap_10_04_02) - ram_mask(fuc, 0x100200, 0x00000800, 0x00000000); - ram_wr32(fuc, 0x611200, 0x00003300); if (!next->bios.ramcfg_10_02_10) - ram_wr32(fuc, 0x111100, 0x4c020000); /*XXX*/ + ram_mask(fuc, 0x111100, 0x04020000, 0x04020000); /*XXX*/ + /* If we're disabling the DLL, do it now */ + if (next->bios.ramcfg_10_DLLoff) + nouveau_sddr3_dll_disable(fuc, ram->base.mr); + + /* Brace RAM for impact */ ram_wr32(fuc, 0x1002d4, 0x00000001); ram_wr32(fuc, 0x1002d0, 0x00000001); ram_wr32(fuc, 0x1002d0, 0x00000001); @@ -526,24 +584,38 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) ram_wr32(fuc, 0x1002dc, 0x00000001); ram_nsec(fuc, 2000); - ctrl = ram_rd32(fuc, 0x004000); - if (!(ctrl & 0x00000008) && mclk.pll) { - ram_wr32(fuc, 0x004000, (ctrl |= 0x00000008)); + if (nv_device(pfb)->chipset == 0xa3 && freq <= 500000) + ram_mask(fuc, 0x100700, 0x00000006, 0x00000006); + + /* Fiddle with clocks */ + /* There's 4 scenario's + * pll->pll: first switch to a 324MHz clock, set up new PLL, switch + * clk->pll: Set up new PLL, switch + * pll->clk: Set up clock, switch + * clk->clk: Overwrite ctrl and other bits, switch */ + + /* Switch to regular clock - 324MHz */ + if (pll2pll) { + ram_mask(fuc, 0x004000, 0x00000004, 0x00000004); + ram_mask(fuc, 0x004168, 0x003f3141, 0x00083101); + ram_mask(fuc, 0x004000, 0x00000008, 0x00000008); ram_mask(fuc, 0x1110e0, 0x00088000, 0x00088000); ram_wr32(fuc, 0x004018, 0x00001000); - ram_wr32(fuc, 0x004000, (ctrl &= ~0x00000001)); - ram_wr32(fuc, 0x004004, mclk.pll); - ram_wr32(fuc, 0x004000, (ctrl |= 0x00000001)); - udelay(64); - ram_wr32(fuc, 0x004018, 0x00005000 | r004018); - udelay(20); - } else - if (!mclk.pll) { - ram_mask(fuc, 0x004168, 0x003f3040, mclk.clk); - ram_wr32(fuc, 0x004000, (ctrl |= 0x00000008)); + nva3_ram_lock_pll(fuc, &mclk); + } + + if (mclk.pll) { + ram_mask(fuc, 0x004000, 0x00000105, 0x00000105); + ram_wr32(fuc, 0x004018, 0x00001000 | r004018); + ram_wr32(fuc, 0x100da0, r100da0); + } else { + ram_mask(fuc, 0x004168, 0x003f3141, mclk.clk | 0x00000101); + ram_mask(fuc, 0x004000, 0x00000108, 0x00000008); ram_mask(fuc, 0x1110e0, 0x00088000, 0x00088000); - ram_wr32(fuc, 0x004018, 0x0000d000 | r004018); + ram_wr32(fuc, 0x004018, 0x00009000 | r004018); + ram_wr32(fuc, 0x100da0, r100da0); } + ram_nsec(fuc, 20000); if (next->bios.rammap_10_04_08) { ram_wr32(fuc, 0x1005a0, next->bios.ramcfg_10_06 << 16 | @@ -557,6 +629,12 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) 0x80000000); ram_mask(fuc, 0x10053c, 0x00001000, 0x00000000); } else { + if (train->state == NVA3_TRAIN_DONE) { + ram_wr32(fuc, 0x100080, 0x1020); + ram_mask(fuc, 0x111400, 0xffffffff, train->r_111400); + ram_mask(fuc, 0x1111e0, 0xffffffff, train->r_1111e0); + ram_mask(fuc, 0x100720, 0xffffffff, train->r_100720); + } ram_mask(fuc, 0x10053c, 0x00001000, 0x00001000); ram_mask(fuc, 0x10f804, 0x80000000, 0x00000000); ram_mask(fuc, 0x100760, 0x22222222, r100760); @@ -564,22 +642,27 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) ram_mask(fuc, 0x1007e0, 0x22222222, r100760); } + if (nv_device(pfb)->chipset == 0xa3 && freq > 500000) { + ram_mask(fuc, 0x100700, 0x00000006, 0x00000000); + } + + /* Final switch */ if (mclk.pll) { ram_mask(fuc, 0x1110e0, 0x00088000, 0x00011000); - ram_wr32(fuc, 0x004000, (ctrl &= ~0x00000008)); + ram_mask(fuc, 0x004000, 0x00000008, 0x00000000); } - /*XXX: LEAVE */ ram_wr32(fuc, 0x1002dc, 0x00000000); ram_wr32(fuc, 0x1002d4, 0x00000001); ram_wr32(fuc, 0x100210, 0x80000000); - ram_nsec(fuc, 1000); - ram_nsec(fuc, 1000); + ram_nsec(fuc, 2000); - ram_mask(fuc, mr[2], 0x00000000, 0x00000000); + /* Set RAM MR parameters and timings */ + ram_wr32(fuc, mr[2], ram->base.mr[2]); + ram_nsec(fuc, 1000); + ram_wr32(fuc, mr[1], ram->base.mr[1]); ram_nsec(fuc, 1000); - ram_nuke(fuc, mr[0]); - ram_mask(fuc, mr[0], 0x00000000, 0x00000000); + ram_wr32(fuc, mr[0], ram->base.mr[0]); ram_nsec(fuc, 1000); ram_wr32(fuc, 0x100220[3], timing[3]); @@ -595,35 +678,75 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) /* Misc */ ram_mask(fuc, 0x100200, 0x00001000, !next->bios.ramcfg_10_02_08 << 12); - unk714 = ram_rd32(fuc, 0x100714) & ~0xf0000010; - unk718 = ram_rd32(fuc, 0x100718) & ~0x00000100; - unk71c = ram_rd32(fuc, 0x10071c) & ~0x00000100; + /* XXX: A lot of "chipset"/"ram type" specific stuff...? */ + unk714 = ram_rd32(fuc, 0x100714) & ~0xf0000130; + unk718 = ram_rd32(fuc, 0x100718) & ~0x00000100; + unk71c = ram_rd32(fuc, 0x10071c) & ~0x00000100; + r111100 = ram_rd32(fuc, 0x111100) & ~0x3a800000; + + if (next->bios.ramcfg_10_02_04) { + switch (ram->base.type) { + case NV_MEM_TYPE_DDR3: + if (nv_device(pfb)->chipset != 0xa8) + r111100 |= 0x00000004; + /* no break */ + default: + break; + } + } else { + switch (ram->base.type) { + case NV_MEM_TYPE_DDR3: + if (nv_device(pfb)->chipset == 0xa8) { + r111100 |= 0x08000000; + } else { + r111100 &= ~0x00000004; + r111100 |= 0x12800000; + } + unk714 |= 0x00000010; + break; + default: + break; + } + } + + unk714 |= (next->bios.ramcfg_10_04_01) << 8; + if (next->bios.ramcfg_10_02_20) unk714 |= 0xf0000000; - if (!next->bios.ramcfg_10_02_04) - unk714 |= 0x00000010; - ram_wr32(fuc, 0x100714, unk714); - + if (next->bios.ramcfg_10_02_02) + unk718 |= 0x00000100; if (next->bios.ramcfg_10_02_01) unk71c |= 0x00000100; - ram_wr32(fuc, 0x10071c, unk71c); + if (next->bios.timing_10_24 != 0xff) { + unk718 &= ~0xf0000000; + unk718 |= next->bios.timing_10_24 << 28; + } + if (next->bios.ramcfg_10_02_10) + r111100 &= ~0x04020000; - if (next->bios.ramcfg_10_02_02) - unk718 |= 0x00000100; - ram_wr32(fuc, 0x100718, unk718); + ram_mask(fuc, 0x100714, 0xffffffff, unk714); + ram_mask(fuc, 0x10071c, 0xffffffff, unk71c); + ram_mask(fuc, 0x100718, 0xffffffff, unk718); + ram_mask(fuc, 0x111100, 0xffffffff, r111100); - if (next->bios.ramcfg_10_02_10) - ram_wr32(fuc, 0x111100, 0x48000000); /*XXX*/ + /* Reset DLL */ + if (!next->bios.ramcfg_10_DLLoff) + nouveau_sddr3_dll_reset(fuc); - ram_mask(fuc, mr[0], 0x100, 0x100); - ram_nsec(fuc, 1000); - ram_mask(fuc, mr[0], 0x100, 0x000); - ram_nsec(fuc, 1000); + ram_nsec(fuc, 14000); + ram_wr32(fuc, 0x100264, 0x1); ram_nsec(fuc, 2000); - ram_nsec(fuc, 12000); - ram_wr32(fuc, 0x611200, 0x00003330); + ram_nuke(fuc, 0x100700); + ram_mask(fuc, 0x100700, 0x01000000, 0x01000000); + ram_mask(fuc, 0x100700, 0x01000000, 0x00000000); + + /* Re-enable FB */ + ram_unblock(fuc); + ram_wr32(fuc, 0x611200, 0x3330); + + /* Post fiddlings */ if (next->bios.rammap_10_04_02) ram_mask(fuc, 0x100200, 0x00000800, 0x00000800); if (next->bios.ramcfg_10_02_10) { @@ -651,7 +774,22 @@ nva3_ram_prog(struct nouveau_fb *pfb) struct nouveau_device *device = nv_device(pfb); struct nva3_ram *ram = (void *)pfb->ram; struct nva3_ramfuc *fuc = &ram->fuc; - ram_exec(fuc, nouveau_boolopt(device->cfgopt, "NvMemExec", true)); + bool exec = nouveau_boolopt(device->cfgopt, "NvMemExec", true); + + if (exec) { + nv_mask(pfb, 0x001534, 0x2, 0x2); + + ram_exec(fuc, true); + + /* Post-processing, avoids flicker */ + nv_mask(pfb, 0x002504, 0x1, 0x0); + nv_mask(pfb, 0x001534, 0x2, 0x0); + + nv_mask(pfb, 0x616308, 0x10, 0x10); + nv_mask(pfb, 0x616b08, 0x10, 0x10); + } else { + ram_exec(fuc, false); + } return 0; } @@ -716,6 +854,7 @@ nva3_ram_ctor(struct nouveau_object *parent, struct nouveau_object *engine, ram->fuc.r_0x001610 = ramfuc_reg(0x001610); ram->fuc.r_0x001700 = ramfuc_reg(0x001700); + ram->fuc.r_0x002504 = ramfuc_reg(0x002504); ram->fuc.r_0x004000 = ramfuc_reg(0x004000); ram->fuc.r_0x004004 = ramfuc_reg(0x004004); ram->fuc.r_0x004018 = ramfuc_reg(0x004018); @@ -726,12 +865,14 @@ nva3_ram_ctor(struct nouveau_object *parent, struct nouveau_object *engine, ram->fuc.r_0x100210 = ramfuc_reg(0x100210); for (i = 0; i < 9; i++) ram->fuc.r_0x100220[i] = ramfuc_reg(0x100220 + (i * 4)); + ram->fuc.r_0x100264 = ramfuc_reg(0x100264); ram->fuc.r_0x1002d0 = ramfuc_reg(0x1002d0); ram->fuc.r_0x1002d4 = ramfuc_reg(0x1002d4); ram->fuc.r_0x1002dc = ramfuc_reg(0x1002dc); ram->fuc.r_0x10053c = ramfuc_reg(0x10053c); ram->fuc.r_0x1005a0 = ramfuc_reg(0x1005a0); ram->fuc.r_0x1005a4 = ramfuc_reg(0x1005a4); + ram->fuc.r_0x100700 = ramfuc_reg(0x100700); ram->fuc.r_0x100714 = ramfuc_reg(0x100714); ram->fuc.r_0x100718 = ramfuc_reg(0x100718); ram->fuc.r_0x10071c = ramfuc_reg(0x10071c); @@ -739,6 +880,7 @@ nva3_ram_ctor(struct nouveau_object *parent, struct nouveau_object *engine, ram->fuc.r_0x100760 = ramfuc_stride(0x100760, 4, ram->base.part_mask); ram->fuc.r_0x1007a0 = ramfuc_stride(0x1007a0, 4, ram->base.part_mask); ram->fuc.r_0x1007e0 = ramfuc_stride(0x1007e0, 4, ram->base.part_mask); + ram->fuc.r_0x100da0 = ramfuc_stride(0x100da0, 4, ram->base.part_mask); ram->fuc.r_0x10f804 = ramfuc_reg(0x10f804); ram->fuc.r_0x1110e0 = ramfuc_stride(0x1110e0, 4, ram->base.part_mask); ram->fuc.r_0x111100 = ramfuc_reg(0x111100); -- 1.9.3
Roy Spliet
2014-Oct-02 16:01 UTC
[Nouveau] [PATCH 6/9] fb/ramnva3: Reclocking script for DDR2
Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c | 57 +++++++++++++++++------- 1 file changed, 42 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c b/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c index 7cff2d4..0e1697a 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c +++ b/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c @@ -356,7 +356,7 @@ nva3_ram_timing_calc(struct nouveau_fb *pfb, u32 *timing) { struct nva3_ram *ram = (void *)pfb->ram; struct nvbios_ramcfg *cfg = &ram->base.target.bios; - int tUNK_base; + int tUNK_base, tUNK_40_0, prevCL; u32 cur3, cur7, cur8; cur3 = nv_rd32(pfb, 0x10022c); @@ -364,10 +364,11 @@ nva3_ram_timing_calc(struct nouveau_fb *pfb, u32 *timing) cur8 = nv_rd32(pfb, 0x100240); if (T(CWL) == 0) - T(CWL) = ((nv_rd32(pfb, 0x100228) & 0x0f000000) >> 24) + 1; + /* Observed on DDR2 */ + T(CWL) = T(CL) - 1; - tUNK_base = ((cur7 & 0x00ff0000) >> 16) - - (cur3 & 0x000000ff) - 1; + prevCL = (cur3 & 0x000000ff) + 1; + tUNK_base = ((cur7 & 0x00ff0000) >> 16) - prevCL; timing[0] = (T(RP) << 24 | T(RAS) << 16 | T(RFC) << 8 | T(RC)); timing[1] = (T(WR) + 1 + T(CWL)) << 24 | @@ -398,6 +399,16 @@ nva3_ram_timing_calc(struct nouveau_fb *pfb, u32 *timing) 0x202; timing[8] = cur8 & 0xffffff00; + switch (ram->base.type) { + case NV_MEM_TYPE_DDR2: + tUNK_40_0 = prevCL - (cur8 & 0xff); + if (tUNK_40_0 > 0) + timing[8] |= T(CL); + break; + default: + break; + } + nv_debug(pfb, "Entry: 220: %08x %08x %08x %08x\n", timing[0], timing[1], timing[2], timing[3]); nv_debug(pfb, " 230: %08x %08x %08x %08x\n", @@ -408,7 +419,7 @@ nva3_ram_timing_calc(struct nouveau_fb *pfb, u32 *timing) #undef T static void -nouveau_sddr3_dll_reset(struct nva3_ramfuc *fuc) +nouveau_sddr2_dll_reset(struct nva3_ramfuc *fuc) { ram_mask(fuc, mr[0], 0x100, 0x100); ram_nsec(fuc, 1000); @@ -514,6 +525,9 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) ram->base.mr[2] = ram_rd32(fuc, mr[2]); switch (ram->base.type) { + case NV_MEM_TYPE_DDR2: + ret = nouveau_sddr2_calc(&ram->base); + break; case NV_MEM_TYPE_DDR3: ret = nouveau_sddr3_calc(&ram->base); break; @@ -573,8 +587,11 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) ram_mask(fuc, 0x111100, 0x04020000, 0x04020000); /*XXX*/ /* If we're disabling the DLL, do it now */ - if (next->bios.ramcfg_10_DLLoff) + switch (next->bios.ramcfg_10_DLLoff * ram->base.type) { + case NV_MEM_TYPE_DDR3: nouveau_sddr3_dll_disable(fuc, ram->base.mr); + break; + } /* Brace RAM for impact */ ram_wr32(fuc, 0x1002d4, 0x00000001); @@ -658,12 +675,12 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) ram_nsec(fuc, 2000); /* Set RAM MR parameters and timings */ - ram_wr32(fuc, mr[2], ram->base.mr[2]); - ram_nsec(fuc, 1000); - ram_wr32(fuc, mr[1], ram->base.mr[1]); - ram_nsec(fuc, 1000); - ram_wr32(fuc, mr[0], ram->base.mr[0]); - ram_nsec(fuc, 1000); + for (i = 2; i >= 0; i--) { + if (ram_rd32(fuc, mr[i]) != ram->base.mr[i]) { + ram_wr32(fuc, mr[i], ram->base.mr[i]); + ram_nsec(fuc, 1000); + } + } ram_wr32(fuc, 0x100220[3], timing[3]); ram_wr32(fuc, 0x100220[1], timing[1]); @@ -690,11 +707,18 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) if (nv_device(pfb)->chipset != 0xa8) r111100 |= 0x00000004; /* no break */ + case NV_MEM_TYPE_DDR2: + r111100 |= 0x08000000; + break; default: break; } } else { switch (ram->base.type) { + case NV_MEM_TYPE_DDR2: + r111100 |= 0x1a800000; + unk714 |= 0x00000010; + break; case NV_MEM_TYPE_DDR3: if (nv_device(pfb)->chipset == 0xa8) { r111100 |= 0x08000000; @@ -731,12 +755,14 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) /* Reset DLL */ if (!next->bios.ramcfg_10_DLLoff) - nouveau_sddr3_dll_reset(fuc); + nouveau_sddr2_dll_reset(fuc); ram_nsec(fuc, 14000); - ram_wr32(fuc, 0x100264, 0x1); - ram_nsec(fuc, 2000); + if (ram->base.type == NV_MEM_TYPE_DDR3) { + ram_wr32(fuc, 0x100264, 0x1); + ram_nsec(fuc, 2000); + } ram_nuke(fuc, 0x100700); ram_mask(fuc, 0x100700, 0x01000000, 0x01000000); @@ -842,6 +868,7 @@ nva3_ram_ctor(struct nouveau_object *parent, struct nouveau_object *engine, return ret; switch (ram->base.type) { + case NV_MEM_TYPE_DDR2: case NV_MEM_TYPE_DDR3: ram->base.calc = nva3_ram_calc; ram->base.prog = nva3_ram_prog; -- 1.9.3
Roy Spliet
2014-Oct-02 16:01 UTC
[Nouveau] [PATCH 7/9] fb/ramnva3: Reclocking script for GDDR3
Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c | 100 +++++++++++++++++++++-- drivers/gpu/drm/nouveau/core/subdev/gpio/nv50.c | 2 +- 2 files changed, 92 insertions(+), 10 deletions(-) diff --git a/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c b/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c index 0e1697a..3b38a53 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c +++ b/drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c @@ -33,6 +33,8 @@ #include <subdev/clock/nva3.h> #include <subdev/clock/pll.h> +#include <subdev/gpio.h> + #include <subdev/timer.h> #include <engine/fifo.h> @@ -43,6 +45,9 @@ #include "nv50.h" +/* XXX: Remove when memx gains GPIO support */ +extern int nv50_gpio_location(int line, u32 *reg, u32 *shift); + struct nva3_ramfuc { struct ramfuc base; struct ramfuc_reg r_0x001610; @@ -81,6 +86,7 @@ struct nva3_ramfuc { struct ramfuc_reg r_0x111400; struct ramfuc_reg r_0x611200; struct ramfuc_reg r_mr[4]; + struct ramfuc_reg r_gpioFBVREF; }; struct nva3_ltrain { @@ -357,15 +363,22 @@ nva3_ram_timing_calc(struct nouveau_fb *pfb, u32 *timing) struct nva3_ram *ram = (void *)pfb->ram; struct nvbios_ramcfg *cfg = &ram->base.target.bios; int tUNK_base, tUNK_40_0, prevCL; - u32 cur3, cur7, cur8; + u32 cur2, cur3, cur7, cur8; + cur2 = nv_rd32(pfb, 0x100228); cur3 = nv_rd32(pfb, 0x10022c); cur7 = nv_rd32(pfb, 0x10023c); cur8 = nv_rd32(pfb, 0x100240); - if (T(CWL) == 0) - /* Observed on DDR2 */ + + switch ((!T(CWL)) * ram->base.type) { + case NV_MEM_TYPE_DDR2: T(CWL) = T(CL) - 1; + break; + case NV_MEM_TYPE_GDDR3: + T(CWL) = ((cur2 & 0xff000000) >> 24) + 1; + break; + } prevCL = (cur3 & 0x000000ff) + 1; tUNK_base = ((cur7 & 0x00ff0000) >> 16) - prevCL; @@ -389,10 +402,10 @@ nva3_ram_timing_calc(struct nouveau_fb *pfb, u32 *timing) T(13); timing[5] = T(RFC) << 24 | max_t(u8,T(RCDRD), T(RCDWR)) << 16 | - (T(CWL) + 6) << 8 | + max_t(u8, (T(CWL) + 6), (T(CL) + 2)) << 8 | T(RP); timing[6] = (0x5a + T(CL)) << 16 | - (6 - T(CL) + T(CWL)) << 8 | + max_t(u8, 1, (6 - T(CL) + T(CWL))) << 8 | (0x50 + T(CL) - T(CWL)); timing[7] = (cur7 & 0xff000000) | ((tUNK_base + T(CL)) << 16) | @@ -401,6 +414,7 @@ nva3_ram_timing_calc(struct nouveau_fb *pfb, u32 *timing) switch (ram->base.type) { case NV_MEM_TYPE_DDR2: + case NV_MEM_TYPE_GDDR3: tUNK_40_0 = prevCL - (cur8 & 0xff); if (tUNK_40_0 > 0) timing[8] |= T(CL); @@ -440,6 +454,17 @@ nouveau_sddr3_dll_disable(struct nva3_ramfuc *fuc, u32 *mr) } static void +nouveau_gddr3_dll_disable(struct nva3_ramfuc *fuc, u32 *mr) +{ + u32 mr1_old = ram_rd32(fuc, mr[1]); + + if (!(mr1_old & 0x40)) { + ram_wr32(fuc, mr[1], mr[1]); + ram_nsec(fuc, 1000); + } +} + +static void nva3_ram_lock_pll(struct nva3_ramfuc *fuc, struct nva3_clock_info *mclk) { ram_wr32(fuc, 0x004004, mclk->pll); @@ -449,6 +474,29 @@ nva3_ram_lock_pll(struct nva3_ramfuc *fuc, struct nva3_clock_info *mclk) ram_mask(fuc, 0x004000, 0x00000010, 0x00000010); } +static void +nva3_ram_fbvref(struct nva3_ramfuc *fuc, u32 val) +{ + struct nouveau_gpio *gpio = nouveau_gpio(fuc->base.pfb); + struct dcb_gpio_func func; + u32 reg, sh, gpio_val; + int ret; + + if (gpio->get(gpio, 0, 0x2e, DCB_GPIO_UNUSED) != val) { + ret = gpio->find(gpio, 0, 0x2e, DCB_GPIO_UNUSED, &func); + if (ret) + return; + + nv50_gpio_location(func.line, ®, &sh); + gpio_val = ram_rd32(fuc, gpioFBVREF); + if (gpio_val & (8 << sh)) + val = !val; + + ram_mask(fuc, gpioFBVREF, (0x3 << sh), ((val | 0x2) << sh)); + ram_nsec(fuc, 20000); + } +} + static int nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) { @@ -531,6 +579,9 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) case NV_MEM_TYPE_DDR3: ret = nouveau_sddr3_calc(&ram->base); break; + case NV_MEM_TYPE_GDDR3: + ret = nouveau_gddr3_calc(&ram->base); + break; default: ret = -ENOSYS; break; @@ -582,17 +633,26 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) ram_block(fuc); ram_nsec(fuc, 2000); - - if (!next->bios.ramcfg_10_02_10) - ram_mask(fuc, 0x111100, 0x04020000, 0x04020000); /*XXX*/ + if (!next->bios.ramcfg_10_02_10) { + if (ram->base.type == NV_MEM_TYPE_GDDR3) + ram_mask(fuc, 0x111100, 0x04020000, 0x00020000); + else + ram_mask(fuc, 0x111100, 0x04020000, 0x04020000); + } /* If we're disabling the DLL, do it now */ switch (next->bios.ramcfg_10_DLLoff * ram->base.type) { case NV_MEM_TYPE_DDR3: nouveau_sddr3_dll_disable(fuc, ram->base.mr); break; + case NV_MEM_TYPE_GDDR3: + nouveau_gddr3_dll_disable(fuc, ram->base.mr); + break; } + if (fuc->r_gpioFBVREF.addr && next->bios.timing_10_ODT) + nva3_ram_fbvref(fuc, 0); + /* Brace RAM for impact */ ram_wr32(fuc, 0x1002d4, 0x00000001); ram_wr32(fuc, 0x1002d0, 0x00000001); @@ -728,6 +788,10 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) } unk714 |= 0x00000010; break; + case NV_MEM_TYPE_GDDR3: + r111100 |= 0x30000000; + unk714 |= 0x00000020; + break; default: break; } @@ -753,11 +817,18 @@ nva3_ram_calc(struct nouveau_fb *pfb, u32 freq) ram_mask(fuc, 0x100718, 0xffffffff, unk718); ram_mask(fuc, 0x111100, 0xffffffff, r111100); + if (fuc->r_gpioFBVREF.addr && !next->bios.timing_10_ODT) + nva3_ram_fbvref(fuc, 1); + /* Reset DLL */ if (!next->bios.ramcfg_10_DLLoff) nouveau_sddr2_dll_reset(fuc); - ram_nsec(fuc, 14000); + if (ram->base.type == NV_MEM_TYPE_GDDR3) { + ram_nsec(fuc, 31000); + } else { + ram_nsec(fuc, 14000); + } if (ram->base.type == NV_MEM_TYPE_DDR3) { ram_wr32(fuc, 0x100264, 0x1); @@ -859,8 +930,12 @@ nva3_ram_ctor(struct nouveau_object *parent, struct nouveau_object *engine, struct nouveau_oclass *oclass, void *data, u32 datasize, struct nouveau_object **pobject) { + struct nouveau_fb *pfb = nouveau_fb(parent); + struct nouveau_gpio *gpio = nouveau_gpio(pfb); + struct dcb_gpio_func func; struct nva3_ram *ram; int ret, i; + u32 reg, shift; ret = nv50_ram_create(parent, engine, oclass, &ram); *pobject = nv_object(ram); @@ -870,6 +945,7 @@ nva3_ram_ctor(struct nouveau_object *parent, struct nouveau_object *engine, switch (ram->base.type) { case NV_MEM_TYPE_DDR2: case NV_MEM_TYPE_DDR3: + case NV_MEM_TYPE_GDDR3: ram->base.calc = nva3_ram_calc; ram->base.prog = nva3_ram_prog; ram->base.tidy = nva3_ram_tidy; @@ -928,6 +1004,12 @@ nva3_ram_ctor(struct nouveau_object *parent, struct nouveau_object *engine, ram->fuc.r_mr[3] = ramfuc_reg(0x1002e4); } + ret = gpio->find(gpio, 0, 0x2e, DCB_GPIO_UNUSED, &func); + if (ret == 0) { + nv50_gpio_location(func.line, ®, &shift); + ram->fuc.r_gpioFBVREF = ramfuc_reg(reg); + } + return 0; } diff --git a/drivers/gpu/drm/nouveau/core/subdev/gpio/nv50.c b/drivers/gpu/drm/nouveau/core/subdev/gpio/nv50.c index 1864fa9..2e30d5a 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/gpio/nv50.c +++ b/drivers/gpu/drm/nouveau/core/subdev/gpio/nv50.c @@ -54,7 +54,7 @@ nv50_gpio_reset(struct nouveau_gpio *gpio, u8 match) } } -static int +int nv50_gpio_location(int line, u32 *reg, u32 *shift) { const u32 nv50_gpio_reg[4] = { 0xe104, 0xe108, 0xe280, 0xe284 }; -- 1.9.3
Signed-off-by: Roy Spliet <rspliet at eclipso.eu> --- drivers/gpu/drm/nouveau/core/subdev/clock/nva3.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/core/subdev/clock/nva3.c b/drivers/gpu/drm/nouveau/core/subdev/clock/nva3.c index 094551d..07ad012 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/clock/nva3.c +++ b/drivers/gpu/drm/nouveau/core/subdev/clock/nva3.c @@ -510,7 +510,7 @@ nva3_clock_ctor(struct nouveau_object *parent, struct nouveau_object *engine, int ret; ret = nouveau_clock_create(parent, engine, oclass, nva3_domain, NULL, 0, - false, &priv); + true, &priv); *pobject = nv_object(priv); if (ret) return ret; -- 1.9.3
Roy Spliet
2014-Oct-02 16:01 UTC
[Nouveau] [PATCH 9/9] pwr/fuc: Fix thinko in nouveau_memx_wait()
--- drivers/gpu/drm/nouveau/core/subdev/pwr/memx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/nouveau/core/subdev/pwr/memx.c b/drivers/gpu/drm/nouveau/core/subdev/pwr/memx.c index f6ce39d..7a9299d 100644 --- a/drivers/gpu/drm/nouveau/core/subdev/pwr/memx.c +++ b/drivers/gpu/drm/nouveau/core/subdev/pwr/memx.c @@ -107,7 +107,7 @@ nouveau_memx_wait(struct nouveau_memx *memx, { nv_debug(memx->ppwr, "R[%06x] & 0x%08x == 0x%08x, %d us\n", addr, mask, data, nsec); - memx_cmd(memx, MEMX_WAIT, 4, (u32[]){ addr, ~mask, data, nsec }); + memx_cmd(memx, MEMX_WAIT, 4, (u32[]){ addr, mask, data, nsec }); memx_out(memx); /* fuc can't handle multiple */ } -- 1.9.3