Danilo Krummrich
2024-Aug-03 00:00 UTC
[PATCH 2/2] [v7] drm/nouveau: expose GSP-RM logging buffers via debugfs
On Fri, Aug 02, 2024 at 02:05:17PM -0500, Timur Tabi wrote: I was about to apply this patch, but noticed a bug while smoke testing things.> +/** > + * r535_gsp_libos_debugfs_init - create logging debugfs entries > + * @gsp: gsp pointer > + * > + * Create the debugfs entries. This exposes the log buffers to userspace so > + * that an external tool can parse it. > + * > + * The 'logpmu' contains exception dumps from the PMU. It is written via an > + * RPC sent from GSP-RM and must be only 4KB. We create it here because it's > + * only useful if there is a debugfs entry to expose it. If we get the PMU > + * logging RPC and there is no debugfs entry, the RPC is just ignored. > + * > + * The blob_init, blob_rm, and blob_pmu objects can't be transient > + * because debugfs_create_blob doesn't copy them. > + * > + * NOTE: OpenRM loads the logging elf image and prints the log messages > + * in real-time. We may add that capability in the future, but that > + * requires loading an ELF images that are not distributed with the driver, > + * and adding the parsing code to Nouveau. > + * > + * Ideally, this should be part of nouveau_debugfs_init(), but that function > + * is called too late. We really want to create these debugfs entries before > + * r535_gsp_booter_load() is called, so that if GSP-RM fails to initialize, > + * there could still be a log to capture. > + */ > +static void > +r535_gsp_libos_debugfs_init(struct nvkm_gsp *gsp) > +{ > + struct dentry *dir, *dir_init; > + struct dentry *dir_intr = NULL, *dir_rm = NULL, *dir_pmu = NULL; > + struct device *dev = gsp->subdev.device->dev; > + > + /* Each GPU has a subdir based on its device name, so find it */ > + struct drm_device *drm_dev = dev_get_drvdata(dev);In drm-misc-next dev_get_drvdata() returns a struct nouveau_drm. Interestingly, things do not blow up however. Instead, I noticed that your debugfs entries are created in the debugfs root, instead of "dri/<PCI_ID>/". I think we shouldn't try to use the DRI debugfs directory anyway to avoid DRM layer dependencies in nvkm. Let's create our own nouveau one, like you did in earlier versions of this patch. Please also make sure to test your patches on a recent drm-misc-next branch.> + > + dir = drm_dev->debugfs_root; > + > + gsp->blob_init.data = gsp->loginit.data; > + gsp->blob_init.size = gsp->loginit.size; > + gsp->blob_intr.data = gsp->logintr.data; > + gsp->blob_intr.size = gsp->logintr.size; > + gsp->blob_rm.data = gsp->logrm.data; > + gsp->blob_rm.size = gsp->logrm.size; > + > + dir_init = debugfs_create_blob("loginit", 0444, dir, &gsp->blob_init); > + if (IS_ERR(dir_init)) { > + nvkm_error(&gsp->subdev, "failed to create loginit debugfs entry\n"); > + goto error; > + } > + > + dir_intr = debugfs_create_blob("logintr", 0444, dir, &gsp->blob_intr); > + if (IS_ERR(dir_intr)) { > + nvkm_error(&gsp->subdev, "failed to create logintr debugfs entry\n"); > + goto error; > + } > + > + dir_rm = debugfs_create_blob("logrm", 0444, dir, &gsp->blob_rm); > + if (IS_ERR(dir_rm)) { > + nvkm_error(&gsp->subdev, "failed to create logrm debugfs entry\n"); > + goto error; > + } > + > + /* > + * Since the PMU buffer is copied from an RPC, it doesn't need to be > + * a DMA buffer. > + */ > + gsp->blob_pmu.size = GSP_PAGE_SIZE; > + gsp->blob_pmu.data = kzalloc(gsp->blob_pmu.size, GFP_KERNEL); > + if (!gsp->blob_pmu.data) > + goto error; > + > + dir_pmu = debugfs_create_blob("logpmu", 0444, dir, &gsp->blob_pmu); > + if (IS_ERR(dir_pmu)) { > + nvkm_error(&gsp->subdev, "failed to create logpmu debugfs entry\n"); > + kfree(gsp->blob_pmu.data); > + goto error; > + } > + > + i_size_write(d_inode(dir_init), gsp->blob_init.size); > + i_size_write(d_inode(dir_intr), gsp->blob_intr.size); > + i_size_write(d_inode(dir_rm), gsp->blob_rm.size); > + i_size_write(d_inode(dir_pmu), gsp->blob_pmu.size); > + > + r535_gsp_msg_ntfy_add(gsp, NV_VGPU_MSG_EVENT_UCODE_LIBOS_PRINT, > + r535_gsp_msg_libos_print, gsp); > + > + nvkm_debug(&gsp->subdev, "created debugfs GSP-RM logging entries\n"); > + > + if (keep_gsp_logging) { > + nvkm_info(&gsp->subdev, > + "logging buffers will be retained on failure\n"); > + } > + > + return; > + > +error: > + debugfs_remove(dir_init); > + debugfs_remove(dir_intr); > + debugfs_remove(dir_rm); > +} > + > +#endif >}
Timur Tabi
2024-Aug-03 03:23 UTC
[PATCH 2/2] [v7] drm/nouveau: expose GSP-RM logging buffers via debugfs
On Sat, 2024-08-03 at 02:00 +0200, Danilo Krummrich wrote:> > +???????/* Each GPU has a subdir based on its device name, so find it */ > > +???????struct drm_device *drm_dev = dev_get_drvdata(dev); > > In drm-misc-next dev_get_drvdata() returns a struct nouveau_drm. > Interestingly, > things do not blow up however. > > Instead, I noticed that your debugfs entries are created in the debugfs > root, > instead of "dri/<PCI_ID>/".Woah.> I think we shouldn't try to use the DRI debugfs directory anyway to avoid > DRM > layer dependencies in nvkm. Let's create our own nouveau one, like you did > in > earlier versions of this patch.So what path should I use? So what it's supposed to do is create a sort of "shadow" dri/<PCI_ID>/ path that overlays the existing one. So when the module deletes the original dri/<PCI_ID>/, the one I create replaces it automatically. I think it's a neat feature. The path to the debugfs entries stays the same whether or not they're migrated.> Please also make sure to test your patches on a recent drm-misc-next branch.Will do.
Apparently Analagous Threads
- [PATCH 2/2] [v6] drm/nouveau: expose GSP-RM logging buffers via debugfs
- [PATCH 2/2] [v8] drm/nouveau: expose GSP-RM logging buffers via debugfs
- [PATCH 2/2] [v6] drm/nouveau: expose GSP-RM logging buffers via debugfs
- [PATCH 2/2] [v5] drm/nouveau: expose GSP-RM logging buffers via debugfs
- [PATCH 2/2] [v6] drm/nouveau: expose GSP-RM logging buffers via debugfs