David Herrmann
2013-Oct-02 22:19 UTC
[Nouveau] Resource map sanity check fails after GRUB "keeps" the gfx mode
Hi Pavel On Wed, Oct 2, 2013 at 11:46 PM, Pavel Roskin <proski at gnu.org> wrote:> On Wed, 2 Oct 2013 16:47:29 +0200 > David Herrmann <dh.herrmann at gmail.com> wrote: > >> Thanks for your investigations. I finally sent my patch and put you on >> CC. Sorry for the delay, but I have mostly catched up with all emails >> now. >> >> This really just suppresses the warning and nothing else. > > Thank you for your detailed explanation in the patch. > >> It does not >> fix any lockup/crash/... > > I know. I was surprised that my patch fixed something else, but now we > know that it didn't. It's better when patches don't have unexpected > effects even if the effects are positive :)Indeed. And I appreciate your help a lot.>> And your PCI-BAR adjustment doesn't change >> anything either, sorry. > > I simply tried another approach to pacify the resource checker. > > However, there is some difference. nvidiafb cannot access the > resources if IORESOURCE_BUSY is used.Are you sure this is related to IORESOURCE_BUSY? Or is it related to CONFIG_X86_SYSFB? Thanks David> It's not that it matters for me. nvidiafb fails on my hardware anyway > and I don't need it when I already have simplefb and nouveau. But I > think we should have a closer look why it fails and what other drivers > may fail. > > -- > Regards, > Pavel Roskin
Pavel Roskin
2013-Oct-03 22:10 UTC
[Nouveau] Resource map sanity check fails after GRUB "keeps" the gfx mode
Hi David, On Thu, 3 Oct 2013 00:19:56 +0200 David Herrmann <dh.herrmann at gmail.com> wrote:> >> And your PCI-BAR adjustment doesn't change > >> anything either, sorry. > > > > I simply tried another approach to pacify the resource checker. > > > > However, there is some difference. nvidiafb cannot access the > > resources if IORESOURCE_BUSY is used. > > Are you sure this is related to IORESOURCE_BUSY? Or is it related to > CONFIG_X86_SYSFB?CONFIG_X86_SYSFB is always defined. I doubt an x86 kernel would compile without it. create_simplefb() is used in arch/x86/kernel/sysfb.c that is compiled unconditionally and that function is defined in arch/x86/kernel/sysfb_simplefb.c that is only compiled if CONFIG_X86_SYSFB is defined. I tried four combinations: with and without IORESOURCE_BUSY and with and without the PCI resource adjustment. The only combination when nvidiafb probes the hardware is when IORESOURCE_BUSY is not used and the BOOTFP resource is adjusted to match the PCI BAR. It means that your patch by itself won't prevent nvidiafb from getting the resource on my hardware (ThinkPad W530). However, if the BOOTFP resource matches the PCI BAR for the video card, adding IORESOURCE_BUSY might prevent some framebuffer drivers from accessing the resource. This complexity doesn't seem right. I think specific drivers should trump generic once and DRI drivers should trump non-DRI. It shouldn't matter whether the BOOTFP area from screen_info coincides with the PCI BAR or occupies a part of it. -- Regards, Pavel Roskin
David Herrmann
2013-Oct-03 23:08 UTC
[Nouveau] Resource map sanity check fails after GRUB "keeps" the gfx mode
Hi Pavel On Fri, Oct 4, 2013 at 12:10 AM, Pavel Roskin <proski at gnu.org> wrote:> Hi David, > > On Thu, 3 Oct 2013 00:19:56 +0200 > David Herrmann <dh.herrmann at gmail.com> wrote: > >> >> And your PCI-BAR adjustment doesn't change >> >> anything either, sorry. >> > >> > I simply tried another approach to pacify the resource checker. >> > >> > However, there is some difference. nvidiafb cannot access the >> > resources if IORESOURCE_BUSY is used. >> >> Are you sure this is related to IORESOURCE_BUSY? Or is it related to >> CONFIG_X86_SYSFB? > > CONFIG_X86_SYSFB is always defined. I doubt an x86 kernel would > compile without it. create_simplefb() is used in > arch/x86/kernel/sysfb.c that is compiled unconditionally and that > function is defined in arch/x86/kernel/sysfb_simplefb.c that is only > compiled if CONFIG_X86_SYSFB is defined.You can set CONFIG_X86_SYSFB=n and everything works fine. It's the default and is what pre-3.12 kernels always did.> I tried four combinations: with and without IORESOURCE_BUSY and with > and without the PCI resource adjustment. The only combination when > nvidiafb probes the hardware is when IORESOURCE_BUSY is not used and > the BOOTFP resource is adjusted to match the PCI BAR.A dmesg log would be nice, but I assume nvidiafb fails because it cannot map its BAR regions?> It means that your patch by itself won't prevent nvidiafb from getting > the resource on my hardware (ThinkPad W530). However, if the BOOTFP > resource matches the PCI BAR for the video card, adding IORESOURCE_BUSY > might prevent some framebuffer drivers from accessing the resource.Meh! I now understand the problem: The resource.c resource-management allows creating sub-regions of existing regions. However, a sub-region must always be a real child of its parent, it cannot span multiple parents. Therefore, if we create the simplefb region before the pci BAR is mapped, we need your patches to bump the simplefb region to at least the size of the respective PCI region. Otherwise, nvidia tries allocating a sub-region that spans wider than the simplefb region and thus failing. On the other hand, sub-mappings of BUSY regions are _never_ allowed. A BUSY region gives exclusive access to the holder of the region. But dropping BUSY from the simplefb region is wrong. We have to mark the system-framebuffer as BUSY, otherwise we might end up with a corrupted framebuffer after loading other real hw drivers. In other words: The fact that we used to not reserve platform-framebuffer regions before 3.12 trips us now because it is actually _wrong_ to load real hw drivers like nvidiafb while the platform-framebuffer is still available. So the failure we get now just tells us that nvidiafb and friends do horrible things. TL;DR To fix this, we want real hardware drivers to remove platform-framebuffer devices and release their resources before acquiring them again. I recommend CONFIG_X86_SYSFB=n for anyone seeing these issues. For 3.13 I will try to fix the framebuffer-handover. Fortunately, no real DRM drivers actually request pci regions (why would they? pci-probing already guarantees exclusive access) and the platform-FB drivers have already been converted. So this bug can only be triggered with legacy hw-fbdev drivers (a simple search for pci_request_regions in ./drivers/video/ shows them).> This complexity doesn't seem right. I think specific drivers should > trump generic once and DRI drivers should trump non-DRI. It shouldn't > matter whether the BOOTFP area from screen_info coincides with the PCI > BAR or occupies a part of it.I will try to write a patch as part of the SimpleDRM series which allows removing platform-framebuffer devices. We simply do this during framebuffer probing and we should be fine. Thanks David
Reasonably Related Threads
- Resource map sanity check fails after GRUB "keeps" the gfx mode
- Resource map sanity check fails after GRUB "keeps" the gfx mode
- Resource map sanity check fails after GRUB "keeps" the gfx mode
- Resource map sanity check fails after GRUB "keeps" the gfx mode
- Resource map sanity check fails after GRUB "keeps" the gfx mode