Ilia Mirkin
2015-Aug-27 18:19 UTC
[Nouveau] [Mesa-dev] gallium state tracker calls calloc for 0 sizes arrays ?
On Thu, Aug 27, 2015 at 1:59 PM, Alex Deucher <alexdeucher at gmail.com> wrote:> On Thu, Aug 27, 2015 at 1:55 PM, Hans de Goede <hdegoede at redhat.com> wrote: >> Hi, >> >> On 27-08-15 15:46, Marek Olšák wrote: >>> >>> On Thu, Aug 27, 2015 at 3:09 PM, Hans de Goede <hdegoede at redhat.com> >>> wrote: >>>> >>>> Hi All, >>>> >>>> While debugging: https://bugzilla.redhat.com/show_bug.cgi?id=1008089 >>>> >>>> I made a apitrace recording of the a single slide transition >>>> animation, and since I suspected memory corruption replayed >>>> it using ElectrFence + glretrace, this finds a 0 sized array >>>> allocation at src/mesa/state_tracker/st_glsl_to_tgsi.cpp: 5565: >>>> >>>> if (proginfo->Parameters) { >>>> t->constants = (struct ureg_src *) >>>> calloc(proginfo->Parameters->NumParameters, >>>> sizeof(t->constants[0])); >>>> >>>> And if I protect the code against that one, another one at 5618: >>>> >>>> t->immediates = (struct ureg_src *) >>>> calloc(program->num_immediates, sizeof(struct ureg_src)); >>>> >>>> With the regular glibc malloc these both succeed as it actually >>>> returns a valid memory address (posix says it may also return NULL) >>>> >>>> I believe that the fragment program in question comes from: >>>> >>>> src/mesa/main/state.c update_program() and then from the >>>> >>>> else if (ctx->FragmentProgram._MaintainTexEnvProgram) { >>>> /* Use fragment program generated from fixed-function state */ >>>> >>>> } >>>> >>>> block. >>>> >>>> Interestingly enough if I allow malloc(0) to proceed from ElectricFence, >>>> then the glretrace runs fine, and even renders correctly, where as >>>> running the same gl command stream from libreoffice impress leads >>>> to missrendering on nv3c. >>>> >>>> So 2 questions: >>>> >>>> 1) Is it normal / expected for st_translate_program() to get called >>>> with an empty but not NULL proginfo->Parameters resp. num_immediates == 0 >>>> ? >>>> >>>> If not where would I begin to look for finding the culprit of this ? >>> >>> >>> Yes, it's normal. >> >> >> OK, thanks for the clear answer on this. >> >>>> 2) Since the glretrace does work outside of libreoffice impress, I think >>>> it may have something to do with the visual chosen by libreoffice >>>> impress, >>>> is there an easy way to find out what visual lo is choosing? >>> >>> >>> No, it's not because of the visual. It seems to me that libreoffice >>> changed the behavior of malloc and calloc. >> >> >> I'm pretty sure that this is not libreoffice changing malloc / calloc, >> it links normally to libc, and the same slide transition works fine >> with an nv84 card which also has a gallium based mesa driver. >> >> I really believe this is due to libreoffice doing something opengl >> related differently then glretrace, be it the visual or something else >> back buffer related ... >> > > Does libreoffice use llvm? I have vague recollections of there being > issues with llvm and libreoffice in the past because radeonsi uses > llvm as well.FWIW the nv30 gallium driver will only use llvm as part of 'draw' when falling back to the swtnl path. This should be extremely rare. But easy enough to build mesa with --disable-gallium-llvm to double-check (or what was the env var? DRAW_USE_LLVM=0 or something along those lines). -ilia
Hans de Goede
2015-Aug-28 08:54 UTC
[Nouveau] nv3x libreoffice impress opengl animations not working
Hi, On 27-08-15 20:19, Ilia Mirkin wrote:> On Thu, Aug 27, 2015 at 1:59 PM, Alex Deucher <alexdeucher at gmail.com> wrote:<snip>>>>>> 2) Since the glretrace does work outside of libreoffice impress, I think >>>>> it may have something to do with the visual chosen by libreoffice >>>>> impress, >>>>> is there an easy way to find out what visual lo is choosing? >>>> >>>> >>>> No, it's not because of the visual. It seems to me that libreoffice >>>> changed the behavior of malloc and calloc. >>> >>> >>> I'm pretty sure that this is not libreoffice changing malloc / calloc, >>> it links normally to libc, and the same slide transition works fine >>> with an nv84 card which also has a gallium based mesa driver. >>> >>> I really believe this is due to libreoffice doing something opengl >>> related differently then glretrace, be it the visual or something else >>> back buffer related ... >>> >> >> Does libreoffice use llvm? I have vague recollections of there being >> issues with llvm and libreoffice in the past because radeonsi uses >> llvm as well. > > FWIW the nv30 gallium driver will only use llvm as part of 'draw' when > falling back to the swtnl path. This should be extremely rare. But > easy enough to build mesa with --disable-gallium-llvm to double-check > (or what was the env var? DRAW_USE_LLVM=0 or something along those > lines).I've tried building with --disable-gallium-llvm, this does not help, this is not really surprising since on Fedora both libreoffice and mesa use the system llvm, so there should be no problems with them expecting different llvm versions. I've done some further debugging adding some debug printf-s to the texture creation paths for nv3x, this bit is interesting, glretrace does: nv30_miptree_from_handle 1350x863 uniform_pitch 6144 usage 0 flags 0 nv30_miptree_create 1350x863 uniform_pitch 5440 usage 0 flags 0 bind 1 target 2 So it gets a texture from a handle, which I believe is the child-window in which the animation will be shown, and then create another texture with the same dimensions to serve as back buffer I presume. ooimpress however does this: nv30_miptree_from_handle 1350x863 uniform_pitch 6144 usage 0 flags 0 nv30_miptree_create 2700x1726 uniform_pitch 10816 usage 0 flags 0 bind a target 2 nv30_miptree_create 2700x1726 uniform_pitch 10816 usage 0 flags 0 bind 1 target 2 Notice how it is creating 2 (back?) buffers and they are twice the size of the "sheet" area of impress to which the animation gets rendered. I believe this is a clue to the root cause of the problem, but after this I'm sorta stuck. Anyone got any hints on how to debug this further / where to look ? Thanks & Regards, Hans
Ilia Mirkin
2015-Aug-28 09:02 UTC
[Nouveau] nv3x libreoffice impress opengl animations not working
On Fri, Aug 28, 2015 at 4:54 AM, Hans de Goede <hdegoede at redhat.com> wrote:> Hi, > > On 27-08-15 20:19, Ilia Mirkin wrote: >> >> On Thu, Aug 27, 2015 at 1:59 PM, Alex Deucher <alexdeucher at gmail.com> >> wrote: > > > <snip> > >>>>>> 2) Since the glretrace does work outside of libreoffice impress, I >>>>>> think >>>>>> it may have something to do with the visual chosen by libreoffice >>>>>> impress, >>>>>> is there an easy way to find out what visual lo is choosing? >>>>> >>>>> >>>>> >>>>> No, it's not because of the visual. It seems to me that libreoffice >>>>> changed the behavior of malloc and calloc. >>>> >>>> >>>> >>>> I'm pretty sure that this is not libreoffice changing malloc / calloc, >>>> it links normally to libc, and the same slide transition works fine >>>> with an nv84 card which also has a gallium based mesa driver. >>>> >>>> I really believe this is due to libreoffice doing something opengl >>>> related differently then glretrace, be it the visual or something else >>>> back buffer related ... >>>> >>> >>> Does libreoffice use llvm? I have vague recollections of there being >>> issues with llvm and libreoffice in the past because radeonsi uses >>> llvm as well. >> >> >> FWIW the nv30 gallium driver will only use llvm as part of 'draw' when >> falling back to the swtnl path. This should be extremely rare. But >> easy enough to build mesa with --disable-gallium-llvm to double-check >> (or what was the env var? DRAW_USE_LLVM=0 or something along those >> lines). > > > I've tried building with --disable-gallium-llvm, this does not help, > this is not really surprising since on Fedora both libreoffice and > mesa use the system llvm, so there should be no problems with them > expecting different llvm versions. > > I've done some further debugging adding some debug printf-s to the > texture creation paths for nv3x, this bit is interesting, glretrace > does: > > nv30_miptree_from_handle 1350x863 uniform_pitch 6144 usage 0 flags 0 > nv30_miptree_create 1350x863 uniform_pitch 5440 usage 0 flags 0 bind 1 > target 2 > > So it gets a texture from a handle, which I believe is the child-window > in which the animation will be shown, and then create another texture > with the same dimensions to serve as back buffer I presume. > > ooimpress however does this: > > nv30_miptree_from_handle 1350x863 uniform_pitch 6144 usage 0 flags 0 > nv30_miptree_create 2700x1726 uniform_pitch 10816 usage 0 flags 0 bind a > target 2 > nv30_miptree_create 2700x1726 uniform_pitch 10816 usage 0 flags 0 bind 1 > target 2 > > Notice how it is creating 2 (back?) buffers and they are twice the size of > the "sheet" area of impress to which the animation gets rendered.bind a = rt/sampler view, bind 1 = depth/stencil. However nv3x doesn't do NPOT textures... so those sizes are a bit odd. Perhaps there's some logic that attempts to round-up-to-nearest-POT size, but instead multiplies width by 2?> > I believe this is a clue to the root cause of the problem, but after this > I'm sorta stuck. Anyone got any hints on how to debug this further / where > to look ? > > Thanks & Regards, > > Hans
Marek Olšák
2015-Aug-28 11:01 UTC
[Nouveau] nv3x libreoffice impress opengl animations not working
Your first question was about shader translation, but now you're talking about texture allocations, which are completely unrelated. Like I said, visuals and textures have NOTHING to do with shader compilations. Marek On Fri, Aug 28, 2015 at 10:54 AM, Hans de Goede <hdegoede at redhat.com> wrote:> Hi, > > On 27-08-15 20:19, Ilia Mirkin wrote: >> >> On Thu, Aug 27, 2015 at 1:59 PM, Alex Deucher <alexdeucher at gmail.com> >> wrote: > > > <snip> > >>>>>> 2) Since the glretrace does work outside of libreoffice impress, I >>>>>> think >>>>>> it may have something to do with the visual chosen by libreoffice >>>>>> impress, >>>>>> is there an easy way to find out what visual lo is choosing? >>>>> >>>>> >>>>> >>>>> No, it's not because of the visual. It seems to me that libreoffice >>>>> changed the behavior of malloc and calloc. >>>> >>>> >>>> >>>> I'm pretty sure that this is not libreoffice changing malloc / calloc, >>>> it links normally to libc, and the same slide transition works fine >>>> with an nv84 card which also has a gallium based mesa driver. >>>> >>>> I really believe this is due to libreoffice doing something opengl >>>> related differently then glretrace, be it the visual or something else >>>> back buffer related ... >>>> >>> >>> Does libreoffice use llvm? I have vague recollections of there being >>> issues with llvm and libreoffice in the past because radeonsi uses >>> llvm as well. >> >> >> FWIW the nv30 gallium driver will only use llvm as part of 'draw' when >> falling back to the swtnl path. This should be extremely rare. But >> easy enough to build mesa with --disable-gallium-llvm to double-check >> (or what was the env var? DRAW_USE_LLVM=0 or something along those >> lines). > > > I've tried building with --disable-gallium-llvm, this does not help, > this is not really surprising since on Fedora both libreoffice and > mesa use the system llvm, so there should be no problems with them > expecting different llvm versions. > > I've done some further debugging adding some debug printf-s to the > texture creation paths for nv3x, this bit is interesting, glretrace > does: > > nv30_miptree_from_handle 1350x863 uniform_pitch 6144 usage 0 flags 0 > nv30_miptree_create 1350x863 uniform_pitch 5440 usage 0 flags 0 bind 1 > target 2 > > So it gets a texture from a handle, which I believe is the child-window > in which the animation will be shown, and then create another texture > with the same dimensions to serve as back buffer I presume. > > ooimpress however does this: > > nv30_miptree_from_handle 1350x863 uniform_pitch 6144 usage 0 flags 0 > nv30_miptree_create 2700x1726 uniform_pitch 10816 usage 0 flags 0 bind a > target 2 > nv30_miptree_create 2700x1726 uniform_pitch 10816 usage 0 flags 0 bind 1 > target 2 > > Notice how it is creating 2 (back?) buffers and they are twice the size of > the "sheet" area of impress to which the animation gets rendered. > > I believe this is a clue to the root cause of the problem, but after this > I'm sorta stuck. Anyone got any hints on how to debug this further / where > to look ? > > Thanks & Regards, > > Hans