Hans de Goede
2015-Aug-27 17:55 UTC
[Nouveau] [Mesa-dev] gallium state tracker calls calloc for 0 sizes arrays ?
Hi, On 27-08-15 15:46, Marek Olšák wrote:> On Thu, Aug 27, 2015 at 3:09 PM, Hans de Goede <hdegoede at redhat.com> wrote: >> Hi All, >> >> While debugging: https://bugzilla.redhat.com/show_bug.cgi?id=1008089 >> >> I made a apitrace recording of the a single slide transition >> animation, and since I suspected memory corruption replayed >> it using ElectrFence + glretrace, this finds a 0 sized array >> allocation at src/mesa/state_tracker/st_glsl_to_tgsi.cpp: 5565: >> >> if (proginfo->Parameters) { >> t->constants = (struct ureg_src *) >> calloc(proginfo->Parameters->NumParameters, >> sizeof(t->constants[0])); >> >> And if I protect the code against that one, another one at 5618: >> >> t->immediates = (struct ureg_src *) >> calloc(program->num_immediates, sizeof(struct ureg_src)); >> >> With the regular glibc malloc these both succeed as it actually >> returns a valid memory address (posix says it may also return NULL) >> >> I believe that the fragment program in question comes from: >> >> src/mesa/main/state.c update_program() and then from the >> >> else if (ctx->FragmentProgram._MaintainTexEnvProgram) { >> /* Use fragment program generated from fixed-function state */ >> >> } >> >> block. >> >> Interestingly enough if I allow malloc(0) to proceed from ElectricFence, >> then the glretrace runs fine, and even renders correctly, where as >> running the same gl command stream from libreoffice impress leads >> to missrendering on nv3c. >> >> So 2 questions: >> >> 1) Is it normal / expected for st_translate_program() to get called >> with an empty but not NULL proginfo->Parameters resp. num_immediates == 0 ? >> >> If not where would I begin to look for finding the culprit of this ? > > Yes, it's normal.OK, thanks for the clear answer on this.>> 2) Since the glretrace does work outside of libreoffice impress, I think >> it may have something to do with the visual chosen by libreoffice impress, >> is there an easy way to find out what visual lo is choosing? > > No, it's not because of the visual. It seems to me that libreoffice > changed the behavior of malloc and calloc.I'm pretty sure that this is not libreoffice changing malloc / calloc, it links normally to libc, and the same slide transition works fine with an nv84 card which also has a gallium based mesa driver. I really believe this is due to libreoffice doing something opengl related differently then glretrace, be it the visual or something else back buffer related ... Regards, Hans
Alex Deucher
2015-Aug-27 17:59 UTC
[Nouveau] [Mesa-dev] gallium state tracker calls calloc for 0 sizes arrays ?
On Thu, Aug 27, 2015 at 1:55 PM, Hans de Goede <hdegoede at redhat.com> wrote:> Hi, > > On 27-08-15 15:46, Marek Olšák wrote: >> >> On Thu, Aug 27, 2015 at 3:09 PM, Hans de Goede <hdegoede at redhat.com> >> wrote: >>> >>> Hi All, >>> >>> While debugging: https://bugzilla.redhat.com/show_bug.cgi?id=1008089 >>> >>> I made a apitrace recording of the a single slide transition >>> animation, and since I suspected memory corruption replayed >>> it using ElectrFence + glretrace, this finds a 0 sized array >>> allocation at src/mesa/state_tracker/st_glsl_to_tgsi.cpp: 5565: >>> >>> if (proginfo->Parameters) { >>> t->constants = (struct ureg_src *) >>> calloc(proginfo->Parameters->NumParameters, >>> sizeof(t->constants[0])); >>> >>> And if I protect the code against that one, another one at 5618: >>> >>> t->immediates = (struct ureg_src *) >>> calloc(program->num_immediates, sizeof(struct ureg_src)); >>> >>> With the regular glibc malloc these both succeed as it actually >>> returns a valid memory address (posix says it may also return NULL) >>> >>> I believe that the fragment program in question comes from: >>> >>> src/mesa/main/state.c update_program() and then from the >>> >>> else if (ctx->FragmentProgram._MaintainTexEnvProgram) { >>> /* Use fragment program generated from fixed-function state */ >>> >>> } >>> >>> block. >>> >>> Interestingly enough if I allow malloc(0) to proceed from ElectricFence, >>> then the glretrace runs fine, and even renders correctly, where as >>> running the same gl command stream from libreoffice impress leads >>> to missrendering on nv3c. >>> >>> So 2 questions: >>> >>> 1) Is it normal / expected for st_translate_program() to get called >>> with an empty but not NULL proginfo->Parameters resp. num_immediates == 0 >>> ? >>> >>> If not where would I begin to look for finding the culprit of this ? >> >> >> Yes, it's normal. > > > OK, thanks for the clear answer on this. > >>> 2) Since the glretrace does work outside of libreoffice impress, I think >>> it may have something to do with the visual chosen by libreoffice >>> impress, >>> is there an easy way to find out what visual lo is choosing? >> >> >> No, it's not because of the visual. It seems to me that libreoffice >> changed the behavior of malloc and calloc. > > > I'm pretty sure that this is not libreoffice changing malloc / calloc, > it links normally to libc, and the same slide transition works fine > with an nv84 card which also has a gallium based mesa driver. > > I really believe this is due to libreoffice doing something opengl > related differently then glretrace, be it the visual or something else > back buffer related ... >Does libreoffice use llvm? I have vague recollections of there being issues with llvm and libreoffice in the past because radeonsi uses llvm as well. Alex
Ilia Mirkin
2015-Aug-27 18:19 UTC
[Nouveau] [Mesa-dev] gallium state tracker calls calloc for 0 sizes arrays ?
On Thu, Aug 27, 2015 at 1:59 PM, Alex Deucher <alexdeucher at gmail.com> wrote:> On Thu, Aug 27, 2015 at 1:55 PM, Hans de Goede <hdegoede at redhat.com> wrote: >> Hi, >> >> On 27-08-15 15:46, Marek Olšák wrote: >>> >>> On Thu, Aug 27, 2015 at 3:09 PM, Hans de Goede <hdegoede at redhat.com> >>> wrote: >>>> >>>> Hi All, >>>> >>>> While debugging: https://bugzilla.redhat.com/show_bug.cgi?id=1008089 >>>> >>>> I made a apitrace recording of the a single slide transition >>>> animation, and since I suspected memory corruption replayed >>>> it using ElectrFence + glretrace, this finds a 0 sized array >>>> allocation at src/mesa/state_tracker/st_glsl_to_tgsi.cpp: 5565: >>>> >>>> if (proginfo->Parameters) { >>>> t->constants = (struct ureg_src *) >>>> calloc(proginfo->Parameters->NumParameters, >>>> sizeof(t->constants[0])); >>>> >>>> And if I protect the code against that one, another one at 5618: >>>> >>>> t->immediates = (struct ureg_src *) >>>> calloc(program->num_immediates, sizeof(struct ureg_src)); >>>> >>>> With the regular glibc malloc these both succeed as it actually >>>> returns a valid memory address (posix says it may also return NULL) >>>> >>>> I believe that the fragment program in question comes from: >>>> >>>> src/mesa/main/state.c update_program() and then from the >>>> >>>> else if (ctx->FragmentProgram._MaintainTexEnvProgram) { >>>> /* Use fragment program generated from fixed-function state */ >>>> >>>> } >>>> >>>> block. >>>> >>>> Interestingly enough if I allow malloc(0) to proceed from ElectricFence, >>>> then the glretrace runs fine, and even renders correctly, where as >>>> running the same gl command stream from libreoffice impress leads >>>> to missrendering on nv3c. >>>> >>>> So 2 questions: >>>> >>>> 1) Is it normal / expected for st_translate_program() to get called >>>> with an empty but not NULL proginfo->Parameters resp. num_immediates == 0 >>>> ? >>>> >>>> If not where would I begin to look for finding the culprit of this ? >>> >>> >>> Yes, it's normal. >> >> >> OK, thanks for the clear answer on this. >> >>>> 2) Since the glretrace does work outside of libreoffice impress, I think >>>> it may have something to do with the visual chosen by libreoffice >>>> impress, >>>> is there an easy way to find out what visual lo is choosing? >>> >>> >>> No, it's not because of the visual. It seems to me that libreoffice >>> changed the behavior of malloc and calloc. >> >> >> I'm pretty sure that this is not libreoffice changing malloc / calloc, >> it links normally to libc, and the same slide transition works fine >> with an nv84 card which also has a gallium based mesa driver. >> >> I really believe this is due to libreoffice doing something opengl >> related differently then glretrace, be it the visual or something else >> back buffer related ... >> > > Does libreoffice use llvm? I have vague recollections of there being > issues with llvm and libreoffice in the past because radeonsi uses > llvm as well.FWIW the nv30 gallium driver will only use llvm as part of 'draw' when falling back to the swtnl path. This should be extremely rare. But easy enough to build mesa with --disable-gallium-llvm to double-check (or what was the env var? DRAW_USE_LLVM=0 or something along those lines). -ilia