thr3ads.net - llvm dev - [LLVMdev] Module::getOrInsertFunction determinism [Jun 2014]

If this information is useful, please help other people find it:
Share via:

Kuchta, Tomasz

2014-Jun-03 09:50 UTC

[LLVMdev] Module::getOrInsertFunction determinism

Hi Philip,

I would like to ask a follow-up question about code generation.
Do you know if is expected that if we take the same bit code modules and
link them together in the same order (programatically), but on different
machines (assuming the same version of LLVM and
roughly the same OS), the output may differ with respect to order of
function definitions inside module?

Thank you very much in advance for help,

Best regards
Tomasz Kuchta

On 29/05/2014 22:22, "Kuchta, Tomasz" <t.kuchta12 at
imperial.ac.uk> wrote:
>Hi Philip,
>
>Thank you very much for the reply. I need to add one important detail -
>currently I¹m working on an outdated version of
>LLVM, so I will need to first make sure that it also appears on the
>current one. 
>
>The use case that I have in mind is to be able to uniquely identify
>instruction in the bit code.
>For example let¹s say we have an ³add² instruction in some function and I
>want to know that this is the same
>instruction in two separate runs. If the function is put in some different
>place as a result of inserting function and linking in the second run,
>how would I identify that instruction? It seems to me that cannot rely on
>the offset of the instruction within the binary and I cannot rely on the
>instruction count.
>The problem in my case was that for the same source code base, I was
>getting different resulting LLVM IR when doing the same
>sequence of getOrInsertFunction and linking with other bit code modules on
>each of runs.
>
>Thank you,
>Tomek
>
>On 29/05/2014 19:34, "Philip Reames" <listmail at
philipreames.com> wrote:
>
>>
>>On 05/29/2014 11:06 AM, Tim Northover wrote:
>>> Hi Tomek,
>>>
>>>> I¹ve got a question about Module::getOrInsertFunction().
>>>> I got an impression that it is not deterministic where exactly
in the
>>>>bit
>>>> code module the new function will be inserted.
>>> Looking at the code (not exhaustively), it seems a new function
will
>>> always be added to the end of a module.
>>>
>>> Documenting that probably wouldn't be a terrible idea, but it
>>> shouldn't affect anything except the human-readability of LLVM
IR. Are
>>> you trying to do something where it is actually causing problems?
>>>
>>> Cheers.
>>>
>>> Tim.
>>I would argue in favor of leaving this explicitly undocumented.  I
don't
>>see the use case in knowing where in a module it got added, and it
>>restricts future implementations in ways we can't predict.
Documenting
>>that it must be deterministic is fine.  Documenting where it decides to
>>place it is not.
>>
>>Tomek, could you spell out why you need the position?  Maybe there's
>>another option here.
>>
>>Philip
>>_______________________________________________
>>LLVM Developers mailing list
>>LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

Philip Reames

2014-Jun-03 16:15 UTC

head link

[LLVMdev] Module::getOrInsertFunction determinism

On 06/03/2014 02:50 AM, Kuchta, Tomasz wrote:> Hi Philip,
>
> I would like to ask a follow-up question about code generation.
> Do you know if is expected that if we take the same bit code modules and
> link them together in the same order (programatically), but on different
> machines (assuming the same version of LLVM and
> roughly the same OS), the output may differ with respect to order of
> function definitions inside module?This falls in to the category of "not really surprising". There's
enough
variation between machines that it's not uncommon for an otherwise
mostly-deterministic process to behave slightly differently. I wouldn't
call the change in order "expected", but it also doesn't sound
like an
obvious bug.

Also, from what you said previously, you're working with a very old
version of LLVM. Most of my own experience is with newer versions so it
may be my expectations are badly off. I'd really recommend you upgrade.
You'll get much more applicable advice. :)>
> Thank you very much in advance for help,
>
> Best regards
> Tomasz Kuchta
>
> On 29/05/2014 22:22, "Kuchta, Tomasz" <t.kuchta12 at
imperial.ac.uk> wrote:
>
>> Hi Philip,
>>
>> Thank you very much for the reply. I need to add one important detail -
>> currently I¹m working on an outdated version of
>> LLVM, so I will need to first make sure that it also appears on the
>> current one. 
>>
>> The use case that I have in mind is to be able to uniquely identify
>> instruction in the bit code.
>> For example let¹s say we have an ³add² instruction in some function and
I
>> want to know that this is the same
>> instruction in two separate runs. If the function is put in some
different
>> place as a result of inserting function and linking in the second run,
>> how would I identify that instruction? It seems to me that cannot rely
on
>> the offset of the instruction within the binary and I cannot rely on
the
>> instruction count.Have you considered trying either metadata or using an intrinsic
function as a marker for the interesting instruction? Depending on your
use case, either might work. (f.y.i. I believe Metadata has changed
radically since earlier versions.)>> The problem in my case was that for the same source code base, I was
>> getting different resulting LLVM IR when doing the same
>> sequence of getOrInsertFunction and linking with other bit code modules
on
>> each of runs.
>>
>> Thank you,
>> Tomek
>>
>> On 29/05/2014 19:34, "Philip Reames" <listmail at
philipreames.com> wrote:
>>
>>> On 05/29/2014 11:06 AM, Tim Northover wrote:
>>>> Hi Tomek,
>>>>
>>>>> I¹ve got a question about Module::getOrInsertFunction().
>>>>> I got an impression that it is not deterministic where
exactly in the
>>>>> bit
>>>>> code module the new function will be inserted.
>>>> Looking at the code (not exhaustively), it seems a new function
will
>>>> always be added to the end of a module.
>>>>
>>>> Documenting that probably wouldn't be a terrible idea, but
it
>>>> shouldn't affect anything except the human-readability of
LLVM IR. Are
>>>> you trying to do something where it is actually causing
problems?
>>>>
>>>> Cheers.
>>>>
>>>> Tim.
>>> I would argue in favor of leaving this explicitly undocumented.  I
don't
>>> see the use case in knowing where in a module it got added, and it
>>> restricts future implementations in ways we can't predict.
Documenting
>>> that it must be deterministic is fine.  Documenting where it
decides to
>>> place it is not.
>>>
>>> Tomek, could you spell out why you need the position?  Maybe
there's
>>> another option here.
>>>
>>> Philip
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Kuchta, Tomasz

2014-Jun-04 09:32 UTC

head link

[LLVMdev] Module::getOrInsertFunction determinism

Hi Philip,

Thank you very much for your comments.
I think I’ve discovered a root cause. The problem was in linking bit code
archive files with the module.
At some point, std::set<Module*> is used and iterated over. I believe this
was the reason why e.g. It worked consistently with
ASLR turned off and produced non-deterministic output otherwise. I changed
that bit to use vector instead and now it seems to
be working as expected. I wouldn’t be surprised if this is already fixed /
changed in some newer version of LLVM and I agree I should upgrade :)

Thank you very much for help,
Best regards
Tomek

On 03/06/2014 17:15, "Philip Reames" <listmail at
philipreames.com> wrote:
>
>On 06/03/2014 02:50 AM, Kuchta, Tomasz wrote:
>> Hi Philip,
>>
>> I would like to ask a follow-up question about code generation.
>> Do you know if is expected that if we take the same bit code modules
and
>> link them together in the same order (programatically), but on
different
>> machines (assuming the same version of LLVM and
>> roughly the same OS), the output may differ with respect to order of
>> function definitions inside module?
>This falls in to the category of "not really surprising".
There's enough
>variation between machines that it's not uncommon for an otherwise
>mostly-deterministic process to behave slightly differently. I wouldn't
>call the change in order "expected", but it also doesn't sound
like an
>obvious bug.
>
>Also, from what you said previously, you're working with a very old
>version of LLVM. Most of my own experience is with newer versions so it
>may be my expectations are badly off. I'd really recommend you upgrade.
>You'll get much more applicable advice. :)
>>
>> Thank you very much in advance for help,
>>
>> Best regards
>> Tomasz Kuchta
>>
>> On 29/05/2014 22:22, "Kuchta, Tomasz" <t.kuchta12 at
imperial.ac.uk> wrote:
>>
>>> Hi Philip,
>>>
>>> Thank you very much for the reply. I need to add one important
detail -
>>> currently I¹m working on an outdated version of
>>> LLVM, so I will need to first make sure that it also appears on the
>>> current one. 
>>>
>>> The use case that I have in mind is to be able to uniquely identify
>>> instruction in the bit code.
>>> For example let¹s say we have an ³add² instruction in some function
>>>and I
>>> want to know that this is the same
>>> instruction in two separate runs. If the function is put in some
>>>different
>>> place as a result of inserting function and linking in the second
run,
>>> how would I identify that instruction? It seems to me that cannot
rely
>>>on
>>> the offset of the instruction within the binary and I cannot rely
on
>>>the
>>> instruction count.
>Have you considered trying either metadata or using an intrinsic
>function as a marker for the interesting instruction? Depending on your
>use case, either might work. (f.y.i. I believe Metadata has changed
>radically since earlier versions.)
>>> The problem in my case was that for the same source code base, I
was
>>> getting different resulting LLVM IR when doing the same
>>> sequence of getOrInsertFunction and linking with other bit code
>>>modules on
>>> each of runs.
>>>
>>> Thank you,
>>> Tomek
>>>
>>> On 29/05/2014 19:34, "Philip Reames" <listmail at
philipreames.com> wrote:
>>>
>>>> On 05/29/2014 11:06 AM, Tim Northover wrote:
>>>>> Hi Tomek,
>>>>>
>>>>>> I¹ve got a question about
Module::getOrInsertFunction().
>>>>>> I got an impression that it is not deterministic where
exactly in
>>>>>>the
>>>>>> bit
>>>>>> code module the new function will be inserted.
>>>>> Looking at the code (not exhaustively), it seems a new
function will
>>>>> always be added to the end of a module.
>>>>>
>>>>> Documenting that probably wouldn't be a terrible idea,
but it
>>>>> shouldn't affect anything except the human-readability
of LLVM IR.
>>>>>Are
>>>>> you trying to do something where it is actually causing
problems?
>>>>>
>>>>> Cheers.
>>>>>
>>>>> Tim.
>>>> I would argue in favor of leaving this explicitly undocumented.
I
>>>>don't
>>>> see the use case in knowing where in a module it got added, and
it
>>>> restricts future implementations in ways we can't predict.
Documenting
>>>> that it must be deterministic is fine.  Documenting where it
decides
>>>>to
>>>> place it is not.
>>>>
>>>> Tomek, could you spell out why you need the position?  Maybe
there's
>>>> another option here.
>>>>
>>>> Philip
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>

llvm dev - Jun 2014 - [LLVMdev] Module::getOrInsertFunction determinism

[LLVMdev] Module::getOrInsertFunction determinism

[LLVMdev] Module::getOrInsertFunction determinism

[LLVMdev] Module::getOrInsertFunction determinism