thr3ads.net - llvm dev - [llvm-dev] bitcode versioning [Dec 2015]

If this information is useful, please help other people find it:
Share via:

Martin J. O'Riordan via llvm-dev

2015-Dec-11 14:13 UTC

[llvm-dev] bitcode versioning

Hi Mehdi and my apologies for the delay in responding - the day job got in the
way :-)

Our target is still out-of-tree so my reasons for extending the IR would be
eliminated if we were a proper part of LLVM, which I would like to do when the
time is right for us.

My extensions are quite simple really, and I expect that they will be wanted in
the TRUNK sometime anyway.

At the moment I only have one remaining change which is to add 'v16f16'
to the set of IR types.  Previously I had several other FP16 vector types added,
but over the past few iterations of LLVM my changes have been gradually made
redundant because others have added them formally to the source.  I expect that
'v16f16' will go this way too allowing me to have an unaltered IR.

But the problem I have faced with making the changes, is that my LLVM cannot
accept the BC produced by another version (and vice versa), not even the
official version, because the placement of the types in the enumeration is very
particular and changes the indices for all the subsequent values.

I had often thought it would be helpful if the BC (and LL for that matter) had a
version resource of some kind, that would allow me to see that the incoming IR
was produced by the official unchanged LLVM, and then I could have placed a
translation in the loader that would remap the indices to the ones expect by my
back-end.

When you proposed the addition of a version resource, I was thinking that rather
than each target adding parsing code for it, it would be better and more
transparent for it to appear as a "Version Resource Object" that I
could query for simple things like:

  o  Get the major number
  o  Get the minor number
  o  Get the patch number
  o  Is it extended? and if "yes":
     -  Get the vendor ID (could be a string)
     -  Get the vendor specific extension number

And this is really what I mean by an API - essentially a simple object
representing the version information.  For IR production/emission, there would
need to be a 'setter' interface too.

This would allow me to make my extensions, yet be in a position to more robustly
accept BC or LL from other sources.  In particular I should be able to remap IR
coming from a well-known point-release of LLVM, and also be able to detect,
diagnose and reject input from sources I don't recognise (at the moment it
just causes a crash).
>From my experience of developing an out-of-tree LLVM backend, I am painfully
aware of the downsides of not being "in-tree", and while eventually I
expect that I will be able to contribute our work, I am also aware that other
future out-of-tree developers will run into similar kinds of problems in the
future, and a formal version resource would greatly help.
Thanks,

	MartinO - Movidius Ltd.

-----Original Message-----
From: mehdi.amini at apple.com [mailto:mehdi.amini at apple.com] 
Sent: 03 December 2015 19:51
To: Martin J. O'Riordan <martin.oriordan at movidius.com>
Cc: Manuel Rigger <rigger.manuel at gmail.com>; llvm-dev at lists.llvm.org
Subject: Re: [llvm-dev] bitcode versioning

What kind of API would you expect? The Bitcode Reader expose the API to get the
information in this block. It is up to the client to interpret it.

Our internal use case is to parse the version string and identify bitcode
generated by an Apple released LLVM. If the version is “from the future” the
bitcode can be rejected (we’ll do it during LTO).

— 
Mehdi

> On Dec 3, 2015, at 11:48 AM, Martin J. O'Riordan <martin.oriordan at
movidius.com> wrote:
> 
> Is there going to be a formal interface/API for this version-block
information?  I have had to "extend" the IR and bitcode
representations several times to address absences/limitations in the handling of
various vector types, in particular FP16 vector types; and it would be really
useful if I had a "standard" way of doing this, and identifying that
my dialect was different.
> 
> Thanks,
> 
> 	MartinO - Movidius
> 
> -----Original Message-----
> From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of
Mehdi Amini via llvm-dev
> Sent: 03 December 2015 15:45
> To: Manuel Rigger <rigger.manuel at gmail.com>
> Cc: llvm-dev at lists.llvm.org
> Subject: Re: [llvm-dev] bitcode versioning
> 
> 
>> On Dec 3, 2015, at 4:10 AM, Manuel Rigger via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
>> 
>> Hi all,
>> 
>> I am implementing a LLVM IR interpreter and have the following problem:
I want to support execution of bitcode files targeted towards different LLVM
versions. For example, a user of the interpreter should be able to compile a C
file with the latest version of Clang, a Fortran file with Dragonegg (targeting
LLVM 3.3), and a Haskell file with GHC (targeting LLVM 3.5), and then just feed
it to my interpreter without additional arguments.
>> 
>> Currently, my parser expects textual representation for a specific LLVM
version. I could provide different parsers or parser configurations that support
different bitcode versions, but there is no notion of a version field in the
textual representation that I could use to determine which parser to use.
Anyway, for the long term it is not a good idea to rely on the textual format
due to the missing backward compatibility guarantees.
>> 
>> Hence, I want to replace the textual format parser with a parser for
bitcode, which would also be able to parse the files of my example. But how
should I treat bitcode files of major upcoming releases, e.g., of LLVM 4.1? I
found a version ID in the bitcode wrapper format, but the documentation states
that the ID is currently always 0. Is there a policy that specifies when the ID
will be updated? Without having such a policy in place, I would just postpone
the problem I currently have with the textual format parser.
> 
> The wrapper format is Darwin specific AFAIK. However starting with 3.8
there will be a another version block in the bitcode, which contains a string
identifying the producer and an integer that will be bumped when needed
(whatever it means).
> Look for
lib//Bitcode/Reader/BitcodeReader.cpp:llvm::getBitcodeProducerString() as a
starting point.
> 
> — 
> Mehdi
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

Mehdi Amini via llvm-dev

2015-Dec-12 02:13 UTC

head link

[llvm-dev] bitcode versioning

> On Dec 11, 2015, at 6:13 AM, Martin J. O'Riordan <martin.oriordan at
movidius.com> wrote:
> 
> Hi Mehdi and my apologies for the delay in responding - the day job got in
the way :-)
> 
> Our target is still out-of-tree so my reasons for extending the IR would be
eliminated if we were a proper part of LLVM, which I would like to do when the
time is right for us.
> 
> My extensions are quite simple really, and I expect that they will be
wanted in the TRUNK sometime anyway.
> 
> At the moment I only have one remaining change which is to add
'v16f16' to the set of IR types.  Previously I had several other FP16
vector types added, but over the past few iterations of LLVM my changes have
been gradually made redundant because others have added them formally to the
source.  I expect that 'v16f16' will go this way too allowing me to have
an unaltered IR.
> 
> But the problem I have faced with making the changes, is that my LLVM
cannot accept the BC produced by another version (and vice versa), not even the
official version, because the placement of the types in the enumeration is very
particular and changes the indices for all the subsequent values.
> 
> I had often thought it would be helpful if the BC (and LL for that matter)
had a version resource of some kind, that would allow me to see that the
incoming IR was produced by the official unchanged LLVM, and then I could have
placed a translation in the loader that would remap the indices to the ones
expect by my back-end.
> 
> When you proposed the addition of a version resource, I was thinking that
rather than each target adding parsing code for it, it would be better and more
transparent for it to appear as a "Version Resource Object" that I
could query for simple things like:
> 
>  o  Get the major number
>  o  Get the minor number
>  o  Get the patch number
This would force a specific model for the version, which we didn’t want.
>  o  Is it extended? and if "yes":
>     -  Get the vendor ID (could be a string)
>     -  Get the vendor specific extension number
> 
> And this is really what I mean by an API - essentially a simple object
representing the version information.  For IR production/emission, there would
need to be a 'setter' interface too.
This is what we do, but using the string only. 
The “setter” is compile time (LLVM_VERSION probably), we patch the bitcode write
internally
> 
> This would allow me to make my extensions, yet be in a position to more
robustly accept BC or LL from other sources.  In particular I should be able to
remap IR coming from a well-known point-release of LLVM, and also be able to
detect, diagnose and reject input from sources I don't recognise (at the
moment it just causes a crash).
The string content is predictable: if it will begin with “LLVM3.8.0” or
“LLVM3.9.0”, etc. So you should be able to do exactly what you want.
The bitcode produced by our binaries has a very different string, and we use
this information to identify the producer as well.
What’s missing?

— 
Mehdi


> 
> From my experience of developing an out-of-tree LLVM backend, I am
painfully aware of the downsides of not being "in-tree", and while
eventually I expect that I will be able to contribute our work, I am also aware
that other future out-of-tree developers will run into similar kinds of problems
in the future, and a formal version resource would greatly help.
> 
> Thanks,
> 
> 	MartinO - Movidius Ltd.
> 
> -----Original Message-----
> From: mehdi.amini at apple.com [mailto:mehdi.amini at apple.com] 
> Sent: 03 December 2015 19:51
> To: Martin J. O'Riordan <martin.oriordan at movidius.com>
> Cc: Manuel Rigger <rigger.manuel at gmail.com>; llvm-dev at
lists.llvm.org
> Subject: Re: [llvm-dev] bitcode versioning
> 
> What kind of API would you expect? The Bitcode Reader expose the API to get
the information in this block. It is up to the client to interpret it.
> 
> Our internal use case is to parse the version string and identify bitcode
generated by an Apple released LLVM. If the version is “from the future” the
bitcode can be rejected (we’ll do it during LTO).
> 
> — 
> Mehdi
> 
> 
>> On Dec 3, 2015, at 11:48 AM, Martin J. O'Riordan
<martin.oriordan at movidius.com> wrote:
>> 
>> Is there going to be a formal interface/API for this version-block
information?  I have had to "extend" the IR and bitcode
representations several times to address absences/limitations in the handling of
various vector types, in particular FP16 vector types; and it would be really
useful if I had a "standard" way of doing this, and identifying that
my dialect was different.
>> 
>> Thanks,
>> 
>> 	MartinO - Movidius
>> 
>> -----Original Message-----
>> From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of
Mehdi Amini via llvm-dev
>> Sent: 03 December 2015 15:45
>> To: Manuel Rigger <rigger.manuel at gmail.com>
>> Cc: llvm-dev at lists.llvm.org
>> Subject: Re: [llvm-dev] bitcode versioning
>> 
>> 
>>> On Dec 3, 2015, at 4:10 AM, Manuel Rigger via llvm-dev <llvm-dev
at lists.llvm.org> wrote:
>>> 
>>> Hi all,
>>> 
>>> I am implementing a LLVM IR interpreter and have the following
problem: I want to support execution of bitcode files targeted towards different
LLVM versions. For example, a user of the interpreter should be able to compile
a C file with the latest version of Clang, a Fortran file with Dragonegg
(targeting LLVM 3.3), and a Haskell file with GHC (targeting LLVM 3.5), and then
just feed it to my interpreter without additional arguments.
>>> 
>>> Currently, my parser expects textual representation for a specific
LLVM version. I could provide different parsers or parser configurations that
support different bitcode versions, but there is no notion of a version field in
the textual representation that I could use to determine which parser to use.
Anyway, for the long term it is not a good idea to rely on the textual format
due to the missing backward compatibility guarantees.
>>> 
>>> Hence, I want to replace the textual format parser with a parser
for bitcode, which would also be able to parse the files of my example. But how
should I treat bitcode files of major upcoming releases, e.g., of LLVM 4.1? I
found a version ID in the bitcode wrapper format, but the documentation states
that the ID is currently always 0. Is there a policy that specifies when the ID
will be updated? Without having such a policy in place, I would just postpone
the problem I currently have with the textual format parser.
>> 
>> The wrapper format is Darwin specific AFAIK. However starting with 3.8
there will be a another version block in the bitcode, which contains a string
identifying the producer and an integer that will be bumped when needed
(whatever it means).
>> Look for
lib//Bitcode/Reader/BitcodeReader.cpp:llvm::getBitcodeProducerString() as a
starting point.
>> 
>> — 
>> Mehdi
>> 
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> 
>

Martin J. O'Riordan via llvm-dev

2015-Dec-12 15:24 UTC

head link

[llvm-dev] bitcode versioning

Actually, I wasn't requesting additional functionality of a version
resource, rather I was asking "if you were also providing an API to
it".  I guess the answer is that the proposal does not include a
programmatic abstraction or API.  This is not a problem, I can parse arbitrary
strings easily enough.
> >  o  Get the major number
> >  o  Get the minor number
> >  o  Get the patch number
>
> This would force a specific model for the version, which we didn’t want.
These are just examples for illustrating my response, not intended as specific
API requests - an API for the actual implemented version resource would of
course provide its own notion of the content of the resource and would naturally
derive from that implementation.

Thanks for clarifying this.

	MartinO

-----Original Message-----
From: mehdi.amini at apple.com [mailto:mehdi.amini at apple.com] 
Sent: 12 December 2015 02:14
To: Martin J. O'Riordan <martin.oriordan at movidius.com>
Cc: Manuel Rigger <rigger.manuel at gmail.com>; llvm-dev at lists.llvm.org
Subject: Re: [llvm-dev] bitcode versioning

> On Dec 11, 2015, at 6:13 AM, Martin J. O'Riordan <martin.oriordan at
movidius.com> wrote:
> 
> Hi Mehdi and my apologies for the delay in responding - the day job got in
the way :-)
> 
> Our target is still out-of-tree so my reasons for extending the IR would be
eliminated if we were a proper part of LLVM, which I would like to do when the
time is right for us.
> 
> My extensions are quite simple really, and I expect that they will be
wanted in the TRUNK sometime anyway.
> 
> At the moment I only have one remaining change which is to add
'v16f16' to the set of IR types.  Previously I had several other FP16
vector types added, but over the past few iterations of LLVM my changes have
been gradually made redundant because others have added them formally to the
source.  I expect that 'v16f16' will go this way too allowing me to have
an unaltered IR.
> 
> But the problem I have faced with making the changes, is that my LLVM
cannot accept the BC produced by another version (and vice versa), not even the
official version, because the placement of the types in the enumeration is very
particular and changes the indices for all the subsequent values.
> 
> I had often thought it would be helpful if the BC (and LL for that matter)
had a version resource of some kind, that would allow me to see that the
incoming IR was produced by the official unchanged LLVM, and then I could have
placed a translation in the loader that would remap the indices to the ones
expect by my back-end.
> 
> When you proposed the addition of a version resource, I was thinking that
rather than each target adding parsing code for it, it would be better and more
transparent for it to appear as a "Version Resource Object" that I
could query for simple things like:
> 
>  o  Get the major number
>  o  Get the minor number
>  o  Get the patch number
This would force a specific model for the version, which we didn’t want.
>  o  Is it extended? and if "yes":
>     -  Get the vendor ID (could be a string)
>     -  Get the vendor specific extension number
> 
> And this is really what I mean by an API - essentially a simple object
representing the version information.  For IR production/emission, there would
need to be a 'setter' interface too.
This is what we do, but using the string only. 
The “setter” is compile time (LLVM_VERSION probably), we patch the bitcode write
internally
> 
> This would allow me to make my extensions, yet be in a position to more
robustly accept BC or LL from other sources.  In particular I should be able to
remap IR coming from a well-known point-release of LLVM, and also be able to
detect, diagnose and reject input from sources I don't recognise (at the
moment it just causes a crash).
The string content is predictable: if it will begin with “LLVM3.8.0” or
“LLVM3.9.0”, etc. So you should be able to do exactly what you want.
The bitcode produced by our binaries has a very different string, and we use
this information to identify the producer as well.
What’s missing?

— 
Mehdi

llvm dev - Dec 2015 - bitcode versioning

[llvm-dev] bitcode versioning

[llvm-dev] bitcode versioning

[llvm-dev] bitcode versioning