thr3ads.net - llvm dev - [llvm-dev] Question about METADATA

If this information is useful, please help other people find it:
Share via:

Teresa Johnson via llvm-dev

2015-Oct-20 23:02 UTC

[llvm-dev] Question about METADATA_BLOCKs in bitcode

I have a question about module-level METADATA_BLOCKs in the bitcode.
There are currently two blocks with the METADATA_BLOCK id at module
scope. The first has the module-level metadata values (consisting of
some combination of METADATA_* record codes except for METADATA_KIND).
The second consists only of METADATA_KIND records. The latter is used
only in the METADATA_ATTACHMENT block within function blocks (for
metadata attached to instructions).

For ThinLTO we want to delay the parsing of module level metadata
until all functions have been imported from that module (there is some
bookkeeping used to suture it up when we read it during a post-pass).
However, I do need the METADATA_KIND records since that is used when
parsing the function body being imported. In my ThinLTO prototype I
was simply skipping the first METADATA_BLOCK and always parsing the
second, since I knew from inspecting the BitcodeWriter that the second
was the one with the METADATA_KIND records.

When working on the implementation of this handling for the upstream
patches, I am reassessing how to do this perhaps more robustly. A few
options:

1) Should the two module-level blocks use different block ids since
they hold different types of records? I.e. a METADATA_BLOCK_ID
(holding the actual metadata) and a new METADATA_KIND_BLOCK_ID (for
the metadata kind records). That way it is more obvious what the two
separate blocks are for and it is also much more trivial to identify
and skip just the former.

2) Should I simply continue to assume that we will have two
METADATA_BLOCKs at module level and that the kinds will always be in
the latter?

3) Another option is to peek inside the METADATA_BLOCK to check the
first record's type, and if it isn't METADATA_KIND, skip the rest of
the block. Looks like EnterSubBlock already optionally returns the
size of the block in words, so I could use that to compute the bitcode
offset to jump to before reading the first record, in case it is the
wrong METADATA_BLOCK and I want to skip it.

Thoughts?

Thanks,
Teresa

-- 
Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413

Duncan P. N. Exon Smith via llvm-dev

2015-Oct-20 23:51 UTC

head link

[llvm-dev] Question about METADATA_BLOCKs in bitcode

> On 2015-Oct-20, at 16:02, Teresa Johnson <tejohnson at google.com>
wrote:
> 
> I have a question about module-level METADATA_BLOCKs in the bitcode.
> There are currently two blocks with the METADATA_BLOCK id at module
> scope. The first has the module-level metadata values (consisting of
> some combination of METADATA_* record codes except for METADATA_KIND).
> The second consists only of METADATA_KIND records. The latter is used
> only in the METADATA_ATTACHMENT block within function blocks (for
> metadata attached to instructions).
> 
> For ThinLTO we want to delay the parsing of module level metadata
> until all functions have been imported from that module (there is some
> bookkeeping used to suture it up when we read it during a post-pass).
> However, I do need the METADATA_KIND records since that is used when
> parsing the function body being imported. In my ThinLTO prototype I
> was simply skipping the first METADATA_BLOCK and always parsing the
> second, since I knew from inspecting the BitcodeWriter that the second
> was the one with the METADATA_KIND records.
> 
> When working on the implementation of this handling for the upstream
> patches, I am reassessing how to do this perhaps more robustly. A few
> options:
> 
> 1) Should the two module-level blocks use different block ids since
> they hold different types of records? I.e. a METADATA_BLOCK_ID
> (holding the actual metadata) and a new METADATA_KIND_BLOCK_ID (for
> the metadata kind records). That way it is more obvious what the two
> separate blocks are for and it is also much more trivial to identify
> and skip just the former.
This SGTM.  The other options sound fragile (not generally correct).

Note that the reader will still have to understand the old schema.
> 2) Should I simply continue to assume that we will have two
> METADATA_BLOCKs at module level and that the kinds will always be in
> the latter?
> 
> 3) Another option is to peek inside the METADATA_BLOCK to check the
> first record's type, and if it isn't METADATA_KIND, skip the rest
of
> the block. Looks like EnterSubBlock already optionally returns the
> size of the block in words, so I could use that to compute the bitcode
> offset to jump to before reading the first record, in case it is the
> wrong METADATA_BLOCK and I want to skip it.
> 
> Thoughts?
> 
> Thanks,
> Teresa
> 
> -- 
> Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413

Teresa Johnson via llvm-dev

2015-Oct-21 01:18 UTC

head link

[llvm-dev] Question about METADATA_BLOCKs in bitcode

On Tue, Oct 20, 2015 at 4:51 PM, Duncan P. N. Exon Smith
<dexonsmith at apple.com> wrote:>
>> On 2015-Oct-20, at 16:02, Teresa Johnson <tejohnson at
google.com> wrote:
>>
>> I have a question about module-level METADATA_BLOCKs in the bitcode.
>> There are currently two blocks with the METADATA_BLOCK id at module
>> scope. The first has the module-level metadata values (consisting of
>> some combination of METADATA_* record codes except for METADATA_KIND).
>> The second consists only of METADATA_KIND records. The latter is used
>> only in the METADATA_ATTACHMENT block within function blocks (for
>> metadata attached to instructions).
>>
>> For ThinLTO we want to delay the parsing of module level metadata
>> until all functions have been imported from that module (there is some
>> bookkeeping used to suture it up when we read it during a post-pass).
>> However, I do need the METADATA_KIND records since that is used when
>> parsing the function body being imported. In my ThinLTO prototype I
>> was simply skipping the first METADATA_BLOCK and always parsing the
>> second, since I knew from inspecting the BitcodeWriter that the second
>> was the one with the METADATA_KIND records.
>>
>> When working on the implementation of this handling for the upstream
>> patches, I am reassessing how to do this perhaps more robustly. A few
>> options:
>>
>> 1) Should the two module-level blocks use different block ids since
>> they hold different types of records? I.e. a METADATA_BLOCK_ID
>> (holding the actual metadata) and a new METADATA_KIND_BLOCK_ID (for
>> the metadata kind records). That way it is more obvious what the two
>> separate blocks are for and it is also much more trivial to identify
>> and skip just the former.
>
> This SGTM.  The other options sound fragile (not generally correct).
Ok good, I think this is cleaner too.
>
> Note that the reader will still have to understand the old schema.
I will keep support for reading the old format (ThinLTO won't support
it, but it needs recent bitcode anyway).

Thanks,
Teresa
>
>> 2) Should I simply continue to assume that we will have two
>> METADATA_BLOCKs at module level and that the kinds will always be in
>> the latter?
>>
>> 3) Another option is to peek inside the METADATA_BLOCK to check the
>> first record's type, and if it isn't METADATA_KIND, skip the
rest of
>> the block. Looks like EnterSubBlock already optionally returns the
>> size of the block in words, so I could use that to compute the bitcode
>> offset to jump to before reading the first record, in case it is the
>> wrong METADATA_BLOCK and I want to skip it.
>>
>> Thoughts?
>>
>> Thanks,
>> Teresa
>>
>> --
>> Teresa Johnson | Software Engineer | tejohnson at google.com |
408-460-2413
>


-- 
Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413

llvm dev - Oct 2015 - Question about METADATA_BLOCKs in bitcode

[llvm-dev] Question about METADATA_BLOCKs in bitcode

[llvm-dev] Question about METADATA_BLOCKs in bitcode

[llvm-dev] Question about METADATA_BLOCKs in bitcode