thr3ads.net - llvm dev - [llvm-dev] Representations of IR in the output of opt [May 2019]

If this information is useful, please help other people find it:
Share via:

Sébastien Michelland via llvm-dev

2019-May-27 20:13 UTC

[llvm-dev] Representations of IR in the output of opt

Hi Mehdi,

Thank you for mentioning this tool, I was looking for something like 
this. By default the analyzer produces identical output on both files, 
but a complete -dump shows that the storage order of the symbol table is 
different.

This would explain why text files are not affected: the symbols are used 
directly in text form so there is no need for this table.

I suppose that settles the question of where. Out of curiosity, I'd like 
to know if there is a way to order the table in a canonical form. I 
found -preserve-bc-uselistorder which makes more sense (and seems to 
correspond because the table lists all uses of each symbol), but no luck 
yet.

At least now I'm sure that there is no semantic difference between the 
programs so it's a great help. :D

Thanks,
Sébastien Michelland

On 5/27/19 1:48 PM, Mehdi AMINI wrote:> Hi,
> 
> I would give try to run llvm-bcanalyzer on these bc files, this may help 
> to understand where the discrepancy is coming from.
> 
> Best,
> 
> -- 
> Mehdi
> 
> 
> On Mon, May 27, 2019 at 10:42 AM Sébastien Michelland via llvm-dev 
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
> 
>     Hi Eli,
> 
>     Unfortunately the differences remain, I do not observe a significant
>     change in the output besides the fact that it's random.
> 
>     I noticed that running opt without options on the random file changes
>     the order of references in the predecessors of basic blocks (sample
>     below). Further invocations of opt are idempotent.
> 
>     I don't know of this information is stored in the bytecode file as
well.
> 
>     < ; preds = %CF, %CF80, %CF78
>       > ; preds = %CF80, %CF, %CF78
> 
>     FWIW, the conflicting section of the bytecode file is likely not a
>     permutation because the byte patterns don't match (some of the btte
>     values of stress-1.bc are not present in stress-2.bc).
> 
>     Thanks for your help :)
>     Sébastien Michelland
> 
>     On 5/24/19 5:32 PM, Eli Friedman wrote:
>      > Are you passing -preserve-ll-uselistorder when you create the .ll
>     files?  It's off by default because the output tends to be sort of
>     unreadable, but it could explain some of the differences you're
seeing.
>      >
>      > -Eli
>      >
>      >> -----Original Message-----
>      >> From: llvm-dev <llvm-dev-bounces at lists.llvm.org
>     <mailto:llvm-dev-bounces at lists.llvm.org>> On Behalf Of
Sébastien
>      >> Michelland via llvm-dev
>      >> Sent: Friday, May 24, 2019 12:53 PM
>      >> To: llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
>      >> Subject: [EXT] [llvm-dev] Representations of IR in the output
of opt
>      >>
>      >> Hi LLVM,
>      >>
>      >> I'm currently setting up some tools to investigate the
influence
>     of the
>      >> order of optimization passes on the performance of compiled
programs
>      >> -nothing exceptional here.
>      >>
>      >> I noticed something inconvenient with opt, namely that
splitting
>     a call
>      >> does not always give the same output:
>      >>
>      >> % llvm-stress > stress.ll
>      >> % opt -dse -verify -dce stress.ll -o stress-1.bc
>      >> % opt -dse stress.ll | opt -dce -o stress-2.bc
>      >> % diff stress-{1,2}.bc
>      >> Binary files stress-1.bc and stress-2.bc differ
>      >>
>      >> The difference seems meaningful; it's ~180 bytes out of
~1400
>     bytes of
>      >> output in my random case. I can't decode it however,
because
>      >> disassembling the bytecode produces identical text files,
even with
>      >> annotations. (!)
>      >>
>      >> I made sure that the sequence for [-dse -verify -dce] is the
>      >> concatenation of the individual sequences; this falls in
place
>     naturally
>      >> because -dce has no dependencies. The verifier pass helps
make two
>      >> function pass managers, just in case.
>      >>
>      >> Now if I do the same thing but staying in text format, I get
the
>     same IR
>      >> (up to module name):
>      >>
>      >> % opt -S -dse -verify -dce stress.ll -o stress-1.ll
>      >> % opt -S -dse stress.ll | opt -S -dce -o stress-2.ll
>      >> % diff -y --suppress-common-lines stress-{1,2}.ll
>      >> ; ModuleID = 'stress.ll'     |       ; ModuleID =
'<stdin>'
>      >>
>      >> Is there a specific behavior of opt that could explain this
>     situation?
>      >> What kind of difference could there be in the bytecode files
that is
>      >> lost in translation to text format ?
>      >>
>      >> Cheers,
>      >> Sébastien Michelland
>      >>
>      >> _______________________________________________
>      >> LLVM Developers mailing list
>      >> llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
>      >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>     _______________________________________________
>     LLVM Developers mailing list
>     llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>     https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>

Mehdi AMINI via llvm-dev

2019-May-27 20:15 UTC

head link

[llvm-dev] Representations of IR in the output of opt

On Mon, May 27, 2019 at 1:12 PM Sébastien Michelland <
sebastien.michelland at ens-lyon.fr> wrote:
> Hi Mehdi,
>
> Thank you for mentioning this tool, I was looking for something like
> this. By default the analyzer produces identical output on both files,
> but a complete -dump shows that the storage order of the symbol table is
> different.
>
Thanks for the update!
It may be desirable to sort the table before writing the bitcode out,
adding Peter to the thread for his opinion.

-- 
Mehdi


> This would explain why text files are not affected: the symbols are used
> directly in text form so there is no need for this table.
>
> I suppose that settles the question of where. Out of curiosity, I'd
like
> to know if there is a way to order the table in a canonical form. I
> found -preserve-bc-uselistorder which makes more sense (and seems to
> correspond because the table lists all uses of each symbol), but no luck
> yet.
>
> At least now I'm sure that there is no semantic difference between the
> programs so it's a great help. :D
>
> Thanks,
> Sébastien Michelland
>
> On 5/27/19 1:48 PM, Mehdi AMINI wrote:
> > Hi,
> >
> > I would give try to run llvm-bcanalyzer on these bc files, this may
help
> > to understand where the discrepancy is coming from.
> >
> > Best,
> >
> > --
> > Mehdi
> >
> >
> > On Mon, May 27, 2019 at 10:42 AM Sébastien Michelland via llvm-dev
> > <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
> >
> >     Hi Eli,
> >
> >     Unfortunately the differences remain, I do not observe a
significant
> >     change in the output besides the fact that it's random.
> >
> >     I noticed that running opt without options on the random file
changes
> >     the order of references in the predecessors of basic blocks
(sample
> >     below). Further invocations of opt are idempotent.
> >
> >     I don't know of this information is stored in the bytecode
file as
> well.
> >
> >     < ; preds = %CF, %CF80, %CF78
> >       > ; preds = %CF80, %CF, %CF78
> >
> >     FWIW, the conflicting section of the bytecode file is likely not a
> >     permutation because the byte patterns don't match (some of the
btte
> >     values of stress-1.bc are not present in stress-2.bc).
> >
> >     Thanks for your help :)
> >     Sébastien Michelland
> >
> >     On 5/24/19 5:32 PM, Eli Friedman wrote:
> >      > Are you passing -preserve-ll-uselistorder when you create
the .ll
> >     files?  It's off by default because the output tends to be
sort of
> >     unreadable, but it could explain some of the differences
you're
> seeing.
> >      >
> >      > -Eli
> >      >
> >      >> -----Original Message-----
> >      >> From: llvm-dev <llvm-dev-bounces at lists.llvm.org
> >     <mailto:llvm-dev-bounces at lists.llvm.org>> On Behalf Of
Sébastien
> >      >> Michelland via llvm-dev
> >      >> Sent: Friday, May 24, 2019 12:53 PM
> >      >> To: llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
> >      >> Subject: [EXT] [llvm-dev] Representations of IR in the
output of
> opt
> >      >>
> >      >> Hi LLVM,
> >      >>
> >      >> I'm currently setting up some tools to investigate
the influence
> >     of the
> >      >> order of optimization passes on the performance of
compiled
> programs
> >      >> -nothing exceptional here.
> >      >>
> >      >> I noticed something inconvenient with opt, namely that
splitting
> >     a call
> >      >> does not always give the same output:
> >      >>
> >      >> % llvm-stress > stress.ll
> >      >> % opt -dse -verify -dce stress.ll -o stress-1.bc
> >      >> % opt -dse stress.ll | opt -dce -o stress-2.bc
> >      >> % diff stress-{1,2}.bc
> >      >> Binary files stress-1.bc and stress-2.bc differ
> >      >>
> >      >> The difference seems meaningful; it's ~180 bytes out
of ~1400
> >     bytes of
> >      >> output in my random case. I can't decode it however,
because
> >      >> disassembling the bytecode produces identical text
files, even
> with
> >      >> annotations. (!)
> >      >>
> >      >> I made sure that the sequence for [-dse -verify -dce] is
the
> >      >> concatenation of the individual sequences; this falls in
place
> >     naturally
> >      >> because -dce has no dependencies. The verifier pass
helps make
> two
> >      >> function pass managers, just in case.
> >      >>
> >      >> Now if I do the same thing but staying in text format, I
get the
> >     same IR
> >      >> (up to module name):
> >      >>
> >      >> % opt -S -dse -verify -dce stress.ll -o stress-1.ll
> >      >> % opt -S -dse stress.ll | opt -S -dce -o stress-2.ll
> >      >> % diff -y --suppress-common-lines stress-{1,2}.ll
> >      >> ; ModuleID = 'stress.ll'     |       ; ModuleID
= '<stdin>'
> >      >>
> >      >> Is there a specific behavior of opt that could explain
this
> >     situation?
> >      >> What kind of difference could there be in the bytecode
files
> that is
> >      >> lost in translation to text format ?
> >      >>
> >      >> Cheers,
> >      >> Sébastien Michelland
> >      >>
> >      >> _______________________________________________
> >      >> LLVM Developers mailing list
> >      >> llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
> >      >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >     _______________________________________________
> >     LLVM Developers mailing list
> >     llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>
> >     https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190527/0f037f0d/attachment.html>

Sébastien Michelland via llvm-dev

2019-May-30 20:41 UTC

head link

[llvm-dev] Representations of IR in the output of opt

Hello again,
> It may be desirable to sort the table before writing the bitcode out, 
> adding Peter to the thread for his opinion.
Thanks for this!

Now it seems I've been optimistic about this result. I have instrumented 
the test suite to check it on a wider amount of files and quickly 
discovered that it fails for larger optimization sequences.

In particular, the default -O3 set in which I'm interested is not 
reproduced easily. I'm attaching a script that demonstrates this.

It contains the extracted -O3 set in two groups, and checks that [opt 
-debug-pass=Arguments] reports the same sequences when called with -O3 
and the individual arguments. If a file name is provided, it also checks 
that the outputs are the same (or in our case, different).

Many real files fail to pass this test, for instance bilateral_grid.bc:

 
<https://github.com/llvm/llvm-test-suite/blob/master/Bitcode/Benchmarks/Halide/bilateral_grid/bilateral_grid.bc>

The diffs are very large even in text mode, and include lots of code.

I'm puzzled again. Any clue on the behavior of opt is very welcome. :)

Cheers,
Sébastien Michelland
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not-associative.sh
Type: application/x-shellscript
Size: 4156 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20190530/67f4fbdf/attachment.bin>

Maybe Matching Threads

Search for more possibly parallel threads

llvm dev - May 2019 - Representations of IR in the output of opt

[llvm-dev] Representations of IR in the output of opt

[llvm-dev] Representations of IR in the output of opt

[llvm-dev] Representations of IR in the output of opt

Maybe Matching Threads