thr3ads.net - llvm dev - [llvm-dev] Representations of IR in the output of opt [May 2019]

If this information is useful, please help other people find it:
Share via:

Sébastien Michelland via llvm-dev

2019-May-24 19:52 UTC

[llvm-dev] Representations of IR in the output of opt

Hi LLVM,

I'm currently setting up some tools to investigate the influence of the 
order of optimization passes on the performance of compiled programs 
-nothing exceptional here.

I noticed something inconvenient with opt, namely that splitting a call 
does not always give the same output:

% llvm-stress > stress.ll
% opt -dse -verify -dce stress.ll -o stress-1.bc
% opt -dse stress.ll | opt -dce -o stress-2.bc
% diff stress-{1,2}.bc
Binary files stress-1.bc and stress-2.bc differ

The difference seems meaningful; it's ~180 bytes out of ~1400 bytes of 
output in my random case. I can't decode it however, because 
disassembling the bytecode produces identical text files, even with 
annotations. (!)

I made sure that the sequence for [-dse -verify -dce] is the 
concatenation of the individual sequences; this falls in place naturally 
because -dce has no dependencies. The verifier pass helps make two 
function pass managers, just in case.

Now if I do the same thing but staying in text format, I get the same IR 
(up to module name):

% opt -S -dse -verify -dce stress.ll -o stress-1.ll
% opt -S -dse stress.ll | opt -S -dce -o stress-2.ll
% diff -y --suppress-common-lines stress-{1,2}.ll
; ModuleID = 'stress.ll'	|	; ModuleID = '<stdin>'

Is there a specific behavior of opt that could explain this situation? 
What kind of difference could there be in the bytecode files that is 
lost in translation to text format ?

Cheers,
Sébastien Michelland

Eli Friedman via llvm-dev

2019-May-24 21:32 UTC

head link

[llvm-dev] Representations of IR in the output of opt

Are you passing -preserve-ll-uselistorder when you create the .ll files? 
It's off by default because the output tends to be sort of unreadable, but
it could explain some of the differences you're seeing.

-Eli
> -----Original Message-----
> From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of
Sébastien
> Michelland via llvm-dev
> Sent: Friday, May 24, 2019 12:53 PM
> To: llvm-dev at lists.llvm.org
> Subject: [EXT] [llvm-dev] Representations of IR in the output of opt
> 
> Hi LLVM,
> 
> I'm currently setting up some tools to investigate the influence of the
> order of optimization passes on the performance of compiled programs
> -nothing exceptional here.
> 
> I noticed something inconvenient with opt, namely that splitting a call
> does not always give the same output:
> 
> % llvm-stress > stress.ll
> % opt -dse -verify -dce stress.ll -o stress-1.bc
> % opt -dse stress.ll | opt -dce -o stress-2.bc
> % diff stress-{1,2}.bc
> Binary files stress-1.bc and stress-2.bc differ
> 
> The difference seems meaningful; it's ~180 bytes out of ~1400 bytes of
> output in my random case. I can't decode it however, because
> disassembling the bytecode produces identical text files, even with
> annotations. (!)
> 
> I made sure that the sequence for [-dse -verify -dce] is the
> concatenation of the individual sequences; this falls in place naturally
> because -dce has no dependencies. The verifier pass helps make two
> function pass managers, just in case.
> 
> Now if I do the same thing but staying in text format, I get the same IR
> (up to module name):
> 
> % opt -S -dse -verify -dce stress.ll -o stress-1.ll
> % opt -S -dse stress.ll | opt -S -dce -o stress-2.ll
> % diff -y --suppress-common-lines stress-{1,2}.ll
> ; ModuleID = 'stress.ll'	|	; ModuleID = '<stdin>'
> 
> Is there a specific behavior of opt that could explain this situation?
> What kind of difference could there be in the bytecode files that is
> lost in translation to text format ?
> 
> Cheers,
> Sébastien Michelland
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Sébastien Michelland via llvm-dev

2019-May-27 12:46 UTC

head link

[llvm-dev] Representations of IR in the output of opt

Hi Eli,

Unfortunately the differences remain, I do not observe a significant 
change in the output besides the fact that it's random.

I noticed that running opt without options on the random file changes 
the order of references in the predecessors of basic blocks (sample 
below). Further invocations of opt are idempotent.

I don't know of this information is stored in the bytecode file as well.

< ; preds = %CF, %CF80, %CF78
 > ; preds = %CF80, %CF, %CF78

FWIW, the conflicting section of the bytecode file is likely not a 
permutation because the byte patterns don't match (some of the btte 
values of stress-1.bc are not present in stress-2.bc).

Thanks for your help :)
Sébastien Michelland

On 5/24/19 5:32 PM, Eli Friedman wrote:> Are you passing -preserve-ll-uselistorder when you create the .ll files? 
It's off by default because the output tends to be sort of unreadable, but
it could explain some of the differences you're seeing.
> 
> -Eli
> 
>> -----Original Message-----
>> From: llvm-dev <llvm-dev-bounces at lists.llvm.org> On Behalf Of
Sébastien
>> Michelland via llvm-dev
>> Sent: Friday, May 24, 2019 12:53 PM
>> To: llvm-dev at lists.llvm.org
>> Subject: [EXT] [llvm-dev] Representations of IR in the output of opt
>>
>> Hi LLVM,
>>
>> I'm currently setting up some tools to investigate the influence of
the
>> order of optimization passes on the performance of compiled programs
>> -nothing exceptional here.
>>
>> I noticed something inconvenient with opt, namely that splitting a call
>> does not always give the same output:
>>
>> % llvm-stress > stress.ll
>> % opt -dse -verify -dce stress.ll -o stress-1.bc
>> % opt -dse stress.ll | opt -dce -o stress-2.bc
>> % diff stress-{1,2}.bc
>> Binary files stress-1.bc and stress-2.bc differ
>>
>> The difference seems meaningful; it's ~180 bytes out of ~1400 bytes
of
>> output in my random case. I can't decode it however, because
>> disassembling the bytecode produces identical text files, even with
>> annotations. (!)
>>
>> I made sure that the sequence for [-dse -verify -dce] is the
>> concatenation of the individual sequences; this falls in place
naturally
>> because -dce has no dependencies. The verifier pass helps make two
>> function pass managers, just in case.
>>
>> Now if I do the same thing but staying in text format, I get the same
IR
>> (up to module name):
>>
>> % opt -S -dse -verify -dce stress.ll -o stress-1.ll
>> % opt -S -dse stress.ll | opt -S -dce -o stress-2.ll
>> % diff -y --suppress-common-lines stress-{1,2}.ll
>> ; ModuleID = 'stress.ll'	|	; ModuleID = '<stdin>'
>>
>> Is there a specific behavior of opt that could explain this situation?
>> What kind of difference could there be in the bytecode files that is
>> lost in translation to text format ?
>>
>> Cheers,
>> Sébastien Michelland
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Possibly Parallel Threads

Search for more apparently analagous threads

llvm dev - May 2019 - Representations of IR in the output of opt

[llvm-dev] Representations of IR in the output of opt

[llvm-dev] Representations of IR in the output of opt

[llvm-dev] Representations of IR in the output of opt

Possibly Parallel Threads