thr3ads.net - llvm dev - [LLVMdev] [RFC] Setting preserve-bc-use-list-order=true by default [Apr 2015]

If this information is useful, please help other people find it:
Share via:

Duncan P. N. Exon Smith

2015-Apr-09 19:37 UTC

[LLVMdev] [RFC] Setting preserve-bc-use-list-order=true by default

> On 2015-Apr-09, at 11:06, David Blaikie <dblaikie at gmail.com>
wrote:
> 
> Late to the party because I figured other people would chime in, but
I'll have a go...
> 
> On Tue, Mar 31, 2015 at 7:10 PM, Duncan P. N. Exon Smith <dexonsmith at
apple.com> wrote:
> A while back I finished up some work [1] that Chad started to preserve
> use-list-order in bitcode [2], hidden behind an "experimental"
option
> called `-preserve-bc-use-list-order`.  I then added a similar
> `-preserve-ll-use-list-order` option for LLVM assembly [3].
> 
> [1]: http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-July/074604.html
> [2]: https://llvm.org/bugs/show_bug.cgi?id=5680
> [3]: https://llvm.org/bugs/show_bug.cgi?id=20515
> 
> I'd like to move both of these options out of "experimental"
mode, and
> turn `-preserve-bc-use-list-order` on by default.  I've attached a
patch
> that does this.
> 
> Why?
> ===> 
> Use-list order affects the output of some LLVM passes.  A typical
> example is a pass that walks through basic block predecessors.  The
> use-list order is deterministic, but not preserved when serializing the
> LLVM IR to bitcode or assembly.
> 
> In rare (but frustrating) cases, this makes it difficult to reproduce a
> crash or miscompile from an intermediate result.
> 
> For example, consider running an LTO build and serializing between the
> LTO optimization pipeline and the LTO codegen pipeline.  On SPEC,
> serializing to/from bitcode will change the output executable in 33
> benchmarks.  If you use `-preserve-bc-use-list-order`, all executables
> match.
> 
> Why do you need those to match? What's the penalty/problem when these
don't match?
You need these to match so that running a pass using `opt` or `llc` on
a dumped bitcode file gets you the same result as running the pass
directly.  This sort of thing (dumping out temporary results and
continuing from there) is useful when triaging bugs.  The problem is
that the bug may not reproduce at all when starting from the dumped
bitcode.
> Reproducibility for tests/bugs/etc seems important, but if it's any
worse/better with a particular use list, that's a bug we should fix, right?
Forcing a specific use list isn't going to fix those bugs, just make sure we
make the same decision (good or bad) every time.
I'm not sure there's consensus that this is a bug.  It's not
optimal,
but it's not clear to me that it's invalid to depend on use-list order.
In some cases, it may be a reasonable heuristic that improves compile
time (I'm not arguing for that, I'm just not convinced otherwise).
> What does it cost?
> =================> 
> Manman collected a bunch of numbers, with `-preserve-bc-use-list-order`
> and `-preserve-ll-use-list-order` both turned on:
> 
>   - Time increase on LTO build time: negligible.
>   - Filesize increase in bitcode files output by `clang -flto`:
>       - clang: 6.8556% (sum when from 310412640 to 331693296 bytes).
>   - Filesize increase in merged bitcode files (output of
>     `ld -save-temps` when using `clang -flto` on Darwin).
>       — SPEC: 5.24% to 23.4% increase in file size.
>       — clang and related tools: 0.83% to 9.19% (details in
>         filesize_clang.txt, attached).
>   - Time increase of running `llvm-dis` on merged bitcode files:
>       — 6.4% to 15% on SPEC benchmarks with running time >= 1 second.
>       — 7.9% to 14.5% on clang with running time >= 1 second (details in
>         dis_clang.txt, attached).
>   - Time increase of running `llvm-as` on merged bitcode files:
>       — 3.5% to 39% on SPEC benchmarks with running time >= 1 second.
>       — 14.7% to 24.5% with running time >= 1 second (details in
>         as_clang.txt, attached).
> 
> These seem like pretty big costs to pay (bitcode size is going to be
particularly important to Google - big projects under LTO, limits on the total
size of the inputs to the link step, etc). To the point above, it's not
clear why we'd want to pay that cost. (I'm partly playing devil's
advocate here - I realize reproducibility is really important for a bunch of
reasons, though this particular reproducibility is a marginal one (compared to
"run the compiler twice on the same input and get two different
answers") but it seems like we've generally treated these issues as
bugs and fixed the optimizations to be use-list-order independent in the past,
no?)
>  
(FWIW, there's some ancient discussion in PR5680 about this.)

I don't have a strong opinion on whether depending on use-list order
should be considered a bug.  However, it *is* a bug not to be able to
roundtrip to bitcode and get the same end result.

While it may be possible to remove the compiler's dependency on use-list
order, no one has signed up to do the work, it's not clear what the
compile time cost would be, and there isn't consensus that it's the
right end goal.

In the meantime, this fixes the bug.  If/when that hypothetical work is
done and validated we can turn this off by default if it makes sense to.

In terms of LTO bitcode size: serializing use-list order isn't actually
necessary/useful for the "normal" outputs of `clang -flto`.  It's
important for `clang -emit-llvm`, `clang [-flto] -save-temps`, and
`<gold/libLTO> -save-temps`, but serialization between "compile"
and
"link" is a deterministic and reproducible step.  A possible
optimization would be to disable the option when writing `clang`'s
(final) output file in `clang -flto`.  Thoughts?

Alternatively, some portions of the aforementioned hypothetical work
might be low-hanging fruit.  E.g., it might not be hard to validate that
the use-list order of `ConstantInt`s doesn't affect output (and if it
does, it would probably be easy to fix); at the same time, I imagine
they account for a disproportionate percentage of the bitcode bloat
(they have large use-lists that change frequently).

David Blaikie

2015-Apr-10 16:12 UTC

head link

[LLVMdev] [RFC] Setting preserve-bc-use-list-order=true by default

On Thu, Apr 9, 2015 at 12:37 PM, Duncan P. N. Exon Smith <
dexonsmith at apple.com> wrote:
>
> > On 2015-Apr-09, at 11:06, David Blaikie <dblaikie at gmail.com>
wrote:
> >
> > Late to the party because I figured other people would chime in, but
> I'll have a go...
> >
> > On Tue, Mar 31, 2015 at 7:10 PM, Duncan P. N. Exon Smith <
> dexonsmith at apple.com> wrote:
> > A while back I finished up some work [1] that Chad started to preserve
> > use-list-order in bitcode [2], hidden behind an
"experimental" option
> > called `-preserve-bc-use-list-order`.  I then added a similar
> > `-preserve-ll-use-list-order` option for LLVM assembly [3].
> >
> > [1]: http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-July/074604.html
> > [2]: https://llvm.org/bugs/show_bug.cgi?id=5680
> > [3]: https://llvm.org/bugs/show_bug.cgi?id=20515
> >
> > I'd like to move both of these options out of
"experimental" mode, and
> > turn `-preserve-bc-use-list-order` on by default.  I've attached a
patch
> > that does this.
> >
> > Why?
> > ===> >
> > Use-list order affects the output of some LLVM passes.  A typical
> > example is a pass that walks through basic block predecessors.  The
> > use-list order is deterministic, but not preserved when serializing
the
> > LLVM IR to bitcode or assembly.
> >
> > In rare (but frustrating) cases, this makes it difficult to reproduce
a
> > crash or miscompile from an intermediate result.
> >
> > For example, consider running an LTO build and serializing between the
> > LTO optimization pipeline and the LTO codegen pipeline.  On SPEC,
> > serializing to/from bitcode will change the output executable in 33
> > benchmarks.  If you use `-preserve-bc-use-list-order`, all executables
> > match.
> >
> > Why do you need those to match? What's the penalty/problem when
these
> don't match?
>
> You need these to match so that running a pass using `opt` or `llc` on
> a dumped bitcode file gets you the same result as running the pass
> directly.  This sort of thing (dumping out temporary results and
> continuing from there) is useful when triaging bugs.  The problem is
> that the bug may not reproduce at all when starting from the dumped
> bitcode.
>
> > Reproducibility for tests/bugs/etc seems important, but if it's
any
> worse/better with a particular use list, that's a bug we should fix,
right?
> Forcing a specific use list isn't going to fix those bugs, just make
sure
> we make the same decision (good or bad) every time.
>
> I'm not sure there's consensus that this is a bug.  It's not
optimal,
> but it's not clear to me that it's invalid to depend on use-list
order.
> In some cases, it may be a reasonable heuristic that improves compile
> time (I'm not arguing for that, I'm just not convinced otherwise).
>
> > What does it cost?
> > =================> >
> > Manman collected a bunch of numbers, with
`-preserve-bc-use-list-order`
> > and `-preserve-ll-use-list-order` both turned on:
> >
> >   - Time increase on LTO build time: negligible.
> >   - Filesize increase in bitcode files output by `clang -flto`:
> >       - clang: 6.8556% (sum when from 310412640 to 331693296 bytes).
> >   - Filesize increase in merged bitcode files (output of
> >     `ld -save-temps` when using `clang -flto` on Darwin).
> >       — SPEC: 5.24% to 23.4% increase in file size.
> >       — clang and related tools: 0.83% to 9.19% (details in
> >         filesize_clang.txt, attached).
> >   - Time increase of running `llvm-dis` on merged bitcode files:
> >       — 6.4% to 15% on SPEC benchmarks with running time >= 1
second.
> >       — 7.9% to 14.5% on clang with running time >= 1 second
(details in
> >         dis_clang.txt, attached).
> >   - Time increase of running `llvm-as` on merged bitcode files:
> >       — 3.5% to 39% on SPEC benchmarks with running time >= 1
second.
> >       — 14.7% to 24.5% with running time >= 1 second (details in
> >         as_clang.txt, attached).
> >
> > These seem like pretty big costs to pay (bitcode size is going to be
> particularly important to Google - big projects under LTO, limits on the
> total size of the inputs to the link step, etc). To the point above,
it's
> not clear why we'd want to pay that cost. (I'm partly playing
devil's
> advocate here - I realize reproducibility is really important for a bunch
> of reasons, though this particular reproducibility is a marginal one
> (compared to "run the compiler twice on the same input and get two
> different answers") but it seems like we've generally treated
these issues
> as bugs and fixed the optimizations to be use-list-order independent in the
> past, no?)
> >
>
> (FWIW, there's some ancient discussion in PR5680 about this.)
>
> I don't have a strong opinion on whether depending on use-list order
> should be considered a bug.  However, it *is* a bug not to be able to
> roundtrip to bitcode and get the same end result.
>
> While it may be possible to remove the compiler's dependency on
use-list
> order, no one has signed up to do the work, it's not clear what the
> compile time cost would be, and there isn't consensus that it's the
> right end goal.
>
I'd say the current solution, while you've signed up/done the work has a
pretty clear and significant cost, and I'm not sure about consensus (I
figured other people might chime in which is why I didn't bother until now
- Chandler mentioned he'd tried & hadn't found many supporters of
his
dissenting opinion so I figured I'd have a go).

> In the meantime, this fixes the bug.  If/when that hypothetical work is
> done and validated we can turn this off by default if it makes sense to.
>
> In terms of LTO bitcode size: serializing use-list order isn't actually
> necessary/useful for the "normal" outputs of `clang -flto`. 
It's
> important for `clang -emit-llvm`, `clang [-flto] -save-temps`, and
> `<gold/libLTO> -save-temps`, but serialization between
"compile" and
> "link" is a deterministic and reproducible step.  A possible
> optimization would be to disable the option when writing `clang`'s
> (final) output file in `clang -flto`.  Thoughts?
>
Sounds plausible - I'd be inclined to go a step further and make this
opt-in for tools/actions rather than opt-out for clang -flto.

I'm assuming -emit-llvm and -save-temps behavior is important to use as
debugging tools - or is there some other reason you're relying on those
features not to introduce variation? If it's a matter of "LLVM
debugging
tools should enable use list order preservation to make the lives of LLVM
developers easier" then I think it makes sense for us to opt in all our
debugging tools to do this but leave the default behavior to not incur this
extra cost - so that people using LLVM as a library don't pay this extra
cost in their production pipelines (but, like us, can enable it with a flag
when needed).

- David

>
> Alternatively, some portions of the aforementioned hypothetical work
> might be low-hanging fruit.  E.g., it might not be hard to validate that
> the use-list order of `ConstantInt`s doesn't affect output (and if it
> does, it would probably be easy to fix); at the same time, I imagine
> they account for a disproportionate percentage of the bitcode bloat
> (they have large use-lists that change frequently).
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150410/2e08c426/attachment.html>

Chandler Carruth

2015-Apr-10 19:49 UTC

head link

[LLVMdev] [RFC] Setting preserve-bc-use-list-order=true by default

FWIW, I completely agree with everything David has said here. Just so my
silence isn't confusing.

On Fri, Apr 10, 2015 at 9:16 AM David Blaikie <dblaikie at gmail.com>
wrote:
> On Thu, Apr 9, 2015 at 12:37 PM, Duncan P. N. Exon Smith <
> dexonsmith at apple.com> wrote:
>
>>
>> > On 2015-Apr-09, at 11:06, David Blaikie <dblaikie at
gmail.com> wrote:
>> >
>> > Late to the party because I figured other people would chime in,
but
>> I'll have a go...
>> >
>> > On Tue, Mar 31, 2015 at 7:10 PM, Duncan P. N. Exon Smith <
>> dexonsmith at apple.com> wrote:
>> > A while back I finished up some work [1] that Chad started to
preserve
>> > use-list-order in bitcode [2], hidden behind an
"experimental" option
>> > called `-preserve-bc-use-list-order`.  I then added a similar
>> > `-preserve-ll-use-list-order` option for LLVM assembly [3].
>> >
>> > [1]:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-July/074604.html
>> > [2]: https://llvm.org/bugs/show_bug.cgi?id=5680
>> > [3]: https://llvm.org/bugs/show_bug.cgi?id=20515
>> >
>> > I'd like to move both of these options out of
"experimental" mode, and
>> > turn `-preserve-bc-use-list-order` on by default.  I've
attached a patch
>> > that does this.
>> >
>> > Why?
>> > ===>> >
>> > Use-list order affects the output of some LLVM passes.  A typical
>> > example is a pass that walks through basic block predecessors. 
The
>> > use-list order is deterministic, but not preserved when
serializing the
>> > LLVM IR to bitcode or assembly.
>> >
>> > In rare (but frustrating) cases, this makes it difficult to
reproduce a
>> > crash or miscompile from an intermediate result.
>> >
>> > For example, consider running an LTO build and serializing between
the
>> > LTO optimization pipeline and the LTO codegen pipeline.  On SPEC,
>> > serializing to/from bitcode will change the output executable in
33
>> > benchmarks.  If you use `-preserve-bc-use-list-order`, all
executables
>> > match.
>> >
>> > Why do you need those to match? What's the penalty/problem
when these
>> don't match?
>>
>> You need these to match so that running a pass using `opt` or `llc` on
>> a dumped bitcode file gets you the same result as running the pass
>> directly.  This sort of thing (dumping out temporary results and
>> continuing from there) is useful when triaging bugs.  The problem is
>> that the bug may not reproduce at all when starting from the dumped
>> bitcode.
>>
>> > Reproducibility for tests/bugs/etc seems important, but if
it's any
>> worse/better with a particular use list, that's a bug we should
fix, right?
>> Forcing a specific use list isn't going to fix those bugs, just
make sure
>> we make the same decision (good or bad) every time.
>>
>> I'm not sure there's consensus that this is a bug.  It's
not optimal,
>> but it's not clear to me that it's invalid to depend on
use-list order.
>> In some cases, it may be a reasonable heuristic that improves compile
>> time (I'm not arguing for that, I'm just not convinced
otherwise).
>>
>> > What does it cost?
>> > =================>> >
>> > Manman collected a bunch of numbers, with
`-preserve-bc-use-list-order`
>> > and `-preserve-ll-use-list-order` both turned on:
>> >
>> >   - Time increase on LTO build time: negligible.
>> >   - Filesize increase in bitcode files output by `clang -flto`:
>> >       - clang: 6.8556% (sum when from 310412640 to 331693296
bytes).
>> >   - Filesize increase in merged bitcode files (output of
>> >     `ld -save-temps` when using `clang -flto` on Darwin).
>> >       — SPEC: 5.24% to 23.4% increase in file size.
>> >       — clang and related tools: 0.83% to 9.19% (details in
>> >         filesize_clang.txt, attached).
>> >   - Time increase of running `llvm-dis` on merged bitcode files:
>> >       — 6.4% to 15% on SPEC benchmarks with running time >= 1
second.
>> >       — 7.9% to 14.5% on clang with running time >= 1 second
(details in
>> >         dis_clang.txt, attached).
>> >   - Time increase of running `llvm-as` on merged bitcode files:
>> >       — 3.5% to 39% on SPEC benchmarks with running time >= 1
second.
>> >       — 14.7% to 24.5% with running time >= 1 second (details
in
>> >         as_clang.txt, attached).
>> >
>> > These seem like pretty big costs to pay (bitcode size is going to
be
>> particularly important to Google - big projects under LTO, limits on
the
>> total size of the inputs to the link step, etc). To the point above,
it's
>> not clear why we'd want to pay that cost. (I'm partly playing
devil's
>> advocate here - I realize reproducibility is really important for a
bunch
>> of reasons, though this particular reproducibility is a marginal one
>> (compared to "run the compiler twice on the same input and get two
>> different answers") but it seems like we've generally treated
these issues
>> as bugs and fixed the optimizations to be use-list-order independent in
the
>> past, no?)
>> >
>>
>> (FWIW, there's some ancient discussion in PR5680 about this.)
>>
>> I don't have a strong opinion on whether depending on use-list
order
>> should be considered a bug.  However, it *is* a bug not to be able to
>> roundtrip to bitcode and get the same end result.
>>
>> While it may be possible to remove the compiler's dependency on
use-list
>> order, no one has signed up to do the work, it's not clear what the
>> compile time cost would be, and there isn't consensus that it's
the
>> right end goal.
>>
>
> I'd say the current solution, while you've signed up/done the work
has a
> pretty clear and significant cost, and I'm not sure about consensus (I
> figured other people might chime in which is why I didn't bother until
now
> - Chandler mentioned he'd tried & hadn't found many supporters
of his
> dissenting opinion so I figured I'd have a go).
>
>
>> In the meantime, this fixes the bug.  If/when that hypothetical work is
>> done and validated we can turn this off by default if it makes sense
to.
>>
>> In terms of LTO bitcode size: serializing use-list order isn't
actually
>> necessary/useful for the "normal" outputs of `clang -flto`. 
It's
>> important for `clang -emit-llvm`, `clang [-flto] -save-temps`, and
>> `<gold/libLTO> -save-temps`, but serialization between
"compile" and
>> "link" is a deterministic and reproducible step.  A possible
>> optimization would be to disable the option when writing `clang`'s
>> (final) output file in `clang -flto`.  Thoughts?
>>
>
> Sounds plausible - I'd be inclined to go a step further and make this
> opt-in for tools/actions rather than opt-out for clang -flto.
>
> I'm assuming -emit-llvm and -save-temps behavior is important to use as
> debugging tools - or is there some other reason you're relying on those
> features not to introduce variation? If it's a matter of "LLVM
debugging
> tools should enable use list order preservation to make the lives of LLVM
> developers easier" then I think it makes sense for us to opt in all
our
> debugging tools to do this but leave the default behavior to not incur this
> extra cost - so that people using LLVM as a library don't pay this
extra
> cost in their production pipelines (but, like us, can enable it with a flag
> when needed).
>
>
> - David
>
>
>>
>> Alternatively, some portions of the aforementioned hypothetical work
>> might be low-hanging fruit.  E.g., it might not be hard to validate
that
>> the use-list order of `ConstantInt`s doesn't affect output (and if
it
>> does, it would probably be easy to fix); at the same time, I imagine
>> they account for a disproportionate percentage of the bitcode bloat
>> (they have large use-lists that change frequently).
>>
>>  _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150410/20b6a6d3/attachment.html>

Duncan P. N. Exon Smith

2015-Apr-14 16:06 UTC

head link

[LLVMdev] [RFC] Setting preserve-bc-use-list-order=true by default

> On 2015-Apr-10, at 09:12, David Blaikie <dblaikie at gmail.com>
wrote:
> 
> 
> 
> On Thu, Apr 9, 2015 at 12:37 PM, Duncan P. N. Exon Smith <dexonsmith at
apple.com> wrote:
> 
> > On 2015-Apr-09, at 11:06, David Blaikie <dblaikie at gmail.com>
wrote:
> >
> > Late to the party because I figured other people would chime in, but
I'll have a go...
> >
> > On Tue, Mar 31, 2015 at 7:10 PM, Duncan P. N. Exon Smith
<dexonsmith at apple.com> wrote:
> > A while back I finished up some work [1] that Chad started to preserve
> > use-list-order in bitcode [2], hidden behind an
"experimental" option
> > called `-preserve-bc-use-list-order`.  I then added a similar
> > `-preserve-ll-use-list-order` option for LLVM assembly [3].
> >
> > [1]: http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-July/074604.html
> > [2]: https://llvm.org/bugs/show_bug.cgi?id=5680
> > [3]: https://llvm.org/bugs/show_bug.cgi?id=20515
> >
> > I'd like to move both of these options out of
"experimental" mode, and
> > turn `-preserve-bc-use-list-order` on by default.  I've attached a
patch
> > that does this.
> >
> > Why?
> > ===> >
> > Use-list order affects the output of some LLVM passes.  A typical
> > example is a pass that walks through basic block predecessors.  The
> > use-list order is deterministic, but not preserved when serializing
the
> > LLVM IR to bitcode or assembly.
> >
> > In rare (but frustrating) cases, this makes it difficult to reproduce
a
> > crash or miscompile from an intermediate result.
> >
> > For example, consider running an LTO build and serializing between the
> > LTO optimization pipeline and the LTO codegen pipeline.  On SPEC,
> > serializing to/from bitcode will change the output executable in 33
> > benchmarks.  If you use `-preserve-bc-use-list-order`, all executables
> > match.
> >
> > Why do you need those to match? What's the penalty/problem when
these don't match?
> 
> You need these to match so that running a pass using `opt` or `llc` on
> a dumped bitcode file gets you the same result as running the pass
> directly.  This sort of thing (dumping out temporary results and
> continuing from there) is useful when triaging bugs.  The problem is
> that the bug may not reproduce at all when starting from the dumped
> bitcode.
> 
> > Reproducibility for tests/bugs/etc seems important, but if it's
any worse/better with a particular use list, that's a bug we should fix,
right? Forcing a specific use list isn't going to fix those bugs, just make
sure we make the same decision (good or bad) every time.
> 
> I'm not sure there's consensus that this is a bug.  It's not
optimal,
> but it's not clear to me that it's invalid to depend on use-list
order.
> In some cases, it may be a reasonable heuristic that improves compile
> time (I'm not arguing for that, I'm just not convinced otherwise).
> 
> > What does it cost?
> > =================> >
> > Manman collected a bunch of numbers, with
`-preserve-bc-use-list-order`
> > and `-preserve-ll-use-list-order` both turned on:
> >
> >   - Time increase on LTO build time: negligible.
> >   - Filesize increase in bitcode files output by `clang -flto`:
> >       - clang: 6.8556% (sum when from 310412640 to 331693296 bytes).
> >   - Filesize increase in merged bitcode files (output of
> >     `ld -save-temps` when using `clang -flto` on Darwin).
> >       — SPEC: 5.24% to 23.4% increase in file size.
> >       — clang and related tools: 0.83% to 9.19% (details in
> >         filesize_clang.txt, attached).
> >   - Time increase of running `llvm-dis` on merged bitcode files:
> >       — 6.4% to 15% on SPEC benchmarks with running time >= 1
second.
> >       — 7.9% to 14.5% on clang with running time >= 1 second
(details in
> >         dis_clang.txt, attached).
> >   - Time increase of running `llvm-as` on merged bitcode files:
> >       — 3.5% to 39% on SPEC benchmarks with running time >= 1
second.
> >       — 14.7% to 24.5% with running time >= 1 second (details in
> >         as_clang.txt, attached).
> >
> > These seem like pretty big costs to pay (bitcode size is going to be
particularly important to Google - big projects under LTO, limits on the total
size of the inputs to the link step, etc). To the point above, it's not
clear why we'd want to pay that cost. (I'm partly playing devil's
advocate here - I realize reproducibility is really important for a bunch of
reasons, though this particular reproducibility is a marginal one (compared to
"run the compiler twice on the same input and get two different
answers") but it seems like we've generally treated these issues as
bugs and fixed the optimizations to be use-list-order independent in the past,
no?)
> >
> 
> (FWIW, there's some ancient discussion in PR5680 about this.)
> 
> I don't have a strong opinion on whether depending on use-list order
> should be considered a bug.  However, it *is* a bug not to be able to
> roundtrip to bitcode and get the same end result.
> 
> While it may be possible to remove the compiler's dependency on
use-list
> order, no one has signed up to do the work, it's not clear what the
> compile time cost would be, and there isn't consensus that it's the
> right end goal.
> 
> I'd say the current solution, while you've signed up/done the work
has a pretty clear and significant cost, and I'm not sure about consensus (I
figured other people might chime in which is why I didn't bother until now -
Chandler mentioned he'd tried & hadn't found many supporters of his
dissenting opinion so I figured I'd have a go).
>  
> In the meantime, this fixes the bug.  If/when that hypothetical work is
> done and validated we can turn this off by default if it makes sense to.
> 
> In terms of LTO bitcode size: serializing use-list order isn't actually
> necessary/useful for the "normal" outputs of `clang -flto`. 
It's
> important for `clang -emit-llvm`, `clang [-flto] -save-temps`, and
> `<gold/libLTO> -save-temps`, but serialization between
"compile" and
> "link" is a deterministic and reproducible step.  A possible
> optimization would be to disable the option when writing `clang`'s
> (final) output file in `clang -flto`.  Thoughts?
> 
> Sounds plausible - I'd be inclined to go a step further and make this
opt-in for tools/actions rather than opt-out for clang -flto.
> 
> I'm assuming -emit-llvm and -save-temps behavior is important to use as
debugging tools -
Yup, that's the main reason.
> or is there some other reason you're relying on those features not to
introduce variation? If it's a matter of "LLVM debugging tools should
enable use list order preservation to make the lives of LLVM developers
easier" then I think it makes sense for us to opt in all our debugging
tools to do this but leave the default behavior to not incur this extra cost -
so that people using LLVM as a library don't pay this extra cost in their
production pipelines (but, like us, can enable it with a flag when needed).
Okay, I'm convinced.  I'll make it that way over the next couple of
days.
> - David
>  
> 
> Alternatively, some portions of the aforementioned hypothetical work
> might be low-hanging fruit.  E.g., it might not be hard to validate that
> the use-list order of `ConstantInt`s doesn't affect output (and if it
> does, it would probably be easy to fix); at the same time, I imagine
> they account for a disproportionate percentage of the bitcode bloat
> (they have large use-lists that change frequently).

llvm dev - Apr 2015 - [LLVMdev] [RFC] Setting preserve-bc-use-list-order=true by default

[LLVMdev] [RFC] Setting preserve-bc-use-list-order=true by default

[LLVMdev] [RFC] Setting preserve-bc-use-list-order=true by default

[LLVMdev] [RFC] Setting preserve-bc-use-list-order=true by default

[LLVMdev] [RFC] Setting preserve-bc-use-list-order=true by default