thr3ads.net - llvm dev - [llvm-dev] Hash of a module [Dec 2015]

If this information is useful, please help other people find it:
Share via:

Russell Wallace via llvm-dev

2015-Dec-20 12:43 UTC

[llvm-dev] Hash of a module

I want to run a bunch of optimizations, iteratively, that is keep running
until things stop changing (to make sure all optimization opportunities are
taken). As far as I know, there is no way to copy a module or compare
modules by value, so it occurs to me that a practical solution might be to
take the hash code of the module and see if that changes.

A problem is that hash algorithms are designed to work on streams of bytes,
not compound objects.

First attempt at a solution: iterate through all instructions in all
functions and hash the instruction kinds. I can think of some possible
changes that would fail to be captured by that.

Is there any already known solution?
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20151220/33d8506b/attachment.html>

Manuel Jacob via llvm-dev

2015-Dec-20 14:29 UTC

head link

[llvm-dev] Hash of a module

Hi Russell,

please take a look at the file lib/Transforms/IPO/MergeFunctions.cpp.  
It implements comparison and hashing of functions.  You can probably 
reuse some of it's code and extend the approach to modules.  
Alternatively,  you could try to use code from tools/llvm-diff/ 
(DifferenceEngine.cpp in particular).  They both implement similar 
functionality, one might be better suited for your particular use case.

There is code to create a copy of a module in 
lib/Transforms/Utils/CloneModule.cpp.

-Manuel

On 2015-12-20 13:43, Russell Wallace via llvm-dev wrote:> I want to run a bunch of optimizations, iteratively, that is keep 
> running
> until things stop changing (to make sure all optimization opportunities 
> are
> taken). As far as I know, there is no way to copy a module or compare
> modules by value, so it occurs to me that a practical solution might be 
> to
> take the hash code of the module and see if that changes.
> 
> A problem is that hash algorithms are designed to work on streams of 
> bytes,
> not compound objects.
> 
> First attempt at a solution: iterate through all instructions in all
> functions and hash the instruction kinds. I can think of some possible
> changes that would fail to be captured by that.
> 
> Is there any already known solution?
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Russell Wallace via llvm-dev

2015-Dec-21 00:05 UTC

head link

[llvm-dev] Hash of a module

Thanks! The code to create a copy of a module is just what I needed; having
done that, I've written my own code to compare modules for approximate
equality, since I realized for my purposes, a fast, coarse-grained
comparison suffices (since coarse-grained changes are the most likely to
create further optimization opportunities).

On Sun, Dec 20, 2015 at 2:29 PM, Manuel Jacob <me at manueljacob.de>
wrote:
> Hi Russell,
>
> please take a look at the file lib/Transforms/IPO/MergeFunctions.cpp.  It
> implements comparison and hashing of functions.  You can probably reuse
> some of it's code and extend the approach to modules.  Alternatively, 
you
> could try to use code from tools/llvm-diff/ (DifferenceEngine.cpp in
> particular).  They both implement similar functionality, one might be
> better suited for your particular use case.
>
> There is code to create a copy of a module in
> lib/Transforms/Utils/CloneModule.cpp.
>
> -Manuel
>
>
> On 2015-12-20 13:43, Russell Wallace via llvm-dev wrote:
>
>> I want to run a bunch of optimizations, iteratively, that is keep
running
>> until things stop changing (to make sure all optimization opportunities
>> are
>> taken). As far as I know, there is no way to copy a module or compare
>> modules by value, so it occurs to me that a practical solution might be
to
>> take the hash code of the module and see if that changes.
>>
>> A problem is that hash algorithms are designed to work on streams of
>> bytes,
>> not compound objects.
>>
>> First attempt at a solution: iterate through all instructions in all
>> functions and hash the instruction kinds. I can think of some possible
>> changes that would fail to be captured by that.
>>
>> Is there any already known solution?
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20151221/08610dba/attachment.html>

Artur Pilipenko via llvm-dev

2015-Dec-21 12:36 UTC

head link

[llvm-dev] Hash of a module

Are you going to run some of the existing passes? Why can’t you just use the
returned change-made value from the passes?

Artur
> On 20 Dec 2015, at 15:43, Russell Wallace via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> I want to run a bunch of optimizations, iteratively, that is keep running
until things stop changing (to make sure all optimization opportunities are
taken). As far as I know, there is no way to copy a module or compare modules by
value, so it occurs to me that a practical solution might be to take the hash
code of the module and see if that changes.
> 
> A problem is that hash algorithms are designed to work on streams of bytes,
not compound objects.
> 
> First attempt at a solution: iterate through all instructions in all
functions and hash the instruction kinds. I can think of some possible changes
that would fail to be captured by that.
> 
> Is there any already known solution?
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Russell Wallace via llvm-dev

2015-Dec-21 16:59 UTC

head link

[llvm-dev] Hash of a module

Yes, I'm running all the existing passes that I know how to run. I
didn't
know they returned change-made. Thanks!

On Mon, Dec 21, 2015 at 12:36 PM, Artur Pilipenko <
apilipenko at azulsystems.com> wrote:
> Are you going to run some of the existing passes? Why can’t you just use
> the returned change-made value from the passes?
>
> Artur
>
> > On 20 Dec 2015, at 15:43, Russell Wallace via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> >
> > I want to run a bunch of optimizations, iteratively, that is keep
> running until things stop changing (to make sure all optimization
> opportunities are taken). As far as I know, there is no way to copy a
> module or compare modules by value, so it occurs to me that a practical
> solution might be to take the hash code of the module and see if that
> changes.
> >
> > A problem is that hash algorithms are designed to work on streams of
> bytes, not compound objects.
> >
> > First attempt at a solution: iterate through all instructions in all
> functions and hash the instruction kinds. I can think of some possible
> changes that would fail to be captured by that.
> >
> > Is there any already known solution?
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20151221/7e5e3ff0/attachment.html>

Kevin Modzelewski via llvm-dev

2015-Dec-23 06:24 UTC

head link

[llvm-dev] Hash of a module

I've tried this kind of thing before -- one thing to be aware of is that
some code will not converge on a single fixed point under repeated
optimizations, and instead will cycle through a couple different versions.
So regardless of the way you have of determining if changes were made, the
cycle case might need to be taken into account.  I handled it by saying
"stop if the number of IR instructions increases" which ended up being
a
decent heuristic for my use case (starting from something very unoptimized
so there were a lot of optimizations available).  This was probably a bug I
was running into and it might have been fixed (I think this was using llvm
3.3), so not sure if this will end up affecting you.

I also ran into another case where the "changes made" flag was set
despite
no changes in the IR.  My sense is that this feature is not used very often
so these kind of things can sneak in.  Anyway, here's the patch, not sure
if it is still relevant.
https://github.com/dropbox/pyston/blob/master/llvm_patches/0003-Update-TailCallElim-to-avoid-making-redundant-change.patch

kmod

On Sun, Dec 20, 2015 at 4:43 AM, Russell Wallace via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> I want to run a bunch of optimizations, iteratively, that is keep running
> until things stop changing (to make sure all optimization opportunities are
> taken). As far as I know, there is no way to copy a module or compare
> modules by value, so it occurs to me that a practical solution might be to
> take the hash code of the module and see if that changes.
>
> A problem is that hash algorithms are designed to work on streams of
> bytes, not compound objects.
>
> First attempt at a solution: iterate through all instructions in all
> functions and hash the instruction kinds. I can think of some possible
> changes that would fail to be captured by that.
>
> Is there any already known solution?
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20151222/cdb0fc12/attachment.html>

llvm dev - Dec 2015 - Hash of a module

[llvm-dev] Hash of a module

[llvm-dev] Hash of a module

[llvm-dev] Hash of a module

[llvm-dev] Hash of a module

[llvm-dev] Hash of a module

[llvm-dev] Hash of a module