Dear All,
Please help me to explain more detail how to implement xxxHazardRecognizer.
Give me one example.
Thanks,
Huy
-----Original Message-----
From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of via
llvm-dev
Sent: Saturday, July 16, 2016 5:16 AM
To: llvm-dev at lists.llvm.org
Subject: llvm-dev Digest, Vol 145, Issue 76
Send llvm-dev mailing list submissions to
llvm-dev at lists.llvm.org
To subscribe or unsubscribe via the World Wide Web, visit
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
or, via email, send a message with subject or body 'help' to
llvm-dev-request at lists.llvm.org
You can reach the person managing the list at
llvm-dev-owner at lists.llvm.org
When replying, please edit your Subject line so it is more specific than
"Re: Contents of llvm-dev digest..."
Today's Topics:
1. Re: RFC: Strong GC References in LLVM (Sanjoy Das via llvm-dev)
2. Re: AArch64 testsuite buildbots timeout (Bill Seurer via llvm-dev)
3. Re: RFC: Strong GC References in LLVM (Daniel Berlin via llvm-dev)
4. Re: clone function (Pierre Gagelin via llvm-dev)
5. Re: AArch64 testsuite buildbots timeout
(Renato Golin via llvm-dev)
6. Re: RFC: Strong GC References in LLVM (Daniel Berlin via llvm-dev)
7. Re: RFC: To add __attribute__((regmask("preserve/clobbered
list here"))) in clang (Joerg Sonnenberger via llvm-dev)
8. Re: AArch64 testsuite buildbots timeout (Bill Seurer via llvm-dev)
9. Re: RFC: Strong GC References in LLVM (Sanjoy Das via llvm-dev)
10. Re: RFC: Strong GC References in LLVM (Daniel Berlin via llvm-dev)
11. Re: RFC: Strong GC References in LLVM (Sanjoy Das via llvm-dev)
12. Re: RFC: To add __attribute__((regmask("preserve/clobbered
list here"))) in clang (Mehdi Amini via llvm-dev)
13. Re: RFC: Strong GC References in LLVM (Daniel Berlin via llvm-dev)
14. Re: RFC: Strong GC References in LLVM (Daniel Berlin via llvm-dev)
----------------------------------------------------------------------
Message: 1
Date: Fri, 15 Jul 2016 12:21:09 -0700
From: Sanjoy Das via llvm-dev <llvm-dev at lists.llvm.org>
To: Daniel Berlin <dberlin at dberlin.org>
Cc: Oscar Blumberg <oscar.blumberg at normalesup.org>, llvm-dev
<llvm-dev at lists.llvm.org>
Subject: Re: [llvm-dev] RFC: Strong GC References in LLVM
Message-ID: <578937A5.2090808 at playingwithpointers.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Hi Daniel,
Daniel Berlin wrote:
> As a starting point, LLVM will conservatively not speculate such
> loads and stores; and will leave open the potential to upstream
> logic that will have a more precise sense of when these
loads and
> stores are safe to speculate.
>
>
> I think you need to define what you mean by control dependence here. If
> you mean speculation, you should say speculation :)
Apologies for being non-specific -- this is really just "don't
speculate".
> As you describe below, it is not enough to simply not speculate them.
I'm not sure where I said that?
> You also are saying you don't want to change the conditions on which
> they execute.
> That is very different from speculation.
If I implied that somehow then I (or the example) was wrong. :)
We can't speculate these instructions (without special knowledge of the GC
and the Java type system), and that's it.
> FWIW: This raises one of the same issues we have now with may-throw, >
which is that, if all you have is a flag on the instruction, now you > have
to look at every instruction in every block to know whether a *CFG* >
transform is correct.
>
> That means any pass that wants to just touch the CFG can't do so
without > also looking at the instruction stream. It will also make a bunch
of > things currently O(N), O(N^2) (see the sets of patches fixing may-throw
> places, and extrapolate to more places).
As I said, I'm only proposing a "don't speculate" flag, so
this does not (?) apply.
However, I didn't quite understand your point about may-throw -- how is
may-throw different from a generic side-effect (volatile store, syscall etc.)?
All of those can't be hoisted or sunk -- we have to make sure that they
execute in semantically the same conditions that they did in the original
program.
> This is theoretically fixable in each pass by taking a single pass over
> the instruction stream and marking which blocks have these instructions,
> etc, and then using that info.
>
> But we shouldn't have to do that in each pass, especially if they
just > want to make CFG manipulations.
>
> The TL;DR I would really like to see this also made a BB level flag >
that says whether the block contains instructions with unknown control >
dependences (or really, 1 bb level flag for each attribute you
introduce) .
> I don't think this is terribly difficult, and at the very least,
will
> keep us from having to look at every instruction in the block in the >
common case that there is nothing in the block to worry about ;) > >
Also note that these CFG transforms will also now need post-dominance > info
in a bunch of cases to sanely determine if they are changing the > control
dependence structure.
>
> Let me ask another question:
>
> Given
>
> %x = malloc() ;; known thread local
> if (cond_0) {
> store GCREF %val to %x
> }
> if (cond_1) {
> store i64 %val to %x
> }
> Assume i can prove cond0 and cond1 the same.
>
>
> I change this to:
> %x = malloc() ;; known thread local
> if (cond_0) {
> store i64 %val to %x
> store GCREF %val to %x
> }
>
> Is this okay to happen?
Yes. The only restriction is you can't issue a GCREF load or store that
wouldn't have been issued in the original program (even if the location is
thread local, in case of stores).
-- Sanjoy
> Note that i did not actually change the control dependence of it by the
> normal definition of control dependence.
>
> But you can end up "hoisting" or "sinking"
instructions by moving every > other instruction :) > > Is that
okay, or are they really barriers as well (like may throw), in > which case,
they probably need some real representation in control flow > if you want to
make most algorithms O(N) (you can get away with the bb > level flags if you
are willing to make some algorithms N^2 in cases > where these things
exist).
------------------------------
Message: 2
Date: Fri, 15 Jul 2016 14:36:38 -0500
From: Bill Seurer via llvm-dev <llvm-dev at lists.llvm.org>
To: Diana Picus <diana.picus at linaro.org>, llvm-dev
<llvm-dev at lists.llvm.org>, cfe-dev <cfe-dev at lists.llvm.org>
Subject: Re: [llvm-dev] AArch64 testsuite buildbots timeout
Message-ID: <57893B46.9000801 at linux.vnet.ibm.com>
Content-Type: text/plain; charset=utf-8; format=flowed
On 07/14/16 08:50, Diana Picus via llvm-dev wrote:> Some of our AArch64 bots have started timing out while compiling
> SingleSource/UnitTests/Vector/AArch64/aarch64_neon_intrinsics.
>
> On clang-cmake-aarch64-quick the failure first appears between r275337
> and r275351, and on clang-cmake-aarch64-full it appears after r275352,
> so there's isn't a clear culprit for this. I suspect we have been
> slowly approaching that threshold for a while.
>
> Both Renato and I are currently travelling, so we can't investigate
> this until Monday. I can't really think of any temporary solution
> other than bumping the timeout threshold, but IIUC that would affect
> all the bots.
I have a couple of powerpc bots that on occasion time out as well. It
doesn't happen very often and I have been trying to tweak things to avoid it
but it is still an ongoing issue.
At least a few of the factories have the ability to override the timeout setting
on a per-slave basis. See the SanitizerBuildFactory and LLVMCMakeBuildFactory
factories for instance. Perhaps it is time to add this to all (or at least
more) of the factories.
--
-Bill Seurer
------------------------------
Message: 3
Date: Fri, 15 Jul 2016 12:36:57 -0700
From: Daniel Berlin via llvm-dev <llvm-dev at lists.llvm.org>
To: Sanjoy Das <sanjoy at playingwithpointers.com>
Cc: Oscar Blumberg <oscar.blumberg at normalesup.org>, llvm-dev
<llvm-dev at lists.llvm.org>
Subject: Re: [llvm-dev] RFC: Strong GC References in LLVM
Message-ID:
<CAF4BwTXYW5htMiuz5o9nSUm=rCZFiSsBWSwXf_wVyWY5YQPqvg at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
On Fri, Jul 15, 2016 at 12:21 PM, Sanjoy Das <sanjoy at
playingwithpointers.com> wrote:
> Hi Daniel,
>
> Daniel Berlin wrote:
> > As a starting point, LLVM will conservatively not speculate
such
> > loads and stores; and will leave open the potential to
upstream
> > logic that will have a more precise sense of when these
> > loads
> and
> > stores are safe to speculate.
> >
> >
> > I think you need to define what you mean by control dependence here.
> > If you mean speculation, you should say speculation :)
>
> Apologies for being non-specific -- this is really just "don't
> speculate".
>
> > As you describe below, it is not enough to simply not speculate them.
>
> I'm not sure where I said that?
>
>
> > You also are saying you don't want to change the conditions on
which
> > they execute.
> > That is very different from speculation.
>
> If I implied that somehow then I (or the example) was wrong. :)
>
:)
>
> We can't speculate these instructions (without special knowledge of
> the GC and the Java type system), and that's it.
Okey.
>
>
> > FWIW: This raises one of the same issues we have now with may-throw,
> > which is that, if all you have is a flag on the instruction, now you
> > have to look at every instruction in every block to know whether a
> > *CFG* transform is correct.
> >
> > That means any pass that wants to just touch the CFG can't do so
> > without also looking at the instruction stream. It will also make a
> > bunch of things currently O(N), O(N^2) (see the sets of patches
> > fixing may-throw places, and extrapolate to more places).
>
> As I said, I'm only proposing a "don't speculate" flag,
so this does
> not (?) apply.
>
As long as it applies only to the instructions, and they do not act as
"barriers" to hoisting/sinking, then yes, it should not apply.
(In theory it still means things have to look at instructions, but they had to
look at them anyway at that point :P)
>
> However, I didn't quite understand your point about may-throw -- how
> is may-throw different from a generic side-effect (volatile store,
> syscall etc.)? All of those can't be hoisted or sunk -- we have to
> make sure that they execute in semantically the same conditions that
> they did in the original program.
>
> may-throw is, AFAIK, worse. They act as barriers to sinking *other
things*. You cannot sink a store past a may-throw, or hoist a load above them.
You can't optimize stores across them either:
See:
[PATCH] D21007: DSE: Don't remove stores made live by a call which unwinds.
for the latter
[llvm] r270828 - [MergedLoadStoreMotion] Don't transform across may-throw
calls for the former.
"It is unsafe to hoist a load before a function call which may throw, the
throw might prevent a pointer dereference.
Likewise, it is unsafe to sink a store after a call which may throw.
The caller might be able to observe the difference."
This then leads to the problem i mentioned - because the may-throwness is not
expressed at the bb level (or in the CFG, by having the call end the block, or
at the least, a fake abnormal CFG edge), everything has to go checking every
instruction along the entire path they want to hoist, whereas hoisting is
normally just a simple dataflow problem with BB level properties :)
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160715/a94f4e87/attachment-0001.html>
------------------------------
Message: 4
Date: Fri, 15 Jul 2016 20:42:10 +0100
From: Pierre Gagelin via llvm-dev <llvm-dev at lists.llvm.org>
To: Mehdi Amini <mehdi.amini at apple.com>
Cc: llvm-dev at lists.llvm.org
Subject: Re: [llvm-dev] clone function
Message-ID:
<CALAmmgS5oux78iQrNMWG515X0MNZ1EBGT3V36RUPY+hwmrj6vA at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Hi again,
I finally managed to find the solution looking at CloneFunctionInto
implementation.
> Not sure what you mean by "you can't clone the function if there
is
> arguments”, CloneFunction is supposed to handle this.
>
>
I meant that you can't directly splice the function body from a function
with arguments (at least if those argument are used in the function body).
Because that's what I tried first as an "hand-made" solution, then
I had to do the arguments mapping I thought this should probably exist somewhere
in LLVM API...
>
> The instructions using these arguments will have Values refering to
> another function body which triggers errors like:
>
> Referring to an argument in another function!
> store i8* %ptr, i8** %ptr.addr, align 8
>
> for a function taking i8* %ptr as argument.
>
>
> If you end up with this by *only* calling CloneFunction, that’s a bug.
> It is supposed to handle this perfectly.
>
Nope, that happen when arguments weren't remapped. I didn't managed to
use CloneFunction because I didn't get how to build the ValueToValueMapTy
argument (given by "typedef ValueMap< const Value*,
WeakVH> llvm::ValueToValueMapTy").
At last when I looked to the implementation they used it as a
std::map<Value*, Value*>... I'm still confused on that point. Well at
least it does make sense to remap the argument with a map like that. And that
was my initial question by the way.
> But I think CloneFunctionInto allows you to do this kind of argument
> expansion
>
>
> Yes, CloneFunction is setting up the call for CloneFunctionInto, if
> you look at the implementation you should be able to copy it, and with
> some adaptation call CallFunctionInto.
>
>
> (because that's the only thing after all: adding an argument to the
> function).
>
>
> Note: If you just want to *add* an argument and don’t need to preserve
> the old function, cloning it is an expensive solution…
>
>
>
Well I think I don't need to keep the original function around, but how
should I do so ? This doesn't look like trivial because every CallInst to
the original function has to be modified too to match the new signature. Or
doing it with optional arguments ? Never tried those.
But anyway I guess the answer to my original question was just "you can put
some std::map<Value*, Value*> for the VMap argument" ^^.
Thanks for your time, I hope it's the last time I need to ask for help!
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160715/e6035a5b/attachment-0001.html>
------------------------------
Message: 5
Date: Fri, 15 Jul 2016 20:50:17 +0100
From: Renato Golin via llvm-dev <llvm-dev at lists.llvm.org>
To: Bill Seurer <seurer at linux.vnet.ibm.com>
Cc: LLVM Dev <llvm-dev at lists.llvm.org>, cfe-dev
<cfe-dev at lists.llvm.org>
Subject: Re: [llvm-dev] AArch64 testsuite buildbots timeout
Message-ID:
<CAMSE1keB8Nyu+XGSkynRAprnw4GRiqM9waw+aHv2mu=ZfowQQA at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
On 15 Jul 2016 8:36 p.m., "Bill Seurer via llvm-dev" < llvm-dev at
lists.llvm.org> wrote:>
> On 07/14/16 08:50, Diana Picus via llvm-dev wrote:
>>
>> Some of our AArch64 bots have started timing out while compiling
>> SingleSource/UnitTests/Vector/AArch64/aarch64_neon_intrinsics.
>>
>> On clang-cmake-aarch64-quick the failure first appears between
>> r275337 and r275351, and on clang-cmake-aarch64-full it appears after
>> r275352, so there's isn't a clear culprit for this. I suspect
we have
>> been slowly approaching that threshold for a while.
>>
>> Both Renato and I are currently travelling, so we can't investigate
>> this until Monday. I can't really think of any temporary solution
>> other than bumping the timeout threshold, but IIUC that would affect
>> all the bots.
>
>
> I have a couple of powerpc bots that on occasion time out as well. It
doesn't happen very often and I have been trying to tweak things to avoid it
but it is still an ongoing issue.>
> At least a few of the factories have the ability to override the
> timeout
setting on a per-slave basis. See the SanitizerBuildFactory and
LLVMCMakeBuildFactory factories for instance. Perhaps it is time to add this to
all (or at least more) of the factories.
It'd also be interesting to know why we're slower on that range. I
wouldn't like to make it easier for people to tune the timeout, or we'll
end up regressing on compile time too much without noticing.
Cheers,
Renato
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160715/59db9575/attachment-0001.html>
------------------------------
Message: 6
Date: Fri, 15 Jul 2016 13:25:34 -0700
From: Daniel Berlin via llvm-dev <llvm-dev at lists.llvm.org>
To: Sanjoy Das <sanjoy at playingwithpointers.com>
Cc: Oscar Blumberg <oscar.blumberg at normalesup.org>, llvm-dev
<llvm-dev at lists.llvm.org>
Subject: Re: [llvm-dev] RFC: Strong GC References in LLVM
Message-ID:
<CAF4BwTU81zp5LU+g-_rWS9vOkHjCPEqMKht9K5jGEJE2dot4HQ at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
>
>
>> This then leads to the problem i mentioned - because the
>> may-throwness is
> not expressed at the bb level (or in the CFG, by having the call end
> the block, or at the least, a fake abnormal CFG edge), everything has
> to go checking every instruction along the entire path they want to
> hoist, whereas hoisting is normally just a simple dataflow problem
> with BB level properties :)
>
>
and to be clear, i'm just being colloquial about "expressed at the bb
level". An analysis that everything used/kept up to date if it decided to
insert throwing calls or make calls nounwind would have the same effect. I
happen to be more used to these things being flags on basic blocks, but whatever
works.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160715/8b69393f/attachment-0001.html>
------------------------------
Message: 7
Date: Fri, 15 Jul 2016 22:48:38 +0200
From: Joerg Sonnenberger via llvm-dev <llvm-dev at lists.llvm.org>
To: llvm-dev at lists.llvm.org
Subject: Re: [llvm-dev] RFC: To add
__attribute__((regmask("preserve/clobbered list here"))) in clang
Message-ID: <20160715204838.GA32714 at britannica.bec.de>
Content-Type: text/plain; charset=us-ascii
On Fri, Jul 15, 2016 at 11:57:35PM +0530, vivek pandya via llvm-dev
wrote:> So for IPRA we have a situation where a function is calling a function
> which is written in assembly and it is not defined in current module.
> IPRA's scope is limited to a module so for such externally defined
> function it uses default calling convention but here as the function
> is written in assembly user can provide exact register usage detials.
> So we dicided to mark declration of such function with
> __attribute__((regmask("clobbered list here"))) so LLVM can
construct
> regmask out of it and use it with IPRA to improve register allocation.
This situation is actually far more common and not restricted to assembly at
all. There are a number of functions already that have special ABIs with much
larger set of preserved registers. A typical is __tls_get_addr in many ABIs. At
the moment, we need hacks in the target specific part of LLVM for handling this.
Related (limited) approaches for this are the preserve_most and preserve_all
calling conventions.
As mentioned in the IRC discussion, there are two important issues to be
considered here from my perspective.
(1) I really dislike an attribute providing a clobber list. Whether a given
register is clobbered or not is an implementation detail of a specific version
and can easily change. It is also something difficult to reason about. The
invariance that should be put into the ABI contract is the inverse -- what
registers a function is going to preserve. That is even more important when
looking at long time ABI stability. New registers are introduced every so often.
That shouldn't change the meaning of a declaration.
The main reason for using a clobber list seems to be a concern about verbosity.
I think that can be mostly avoided by allowing the use of register classes in
the specifier, e.g. all-fp for i387 register,
all-sse2 for the SSE2 register set, all-avx for the AVX register etc.
At the same time, I consider a certain verbosity to be useful, since ultimately,
implementation and interface definition need to be carefully compared.
(2) Should the attribute extend or replace the normal preserved registers?
Randomly clobbering registers is going to create all kinds of fun issues with
the backend assumptions. We already have such fun with inline assembler.
Extend-only semantic is much easier to support. It can also be combined with a
special CC with minimal default preservation and well defined meanings e.g. for
arguments passed in registers.
Joerg
------------------------------
Message: 8
Date: Fri, 15 Jul 2016 16:03:47 -0500
From: Bill Seurer via llvm-dev <llvm-dev at lists.llvm.org>
To: Renato Golin <renato.golin at linaro.org>
Cc: LLVM Dev <llvm-dev at lists.llvm.org>, cfe-dev
<cfe-dev at lists.llvm.org>
Subject: Re: [llvm-dev] AArch64 testsuite buildbots timeout
Message-ID: <57894FB3.5040309 at linux.vnet.ibm.com>
Content-Type: text/plain; charset=utf-8; format=flowed
On 07/15/16 14:50, Renato Golin wrote:>
> On 15 Jul 2016 8:36 p.m., "Bill Seurer via llvm-dev"
> <llvm-dev at lists.llvm.org <mailto:llvm-dev at
lists.llvm.org>> wrote:
> >
> > On 07/14/16 08:50, Diana Picus via llvm-dev wrote:
> >>
> >> Some of our AArch64 bots have started timing out while compiling
> >> SingleSource/UnitTests/Vector/AArch64/aarch64_neon_intrinsics.
> >>
> >> On clang-cmake-aarch64-quick the failure first appears between
> r275337 >> and r275351, and on clang-cmake-aarch64-full it appears
> after r275352, >> so there's isn't a clear culprit for this.
I
> suspect we have been >> slowly approaching that threshold for a
while.
> >>
> >> Both Renato and I are currently travelling, so we can't
> investigate >> this until Monday. I can't really think of any
> temporary solution >> other than bumping the timeout threshold, but
> IIUC that would affect >> all the bots.
> >
> >
> > I have a couple of powerpc bots that on occasion time out as well.
> It doesn't happen very often and I have been trying to tweak things to
> avoid it but it is still an ongoing issue.
> >
> > At least a few of the factories have the ability to override the
> timeout setting on a per-slave basis. See the SanitizerBuildFactory
> and LLVMCMakeBuildFactory factories for instance. Perhaps it is time
> to add this to all (or at least more) of the factories.
>
> It'd also be interesting to know why we're slower on that range. I
> wouldn't like to make it easier for people to tune the timeout, or
> we'll end up regressing on compile time too much without noticing.
In the past I looked into the timeouts a bit and in many cases the time to run
the particular test varied a lot from well under the specified limit to
somewhere over it. When I ran them singly they never failed so I just
attributed it to interactions between all the other stuff that was running
slowing things down.
--
-Bill Seurer
------------------------------
Message: 9
Date: Fri, 15 Jul 2016 14:30:20 -0700
From: Sanjoy Das via llvm-dev <llvm-dev at lists.llvm.org>
To: Daniel Berlin <dberlin at dberlin.org>
Cc: Oscar Blumberg <oscar.blumberg at normalesup.org>, llvm-dev
<llvm-dev at lists.llvm.org>
Subject: Re: [llvm-dev] RFC: Strong GC References in LLVM
Message-ID: <578955EC.8090601 at playingwithpointers.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Hi Daniel,
Daniel Berlin wrote:
> However, I didn't quite understand your point about may-throw --
how
> is may-throw different from a generic side-effect (volatile store,
> syscall etc.)? All of those can't be hoisted or sunk -- we have
to
> make sure that they execute in semantically the same conditions that
> they did in the original program.
>
> may-throw is, AFAIK, worse. They act as barriers to sinking *other >
things*. You cannot sink a store past a may-throw, or hoist a load above >
them. You can't optimize stores across them either:
Don't we have the same problems for "exit(0)" and
"while(true) { *volatile_ptr = 42; }" too? Both of these are
optimization barriers while still being "nounwind" (i.e. could be
legitimately contained in a nounwind function); though not in exactly the same
way as a may-throw call (e.g. you can DSE across exit(0) and you can sink
non-atomic loads past "while(true) {...}").
-- Sanjoy
> See:
> [PATCH] D21007: DSE: Don't remove stores made live by a call which
unwinds.
> for the latter
>
> [llvm] r270828 - [MergedLoadStoreMotion] Don't transform across >
may-throw calls > for the former.
>
> "It is unsafe to hoist a load before a function call which maythrow,
the > throw might prevent a pointer dereference.
>
> Likewise, it is unsafe to sink a store after a call which maythrow.
> The caller might be able to observe the difference."
>
> This then leads to the problem i mentioned - because the may-throwness
> is not expressed at the bb level (or in the CFG, by having the call end
> the block, or at the least, a fake abnormal CFG edge), everything has to
> go checking every instruction along the entire path they want to hoist,
> whereas hoisting is normally just a simple dataflow problem with BB >
level properties :) > > >
------------------------------
Message: 10
Date: Fri, 15 Jul 2016 14:36:13 -0700
From: Daniel Berlin via llvm-dev <llvm-dev at lists.llvm.org>
To: Sanjoy Das <sanjoy at playingwithpointers.com>
Cc: Oscar Blumberg <oscar.blumberg at normalesup.org>, llvm-dev
<llvm-dev at lists.llvm.org>
Subject: Re: [llvm-dev] RFC: Strong GC References in LLVM
Message-ID:
<CAF4BwTUVux7hAs-p6yNhi_Bo5op9rZDhfAFtfWrTahxvwnevkQ at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
On Fri, Jul 15, 2016 at 2:30 PM, Sanjoy Das <sanjoy at
playingwithpointers.com>
wrote:
> Hi Daniel,
>
> Daniel Berlin wrote:
> > However, I didn't quite understand your point about may-throw
-- how
> > is may-throw different from a generic side-effect (volatile store,
> > syscall etc.)? All of those can't be hoisted or sunk -- we
have to
> > make sure that they execute in semantically the same conditions
that
> > they did in the original program.
> >
> > may-throw is, AFAIK, worse. They act as barriers to sinking *other
> > things*. You cannot sink a store past a may-throw, or hoist a load
above
> > them. You can't optimize stores across them either:
>
> Don't we have the same problems for "exit(0)"
This is a noreturn call, so yes, iit has another hidden control
flow-side-effect of a slightly different kind. GCC models it as an extra
fake edge from the BB containing a noreturn call to the exit block of the
function, so that nothing sinks below it by accident.
I do not believe we do anything special here, so yes, it also has the same
general issue as may-throw.
> and "while(true) {
> *volatile_ptr = 42; }" too?
I can move non-volatile stores past volatile stores :)
Or did you mean something else?
> Both of these are optimization barriers
> while still being "nounwind" (i.e. could be legitimately
contained in
> a nounwind function); though not in exactly the same way as a
> may-throw call (e.g. you can DSE across exit(0) and you can sink
> non-atomic loads past "while(true) {...}").
>
I do not claim there are not other instances. Noreturn is in fact, a good
exampl). But i would also bet they are just as buggy as may-throw was for
the same reason, and they would cause the same N^2ness.
Essentially, anything that has produces hidden control flow (instead of
just depending on hidden control flow) will have this issue.
The also are things that any flag/analysis should be able to flag.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160715/664b36b8/attachment-0001.html>
------------------------------
Message: 11
Date: Fri, 15 Jul 2016 14:44:02 -0700
From: Sanjoy Das via llvm-dev <llvm-dev at lists.llvm.org>
To: Daniel Berlin <dberlin at dberlin.org>
Cc: Oscar Blumberg <oscar.blumberg at normalesup.org>, llvm-dev
<llvm-dev at lists.llvm.org>
Subject: Re: [llvm-dev] RFC: Strong GC References in LLVM
Message-ID: <57895922.1030409 at playingwithpointers.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Hi Daniel,
Daniel Berlin wrote:
> Don't we have the same problems for "exit(0)"
>
>
> This is a noreturn call, so yes, iit has another hidden control
> flow-side-effect of a slightly different kind. GCC models it as an extra
> fake edge from the BB containing a noreturn call to the exit block of
> the function, so that nothing sinks below it by accident.
Just to be clear, it'd have to keep that sort of edge for all call
sites, unless it can prove that the call target does not call exit?
> I do not believe we do anything special here, so yes, it also has the
> same general issue as may-throw.
>
> and "while(true) {
> *volatile_ptr = 42; }" too?
>
>
> I can move non-volatile stores past volatile stores :)
I meant:
// ptr_a and ptr_b are NoAlias, ptr_a holds 0 to begin with.
ThreadA:
while(true) { store volatile i32 42, i32* %ptr_b }
store atomic i32 42, i32* %ptr_a
ThreadB:
%val = load atomic i32, i32* %ptr_a
assert(%val is not 42) // The store is "guarded" by an inf loop
We can't reorder the store to ptr_a to before the infinite loop. The
volatile store is there to make the infinite loop well defined.
> Or did you mean something else?
>
> Both of these are optimization barriers
> while still being "nounwind" (i.e. could be legitimately
contained in
> a nounwind function); though not in exactly the same way as a
> may-throw call (e.g. you can DSE across exit(0) and you can sink
> non-atomic loads past "while(true) {...}").
>
>
> I do not claim there are not other instances. Noreturn is in fact, a
> good exampl). But i would also bet they are just as buggy as may-throw
> was for the same reason, and they would cause the same N^2ness.
Yes.
> Essentially, anything that has produces hidden control flow (instead of
> just depending on hidden control flow) will have this issue.
> The also are things that any flag/analysis should be able to flag.
Yup.
-- Sanjoy
------------------------------
Message: 12
Date: Fri, 15 Jul 2016 14:54:26 -0700
From: Mehdi Amini via llvm-dev <llvm-dev at lists.llvm.org>
To: Joerg Sonnenberger <joerg at bec.de>
Cc: llvm-dev at lists.llvm.org
Subject: Re: [llvm-dev] RFC: To add
__attribute__((regmask("preserve/clobbered list here"))) in clang
Message-ID: <FAD0B5D8-3296-48AF-A5AC-779605FDB169 at apple.com>
Content-Type: text/plain; charset=utf-8
> On Jul 15, 2016, at 1:48 PM, Joerg Sonnenberger via llvm-dev <llvm-dev
at lists.llvm.org> wrote:
>
> On Fri, Jul 15, 2016 at 11:57:35PM +0530, vivek pandya via llvm-dev wrote:
>> So for IPRA we have a situation where a function is calling a function
>> which is written in assembly and it is not defined in current module.
IPRA's scope is
>> limited to a module so for such externally defined function it uses
default
>> calling convention but here as the function is written in assembly user
can
>> provide exact register usage detials. So we dicided to mark declration
of
>> such
>> function with __attribute__((regmask("clobbered list here")))
so LLVM can
>> construct regmask out of it and use it with IPRA to improve register
>> allocation.
>
> This situation is actually far more common and not restricted to
> assembly at all. There are a number of functions already that have
> special ABIs with much larger set of preserved registers. A typical
> is __tls_get_addr in many ABIs. At the moment, we need hacks in the
> target specific part of LLVM for handling this. Related (limited)
> approaches for this are the preserve_most and preserve_all calling
> conventions.
>
> As mentioned in the IRC discussion, there are two important issues to be
> considered here from my perspective.
>
> (1) I really dislike an attribute providing a clobber list. Whether a
> given register is clobbered or not is an implementation detail of a
> specific version and can easily change. It is also something difficult
> to reason about. The invariance that should be put into the ABI contract
> is the inverse -- what registers a function is going to preserve. That
> is even more important when looking at long time ABI stability. New
> registers are introduced every so often. That shouldn't change the
> meaning of a declaration.
Interestingly your last point is the reason why I'd think a clobber list
could be more appropriate for some cases: if I have a hand-written assembly
function, and it is clobbering some registers, the fact that the client code
enables AVX2 won’t make my routine clobbering these.
Maybe a syntax with +/- could be used to express things like “all vector
registers but these”.
> The main reason for using a clobber list seems to be a concern about
> verbosity. I think that can be mostly avoided by allowing the use of
> register classes in the specifier, e.g. all-fp for i387 register,
> all-sse2 for the SSE2 register set, all-avx for the AVX register etc.
> At the same time, I consider a certain verbosity to be useful, since
> ultimately, implementation and interface definition need to be carefully
> compared.
>
> (2) Should the attribute extend or replace the normal preserved
> registers? Randomly clobbering registers is going to create all kinds of
> fun issues with the backend assumptions. We already have such fun with
> inline assembler. Extend-only semantic is much easier to support. It can
> also be combined with a special CC with minimal default preservation and
> well defined meanings e.g. for arguments passed in registers.
Agree.
Overall I’m unsure how much applicability this attribute feature will have in
practice though, or if it is worth the complexity to support it.
—
Mehdi
------------------------------
Message: 13
Date: Fri, 15 Jul 2016 14:54:42 -0700
From: Daniel Berlin via llvm-dev <llvm-dev at lists.llvm.org>
To: Sanjoy Das <sanjoy at playingwithpointers.com>
Cc: Oscar Blumberg <oscar.blumberg at normalesup.org>, llvm-dev
<llvm-dev at lists.llvm.org>
Subject: Re: [llvm-dev] RFC: Strong GC References in LLVM
Message-ID:
<CAF4BwTXr6Wp5G+L-+ccLUcxq3=Xs5QUJzTBmgp2_=hn_tMy6Jg at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
On Fri, Jul 15, 2016 at 2:44 PM, Sanjoy Das <sanjoy at
playingwithpointers.com>
wrote:
> Hi Daniel,
>
> Daniel Berlin wrote:
> > Don't we have the same problems for "exit(0)"
> >
> >
> > This is a noreturn call, so yes, iit has another hidden control
> > flow-side-effect of a slightly different kind. GCC models it as an
extra
> > fake edge from the BB containing a noreturn call to the exit block of
> > the function, so that nothing sinks below it by accident.
>
> Just to be clear, it'd have to keep that sort of edge for all call
> sites, unless it can prove that the call target does not call exit?
Yes.
/* Add fake edges to the function exit for any non constant and non
noreturn calls (or noreturn calls with EH/abnormal edges),
volatile inline assembly in the bitmap of blocks specified by BLOCKS
or to the whole CFG if BLOCKS is zero.
...
The goal is to expose cases in which entering a basic block does
not imply that all subsequent instructions must be executed. */
> // ptr_a and ptr_b are NoAlias, ptr_a holds 0 to begin with.
>
> ThreadA:
> while(true) { store volatile i32 42, i32* %ptr_b }
> store atomic i32 42, i32* %ptr_a
>
> ThreadB:
> %val = load atomic i32, i32* %ptr_a
> assert(%val is not 42) // The store is "guarded" by an inf
loop
>
>
> We can't reorder the store to ptr_a to before the infinite loop. The
> volatile store is there to make the infinite loop well defined.
These do not have hidden control flow. It is actually well defined it just
literallly involves other instructions :)
Note that gcc will optionally connect the infinite loop itself to the exit
block with a fake edge if you want
(you can add/remove fake edges on a per-opt basis).
>
>
> > Or did you mean something else?
> >
> > Both of these are optimization barriers
> > while still being "nounwind" (i.e. could be legitimately
contained in
> > a nounwind function); though not in exactly the same way as a
> > may-throw call (e.g. you can DSE across exit(0) and you can sink
> > non-atomic loads past "while(true) {...}").
> >
> >
> > I do not claim there are not other instances. Noreturn is in fact, a
> > good exampl). But i would also bet they are just as buggy as may-throw
> > was for the same reason, and they would cause the same N^2ness.
>
> Yes.
>
> > Essentially, anything that has produces hidden control flow (instead
of
> > just depending on hidden control flow) will have this issue.
> > The also are things that any flag/analysis should be able to flag.
>
> Yup.
>
> -- Sanjoy
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160715/0a316238/attachment-0001.html>
------------------------------
Message: 14
Date: Fri, 15 Jul 2016 15:20:32 -0700
From: Daniel Berlin via llvm-dev <llvm-dev at lists.llvm.org>
To: Sanjoy Das <sanjoy at playingwithpointers.com>
Cc: Oscar Blumberg <oscar.blumberg at normalesup.org>, llvm-dev
<llvm-dev at lists.llvm.org>
Subject: Re: [llvm-dev] RFC: Strong GC References in LLVM
Message-ID:
<CAF4BwTVZddS7-ZNhXqeJcmpX=uj2gCt9Ykn=OihZ_0Ma4m0GPg at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
On Fri, Jul 15, 2016 at 2:54 PM, Daniel Berlin <dberlin at dberlin.org>
wrote:
>
>
> On Fri, Jul 15, 2016 at 2:44 PM, Sanjoy Das <
> sanjoy at playingwithpointers.com> wrote:
>
>> Hi Daniel,
>>
>> Daniel Berlin wrote:
>> > Don't we have the same problems for "exit(0)"
>> >
>> >
>> > This is a noreturn call, so yes, iit has another hidden control
>> > flow-side-effect of a slightly different kind. GCC models it as an
extra
>> > fake edge from the BB containing a noreturn call to the exit block
of
>> > the function, so that nothing sinks below it by accident.
>>
>> Just to be clear, it'd have to keep that sort of edge for all call
>> sites, unless it can prove that the call target does not call exit?
>
>
> Yes.
>
> /* Add fake edges to the function exit for any non constant and non
> noreturn calls (or noreturn calls with EH/abnormal edges),
> volatile inline assembly in the bitmap of blocks specified by BLOCKS
> or to the whole CFG if BLOCKS is zero.
> ...
>
> The goal is to expose cases in which entering a basic block does
> not imply that all subsequent instructions must be executed. */
>
>
Note that this is also necessary to makes post-dominance correct (but we
already do it in most cases, but i think there are still bugs open about
correctness)
For dominance, the dominance relationship for exit blocks a noreturn blocks
reach also changes , though i don't honestly remember if it's material
or
not, and i'm a bit lazy to think about it. But here's an example:
IE given
A (may-throw)
|
v
B
|
v
C (exit)
Here, we have A dominates B dominates C
So the dominator tree is
A
|
v
B
|
v
C
Now, if you add an edge from A to C, you have:
A dominates B
Neither B nor A dominate C (C's idom is somewhere above, so it's in a
sibling tree somewhere).
IE
A C
|
B
In GCC, there is a single exit block, and it is always empty (returns are
connected to the exit block).
Thus, the above will not prevent an optimization that should otherwise
happen, from happening.
Because LLVM's exit blocks contain real code, it may
You can actually get even worse (ie really wrong) situations if the
"exit"
blocks somehow have successors, but thankfully we don't have a case where
that happens that i know :)
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160715/0ae70ffe/attachment.html>
------------------------------
Subject: Digest Footer
_______________________________________________
llvm-dev mailing list
llvm-dev at lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
------------------------------
End of llvm-dev Digest, Vol 145, Issue 76
*****************************************