On 2008-10-22, at 19:24, Mike Stump wrote:> On Oct 22, 2008, at 3:28 PM, Paul Biggar wrote: > >> As part of our PHP compiler (phpcompiler.org), it would be great to >> be able to annotate our generated C code with, for example, (var != >> NULL), or (var->type == STRING), and have that information passed >> around (esp interprocedurally at link-time) by the LLVM optimizers. > > For some odd reason I was thinking this was going to be done with an > assert (or a special no code generating assert). Just gen up assert > (i != 0); and put it in anytime it is true. The optimizers in the > fulness of time would the recognize and propagate this information.Can't you implement __builtin_assume(cond) to codegen to something like: %cond = i1 ... br i1 %cond, label %always, label %never never: unreachable always: Then in assert.h: #ifdef NDEBUG # define assert(cond) __builtin_assume((cond)) #else # define assert(cond) ... #endif — Gordon
> Can't you implement __builtin_assume(cond) to codegen to something like: > > %cond = i1 ... > br i1 %cond, label %always, label %never > never: > unreachable > always:The code generators will remove the branch to %never. I already tried this :) What would work is to define an llvm.abort intrinsic, and do: %cond = i1 ... br i1 %cond, label %always, label %never never: call void @llvm.abort() unreachable At codegen time @llvm.abort() can be lowered to nothing at all. I'm not saying that this is my favorite solution, but it is simple. Ciao, Duncan.
On Oct 22, 2008, at 10:45 PM, Duncan Sands wrote:>> Can't you implement __builtin_assume(cond) to codegen to something >> like: >> >> %cond = i1 ... >> br i1 %cond, label %always, label %never >> never: >> unreachable >> always: > > The code generators will remove the branch to %never. > I already tried this :) What would work is to define > an llvm.abort intrinsic, and do: > > %cond = i1 ... > br i1 %cond, label %always, label %never > never: > call void @llvm.abort() > unreachable > > At codegen time @llvm.abort() can be lowered to > nothing at all. I'm not saying that this is my > favorite solution, but it is simple.How is this different than just branching to unreachable? Branching to unreachable says that "this condition is true or else the program has undefined behavior". This means that the condition must be true :) -Chris
Hi all, I've been thinking about this issue as well, since I'm working with a architecture that can do hardware based loops, but only when the loop count is more than some minimal value. To probably use this, we need some way for the code to specify that a loop variable has a minimum value.> Can't you implement __builtin_assume(cond) to codegen to something like: > > %cond = i1 ... > br i1 %cond, label %always, label %never > never: > unreachable > always:I've looked into an approach like this as well, but it doesn't quite work. My approach wasn't to use an unreachable, but simply a return (IIRC). However, the problems I saw were as follows: * If the optimizers are smart enough to actually verify that your assertion holds (ie, %cond is provably true), the branch is removed alltogether (or at least %cond is simplified and replaced by true). Now, you might say that if the optimizers are smart enough to prove the condition true, they shouldn't need the assertion in the first place. However, it usually requires quite some reasoning to prove the condition, which is often done in incremental step (sometimes even by different passes). Moreover, not all passes do the same complicated reasoning to prove some property, even though they might benefit from the conclusions. We don't want all passes to do this either, since drawing the same conclusions over and over again is pointless. Also, the code might want to make some assertions which are not provably true, (ie, preconditions on external functions can't be proven until after linking), but can lead to significant optimizations. * If the optimizers are not smart enough to verify the assertion, modeling such an assertion with a conditional branch instruction will leave the branch instruction in place, and you wil end up with a branch instruction in the resulting code. The above shows that using normal branches for marking assumptions is not a very usable strategy. The example above tries to mark the branch as a special branch by putting in an unreachable instruction. While this should prevent the branch from showing up in the resulting code, as pointed out the branch gets eliminated way too early. One could think of making a new "codegen-unreachable" instruction, which is counted as unreachable only just before or at codegen. This would help the branch to stay alive longer, so the optimization passes can use the info it encodes. However, this still allows the branch to be removed when the condition can be proven somewhere along the way. So, to really be able to encode this data, one could think of having an "assume" intrinsic, i.e.: %cond = i1 ... call void @llvm.assume( i1 %cond ) Optimization passes won't just delete this, but we could teach codegen that this intrinsic is not generated into any code. However, this still doesn't completely solve the problem indicated at the first point above. If %cond is provably true, we will end up with call void @llvm.assume( i1 true ) which is quite pointless. I can see two ways of fixing this. * Don't use normal IR in the encoding of assumptions, i.e.: call void @llvm.assume( [ 7 x i8 ] "p != 0" ) (Not sure if this is proper IR for encoding a string, but well..) The main downside here is that you are limited in which assumptions you can encode, which have to be defined quite clearly. On the other hand, when encoding using IR, you can encode almost everything, but optimizations will still be able to understand a limited amount of it. Another downside here is that it is harder to keep assumptions correct when the code is transformed. * Mark the instructions used for assumptions explicitely, so they won't get modified, i.e.: %cond = immutable icmp ne i32 %p, 0 call void @llvm.assume( i1 %cond ) This probably has lots of other problems (such as preventing other transformations from taking place and needing updates to almost every optimization pass we have), but seems like it could work. Are there any other suggestions for solutions? I don't quite like either one of the two I proposed, but can't think of any others just now. Gr. Matthijs -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 189 bytes Desc: Digital signature URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20081023/f4cae1d3/attachment.sig>
Hi,> * Mark the instructions used for assumptions explicitely, so they won't get > modified, i.e.: > > %cond = icmp ne i32 %p, 0 > call void @llvm.assume( i1 %cond )gcc uses the equivalent of: %p2 = call @llvm.assume(i32 %p, "ne", i32 0) with %p2 being used for %p wherever this assumption if valid (for example in the "true" branch of an "if (%p != 0) {...}"). Ciao, Duncan.
Hi, I'm using LLVM for JIT compilation of shaders for my ray tracer (http://www.indigorenderer.com). My primary development target is 32 bit and 64 bit Windows. JIT compilation of shaders is working great for x86 code, but for x64 code LLVM doesn't really work, due to ABI incompatibilties in the form of calling convention errors with x64 windows, I think. Anyway, my questions are as follows: Is x64 JIT on Windows supposed to work currently? If not, is x64 JIT on Windows a LLVM development goal? And if so, is there a time-line or roadmap for achieving such a goal? Thanks, Nicholas Chapman
On Oct 22, 2008, at 5:36 PM, Gordon Henriksen wrote:> Can't you implement __builtin_assume(cond) to codegen to something > like:> %cond = i1 ... > br i1 %cond, label %always, label %never > never: > unreachable > always:The thing I don't like about this, is that this has conditional branches all over the place, which break up basic blocks, which, if you do that, they screw with optimization passes. The idea would be to use a form that doesn't screw with optimization passes as hard. Now, I'm just a front end person, so if the backend people like it, who am I to say otherwise...
On Oct 23, 2008, at 3:40 AM, Matthijs Kooijman wrote:> So, to really be able to encode this data, one could think of having > an > "assume" intrinsic, i.e.: > > %cond = i1 ... > call void @llvm.assume( i1 %cond )I like this the best.> Optimization passes won't just delete this, but we could teach > codegen that > this intrinsic is not generated into any code. However, this still > doesn't > completely solve the problem indicated at the first point above. If > %cond is > provably true, we will end up with > > call void @llvm.assume( i1 true )No, trivially the optimizer can be taught to not do this, if we don't want it to. The optimizer can see that this is @llvm.assume by checking the spelling (code). It all comes down to how much memory you want to burn to remember things that at one time you knew about the code and the likelihood of the utility of knowing that. Put another way, if there are any downstream consumers of the information. If we do: assert (p!=0); (base*)p; when base is a virtual base, this must generate code to preserve 0 values: if (p != 0) p += *(int*)(((char*)p)+n) this does the conversion if p is non-zero, and if it is 0, it leaves it alone. And for any reason we can figure out that p is not zero, we then can eliminate the assert as it were, and remove the conditional branch (nice win). If there are no more down stream consumers of the information, the information itself can die at this optimization pass, and save the memory. If this isn't the last consumer of the information, we can leave the assert untouched, remove the condition. For example, after -O4 inlining, there might be another virtual base conversion then exposed, and we figure out: assert (p!=0) p += *(int*)(((char*)p)+n) if (p != 0) p += *(int*)(((char*)p)+n1) should be optimized as: assert (p!=0) p += *(int*)(((char*)p)+n) p += *(int*)(((char*)p)+n1) as we can know that the first p += vbase adjustment cannot wrap back to 0.
On Thu, Oct 23, 2008 at 10:36 AM, Mike Stump <mrs at apple.com> wrote:> On Oct 22, 2008, at 5:36 PM, Gordon Henriksen wrote: >> %cond = i1 ... >> br i1 %cond, label %always, label %never >> never: >> unreachable >> always: > The thing I don't like about this, is that this has conditional > branches all over the place, which break up basic blocksI've been playing around with something similar to this except the never block actually can be reached but calls a new intrinsic @llvm.abort() followed by an unreachable. In my case, I'm converting branches into asserts and moving the branch for the assert around to create bigger basic blocks. If I make sure to use the same "never" block with the abort() call, -simplifycfg will combine the assert conditions by ANDing them together. A later pass to -instcombine can get rid of redundant checks. Additionally because I've got plain branches in the IR, -predsimplify automatically handles code simplification without modifications. So in regards to a generic assert/assume instruction, I would imagine something like |assert i1 %cond, label %notTrue|. So assume would provide a "never" block that is unreachable as the label where I could provide "abort" block as the label. (The reason why it wouldn't be an intrinsic is that intrinsics can't take labels.) Ed