Why should SelectionDAGBuilder generate an explicit bitcast for a no-op
bitcast? By definition, no bits are changed; so if the EVTs match, there
is nothing to do. The fundamental problem is how address spaces are
handled, and specifically how they are converted, in LLVM IR. Address
space casts are currently implemented with bitcasts (in general). While
this works out for the LLVM IR type system, it does not match the semantics
of an address space cast for some targets. For PTX, and address space cast
may involve changing bits in the address. Therefore, a bitcast is not a
valid way to perform an address space cast. We introduced target
intrinsics to perform address space casts for this purpose. But I feel
that this problem is likely not specific to PTX. A target that uses
different pointer sizes for different address spaces would be hit by this
issue even more, since a bitcast would not even be valid IR.
On Thu, Aug 8, 2013 at 8:03 AM, Michele Scandale <michele.scandale at
gmail.com> wrote:
> On 08/08/2013 11:04 AM, David Chisnall wrote:
>
>> What happens when I link together two IR modules from different front
>> ends that have different language-specific address spaces?
>>
>
> I agree with Micah: if during the linking two IR modules there are
> incoherences (e.g. in module1 2 -> 1 and in module2 2 -> 3) then the
> modules are incompatible and the link process should fail.
>
>
> I would be very hesitant about using address spaces until we've fixed
>> their semantics to disallow bitcasts between different address spaces
and
>> require an explicit address space cast. To illustrate the problem,
>> consider the following trivial example:
>>
>> typedef __attribute__((address_space(**256))) int* gsptr;
>>
>> int *toglobal(gsptr foo)
>> {
>> return (int*)foo;
>> }
>>
>> int load(int *foo)
>> {
>> return *foo;
>> }
>>
>> int loadgs(gsptr foo)
>> {
>> return *foo;
>> }
>>
>> int loadgs2(gsptr foo)
>> {
>> return *toglobal(foo);
>> }
>>
>> When we compile this to LLVM IR with clang (disabling asynchronous
unwind
>> tables for clarity), at -O2 we get this:
>>
>> define i32* @toglobal(i32 addrspace(256)* %foo) nounwind readnone ssp {
>> %1 = bitcast i32 addrspace(256)* %foo to i32*
>> ret i32* %1
>> }
>>
>> define i32 @load(i32* nocapture %foo) nounwind readonly ssp {
>> %1 = load i32* %foo, align 4, !tbaa !0
>> ret i32 %1
>> }
>>
>> define i32 @loadgs(i32 addrspace(256)* nocapture %foo) nounwind
readonly
>> ssp {
>> %1 = load i32 addrspace(256)* %foo, align 4, !tbaa !0
>> ret i32 %1
>> }
>>
>> define i32 @loadgs2(i32 addrspace(256)* nocapture %foo) nounwind
readonly
>> ssp {
>> %1 = bitcast i32 addrspace(256)* %foo to i32*
>> %2 = load i32* %1, align 4, !tbaa !0
>> ret i32 %2
>> }
>>
>> Note that in loadgs2, the call to toglobal has been inlined and so the
>> back end will just see a bitcast, which SelectionDAG treats as a no-op.
>> The assembly we get from this is:
>>
>> _toglobal: ## @toglobal
>> ## BB#0:
>> pushq %rbp
>> movq %rsp, %rbp
>> movq %rdi, %rax
>> popq %rbp
>> ret
>> load: ## @load
>> ## BB#0:
>> pushq %rbp
>> movq %rsp, %rbp
>> movl (%rdi), %eax
>> popq %rbp
>> ret
>>
>> .globl _loadgs
>> .align 4, 0x90
>> loadgs: ## @loadgs
>> ## BB#0:
>> pushq %rbp
>> movq %rsp, %rbp
>> movl %gs:(%rdi), %eax
>> popq %rbp
>> ret
>>
>> .globl _loadgs2
>> .align 4, 0x90
>> loadgs2: ## @loadgs2
>> ## BB#0:
>> pushq %rbp
>> movq %rsp, %rbp
>> movl (%rdi), %eax
>> popq %rbp
>> ret
>>
>> loadgs() has been compiled correctly. It uses the parameter as a
>> gs-relative address and performs the load. The assembly for load() and
>> loadgs2(), however, are identical: both are treating the parameter as a
>> linear (not gs-relative) address. The cast has been lost. This is
even
>> simpler when you look at toglobal(), which has just become a noop. The
>> correct code for this should be (I believe):
>>
>> _toglobal: ## @toglobal
>> ## BB#0:
>> pushq %rbp
>> movq %rsp, %rbp
>> lea %gs:(%rdi), %rax
>> popq %rbp
>> ret
>>
>> In the inlined version, the lea and movl should be combined into a
single
>> gs-relativel movl.
>>
>> Until we can generate correct code from IR containing address spaces,
>> discussion of how to optimise this IR seems premature.
>>
>
> I've done a quick test: the problem is that the BITCAST node is not
> generated during the SelectionDAG building. If you look in
> SelectionDAGBuilder::**visitBitCast, you will see that the node is
> generated only if the operand value of the bitcast operation and the result
> value have different EVTs: the address space information is not handled in
> EVT and so pointers in different address spaces are mapped to the same EVT
> that imply a missing BITCAST node.
>
> Maybe rethinking the way address spaces are handled at the interface
> between middle-end and backend would allow to fix also these kind of
> problems. BTW, I think this specific problem can be used for a bug report
> :-).
>
> Thanks.
>
> -Michele
>
>
> ______________________________**_________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
>
http://lists.cs.uiuc.edu/**mailman/listinfo/llvmdev<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev>
>
--
Thanks,
Justin Holewinski
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20130808/e763db8a/attachment.html>