Hi, SelectionDAGBuilder doesn't know how to lower a Memcpy and Memset if one of the pointer operands have an address space >= 256. This is understandable since the libc's memcpy / memset don't work for these address spaces. However, both Clang (when copying a struct) and some optimization passes (LoopIdiomRecognize, MemCpyOpt) can emit memcpy / memset for these address spaces. This triggers an assert in SelectionDAGBuilder. The optimization passes could be modified to give up when they encounter an address space >= 256, but I think clang would need some new code that emits a struct copy member-by-member. I think it's better to extend the code generator to be able to emit code for that. What do you think? The problem is also described here: http://llvm.org/bugs/show_bug.cgi?id=18549 -Manuel
I have some patches that automatically expand all memcpy and similar if the operands are not in AS 0. I think this is probably not quite the right approach though, and we should be asking the back end for the function that does a memcpy / memset / whatever in a non-0 address space, and expand automatically if it doesn't provide one. In an ideal world, I'd rather have the memcpy / memset lowering moved entirely out of SelectionDAG and into a FunctionPass, where it would be much easier to debug. I'd also want to do the same for lowering of unaligned loads / stores, so by the time you get to the back end every load and store is something that can map trivially to a single instruction (assuming an adequate addressing mode exists). David On 11 Mar 2014, at 22:23, Manuel Jacob <me at manueljacob.de> wrote:> Hi, > > SelectionDAGBuilder doesn't know how to lower a Memcpy and Memset if one of the pointer operands have an address space >= 256. This is understandable since the libc's memcpy / memset don't work for these address spaces. However, both Clang (when copying a struct) and some optimization passes (LoopIdiomRecognize, MemCpyOpt) can emit memcpy / memset for these address spaces. This triggers an assert in SelectionDAGBuilder. The optimization passes could be modified to give up when they encounter an address space >= 256, but I think clang would need some new code that emits a struct copy member-by-member. I think it's better to extend the code generator to be able to emit code for that. What do you think? > > The problem is also described here: http://llvm.org/bugs/show_bug.cgi?id=18549 > > -Manuel > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
So a set of target specific legalisation passes? Like from the recent "PNaCl's IR simplification passes" discussion. On Wed, Mar 12, 2014 at 7:18 PM, David Chisnall <David.Chisnall at cl.cam.ac.uk> wrote:> I have some patches that automatically expand all memcpy and similar if > the operands are not in AS 0. I think this is probably not quite the right > approach though, and we should be asking the back end for the function that > does a memcpy / memset / whatever in a non-0 address space, and expand > automatically if it doesn't provide one. > > In an ideal world, I'd rather have the memcpy / memset lowering moved > entirely out of SelectionDAG and into a FunctionPass, where it would be > much easier to debug. I'd also want to do the same for lowering of > unaligned loads / stores, so by the time you get to the back end every load > and store is something that can map trivially to a single instruction > (assuming an adequate addressing mode exists). > > David > > On 11 Mar 2014, at 22:23, Manuel Jacob <me at manueljacob.de> wrote: > > > Hi, > > > > SelectionDAGBuilder doesn't know how to lower a Memcpy and Memset if one > of the pointer operands have an address space >= 256. This is > understandable since the libc's memcpy / memset don't work for these > address spaces. However, both Clang (when copying a struct) and some > optimization passes (LoopIdiomRecognize, MemCpyOpt) can emit memcpy / > memset for these address spaces. This triggers an assert in > SelectionDAGBuilder. The optimization passes could be modified to give up > when they encounter an address space >= 256, but I think clang would need > some new code that emits a struct copy member-by-member. I think it's > better to extend the code generator to be able to emit code for that. What > do you think? > > > > The problem is also described here: > http://llvm.org/bugs/show_bug.cgi?id=18549 > > > > -Manuel > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20140312/1c3ba13d/attachment.html>
Hi David, sorry for sending you the mail two times, I forgot to send to the list the first time. On 2014-03-12 09:48, David Chisnall wrote:> I have some patches that automatically expand all memcpy and similar > if the operands are not in AS 0. I think this is probably not quite > the right approach though, and we should be asking the back end for > the function that does a memcpy / memset / whatever in a non-0 address > space, and expand automatically if it doesn't provide one.Can you share these patches? This would be a tentative solution for the reporter of the bug I linked in the original post.> In an ideal world, I'd rather have the memcpy / memset lowering moved > entirely out of SelectionDAG and into a FunctionPass, where it would > be much easier to debug. I'd also want to do the same for lowering of > unaligned loads / stores, so by the time you get to the back end every > load and store is something that can map trivially to a single > instruction (assuming an adequate addressing mode exists).While I agree that the memcpy lowering pass could be done as an IR pass because it involves loops, I don't think you should do that for lowering of unaligned loads / stores. But that's mostly unrelated to this thread and should be discussed separately. There are still some advantages of lowering the memcpy / memset in SelectionDAGBuilder. The infrastructure (e.g. target hooks for determining the right register class for memory operations) is already there. I don't know how hard it is to generate loops in SelectionDAGBuilder, though. -Manuel> David > > On 11 Mar 2014, at 22:23, Manuel Jacob <me at manueljacob.de> wrote: > >> Hi, >> >> SelectionDAGBuilder doesn't know how to lower a Memcpy and Memset if >> one of the pointer operands have an address space >= 256. This is >> understandable since the libc's memcpy / memset don't work for these >> address spaces. However, both Clang (when copying a struct) and some >> optimization passes (LoopIdiomRecognize, MemCpyOpt) can emit memcpy / >> memset for these address spaces. This triggers an assert in >> SelectionDAGBuilder. The optimization passes could be modified to >> give up when they encounter an address space >= 256, but I think clang >> would need some new code that emits a struct copy member-by-member. I >> think it's better to extend the code generator to be able to emit code >> for that. What do you think? >> >> The problem is also described here: >> http://llvm.org/bugs/show_bug.cgi?id=18549 >> >> -Manuel
Hi, this thread is now almost a year old, but the problem still exists. On 2014-03-12 09:48, David Chisnall wrote:> I have some patches that automatically expand all memcpy and similar > if the operands are not in AS 0. I think this is probably not quite > the right approach though, and we should be asking the back end for > the function that does a memcpy / memset / whatever in a non-0 address > space, and expand automatically if it doesn't provide one.Is it the back end that should be asked for a memcpy or memset function? Since these functions are provided by the runtime, I think it should be the user that configures this. My suggestion for how to fix the problem is: 1) Provide a way for the user to specify which functions to call for memcpy / memset with address spaces >= 256. 2) Modify / add target hooks to enable better code generation for small constant-sized memcpy / memset (like possible now with address spaces < 256). -Manuel> In an ideal world, I'd rather have the memcpy / memset lowering moved > entirely out of SelectionDAG and into a FunctionPass, where it would > be much easier to debug. I'd also want to do the same for lowering of > unaligned loads / stores, so by the time you get to the back end every > load and store is something that can map trivially to a single > instruction (assuming an adequate addressing mode exists). > > David > > On 11 Mar 2014, at 22:23, Manuel Jacob <me at manueljacob.de> wrote: > >> Hi, >> >> SelectionDAGBuilder doesn't know how to lower a Memcpy and Memset if >> one of the pointer operands have an address space >= 256. This is >> understandable since the libc's memcpy / memset don't work for these >> address spaces. However, both Clang (when copying a struct) and some >> optimization passes (LoopIdiomRecognize, MemCpyOpt) can emit memcpy / >> memset for these address spaces. This triggers an assert in >> SelectionDAGBuilder. The optimization passes could be modified to >> give up when they encounter an address space >= 256, but I think clang >> would need some new code that emits a struct copy member-by-member. I >> think it's better to extend the code generator to be able to emit code >> for that. What do you think? >> >> The problem is also described here: >> http://llvm.org/bugs/show_bug.cgi?id=18549 >> >> -Manuel >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev