similar to: [LLVMdev] [global-isel] Proposal for a global instruction selector

Displaying 20 results from an estimated 70000 matches similar to: "[LLVMdev] [global-isel] Proposal for a global instruction selector"

2016 Jan 11
2
[GlobalISel] A Proposal for global instruction selection
Hi Daniel, Thanks for the pointers, I wasn’t aware of the second thread you’ve mentioned. I may be wrong but I think LLVM-IR optimizations really treat bistcasts as no-op casts, in the sense of no instructions are required. Is there anyone that could chime in on that? However, it seems SelectionDAG sticks to the load/store semantic: "BITCAST - This operator converts between integer,
2013 Aug 09
0
[LLVMdev] [global-isel] Random comments on Proposal for a global instruction selector
On 9 Aug 2013, at 00:18, Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote: > I am hoping that this proposal will generate a lot of feedback, and there are many different topics to discuss. When replying to this email, please change the subject header to something more specific, but keep the [global-isel] tag. Subject changed, but I'm not sure if helps... Overall, I really like this
2016 Jan 13
2
[GlobalISel] A Proposal for global instruction selection
Hi James, I am also confused! > On Jan 12, 2016, at 4:11 PM, Philip Reames <listmail at philipreames.com> wrote: > > I think after reading your link I'm actually more confused. This might just be a wording problem, but let me ask a couple of clarifying questions. > > 1) After compiling the code sequence below (from that page), does the in memory bit pattern differ?
2016 Jan 12
2
[GlobalISel] A Proposal for global instruction selection
What happens when you cascade bitcast? Are these sequences all equivalent at the IR level (i.e. do they reference the same byte from the original i128)? i128 => <16 x i8> => GEP 0 i128 => <2 x i64> => GEP 0 => <8 x i8> => GEP 0 i128 => <2 x i64> => GEP 0 => <2 x i32> => GEP 0 => <4 x i8> => GEP 0 —
2016 Jan 12
4
[GlobalISel] A Proposal for global instruction selection
Hi, > I found this thinking quite difficult to explain. Does it make sense? It might help to link to the documentation on why bitcasts are weird on big-endian NEON: http://llvm.org/docs/BigEndianNEON.html#bitconverts Cheers, James On Tue, 12 Jan 2016 at 13:23 Daniel Sanders via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Hi, > > > > I haven't found much time to
2016 Jan 07
2
[GlobalISel] A Proposal for global instruction selection
Hi Daniel, I had a quick look at the language reference for bitcast and I have a different reading than what you were pointing out. Indeed, my take away is: "It is always a no-op cast because no bits change with this conversion." In other words, deleting all bitcast instructions should be fine. My understanding of the quote you’ve highlighted is that it tells C programmers that this
2013 Aug 09
2
[LLVMdev] [RFC] Poor code generation for paired load
Hi, I am investigating a poor code generation on x86-64 involving a 64-bits structure with two 32-bits fields (in the attached examples float, but similar behavior is exposed with i32, and we can probably generalize that to smaller types too). The root cause of the problem is in SROA, although I am not sure we should fix something there. That is why I need your advices. ** Problem ** 64-bits
2013 Aug 10
0
[LLVMdev] [RFC] Poor code generation for paired load
On Fri, Aug 9, 2013 at 4:58 PM, Quentin Colombet <qcolombet at apple.com> wrote: > Hi, > > I am investigating a poor code generation on x86-64 involving a 64-bits > structure with two 32-bits fields (in the attached examples float, but > similar behavior is exposed with i32, and we can probably generalize that to > smaller types too). > The root cause of the problem is
2013 Aug 12
2
[LLVMdev] [RFC] Poor code generation for paired load
Hi Eli, Thanks for the feedbacks. On Aug 9, 2013, at 8:00 PM, Eli Friedman <eli.friedman at gmail.com> wrote: > On Fri, Aug 9, 2013 at 4:58 PM, Quentin Colombet <qcolombet at apple.com> wrote: >> Hi, >> >> I am investigating a poor code generation on x86-64 involving a 64-bits >> structure with two 32-bits fields (in the attached examples float, but
2019 Sep 27
4
Dealing with boolean values in GlobalISel
Hi, I’ve been thinking about what the strategy to use for boolean values in GlobalISel. There are a few semantic and mechanical issues I’ve encountered. For background, on AMDGPU, there are two kinds of bool/s1 values. Contextually, a real boolean value will either be a 1-bit scalar condition (in a non-allocatable physical condition register, which will need to be copied to an allocatable class
2013 Aug 12
0
[LLVMdev] [RFC] Poor code generation for paired load
On Mon, Aug 12, 2013 at 9:59 AM, Quentin Colombet <qcolombet at apple.com> wrote: > Hi Eli, > > Thanks for the feedbacks. > > On Aug 9, 2013, at 8:00 PM, Eli Friedman <eli.friedman at gmail.com> wrote: > > On Fri, Aug 9, 2013 at 4:58 PM, Quentin Colombet <qcolombet at apple.com> > wrote: > > Hi, > > I am investigating a poor code generation on
2020 Jan 11
2
[RFC][SDAG] Convert build_vector of ops on extractelts into ops on input vectors
Thanks so much for your feedback Simon. I am not sure that what I am proposing here is at odds with what you're referring to (here and in the PR you linked). The key difference AFAICT is that the pattern I am referring to is probably more aptly described as "reducing scalarization" than as "vectorization". The reason I say that is that the inputs are vectors and the output
2020 Jan 11
2
[RFC][SDAG] Convert build_vector of ops on extractelts into ops on input vectors
Absolutely. We do it for scalars, so it would likely be a matter of just extending it. But that is one example. The issue of extracting elements, performing an operation on each element individually and then rebuilding the vector is likely more prevalent than that. At least I think that is the case, but I'll do some analysis to see if it is so or not. On Sat, Jan 11, 2020 at 6:15 PM Craig
2017 Jul 07
2
Error in v64i32 type in x86 backend
Thank You. On Fri, Jul 7, 2017 at 10:03 AM, Craig Topper <craig.topper at gmail.com> wrote: > Yes, that error is from instruction selection. I think your legalization > changes worked fine. > > ~Craig > > On Thu, Jul 6, 2017 at 8:21 PM, hameeza ahmed via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> also i further run the following command;
2017 Aug 02
3
[InstCombine] Simplification sometimes only transforms but doesn't simplify instruction, causing side effect in other pass
On Wed, Aug 2, 2017 at 3:36 PM, Matthias Braun <mbraun at apple.com> wrote: > So to write this in a more condensed form, you have: > > %v0 = ... > %v1 = and %v0, 255 > %v2 = and %v1, 31 > use %v1 > use %v2 > > and transform this to > %v0 = ... > %v1 = and %v0, 255 > %v2 = and %v0, 31 > ... > > This is a classical problem with instruction
2009 Feb 02
2
[LLVMdev] Adding legal integer sizes to TargetData
Now that 2.5 is about to branch, I'd like to bring up one of Scott's favorite topics: certain optimizers widen or narrow arithmetic, without regard for whether the type is legal for the target. In his specific case, instcombine is turning an i32 multiply into an i64 multiply in order to eliminate a cast. This does simplify/reduce the number of IR operations, but an i64 multiply
2017 Aug 02
3
[InstCombine] Simplification sometimes only transforms but doesn't simplify instruction, causing side effect in other pass
Hi, We recently found a testcase showing that simplifications in instcombine sometimes change the instruction without reducing the instruction cost, but causing problems in TwoAddressInstruction pass. And it looks like the problem is generic and other simplification may have the same issue. I want to get some ideas about what is the best way to fix such kind of problem. The testcase:
2017 Jul 07
2
Error in v64i32 type in x86 backend
also i further run the following command; llc -debug filer-knl_o3.ll and its output is attached here. by looking at the output can we say that legalization runs fine and the error is due to instruction selection/ pattern matching which is not yet implemented? so do i need to worry and try to correct it at this stage or should i move forward to implement instruction selection/ pattern matching?
2017 May 16
4
Which pass should be propagating memory copies
Consider the following IR example: define void @simple([4 x double] *%ptr, i64 %idx) { %stack = alloca [4 x double] %ptri8 = bitcast [4 x double] *%ptr to i8* %stacki8 = bitcast [4 x double] *%stack to i8* call void @llvm.memcpy.p0i8.p0i8.i32(i8 *%stacki8, i8 *%ptri8, i32 32, i32 0, i1 0) %dataptr = getelementptr inbounds [4 x double], [4 x double] *%ptr, i32 0, i64 %idx
2017 Aug 02
2
[InstCombine] Simplification sometimes only transforms but doesn't simplify instruction, causing side effect in other pass
On Wed, Aug 2, 2017 at 4:07 PM Matthias Braun via llvm-dev < llvm-dev at lists.llvm.org> wrote: > On Aug 2, 2017, at 4:00 PM, Wei Mi <wmi at google.com> wrote: > > On Wed, Aug 2, 2017 at 3:36 PM, Matthias Braun <mbraun at apple.com> wrote: > > So to write this in a more condensed form, you have: > > %v0 = ... > %v1 = and %v0, 255 > %v2 = and %v1, 31