thr3ads.net - similar to: "[LLVMdev] [global-isel] Proposal for a global instruction selector"

Displaying 20 results from an estimated 70000 matches similar to: "[LLVMdev] [global-isel] Proposal for a global instruction selector"

[GlobalISel] A Proposal for global instruction selection

2016 Jan 11

[GlobalISel] A Proposal for global instruction selection

Hi Daniel, Thanks for the pointers, I wasn’t aware of the second thread you’ve mentioned. I may be wrong but I think LLVM-IR optimizations really treat bistcasts as no-op casts, in the sense of no instructions are required. Is there anyone that could chime in on that? However, it seems SelectionDAG sticks to the load/store semantic: "BITCAST - This operator converts between integer,

[LLVMdev] [global-isel] Random comments on Proposal for a global instruction selector

2013 Aug 09

[LLVMdev] [global-isel] Random comments on Proposal for a global instruction selector

On 9 Aug 2013, at 00:18, Jakob Stoklund Olesen <stoklund at 2pi.dk> wrote: > I am hoping that this proposal will generate a lot of feedback, and there are many different topics to discuss. When replying to this email, please change the subject header to something more specific, but keep the [global-isel] tag. Subject changed, but I'm not sure if helps... Overall, I really like this

[GlobalISel] A Proposal for global instruction selection

2016 Jan 13

[GlobalISel] A Proposal for global instruction selection

Hi James, I am also confused! > On Jan 12, 2016, at 4:11 PM, Philip Reames <listmail at philipreames.com> wrote: > > I think after reading your link I'm actually more confused. This might just be a wording problem, but let me ask a couple of clarifying questions. > > 1) After compiling the code sequence below (from that page), does the in memory bit pattern differ?

[GlobalISel] A Proposal for global instruction selection

2016 Jan 12

[GlobalISel] A Proposal for global instruction selection

What happens when you cascade bitcast? Are these sequences all equivalent at the IR level (i.e. do they reference the same byte from the original i128)? i128 => <16 x i8> => GEP 0 i128 => <2 x i64> => GEP 0 => <8 x i8> => GEP 0 i128 => <2 x i64> => GEP 0 => <2 x i32> => GEP 0 => <4 x i8> => GEP 0 —

[GlobalISel] A Proposal for global instruction selection

2016 Jan 12

[GlobalISel] A Proposal for global instruction selection

Hi, > I found this thinking quite difficult to explain. Does it make sense? It might help to link to the documentation on why bitcasts are weird on big-endian NEON: http://llvm.org/docs/BigEndianNEON.html#bitconverts Cheers, James On Tue, 12 Jan 2016 at 13:23 Daniel Sanders via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Hi, > > > > I haven't found much time to

[GlobalISel] A Proposal for global instruction selection

2016 Jan 07

[GlobalISel] A Proposal for global instruction selection

Hi Daniel, I had a quick look at the language reference for bitcast and I have a different reading than what you were pointing out. Indeed, my take away is: "It is always a no-op cast because no bits change with this conversion." In other words, deleting all bitcast instructions should be fine. My understanding of the quote you’ve highlighted is that it tells C programmers that this

[LLVMdev] [RFC] Poor code generation for paired load

2013 Aug 09

[LLVMdev] [RFC] Poor code generation for paired load

Hi, I am investigating a poor code generation on x86-64 involving a 64-bits structure with two 32-bits fields (in the attached examples float, but similar behavior is exposed with i32, and we can probably generalize that to smaller types too). The root cause of the problem is in SROA, although I am not sure we should fix something there. That is why I need your advices. ** Problem ** 64-bits

[LLVMdev] [RFC] Poor code generation for paired load

2013 Aug 10

[LLVMdev] [RFC] Poor code generation for paired load

On Fri, Aug 9, 2013 at 4:58 PM, Quentin Colombet <qcolombet at apple.com> wrote: > Hi, > > I am investigating a poor code generation on x86-64 involving a 64-bits > structure with two 32-bits fields (in the attached examples float, but > similar behavior is exposed with i32, and we can probably generalize that to > smaller types too). > The root cause of the problem is

[LLVMdev] [RFC] Poor code generation for paired load

2013 Aug 12

[LLVMdev] [RFC] Poor code generation for paired load

Hi Eli, Thanks for the feedbacks. On Aug 9, 2013, at 8:00 PM, Eli Friedman <eli.friedman at gmail.com> wrote: > On Fri, Aug 9, 2013 at 4:58 PM, Quentin Colombet <qcolombet at apple.com> wrote: >> Hi, >> >> I am investigating a poor code generation on x86-64 involving a 64-bits >> structure with two 32-bits fields (in the attached examples float, but

Dealing with boolean values in GlobalISel

2019 Sep 27

Dealing with boolean values in GlobalISel

Hi, I’ve been thinking about what the strategy to use for boolean values in GlobalISel. There are a few semantic and mechanical issues I’ve encountered. For background, on AMDGPU, there are two kinds of bool/s1 values. Contextually, a real boolean value will either be a 1-bit scalar condition (in a non-allocatable physical condition register, which will need to be copied to an allocatable class

[LLVMdev] [RFC] Poor code generation for paired load

2013 Aug 12

[LLVMdev] [RFC] Poor code generation for paired load

On Mon, Aug 12, 2013 at 9:59 AM, Quentin Colombet <qcolombet at apple.com> wrote: > Hi Eli, > > Thanks for the feedbacks. > > On Aug 9, 2013, at 8:00 PM, Eli Friedman <eli.friedman at gmail.com> wrote: > > On Fri, Aug 9, 2013 at 4:58 PM, Quentin Colombet <qcolombet at apple.com> > wrote: > > Hi, > > I am investigating a poor code generation on

[RFC][SDAG] Convert build_vector of ops on extractelts into ops on input vectors

2020 Jan 11

[RFC][SDAG] Convert build_vector of ops on extractelts into ops on input vectors

Thanks so much for your feedback Simon. I am not sure that what I am proposing here is at odds with what you're referring to (here and in the PR you linked). The key difference AFAICT is that the pattern I am referring to is probably more aptly described as "reducing scalarization" than as "vectorization". The reason I say that is that the inputs are vectors and the output

[RFC][SDAG] Convert build_vector of ops on extractelts into ops on input vectors

2020 Jan 11

[RFC][SDAG] Convert build_vector of ops on extractelts into ops on input vectors

Absolutely. We do it for scalars, so it would likely be a matter of just extending it. But that is one example. The issue of extracting elements, performing an operation on each element individually and then rebuilding the vector is likely more prevalent than that. At least I think that is the case, but I'll do some analysis to see if it is so or not. On Sat, Jan 11, 2020 at 6:15 PM Craig

Error in v64i32 type in x86 backend

2017 Jul 07

Error in v64i32 type in x86 backend

Thank You. On Fri, Jul 7, 2017 at 10:03 AM, Craig Topper <craig.topper at gmail.com> wrote: > Yes, that error is from instruction selection. I think your legalization > changes worked fine. > > ~Craig > > On Thu, Jul 6, 2017 at 8:21 PM, hameeza ahmed via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> also i further run the following command;

[InstCombine] Simplification sometimes only transforms but doesn't simplify instruction, causing side effect in other pass

2017 Aug 02

[InstCombine] Simplification sometimes only transforms but doesn't simplify instruction, causing side effect in other pass

On Wed, Aug 2, 2017 at 3:36 PM, Matthias Braun <mbraun at apple.com> wrote: > So to write this in a more condensed form, you have: > > %v0 = ... > %v1 = and %v0, 255 > %v2 = and %v1, 31 > use %v1 > use %v2 > > and transform this to > %v0 = ... > %v1 = and %v0, 255 > %v2 = and %v0, 31 > ... > > This is a classical problem with instruction

[LLVMdev] Adding legal integer sizes to TargetData

2009 Feb 02

[LLVMdev] Adding legal integer sizes to TargetData

Now that 2.5 is about to branch, I'd like to bring up one of Scott's favorite topics: certain optimizers widen or narrow arithmetic, without regard for whether the type is legal for the target. In his specific case, instcombine is turning an i32 multiply into an i64 multiply in order to eliminate a cast. This does simplify/reduce the number of IR operations, but an i64 multiply

[InstCombine] Simplification sometimes only transforms but doesn't simplify instruction, causing side effect in other pass

2017 Aug 02

[InstCombine] Simplification sometimes only transforms but doesn't simplify instruction, causing side effect in other pass

Hi, We recently found a testcase showing that simplifications in instcombine sometimes change the instruction without reducing the instruction cost, but causing problems in TwoAddressInstruction pass. And it looks like the problem is generic and other simplification may have the same issue. I want to get some ideas about what is the best way to fix such kind of problem. The testcase:

Error in v64i32 type in x86 backend

2017 Jul 07

Error in v64i32 type in x86 backend

also i further run the following command; llc -debug filer-knl_o3.ll and its output is attached here. by looking at the output can we say that legalization runs fine and the error is due to instruction selection/ pattern matching which is not yet implemented? so do i need to worry and try to correct it at this stage or should i move forward to implement instruction selection/ pattern matching?

Which pass should be propagating memory copies

2017 May 16

Which pass should be propagating memory copies

Consider the following IR example: define void @simple([4 x double] *%ptr, i64 %idx) { %stack = alloca [4 x double] %ptri8 = bitcast [4 x double] *%ptr to i8* %stacki8 = bitcast [4 x double] *%stack to i8* call void @llvm.memcpy.p0i8.p0i8.i32(i8 *%stacki8, i8 *%ptri8, i32 32, i32 0, i1 0) %dataptr = getelementptr inbounds [4 x double], [4 x double] *%ptr, i32 0, i64 %idx

[InstCombine] Simplification sometimes only transforms but doesn't simplify instruction, causing side effect in other pass

2017 Aug 02

[InstCombine] Simplification sometimes only transforms but doesn't simplify instruction, causing side effect in other pass

On Wed, Aug 2, 2017 at 4:07 PM Matthias Braun via llvm-dev < llvm-dev at lists.llvm.org> wrote: > On Aug 2, 2017, at 4:00 PM, Wei Mi <wmi at google.com> wrote: > > On Wed, Aug 2, 2017 at 3:36 PM, Matthias Braun <mbraun at apple.com> wrote: > > So to write this in a more condensed form, you have: > > %v0 = ... > %v1 = and %v0, 255 > %v2 = and %v1, 31

similar to: [LLVMdev] [global-isel] Proposal for a global instruction selector