thr3ads.net - similar to: "[LLVMdev] 8-bit DIV IR irregularities"

Displaying 20 results from an estimated 1000 matches similar to: "[LLVMdev] 8-bit DIV IR irregularities"

2012 Jun 28

[LLVMdev] 8-bit DIV IR irregularities

On Wed, Jun 27, 2012 at 5:22 PM, Nowicki, Tyler <tyler.nowicki at intel.com> wrote: > I understand, but this sounds like legalization. Does every architecture trigger an overflow exception, as opposed to setting a bit? Perhaps it makes more sense to do this in the backends that trigger an overflow exception? The IR instruction has undefined behavior on overflow. This has nothing to do

[LLVMdev] 8-bit DIV IR irregularities

2012 Jun 28

[LLVMdev] 8-bit DIV IR irregularities

I understand, but this sounds like legalization. Does every architecture trigger an overflow exception, as opposed to setting a bit? Perhaps it makes more sense to do this in the backends that trigger an overflow exception? I'm working on a modification for DIV right now in the x86 backend for Intel Atom that will improve performance, however because the *actual* operation has been replaced

[LLVMdev] 8-bit DIV IR irregularities

2012 Jun 27

[LLVMdev] 8-bit DIV IR irregularities

On Wed, Jun 27, 2012 at 4:02 PM, Nowicki, Tyler <tyler.nowicki at intel.com> wrote: > Hi, > > > > I noticed that when dividing with signed 8-bit values the IR uses a 32-bit > signed divide, however, when unsigned 8-bit values are used the IR uses an > 8-bit unsigned divide. Why not use a 8-bit signed divide when using 8-bit > signed values? "sdiv i8 -128,

Lowering ISD::TRUNCATE

2018 Aug 06

Lowering ISD::TRUNCATE

I'm working on defining the instructions and implementing the lowering code for a Z80 backend. For now, the backend supports only the native CPU-supported datatypes, which are 8 and 16 bits wide (i.e. no 32 bit long, float, ... yet). So far, a lot of the simple stuff like immediate loads and return values is very straightforward, but now I got stuck with ISD::TRUNCATE, as in:

[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)

2012 Dec 10

[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)

Hello all, I wanted to get some feedback on this patch for ScalarEvolution. It addresses a performance problem I am seeing for simple benchmark. Starting with this C code: 01: signed char foo(void) 02: { 03: const int count = 8000; 04: signed char result = 0; 05: int j; 06: 07: for (j = 0; j < count; ++j) { 08: result += (result_t)(3); 09: } 10: 11: return result; 12: } I

Remove zext-unfolding from InstCombine

2016 Aug 04

Remove zext-unfolding from InstCombine

Hi Sanjay, > Am 02.08.2016 um 21:39 schrieb Sanjay Patel <spatel at rotateright.com>: > > Hi Matthias - > > Sorry for the delayed reply. I think you're on the right path with D22864. No problem, thank you for your answer! > If I'm understanding it correctly, my foo() example and zext_or_icmp_icmp() will be equivalent after your patch is added to InstCombine.

InstCombine wrongful (?) optimization on BinOp with SameOperands

2015 Sep 30

InstCombine wrongful (?) optimization on BinOp with SameOperands

Hi all, I have been looking at the way LLVM optimizes code before forwarding it to the backend I develop for my company and while building define i32 @test_extract_subreg_func(i32 %x, i32 %y) #0 { entry: %conv = zext i32 %x to i64 %conv1 = zext i32 %y to i64 %mul = mul nuw i64 %conv1, %conv %shr = lshr i64 %mul, 32 %xor = xor i64 %shr, %mul %conv2 = trunc i64 %xor to i32

[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)

2012 Dec 17

[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)

On Mon, Dec 10, 2012 at 2:13 PM, Matthew Curtis <mcurtis at codeaurora.org> wrote: > Hello all, > > I wanted to get some feedback on this patch for ScalarEvolution. > > It addresses a performance problem I am seeing for simple benchmark. > > Starting with this C code: > > 01: signed char foo(void) > 02: { > 03: const int count = 8000; > 04: signed char

[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)

2012 Dec 18

[LLVMdev] [PATCH] Teaching ScalarEvolution to handle IV=add(zext(trunc(IV)), Step)

Dan, Thanks for the response ... On 12/17/2012 1:53 PM, Dan Gohman wrote: > On Mon, Dec 10, 2012 at 2:13 PM, Matthew Curtis <mcurtis at codeaurora.org> wrote: >> Hello all, >> >> I wanted to get some feedback on this patch for ScalarEvolution. >> >> It addresses a performance problem I am seeing for simple benchmark. >> >> Starting with this C

Remove zext-unfolding from InstCombine

2016 Jul 27

Remove zext-unfolding from InstCombine

Hi Sanjay, thank you a lot for your answer. I understand that in your examples it is desirable that `foo` and `goo` are canonicalized to the same IR, i.e., something like `@goo`. However, I still have a few open questions, but please correct me in case I'm thinking in the wrong direction. > Am 21.07.2016 um 18:51 schrieb Sanjay Patel <spatel at rotateright.com>: > > I've

Finding Confidence Intervals

2011 Jul 11

Finding Confidence Intervals

This is a very basic question, so please bear with me. I've been learning about AB Testing, which is largely used in internet marketing to examine the effectiveness of certain aspects of ads, websites, etc. Here's a couple links to people who want to know more about AB Testing:

Signed Division and InstCombine

2016 May 31

Signed Division and InstCombine

I was looking through the InstCombine pass, and I was wondering why signed division is not considered a valid operation to combine in the canEvaluateTruncated function. This means, given the following code: %conv = sext i16 %0 to i32 %conv1 = sext i16 %1 to i32 %div = sdiv i32 %conv, %conv1 %conv2 = trunc i32 %div to i16 * Assume %0 and %1 are registers created from simple 16-bit loads. We

Speedups with Ra and jit

2008 May 02

Speedups with Ra and jit

The topic of Ra and jit has come up on this list recently (see http://www.milbo.users.sonic.net/ra/index.html) so I thought people might be interested in this little demo. For it I used my machine, a 3-year old laptop with 2Gb memory running Windows XP, and the good old convolution example, the same one as used on the web page, (though the code on the web page has a slight glitch in it). This

[RFC][SDAG] Convert build_vector of ops on extractelts into ops on input vectors

2020 Jan 11

[RFC][SDAG] Convert build_vector of ops on extractelts into ops on input vectors

Thanks so much for your feedback Simon. I am not sure that what I am proposing here is at odds with what you're referring to (here and in the PR you linked). The key difference AFAICT is that the pattern I am referring to is probably more aptly described as "reducing scalarization" than as "vectorization". The reason I say that is that the inputs are vectors and the output

[RFC][SDAG] Convert build_vector of ops on extractelts into ops on input vectors

2020 Jan 11

[RFC][SDAG] Convert build_vector of ops on extractelts into ops on input vectors

Absolutely. We do it for scalars, so it would likely be a matter of just extending it. But that is one example. The issue of extracting elements, performing an operation on each element individually and then rebuilding the vector is likely more prevalent than that. At least I think that is the case, but I'll do some analysis to see if it is so or not. On Sat, Jan 11, 2020 at 6:15 PM Craig

[LLVMdev] LiveIntervals analysis problem

2013 Feb 14

[LLVMdev] LiveIntervals analysis problem

Hello everyone, please I need your help. To reproduce my problem I created simple pass for backends (TestPass.cpp in attached files). That pass I call from Mips backend in this way (MipsTargetMachine.cpp): bool MipsPassConfig::addPreRegAlloc() { addPass(createTestPass()); return false; } The problem becomes, when I am trying compile file ldtoa.ll (in attached files). Compiling

[LLVMdev] Promoting i16 load to i32

2011 Feb 07

[LLVMdev] Promoting i16 load to i32

Hi, I'm working on an LLVM backend for an architecture which does not natively support half-word loads. I'm having trouble getting LLVM to promote i16 to i32 loads for me - should I expect LLVM to be able to do this, are do I have to write a custom lowerer? This post (http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-February/019929.html) gave me the impression that it should be possible,

If WAN is broken I have a problem with local connect

1998 Jul 30

If WAN is broken I have a problem with local connect

We have two subnets with two samba servers (div0 - 150.10.10.1 and div8 - 150.18.18.1 ). If I break the connection between subnets, clients belong to div8 cannot connect locally. All clients have set up wins server to 150.10.10.1 ( div0 ) . If subnets are connect, all is ok ( connecting, browsing). We are using samba 1.9.18p7. smb.conf in div0 is [global] workgroup = LANGROUP domain logons =

[LLVMdev] Packed instructions generaetd by LoopVectorize?

2013 Apr 04

[LLVMdev] Packed instructions generaetd by LoopVectorize?

Thanks, that did it! Are there any plans to enable the loop vectorizer by default? From: Nadav Rotem [mailto:nrotem at apple.com] Sent: Wednesday, April 03, 2013 13:33 PM To: Nowicki, Tyler Cc: LLVM Developers Mailing List Subject: Re: Packed instructions generaetd by LoopVectorize? Hi Tyler, Try adding -ffast-math. We can only vectorize reduction variables if it is safe to reorder floating

[LLVMdev] Generate scalar SSE instructions instead of packed instructions

2013 Feb 21

[LLVMdev] Generate scalar SSE instructions instead of packed instructions

On Thu, Feb 21, 2013 at 12:14 PM, Nadav Rotem <nrotem at apple.com> wrote: > You can change the input LLVM-IR. > > On Feb 21, 2013, at 7:16 AM, "Nowicki, Tyler" <tyler.nowicki at intel.com> > wrote: > > Hi,**** > > ** ** > > I am interested in evaluating the performance of packed vs scalar > double-precision floating point instructions on

similar to: [LLVMdev] 8-bit DIV IR irregularities