thr3ads.net - search: "klontz"

Displaying 20 results from an estimated 44 matches for "klontz".

2013 Nov 15

[LLVMdev] Limit loop vectorizer to SSE

...nt == 0) + Alignment = 1; unsigned AddressSpace = Ptr->getType()->getPointerAddressSpace(); unsigned ScalarAllocatedSize = DL->getTypeAllocSize(ScalarDataTy); unsigned VectorElementSize = DL->getTypeStoreSize(DataTy)/VF; Should fix this. On Nov 15, 2013, at 3:49 PM, Joshua Klontz <josh.klontz at gmail.com> wrote: > Nadav, > > I believe aligned accesses to unaligned pointers is precisely the issue. Consider the function `add_u8S` before[1] and after[2] the loop vectorizer pass. There is no alignment assumption associated with %kernel_data prior to vectorizat...

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 15

[LLVMdev] Limit loop vectorizer to SSE

...getPrefTypeAlignment or getABITypeAlignment I would have thought the latter. On Nov 15, 2013, at 4:12 PM, Hal Finkel <hfinkel at anl.gov> wrote: > ----- Original Message ----- >> From: "Arnold Schwaighofer" <aschwaighofer at apple.com> >> To: "Joshua Klontz" <josh.klontz at gmail.com> >> Cc: "LLVM Dev" <llvmdev at cs.uiuc.edu> >> Sent: Friday, November 15, 2013 4:05:53 PM >> Subject: Re: [LLVMdev] Limit loop vectorizer to SSE >> >> >> Something like: >> >> index 6db7f68..685...

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 15

[LLVMdev] Limit loop vectorizer to SSE

----- Original Message ----- > From: "Arnold Schwaighofer" <aschwaighofer at apple.com> > To: "Joshua Klontz" <josh.klontz at gmail.com> > Cc: "LLVM Dev" <llvmdev at cs.uiuc.edu> > Sent: Friday, November 15, 2013 4:05:53 PM > Subject: Re: [LLVMdev] Limit loop vectorizer to SSE > > > Something like: > > index 6db7f68..68564cb 100644 > --- a/lib/Trans...

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 15

[LLVMdev] Limit loop vectorizer to SSE

On Nov 15, 2013, at 12:36 PM, Renato Golin <renato.golin at linaro.org> wrote: > On 15 November 2013 20:24, Joshua Klontz <josh.klontz at gmail.com> wrote: > Agreed, is there a pass that will insert a runtime alignment check? Also, what's the easiest way to get at TargetTransformInfo::getRegisterBitWidth() so I don't have to hard code 32? Thanks! > > I think that's a fair question, and it...

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 15

[LLVMdev] Limit loop vectorizer to SSE

...Josh [1] http://pastebin.com/kc95WtGG [2] http://pastebin.com/VY3ZLVJK On Fri, Nov 15, 2013 at 3:58 PM, Nadav Rotem <nrotem at apple.com> wrote: > > On Nov 15, 2013, at 12:36 PM, Renato Golin <renato.golin at linaro.org> > wrote: > > On 15 November 2013 20:24, Joshua Klontz <josh.klontz at gmail.com> wrote: > >> Agreed, is there a pass that will insert a runtime alignment check? Also, >> what's the easiest way to get at TargetTransformInfo::getRegisterBitWidth() >> so I don't have to hard code 32? Thanks! >> > > I think t...

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 15

[LLVMdev] Limit loop vectorizer to SSE

A fix for this is in r194876. Thanks for reporting this! On Nov 15, 2013, at 3:49 PM, Joshua Klontz <josh.klontz at gmail.com> wrote: > Nadav, > > I believe aligned accesses to unaligned pointers is precisely the issue. Consider the function `add_u8S` before[1] and after[2] the loop vectorizer pass. There is no alignment assumption associated with %kernel_data prior to vectorizat...

[LLVMdev] [LV] possible `vector.memcheck` regression when using `llvm.loop` and `llvm.mem.parallel_loop_access`

2015 Mar 19

[LLVMdev] [LV] possible `vector.memcheck` regression when using `llvm.loop` and `llvm.mem.parallel_loop_access`

...y is vectorized, though %33 and %46 are deemed MayAlias despite their exclusive use in loads ands stores marked with `llvm.mem.parallel_loop_access`. Many Thanks, Josh On Thu, Mar 19, 2015 at 12:55 PM, Adam Nemet <anemet at apple.com> wrote: > > > On Mar 19, 2015, at 9:43 AM, Josh Klontz <josh.klontz at gmail.com> wrote: > > > > It seems that at some point in the not-so-distant-past that the loop > vectorizer gained the ability to vectorize loops without explicit > `llvm.loop` & `llvm.mem.parallel_loop_access` metadata. While that's > awesome, the...

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 16

[LLVMdev] Limit loop vectorizer to SSE

...32 byte aligned accesses. In which case I would align the payload to 32 byte in order to save a little in the preamble. Frank On 15/11/13 18:15, Arnold Schwaighofer wrote: > A fix for this is in r194876. > > Thanks for reporting this! > > > On Nov 15, 2013, at 3:49 PM, Joshua Klontz <josh.klontz at gmail.com> wrote: > >> Nadav, >> >> I believe aligned accesses to unaligned pointers is precisely the issue. Consider the function `add_u8S` before[1] and after[2] the loop vectorizer pass. There is no alignment assumption associated with %kernel_data prio...

[LLVMdev] Simple Loop Vectorize Question

2013 May 10

[LLVMdev] Simple Loop Vectorize Question

...oes not have a triple, so the target machine and > TargetTransformInfo have no way of knowing if you are running on a machine > with vector registers. Try adding the '-mcpu=XXXX' to opt and see what > happens. > > Thanks, > Nadav > > On May 9, 2013, at 1:42 PM, Josh Klontz <josh.klontz at gmail.com> wrote: > > Hi! I am trying to get the loop vectorizer to work on a simple example > (http://pastebin.com/tGhpc4y0) that doubles every element in a vector. > > I've found that 'opt -loop-vectorize -force-vector-width=4 -S -debug > double.ll&...

[LLVMdev] Simple Loop Vectorize Question

2013 May 10

[LLVMdev] Simple Loop Vectorize Question

Hi Josh, This line works for me: opt file.ll -loop-vectorize -S -o - -mtriple=x86_64 -mcpu=corei7-avx -debug You need to specify the triple on the command line if it is not inside the module. Thanks, Nadav On May 9, 2013, at 5:53 PM, Joshua Klontz <josh.klontz at gmail.com> wrote: > Nadav, > > Please forgive my ignorance, but 'opt -mcpu=corei7 -loop-vectorize -S -debug double.ll' doesn't appear to make a difference. In fact it seems to be ignored as garbage values for -mcpu don't raise an error. Am I overlook...

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 16

[LLVMdev] Limit loop vectorizer to SSE

...load > to 32 byte in order to save a little in the preamble. > > Frank > > > > On 15/11/13 18:15, Arnold Schwaighofer wrote: >> A fix for this is in r194876. >> >> Thanks for reporting this! >> >> >> On Nov 15, 2013, at 3:49 PM, Joshua Klontz <josh.klontz at gmail.com> wrote: >> >>> Nadav, >>> >>> I believe aligned accesses to unaligned pointers is precisely the issue. Consider the function `add_u8S` before[1] and after[2] the loop vectorizer pass. There is no alignment assumption associated with...

Early CSE clobbering llvm.assume

2016 Jun 10

Early CSE clobbering llvm.assume

Maybe. It may not fix it directly because you never use %1 or %2 again. I haven't looked to see how good the lookup is. On Fri, Jun 10, 2016, 3:45 PM Josh Klontz <josh.klontz at gmail.com> wrote: > Thanks Daniel, with that knowledge I think I can at least work around the > issue in my frontend. > > Ignoring GVN for a second though, and just looking at Early CSE, it seems > to me that at least in this pass that there is the potential fo...

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 19

[LLVMdev] Limit loop vectorizer to SSE

On 16/11/2013 7:58 AM, Nadav Rotem wrote: > > On Nov 15, 2013, at 12:36 PM, Renato Golin <renato.golin at linaro.org > <mailto:renato.golin at linaro.org>> wrote: > >> On 15 November 2013 20:24, Joshua Klontz <josh.klontz at gmail.com >> <mailto:josh.klontz at gmail.com>> wrote: >> >> Agreed, is there a pass that will insert a runtime alignment >> check? Also, what's the easiest way to get at >> TargetTransformInfo::getRegisterBitWidth() so I do...

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 13

[LLVMdev] Limit loop vectorizer to SSE

On 12 November 2013 21:10, Josh Klontz <josh.klontz at gmail.com> wrote: > Porting my project from JIT to MCJIT did not fix the code generation bug > Frank is also experiencing. However, Renato's "-avx" suggestion did resolve > the issue for me. Hopefully we can get some traction on this bug, happy to >...

[LLVMdev] Heuristic for choosing between MCJIT and Interpreter

2014 Aug 09

[LLVMdev] Heuristic for choosing between MCJIT and Interpreter

I'm facing a situation where I have generated IR that only needs to be executed once. I've noticed for simple IR it's faster to run the interpreter on it, but for complex IR it's much better to JIT compile and execute it. I'm seeking suggestions for a good heuristic to decide which approach to take for any given IR. I'm leaning in favor of deciding based on the

[LLVMdev] Heuristic for choosing between MCJIT and Interpreter

2014 Aug 12

[LLVMdev] Heuristic for choosing between MCJIT and Interpreter

On 08/11/2014 04:21 AM, David Chisnall wrote: > Hi Josh, > > On 9 Aug 2014, at 21:33, Josh Klontz <josh.klontz at gmail.com> wrote: > >> I'm facing a situation where I have generated IR that only needs to be executed once. I've noticed for simple IR it's faster to run the interpreter on it, but for complex IR it's much better to JIT compile and execute it. I'm...

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 15

[LLVMdev] Limit loop vectorizer to SSE

On 15 November 2013 20:24, Joshua Klontz <josh.klontz at gmail.com> wrote: > Agreed, is there a pass that will insert a runtime alignment check? Also, > what's the easiest way to get at TargetTransformInfo::getRegisterBitWidth() > so I don't have to hard code 32? Thanks! > I think that's a fair question, an...

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 15

[LLVMdev] Limit loop vectorizer to SSE

Agreed, is there a pass that will insert a runtime alignment check? Also, what's the easiest way to get at TargetTransformInfo::getRegisterBitWidth() so I don't have to hard code 32? Thanks! -Josh On Fri, Nov 15, 2013 at 3:20 PM, Frank Winter <fwinter at jlab.org> wrote: > Hmm.. I don't quite understand. How can a module validator > catch this, when it's the

[LLVMdev] [LV] possible `vector.memcheck` regression when using `llvm.loop` and `llvm.mem.parallel_loop_access`

2015 Mar 19

[LLVMdev] [LV] possible `vector.memcheck` regression when using `llvm.loop` and `llvm.mem.parallel_loop_access`

It seems that at some point in the not-so-distant-past that the loop vectorizer gained the ability to vectorize loops without explicit `llvm.loop` & `llvm.mem.parallel_loop_access` metadata. While that's awesome, there seems to be a regression in that `llvm.mem.parallel_loop_access` metadata doesn't make it into the alias analysis, and therefore a `vector.memcheck` basic block is

[LLVMdev] MCJIT + Windows = Incompatible object format

2013 Dec 03

[LLVMdev] MCJIT + Windows = Incompatible object format

...d of your target triple > to get MCJIT to generate ELF object in memory on Windows. This should work > with 32- or 64-bit targets. > > > > -Andy > > > > > > *From:* llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] *On > Behalf Of *Joshua Klontz > *Sent:* Monday, December 02, 2013 1:18 PM > *To:* Dev > *Subject:* [LLVMdev] MCJIT + Windows = Incompatible object format > > > > Is the MCJIT infrastructure supported on Windows? I'm getting an "LLVM > ERROR: Incompatible object format!" when running my proj...

search for: klontz