search for: klontz

Displaying 20 results from an estimated 44 matches for "klontz".

2013 Nov 15
4
[LLVMdev] Limit loop vectorizer to SSE
...nt == 0) + Alignment = 1; unsigned AddressSpace = Ptr->getType()->getPointerAddressSpace(); unsigned ScalarAllocatedSize = DL->getTypeAllocSize(ScalarDataTy); unsigned VectorElementSize = DL->getTypeStoreSize(DataTy)/VF; Should fix this. On Nov 15, 2013, at 3:49 PM, Joshua Klontz <josh.klontz at gmail.com> wrote: > Nadav, > > I believe aligned accesses to unaligned pointers is precisely the issue. Consider the function `add_u8S` before[1] and after[2] the loop vectorizer pass. There is no alignment assumption associated with %kernel_data prior to vectorizat...
2013 Nov 15
2
[LLVMdev] Limit loop vectorizer to SSE
...getPrefTypeAlignment or getABITypeAlignment I would have thought the latter. On Nov 15, 2013, at 4:12 PM, Hal Finkel <hfinkel at anl.gov> wrote: > ----- Original Message ----- >> From: "Arnold Schwaighofer" <aschwaighofer at apple.com> >> To: "Joshua Klontz" <josh.klontz at gmail.com> >> Cc: "LLVM Dev" <llvmdev at cs.uiuc.edu> >> Sent: Friday, November 15, 2013 4:05:53 PM >> Subject: Re: [LLVMdev] Limit loop vectorizer to SSE >> >> >> Something like: >> >> index 6db7f68..685...
2013 Nov 15
0
[LLVMdev] Limit loop vectorizer to SSE
----- Original Message ----- > From: "Arnold Schwaighofer" <aschwaighofer at apple.com> > To: "Joshua Klontz" <josh.klontz at gmail.com> > Cc: "LLVM Dev" <llvmdev at cs.uiuc.edu> > Sent: Friday, November 15, 2013 4:05:53 PM > Subject: Re: [LLVMdev] Limit loop vectorizer to SSE > > > Something like: > > index 6db7f68..68564cb 100644 > --- a/lib/Trans...
2013 Nov 15
6
[LLVMdev] Limit loop vectorizer to SSE
On Nov 15, 2013, at 12:36 PM, Renato Golin <renato.golin at linaro.org> wrote: > On 15 November 2013 20:24, Joshua Klontz <josh.klontz at gmail.com> wrote: > Agreed, is there a pass that will insert a runtime alignment check? Also, what's the easiest way to get at TargetTransformInfo::getRegisterBitWidth() so I don't have to hard code 32? Thanks! > > I think that's a fair question, and it...
2013 Nov 15
0
[LLVMdev] Limit loop vectorizer to SSE
...Josh [1] http://pastebin.com/kc95WtGG [2] http://pastebin.com/VY3ZLVJK On Fri, Nov 15, 2013 at 3:58 PM, Nadav Rotem <nrotem at apple.com> wrote: > > On Nov 15, 2013, at 12:36 PM, Renato Golin <renato.golin at linaro.org> > wrote: > > On 15 November 2013 20:24, Joshua Klontz <josh.klontz at gmail.com> wrote: > >> Agreed, is there a pass that will insert a runtime alignment check? Also, >> what's the easiest way to get at TargetTransformInfo::getRegisterBitWidth() >> so I don't have to hard code 32? Thanks! >> > > I think t...
2013 Nov 15
2
[LLVMdev] Limit loop vectorizer to SSE
A fix for this is in r194876. Thanks for reporting this! On Nov 15, 2013, at 3:49 PM, Joshua Klontz <josh.klontz at gmail.com> wrote: > Nadav, > > I believe aligned accesses to unaligned pointers is precisely the issue. Consider the function `add_u8S` before[1] and after[2] the loop vectorizer pass. There is no alignment assumption associated with %kernel_data prior to vectorizat...
2015 Mar 19
2
[LLVMdev] [LV] possible `vector.memcheck` regression when using `llvm.loop` and `llvm.mem.parallel_loop_access`
...y is vectorized, though %33 and %46 are deemed MayAlias despite their exclusive use in loads ands stores marked with `llvm.mem.parallel_loop_access`. Many Thanks, Josh On Thu, Mar 19, 2015 at 12:55 PM, Adam Nemet <anemet at apple.com> wrote: > > > On Mar 19, 2015, at 9:43 AM, Josh Klontz <josh.klontz at gmail.com> wrote: > > > > It seems that at some point in the not-so-distant-past that the loop > vectorizer gained the ability to vectorize loops without explicit > `llvm.loop` & `llvm.mem.parallel_loop_access` metadata. While that's > awesome, the...
2013 Nov 16
0
[LLVMdev] Limit loop vectorizer to SSE
...32 byte aligned accesses. In which case I would align the payload to 32 byte in order to save a little in the preamble. Frank On 15/11/13 18:15, Arnold Schwaighofer wrote: > A fix for this is in r194876. > > Thanks for reporting this! > > > On Nov 15, 2013, at 3:49 PM, Joshua Klontz <josh.klontz at gmail.com> wrote: > >> Nadav, >> >> I believe aligned accesses to unaligned pointers is precisely the issue. Consider the function `add_u8S` before[1] and after[2] the loop vectorizer pass. There is no alignment assumption associated with %kernel_data prio...
2013 May 10
2
[LLVMdev] Simple Loop Vectorize Question
...oes not have a triple, so the target machine and > TargetTransformInfo have no way of knowing if you are running on a machine > with vector registers. Try adding the '-mcpu=XXXX' to opt and see what > happens. > > Thanks, > Nadav > > On May 9, 2013, at 1:42 PM, Josh Klontz <josh.klontz at gmail.com> wrote: > > Hi! I am trying to get the loop vectorizer to work on a simple example > (http://pastebin.com/tGhpc4y0) that doubles every element in a vector. > > I've found that 'opt -loop-vectorize -force-vector-width=4 -S -debug > double.ll&...
2013 May 10
0
[LLVMdev] Simple Loop Vectorize Question
Hi Josh, This line works for me: opt file.ll -loop-vectorize -S -o - -mtriple=x86_64 -mcpu=corei7-avx -debug You need to specify the triple on the command line if it is not inside the module. Thanks, Nadav On May 9, 2013, at 5:53 PM, Joshua Klontz <josh.klontz at gmail.com> wrote: > Nadav, > > Please forgive my ignorance, but 'opt -mcpu=corei7 -loop-vectorize -S -debug double.ll' doesn't appear to make a difference. In fact it seems to be ignored as garbage values for -mcpu don't raise an error. Am I overlook...
2013 Nov 16
1
[LLVMdev] Limit loop vectorizer to SSE
...load > to 32 byte in order to save a little in the preamble. > > Frank > > > > On 15/11/13 18:15, Arnold Schwaighofer wrote: >> A fix for this is in r194876. >> >> Thanks for reporting this! >> >> >> On Nov 15, 2013, at 3:49 PM, Joshua Klontz <josh.klontz at gmail.com> wrote: >> >>> Nadav, >>> >>> I believe aligned accesses to unaligned pointers is precisely the issue. Consider the function `add_u8S` before[1] and after[2] the loop vectorizer pass. There is no alignment assumption associated with...
2016 Jun 10
3
Early CSE clobbering llvm.assume
Maybe. It may not fix it directly because you never use %1 or %2 again. I haven't looked to see how good the lookup is. On Fri, Jun 10, 2016, 3:45 PM Josh Klontz <josh.klontz at gmail.com> wrote: > Thanks Daniel, with that knowledge I think I can at least work around the > issue in my frontend. > > Ignoring GVN for a second though, and just looking at Early CSE, it seems > to me that at least in this pass that there is the potential fo...
2013 Nov 19
0
[LLVMdev] Limit loop vectorizer to SSE
On 16/11/2013 7:58 AM, Nadav Rotem wrote: > > On Nov 15, 2013, at 12:36 PM, Renato Golin <renato.golin at linaro.org > <mailto:renato.golin at linaro.org>> wrote: > >> On 15 November 2013 20:24, Joshua Klontz <josh.klontz at gmail.com >> <mailto:josh.klontz at gmail.com>> wrote: >> >> Agreed, is there a pass that will insert a runtime alignment >> check? Also, what's the easiest way to get at >> TargetTransformInfo::getRegisterBitWidth() so I do...
2013 Nov 13
3
[LLVMdev] Limit loop vectorizer to SSE
On 12 November 2013 21:10, Josh Klontz <josh.klontz at gmail.com> wrote: > Porting my project from JIT to MCJIT did not fix the code generation bug > Frank is also experiencing. However, Renato's "-avx" suggestion did resolve > the issue for me. Hopefully we can get some traction on this bug, happy to >...
2014 Aug 09
3
[LLVMdev] Heuristic for choosing between MCJIT and Interpreter
I'm facing a situation where I have generated IR that only needs to be executed once. I've noticed for simple IR it's faster to run the interpreter on it, but for complex IR it's much better to JIT compile and execute it. I'm seeking suggestions for a good heuristic to decide which approach to take for any given IR. I'm leaning in favor of deciding based on the
2014 Aug 12
2
[LLVMdev] Heuristic for choosing between MCJIT and Interpreter
On 08/11/2014 04:21 AM, David Chisnall wrote: > Hi Josh, > > On 9 Aug 2014, at 21:33, Josh Klontz <josh.klontz at gmail.com> wrote: > >> I'm facing a situation where I have generated IR that only needs to be executed once. I've noticed for simple IR it's faster to run the interpreter on it, but for complex IR it's much better to JIT compile and execute it. I'm...
2013 Nov 15
0
[LLVMdev] Limit loop vectorizer to SSE
On 15 November 2013 20:24, Joshua Klontz <josh.klontz at gmail.com> wrote: > Agreed, is there a pass that will insert a runtime alignment check? Also, > what's the easiest way to get at TargetTransformInfo::getRegisterBitWidth() > so I don't have to hard code 32? Thanks! > I think that's a fair question, an...
2013 Nov 15
2
[LLVMdev] Limit loop vectorizer to SSE
Agreed, is there a pass that will insert a runtime alignment check? Also, what's the easiest way to get at TargetTransformInfo::getRegisterBitWidth() so I don't have to hard code 32? Thanks! -Josh On Fri, Nov 15, 2013 at 3:20 PM, Frank Winter <fwinter at jlab.org> wrote: > Hmm.. I don't quite understand. How can a module validator > catch this, when it's the
2015 Mar 19
2
[LLVMdev] [LV] possible `vector.memcheck` regression when using `llvm.loop` and `llvm.mem.parallel_loop_access`
It seems that at some point in the not-so-distant-past that the loop vectorizer gained the ability to vectorize loops without explicit `llvm.loop` & `llvm.mem.parallel_loop_access` metadata. While that's awesome, there seems to be a regression in that `llvm.mem.parallel_loop_access` metadata doesn't make it into the alias analysis, and therefore a `vector.memcheck` basic block is
2013 Dec 03
1
[LLVMdev] MCJIT + Windows = Incompatible object format
...d of your target triple > to get MCJIT to generate ELF object in memory on Windows. This should work > with 32- or 64-bit targets. > > > > -Andy > > > > > > *From:* llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] *On > Behalf Of *Joshua Klontz > *Sent:* Monday, December 02, 2013 1:18 PM > *To:* Dev > *Subject:* [LLVMdev] MCJIT + Windows = Incompatible object format > > > > Is the MCJIT infrastructure supported on Windows? I'm getting an "LLVM > ERROR: Incompatible object format!" when running my proj...