thr3ads.net - search: "jlab"

Displaying 20 results from an estimated 132 matches for "jlab".

Did you mean: jla

2013 Nov 15

[LLVMdev] Limit loop vectorizer to SSE

On 15 November 2013 20:05, Frank Winter <fwinter at jlab.org> wrote: > Good catch! That was the problem in my case too. I totally > overlooked the alignment requirement for AVX. I wonder if the validation mechanism shouldn't have caught it earlier... Do you guys run validate on the modules before JIT-ing? --renato -------------- next par...

[LLVMdev] loop vectorizer

2013 Nov 06

[LLVMdev] loop vectorizer

...both solutions will be needed, I guess. Frank On 05/11/13 22:12, Andrew Trick wrote: > > On Oct 30, 2013, at 11:21 PM, Renato Golin <renato.golin at linaro.org > <mailto:renato.golin at linaro.org>> wrote: > >> On 30 October 2013 18:40, Frank Winter <fwinter at jlab.org >> <mailto:fwinter at jlab.org>> wrote: >> >> const std::uint64_t ir0 = (i+0)%4; // not working >> >> >> I thought this would be the case when I saw the original expression. >> Maybe we need to teach module arithmetic to SCEV? >...

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 15

[LLVMdev] Limit loop vectorizer to SSE

Hmm.. I don't quite understand. How can a module validator catch this, when it's the pointers, i.e. the payload, you pass as function arguments that need to be aligned.. ?! Frank On 15/11/13 15:16, Renato Golin wrote: > On 15 November 2013 20:05, Frank Winter <fwinter at jlab.org > <mailto:fwinter at jlab.org>> wrote: > > Good catch! That was the problem in my case too. I totally > overlooked the alignment requirement for AVX. > > > I wonder if the validation mechanism shouldn't have caught it > earlier... Do you guys run v...

[LLVMdev] SLP vectorizer on AVX feature

2015 Jul 01

[LLVMdev] SLP vectorizer on AVX feature

On 1 July 2015 at 21:22, Frank Winter <fwinter at jlab.org> wrote: > there were two follow-up emails. I only got one... weird... > The issue is solved. The SLP vectorizer has > a magic number built into the code which determines the max. vector length > to search for. That was set to 128 bits. Increasing it to 256 bits solved > the...

[LLVMdev] loop vectorizer

2013 Nov 06

[LLVMdev] loop vectorizer

On 06/11/13 08:54, Arnold wrote: > > > Sent from my iPhone > > On Nov 5, 2013, at 7:39 PM, Frank Winter <fwinter at jlab.org > <mailto:fwinter at jlab.org>> wrote: > >> Good that you bring this up. I still have no solution to this >> vectorization problem. >> >> However, I can rewrite the code and insert a second loop which >> eliminates the 'urem' and 'div&...

[LLVMdev] loop vectorizer

2013 Oct 31

[LLVMdev] loop vectorizer

On 30 October 2013 18:40, Frank Winter <fwinter at jlab.org> wrote: > const std::uint64_t ir0 = (i+0)%4; // not working > I thought this would be the case when I saw the original expression. Maybe we need to teach module arithmetic to SCEV? --renato -------------- next part -------------- An HTML attachment was scrubbed... URL: <h...

[LLVMdev] loop vectorizer

2013 Nov 06

[LLVMdev] loop vectorizer

Sent from my iPhone > On Nov 5, 2013, at 7:39 PM, Frank Winter <fwinter at jlab.org> wrote: > > Good that you bring this up. I still have no solution to this vectorization problem. > > However, I can rewrite the code and insert a second loop which eliminates the 'urem' and 'div' instructions in the index calculations. In this case, the inner lo...

[LLVMdev] loop vectorizer misses opportunity, exploit

2013 Oct 31

[LLVMdev] loop vectorizer misses opportunity, exploit

...ou just write a small loop like this: for (i=0; i<4; i++) C[i] = A[i] + B[i] ?? Either the unroller will unroll it and the SLP-vectorizer will vectorize the unrolled iterations, or the loop-vectorizer would catch it. Thanks, Nadav On Oct 31, 2013, at 8:01 AM, Frank Winter <fwinter at jlab.org> wrote: > A quite small but yet complete example function which all vectorization passes fail to optimize: > > #include <cstdint> > #include <iostream> > > void bar(std::uint64_t start, std::uint64_t end, float * __restrict__ c, float * __restrict__ a, float...

[LLVMdev] loop vectorizer misses opportunity, exploit

2013 Oct 31

[LLVMdev] loop vectorizer misses opportunity, exploit

...for (i=0; i<4; i++) > C[i] = A[i] + B[i] ?? Either the unroller will unroll it and the > SLP-vectorizer will vectorize the unrolled iterations, or the > loop-vectorizer would catch it. > > Thanks, > Nadav > > On Oct 31, 2013, at 8:01 AM, Frank Winter <fwinter at jlab.org > <mailto:fwinter at jlab.org>> wrote: > >> A quite small but yet complete example function which all >> vectorization passes fail to optimize: >> >> #include <cstdint> >> #include <iostream> >> >> void bar(std::uint64_t sta...

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 12

[LLVMdev] Limit loop vectorizer to SSE

On 12 November 2013 15:14, Frank Winter <fwinter at jlab.org> wrote: > I am asking because the option 'force-vector-width' is too restrictive. > I would like to leave open the possibility to use vector width 2. I was about to say that, and you saved us both one cycle. ;) What you could do is to force an architecture that doesn't...

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 12

[LLVMdev] Limit loop vectorizer to SSE

On 12 November 2013 15:53, Frank Winter <fwinter at jlab.org> wrote: > .. forcing the vector size to 4 does not prevent using AVX. > Sure. That's more for tests than anything else. So, there are ways of disabling stuf in Clang, for instance "-mattr=-avx" or "-target-feature -avx", but I'm not sure how you're d...

links causing dovecot endless search through user's homedirs

2008 Feb 29

links causing dovecot endless search through user's homedirs

...many links in their homedirs. Some even have circular links, such as: /home/username/foo/foo1/foo2/foo3/foo4 -> /home/username/foo/foo1 All these links are causing some dovecot imap process to endlessly search. An example of a strace of a running imap processes shows: stat64("/home/xxx//jlab/jlab/dev/tail/interp/radcor/results/fake/e94010/3RDTRY/interp/radcor/results/fake/e94010/3RDTRY/interp/radcor/resul ts/fake/e94010/3RDTRY/interp/radcor/results/fake/e94010/3RDTRY/interp/radcor/results/fake/e94010/3RDTRY/interp/radcor/results/fake/e94010/3RDT RY/interp/radcor/results/fake/e94010/3RD...

[LLVMdev] loop vectorizer misses opportunity, exploit

2013 Oct 31

[LLVMdev] loop vectorizer misses opportunity, exploit

...l.ll All optimization passes miss the opportunity. It seems the SCEV AA pass doesn't understand modulo arithmetic. How can the SCEV AA pass be extended to handle this type of arithmetic? Frank On 31/10/13 02:21, Renato Golin wrote: > On 30 October 2013 18:40, Frank Winter <fwinter at jlab.org > <mailto:fwinter at jlab.org>> wrote: > > const std::uint64_t ir0 = (i+0)%4; // not working > > > I thought this would be the case when I saw the original expression. > Maybe we need to teach module arithmetic to SCEV? > > --renato -----------...

[LLVMdev] MCJIT generates MOVAPS on unaligned address

2014 Aug 07

[LLVMdev] MCJIT generates MOVAPS on unaligned address

...would fix your issue though, because that would mean we return the wrong alignment not none. > > If the call below returns 0 then something has gone wrong in setting up the data layout in your compilation pipeline. > > >> On Aug 7, 2014, at 2:18 PM, Frank Winter <fwinter at jlab.org> wrote: >> >> It's not reproducible with 'opt'. I call the SLP pass from my application and only then the wrong IR gets generated. >> >> On the attached module I call via the function pass manager: >> >> 1) TargetLibraryInfo with the target...

[LLVMdev] MCJIT generates MOVAPS on unaligned address

2014 Aug 07

[LLVMdev] MCJIT generates MOVAPS on unaligned address

...This produces the wrong IR. On the other hand running the attached module through 'opt -slp-vectorizer' results in no code changes. What could I be missing here? Frank On 08/07/2014 04:29 PM, Arnold Schwaighofer wrote: >> On Aug 7, 2014, at 12:42 PM, Frank Winter <fwinter at jlab.org> wrote: >> >> MCJIT when lowering to x86-64 generates a MOVAPS (Move Aligned Packed Single-Precision Floating-Point Values) on a non-aligned memory address: >> >> movaps 88(%rdx), %xmm0 >> >> where %rdx comes in as a function argument with only natu...

[LLVMdev] SLP vectorizer on AVX feature

2015 Jul 01

[LLVMdev] SLP vectorizer on AVX feature

Frank, It sounds like the SLP vectorizer thinks that it is more profitable to use 128bit wide operations (because 256bit operations are double pumped on Sandybridge). Did you see a different result on Haswell? Thanks, Nadav > On Jul 1, 2015, at 11:06 AM, Frank Winter <fwinter at jlab.org> wrote: > > I realized that the function parameters had no alignment attributes on them. However, even adding an alignment suitable for aligned loads on YMM, i.e. 32 bytes, didn't convince the vectorizer to use [8 x float]. > > define void @main(i64 %lo, i64 %hi, float* noa...

[LLVMdev] How to broaden the SLP vectorizer's search

2014 Aug 08

[LLVMdev] How to broaden the SLP vectorizer's search

...hm. You can try to increase this threshold to 128 and see if it helps. I also agree with Renato and Chad that adding a flag to tell the SLP-vectorizer to put more effort (compile time) into the problem is a good idea. Thanks, Nadav > On Aug 8, 2014, at 8:27 AM, Frank Winter <fwinter at jlab.org> wrote: > > I changed the max. recursion depth to 36, and tried then 1000 (from the original value of 12) and it did not improve SLP's optimization capabilities on my input function. For example, the attached function is (by design) perfectly vectorizable into 4-packed single prec...

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 12

[LLVMdev] Limit loop vectorizer to SSE

...s loop! LV: Found trip count: 4 LV: The Widest type: 64 bits. LV: The Widest register is: 256 bits. LV: Using user VF 4. Looks like I have to disable AVX somehow. (Which is sad on its own.) Frank On 12/11/13 10:34, Renato Golin wrote: > On 12 November 2013 15:14, Frank Winter <fwinter at jlab.org > <mailto:fwinter at jlab.org>> wrote: > > I am asking because the option 'force-vector-width' is too > restrictive. > I would like to leave open the possibility to use vector width 2. > > > I was about to say that, and you saved us both one...

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 12

[LLVMdev] Limit loop vectorizer to SSE

On 12/11/13 11:01, Renato Golin wrote: > On 12 November 2013 15:53, Frank Winter <fwinter at jlab.org > <mailto:fwinter at jlab.org>> wrote: > > .. forcing the vector size to 4 does not prevent using AVX. > > > Sure. That's more for tests than anything else. > > So, there are ways of disabling stuf in Clang, for instance > "-mattr=-avx" or...

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 12

[LLVMdev] Limit loop vectorizer to SSE

On 12 November 2013 16:05, Frank Winter <fwinter at jlab.org> wrote: > engineBuilder.setMCPU(llvm::sys::getHostCPUName()); > Try: engineBuilder.setMAttrs("-avx"); --renato -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131112/4b00aed7/attac...

search for: jlab