thr3ads.net - search: "znver2"

2019 Sep 27

2

Question on target-features

Hi, In "target-features" list in LLVM-IR, there are "+feature", "-feature". My question is, does "-feature" is equivalent to not specifying a feature at all? For example: *attributes #0 = { "target-cpu"="znver2" "target-features"="+avx -avx2" }* Wheather it is equalent to omitting the avx2 from list? *attributes #0 = { "target-cpu"="znver2" "target-features"="+avx" }* -- Regards, DTharun -------------- next part -------------- An HTML at...

Question on target-features

2019 Sep 27

3

Question on target-features

...[llvm-dev] Question on target-features Hi, In "target-features" list in LLVM-IR, there are "+feature", "-feature". My question is, does "-feature" is equivalent to not specifying a feature at all? For example: attributes #0 = { "target-cpu"="znver2" "target-features"="+avx -avx2" } Wheather it is equalent to omitting the avx2 from list? attributes #0 = { "target-cpu"="znver2" "target-features"="+avx" } -- Regards, DTharun -------------- next part -------------- An HTML attach...

LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target

2020 Jul 16

2

LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target

Tried a bunch of them there (x86-64, haswell, znver2) and they all defaulted to 4-wide - haswell additionally caused some extra loop unrolling but still with 8-wide pows. Cheers, -Neil. On Thu, Jul 16, 2020 at 2:39 PM Roman Lebedev <lebedev.ri at gmail.com> wrote: > Did you specify the target CPU the code should be optimized for? > For...

Generating object files more efficiently

2019 Mar 23

2

Generating object files more efficiently

...aswell, core-avx2, broadwell, skylake, skylake-avx512, skx, cascadelake, cannonlake, icelake-client, icelake-server, knl, knm, k8, athlon64, athlon-fx, opteron, k8-sse3, athlon64-sse3, opteron-sse3, amdfam10, barcelona, btver1, btver2, bdver1, bdver2, bdver3, bdver4, znver1, znver2, x86-64 ________________________________ From: Doerfert, Johannes <jdoerfert at anl.gov> Sent: Saturday, March 23, 2019 1:15 PM To: J S Cc: via llvm-dev Subject: Re: [llvm-dev] Generating object files more efficiently I would have guessed: object: clang -c foo.c -o foo.o -march=XYZ...

Generating object files more efficiently

2019 Mar 23

4

Generating object files more efficiently

...aswell, core-avx2, broadwell, skylake, skylake-avx512, skx, cascadelake, cannonlake, icelake-client, icelake-server, knl, knm, k8, athlon64, athlon-fx, opteron, k8-sse3, athlon64-sse3, opteron-sse3, amdfam10, barcelona, btver1, btver2, bdver1, bdver2, bdver3, bdver4, znver1, znver2, x86-64 ________________________________ From: Doerfert, Johannes <jdoerfert at anl.gov> Sent: Saturday, March 23, 2019 1:15 PM To: J S Cc: via llvm-dev Subject: Re: [llvm-dev] Generating object files more efficiently I would have guessed: object: clang -c foo.c -o foo.o -march=XYZ...

LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target

2020 Jul 16

4

LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target

...e are > doing those libcalls, then it's not clear to me how anything else in the > loop matters for performance. > > On Thu, Jul 16, 2020 at 10:20 AM Neil Henning via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> Tried a bunch of them there (x86-64, haswell, znver2) and they all >> defaulted to 4-wide - haswell additionally caused some extra loop unrolling >> but still with 8-wide pows. >> >> Cheers, >> -Neil. >> >> On Thu, Jul 16, 2020 at 2:39 PM Roman Lebedev <lebedev.ri at gmail.com> >> wrote: >>...

Generating object files more efficiently

2019 Mar 23

2

Generating object files more efficiently

...vx2, broadwell, skylake, skylake-avx512, skx, cascadelake, > cannonlake, icelake-client, icelake-server, knl, knm, k8, athlon64, > athlon-fx, opteron, k8-sse3, athlon64-sse3, opteron-sse3, amdfam10, > barcelona, btver1, btver2, bdver1, bdver2, bdver3, bdver4, znver1, > znver2, > x86-64 > > > ------------------------------ > *From:* Doerfert, Johannes <jdoerfert at anl.gov> > *Sent:* Saturday, March 23, 2019 1:15 PM > *To:* J S > *Cc:* via llvm-dev > *Subject:* Re: [llvm-dev] Generating object files more efficiently > > I woul...

LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target

2020 Jul 16

2

LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target

Hey list, I've recently done the first test run of bumping our Burst compiler from LLVM 10 -> 11 now that the branch has been cut, and have noticed an apparent loop vectorization codegen regression for X86 with AVX or AVX2 enabled. The following IR example is vectorized to 4 wide with LLVM 11 and trunk whereas in LLVM 10 it (correctly as per what we want) vectorized it 8 wide matching the

Generating object files more efficiently

2019 Mar 23

2

Generating object files more efficiently

Currently I compile my C code in 2 steps in order to generate .o files clang -emit-llvm -c foo.c -o foo.bc llc -march=XYZ foo.bc -filetype=obj Is there a way to generate either .o or .elf files in just 1 command? Thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190323/da9b3c18/attachment.html>

search for: znver2