Displaying 9 results from an estimated 9 matches for "znver2".
2019 Sep 27
2
Question on target-features
Hi,
In "target-features" list in LLVM-IR, there are "+feature", "-feature". My
question is, does "-feature" is equivalent to not specifying a feature at
all?
For example:
*attributes #0 = { "target-cpu"="znver2" "target-features"="+avx -avx2" }*
Wheather it is equalent to omitting the avx2 from list?
*attributes #0 = { "target-cpu"="znver2" "target-features"="+avx" }*
--
Regards,
DTharun
-------------- next part --------------
An HTML at...
2019 Sep 27
3
Question on target-features
...[llvm-dev] Question on target-features
Hi,
In "target-features" list in LLVM-IR, there are "+feature", "-feature". My question is, does "-feature" is equivalent to not specifying a feature at all?
For example:
attributes #0 = { "target-cpu"="znver2" "target-features"="+avx -avx2" }
Wheather it is equalent to omitting the avx2 from list?
attributes #0 = { "target-cpu"="znver2" "target-features"="+avx" }
--
Regards,
DTharun
-------------- next part --------------
An HTML attach...
2020 Jul 16
2
LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target
Tried a bunch of them there (x86-64, haswell, znver2) and they all
defaulted to 4-wide - haswell additionally caused some extra loop unrolling
but still with 8-wide pows.
Cheers,
-Neil.
On Thu, Jul 16, 2020 at 2:39 PM Roman Lebedev <lebedev.ri at gmail.com> wrote:
> Did you specify the target CPU the code should be optimized for?
> For...
2019 Mar 23
2
Generating object files more efficiently
...aswell,
core-avx2, broadwell, skylake, skylake-avx512, skx, cascadelake,
cannonlake, icelake-client, icelake-server, knl, knm, k8, athlon64,
athlon-fx, opteron, k8-sse3, athlon64-sse3, opteron-sse3, amdfam10,
barcelona, btver1, btver2, bdver1, bdver2, bdver3, bdver4, znver1, znver2,
x86-64
________________________________
From: Doerfert, Johannes <jdoerfert at anl.gov>
Sent: Saturday, March 23, 2019 1:15 PM
To: J S
Cc: via llvm-dev
Subject: Re: [llvm-dev] Generating object files more efficiently
I would have guessed:
object:
clang -c foo.c -o foo.o -march=XYZ...
2019 Mar 23
4
Generating object files more efficiently
...aswell,
core-avx2, broadwell, skylake, skylake-avx512, skx, cascadelake,
cannonlake, icelake-client, icelake-server, knl, knm, k8, athlon64,
athlon-fx, opteron, k8-sse3, athlon64-sse3, opteron-sse3, amdfam10,
barcelona, btver1, btver2, bdver1, bdver2, bdver3, bdver4, znver1, znver2,
x86-64
________________________________
From: Doerfert, Johannes <jdoerfert at anl.gov>
Sent: Saturday, March 23, 2019 1:15 PM
To: J S
Cc: via llvm-dev
Subject: Re: [llvm-dev] Generating object files more efficiently
I would have guessed:
object:
clang -c foo.c -o foo.o -march=XYZ...
2020 Jul 16
4
LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target
...e are
> doing those libcalls, then it's not clear to me how anything else in the
> loop matters for performance.
>
> On Thu, Jul 16, 2020 at 10:20 AM Neil Henning via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Tried a bunch of them there (x86-64, haswell, znver2) and they all
>> defaulted to 4-wide - haswell additionally caused some extra loop unrolling
>> but still with 8-wide pows.
>>
>> Cheers,
>> -Neil.
>>
>> On Thu, Jul 16, 2020 at 2:39 PM Roman Lebedev <lebedev.ri at gmail.com>
>> wrote:
>>...
2019 Mar 23
2
Generating object files more efficiently
...vx2, broadwell, skylake, skylake-avx512, skx, cascadelake,
> cannonlake, icelake-client, icelake-server, knl, knm, k8, athlon64,
> athlon-fx, opteron, k8-sse3, athlon64-sse3, opteron-sse3, amdfam10,
> barcelona, btver1, btver2, bdver1, bdver2, bdver3, bdver4, znver1,
> znver2,
> x86-64
>
>
> ------------------------------
> *From:* Doerfert, Johannes <jdoerfert at anl.gov>
> *Sent:* Saturday, March 23, 2019 1:15 PM
> *To:* J S
> *Cc:* via llvm-dev
> *Subject:* Re: [llvm-dev] Generating object files more efficiently
>
> I woul...
2020 Jul 16
2
LLVM 11 and trunk selecting 4 wide instead of 8 wide loop vectorization for AVX-enabled target
Hey list,
I've recently done the first test run of bumping our Burst compiler from
LLVM 10 -> 11 now that the branch has been cut, and have noticed an
apparent loop vectorization codegen regression for X86 with AVX or AVX2
enabled. The following IR example is vectorized to 4 wide with LLVM 11 and
trunk whereas in LLVM 10 it (correctly as per what we want) vectorized it 8
wide matching the
2019 Mar 23
2
Generating object files more efficiently
Currently I compile my C code in 2 steps in order to generate .o files
clang -emit-llvm -c foo.c -o foo.bc
llc -march=XYZ foo.bc -filetype=obj
Is there a way to generate either .o or .elf files in just 1 command?
Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190323/da9b3c18/attachment.html>