thr3ads.net - similar to: "AVX2 / 3DNow."

Displaying 20 results from an estimated 10000 matches similar to: "AVX2 / 3DNow."

2015 Jul 24

[LLVMdev] SIMD for sdiv <2 x i64>

> On 24.07.2015, at 08:06, zhi chen <zchenhn at gmail.com> wrote: > > It seems that that it's hard to vectorize int64 in LLVM. For example, LLVM 3.4 generates very complicated code for the following IR. I am running on a Haswell processor. Is it because there is no alternative AVX/2 instructions for int64? The same thing also happens to zext <2 x i32> -> <2 x

X86 TRUNCATE cost for AVX & AVX2 mode

2016 Apr 12

X86 TRUNCATE cost for AVX & AVX2 mode

<Copied Cong> Thanks Elena. Mostly I was interested in why such a high cost 30 kept for TRUNCATE v16i32 to v16i8 in SSE41. Looking at the code it appears like TRUNCATE v16i32 to v16i8 in SSE41 is very expensive vs SSE2. I feel this number should be same/close to the cost mentioned for same operation in SSE2ConversionTbl. Below patch from Cong Hou reduce cost for same operation in SSE2

[LLVMdev] AVX2 Cost Table in X86TargetTransformInfo

2015 May 04

[LLVMdev] AVX2 Cost Table in X86TargetTransformInfo

Hi all, I have a query regarding Cost Table for AVX2 in TargetTransformInfo. The table consist of entries for shift and div operations only. There are no entries for ADD, SUB and MUL for AVX2 cost table. Those entries are present in Cost Table for AVX. The reason for query is - when my sub target feature is AVX2, in SLP Vectorization, while calculating scalar cost of ADD, it doesn't see

[LLVMdev] AVX2 Cost Table in X86TargetTransformInfo

2015 May 04

[LLVMdev] AVX2 Cost Table in X86TargetTransformInfo

Thanks Nadav for the info. It clears my query :) Yes its an integer ADD, and since AVX2 supports 256 bits integer arithmetic, so its cost is less than AVX1. One query though - shouldn't then the cost of integer ADD/SUB/MUL (which would be 1) be explicitly specified in AVX2 cost table? Because right now this entry is missing and cost of these operations are taken from BaseTTI (which is

Re: LibVirt query CPU Model support and restore operation

2017 Jan 27

Re: LibVirt query CPU Model support and restore operation

hello , thanks for comments . I tried now with force options for CPU flag which were not supported . Now the command with non fully supported CPU model gets executed , But i am surprised to see that still Guest cpu model is not changed and still same as host cpu model(SAndy Bridge) Why don't i see the model as HAswell now , could you please comment. Command used : virt-install

[LLVMdev] SIMD for sdiv <2 x i64>

2015 Jul 24

[LLVMdev] SIMD for sdiv <2 x i64>

It seems that that it's hard to vectorize int64 in LLVM. For example, LLVM 3.4 generates very complicated code for the following IR. I am running on a Haswell processor. Is it because there is no alternative AVX/2 instructions for int64? The same thing also happens to zext <2 x i32> -> <2 x i64> and trunc <2 x i64> -> <2 x i32>. Any ideas to optimize these

Lets do a 1.3.2 release

2016 Jan 18

Lets do a 1.3.2 release

Dave Yeo wrote: > The nature of the error implies AVX2 support that is missing but I'm not > much up on assembly, > > Best to be safe so updated patch attached. > I've also opened a ticket, http://trac.netlabs.org/rpm/ticket/165#ticket > Dave I cannot find information what version of binutils supports AVX/AVX2/FMA instructions, but IIRC OS/2 doesn't support AVX

New x86-64 micro-architecture levels

2020 Jul 13

New x86-64 micro-architecture levels

On Fri, Jul 10, 2020 at 11:45 PM H.J. Lu via Gcc <gcc at gcc.gnu.org> wrote: > > On Fri, Jul 10, 2020 at 10:30 AM Florian Weimer <fweimer at redhat.com> wrote: > > > > Most Linux distributions still compile against the original x86-64 > > baseline that was based on the AMD K8 (minus the 3DNow! parts, for Intel > > EM64T compatibility). > > > >

[LLVMdev] AVX code gen

2013 Dec 12

[LLVMdev] AVX code gen

It probably does not pick the right processor architecture. You could try “clang -mavx” or “clang -march=corei7-avx” for ivy-bridge and “clang -march=core-avx2” or “clang -mavx2" for haswell. $ clang -march=core-avx2 -O3 -S -o - test.c .section __TEXT,__text,regular,pure_instructions .globl _f .align 4, 0x90 _f: ## @f

libvirt does not show same CPU Model as /proc/cpuinfo for CPU Model info.

2017 Jan 28

libvirt does not show same CPU Model as /proc/cpuinfo for CPU Model info.

Hi , Created new thread . Environment: Bare Metal server + CentOs with qemu/KVM +libvirt for virtualization Guest Instantiated with virt-install with forced CPU model like below virt-install --virt-type kvm --name compute-0 --cpu Haswell,+fma,+movbe,+fsgsbase,+bmi1,+hle,+avx2,+smep,+bmi2,+erms,+invpcid,+rtm --ram=61440 --vcpus=20 --os-type=linux --os-variant=generic After guest installation

Re: [ovirt-users] Re: Testing ovirt 4.4.1 Nested KVM on Skylake-client (core i5) does not work

2020 Sep 14

Re: [ovirt-users] Re: Testing ovirt 4.4.1 Nested KVM on Skylake-client (core i5) does not work

On Mon, Sep 14, 2020 at 8:42 AM Yedidyah Bar David <didi@redhat.com> wrote: > > On Mon, Sep 14, 2020 at 12:28 AM wodel youchi <wodel.youchi@gmail.com> wrote: > > > > Hi, > > > > Thanks for the help, I think I found the solution using this link : https://www.berrange.com/posts/2018/06/29/cpu-model-configuration-for-qemu-kvm-on-x86-hosts/ > > > >

New x86-64 micro-architecture levels

2020 Jul 10

New x86-64 micro-architecture levels

Most Linux distributions still compile against the original x86-64 baseline that was based on the AMD K8 (minus the 3DNow! parts, for Intel EM64T compatibility). There has been an attempt to use the existing AT_PLATFORM-based loading mechanism in the glibc dynamic linker to enable a selection of optimized libraries. But the general selection mechanism in glibc is problematic: hwcaps

[LLVMdev] Memory alignment model on AVX, AVX2 and AVX-512 targets

2014 Dec 15

[LLVMdev] Memory alignment model on AVX, AVX2 and AVX-512 targets

AFAIK, there is no additional penalty for AMD processors. From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Chandler Carruth Sent: Monday, December 15, 2014 3:57 AM To: Demikhovsky, Elena Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Memory alignment model on AVX, AVX2 and AVX-512 targets FWIW, this makes sense to me. I'd be interested to hear from

how to force llvm generate gather intrinsic

2016 Feb 26

how to force llvm generate gather intrinsic

If I'm understanding correctly, you're saying that vgather* is slow on all of Excavator, Haswell, Broadwell, and Skylake (client). Therefore, we will not generate it for any of those machines. Even if that's true, we should not define "gatherIsSlow()" as "hasAVX2() && !hasAVX512()". It could break for some hypothetical future processor that manages to

X86 TRUNCATE cost for AVX & AVX2 mode

2016 Apr 11

X86 TRUNCATE cost for AVX & AVX2 mode

Hi, I was going through the X86TTIImpl::getCastInstrCost, and got a doubt on cost calculation for TRUNCATE instruction in AVX mode. In AVX2ConversionTbl & AVXConversionTbl table there is no cost defined for TRUNCATE v16i32 to v16i8, as a fallback it goes to SSE41ConversionTbl table and there it finds cost as 30 for this operation. 30 cost for this operation looks very high. Wondering why

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

2013 Sep 12

[LLVMdev] [PATCH] Detect Haswell subarchitecture (i.e. using -march=native)

> That's far more worrying to me than not being able to detect Haswell. > I can't reproduce the problem here at the moment: both debug and > release builds give identical assembly for Host.cpp. OK. I know the reason you cannot reproduce it, before posting the patch I've decided to check for AVX before checking AVX2, just not to cpuid AVX2 when we don't have AVX1 anyway.

how to force llvm generate gather intrinsic

2016 Feb 26

how to force llvm generate gather intrinsic

That makes great sense. It would be great if we have profitability mode to see the necessity to use gathers. Or it also would be good if there is a compiler option for the users to enable LLVM to generate the gather instructions no matter it is faster or slow. Best, Zhi On Fri, Feb 26, 2016 at 12:49 PM, Sanjay Patel <spatel at rotateright.com> wrote: > If I'm understanding

Lets do a 1.3.2 release

2016 Jan 18

Lets do a 1.3.2 release

Dave Yeo wrote: > Seems that the default binutils on OS/2 is too old to support AVX2, > attached patch works around this. Not the best solution as best would be > configure tests, but simple. Are you sure that these binutils support AVX and FMA? (Currently libFLAC doesn't contain AVX and FMA instructions). If they aren't supported then it's better to include them too into

[LLVMdev] SIMD for sdiv <2 x i64>

2015 Jul 24

[LLVMdev] SIMD for sdiv <2 x i64>

On 07/24/2015 03:42 AM, Benjamin Kramer wrote: >> On 24.07.2015, at 08:06, zhi chen <zchenhn at gmail.com> wrote: >> >> It seems that that it's hard to vectorize int64 in LLVM. For example, LLVM 3.4 generates very complicated code for the following IR. I am running on a Haswell processor. Is it because there is no alternative AVX/2 instructions for int64? The same thing

Lets do a 1.3.2 release

2016 Jan 19

Lets do a 1.3.2 release

Dave Yeo wrote: >> I cannot find information what version of binutils supports AVX/AVX2/FMA >> instructions, but IIRC OS/2 doesn't support AVX instructions anyway, >> so it doesn't matter much. > > Surprisingly, I've yet to have a report of an AVX related crash or trap > (used in FFmpeg and projects based on it, Mozilla, probably others). > As I

similar to: AVX2 / 3DNow.