thr3ads.net - search: "x86

2011 Oct 26

2

[LLVMdev] Lowering to MMX

...of data in and out of the lower half of an MMX registers). Take for example the following LLVM IR: define internal void @unpack(i8*, i8*) { %3 = bitcast i8* %1 to i32* %4 = load i32* %3, align 1 %5 = insertelement <2 x i32> undef, i32 %4, i32 0 %6 = bitcast <2 x i32> %5 to x86_mmx %7 = call x86_mmx @llvm.x86.mmx.punpcklbw(x86_mmx %6, x86_mmx %6) %8 = bitcast i8* %0 to x86_mmx* store x86_mmx %7, x86_mmx* %8, align 1 ret void } declare x86_mmx @llvm.x86.mmx.punpcklbw(x86_mmx, x86_mmx) nounwind readnone Which gives me the following assembly code: push ebp...

[LLVMdev] Lowering to MMX

2011 Oct 26

0

[LLVMdev] Lowering to MMX

...the lower half of an MMX registers). Take for example the following LLVM IR: > > define internal void @unpack(i8*, i8*) { > %3 = bitcast i8* %1 to i32* > %4 = load i32* %3, align 1 > %5 = insertelement <2 x i32> undef, i32 %4, i32 0 > %6 = bitcast <2 x i32> %5 to x86_mmx > %7 = call x86_mmx @llvm.x86.mmx.punpcklbw(x86_mmx %6, x86_mmx %6) > %8 = bitcast i8* %0 to x86_mmx* > store x86_mmx %7, x86_mmx* %8, align 1 > ret void > } > declare x86_mmx @llvm.x86.mmx.punpcklbw(x86_mmx, x86_mmx) nounwind readnone > > Which gives me the following a...

[LLVMdev] RFC: generation of PSAD instruction

2015 Jan 28

2

[LLVMdev] RFC: generation of PSAD instruction

...i8* %pix1, i64 8 ..... The test case is a perfect example for the generation of Sum of Absolute Difference (SAD) instruction which is present in most of the targets. The proposed generated IR is : ( directly x86 intrinsic is called just to demonstrate the difference ) ..... %6 = bitcast i8* %5 to x86_mmx* %wide.load8.1 = load x86_mmx* %6, align 1 %7 = getelementptr inbounds i8* %pix2, i64 8 %8 = bitcast i8* %7 to x86_mmx* %wide.load79.1 = load x86_mmx* %8, align 1 %9 = call x86_mmx @llvm.x86.mmx.psad.bw(x86_mmx %wide.load8.1, x86_mmx %wide.load79.1) %10 = bitcast x86_mmx %9 to i64 %11...

[LLVMdev] RFC: generation of PSAD instruction

2015 Jan 28

5

[LLVMdev] RFC: generation of PSAD instruction

...of Sum of >> Absolute Difference (SAD) instruction which is present in most of >> the targets. The proposed generated IR is : ( directly x86 intrinsic >> is called just to demonstrate the difference ) >> >> >> >> ….. >> >> %6 = bitcast i8* %5 to x86_mmx* >> >> %wide.load8.1 = load x86_mmx* %6, align 1 >> >> %7 = getelementptr inbounds i8* %pix2, i64 8 >> >> %8 = bitcast i8* %7 to x86_mmx* >> >> %wide.load79.1 = load x86_mmx* %8, align 1 >> >> %9 = call x86_mmx @llvm.x86.mmx.psad.bw(x86_mm...

[LLVMdev] Lowering to MMX

2011 Oct 25

0

[LLVMdev] Lowering to MMX

On Oct 20, 2011, at 8:42 AM, Nicolas Capens wrote: > Hi all, > > I'm working on a graphics project which uses LLVM for dynamic code > generation, and I noticed a major performance regression when upgrading > from LLVM 2.8 to 3.0-rc1 (LLVM 2.9 didn't support Win64 so I skipped it > entirely). > > I found out that the performance regression is due to removing

[LLVMdev] Lowering to MMX

2011 Oct 20

4

[LLVMdev] Lowering to MMX

Hi all, I'm working on a graphics project which uses LLVM for dynamic code generation, and I noticed a major performance regression when upgrading from LLVM 2.8 to 3.0-rc1 (LLVM 2.9 didn't support Win64 so I skipped it entirely). I found out that the performance regression is due to removing support for lowering 64-bit vector operations to MMX, and using SSE2 instead. My code uses a

RFC: Complex in LLVM

2019 Jul 02

3

RFC: Complex in LLVM

"Finkel, Hal J." <hfinkel at anl.gov> writes: > I think that it's really important that we're specific about the goals > here. Exactly what kinds of optimizations are we aiming to (more-easily) > enable? There certainly exists hardware with instructions that help > vectorize complex multiplication, for example, and having a builtin > complex type would

[RFC][SVE] Supporting SIMD instruction sets with variable vector lengths

2019 May 24

2

[RFC][SVE] Supporting SIMD instruction sets with variable vector lengths

...to the > type record. If the field is not present the type will default to a fixed-length > vector type, preserving backwards compatibility. > > Alternatives Considered > ----------------------- > > We did consider one main alternative -- a dedicated target type, like the > x86_mmx type. > > A dedicated target type would either need to extend all existing passes that > work with vectors to recognize the new type, or to duplicate all that code > in order to get reasonable code generation and autovectorization. > > This hasn't been done for the x86_mmx typ...

[RFC][SVE] Supporting SIMD instruction sets with variable vector lengths

2018 Jul 30

5

[RFC][SVE] Supporting SIMD instruction sets with variable vector lengths

...o the > type record. If the field is not present the type will default to a fixed-length > vector type, preserving backwards compatibility. > > Alternatives Considered > ----------------------- > > We did consider one main alternative -- a dedicated target type, like the > x86_mmx type. > > A dedicated target type would either need to extend all existing passes that > work with vectors to recognize the new type, or to duplicate all that code > in order to get reasonable code generation and autovectorization. > > This hasn't been done for the x86_mmx t...

[RFC][SVE] Supporting SIMD instruction sets with variable vector lengths

2018 Jun 05

14

[RFC][SVE] Supporting SIMD instruction sets with variable vector lengths

...bitcode, a new boolean field is added to the type record. If the field is not present the type will default to a fixed-length vector type, preserving backwards compatibility. Alternatives Considered ----------------------- We did consider one main alternative -- a dedicated target type, like the x86_mmx type. A dedicated target type would either need to extend all existing passes that work with vectors to recognize the new type, or to duplicate all that code in order to get reasonable code generation and autovectorization. This hasn't been done for the x86_mmx type, and so it is only capable...

[EXT] Re: [RFC][SVE] Supporting SIMD instruction sets with variable vector lengths

2019 May 24

2

[EXT] Re: [RFC][SVE] Supporting SIMD instruction sets with variable vector lengths

...bitcode, a new boolean field is added to the type record. If the field is not present the type will default to a fixed-length vector type, preserving backwards compatibility. Alternatives Considered ----------------------- We did consider one main alternative -- a dedicated target type, like the x86_mmx type. A dedicated target type would either need to extend all existing passes that work with vectors to recognize the new type, or to duplicate all that code in order to get reasonable code generation and autovectorization. This hasn't been done for the x86_mmx type, and so it is only capable...

[EXT] Re: [RFC][SVE] Supporting SIMD instruction sets with variable vector lengths

2019 May 27

2

[EXT] Re: [RFC][SVE] Supporting SIMD instruction sets with variable vector lengths

...bitcode, a new boolean field is added to the type record. If the field is not present the type will default to a fixed-length vector type, preserving backwards compatibility. Alternatives Considered ----------------------- We did consider one main alternative -- a dedicated target type, like the x86_mmx type. A dedicated target type would either need to extend all existing passes that work with vectors to recognize the new type, or to duplicate all that code in order to get reasonable code generation and autovectorization. This hasn't been done for the x86_mmx type, and so it is only capable...

[EXT] Re: [RFC][SVE] Supporting SIMD instruction sets with variable vector lengths

2019 Jun 03

2

[EXT] Re: [RFC][SVE] Supporting SIMD instruction sets with variable vector lengths

...to the > type record. If the field is not present the type will default to a fixed-length > vector type, preserving backwards compatibility. > > Alternatives Considered > ----------------------- > > We did consider one main alternative -- a dedicated target type, like the > x86_mmx type. > > A dedicated target type would either need to extend all existing passes that > work with vectors to recognize the new type, or to duplicate all that code > in order to get reasonable code generation and autovectorization. > > This hasn't been done for the x86_mmx typ...

[RFC][SVE] Supporting SIMD instruction sets with variable vector lengths

2018 Jul 30

7

[RFC][SVE] Supporting SIMD instruction sets with variable vector lengths

...to a fixed-length > > vector type, preserving backwards compatibility. > > > > Alternatives Considered > > ----------------------- > > > > We did consider one main alternative -- a dedicated target type, > like the > > x86_mmx type. > > > > A dedicated target type would either need to extend all existing > passes that > > work with vectors to recognize the new type, or to duplicate all > that code > > in order to get reasonable code generation and autovectorization. &...

Scalable Vector Types in IR - Next Steps?

2019 Mar 08

2

Scalable Vector Types in IR - Next Steps?

Hi folks, We seem to be converging on how the representation of scalable vectors will be implemented in IR, and we also have support for such vectors in the AArch64 back-end. We're also fresh out of the release process and have a good number of months to hash out potential problems until next release. What are the next steps to get this merged into trunk? Given this is a major change to IR,

[RFC][SVE] Supporting SIMD instruction sets with variable vector lengths

2018 Jul 02

3

[RFC][SVE] Supporting SIMD instruction sets with variable vector lengths

...to the > type record. If the field is not present the type will default to a fixed-length > vector type, preserving backwards compatibility. > > Alternatives Considered > ----------------------- > > We did consider one main alternative -- a dedicated target type, like the > x86_mmx type. > > A dedicated target type would either need to extend all existing passes that > work with vectors to recognize the new type, or to duplicate all that code > in order to get reasonable code generation and autovectorization. > > This hasn't been done for the x86_mmx typ...

[RFC][SVE] Supporting Scalable Vector Architectures in LLVM IR (take 2)

2017 Jul 06

2

[RFC][SVE] Supporting Scalable Vector Architectures in LLVM IR (take 2)

...es, as an extension to VectorType they can be manipulated and passed around like normal vectors, load/stored directly, phis, put in llvm structs etc. Address computation generates expressions in terms vscale and it seems to work well. > > I think that a target-specific type (e.g. like we have X86_mmx) is the only reasonable alternative. A subclass of VectorType is just another implementation approach of your design above. This is assuming that scalable vectors are really first class types. > > The pros and cons of a separate type is that it avoids you having to touch everything that tou...

[LLVMdev] x86 SSE4.2 CRC32 intrinsics renamed

2011 May 26

0

[LLVMdev] x86 SSE4.2 CRC32 intrinsics renamed

...llvm.x86.sse42.crc32.64.64"; > + } > + } > + if (NewFnName) { > + F->setName(NewFnName); > + NewFn = F; > + return true; > + } > + } > + > // This fixes all MMX shift intrinsic instructions to take a > // x86_mmx instead of a v1i64, v2i32, v4i16, or v8i8. > if (Name.compare(5, 8, "x86.mmx.", 8) == 0) { > > Modified: llvm/trunk/test/CodeGen/X86/sse42.ll > URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/sse42.ll?rev=132163&r1=132162&r2=132163&view=di...

Optimisations

2000 Nov 15

8

Optimisations

Looking through the archives I have seen talk of making CPU specific optimisations for Vorbis, a la MMX/3DNow!/SSE. The feeling I gather is to wait until something is working well in C before committing to any kind of specific optimisation. What if oft used and needed DSP functions were identified and standardised DSP functionality be written for Vorbis? This would seperate the basically

[RFC][SVE] Supporting Scalable Vector Architectures in LLVM IR (take 2)

2017 Jun 01

4

[RFC][SVE] Supporting Scalable Vector Architectures in LLVM IR (take 2)

Hi, Here's the updated RFC for representing scalable vector types and associated constants in IR. I added a section to address questions that came up on the recent patch review. -Graham =================================================== Supporting Scalable Vector Architectures in LLVM IR =================================================== ========== Background ========== *ARMv8-A

search for: x86_mmx