thr3ads.net - search: "vpmaskmovd"

how to force llvm generate gather intrinsic

2016 Jan 23

2

how to force llvm generate gather intrinsic

...re ops for an AVX target. If we can fix scatter/gather similarly, that would be great. Can we legalize the same set of masked load/store operations for AVX1 as AVX2? If I'm understanding them correctly, the AVX1 FP instructions (vmaskmovps/pd) can be used in place of the AVX2 int instructions (vpmaskmovd/q), just with domain crossing penalties thrown in. I think we do this for other missing integer ops for an AVX1 target either in x86 lowering or in the tablegen patterns. Elena - I'm not too familiar with the vectorizers or scatter/gather, but I'll certainly take a look at D15690. Thanks f...

how to force llvm generate gather intrinsic

2016 Feb 25

2

how to force llvm generate gather intrinsic

...re ops for an AVX target. If we can fix scatter/gather similarly, that would be great. Can we legalize the same set of masked load/store operations for AVX1 as AVX2? If I'm understanding them correctly, the AVX1 FP instructions (vmaskmovps/pd) can be used in place of the AVX2 int instructions (vpmaskmovd/q), just with domain crossing penalties thrown in. I think we do this for other missing integer ops for an AVX1 target either in x86 lowering or in the tablegen patterns. Elena - I'm not too familiar with the vectorizers or scatter/gather, but I'll certainly take a look at D15690. Thanks f...

how to force llvm generate gather intrinsic

2016 Feb 24

0

how to force llvm generate gather intrinsic

...we can fix scatter/gather > similarly, that would be great. > > Can we legalize the same set of masked load/store operations for AVX1 as > AVX2? If I'm understanding them correctly, the AVX1 FP instructions > (vmaskmovps/pd) can be used in place of the AVX2 int instructions > (vpmaskmovd/q), just with domain crossing penalties thrown in. I think we > do this for other missing integer ops for an AVX1 target either in x86 > lowering or in the tablegen patterns. > > Elena - I'm not too familiar with the vectorizers or scatter/gather, but > I'll certainly take a...

how to force llvm generate gather intrinsic

2016 Feb 25

0

how to force llvm generate gather intrinsic

...we can fix scatter/gather > similarly, that would be great. > > Can we legalize the same set of masked load/store operations for AVX1 as > AVX2? If I'm understanding them correctly, the AVX1 FP instructions > (vmaskmovps/pd) can be used in place of the AVX2 int instructions > (vpmaskmovd/q), just with domain crossing penalties thrown in. I think we > do this for other missing integer ops for an AVX1 target either in x86 > lowering or in the tablegen patterns. > > Elena - I'm not too familiar with the vectorizers or scatter/gather, but > I'll certainly take a...

how to force llvm generate gather intrinsic

2016 Feb 25

2

how to force llvm generate gather intrinsic

...r >> similarly, that would be great. >> >> Can we legalize the same set of masked load/store operations for AVX1 as >> AVX2? If I'm understanding them correctly, the AVX1 FP instructions >> (vmaskmovps/pd) can be used in place of the AVX2 int instructions >> (vpmaskmovd/q), just with domain crossing penalties thrown in. I think we >> do this for other missing integer ops for an AVX1 target either in x86 >> lowering or in the tablegen patterns. >> >> Elena - I'm not too familiar with the vectorizers or scatter/gather, but >> I'...

how to force llvm generate gather intrinsic

2016 Jan 23

2

how to force llvm generate gather intrinsic

Thanks Sanjay for highlighting this, few days back I also faced similar problem while generating masked store in avx1 mode, found its only supported under avx2 else we scalarize it. > 1) I did not switch-on masked_load/store to AVX1, I can do this. Yes Elena, This should be supported for FP type in avx1 mode (for INT type, I doubt X86 has masked_load/store instruction in avx1 mode).

how to force llvm generate gather intrinsic

2016 Feb 26

2

how to force llvm generate gather intrinsic

...we can fix scatter/gather > similarly, that would be great. > > Can we legalize the same set of masked load/store operations for AVX1 as > AVX2? If I'm understanding them correctly, the AVX1 FP instructions > (vmaskmovps/pd) can be used in place of the AVX2 int instructions > (vpmaskmovd/q), just with domain crossing penalties thrown in. I think we > do this for other missing integer ops for an AVX1 target either in x86 > lowering or in the tablegen patterns. > > Elena - I'm not too familiar with the vectorizers or scatter/gather, but > I'll certainly take a...

how to force llvm generate gather intrinsic

2016 Feb 26

0

how to force llvm generate gather intrinsic

...re ops for an AVX target. If we can fix scatter/gather similarly, that would be great. Can we legalize the same set of masked load/store operations for AVX1 as AVX2? If I'm understanding them correctly, the AVX1 FP instructions (vmaskmovps/pd) can be used in place of the AVX2 int instructions (vpmaskmovd/q), just with domain crossing penalties thrown in. I think we do this for other missing integer ops for an AVX1 target either in x86 lowering or in the tablegen patterns. Elena - I'm not too familiar with the vectorizers or scatter/gather, but I'll certainly take a look at D15690. Thanks f...

how to force llvm generate gather intrinsic

2016 Feb 26

0

how to force llvm generate gather intrinsic

...r >> similarly, that would be great. >> >> Can we legalize the same set of masked load/store operations for AVX1 as >> AVX2? If I'm understanding them correctly, the AVX1 FP instructions >> (vmaskmovps/pd) can be used in place of the AVX2 int instructions >> (vpmaskmovd/q), just with domain crossing penalties thrown in. I think we >> do this for other missing integer ops for an AVX1 target either in x86 >> lowering or in the tablegen patterns. >> >> Elena - I'm not too familiar with the vectorizers or scatter/gather, but >> I'...

[LLVMdev] Adding masked vector load and store intrinsics

2014 Oct 24

20

[LLVMdev] Adding masked vector load and store intrinsics

Hi, We would like to add support for masked vector loads and stores by introducing new target-independent intrinsics. The loop vectorizer will then be enhanced to optimize loops containing conditional memory accesses by generating these intrinsics for existing targets such as AVX2 and AVX-512. The vectorizer will first ask the target about availability of masked vector loads and stores. The SLP

search for: vpmaskmovd