Displaying 10 results from an estimated 10 matches for "_mm_".
Did you mean:
_am_
2009 Jan 31
0
[LLVMdev] Optimized code analysis problems
...y to extract the names
> through any of the passes.
> Where could I potentially insert a hack so that any function call to
> intrinsic functions or library functions can be retrieved?
> Could you gimme any ideas for the start?
Basically, there is no mapping from the llvm.* names to the _mm_*
names; the transformation is lossy.
You have a couple options here: one is to manipulate the source to let
you see the _mm_ names, and the other is to catch the _mm_ names
before the inliner runs.
Manipulating the source isn't actually very hard, although it's a
non-trivial amount of wor...
2009 Jan 31
2
[LLVMdev] Optimized code analysis problems
...pun
On Fri, Jan 30, 2009 at 10:39 PM, Eli Friedman <eli.friedman at gmail.com>wrote:
> On Fri, Jan 30, 2009 at 7:10 PM, Nipun Arora <nipun2512 at gmail.com> wrote:
> > Essentially I would like to extract the control flow graph representation
> > with function names (eg. _mm_cvtsi32_si128) instead of the functions
> being
> > replaced by 'llvm.*'
> > Is there anyway to extract these names directly as function calls?
>
> The names disappear in an unrecoverable way once the first inlining
> pass runs to take care of always_inline. You migh...
2009 Feb 01
1
[LLVMdev] Optimized code analysis problems
...gt; through any of the passes.
> > Where could I potentially insert a hack so that any function call to
> > intrinsic functions or library functions can be retrieved?
> > Could you gimme any ideas for the start?
>
> Basically, there is no mapping from the llvm.* names to the _mm_*
> names; the transformation is lossy.
>
> You have a couple options here: one is to manipulate the source to let
> you see the _mm_ names, and the other is to catch the _mm_ names
> before the inliner runs.
>
> Manipulating the source isn't actually very hard, although it&...
2016 Jan 23
3
how to force llvm generate gather intrinsic
...> bit vectors with AVX2 (not just AVX512).
>
> I looked at this for the first time today, so I may be missing something...
>
> So for the moment, the answer to your question is 'no'; there's no generic
> way to produce these instructions. You should be able to use the _mm_*
> intrinsics in C though.
>
>
>
>
> On Fri, Jan 22, 2016 at 5:00 PM, zhi chen via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Hi,
>>
>> I used clang -O3 -c -emit-llvm on the follow code to generate a bitcode,
>> say a.bc. I read the .l...
2016 Jan 23
2
how to force llvm generate gather intrinsic
Hi,
I used clang -O3 -c -emit-llvm on the follow code to generate a bitcode,
say a.bc. I read the .ll file and didn't see any gather intrinsic. Also, I
used opt -O3 -mcpu=core-avx2/-mcpu=skx, but there is still no gather
intrinsic generated.
int foo(int A[800], int B[800], int C[800]) {
for (int i = 0; i < 800; i++) {
A[B[i]] = i + 5;
}
for (int i = 0; i < 800;
2016 Jan 14
2
RFC: non-temporal fencing in LLVM IR
...mfence ). Other languages have private stacks where this isn't
> > an issue, and where the stack top can reasonably be assumed to be
> > in
> > cache.
>
> > How will this affect non-user-mode code (i.e. kernel code)?
>
> > Kernel code still has to ask for _mm_ mfence if it wants mfence :
> > C11
> > and C++11 barriers aren't specified as a specific instruction.
>
> > Is it safe to access top-of-stack?
>
> > AFAIK yes, and the ABI-specified red zone has our back (or front if
> > the stack grows up ☻).
>
>...
2016 Jan 23
2
how to force llvm generate gather intrinsic
...) should be legal for 128/256 bit vectors with AVX2 (not just AVX512).
I looked at this for the first time today, so I may be missing something...
So for the moment, the answer to your question is 'no'; there's no generic way to produce these instructions. You should be able to use the _mm_* intrinsics in C though.
On Fri, Jan 22, 2016 at 5:00 PM, zhi chen via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:
Hi,
I used clang -O3 -c -emit-llvm on the follow code to generate a bitcode, say a.bc. I read the .ll file and didn't see any gat...
2016 Jan 13
4
RFC: non-temporal fencing in LLVM IR
...r a locked
top-of-stack idempotent operation is better than mfence). Other languages
have private stacks where this isn't an issue, and where the stack top can
reasonably be assumed to be in cache.
*How will this affect non-user-mode code (i.e. kernel code)?*
Kernel code still has to ask for _mm_mfence if it wants mfence: C11 and
C++11 barriers aren't specified as a specific instruction.
*Is it safe to access top-of-stack?*
AFAIK yes, and the ABI-specified red zone has our back (or front if the
stack grows up ☻).
*What about non-x86 architectures?*
Architectures such as ARMv8 supp...
2019 Dec 21
13
[PATCH 0/8] Convert the intel iommu driver to the dma-iommu api
This patchset converts the intel iommu driver to the dma-iommu api.
While converting the driver I exposed a bug in the intel i915 driver which causes a huge amount of artifacts on the screen of my laptop. You can see a picture of it here:
https://github.com/pippy360/kernelPatches/blob/master/IMG_20191219_225922.jpg
This issue is most likely in the i915 driver and is most likely caused by the
2019 Dec 21
13
[PATCH 0/8] Convert the intel iommu driver to the dma-iommu api
This patchset converts the intel iommu driver to the dma-iommu api.
While converting the driver I exposed a bug in the intel i915 driver which causes a huge amount of artifacts on the screen of my laptop. You can see a picture of it here:
https://github.com/pippy360/kernelPatches/blob/master/IMG_20191219_225922.jpg
This issue is most likely in the i915 driver and is most likely caused by the