Displaying 20 results from an estimated 500 matches similar to: "RFC phantom memory intrinsic"
2017 Sep 13
2
RFC phantom memory intrinsic
Hi Michael,
>Interesting approach but how do you handle more complex offsets, e.g., when the pointer is part of an aggregate? Only one offset does not seem enough to handle generic cases.
Yes, correct, this a little bit changed example is not working.
#include <x86intrin.h>
__m256d vsht_d4_fold(const double* ptr, unsigned long long i) {
__m256d foo = (__m256d){ ptr[i], ptr[i+1],
2017 Sep 13
2
RFC phantom memory intrinsic
Hi Michael,
>I have a case where InstCombine removes a store and your approach would be
>valuable for me if the entire access to an aggregate could be restored.
Yes, no problem and we could add the aggregate pointer to this new
intrinsic and in my particular case I should ignore it, but I am
looking now at "speculation_marker" metadata and I am still not sure
how to implement it
2017 Sep 26
0
RFC phantom memory intrinsic
On 09/26/2017 08:31 AM, Dinar Temirbulatov wrote:
> Hi Hal,
>> Are you primarily concerned with being able to widen loads later in the pipeline? Could we attached metadata to the remaining loads indicating that it would be legal to widen them?
> no, I don't have any concerns about intrinsic way of implementation,
> and intrinsic way looks safer for me since we somehow detach our
2017 Sep 26
2
RFC phantom memory intrinsic
Hi Hal,
>Are you primarily concerned with being able to widen loads later in the pipeline? Could we attached metadata to the remaining loads indicating that it would be legal to widen them?
no, I don't have any concerns about intrinsic way of implementation,
and intrinsic way looks safer for me since we somehow detach our
information about memory from that actual load instruction. I updated
2017 Sep 26
0
RFC phantom memory intrinsic
On 09/13/2017 04:46 PM, Dinar Temirbulatov via llvm-dev wrote:
> Hi Michael,
>> I have a case where InstCombine removes a store and your approach would be
>> valuable for me if the entire access to an aggregate could be restored.
> Yes, no problem and we could add the aggregate pointer to this new
> intrinsic and in my particular case I should ignore it, but I am
> looking
2017 Jul 20
2
[RFC] dereferenceable metadata
Indeed. But the problem here is that Dinar is trying to keep information
after a load/store is removed by instcombine
For example:
v4sf v = {p[0], p[1], p[2], p[3]};
v4sf v2 = shuffle(v, 0, 0, 2, 2);
Some pass comes in and removes the p[3] and p[1].
Now you have smaller code, but lost the ability to use a vector load for
all those values + shuffle. The code got scalarized because we lost the
2017 Jul 18
2
[RFC] dereferenceable metadata
Hi,
While working on PR21780, I used "dereferenceable_or_null" metadata
and I realized now that it is not correct for my solution to use this
metadata type since it might point to an address that it is not
dereferenceable but null. I think that we need another new metadata
type, something like "dereferenceable" with that we could annotate
any load (not just pointer type like
2017 Jun 14
2
Default FPENV state
Hi,
We are interesting in expanding some vector operations directly in the
IR form as constants https://reviews.llvm.org/D33406,
for example: _mm256_cmp_ps("any input", "any input", _CMP_TRUE_UQ)
should produce -1, -1, -1, ... vector, but for some values for example
"1.00 -nan" if FPU exceptions were enabled this operation triggers the
exception. Here is the question:
2015 Apr 16
3
[LLVMdev] double* to <2 x double>*
Does anyone know how to instrument *double* to <2 x doulbe>**, e.g., 2.2
--> <2.2, 2.2>?
For example, I want to change the following IR code
%arrayidx1 = getelementptr inbounds [100 x double]* @main.B, i32 0, i32
%i.021
%1 = load double* %arrayidx1, align 4, !tbaa !0
to:
%arrayidx1 = getelementptr inbounds [100 x double]* @main.B, i32 0, i32
%i.021
%1 = bitcast double* %arrayidx1
2014 Oct 17
2
[LLVMdev] opt -O2 leads to incorrect operation (possibly a bug in the DSE)
Hi all,
Consider the following example:
define void @fn(i8* %buf) #0 {
entry:
%arrayidx = getelementptr i8* %buf, i64 18
tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* %arrayidx, i8* %buf, i64
18, i32 1, i1 false)
%arrayidx1 = getelementptr i8* %buf, i64 18
store i8 1, i8* %arrayidx1, align 1
tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* %buf, i8* %arrayidx, i64
18, i32 1, i1 false)
2018 Feb 01
1
Intrinsic pattern matching
Hello,
I have a problem with pattern matching on intrinsics.
I have following code in IntrinsicsX86.td:
```
let TargetPrefix = "x86" in { // All intrinsics start with "llvm.x86.".
def int_x86_mpx_bndmk:
Intrinsic<[llvm_x86bnd_ty], [llvm_ptr_ty, llvm_i64_ty], []>;
}
```
And following instruction that is generated when @llvm.x86.mpx.bndmk is
used in code:
2017 May 05
2
load instruction to gather intrinsics
Hi All,
Can I change a vector load to gather intrinsic? If so, how can I do it? For
example, I want to change the following IR code
%1 = load <2 x i64>* %arrayidx1, align 8
to
%1 = call <2 x i64> @llvm.masked.gather.v2i64(<2 x i64*> %arrayidx1,
i32 8, <2 x i1> <i1 true, i1 true>, <2 x i64> undef)
Basically, I am not sure how to get two consecutive
2024 Aug 22
3
High RAM consumption
Hello,
LDAP Samba processes consume a lot of memory. Each LDAP process loads the
entire ldb database into memory and does not release it over time. For
example, for a 4GB ldb database and 4 LDAP processes, Samba will use up
more than 16 GB of RAM. If only 16GB of RAM is installed on the server,
then Samba will respond to requests with long delays. Some requests will not
be answered (by time-out).
2016 Mar 16
3
RFC: A change in InstCombine canonical form
=== PROBLEM === (See this bug https://llvm.org/bugs/show_bug.cgi?id=26445)
IR contains code for loading a float from float * and storing it to a float
* address. After canonicalization of load in InstCombine [1], new bitcasts
are added to the IR (see bottom of the email for code samples). This
prevents select speculation in SROA to work. Also after SROA we have
bitcasts from int32 to float.
2019 Aug 08
2
Suboptimal code generated by clang+llc in quite a common scenario (?)
I found a something that I quite not understand when compiling a common piece of code using the -Os flags.
I found it while testing my own backend but then I got deeper and found that at least the x86 is affected as well. This is the referred code:
char pp[3];
char *scscx = pp;
int tst( char i, char j, char k )
{
scscx[0] = i;
scscx[1] = j;
scscx[2] = k;
return 0;
}
The above gets
2016 Mar 16
2
RFC: A change in InstCombine canonical form
On Wed, Mar 16, 2016 at 8:34 AM, Mehdi Amini via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hi,
>
> How do it interact with the "typeless pointers" work?
>
Right - the goal of the typeless pointer work is to fix all these bugs
related to "didn't look through bitcasts" in optimizations. Sometimes
that's going to mean more work (because the code
2013 Nov 11
2
[LLVMdev] What's the Alias Analysis does clang use ?
Hi, LLVM community:
I found basicaa seems not to tell must-not-alias for __restrict__ arguments
in c/c++. It only compares two pointers and the underlying objects they
point to. I wonder how clang does alias analysis
for c/c++ keyword restrict.
let assume we compile the following code:
$cat myalias.cc
float foo(float * __restrict__ v0, float * __restrict__ v1, float *
__restrict__ v2, float *
2016 Mar 16
3
RFC: A change in InstCombine canonical form
On Wed, Mar 16, 2016 at 11:00 AM, Ehsan Amiri <ehsanamiri at gmail.com> wrote:
> David,
>
> Could you give us an update on the status of typeless pointer work? How
> much work is left and when you think it might be ready?
>
It's a bit of an onion peel, really - since it will eventually involve
generalizing/fixing every optimization that's currently leaning on typed
2020 Oct 05
2
Question about using IRBuilder::CreateIntrinsic for a variadic intrinsic
I have a variadic intrinsic that is defined as something like this:
def int_foobar : Intrinsic<[llvm_anyint_ty],
[llvm_vararg_ty],
[IntrNoMem, IntrSpeculatable]>;
When I construct a call to the above intrinsic with IRBuilder::CreateIntrinsic, it mangles the intrinsic name with the return type (i64) and the actual argument
2009 Mar 09
2
[LLVMdev] Intrinsic & address space
I would like to use intrinsic with different address space.
I defined an intrinsic (used to represent à specific instruction of my target) with a pointer in its arguments, but when calling this intrinsic, if the pointer is not in the generic address space (ie AddrSpace 0), an error occurs ("bad signature").
How can I specify the address space in the intrinsic definition ?
Thank you.