thr3ads.net - similar to: "RFC phantom memory intrinsic"

Displaying 20 results from an estimated 500 matches similar to: "RFC phantom memory intrinsic"

2017 Sep 13

RFC phantom memory intrinsic

Hi Michael, >Interesting approach but how do you handle more complex offsets, e.g., when the pointer is part of an aggregate? Only one offset does not seem enough to handle generic cases. Yes, correct, this a little bit changed example is not working. #include <x86intrin.h> __m256d vsht_d4_fold(const double* ptr, unsigned long long i) { __m256d foo = (__m256d){ ptr[i], ptr[i+1],

RFC phantom memory intrinsic

2017 Sep 13

RFC phantom memory intrinsic

Hi Michael, >I have a case where InstCombine removes a store and your approach would be >valuable for me if the entire access to an aggregate could be restored. Yes, no problem and we could add the aggregate pointer to this new intrinsic and in my particular case I should ignore it, but I am looking now at "speculation_marker" metadata and I am still not sure how to implement it

RFC phantom memory intrinsic

2017 Sep 26

RFC phantom memory intrinsic

On 09/26/2017 08:31 AM, Dinar Temirbulatov wrote: > Hi Hal, >> Are you primarily concerned with being able to widen loads later in the pipeline? Could we attached metadata to the remaining loads indicating that it would be legal to widen them? > no, I don't have any concerns about intrinsic way of implementation, > and intrinsic way looks safer for me since we somehow detach our

RFC phantom memory intrinsic

2017 Sep 26

RFC phantom memory intrinsic

Hi Hal, >Are you primarily concerned with being able to widen loads later in the pipeline? Could we attached metadata to the remaining loads indicating that it would be legal to widen them? no, I don't have any concerns about intrinsic way of implementation, and intrinsic way looks safer for me since we somehow detach our information about memory from that actual load instruction. I updated

RFC phantom memory intrinsic

2017 Sep 26

RFC phantom memory intrinsic

On 09/13/2017 04:46 PM, Dinar Temirbulatov via llvm-dev wrote: > Hi Michael, >> I have a case where InstCombine removes a store and your approach would be >> valuable for me if the entire access to an aggregate could be restored. > Yes, no problem and we could add the aggregate pointer to this new > intrinsic and in my particular case I should ignore it, but I am > looking

[RFC] dereferenceable metadata

2017 Jul 20

[RFC] dereferenceable metadata

Indeed. But the problem here is that Dinar is trying to keep information after a load/store is removed by instcombine For example: v4sf v = {p[0], p[1], p[2], p[3]}; v4sf v2 = shuffle(v, 0, 0, 2, 2); Some pass comes in and removes the p[3] and p[1]. Now you have smaller code, but lost the ability to use a vector load for all those values + shuffle. The code got scalarized because we lost the

[RFC] dereferenceable metadata

2017 Jul 18

[RFC] dereferenceable metadata

Hi, While working on PR21780, I used "dereferenceable_or_null" metadata and I realized now that it is not correct for my solution to use this metadata type since it might point to an address that it is not dereferenceable but null. I think that we need another new metadata type, something like "dereferenceable" with that we could annotate any load (not just pointer type like

Default FPENV state

2017 Jun 14

Default FPENV state

Hi, We are interesting in expanding some vector operations directly in the IR form as constants https://reviews.llvm.org/D33406, for example: _mm256_cmp_ps("any input", "any input", _CMP_TRUE_UQ) should produce -1, -1, -1, ... vector, but for some values for example "1.00 -nan" if FPU exceptions were enabled this operation triggers the exception. Here is the question:

[LLVMdev] double* to <2 x double>*

2015 Apr 16

[LLVMdev] double* to <2 x double>*

Does anyone know how to instrument *double* to <2 x doulbe>**, e.g., 2.2 --> <2.2, 2.2>? For example, I want to change the following IR code %arrayidx1 = getelementptr inbounds [100 x double]* @main.B, i32 0, i32 %i.021 %1 = load double* %arrayidx1, align 4, !tbaa !0 to: %arrayidx1 = getelementptr inbounds [100 x double]* @main.B, i32 0, i32 %i.021 %1 = bitcast double* %arrayidx1

[LLVMdev] opt -O2 leads to incorrect operation (possibly a bug in the DSE)

2014 Oct 17

[LLVMdev] opt -O2 leads to incorrect operation (possibly a bug in the DSE)

Hi all, Consider the following example: define void @fn(i8* %buf) #0 { entry: %arrayidx = getelementptr i8* %buf, i64 18 tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* %arrayidx, i8* %buf, i64 18, i32 1, i1 false) %arrayidx1 = getelementptr i8* %buf, i64 18 store i8 1, i8* %arrayidx1, align 1 tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* %buf, i8* %arrayidx, i64 18, i32 1, i1 false)

Intrinsic pattern matching

2018 Feb 01

Intrinsic pattern matching

Hello, I have a problem with pattern matching on intrinsics. I have following code in IntrinsicsX86.td: ``` let TargetPrefix = "x86" in { // All intrinsics start with "llvm.x86.". def int_x86_mpx_bndmk: Intrinsic<[llvm_x86bnd_ty], [llvm_ptr_ty, llvm_i64_ty], []>; } ``` And following instruction that is generated when @llvm.x86.mpx.bndmk is used in code:

load instruction to gather intrinsics

2017 May 05

load instruction to gather intrinsics

Hi All, Can I change a vector load to gather intrinsic? If so, how can I do it? For example, I want to change the following IR code %1 = load <2 x i64>* %arrayidx1, align 8 to %1 = call <2 x i64> @llvm.masked.gather.v2i64(<2 x i64*> %arrayidx1, i32 8, <2 x i1> <i1 true, i1 true>, <2 x i64> undef) Basically, I am not sure how to get two consecutive

High RAM consumption

2024 Aug 22

High RAM consumption

Hello, LDAP Samba processes consume a lot of memory. Each LDAP process loads the entire ldb database into memory and does not release it over time. For example, for a 4GB ldb database and 4 LDAP processes, Samba will use up more than 16 GB of RAM. If only 16GB of RAM is installed on the server, then Samba will respond to requests with long delays. Some requests will not be answered (by time-out).

RFC: A change in InstCombine canonical form

2016 Mar 16

RFC: A change in InstCombine canonical form

=== PROBLEM === (See this bug https://llvm.org/bugs/show_bug.cgi?id=26445) IR contains code for loading a float from float * and storing it to a float * address. After canonicalization of load in InstCombine [1], new bitcasts are added to the IR (see bottom of the email for code samples). This prevents select speculation in SROA to work. Also after SROA we have bitcasts from int32 to float.

Suboptimal code generated by clang+llc in quite a common scenario (?)

2019 Aug 08

Suboptimal code generated by clang+llc in quite a common scenario (?)

I found a something that I quite not understand when compiling a common piece of code using the -Os flags. I found it while testing my own backend but then I got deeper and found that at least the x86 is affected as well. This is the referred code: char pp[3]; char *scscx = pp; int tst( char i, char j, char k ) { scscx[0] = i; scscx[1] = j; scscx[2] = k; return 0; } The above gets

RFC: A change in InstCombine canonical form

2016 Mar 16

RFC: A change in InstCombine canonical form

On Wed, Mar 16, 2016 at 8:34 AM, Mehdi Amini via llvm-dev < llvm-dev at lists.llvm.org> wrote: > Hi, > > How do it interact with the "typeless pointers" work? > Right - the goal of the typeless pointer work is to fix all these bugs related to "didn't look through bitcasts" in optimizations. Sometimes that's going to mean more work (because the code

[LLVMdev] What's the Alias Analysis does clang use ?

2013 Nov 11

[LLVMdev] What's the Alias Analysis does clang use ?

Hi, LLVM community: I found basicaa seems not to tell must-not-alias for __restrict__ arguments in c/c++. It only compares two pointers and the underlying objects they point to. I wonder how clang does alias analysis for c/c++ keyword restrict. let assume we compile the following code: $cat myalias.cc float foo(float * __restrict__ v0, float * __restrict__ v1, float * __restrict__ v2, float *

RFC: A change in InstCombine canonical form

2016 Mar 16

RFC: A change in InstCombine canonical form

On Wed, Mar 16, 2016 at 11:00 AM, Ehsan Amiri <ehsanamiri at gmail.com> wrote: > David, > > Could you give us an update on the status of typeless pointer work? How > much work is left and when you think it might be ready? > It's a bit of an onion peel, really - since it will eventually involve generalizing/fixing every optimization that's currently leaning on typed

Question about using IRBuilder::CreateIntrinsic for a variadic intrinsic

2020 Oct 05

Question about using IRBuilder::CreateIntrinsic for a variadic intrinsic

I have a variadic intrinsic that is defined as something like this: def int_foobar : Intrinsic<[llvm_anyint_ty], [llvm_vararg_ty], [IntrNoMem, IntrSpeculatable]>; When I construct a call to the above intrinsic with IRBuilder::CreateIntrinsic, it mangles the intrinsic name with the return type (i64) and the actual argument

[LLVMdev] Intrinsic & address space

2009 Mar 09

[LLVMdev] Intrinsic & address space

I would like to use intrinsic with different address space. I defined an intrinsic (used to represent à specific instruction of my target) with a pointer in its arguments, but when calling this intrinsic, if the pointer is not in the generic address space (ie AddrSpace 0), an error occurs ("bad signature"). How can I specify the address space in the intrinsic definition ? Thank you.

similar to: RFC phantom memory intrinsic