Displaying 20 results from an estimated 2000 matches similar to: "[LLVMdev] Missed optimization on array initialization"
2012 Feb 25
0
[LLVMdev] Missed optimization on array initialization
On Feb 25, 2012, at 3:17 AM, Carlo Alberto Ferraris wrote:
> Prompted by a SO post (http://stackoverflow.com/questions/9441882/compiler-instruction-reordering-optimizations-in-c-and-what-inhibits-them/9442363) I checked and found that LLVM yields the same (seemingly) suboptimal code as MSVC.
> Consider the following, simplified, C snippet:
> extern void bar(int*);
>
> void
2012 Feb 25
1
[LLVMdev] Missed optimization on array initialization
On Feb 25, 2012, at 10:32 AM, Chris Lattner <clattner at apple.com> wrote:
>
> On Feb 25, 2012, at 3:17 AM, Carlo Alberto Ferraris wrote:
>
>> Prompted by a SO post (http://stackoverflow.com/questions/9441882/compiler-instruction-reordering-optimizations-in-c-and-what-inhibits-them/9442363) I checked and found that LLVM yields the same (seemingly) suboptimal code as MSVC.
2014 Apr 22
2
[LLVMdev] where is F7 opcode for TEST instruction on X86?
hi,
at the moment, TEST instruction is defined with 0xf7 opcode, as
demonstrated below.
$ echo "0xf7 0xc0 0x00 0x00 0x00 0x22"|./Release+Asserts/bin/llvm-mc
-disassemble -arch=x86
.section __TEXT,__text,regular,pure_instructions
testl $570425344, %eax ## imm = 0x22000000
however, i cannot find anywhere this F7 opcode is defined in
2013 Nov 16
1
[LLVMdev] Limit loop vectorizer to SSE
The vectorizer will now emit
= load <8 x i32>, align #TargetAlignmentOfScalari32
where before it would emit
= load <8 x i32>
(which has the semantics of “= load <8 xi32>, align 0” which means the address is aligned with target abi alignment, see http://llvm.org/docs/LangRef.html#load-instruction).
When the backend generates code for the former it will emit an unaligned move:
2014 Oct 17
2
[LLVMdev] opt -O2 leads to incorrect operation (possibly a bug in the DSE)
Hi all,
Consider the following example:
define void @fn(i8* %buf) #0 {
entry:
%arrayidx = getelementptr i8* %buf, i64 18
tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* %arrayidx, i8* %buf, i64
18, i32 1, i1 false)
%arrayidx1 = getelementptr i8* %buf, i64 18
store i8 1, i8* %arrayidx1, align 1
tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* %buf, i8* %arrayidx, i64
18, i32 1, i1 false)
2017 Mar 31
2
CHECK-LABLE or CHECK?
Hi All,
I came across a FileCheck failure I don't understand why. The
example code below:
void test1() {
... code ...
// CHECK-LABEL: @test1
// CHECK: void @llvm.memcpy.p0i8.p0i8.i32 - (1)
}
void dummy() { // make (1) match
... code ...
// CHECK-LABEL: @dummy
}
void test2() {
... code ...
// CHECK-LABEL: @test2
//
2013 May 21
4
[LLVMdev] malloc / free & memcpy optimisations.
The front end I'm building for an existing interpreted language is
unfortunately producing output similar to this far too often;
define void @foo(i8* nocapture %dest, i8* nocapture %src, i32 %len)
nounwind {
%1 = tail call noalias i8* @malloc(i32 %len) nounwind
tail call void @llvm.memcpy.p0i8.p0i8.i32(i8* %1, i8* %src, i32 %len, i32
1, i1 false)
tail call void
2020 Sep 30
2
lifetime_start/end
Hello,
What intrinsics "@llvm.lifetime.start/@llvm.lifetime.end" really do? As per
my knowledge, they define the live ranges of variables. In the following
code section, they seem redundant. However, when I remove them, the
behavior of the code becomes non-deterministic. The live ranges of the
variables defined by them are never used in the code.
Thanks,
---------------
%37 = bitcast
2017 May 16
4
Which pass should be propagating memory copies
Consider the following IR example:
define void @simple([4 x double] *%ptr, i64 %idx) {
%stack = alloca [4 x double]
%ptri8 = bitcast [4 x double] *%ptr to i8*
%stacki8 = bitcast [4 x double] *%stack to i8*
call void @llvm.memcpy.p0i8.p0i8.i32(i8 *%stacki8, i8 *%ptri8, i32 32,
i32 0, i1 0)
%dataptr = getelementptr inbounds [4 x double], [4 x double] *%ptr, i32
0, i64 %idx
2012 May 22
4
[LLVMdev] How to get llvm bitcode executed
Hi All,
I have a program that uses C++ STL a lot. To have the source code for
STL functions, I undefined "_GLIBCXX_EXTERN_TEMPLATE" in
c++config.h. In spite of this, after compilation (via clang) and
linking (via llvm-ld), the resulting bitcode contains a few declared
functions (with no definitions).
My question is: In the scenario where some function definitions are
missing in a llvm
2018 Mar 22
2
new @llvm.memcpy and @llvm.memset API in trunk - how to use alignment?
The new @llvm.memcpy API does not have an alignment parameter. Instead the
docs say to use the align <n> attribute. How is this supposed to work with
different alignments?
For example, I have one memcpy with align 4, align 4, and another with
align 1, align 1.
; Function Attrs: argmemonly nounwind
declare void @llvm.memcpy.p0i8.p0i8.i64(i8* nocapture writeonly align 4,
i8* nocapture
2018 Nov 23
2
is this a bug in an optimization pass?
The frontend code is a pretty simple for loop, that counts from i = 0;
i != 10; i += 1
It gets optimized into and endless loop.
export fn entry() void {
var array: [10]Bar = undefined;
var x = for (array) |elem, i| {
if (i == 1) break elem;
} else bar2();
}
Here's the generated IR:
; ModuleID = 'test'
source_filename = "test"
target datalayout =
2018 Jan 19
2
Change memcpy/memmove/memset to have dest and source alignment attributes
> On Jan 18, 2018, at 7:45 AM, Daniel Neilson via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>
>
> Hi all,
> This change has been reviewed, and appears to be ready to land (review available here if anyone still wants to chime in: https://reviews.llvm.org/D41675 <https://reviews.llvm.org/D41675> ). The process that we’re going to use for landing this will take a few
2012 Jul 26
1
[LLVMdev] llvm.memset.p0i8.* intrinsics
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi,
are the llvm.memset.p0i8.i32 and llvm.memset.p0i8.i64 intrinsics meant
to be used for 32-bit and 64-bit architectures, respectively depending
on the module's target datalayout? Or can I use any of them? If yes,
why are there two functions?
Sorry, if it's obvious, but the documentation wasn't that clear.
Thanks and ciao,
Mario
2013 May 21
0
[LLVMdev] malloc / free & memcpy optimisations.
> could you allocate the memory on the stack instead (alloca instruction)?
This is mainly for string or binary blob handling, using the stack isn't a
great idea for size reasons.
While I'm experimenting with simple code examples now, and I picked a
simple one for this email. I'm certain things will get much more
complicated once I implement more features of the language.
On Tue,
2012 Aug 22
4
[LLVMdev] PROPOSAL: IR representation of detailed struct assignment information
Hello,
Currently LLVM expects front-ends to lower struct assignments into either
individual scalar loads and stores, or calls to @llvm.memcpy. For structs
with lots of fields, it can take a lot of scalar loads and stores, so
@llvm.memcpy is used instead. Unfortunately, using @llvm.memcpy does not
permit full TBAA information to be preserved. Also, it unnecessarily copies
any padding bytes between
2018 Jan 02
5
Change memcpy/memmove/memset to have dest and source alignment attributes
Good day all,
I’ve spent a few days resurrecting the circa-2015 work on removing the explicit alignment argument (4th arg) from the @llvm.memcpy/memmove/memset intrinsics in favour of using the alignment attribute on the pointer args of calls to the intrinsic. This work was first proposed back in August 2015 by Lang Hames:
http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html (item
2018 Jan 19
0
Change memcpy/memmove/memset to have dest and source alignment attributes
On Jan 18, 2018, at 10:48 PM, Chris Lattner <clattner at nondot.org<mailto:clattner at nondot.org>> wrote:
On Jan 18, 2018, at 7:45 AM, Daniel Neilson via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:
Hi all,
This change has been reviewed, and appears to be ready to land (review available here if anyone still wants to chime in:
2018 Nov 29
2
AliasAnalysis does not look though a memcpy
Hi,
I'm trying to get AA results for two pointers, but it seems that AA
cannot look though a memcpy. For example:
define dso_local spir_func void @fun() {
entry:
; Store an address of `var'
%var = alloca i32, align 4
store i32 42, i32* %var, align 4
%var.addr = alloca i32*, align 8
store i32* %var, i32** %var.addr, align 8
; Memcpy
2019 Aug 07
2
Dead store elimination in the backend for -ftrivial-auto-var-init
There are two problems:
1. padding after union and call to q(), without LTO we can't remove that
store.
2. shortcut which I have which ignores all instructions q() . this assume
that memset to acpar.match, acpar.matchinfo also useful which is not true. I
should be able to improve this case.
On Thu, Aug 1, 2019 at 11:29 PM Vitaly Buka <vitalybuka at google.com> wrote:
> On a first