thr3ads.net - similar to: "array fill idioms"

Displaying 20 results from an estimated 30000 matches similar to: "array fill idioms"

2016 Nov 10

array fill idioms

Yes, I know this works peachy keen for char arrays. I'm looking at (which is hard to express in C) something like void foo () { int bar[20] = { 42, 42, ..., 42 }; } I don't want to do a memcopy of the 20 element constant array, and memset doesn't work here. I want an intrinsic that copys the scalar int constant 42 to each element of the int array. bagel On 11/10/2016 03:30

array fill idioms

2016 Nov 10

array fill idioms

Back in the day, we called this a BLT (block transfer, pronouced 'blit') for the PDP-10 instruction of that name. You can do this by assigning the first value, then do an overlapping memcpy to fill in the rest. There's probably something clever you can do with vector instructions too, in many cases. --paulr From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Ryan

[LLVMdev] Why llvm function name is different with . and ..

2010 May 05

[LLVMdev] Why llvm function name is different with . and ..

declare i8 @llvm.atomic.load.max.i8.p0i8( i8* <ptr>, i8 <delta> ) declare i16 @llvm.atomic.load.max.i16.p0i16( i16* <ptr>, i16 <delta> ) declare i32 @llvm.atomic.load.max.i32.p0i32( i32* <ptr>, i32 <delta> ) declare i64 @llvm.atomic.load.max.i64.p0i64( i64* <ptr>, i64 <delta> ) declare i8 @llvm.atomic.load.min.i8.p0i8( i8* <ptr>, i8

[LLVMdev] wrong code generation for memcpy function in SROA optimization pass

2013 Nov 24

[LLVMdev] wrong code generation for memcpy function in SROA optimization pass

SROA optimization pass did some optimizations and transforms for memcpy function,such as ld/st operations.When someone has written down code like size>sizeof(dest) in memcpy(*dest,*src,size), there was much likely a wrong code generation.for example,considered as such testcase: int main() { char ch; short sh = 0x1234; memcpy(&ch,&sh,2); printf("ch=0x%02x\n",ch); } At

[LLVMdev] Missed optimization on array initialization

2012 Feb 25

[LLVMdev] Missed optimization on array initialization

Prompted by a SO post (http://stackoverflow.com/questions/9441882/compiler-instruction-reordering-optimizations-in-c-and-what-inhibits-them/9442363) I checked and found that LLVM yields the same (seemingly) suboptimal code as MSVC. Consider the following, simplified, C snippet: extern void bar(int*); void foo(int a) { int ar[100] = {a}; if (a) return; bar(ar); }

How to change CLang struct alignment behaviour?

2019 May 13

How to change CLang struct alignment behaviour?

Hi Joan, On Mon, 13 May 2019 at 18:01, Joan Lluch <joan.lluch at icloud.com> wrote: > After looking at it a bit further, I think this is a Clang thing. Clang issues “align 2” if the struct has at least one int (2 bytes), but also if the entire struct size is multiple of 2. For example a struct with 4 char members. In these cases the LLVM backend correctly creates word sized load/stores

load with alignment of 1 crashes from being unaligned

2017 Oct 01

load with alignment of 1 crashes from being unaligned

Below is attached a full IR module that can reproduce this issue, but the part to notice is this: %Foo96Bits = type <{ i24, i24, i24, i24 }> define internal fastcc i16 @main.0.1() unnamed_addr #2 !dbg !113 { Entry: %value = alloca %Foo96Bits, align 1 %b = alloca i24, align 4 %0 = bitcast %Foo96Bits* %value to i8*, !dbg !129 call void @llvm.memcpy.p0i8.p0i8.i64(i8* %0, i8* bitcast

Vectorizing structure reads, writes, etc on X86-64 AVX

2015 Nov 03

Vectorizing structure reads, writes, etc on X86-64 AVX

----- Original Message ----- > From: "Sanjay Patel via llvm-dev" <llvm-dev at lists.llvm.org> > To: "Jay McCarthy" <jay.mccarthy at gmail.com> > Cc: "llvm-dev" <llvm-dev at lists.llvm.org> > Sent: Tuesday, November 3, 2015 12:30:51 PM > Subject: Re: [llvm-dev] Vectorizing structure reads, writes, etc on X86-64 AVX > > If the

[LLVMdev] memory-to-memory instructions

2009 Aug 04

[LLVMdev] memory-to-memory instructions

It appears impossible to match memory-to-memory instructions (except for MOV). The MSP430 supports these. The following test case tries to match the pattern that generates "add.w &foo,&bar": ; RUN: llvm-as < %s | llc -march=msp430 -O3 target datalayout = "e-p:16:8:8-i8:8:8-i16:8:8-i32:8:8" target triple = "msp430-generic-generic" @foo = common global

can debug info for coroutines be improved?

2018 Jun 27

can debug info for coroutines be improved?

I'm going to show the same function, first normally, and then as a coroutine, and show how gdb can see the variable when it's a normal function, but not when it's a coroutine. I'd like to understand if this can be improved. I'm trying to debug a real world problem, but the lack of debug info on variables in coroutines is making it difficult. Should I file a bug? Is this a

Vectorizing structure reads, writes, etc on X86-64 AVX

2015 Nov 04

Vectorizing structure reads, writes, etc on X86-64 AVX

Hi Jay - I see the slow, small accesses using an older clang [Apple LLVM version 7.0.0 (clang-700.1.76)], but this looks fixed on trunk. I made a change that comes into play if you don't specify a particular CPU: http://llvm.org/viewvc/llvm-project?view=revision&revision=245950 $ ./clang -O1 -mavx copy.c -S -o - ... movslq %edi, %rax movq _spr_dynamic at GOTPCREL(%rip),

llvm-ir: TBAA and struct copies

2019 Jun 04

llvm-ir: TBAA and struct copies

Hi, I have a question about the current definition of TBAA (See [1]). In the LLVM-IR code that we produce, we generate load/stores of struct types. (See [2] and [3] for a godbolt example showing the issue) For following c-alike code: struct S { int dummy; short e, f; } x,y; struct S* p = &x; int foobar() { x.f=42; *p=y; //**** struct copy return x.f; } We produce:

Vectorizing structure reads, writes, etc on X86-64 AVX

2015 Oct 30

Vectorizing structure reads, writes, etc on X86-64 AVX

I am a first time poster, so I apologize if this is an obvious question or out of scope for LLVM. I am an LLVM user. I don't really know anything about hacking on LLVM, but I do know a bit about compilation generally. I am on x86-64 and I am interested in structure reads, writes, and constants being optimized to use vector registers when the alignment and sizes are right. I have created a

[LLVMdev] llvm-as regression

2009 Jul 24

[LLVMdev] llvm-as regression

The following causes an assertion in recent svn pulls, but not in 2.5. The assertion: llvm-as: /home/bgl/work/llvm-work/include/llvm/ADT/SmallVector.h:125: T& llvm::SmallVectorImpl<T>::operator[](unsigned int) [with T = llvm::Constant*]: Assertion `Begin + idx < End' failed. The .ll code: target datalayout =

llvm-ir: TBAA and struct copies

2019 Jun 05

llvm-ir: TBAA and struct copies

Hi Ivan, The code that we have is indeed different from what the 'standard llvm' expects. Let me explain: in our version we came into this situation in two steps: 1) I added support for 'special types' that map directly to types supported by hardware. These types are represented by a struct containing a single iXXX member, providing the necessary bits of the type, and at the

Vectorizing structure reads, writes, etc on X86-64 AVX

2015 Nov 03

Vectorizing structure reads, writes, etc on X86-64 AVX

Thank you for your reply. FWIW, I wrote the .ll by hand after taking the C program, using clang to emit the llvm and seeing the memcpy. The memcpy version that clang generates gets compiled into assembly that uses the large sequence of movs and does not use the vector hardware at all. When I started debugging, I took that clang produced .ll and started to write it different ways trying to get

[LLVMdev] Is shortening a load a bug?

2014 Sep 11

[LLVMdev] Is shortening a load a bug?

When the IR specifies a 32 bit load can it be changed to a narrower load? What if the load is from memory (e.g. a peripheral) that only supports 32-bit access? Consider the following IR: ---- target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:32" target triple = "thumbv7m-unknown-unknown" @f = external global i32 define zeroext i8 @bar() nounwind { L.0:

Updated llc does not compile my .ll files any more [addrspace on AVR problem?]

2020 May 21

Updated llc does not compile my .ll files any more [addrspace on AVR problem?]

Hi, I’ve come back and updated my llvm toolset with modern code (my branch was about 1-2 years old) and now the llvm IR files produced by my front end no longer compile with llc. Here is a sample of llvm ir produced by my front end (it’s a standard version 3.1 build of swift from the swift.org open source website). ; ModuleID = 'main.ll' source_filename = "main.ll" target

Byte-wide stores aren't coalesced if interspersed with other stores

2018 Sep 10

Byte-wide stores aren't coalesced if interspersed with other stores

Hi, I have, in postres, a piece of IR that, after inlining and constant propagation boils (when cooked on really high heat) down to (also attached for your convenience): source_filename = "pg" target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-pc-linux-gnu" define void @evalexpr_0_0(i8* align 8 noalias, i32* align 8 noalias) {

[LLVMdev] Possible bug in the dragonegg

2012 Jan 23

[LLVMdev] Possible bug in the dragonegg

Hi Duncan, >> #include<stdio.h> >> #include<string.h> >> >> int main(int argc, char** argv){ >> >> char a[8] = "aaaaaaa"; >> char b[8] = "bbbbbbb"; >> >> char *c = (char*) malloc(sizeof(char)*(strlen(a)+strlen(b)+1)); >> memcpy(c, a, strlen(a)); >> memcpy(c + strlen(a), b, strlen(b) + 1); >>

similar to: array fill idioms