search for: vec2

Displaying 20 results from an estimated 41 matches for "vec2".

Did you mean: vec
2008 Dec 30
2
[LLVMdev] Folding vector instructions
...#39;select' instruction, and create 'insert-element' instruction. <code> llvm::Value * Instructions::min(llvm::Value *in1, llvm::Value *in2) { std::vector<llvm::Value*> vec1 = extractVector(in1); // generate LLVM extract element std::vector<llvm::Value*> vec2 = extractVector(in2); Value *xcmp = m_builder.CreateFCmpOLT(vec1[0], vec2[0], name("xcmp")); Value *selx = m_builder.CreateSelect(xcmp, vec1[0], vec2[0], name("selx")); Value *ycmp = m_builder.CreateFCmpOLT(vec1[1], vec2[1], n...
2015 May 04
2
[LLVMdev] Incorrect code generated for arm64
Hi all, I’ve narrowed down a problem in my code to the following test case: - - - - typedef struct {float v[2];} vec2; typedef struct {float v[3];} vec3; vec2 getVec2(); vec3 getVec3() { vec2 myVec = getVec2(); vec3 res; res.v[0] = myVec.v[0]; res.v[1] = myVec.v[1]; res.v[2] = 1; return res; } - - - - Compiling this with any level of optimization for arm64 gives incorrect code, unless my test case...
2013 Jun 18
1
transform 3 numeric vectors empty of 0/1
Dear all, Without a loop, I would like transform 3 numeric vectors empty of 0/1 of same length Vec1 : transform 1 to A and 0 to "" Vec2 : transform 1 to B and 0 to "" Vec3 : transform 1 to C and 0 to "" to obtain only 1 vector Vec who is the paste of the 3 vectors (Ex : ABC, BC, AC, AB,...) Any idea ? Thank you for your help -- Michel ARNAUD
2015 May 04
2
[LLVMdev] Incorrect code generated for arm64
...my codebase that return small (length 2 or 3) Vector types - generally in headers so they can be inlined an optimized well - but I’d definitely like to understand the root cause of this one so I can be on the lookout for any other similar failures. Simon > > typedef struct {float v0, v1;} vec2; > typedef struct {float v0, v1, v2;} vec3; > > vec2 getVec2(); > > vec3 getVec3() > { > vec2 myVec = getVec2(); > > vec3 res; > res.v0 = myVec.v0; > res.v1 = myVec.v1; > res.v2 = 1; > return res; > } > > .section __TEXT,__text,regula...
2012 Feb 15
2
[LLVMdev] ASM appears to be incorrect from llc
Hi, I'm trying to compile an intermediate representation file to ASM (intel style), and I believe that the resultant ASM is invalid. The IR is: ; ModuleID = 'test.u' %vec2 = type { float, float } @t = global %vec2 zeroinitializer @x = global i32 0 define i32 @main__i__v() nounwind { locals: %0 = load float* getelementptr inbounds (%vec2* @t, i32 0, i32 0) %1 = fptosi float %0 to i64 %2 = trunc i64 %1 to i32 store i32 %2, i32* @x ret i32 0 } Now, I know...
2011 Jul 01
2
[LLVMdev] (no subject)
I'm trying to debug a problem with our custom backend with using a tiered register allocation setup. Just a little background. My target uses vec4 32bit registers and I want to have three levels of registers setup. Each vec4 register can have two sub-regs of size vec2 32bit, and each sub-reg, has its own two sub-regs of 32bit each. So it looks like this, xyzw -> {xy, zw} -> {x, y, z, w}. Now the problem I am having is that for some reason, the linearscan allocator is running out of registers to allocate. This makes no sense to me as I have 1024 vec4 regi...
2007 Aug 28
1
Age-Length key with kimura algorith
Se ha borrado un texto insertado con un juego de caracteres sin especificar... Nombre: no disponible Url: https://stat.ethz.ch/pipermail/r-help/attachments/20070828/6641b572/attachment.pl
2012 Feb 16
0
[LLVMdev] ASM appears to be incorrect from llc
...Wed, Feb 15, 2012 at 3:36 PM, Matthew Huck <matthew.huck at gmail.com> wrote: > Hi, >   I'm trying to compile an intermediate representation file to ASM (intel > style), and I believe that the resultant ASM is invalid. The IR is: > > ; ModuleID = 'test.u' > > %vec2 = type { float, float } > @t = global %vec2 zeroinitializer > @x = global i32 0 > > define i32 @main__i__v() nounwind { > locals: >   %0 = load float* getelementptr inbounds (%vec2* @t, i32 0, i32 0) >   %1 = fptosi float %0 to i64 >   %2 = trunc i64 %1 to i32 >   store i...
2008 Dec 30
2
[LLVMdev] [Mesa3d-dev] Folding vector instructions
...'insert-element' > instruction. > > <code> > llvm::Value * Instructions::min(llvm::Value *in1, llvm::Value *in2) > { > std::vector<llvm::Value*> vec1 = extractVector(in1); // generate LLVM > extract element > std::vector<llvm::Value*> vec2 = extractVector(in2); > > Value *xcmp = m_builder.CreateFCmpOLT(vec1[0], vec2[0], name("xcmp")); > Value *selx = m_builder.CreateSelect(xcmp, vec1[0], vec2[0], > name("selx")); > > Value *ycmp = m_builder.Cre...
2008 Jun 27
0
[LLVMdev] Vector instructions
...sufficiently generalized shufflevector would remove the need for > insertelement and extractelement to exist completely. You should look into how this works with clang. Clang allows you to do things like this, for example: typedef __attribute__(( ext_vector_type(4) )) float float4; float2 vec2, vec2_2; float4 vec4, vec4_2; float f; void test2() { vec2 = vec4.xy; // shorten f = vec2.x; // extract elt vec4 = vec4.yyyy; // splat vec4.zw = vec2; // insert } etc. It also offers operators to extract all the even or odd elements...
2011 Jul 01
0
[LLVMdev] (no subject)
...h wrote: > I'm trying to debug a problem with our custom backend with using a tiered register allocation setup. > > Just a little background. My target uses vec4 32bit registers and I want to have three levels of registers setup. > Each vec4 register can have two sub-regs of size vec2 32bit, and each sub-reg, has its own two sub-regs of 32bit each. > So it looks like this, xyzw -> {xy, zw} -> {x, y, z, w}. > > Now the problem I am having is that for some reason, the linearscan allocator is running out of registers to allocate. > > This makes no sense to...
1999 Dec 16
2
R question
I have the following question, which is elementary but I am unable to answer. In a for(i=10) loop, I am trying to represent the 10 1-dimensional vectors l1, l2,... l10 by some expression that will run through these values. ie. soppose I want to add l1 + ... + l10 I could go x <- 0 for(i in 1:10){ x <- x+ l(i)} This should return x to be the sum of the 10 li's for i from 1 to 10
2011 Jul 01
1
[LLVMdev] (no subject)
...16 PM, Villmow, Micah wrote: I'm trying to debug a problem with our custom backend with using a tiered register allocation setup. Just a little background. My target uses vec4 32bit registers and I want to have three levels of registers setup. Each vec4 register can have two sub-regs of size vec2 32bit, and each sub-reg, has its own two sub-regs of 32bit each. So it looks like this, xyzw -> {xy, zw} -> {x, y, z, w}. Now the problem I am having is that for some reason, the linearscan allocator is running out of registers to allocate. This makes no sense to me as I have 1024 vec4 regi...
2007 Sep 27
3
[LLVMdev] Vector swizzling and write masks code generation
...to make sure that the code generator is actually capable of generating instructions with exactly those semantics. Right now vector operations utilizing swizzling and write masks in LLVM IR have to expressed with series of load/extractelement/instertelement/store constructs. As in vec2 = vec4.xy would end up being: %tmp = load <4 x float>* @vec4 %tmp1 = extractelement <4 x float> %tmp, i32 0 %tmp2 = insertelement <2 x float> undef, float %tmp1, i32 0 %tmp3 = extractelement <4 x float> %tmp, i32 1 %tmp4 = insertelement <2 x float> %tmp2, float %tmp3...
2008 Jun 27
2
[LLVMdev] Vector instructions
Hi Dan, Thanks for your comments. I've responded inline below. On 26-Jun-08, at 6:49 PM, Dan Gohman wrote: > On Jun 26, 2008, at 1:56 PM, Stefanus Du Toit wrote: >> >> === >> 1. Shufflevector only accepts vectors of the same type >> >> I would propose to change the syntax from: >> >>> <result> = shufflevector <n x <ty>>
2000 Jun 17
2
R 1.1.0 for Windows
Windows binaries at CRAN (bin/windows/Windows-NT/base) have been updated to R-1.1.0. See below for a list of Windows-specific changes. We thanks all the people who checked over pre-test versions. guido masarotto (for the R-core team) Windows-specific changes to R ============================= There is now a GUI preferences editor on the Edit menu in Rgui. A data entry
2000 Jun 17
2
R 1.1.0 for Windows
Windows binaries at CRAN (bin/windows/Windows-NT/base) have been updated to R-1.1.0. See below for a list of Windows-specific changes. We thanks all the people who checked over pre-test versions. guido masarotto (for the R-core team) Windows-specific changes to R ============================= There is now a GUI preferences editor on the Edit menu in Rgui. A data entry
2012 Apr 05
0
[LLVMdev] Difference between 2.9 and 3.0 in intel ASM printer
Hi, I'm almost there with my Yasm printer, however, I've stumbled upon this. Using this: ; ModuleID = 'data.u' %window = type { %visobj, %vec2, %vec2, %vec2, %string, %color, i32, i32, %string, %string, %string, i1, i1, i1, i1, i1, i1, i32, i8* } %visobj = type { %object, i1, i1, i1, i1, %color, %vec4, %vec4, %vec4, %vec4, i32, %mat4, %mat4, %mat4, %mat4, %material*, %effect*, i32, i32, i32, float, i8, %visobj*, %vec3, %vec3, %vec3, %vec3...
2016 Mar 11
3
masked-load endpoints optimization
...#39;re already doing an illegal optimization: define <4 x i32> @load_bonus_bytes(i32* %addr1, <4 x i32> %v) { %ld1 = load i32, i32* %addr1 %addr2 = getelementptr i32, i32* %addr1, i64 3 %ld2 = load i32, i32* %addr2 %vec1 = insertelement <4 x i32> undef, i32 %ld1, i32 0 %vec2 = insertelement <4 x i32> %vec1, i32 %ld2, i32 3 ret <4 x i32> %vec2 } $ ./llc -o - loadcombine.ll ... movups (%rdi), %xmm0 retq On Thu, Mar 10, 2016 at 10:22 PM, Nema, Ashutosh <Ashutosh.Nema at amd.com> wrote: > This looks interesting, the main motivati...
2012 Feb 16
3
[LLVMdev] ASM appears to be incorrect from llc
...ck <matthew.huck at gmail.com> > wrote: > > Hi, > > I'm trying to compile an intermediate representation file to ASM (intel > > style), and I believe that the resultant ASM is invalid. The IR is: > > > > ; ModuleID = 'test.u' > > > > %vec2 = type { float, float } > > @t = global %vec2 zeroinitializer > > @x = global i32 0 > > > > define i32 @main__i__v() nounwind { > > locals: > > %0 = load float* getelementptr inbounds (%vec2* @t, i32 0, i32 0) > > %1 = fptosi float %0 to i64 > >...