thr3ads.net - similar to: "[LLVMdev] Failure to optimize vector select"

Displaying 20 results from an estimated 4000 matches similar to: "[LLVMdev] Failure to optimize vector select"

[LLVMdev] Failure to optimize vector select

2013 Aug 20

[LLVMdev] Failure to optimize vector select

Have you tried running SLP vectorizer pass (-vectorize-slp)? Eugene On Mon, Aug 19, 2013 at 9:04 PM, Matt Arsenault <arsenm2 at gmail.com> wrote: > Hi, > > I've found a case I would expect would optimize easily, but it doesn't. A > simple implementation of vector select: > > float4 simple_select(float4 a, float4 b, int4 c) > { > float4 result; >

[LLVMdev] Failure to optimize vector select

2013 Aug 20

[LLVMdev] Failure to optimize vector select

On Aug 19, 2013, at 18:47 , Eugene Toder <eltoder at gmail.com> wrote: > Have you tried running SLP vectorizer pass (-vectorize-slp)? Yes. That was the first thing i tried, and it didn't do anything. I was looking the vectorizer, but then I saw some things that made me wonder if it was even supposed to do this

[LLVMdev] Failure to optimize vector select

2013 Aug 20

[LLVMdev] Failure to optimize vector select

Hi Matt, This code maintains a vector of float4 and it inserts and extracts values from this vector. The ’select’ operations are already vectorized. Maybe a sequence of inst-combines (or DAG-combines) can help. If you re-write this code using scalars then the slp-vectorizer, with some tweaks, will be able to catch it. Thanks, Nadav On Aug 20, 2013, at 1:14 PM, Matt Arsenault <arsenm2 at

[LLVMdev] Failure to optimize vector select

2013 Aug 20

[LLVMdev] Failure to optimize vector select

Can you send the IR of the function ? On Aug 20, 2013, at 8:36 AM, Matt Arsenault <arsenm2 at gmail.com> wrote: > > On Aug 19, 2013, at 18:47 , Eugene Toder <eltoder at gmail.com> wrote: > >> Have you tried running SLP vectorizer pass (-vectorize-slp)? > Yes. That was the first thing i tried, and it didn't do anything. I was looking the vectorizer, but then

[LLVMdev] Failure to optimize vector select

2013 Aug 20

[LLVMdev] Failure to optimize vector select

On Aug 20, 2013, at 14:49 , Nadav Rotem <nrotem at apple.com> wrote: > Hi Matt, > > This code maintains a vector of float4 and it inserts and extracts values from this vector. The ’select’ operations are already vectorized. Maybe a sequence of inst-combines (or DAG-combines) can help. If you re-write this code using scalars then the slp-vectorizer, with some tweaks, will be able

[LLVMdev] Failure to optimize vector select

2013 Aug 20

[LLVMdev] Failure to optimize vector select

On Aug 20, 2013, at 10:22 , Nadav Rotem <nrotem at apple.com> wrote: > Can you send the IR of the function ? Attached is the -O0 and -O3 IR -------------- next part -------------- A non-text attachment was scrubbed... Name: vselect_optimized.ll Type: application/octet-stream Size: 1545 bytes Desc: not available URL:

[LLVMdev] How to vectorize a vector type cast?

2012 Feb 28

[LLVMdev] How to vectorize a vector type cast?

Since Clang does not seem to allow type casts, such as uchar4 to float4, between vector types, it seems it is necessary to write them as element by element conversions, such as typedef float float4 __attribute__((ext_vector_type(4))); typedef unsigned char uchar4 __attribute__((ext_vector_type(4))); float4 to_float4(uchar4 in) { float4 out = {in.x, in.y, in.z, in.w}; return out; } Running

Static assert fails when compiler for i386

2019 Oct 17

Static assert fails when compiler for i386

Hi Devs, Consider below testcase. $cat test.cpp #include <vector> #include<type_traits> typedef int _int4 __attribute__((vector_size(16))); typedef union{ int data[4]; struct {int x, y, z, w;}; _int4 vec; } int4; typedef int4 int3; int main() { static_assert(std::alignment_of<int4>::value <= alignof(max_align_t), "over aligned!"); } $clang++ -m32 error:

[LLVMdev] Generalizing shuffle vector

2008 Sep 30

[LLVMdev] Generalizing shuffle vector

Hi, The current definition of shuffle vector is <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <n x i32> <mask> ; yields <n x <ty>> The first two operands of a 'shufflevector' instruction are vectors with types that match each other and types that match the result of the instruction. The third

[LLVMdev] About JIT by LLVM 2.9 or later

2011 Nov 02

[LLVMdev] About JIT by LLVM 2.9 or later

Hello guys, Thanks for your help when you are busing. I am working on an open source project. It supports shader language and I want JIT feature, so LLVM is used. But now I find the ABI & Calling Convention did not co-work with MSVC. For example, following code I have: struct float4 { float x, y, z, w; }; struct float4x4 { float4 x, y, z, w; }; float4 fetch_vs( float4x4* mat

[LLVMdev] Functions: sret and readnone

2009 Oct 05

[LLVMdev] Functions: sret and readnone

Hi all, I'm currently building a DSL for a computer graphics project that is not unlike NVIDIA's Cg. I have an intrinsic with the following signature float4 sample(texture tex, float2 coords); that is translated to this LLVM IR code: declare void @"sample"(%float4* noalias nocapture sret, %texture, $float2) nounwind readnone The type float4 is basically an array of four

[LLVMdev] Win64 Calling Convention problem

2009 Dec 03

[LLVMdev] Win64 Calling Convention problem

Hi! I have discovered a problem with LLVM's interpretation of the Win64 calling convention w.r.t. passing of aggregates as arguments. The following code is part of my host application that is compiled with Visual Studio 2005 in 64-bit debug mode. noise4 expects a structure of four floats as its first and only argument, which is - in accordance with the specs of the Win64 calling convention -

RODBC sqlSave problem.

2003 Apr 02

RODBC sqlSave problem.

Dear list, Being new to both the postgres database, ODBC and the RODBC interface, I am somewhat confused by some of the problems I am experiencing trying to connect R to the database. Whai I am trying is basically the example part of the help file for the sqlSave function: > library(RODBC) > odbcConnect("theodor") -> channel > data(USArrests) > sqlSave(channel,

[LLVMdev] Functions: sret and readnone

2009 Nov 05

[LLVMdev] Functions: sret and readnone

It's been a while and I finally had the time to look into this. What I did was to build a custom AliasAnalysis pass, as Chris suggested, that returns AliasAnalysis::Mod for values passed to the sample function in the sret spot, and NoModRef for all other values. I'm also returning AliasAnalysis::AccessesArguments in the pass' getModRefBehavior methods. However, I haven't been

[LLVMdev] Making GEP into vector illegal?

2008 Oct 14

[LLVMdev] Making GEP into vector illegal?

In Joe programmer language (i.e. C ;) ), are we basically talking about disallowing: float4 a; float* ptr_z = &a.z; ? Won't programmers just resort to: float4 a; float* ptr_z = (float*)(&a) + 3; ? On Oct 14, 2008, at 3:55 PM, Mon Ping Wang wrote: > Hi, > > Something like a sequential type makes sense especially in light of > what Duncan is point out. I agree

How to handle INT8 data

2017 Jan 20

How to handle INT8 data

Right, they are identifiers. Storing them as String has drawbacks: - huge to store in memory - slow to process - huge to index (by eg data.table columns indexes) Why not storing them as numeric ? Thanks, Le 20 janv. 2017 ? 18h16, William Dunlap ?crivait : > If these are identifiers, store them as strings. If not, what sort of > calculations do you plan on doing with them? > Bill

How to handle INT8 data

2017 Jan 20

How to handle INT8 data

Hello r users, I have to deal with int8 data with R. AFAIK R does only handle int4 with `as.integer` function [1]. I wonder: 1. what is the better approach to handle int8 ? `as.character` ? `as.numeric` ? 2. is there any plan to handle int8 in the future ? As you might know, int4 is to small to deal with earth population right now. Thanks for you ideas, int8 eg: human_id

[LLVMdev] Making GEP into vector illegal?

2008 Oct 14

[LLVMdev] Making GEP into vector illegal?

On Tue, Oct 14, 2008 at 1:34 PM, Daniel M Gessel <gessel at apple.com> wrote: > In Joe programmer language (i.e. C ;) ), are we basically talking > about disallowing: > > float4 a; > float* ptr_z = &a.z; > > ? That's my reading as well; the argument for not allowing it is just to make optimization easier. We don't allow addressing individual bits either,

[LLVMdev] Generalizing shuffle vector

2008 Sep 30

[LLVMdev] Generalizing shuffle vector

Hi Mon Ping, Generalizing shufflevector would be great. I have an additional suggestion below. On 29-Sep-08, at 11:11 PM, Mon Ping Wang wrote: > I am proposing to extend the shuffle vector definition to be > <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <m x i32> > <mask> ; yields <m x <ty>> > > The

[LLVMdev] Vector instructions

2008 Jun 27

[LLVMdev] Vector instructions

On Jun 27, 2008, at 8:02 AM, Stefanus Du Toit wrote: >>>> <result> = shufflevector <a x <ty>> <v1>, <b x <ty>> <v2>, <d x >>>> i32> >>>> <mask> ; yields <d x <ty>> >>> >>> With the requirement that the entries in the (still constant) mask >>> are >>> within

similar to: [LLVMdev] Failure to optimize vector select