thr3ads.net - similar to: "1.5 on os x?"

Displaying 20 results from an estimated 500 matches similar to: "1.5 on os x?"

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

2019 Jun 03

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

On Mon, 3 Jun 2019 at 20:00, Francesco Petrogalli via cfe-dev < cfe-dev at lists.llvm.org> wrote: > Hi All, > > The original intend of this thread is to "Expose user provided vector > function for auto-vectorization.” > > I originally proposed to use OpenMP `declare variant` for the sake of > using something that is defined by a standard. The RFC itself is not

RFC: Interface user provided vector functions with the vectorizer.

2019 Jun 11

RFC: Interface user provided vector functions with the vectorizer.

Dear all, I have re-written the proposal for interfacing user provided vector functions, originally posted in both llvm-dev and cfe-dev mailing list: "[RFC] Expose user provided vector function for auto-vectorization." The proposal looks quite different from the original submission, therefore I took the liberty to start a new thread. The original thread generated some good

RFC: Interface user provided vector functions with the vectorizer.

2019 Jun 24

RFC: Interface user provided vector functions with the vectorizer.

I have an RFC for first-class complex types in LLVM IR pending for some internal review. I hope to post it soon. That should help address this problem. Then the vector function signature generation could stay in LLVM, if I'm understanding the issue correctly. -David Francesco Petrogalli via llvm-dev <llvm-dev at lists.llvm.org> writes: > Hi all - I am

RFC: Interface user provided vector functions with the vectorizer.

2019 Jun 17

RFC: Interface user provided vector functions with the vectorizer.

I agree with Simon. This looks good conceptually. I have minor implementation comments but that can wait till the code reviews. Sorry for the delay and thanks for working on this. Get Outlook for Android<https://aka.ms/ghei36> ________________________________ From: Simon Moll <moll at cs.uni-saarland.de> Sent: Monday, June 17, 2019 10:02:58 AM To: Francesco Petrogalli; LLVM

RFC: Interface user provided vector functions with the vectorizer.

2019 Jun 21

RFC: Interface user provided vector functions with the vectorizer.

>In all cases, the IR type of the parameters in `foo` is i64, therefore is not possible to distinguish what C type generated the signature of `foo`. Ouch. >I don’t know if this is going to be a problem for other architectures I haven't checked what IA-32/Intel64 should do for type 2, but I fully agree that this needs to be done properly according to the ABI. >Therefore, I would

[RFC] Expose user provided vector function for auto-vectorization.

2019 Jun 07

[RFC] Expose user provided vector function for auto-vectorization.

Hi All, [I'm only subscribed to digest, so the reply doesn't look great, sorry about that] > The second component is a tool that other parts of LLVM (for example, the loop vectorizer) can use to query the availability of the vector function, the SVFS I have described in the original post of the RFC, which is based on interpreting the `vector-variant` attribute. > The final

RFC: Interface user provided vector functions with the vectorizer.

2019 Jun 24

RFC: Interface user provided vector functions with the vectorizer.

@Xinmin, Saito: If Clang/the frontend generates the version there is no problem, or is there? The frontend knows about the original source type and it's ABI specific lowering already. @Francesco, we should even consider putting the generating capabilities outside of the OpenMP code generation (in the future). That could allow easier reuse by other frontends. Get Outlook for

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

2019 Jun 03

[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.

Hi All, The original intend of this thread is to "Expose user provided vector function for auto-vectorization.” I originally proposed to use OpenMP `declare variant` for the sake of using something that is defined by a standard. The RFC itself is not about fully implementing the `declare variant` directive. In fact, given the amount of complication it is bringing, I would like to move the

[PATCH] Combined checkFTB and capDirection into one checkOrientation function.

2007 Jun 12

[PATCH] Combined checkFTB and capDirection into one checkOrientation function.

--- include/cube.h | 18 +++------ plugins/cube.c | 120 +++++++++++++++++-------------------------------------- 2 files changed, 43 insertions(+), 95 deletions(-) diff --git a/include/cube.h b/include/cube.h index 0a87626..293bad1 100644 --- a/include/cube.h +++ b/include/cube.h @@ -87,16 +87,11 @@ typedef void (*CubePaintInsideProc) (CompScreen *s, CompOutput *output,

[LLVMdev] VFCmp failing when unordered or UnsafeFPMath on x86

2008 Jun 17

[LLVMdev] VFCmp failing when unordered or UnsafeFPMath on x86

Hi Nate! I don't see how that would work. Select doesn't work per element. Say we're trying to vectorize the following C++ code: if(v[0] < 0) v[0] += 1.0f; if(v[1] < 0) v[1] += 1.0f; if(v[2] < 0) v[2] += 1.0f; if(v[3] < 0) v[3] += 1.0f; With SSE assembly this would be as simple as: movaps xmm1, xmm0 // v in xmm0 cmpltps xmm1, zero // zero =

[PATCH v2 1/3] nvc0: implement multiple viewports/scissors, enable ARB_viewport_array

2014 Jun 15

[PATCH v2 1/3] nvc0: implement multiple viewports/scissors, enable ARB_viewport_array

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de> --- src/gallium/drivers/nouveau/nvc0/nvc0_context.h | 7 +- src/gallium/drivers/nouveau/nvc0/nvc0_program.c | 2 +- src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 20 ++-- src/gallium/drivers/nouveau/nvc0/nvc0_screen.h | 3 + src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 27 +++--

[PATCH 1/3] nvc0: implement multiple viewports/scissors, enable ARB_viewport_array

2014 Jun 14

[PATCH 1/3] nvc0: implement multiple viewports/scissors, enable ARB_viewport_array

[PATCH] nv50, nvc0: adjust blit_3d handling of ms output textures

2014 Mar 06

[PATCH] nv50, nvc0: adjust blit_3d handling of ms output textures

This fixes some unwanted scaling when the output is multisampled. Also increases nvc0 maximum supported texture size to be able to work with a 32k texture. Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> Cc: "10.0 10.1" <mesa-stable at lists.freedesktop.org> --- Ran the EXT_framebuffer_multisample tests, they improve a lot. The remaining failures are probably due to

[LLVMdev] Excessive register spilling in large automatically generated functions, such as is found in FFTW

2012 Jul 06

[LLVMdev] Excessive register spilling in large automatically generated functions, such as is found in FFTW

Hi, I've noticed that LLVM tends to generate suboptimal code and spill an excessive amount of registers in large functions, such as in those that are automatically generated by FFTW. LLVM generates good code for a function that computes an 8-point complex FFT, but from 16-point upwards, icc or gcc generates much better code. Here is an example of a sequence of instructions from a 32-point

[PATCH 2/8] nv50: fix viewport transform

2009 Jul 28

[PATCH 2/8] nv50: fix viewport transform

The translation also needs to be inverted, and in bypass mode the state tracker incorrectly assumes that Y = 0 = TOP, so we need inversion there to; NDC clipping has to be deactivated explicitly. --- src/gallium/drivers/nv50/nv50_state_validate.c | 31 +++++++++++++++-------- 1 files changed, 20 insertions(+), 11 deletions(-) diff --git a/src/gallium/drivers/nv50/nv50_state_validate.c

[RFC PATCH] nv50: adjust blit_3d logic

2014 Mar 06

[RFC PATCH] nv50: adjust blit_3d logic

--- So... this fixes a whole bunch of EXT_framebuffer_multisample tests, and the ones that still fail appear to do so due to some resolve error, rather than some "this is the wrong image" type errors. Perhaps it needs a 2d-style "move coordinates over a sub-texel" logic. But I'm unclear what these vertices are, I arrived at this through trial-and-error.

[LLVMdev] Plea for help

2004 May 04

[LLVMdev] Plea for help

On Tue, 4 May 2004, Finn S Andersen wrote: > Chris wrote in a followup: > > > Can you send the output of 'llc -o - foo.bc -debug -print-machineinstrs'? > > > Attached as "linscan". (But added the "-regalloc=linearscan" to provoke > the error). Yes, that's exactly what I meant... thanks for reading my mind! :) It looks like this is where

[PATCH 2/3] nv50: fix viewport transform

2009 Jul 12

[PATCH 2/3] nv50: fix viewport transform

We need to invert the viewport translate/scale parameters when the state tracker thinks we have Y_0_TOP. If these cases, we have do to bypass mode by setting an identity viewport transform for x, z and inversion for y, or p.e. clear_with_quad won't work correctly. Clipping for xy in NDC space needs to be disabled then. --- src/gallium/drivers/nv50/nv50_context.h | 1 +

RFC: Interface user provided vector functions with the vectorizer.

2019 Jun 24

RFC: Interface user provided vector functions with the vectorizer.

>Thank you everybody for their input, and for your patience. This is proving harder than expected! :) Thank you for doing the hard part of the work. Hideki -----Original Message----- From: Francesco Petrogalli [mailto:Francesco.Petrogalli at arm.com] Sent: Monday, June 24, 2019 11:26 AM To: Saito, Hideki <hideki.saito at intel.com> Cc: Doerfert, Johannes <jdoerfert at anl.gov>;

[RFC 4/9] tgsi: populate precise

2017 Jun 11

[RFC 4/9] tgsi: populate precise

Only implemented for glsl->tgsi. Other converters just set precise to 0. Signed-off-by: Karol Herbst <karolherbst at gmail.com> --- src/gallium/auxiliary/tgsi/tgsi_build.c | 3 +++ src/gallium/auxiliary/tgsi/tgsi_ureg.c | 14 +++++++--- src/gallium/auxiliary/tgsi/tgsi_ureg.h | 20 +++++++++++--- src/gallium/auxiliary/util/u_simple_shaders.c | 2 +-

similar to: 1.5 on os x?