Displaying 20 results from an estimated 500 matches similar to: "1.5 on os x?"
2019 Jun 03
2
[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.
On Mon, 3 Jun 2019 at 20:00, Francesco Petrogalli via cfe-dev <
cfe-dev at lists.llvm.org> wrote:
> Hi All,
>
> The original intend of this thread is to "Expose user provided vector
> function for auto-vectorization.”
>
> I originally proposed to use OpenMP `declare variant` for the sake of
> using something that is defined by a standard. The RFC itself is not
2019 Jun 11
2
RFC: Interface user provided vector functions with the vectorizer.
Dear all,
I have re-written the proposal for interfacing user provided vector
functions, originally posted in both llvm-dev and cfe-dev mailing
list:
"[RFC] Expose user provided vector function for auto-vectorization."
The proposal looks quite different from the original submission,
therefore I took the liberty to start a new thread.
The original thread generated some good
2019 Jun 24
2
RFC: Interface user provided vector functions with the vectorizer.
I have an RFC for first-class complex types in LLVM IR pending for some
internal review. I hope to post it soon. That should help address this
problem. Then the vector function signature generation could stay in
LLVM, if I'm understanding the issue correctly.
-David
Francesco Petrogalli via llvm-dev <llvm-dev at lists.llvm.org> writes:
> Hi all - I am
2019 Jun 17
3
RFC: Interface user provided vector functions with the vectorizer.
I agree with Simon. This looks good conceptually. I have minor implementation comments but that can wait till the code reviews.
Sorry for the delay and thanks for working on this.
Get Outlook for Android<https://aka.ms/ghei36>
________________________________
From: Simon Moll <moll at cs.uni-saarland.de>
Sent: Monday, June 17, 2019 10:02:58 AM
To: Francesco Petrogalli; LLVM
2019 Jun 21
2
RFC: Interface user provided vector functions with the vectorizer.
>In all cases, the IR type of the parameters in `foo` is i64, therefore is not possible to distinguish what C type generated the signature of `foo`.
Ouch.
>I don’t know if this is going to be a problem for other architectures
I haven't checked what IA-32/Intel64 should do for type 2, but I fully agree that this needs to be done properly according to the ABI.
>Therefore, I would
2019 Jun 07
2
[RFC] Expose user provided vector function for auto-vectorization.
Hi All,
[I'm only subscribed to digest, so the reply doesn't look great, sorry about that]
> The second component is a tool that other parts of LLVM (for example, the loop vectorizer) can use to query the availability of the vector function, the SVFS I have described in the original post of the RFC, which is based on interpreting the `vector-variant` attribute.
> The final
2019 Jun 24
4
RFC: Interface user provided vector functions with the vectorizer.
@Xinmin, Saito: If Clang/the frontend generates the version there is no problem, or is there? The frontend knows about the original source type and it's ABI specific lowering already.
@Francesco, we should even consider putting the generating capabilities outside of the OpenMP code generation (in the future). That could allow easier reuse by other frontends.
Get Outlook for
2019 Jun 03
6
[cfe-dev] [RFC] Expose user provided vector function for auto-vectorization.
Hi All,
The original intend of this thread is to "Expose user provided vector function for auto-vectorization.”
I originally proposed to use OpenMP `declare variant` for the sake of using something that is defined by a standard. The RFC itself is not about fully implementing the `declare variant` directive. In fact, given the amount of complication it is bringing, I would like to move the
2007 Jun 12
0
[PATCH] Combined checkFTB and capDirection into one checkOrientation function.
---
include/cube.h | 18 +++------
plugins/cube.c | 120 +++++++++++++++++--------------------------------------
2 files changed, 43 insertions(+), 95 deletions(-)
diff --git a/include/cube.h b/include/cube.h
index 0a87626..293bad1 100644
--- a/include/cube.h
+++ b/include/cube.h
@@ -87,16 +87,11 @@ typedef void (*CubePaintInsideProc) (CompScreen *s,
CompOutput *output,
2008 Jun 17
2
[LLVMdev] VFCmp failing when unordered or UnsafeFPMath on x86
Hi Nate!
I don't see how that would work. Select doesn't work per element.
Say we're trying to vectorize the following C++ code:
if(v[0] < 0) v[0] += 1.0f;
if(v[1] < 0) v[1] += 1.0f;
if(v[2] < 0) v[2] += 1.0f;
if(v[3] < 0) v[3] += 1.0f;
With SSE assembly this would be as simple as:
movaps xmm1, xmm0 // v in xmm0
cmpltps xmm1, zero // zero =
2014 Jun 15
0
[PATCH v2 1/3] nvc0: implement multiple viewports/scissors, enable ARB_viewport_array
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de>
---
src/gallium/drivers/nouveau/nvc0/nvc0_context.h | 7 +-
src/gallium/drivers/nouveau/nvc0/nvc0_program.c | 2 +-
src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 20 ++--
src/gallium/drivers/nouveau/nvc0/nvc0_screen.h | 3 +
src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 27 +++--
2014 Jun 14
0
[PATCH 1/3] nvc0: implement multiple viewports/scissors, enable ARB_viewport_array
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann at mni.thm.de>
---
src/gallium/drivers/nouveau/nvc0/nvc0_context.h | 7 +-
src/gallium/drivers/nouveau/nvc0/nvc0_program.c | 2 +-
src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 20 ++--
src/gallium/drivers/nouveau/nvc0/nvc0_screen.h | 3 +
src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 27 ++++-
2014 Mar 06
0
[PATCH] nv50, nvc0: adjust blit_3d handling of ms output textures
This fixes some unwanted scaling when the output is multisampled. Also
increases nvc0 maximum supported texture size to be able to work with a
32k texture.
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>
Cc: "10.0 10.1" <mesa-stable at lists.freedesktop.org>
---
Ran the EXT_framebuffer_multisample tests, they improve a lot. The remaining
failures are probably due to
2012 Jul 06
2
[LLVMdev] Excessive register spilling in large automatically generated functions, such as is found in FFTW
Hi,
I've noticed that LLVM tends to generate suboptimal code and spill an
excessive amount of registers in large functions, such as in those
that are automatically generated by FFTW.
LLVM generates good code for a function that computes an 8-point
complex FFT, but from 16-point upwards, icc or gcc generates much
better code. Here is an example of a sequence of instructions from a
32-point
2009 Jul 28
0
[PATCH 2/8] nv50: fix viewport transform
The translation also needs to be inverted, and in bypass mode
the state tracker incorrectly assumes that Y = 0 = TOP, so we
need inversion there to; NDC clipping has to be deactivated
explicitly.
---
src/gallium/drivers/nv50/nv50_state_validate.c | 31 +++++++++++++++--------
1 files changed, 20 insertions(+), 11 deletions(-)
diff --git a/src/gallium/drivers/nv50/nv50_state_validate.c
2014 Mar 06
0
[RFC PATCH] nv50: adjust blit_3d logic
---
So... this fixes a whole bunch of EXT_framebuffer_multisample tests, and the
ones that still fail appear to do so due to some resolve error, rather than
some "this is the wrong image" type errors. Perhaps it needs a 2d-style "move
coordinates over a sub-texel" logic. But I'm unclear what these vertices are,
I arrived at this through trial-and-error.
2004 May 04
0
[LLVMdev] Plea for help
On Tue, 4 May 2004, Finn S Andersen wrote:
> Chris wrote in a followup:
>
> > Can you send the output of 'llc -o - foo.bc -debug -print-machineinstrs'?
>
>
> Attached as "linscan". (But added the "-regalloc=linearscan" to provoke
> the error).
Yes, that's exactly what I meant... thanks for reading my mind! :)
It looks like this is where
2009 Jul 12
0
[PATCH 2/3] nv50: fix viewport transform
We need to invert the viewport translate/scale parameters
when the state tracker thinks we have Y_0_TOP.
If these cases, we have do to bypass mode by setting an
identity viewport transform for x, z and inversion for y,
or p.e. clear_with_quad won't work correctly.
Clipping for xy in NDC space needs to be disabled then.
---
src/gallium/drivers/nv50/nv50_context.h | 1 +
2019 Jun 24
2
RFC: Interface user provided vector functions with the vectorizer.
>Thank you everybody for their input, and for your patience. This is proving harder than expected! :)
Thank you for doing the hard part of the work.
Hideki
-----Original Message-----
From: Francesco Petrogalli [mailto:Francesco.Petrogalli at arm.com]
Sent: Monday, June 24, 2019 11:26 AM
To: Saito, Hideki <hideki.saito at intel.com>
Cc: Doerfert, Johannes <jdoerfert at anl.gov>;
2017 Jun 11
0
[RFC 4/9] tgsi: populate precise
Only implemented for glsl->tgsi. Other converters just set precise to 0.
Signed-off-by: Karol Herbst <karolherbst at gmail.com>
---
src/gallium/auxiliary/tgsi/tgsi_build.c | 3 +++
src/gallium/auxiliary/tgsi/tgsi_ureg.c | 14 +++++++---
src/gallium/auxiliary/tgsi/tgsi_ureg.h | 20 +++++++++++---
src/gallium/auxiliary/util/u_simple_shaders.c | 2 +-