thr3ads.net - search: "float16"

Displaying 20 results from an estimated 20 matches for "float16".

Did you mean: _float16

2009 Feb 05

[LLVMdev] 16 bit floats

...add f16 support? probably also code generation (can't give specifics, no real expert on the LLVM codebase). this would be because, even if the core typesystem knows of the type, the codegen might not know how to emit operations on that type. now, of note: in my project (not LLVM based), float16 had not been supported directly (since it is not known to the CPU), rather, some loader and saver thunks were used which converted to/from float32 (this used as the 'internal' representation of the type). in most cases, I would think this would be faster than directly operating on the float...

[LLVMdev] 16 bit floats

2009 Feb 05

[LLVMdev] 16 bit floats

I need to support 16 bit floats for some operations, outside of datatypes.td and the constants class, is there anything else I will need to modify to add f16 support? Thanks, Micah Villmow Systems Engineer Advanced Technology & Performance Advanced Micro Devices Inc. S1-609 One AMD Place Sunnyvale, CA. 94085 P: 408-749-3966 -------------- next part -------------- An HTML

[LLVMdev] 16 bit floats

2009 Feb 05

[LLVMdev] 16 bit floats

...also code generation (can't give specifics, no real expert > on the LLVM codebase). > this would be because, even if the core typesystem knows of the > type, the codegen might not know how to emit operations on that type. > > now, of note: > in my project (not LLVM based), float16 had not been supported > directly (since it is not known to the CPU), rather, some loader and > saver thunks were used which converted to/from float32 (this used as > the 'internal' representation of the type). in most cases, I would > think this would be faster than dir...

[LLVMdev] float16/half float support situation? (and a problem)

2012 Apr 11

[LLVMdev] float16/half float support situation? (and a problem)

OpenCL defines half data type, and it seems clang accepts this and generates code for it. The backend support for operations with fp16 seems to be missing and it works (or should work?) by converting these to fp32 for the actual calculations? But I'm having problems with this. first I just tried to use fp16 data type, without any support in backend. This was expected to fail. I got

[LLVMdev] 16 bit floats

2009 Feb 05

[LLVMdev] 16 bit floats

...f16 support? probably also code generation (can't give specifics, no real expert on the LLVM codebase). this would be because, even if the core typesystem knows of the type, the codegen might not know how to emit operations on that type. now, of note: in my project (not LLVM based), float16 had not been supported directly (since it is not known to the CPU), rather, some loader and saver thunks were used which converted to/from float32 (this used as the 'internal' representation of the type). in most cases, I would think this would be faster than directly operating on the float...

[cfe-dev] ARM float16 intrinsic test

2019 Jul 12

[cfe-dev] ARM float16 intrinsic test

...t checkout llvmorg-8.0.0 -b llvm8.0 cmake -G "Unix Makefiles" ../llvm-project/llvm -DCMAKE_BUILD_TYPE=Debug -DLLVM_ENABLE_PROJECTS="clang;lld" -DLLVM_TARGETS_TO_BUILD="X86;NVPTX;AMDGPU;ARM;AArch64" [arm.cpp] #define vst4_lane_f16(__p0, __p1, __p2) __extension__ ({ \ float16x4x4_t __s1 = __p1; \ __builtin_neon_vst4_lane_v(__p0, __s1.val[0], __s1.val[1], __s1.val[2], __s1.val[3], __p2, 8); \ }) typedef __fp16 float16_t; typedef __attribute__((neon_vector_type(4))) float16_t float16x4_t; typedef struct float16x4x4_t { float16x4_t val[4]; } float16x4x4_t; void test_vs...

[cfe-dev] ARM float16 intrinsic test

2019 Jul 12

[cfe-dev] ARM float16 intrinsic test

...torize-loops -vectorize-slp -o - -x c++ arm.cpp -faddrsig 1. <eof> parser at end of file 2. Code generation 3. Running pass 'Function Pass Manager' on module 'arm.cpp'. 4. Running pass 'ARM Instruction Selection' on function '@_Z18test_vst4_lane_f16PDh13float16x4x4_t' #0 0x000000000444190d llvm::sys::PrintStackTrace(llvm::raw_ostream&) /home/nancy/rpp_llvm/llvm-project/llvm/lib/Support/Unix/Signals.inc:495:0 #1 0x00000000044419a0 PrintStackTraceSignalHandler(void*) /home/nancy/rpp_llvm/llvm-project/llvm/lib/Support/Unix/Signals.inc:559:0 #2 0x0...

RFC: [GlobalISel] propagating int/float type information

2020 May 05

RFC: [GlobalISel] propagating int/float type information

I don’t think bfloat should be handled this way. What Amara is suggesting is an optimization, i.e., if we drop the information we are still correct. With bfloat, if we do an operation on float16 instead of bfloat16 this is a correctness problem. So that means that either we need to have new opcodes for bfloat or we need to carry around the floating point type in MIR. I think it would be more manageable to have the floating point type long term. That said, it also depends on what we decide...

[LLVMdev] Generalizing shuffle vector

2008 Sep 30

[LLVMdev] Generalizing shuffle vector

On Mon, Sep 29, 2008 at 8:11 PM, Mon Ping Wang <wangmp at apple.com> wrote: > The problem with generating insert and extracts is that we can generate poor > code > %tmp16 = extractelement <4 x float> %f4b, i32 0 > %f8a = insertelement <8 x float> %f8a, float %tmp16, i32 0 > %tmp18 = extractelement <4 x float> %f4b, i32 1 > %f8c

[LLVMdev] Re : ANN: libclc (OpenCL C library implementation)

2011 Oct 20

[LLVMdev] Re : ANN: libclc (OpenCL C library implementation)

...LLVM intrinsics, and is built around pure C macros. Clover uses a slightly more complex system, involving a Python script "compiling" a set of built-ins into four files. For example, this declaration (REPL is a macro that does a simple for()) : ---- def vecf : float2 float3 float4 float8 float16 native $type acospi $vecf : x:$type REPL($vecdim) result[i] = std::acos(x[i]) / M_PI; end ---- Is compiled to these fragments, one for each vector type (float2, float3, etc) : ---- // In stdlib_def.h : what the OpenCL C kernel sees float2 OVERLOAD acospi(float2 x); // In stdlib_impl...

RFC: [GlobalISel] propagating int/float type information

2020 May 06

RFC: [GlobalISel] propagating int/float type information

...Arsenault > Subject: Re: [llvm-dev] RFC: [GlobalISel] propagating int/float type information > > I don’t think bfloat should be handled this way. What Amara is suggesting is an optimization, i.e., if we drop the information we are still correct. > With bfloat, if we do an operation on float16 instead of bfloat16 this is a correctness problem. > > So that means that either we need to have new opcodes for bfloat or we need to carry around the floating point type in MIR. I think it would be more manageable to have the floating point type long term. > That said, it also depends on...

[LLVMdev] LLVM Archive Format Extension Proposal

2012 Nov 21

[LLVMdev] LLVM Archive Format Extension Proposal

On Nov 21, 2012, at 8:55 AM, Relph, Richard wrote: > AMD would like to add new functionality to ranlib (and later ar and nm) and to the bits of LLVM Core that read (and later write) archives. > Herewith a terse summary of the change, which we want to improve support of OpenCL for multiple GPUs in a single run-time. > > Conceptually, a serialized archive is really 2 pieces: a few

[LLVMdev] Generalizing shuffle vector

2008 Sep 30

[LLVMdev] Generalizing shuffle vector

Hi, The current definition of shuffle vector is <result> = shufflevector <n x <ty>> <v1>, <n x <ty>> <v2>, <n x i32> <mask> ; yields <n x <ty>> The first two operands of a 'shufflevector' instruction are vectors with types that match each other and types that match the result of the instruction. The third

[LLVMdev] ANN: libclc (OpenCL C library implementation)

2011 Oct 20

[LLVMdev] ANN: libclc (OpenCL C library implementation)

Hi Carlos, On 10/20/11 9:54 AM, Carlos Sánchez de La Lama wrote: >> The project started as a use-case for our "Whole-Function Vectorization" >> library, which allows to transform a function to compute the same as W >> executions of the original code by using SIMD instructions (W = 4 for >> SSE/AltiVec, 8 for AVX). > > Quite interesting. We were planning to

[LLVMdev] LLVM Archive Format Extension Proposal

2012 Nov 21

[LLVMdev] LLVM Archive Format Extension Proposal

AMD would like to add new functionality to ranlib (and later ar and nm) and to the bits of LLVM Core that read (and later write) archives. Herewith a terse summary of the change, which we want to improve support of OpenCL for multiple GPUs in a single run-time. Conceptually, a serialized archive is really 2 pieces: a few header members and a set of normal file members. There are no constraints on

Wine release 4.14

2019 Aug 16

Wine release 4.14

...nri Verbeet (49): wined3d: Pass a wined3d_context_gl structure to context_set_current(). wined3d: Return a wined3d_context_gl structure from context_get_current(). wined3d: Use d3d_info to determine BGRA vertex support in context_update_stream_info(). wined3d: Get rid of the float16 fallback in context_update_stream_info(). wined3d: Use d3d_info to determine whether shader outputs need interpolation qualifiers. wined3d: Store sRGB read control support in struct wined3d_d3d_info. wined3d: Store sRGB write control support in struct wined3d_d3d_info. wined...

[LLVMdev] Weird problems with cos (was Re: [PATCH v3 2/3] R600: Add carry and borrow instructions. Use them to implement UADDO/USUBO)

2014 Oct 03

[LLVMdev] Weird problems with cos (was Re: [PATCH v3 2/3] R600: Add carry and borrow instructions. Use them to implement UADDO/USUBO)

...lit generated_tests/cl/builtin/math/builtin-float-cos-1.0.generated.c) I have been seeing random failures (incorrect results) for some time and tried to investigate. the weird part is that the failures are not 100% reproducible, sometimes the tests pass, or partly pass (it's usually float8 and float16 subtests that fail). Failure is always the same "Expecting -0.925879 (0xbf6d0668) with tolerance 0.000000 (2 ulps), but got nan (0x7fc00000)" although the position may vary. even if the same value was computed earlier in the results array The first patch of this series does not change th...

RFC: [GlobalISel] propagating int/float type information

2020 May 01

RFC: [GlobalISel] propagating int/float type information

Hi, GlobalISel currently drops all type information relating to the integer/FP distinction during the IR translation pass, as the LLT types only represent whether a value is a scalar/vector/pointer and it’s size/shape. To compensate, later passes use the FP operations on those values to guess what kind of value is being stored within that virtual register. This means that i32/float loads get

Wine release 1.3.18

2011 Apr 15

Wine release 1.3.18

...a. d3dcompiler: Fix HeapAlloc/HeapFree for type members in the reflection parser. d3dx9: Make some functions inline. d3dx9: Parse effect pass and technique. Stefan D?singer (3): wined3d: Don't drop VBOs for full buffer reloading without conversion. wined3d: Remove FLOAT16 vertex attribute conversion support. wined3d: Only acquire a context in buffer::PreLoad if we have to. Stefan Leichter (1): scarddlg: New dll stub. Thomas Mullaly (6): include: Updated INTERNETFEATURELIST enum and flags. urlmon/tests: Added tests for CoInternetIsFeatureEna...

Wine release 1.1.27

2009 Aug 07

Wine release 1.1.27

...nstants dirty. wined3d: ARB clipplane init needs the helper constant. wined3d: Only use WINE_normalized_texrect if ARB_texture_np2 is supported. wined3d: Preload the correct texture location. wined3d: Enable WINED3DFMT_R16G16B16A16_UNORM. wined3d: Not all cards support float16 filtering. ddraw: d3d7 does not support two sided stencil. wined3d: Watch out about higher constants when clamping ps 1.x consts. d3d: Filter R8G8B8 in d3d8 and d3d9. wined3d: Filter WINED3DSTENCILCAPS_TWOSIDED in d3d8. wined3d: Dirtify the correct state. Stefan Leich...

search for: float16