Displaying 20 results from an estimated 1576 matches for "laning".
Did you mean:
landing
2017 Dec 20
2
[PATCH] gm107/ir: use lane 0 for manual textureGrad handling
This is parallel to the pre-SM50 change which does this. Adjusts the
shuffles / quadops to make the values correct relative to lane 0, and
then splat the results to all lanes for the final move into the target
register.
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>
---
Entirely untested beyond compilation. Should check
bin/tex-miplevel-selection textureGrad Cube
2017 Dec 20
0
[PATCH] gm107/ir: use lane 0 for manual textureGrad handling
On Tue, Dec 19, 2017 at 11:41 PM, Ilia Mirkin <imirkin at alum.mit.edu> wrote:
> This is parallel to the pre-SM50 change which does this. Adjusts the
> shuffles / quadops to make the values correct relative to lane 0, and
> then splat the results to all lanes for the final move into the target
> register.
>
> Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>
>
2017 Sep 19
1
Describing subreg load for vectors without using vector_insert
Hi,
We are using a vector_insert in our target, to describe an instruction
performing a lane-load of a vector register as:
set $dstReg, (vector_insert $dstReg, (load $addr)), imm:$lane)
However, this means that the dstReg is also marked as used in the
instruction, which we do not want. We can do a direct lane-load to a part
of the vector register without disturbing the rest, and hence would
2020 Nov 06
2
Loop-vectorizer prototype for the EPI Project based on the RISC-V Vector Extension (Scalable vectors)
On 11/6/20 12:39 PM, Sjoerd Meijer wrote:
Hello Simon,
Thanks for your replies, very useful. And yes, thanks for the example and making the target differences clear:
; Some examples:
; RISC-V V & VE(*):
; %mask = (splat i1 1)
; %evl = min(256, %n - %i)
; MVE/SVE :
; %mask = get.active.lane.mask(%i, %n)
; %evl = call @llvm.vscale()
; AVX:
; %mask = icmp (%i + (seq
2013 Jun 19
1
[LLVMdev] Register coalescer and reg_sequence (virtual super-regs)
Was it the subreg lane masks / mapping that was added to address the missed
coalescing? This solution is nice, but I don't think it'll work for me. I
have 8-element vector registers that can be grouped into virtual super regs
for bulk save/restore, and as soon as I have more than 4 in a tuple, the
unsigned int used to hold the lane masks overflows and switches over to the
"bit 31 set
2014 Apr 16
16
[Bug 77529] New: NVS 510 DP-3 output doesn't work
https://bugs.freedesktop.org/show_bug.cgi?id=77529
Priority: medium
Bug ID: 77529
Assignee: nouveau at lists.freedesktop.org
Summary: NVS 510 DP-3 output doesn't work
QA Contact: xorg-team at lists.x.org
Severity: normal
Classification: Unclassified
OS: All
Reporter: tex at sergio.spb.ru
2012 Oct 18
13
[PATCH 00/10] extract dp helper functions
Hi all,
I've frustrated myself the last few days yelling at our link training code.
Comparing the i915 code to radeon and nouveau I've noticed the lack of a nice
set of dp helper functions. So I've started to extract a few.
There's lots more that we can do I think (link configuration selection, the i2c
over aux retry stuff which diverges already between i915 and radeon, maybe
2017 Oct 17
3
[RFC] Adding Intrinsics for Masked Vector Integer Division and Remainder
Introduction
==========
We would like to add support for masked vector signed/unsigned integer division and remainder in the LLVM IR by introducing new target-independent intrinsics.
This follows similar work which was done already for masked vector loads and stores - http://lists.llvm.org/pipermail/llvm-dev/2014-October/078059.html.
Another relevant reference is the masked scatter/gather
2020 Nov 06
4
Loop-vectorizer prototype for the EPI Project based on the RISC-V Vector Extension (Scalable vectors)
On 11/6/20 8:49 AM, Roger Ferrer Ibáñez wrote:
Hi Sjoerd,
Trying to remember how everything fits together here, but could get.active.lane.mask not create the %mask of the VP intrinsics? Or in other words, in the vectoriser, who's producing the %mask and %evl that is consumed by the VP intrinsics?
I'm not sure what would be the best way here. I think about the Loop Vectorizer. I imagine
2013 May 16
1
[LLVMdev] Combining physical registers
On May 16, 2013, at 8:13 AM, Krzysztof Parzyszek <kparzysz at codeaurora.org> wrote:
> The function TII::canCombineSubRegIndices has been gone for a while now, and I was wondering if there is a target-independent way of determining if a certain set of physical registers "adds up" to a larger register. For example, on X86, AL and AH together form AX. On Hexagon, R0 and R1 are
2013 Aug 01
32
[Bug 67628] New: [BISECTED] Monitor on Display port shows distortions
https://bugs.freedesktop.org/show_bug.cgi?id=67628
Priority: medium
Bug ID: 67628
Assignee: nouveau at lists.freedesktop.org
Summary: [BISECTED] Monitor on Display port shows distortions
QA Contact: xorg-team at lists.x.org
Severity: major
Classification: Unclassified
OS: Linux (All)
Reporter:
2020 Nov 09
0
Loop-vectorizer prototype for the EPI Project based on the RISC-V Vector Extension (Scalable vectors)
; RISC-V V & VE(*):
; %mask = get.active.lane.mask(%i, %i)
; %evl = min(256, %n - %i)
; MVE/SVE/AVX :
; %mask = get.active.lane.mask(%i, %n)
; %evl = call @llvm.vscale()
For VE, we want to do as much predication as possible through %evl and as little as possible with %mask. This has performance implications on VE and RISC-V - VE does not generate a mask from %evl but %evl is
2016 Oct 03
5
Is this undefined behavior optimization legal?
Hi,
I've found a test case where SelectionDAG is doing an undefined behavior
optimization, and I need help determining whether or not this is legal.
Here is the example IR:
define void @test(<4 x i8> addrspace(1)* %out, float %a) {
%uint8 = fptoui float %a to i8
%vec = insertelement <4 x i8> <i8 0, i8 0, i8 0, i8 0>, i8 %uint8, i32 0
store <4 x i8> %vec, <4
2013 May 16
2
[LLVMdev] Combining physical registers
The function TII::canCombineSubRegIndices has been gone for a while now,
and I was wondering if there is a target-independent way of determining
if a certain set of physical registers "adds up" to a larger register.
For example, on X86, AL and AH together form AX. On Hexagon, R0 and R1
are D0.
The context here is an attempt to coalesce multiple loads/stores into
fewer loads/stores
2017 Jun 12
4
Implementing cross-thread reduction in the AMDGPU backend
Hi all,
I've been looking into how to implement the more advanced Shader Model
6 reduction operations in radv (and obviously most of the work would
be useful for radeonsi too). They're explained in the spec for
GL_AMD_shader_ballot at
https://www.khronos.org/registry/OpenGL/extensions/AMD/AMD_shader_ballot.txt,
but I'll summarize them here. There are two types of operations:
2020 Nov 06
0
Loop-vectorizer prototype for the EPI Project based on the RISC-V Vector Extension (Scalable vectors)
Hello Simon,
Thanks for your replies, very useful. And yes, thanks for the example and making the target differences clear:
; Some examples:
; RISC-V V & VE(*):
; %mask = (splat i1 1)
; %evl = min(256, %n - %i)
; MVE/SVE :
; %mask = get.active.lane.mask(%i, %n)
; %evl = call @llvm.vscale()
; AVX:
; %mask = icmp (%i + (seq <8 x i32> 0,1,2,.,)), %n,
; %evl
2018 Jan 28
1
semPLS package will not load seems to be failing on loading package lattice
Hi R Help Team
I recently updated my R installation to R 3.4.3 and updated to later version of R Studio and I found that the package semPLS will not load even though installed and it seems to be failing on loading package lattice
Getting the following error message:
library(semPLS)
Loading required package: lattice
Error: package or namespace load failed for 'lattice':
.onLoad failed in
2009 Oct 08
1
xyplot#strips like ggplot?
Dear all,
I want to split the strips in xyplot and push them into the margins ...
Tried to find this in common documentation (such as Deepayan's book) on
lattice ... but so far without success ...
Here is the situation:
xyplot(Speed~Count|Lane*Day,...)
where Speed and Count are numeric, Lane and Day are factors.
By default, this makes a double strip on top of each graph. I can change
2017 Jun 12
2
Implementing cross-thread reduction in the AMDGPU backend
On 06/12/2017 07:15 PM, Tom Stellard via llvm-dev wrote:
> cc some people who have worked on this.
>
> On 06/12/2017 05:58 PM, Connor Abbott via llvm-dev wrote:
>> Hi all,
>>
>> I've been looking into how to implement the more advanced Shader Model
>> 6 reduction operations in radv (and obviously most of the work would
>> be useful for radeonsi too).
2016 Sep 18
4
Addressing TableGen's error "Ran out of lanemask bits" in order to use more than 32 subregisters per register
Hello.
I've managed to patch the various files from the back end related to
lanemask - now I have 1024-bit long lanemask.
But now I get the following error when giving make llc:
<<error:unhandled vector type width in intrinsic!>>
This error comes from this file
https://github.com/llvm-mirror/llvm/blob/master/utils/TableGen/IntrinsicEmitter.cpp,
comes from the