search for: vperm

Displaying 20 results from an estimated 20 matches for "vperm".

Did you mean: eperm
2004 Sep 10
1
altivec lpc_restore_signal
...b v18,-1 vsro v18,v18,v0 ; v18: mask vector li r31,0x8 lvsl v0,0,r31 vsldoi v0,v0,v0,12 li r31,0xc lvsl v1,0,r31 vspltisb v2,0 vspltisb v3,-1 vmrglw v2,v2,v3 vsel v0,v1,v0,v2 ; v0: reversal permutation vector add r10,r5,r6 lvsl v17,0,r5 ; v17: coefficient alignment permutation vector vperm v17,v17,v17,v0 ; v17: reversal coefficient alignment permutation vector mr r11,r8 lvsl v16,0,r11 ; v16: history alignment permutation vector lvx v0,0,r5 addi r5,r5,16 lvx v1,0,r5 vperm v0,v0,v1,v17 lvx v8,0,r11 addi r11,r11,-16 lvx v9,0,r11 vperm v8,v9,v8,v16 cmplw cr0,r5,r10 bc 12,0,...
2004 Oct 06
3
flac-1.1.1 completely broken on linux/ppc and on macosx if built with the standard toolchain (not xcode)
Sadly the latest optimization broke completely everything. The asm code isn't gas compliant. the libFLAC linker script has a typo, disabling the asm optimization and/or altivec won't let a correct build anyway. Instant fixes for the asm stuff: sed -i -e"s:;:\#:" on the lpc_asm.s to load address instead of addis+ori you could use lis and la and PLEASE use the @l(register)
2020 Feb 07
2
[RFC] Extending shufflevector for vscale vectors (SVE etc.)
> -----Original Message----- > From: Chris Lattner <clattner at nondot.org> > Sent: Wednesday, February 5, 2020 4:02 PM > To: Eli Friedman <efriedma at quicinc.com> > Cc: llvm-dev <llvm-dev at lists.llvm.org> > Subject: [EXT] Re: [llvm-dev] [RFC] Extending shufflevector for vscale vectors > (SVE etc.) > > On Jan 29, 2020, at 4:48 PM, Eli Friedman via
2008 Jul 10
2
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
...D::FCTIDZ"; - case PPCISD::FCTIWZ: return "PPCISD::FCTIWZ"; - case PPCISD::STFIWX: return "PPCISD::STFIWX"; - case PPCISD::VMADDFP: return "PPCISD::VMADDFP"; - case PPCISD::VNMSUBFP: return "PPCISD::VNMSUBFP"; - case PPCISD::VPERM: return "PPCISD::VPERM"; - case PPCISD::Hi: return "PPCISD::Hi"; - case PPCISD::Lo: return "PPCISD::Lo"; - case PPCISD::DYNALLOC: return "PPCISD::DYNALLOC"; - case PPCISD::GlobalBaseReg: return "PPCISD::GlobalBaseRe...
2008 Jul 08
0
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
PPCTargetLowering::EmitInstrWithCustomInserter has a reference to the current MachineFunction for other purposes. Can you use MachineFunction::getRegInfo instead? Dan On Jul 8, 2008, at 1:56 PM, Gary Benson wrote: > Would it be acceptable to change MachineInstr::getRegInfo from private > to public so I can use it from > PPCTargetLowering::EmitInstrWithCustomInserter? > >
2008 Jul 11
2
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
...D::FCTIDZ"; - case PPCISD::FCTIWZ: return "PPCISD::FCTIWZ"; - case PPCISD::STFIWX: return "PPCISD::STFIWX"; - case PPCISD::VMADDFP: return "PPCISD::VMADDFP"; - case PPCISD::VNMSUBFP: return "PPCISD::VNMSUBFP"; - case PPCISD::VPERM: return "PPCISD::VPERM"; - case PPCISD::Hi: return "PPCISD::Hi"; - case PPCISD::Lo: return "PPCISD::Lo"; - case PPCISD::DYNALLOC: return "PPCISD::DYNALLOC"; - case PPCISD::GlobalBaseReg: return "PPCISD::GlobalBaseRe...
2008 Jul 11
0
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
Hi Gary, This does not patch cleanly for me (PPCISelLowering.cpp). Can you prepare a updated patch? Thanks, Evan On Jul 10, 2008, at 11:45 AM, Gary Benson wrote: > Cool, that worked. New patch attached... > > Cheers, > Gary > > Evan Cheng wrote: >> Just cast both values to const TargetRegisterClass*. >> >> Evan >> >> On Jul 10, 2008, at 7:36
2008 Jul 10
0
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
Just cast both values to const TargetRegisterClass*. Evan On Jul 10, 2008, at 7:36 AM, Gary Benson wrote: > Evan Cheng wrote: >> How about? >> >> const TargetRegisterClass *RC = is64Bit ? &PPC:GPRCRegClass : >> &PPC:G8RCRegClass; >> unsigned TmpReg = RegInfo.createVirtualRegister(RC); > > I tried something like that yesterday: > > const
2008 Jul 10
2
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
Evan Cheng wrote: > How about? > > const TargetRegisterClass *RC = is64Bit ? &PPC:GPRCRegClass : > &PPC:G8RCRegClass; > unsigned TmpReg = RegInfo.createVirtualRegister(RC); I tried something like that yesterday: const TargetRegisterClass *RC = is64bit ? &PPC::GPRCRegClass : &PPC::G8RCRegClass; but I kept getting this error no matter how I arranged it:
2008 Jun 30
0
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
You need to insert new basic blocks and update CFG to accomplish this. There is a hackish way to do this right now. Add a pseudo instruction to represent this operation and mark it usesCustomDAGSchedInserter. This means the intrinsic is mapped to a single (pseudo) node. But it is then expanded into instructions that can span multiple basic blocks. See
2008 Jul 09
2
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
...D::FCTIDZ"; - case PPCISD::FCTIWZ: return "PPCISD::FCTIWZ"; - case PPCISD::STFIWX: return "PPCISD::STFIWX"; - case PPCISD::VMADDFP: return "PPCISD::VMADDFP"; - case PPCISD::VNMSUBFP: return "PPCISD::VNMSUBFP"; - case PPCISD::VPERM: return "PPCISD::VPERM"; - case PPCISD::Hi: return "PPCISD::Hi"; - case PPCISD::Lo: return "PPCISD::Lo"; - case PPCISD::DYNALLOC: return "PPCISD::DYNALLOC"; - case PPCISD::GlobalBaseReg: return "PPCISD::GlobalBaseRe...
2008 Jul 08
2
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
Would it be acceptable to change MachineInstr::getRegInfo from private to public so I can use it from PPCTargetLowering::EmitInstrWithCustomInserter? Cheers, Gary Evan Cheng wrote: > Look for createVirtualRegister. These are examples in > PPCISelLowering.cpp. > > Evan > On Jul 8, 2008, at 8:24 AM, Gary Benson wrote: > > > Hi Evan, > > > > Evan Cheng wrote:
2008 Jun 30
2
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
Chris Lattner wrote: > On Jun 27, 2008, at 8:27 AM, Gary Benson wrote: > > def CMP_UNRESw : Pseudo<(outs), (ins GPRC:$rA, GPRC:$rB, i32imm: > > $label), > > "cmpw $rA, $rB\n\tbne- La${label}_exit", > > [(PPCcmp_unres GPRC:$rA, GPRC:$rB, imm: > > $label)]>; > > } > > > > ...and
2008 Jul 02
2
[LLVMdev] Implementing llvm.atomic.cmp.swap.i32 on PowerPC
...D::FCTIDZ"; - case PPCISD::FCTIWZ: return "PPCISD::FCTIWZ"; - case PPCISD::STFIWX: return "PPCISD::STFIWX"; - case PPCISD::VMADDFP: return "PPCISD::VMADDFP"; - case PPCISD::VNMSUBFP: return "PPCISD::VNMSUBFP"; - case PPCISD::VPERM: return "PPCISD::VPERM"; - case PPCISD::Hi: return "PPCISD::Hi"; - case PPCISD::Lo: return "PPCISD::Lo"; - case PPCISD::DYNALLOC: return "PPCISD::DYNALLOC"; - case PPCISD::GlobalBaseReg: return "PPCISD::GlobalBaseRe...
2015 Jul 30
2
[LLVMdev] Question on BlendSplat Code - LLVM Commit 72753f87f2b80d66cfd7ca7c7b6c0db6737d4b24
An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150730/121669c8/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: blend-splat-test.tar.gz Type: application/octet-stream Size: 11716 bytes Desc: not available URL:
2020 Feb 08
2
[RFC] Extending shufflevector for vscale vectors (SVE etc.)
...on of scalable > shufflevectors later, we'll be able to autoupgrade the existing ones. > > It isn’t obvious to me that these need to be unified: we don’t have to a have > a single operation named “shuffle vector” that does all of the possible > element permutations. For example, vperm on PPC Altivec supports data > dependent shuffle masks, and merging that into shuffle vector seems like a > bad idea. > > An alternate design could look like three things (again, ignoring intrinsic vs > instruction): > > “Data dependent shuffle” would allow runtime shuffle mask...
2018 Mar 06
0
Heap Exhaustion during 'DAGCombiner::Run'
Martin: It sounds like you are doing is more akin to shuffle selection than fusion and therefore it's a better fit for instruction selection than DAGCombining. Try movign it to <Target>ISelDAGToDAG's Select (or potentially PreprocessISelDAG). Th -Nirav On Tue, Mar 6, 2018 at 4:05 PM Martin J. O'Riordan <MartinO at theheart.ie> wrote: > We discovered what is
2018 Mar 06
2
Heap Exhaustion during 'DAGCombiner::Run'
We discovered what is happening. SDAGCombiner essentially looks at various combinations of nodes to do with vectors, and when it can, it creates a vector shuffle. The problem is, that our vector shuffle lowering builds new trees with vector element, or vector sub-vector insert sequences. The generic DAGCombiner, reconstructs these into a new shuffle, and so the loop continues - we reduce it,
2000 Nov 15
8
Optimisations
Looking through the archives I have seen talk of making CPU specific optimisations for Vorbis, a la MMX/3DNow!/SSE. The feeling I gather is to wait until something is working well in C before committing to any kind of specific optimisation. What if oft used and needed DSP functions were identified and standardised DSP functionality be written for Vorbis? This would seperate the basically
2014 Sep 10
13
[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!
On Tue, Sep 9, 2014 at 11:39 PM, Chandler Carruth <chandlerc at google.com> wrote: > Awesome, thanks for all the information! > > See below: > > On Tue, Sep 9, 2014 at 6:13 AM, Andrea Di Biagio <andrea.dibiagio at gmail.com> > wrote: >> >> You have already mentioned how the new shuffle lowering is missing >> some features; for example, you explicitly