Displaying 20 results from an estimated 10000 matches similar to: "[LLVMdev] How to stick two instructions together?"
2011 Jul 03
0
[LLVMdev] DLX backend
So I thought I'd try to use the documentation on llvm backends to try to create a DLX backend. I think I've got most of the stuff for the .td files done but I've got some problems.
* Do I need to represent the PC in my XXXRegisterInfo.td file; the branch instruction effects it but you can directly access it ... I'm thinking not.
* In my Instruction subclasses (in
2013 Oct 03
1
[LLVMdev] Help with a Microblaze code generation problem.
Sorry if this is a duplicate: I tried to send it last night and it
didn't go through. I'm trimming some text to see if it helps.
I have a simple program that fails on the Microblaze:
int main()
{
unsigned long long x, y;
x = 100;
y = 0x8000000000000000ULL;
return !(x > y);
}
As you can see, the test case compares two unsigned long long values. To
try to track
2018 Jan 23
2
Problems getting nouveau to work with either Geforce GT710 or Geforce 9800GT on ARM Cortex-A9
Hi Arnd,
Sorry for sending this email directly to you, but maybe you can help
me, or guide me where to look for.
I am having big trouble getting the nouveau driver to work with either
Geforce GT 710 or Geforce 9800 GT on an ARM armv7l Cortex-A9 with PCIe
x1/x4 with linux kernel 4.15-rc.
The Geforce GT 710 hangs the kernel boot process for a while and then
a kernel oops occurs, due to nvidiafb,
2008 Aug 08
0
[LLVMdev] Ideas for representing vector gather/scatter and masks in LLVM IR
On Aug 7, 2008, at 12:13 PM, David Greene wrote:
> On Tuesday 05 August 2008 13:27, David Greene wrote:
>
>> Neither solution eliminates the need for instcombine to be careful
>> and
>> consult masks from time to time.
>>
>> Perhaps I'm totally missing something. Concrete examples would be
>> helpful.
>
> Ok, so I took my own advice and
2019 Dec 10
3
Glue two instructions together
Hi,
for DAG-to-DAG instruction selection I’ve implemented a pattern, which
creates from one SDNode two instructions, something like:
def: Pat<(NEW_SDNODE REG:$r1),
(INST_OUT (INST_IN), REG:$r1)>;
where INST_IN doesn't accepts any inputs and INST_OUT accepts two inputs -
one returned by INST_IN and REG;$r1.
Is there any possibility to ‘Glue’ two instruction created
2005 Jul 20
1
MMX IDCT for theora-exp
Hello,
I'm attaching IDCT MMX patch. I reused IDCT from theora-a3-MMXd.zip.
It should work on 64bit X86 platform too.
Here is most used functions when playing video with jet aircrafts (gripen)
Ogg logical stream 310b2968 is Theora 720x480 29.97 fps video
Encoded frame content is 720x480 with 0x0 offset
I can play this video with like 200-300 frame drops on Athlon XP 1700+
CPU load (with
2019 Dec 11
2
Glue two instructions together
You could hardcode a register for the pseudo instruction to use in the td file.
The register allocator will make sure not to clobber it.
let uses = [ R1 ], defs = [ R1 ] in {
def MYINST : Pseudo<>
}
On Wed, Dec 11, 2019 at 10:25 AM Przemyslaw Ossowski via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
>
> I have one more question regarding expanding pseudo instruction.
>
>
2008 Aug 07
6
[LLVMdev] Ideas for representing vector gather/scatter and masks in LLVM IR
On Tuesday 05 August 2008 13:27, David Greene wrote:
> Neither solution eliminates the need for instcombine to be careful and
> consult masks from time to time.
>
> Perhaps I'm totally missing something. Concrete examples would be helpful.
Ok, so I took my own advice and thought about CSE and instcombine a bit.
I wrote the code by hand in a sort of pseudo-llvm language, so
2009 Feb 13
0
[LLVMdev] Modeling GPU vector registers, again (with my implementation)
On Feb 13, 2009, at 9:47 AM, Alex wrote:
> It seems to me that LLVM sub-register is not for the following
> hardware architecture.
>
> All instructions of a hardware are vector instructions. All
> registers contains
> 4 32-bit FP sub-registers. They are called r0.x, r0.y, r0.z, r0.w.
>
> Most instructions write more than one elements in this way:
>
> mul
2007 Dec 02
2
Optimised qmf_synth and iir_mem16
Hi all,
I've taken preglows ARM versions of qmf_synth and iir_mem16 from
rockboxes speex codec, and tweaked them a bit further for some more
speed.
I attach them here so you can review and take on any changes you
want.
Please let me know if you have questions etc.
Thanks,
Robin
--
Robin Watts, Email: <mailto:Robin.Watts@wss.co.uk>
Warm Silence Software, WWW:
2009 Feb 13
3
[LLVMdev] Modeling GPU vector registers, again (with my implementation)
It seems to me that LLVM sub-register is not for the following hardware
architecture.
All instructions of a hardware are vector instructions. All registers
contains
4 32-bit FP sub-registers. They are called r0.x, r0.y, r0.z, r0.w.
Most instructions write more than one elements in this way:
mul r0.xyw, r1, r2
add r0.z, r3, r4
sub r5, r0, r1
Notice that the four elements of r0 are written
2014 Feb 08
0
[PATCH v2] arm: Use the UAL syntax for instructions
This is required in order to build using the built-in assembler
in clang.
---
I squashed the two changes since it would break the normal gcc
build otherwise.
---
celt/arm/arm2gnu.pl | 2 ++
celt/arm/celt_pitch_xcorr_arm.s | 18 +++++++++---------
2 files changed, 11 insertions(+), 9 deletions(-)
diff --git a/celt/arm/arm2gnu.pl b/celt/arm/arm2gnu.pl
index eab42ef..5c24758 100755
---
2014 Oct 24
3
[LLVMdev] IndVar widening in IndVarSimplify causing performance regression on GPU programs
Hi,
I noticed a significant performance regression (up to 40%) on some internal
CUDA benchmarks (a reduced example presented below). The root cause of this
regression seems that IndVarSimpilfy widens induction variables assuming
arithmetics on wider integer types are as cheap as those on narrower ones.
However, this assumption is wrong at least for the NVPTX64 target.
Although the NVPTX64 target
2014 Feb 07
3
[PATCH 1/2] arm: Use the UAL syntax for ldr<cc>h instructions
This is required in order to build using the built-in assembler
in clang.
---
celt/arm/celt_pitch_xcorr_arm.s | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/celt/arm/celt_pitch_xcorr_arm.s b/celt/arm/celt_pitch_xcorr_arm.s
index 09917b1..3c4b950 100644
--- a/celt/arm/celt_pitch_xcorr_arm.s
+++ b/celt/arm/celt_pitch_xcorr_arm.s
@@ -309,7 +309,7 @@
2019 Jan 18
0
[klibc:master] arch: Remove m32r port
Commit-ID: 1e6e96615227de6ca88d096fb0ebe45bf25981c2
Gitweb: http://git.kernel.org/?p=libs/klibc/klibc.git;a=commit;h=1e6e96615227de6ca88d096fb0ebe45bf25981c2
Author: Ben Hutchings <ben at decadent.org.uk>
AuthorDate: Fri, 18 Jan 2019 00:33:51 +0000
Committer: Ben Hutchings <ben at decadent.org.uk>
CommitDate: Fri, 18 Jan 2019 03:10:14 +0000
[klibc] arch: Remove m32r port
2016 Jan 26
2
[v3,11/41] mips: reuse asm-generic/barrier.h
On Tue, Jan 26, 2016 at 12:16:09PM +0000, Will Deacon wrote:
> On Mon, Jan 25, 2016 at 10:03:22PM -0800, Paul E. McKenney wrote:
> > On Mon, Jan 25, 2016 at 04:42:43PM +0000, Will Deacon wrote:
> > > On Fri, Jan 15, 2016 at 01:58:53PM -0800, Paul E. McKenney wrote:
> > > > PPC Overlapping Group-B sets version 4
> > > > ""
> > > > (*
2016 Jan 26
2
[v3,11/41] mips: reuse asm-generic/barrier.h
On Tue, Jan 26, 2016 at 12:16:09PM +0000, Will Deacon wrote:
> On Mon, Jan 25, 2016 at 10:03:22PM -0800, Paul E. McKenney wrote:
> > On Mon, Jan 25, 2016 at 04:42:43PM +0000, Will Deacon wrote:
> > > On Fri, Jan 15, 2016 at 01:58:53PM -0800, Paul E. McKenney wrote:
> > > > PPC Overlapping Group-B sets version 4
> > > > ""
> > > > (*
2006 Jun 26
0
[klibc 26/43] m32r support for klibc
The parts of klibc specific to the m32r architecture.
Signed-off-by: H. Peter Anvin <hpa at zytor.com>
---
commit 7ba219f9bcddda38ddc9010b54fd10431292f744
tree 1cf287dfd321d6b980789f49bb0750e8a4217c22
parent 951dc85bd690c6cc5a971815665da947160cbe51
author H. Peter Anvin <hpa at zytor.com> Sun, 25 Jun 2006 16:58:27 -0700
committer H. Peter Anvin <hpa at zytor.com> Sun, 25 Jun
2016 Jan 26
0
[v3,11/41] mips: reuse asm-generic/barrier.h
On Mon, Jan 25, 2016 at 10:03:22PM -0800, Paul E. McKenney wrote:
> On Mon, Jan 25, 2016 at 04:42:43PM +0000, Will Deacon wrote:
> > On Fri, Jan 15, 2016 at 01:58:53PM -0800, Paul E. McKenney wrote:
> > > PPC Overlapping Group-B sets version 4
> > > ""
> > > (* When the Group-B sets from two different barriers involve instructions in
> > >
2016 Jan 27
0
[v3,11/41] mips: reuse asm-generic/barrier.h
On Tue, Jan 26, 2016 at 11:58:20AM -0800, Paul E. McKenney wrote:
> On Tue, Jan 26, 2016 at 12:16:09PM +0000, Will Deacon wrote:
> > On Mon, Jan 25, 2016 at 10:03:22PM -0800, Paul E. McKenney wrote:
> > > On Mon, Jan 25, 2016 at 04:42:43PM +0000, Will Deacon wrote:
> > > > On Fri, Jan 15, 2016 at 01:58:53PM -0800, Paul E. McKenney wrote:
> > > > > PPC