thr3ads.net - similar to: "Help making 'narrow instruct microcode' Backend"

Displaying 20 results from an estimated 900 matches similar to: "Help making 'narrow instruct microcode' Backend"

Search list of elements for a specific pattern

2012 Jun 22

Search list of elements for a specific pattern

Hi, I have a list of mutations, called "mutList", of the form: > head(mutList) Alu 1 AluJ 2 AluJ/F(R)AM 3 AluJ/FLAM 4 AluJ/FRAM 5 AluJ/monomer 6 AluJb It contains about 500 elements and not all of them contain the sequence "Alu". I tried using this code: Alu<-mutList[which(grep("Alu",mutList)==1)] But that simply returned

BPF tablegen+codegen question

2020 May 12

BPF tablegen+codegen question

In BPF, an ADD instruction is defined as a 2 register instruction: 0x0f. add dst, src. dst += src In BPFInstrInfo.td this kind of ALU instruction is defined with: def _rr : ALU_RR<BPF_ALU64, Opc, (outs GPR:$dst), (ins GPR:$src2, GPR:$src), "$dst "#OpcodeStr#" $src", [(set

[LLVMdev] TableGen: Avoid/Ignore the "no immediates on RHS of commutative node" constraint.

2012 Jan 14

[LLVMdev] TableGen: Avoid/Ignore the "no immediates on RHS of commutative node" constraint.

Ivan, Sorry, no, I wasn't clear enough. Both "op dst_reg,immediate,src_reg" and "op dst_reg,src_reg,immediate" are allowed in the ALU ops. For most instructions these are two different things - e.g. sub a,5,b is different from sub,a,b,5 obviously - but for things like add they just define the same thing. My problem is that LLVM won't allow immediates on the LHS of

NEON FP flags

2016 Mar 25

NEON FP flags

On 25 March 2016 at 04:11, Hal Finkel <hfinkel at anl.gov> wrote: > As I understand it, the fundamental property being addresses here is: Are the semantics of scalar FP math the same as vector FP math? TTI seems like a good place to expose that information. If the semantics are indeed different, then the vectorizer would require fast-math flags in order to vectorize FP operations

Is that iozone result normal?

2008 Dec 14

Is that iozone result normal?

5-nodes server and 1 node client are connected by gigabits Ethernet. #] iozone -r 32k -r 512k -s 8G KB reclen write rewrite read reread read write read rewrite read fwrite frewrite fread freread 8388608 32 10559 9792 62435 62260 8388608 512 63012 63409 63409 63138 It seems 32k write/rewrite performance are very

Issue with files on glusterfs becoming unreadable.

2009 Jun 11

Issue with files on glusterfs becoming unreadable.

elbert at host1:~$ dpkg -l|grep glusterfs ii glusterfs-client 1.3.8-0pre2 GlusterFS fuse client ii glusterfs-server 1.3.8-0pre2 GlusterFS fuse server ii libglusterfs0 1.3.8-0pre2 GlusterFS libraries and translator modules I have 2 hosts set up to use AFR with

[llvm-mca] What's the difference between Rthroughput and "total cycles" in llvm-mca

2019 Jun 07

[llvm-mca] What's the difference between Rthroughput and "total cycles" in llvm-mca

Hi Andrea, So does this definition make sense for basic blocks with more than one instructions? E.g. how should one interpret a basic block with RThroughput of 2.3? On Fri, Jun 7, 2019 at 7:39 AM Andrea Di Biagio <andrea.dibiagio at gmail.com> wrote: > Hi Tom, > > Field 'Total Cycles' from the summary view simply reports the elapsed > number of cycles for the entire

[LLVMdev] Question about per-operand machine model

2014 Feb 18

[LLVMdev] Question about per-operand machine model

Hi Andy and all, I have a question about per-operand machine model. I am finding some relations between 'MCWriteLatencyEntry' and 'MCWriteProcResEntry'. For example, class InstTEST<..., InstrItinClass itin> : Instruction { let Itinerary = Itin; } // I assume this MI writes 2 registers. def TESTINST : InstTEST<..., II_TEST> // schedule info II_TEST:

Re: Hot swap CPU -- "build" is not a good CPU benchmark

2005 Jun 30

Re: Hot swap CPU -- "build" is not a good CPU benchmark

From: Peter Arremann <loony at loonybin.org> > Compiles aren't a great benchmark for a box since its 100% cpu and > neglects memory or disk performance but I had the numbers handy > for that :-) BTW, it is 100% ALU and a major strain on the ALU LOAD. In other words, it's not a good benchmark for even CPU. That's why the 3-issue ALU in the Nx586 on-ward blows the

[LLVMdev] TableGen: Avoid/Ignore the "no immediates on RHS of commutative node" constraint.

2012 Jan 14

[LLVMdev] TableGen: Avoid/Ignore the "no immediates on RHS of commutative node" constraint.

Dear all, I was wondering if it is possible in TableGen to either: 1. Selectively define an instruction depending on an SDNode's properties, e.g. if the SDNode is not commutative. 2. Override/ignore the TableGen error given when a commutative node has an immediate on the LHS. My case comes from trying to define a generic ALU operation multiclass for my target, which includes a

[LLVMdev] Macro-op fusion experiment

2011 Apr 08

[LLVMdev] Macro-op fusion experiment

On Apr 8, 2011, at 9:56 AM, NAKAMURA Takumi wrote: >>> 8B C3 mov eax, ebx >>> 03 C1 add eax, ecx >>> becomes >>> 8B C3 03 C1 add eax, ebx, ecx > > In my understanding, twoaddr pass tends to emit such a sequence. Yes, it always does, and the coalescer tries very hard to eliminate the copy. > Though I

data frame manipulation

2010 Apr 16

data frame manipulation

Dear group, Here is my data.frame : df <- structure(list(DESCRIPTION = c("PRM HGH GD ALU", "PRM HGH GD ALU", "PRIMARY NICKEL", "PRIMARY NICKEL", "PRIMARY NICKEL", "PRIMARY NICKEL", "STANDARD LEAD ", "STANDARD LEAD ", "STANDARD LEAD ", "STANDARD LEAD ", "STANDARD LEAD ",

[Bug 94637] New: system crash, no messages, GT215, ubuntu 16.04 when running glmark2 for a few minutes.

2016 Mar 20

[Bug 94637] New: system crash, no messages, GT215, ubuntu 16.04 when running glmark2 for a few minutes.

https://bugs.freedesktop.org/show_bug.cgi?id=94637 Bug ID: 94637 Summary: system crash, no messages, GT215, ubuntu 16.04 when running glmark2 for a few minutes. Product: Mesa Version: unspecified Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: critical

[PATCH 1/2] Check before trying a solid fill

2015 May 19

[PATCH 1/2] Check before trying a solid fill

Pre-nv50 has all sorts of funny requirements for non-copy alu operations, and will bail out of solid fills left and right. Account for that case and fall back to the memset. Reported-by: Andrew Randrianasulu <randrianasulu at gmail.com> Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- src/drmmode_display.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)

CentOS 4.0 -> 4.1 update failing

2005 Jun 20

CentOS 4.0 -> 4.1 update failing

I've updated CentOS 4.0 to 4.1 on several machines (some desktops, some servers). However on my laptop, update is failing with following error just after headers are downloaded: --> Running transaction check --> Processing Dependency: glibc-common = 2.3.4-2 for package: glibc --> Finished Dependency Resolution Error: Missing Dependency: glibc-common = 2.3.4-2 is needed by package

[PATCH 1/4] exa/nv10: use same clip settings as mesa driver

2014 Aug 10

[PATCH 1/4] exa/nv10: use same clip settings as mesa driver

The higher 0x800 was getting overwritten by the 0x7ff anyways, so it wasn't doing any good. The mesa driver just uses 0x800 for the low portion and doesn't set the 8 bit in the higher portion, so do the same thing here. Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu> --- src/nv10_exa.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/nv10_exa.c

CompRect modification

2009 Mar 01

CompRect modification

Me and Dennis Kasprzyk changed CompRect to be more intuitive and easily replace XRectangle use. This patch changes CompRect and whole Core. Thanks for attention. -- Eduardo Gurgel Pinho (GELSoL-UFC) (Gentoo) Linux User #415930 http://edgurgel.wordpress.com http://alu.dc.ufc.br/~eduardo -------------- next part -------------- An HTML attachment was scrubbed... URL:

[PATCH] Make SSE Run Time option.

2004 Aug 06

[PATCH] Make SSE Run Time option.

On Thu, 15 Jan 2004, Ian Ollmann wrote: > On Thu, 15 Jan 2004, Jean-Marc Valin wrote: > > > > Personally, I don't think much of PNI. The complex arithmetic stuff they > > > added sets you up for a lot of permute overhead that is inefficient -- > > > especially on a processor that is already weak on permute. In my opinion, > > > > Actually, the new

[LLVMdev] Does Mips resolve hazard in pre-ra-sched or post-ra-sched?

2013 Sep 20

[LLVMdev] Does Mips resolve hazard in pre-ra-sched or post-ra-sched?

Akira, Thanks you for response. I understand Post-RA schedule make uses of scoreboardHazardRecognizer. But I found mips codes are good enough by default. basically, I can not easily eyeball any bubbles. I don't understand how they can do that without post-RA-sched. pre-ra-scheduler eg. (SelectionDAG/ScheduleDAGRRList.cpp) has little information and they can only schedule node in topology

[LLVMdev] Does Mips resolve hazard in pre-ra-sched or post-ra-sched?

2013 Sep 20

[LLVMdev] Does Mips resolve hazard in pre-ra-sched or post-ra-sched?

Hi, Akira, I found you maintain mips MipsSchedule.td. does it correct? in MipsSchedule.td, every InstrItinData only uses one InstrStage. there's no ByPass info out there. are you sure this reflects the real R4xxx/R5xxx processors. why IILoad uses funcition unit ALU? InstrItinData<IILoad , [InstrStage<3, [ALU]>]> for my previous question, I have new input after

similar to: Help making 'narrow instruct microcode' Backend