search for: xzr

Displaying 17 results from an estimated 17 matches for "xzr".

Did you mean: xor
2016 Nov 09
10
Is the correct behavior of getelementptr i192* for opt + llc -march=aarch64?
...serted or I write a bad code? % cat a.ll define void @store0_to_p4(i192* %p) { %p1 = bitcast i192* %p to i64* %p2 = getelementptr i64, i64* %p1, i64 3 %p3 = getelementptr i64, i64* %p2, i64 1 store i64 0, i64* %p3 ret void } % llc-3.8 a.ll -O3 -o - -march=aarch64 store0_to_p4: str xzr, [x0, #32] ; (A) ret % opt-3.8 -O3 a.ll -o - | llc-3.8 -O3 -o - -march=aarch64 store0_to_p4: str xzr, [x0, #40] ; (B) ret Yours, Shigeo
2019 Nov 21
2
[ARM] Peephole optimization ( instructions tst + add )
...ticed that in some cases clang generates sequence of AND+TST instructions: For example: AND x3, x2, x1 TST x2, x1 I think these instructions should be merged to one: ANDS x3, x2, x1 ( because TST <Xn>, <Xm> is alias for ANDS XZR, <Xn>, <Xm> - https://static.docs.arm.com/ddi0596/a/DDI_0596_ARM_a64_instruction_set_architecture.pdf ) Is it missing optimization or there could be some negative effect from such merge? Best regards Pavel PS: Code sample (though it may be significantly reduced): (clang -t...
2019 Nov 22
2
[ARM] Peephole optimization ( instructions tst + add )
...ticed that in some cases clang generates sequence of AND+TST instructions: For example: AND x3, x2, x1 TST x2, x1 I think these instructions should be merged to one: ANDS x3, x2, x1 ( because TST <Xn>, <Xm> is alias for ANDS XZR, <Xn>, <Xm> - https://static.docs.arm.com/ddi0596/a/DDI_0596_ARM_a64_instruction_set_architecture.pdf ) Is it missing optimization or there could be some negative effect from such merge? Best regards Pavel PS: Code sample (though it may be significantly reduced): (clang -t...
2018 Jan 04
0
Canonical way to handle zero registers?
...? I noticed that there actually aren't very many uses of GPR32z which seems strange, as I would expect most instructions could make use of the zero register. I haven't got around to rolling it out to GPR32 yet, we think it's safe to do that but there are a couple instructions where wzr/xzr aren't permitted. At the moment, it's on the instructions that lost the optimization when tablegen took over from the C++. > Thanks, > > -- Sean Silva > > > > On 21 Dec 2017, at 21:22, Sean Silva via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at li...
2018 Jan 04
2
Canonical way to handle zero registers?
On Tue, Jan 2, 2018 at 8:28 AM, Daniel Sanders <daniel_l_sanders at apple.com> wrote: > Hi Sean, > > Just to give the GlobalISel perspective on this, Thanks for chiming in! > GlobalISel supports the declaration of a zero register in the register > class like so: > def GPR32z : RegisterOperand<GPR32> { > let GIZeroRegister = WZR; >
2018 May 10
2
[RFC] MC support for variant scheduling classes.
...MI.getOperand(2).getImm() == 0 && "invalid MOVZi operands"); return true; } break; case AArch64::ANDWri: // and Rd, Rzr, #imm return MI.getOperand(1).getReg() == AArch64::WZR; case AArch64::ANDXri: return MI.getOperand(1).getReg() == AArch64::XZR; case TargetOpcode::COPY: return MI.getOperand(1).getReg() == AArch64::WZR; } return false; } ``` That logic can be replaced by the following MCPredicate definitions: ``` def CheckMOVZ : CheckAllOf<[ CheckOpcode<[MOVZWi, MOVZXi]>, CheckNumOperands<3>, CheckImmOperan...
2020 Mar 12
4
Correct modelling of instructions with types smaller than the register class
Hi Quentin, thank you for the reply! I have a couple more questions that came up when I tried to implement this today. I hope you can help me out with this again! Am 09.03.20 um 23:31 schrieb Quentin Colombet: > I would expect that you could create a register class and register > bank for the special register. That way you have something to map to > when you do register bank select.
2016 Feb 27
0
Reserved/Unallocatable Registers
...they are still alive after being clobbered by a regmask. > 2) It is not legal to add or remove definitions of a reserved register; This implies that we cannot replace it with a different register or temporarily spill/reload it. Zero registers don’t follow that rule. AArch64 has a pass that set XZR for unused results and this is fine. I.e., we need to add a note like you did for #4. > 3) Calls are considered uses of reserved registers. That means you cannot reorder a write to a reserved register over a call, even if there is no explicit use operand on the call > 4) The value of the res...
2017 May 09
4
[GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
...rce/Applications/sqlite3/sqlite3 (71%). Same issue causes MultiSource/Applications/lua/lua (46%). * SingleSource/Benchmarks/Misc/flops-2 (75%): Poor lowering of fneg: * FastISel: ldur d0, [x29,#-16] fneg d0, d0 stur d0, [x29,#-16] * GlobalISel: ldur d0, [x29,#-64] orr x8, xzr, #0x8000000000000000 fmov d1, x8 fsub d0, d1, d0 fmov x8, d0 stur x8, [x29,#-64] * MultiSource/Benchmarks/Prolangs-C++/city/city (74%): a call to memcpy for copying 4 bytes is present with GlobalISel that isn't present with FastISel, in function vehicle::select_move(). Same issue causes...
2017 May 09
2
[GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
...6%). > - SingleSource/Benchmarks/Misc/flops-2 (75%): Poor lowering of > fneg: > - FastISel: > ldur d0, [x29,#-16] > fneg d0, d0 > stur d0, [x29,#-16] > - GlobalISel: > ldur d0, [x29,#-64] > orr x8, xzr, #0x8000000000000000 > fmov d1, x8 > fsub d0, d1, d0 > fmov x8, d0 > stur x8, [x29,#-64] > - MultiSource/Benchmarks/Prolangs-C++/city/city (74%): a call to > memcpy for copying 4 bytes is present with GlobalISel that isn't presen...
2016 Feb 26
2
Reserved/Unallocatable Registers
Let's try this again after some longer offline discussions: = Reserved Registers = The primary use of reserved registers is to hold values required by runtime conventions. Typical examples are the stack pointer, frame pointer maybe TLS base address, GOT address ... Zero registers and program counters are an odd special case for which we may be able to provide looser rules. == Rules == 1)
2016 Nov 04
2
[RFC] Supporting ARM's SVE in LLVM
...*shufflevector*(splat), *icmp*, *propff*, *test* sequence has been recognized and transformed into the `whilelo` instruction. \newpage ```nasm SimpleReduction: // BB#0: subs w9, w1, #1 b.lt .LBB0_4 // BB#1: add x9, x9, #1 mov x8, xzr whilelo p0.s, xzr, x9 mov z0.s, #0 .LBB0_2: ld1w {z1.s}, p0/z, [x0, x8, lsl #2] incw x8 add z0.s, p0/m, z0.s, z1.s whilelo p0.s, x8, x9 b.mi .LBB0_2 // BB#3: ptrue p0.s...
2017 May 10
2
[GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
...Same issue causes MultiSource/Applications/lua/lua (46%). >> SingleSource/Benchmarks/Misc/flops-2 (75%): Poor lowering of fneg: >> FastISel: >> ldur d0, [x29,#-16] >> fneg d0, d0 >> stur d0, [x29,#-16] >> GlobalISel: >> ldur d0, [x29,#-64] >> orr x8, xzr, #0x8000000000000000 >> fmov d1, x8 >> fsub d0, d1, d0 >> fmov x8, d0 >> stur x8, [x29,#-64] >> MultiSource/Benchmarks/Prolangs-C++/city/city (74%): a call to memcpy for copying 4 bytes is present with GlobalISel that isn't present with FastISel, in function vehicl...
2017 Apr 27
2
[GlobalISel][AArch64] Toward flipping the switch for O0: Please give it a try!
Hi Kristof, > On Apr 27, 2017, at 9:47 AM, Kristof Beyls <kristof.beyls at arm.com> wrote: > > Hi Quentin, > >> On 27 Apr 2017, at 00:48, Quentin Colombet <qcolombet at apple.com <mailto:qcolombet at apple.com>> wrote: >> >> Hi Kristof, >> >>> On Apr 6, 2017, at 6:53 AM, Kristof Beyls <kristof.beyls at arm.com
2013 Jan 23
132
[PATCH 00/45] initial arm v8 (64-bit) support
First off, Apologies for the massive patch series... This series boots a 32-bit dom0 kernel to a command prompt on an ARMv8 (AArch64) model. The kernel is the same one as I am currently using with the 32 bit hypervisor I haven''t yet tried starting a guest or anything super advanced like that ;-). Also there is not real support for 64-bit domains at all, although in one or two places I
2013 Feb 22
48
[PATCH v3 00/46] initial arm v8 (64-bit) support
This round implements all of the review comments from V2 and all patches are now acked. Unless there are any objections I intend to apply later this morning. Ian.
2008 Jun 30
4
Rebuild of kernel 2.6.9-67.0.20.EL failure
Hello list. I'm trying to rebuild the 2.6.9.67.0.20.EL kernel, but it fails even without modifications. How did I try it? Created a (non-root) build environment (not a mock ) Installed the kernel.scr.rpm and did a rpmbuild -ba --target=`uname -m` kernel-2.6.spec 2> prep-err.log | tee prep-out.log The build failed at the end: Processing files: kernel-xenU-devel-2.6.9-67.0.20.EL Checking