thr3ads.net - search: "wzr"

2018 Jan 04

2

Canonical way to handle zero registers?

...t; wrote: > Hi Sean, > > Just to give the GlobalISel perspective on this, Thanks for chiming in! > GlobalISel supports the declaration of a zero register in the register > class like so: > def GPR32z : RegisterOperand<GPR32> { > let GIZeroRegister = WZR; > } > With that definition, the tablegen-erated ISel code will try to replace > will try to replace 'G_CONSTANT s32 0' with WZR whenever the operand is > specified as GPR32z. > Is this method extensible to the case of other hardwired register values? Tracing throug...

[LLVMdev] ScheduleDAGInstrs computes deps using IR Values that may be invalid

2015 Feb 19

2

[LLVMdev] ScheduleDAGInstrs computes deps using IR Values that may be invalid

...X0 %X6 %X4 %W8 %X15 %X14 %W3 %W2 %X1 %X10 %X11 %X12 %X13 %X9 Predecessors according to CFG: BB#12 %X5<def> = ADDXrr %X16, %X13 * %W19<def> = LDRBBui %X5, 1; mem:LD1[%scevgep95](tbaa=<0x6e02518>) * %W3<def> = MADDWrrr %W2<kill>, %W3<kill>, %WZR * %W2<def> = SUBWrr %W3<kill>, %W19<kill> * STRBBui %W2<kill>, %X5<kill>, 1; mem:ST1[%scevgep95](tbaa=<0x6e02518>) Successors according to CFG: BB#16 BB#15: derived from LLVM BB %L20.1 Live Ins: %X16 %X17 %X18 %X7 %X0 %X6 %X4 %W8 %X15 %X14 %W...

[atomics][AArch64] Possible bug in cmpxchg lowering

2017 May 30

3

[atomics][AArch64] Possible bug in cmpxchg lowering

...the equivalent of the following on AArch64: _*ldxr w8, [x0]*_ cmp w8, w1 b.ne .LBB0_3 // BB#1: // %cmpxchg.trystore stlxr w8, w2, [x0] cbz w8, .LBB0_4 // BB#2: // %cmpxchg.failure mov w0, wzr ret .LBB0_3: // %cmpxchg.nostore clrex mov w0, wzr ret .LBB0_4: orr w0, wzr, #0x1 ret GCC instead generates a ldaxr for the initial load, which seems more correct to me since it is honoring the requested failure case acquire ord...

Canonical way to handle zero registers?

2018 Jan 04

0

Canonical way to handle zero registers?

...> Hi Sean, > > Just to give the GlobalISel perspective on this, > > Thanks for chiming in! > > GlobalISel supports the declaration of a zero register in the register class like so: > def GPR32z : RegisterOperand<GPR32> { > let GIZeroRegister = WZR; > } > With that definition, the tablegen-erated ISel code will try to replace will try to replace 'G_CONSTANT s32 0' with WZR whenever the operand is specified as GPR32z. > > > Is this method extensible to the case of other hardwired register values? Tracing throug...

Canonical way to handle zero registers?

2018 Jan 02

0

Canonical way to handle zero registers?

Hi Sean, Just to give the GlobalISel perspective on this, GlobalISel supports the declaration of a zero register in the register class like so: def GPR32z : RegisterOperand<GPR32> { let GIZeroRegister = WZR; } With that definition, the tablegen-erated ISel code will try to replace will try to replace 'G_CONSTANT s32 0' with WZR whenever the operand is specified as GPR32z. > On 21 Dec 2017, at 21:22, Sean Silva via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > I looked arou...

tablegen exponential behavior

2019 Sep 10

2

tablegen exponential behavior

...p> GPR64sp:$Rn, GPR64sp:$Rm, (i64 3)), (add (mulB<ldop> GPR64sp:$Rn, GPR64sp:$Rm, (i64 2)), (add (mulB<ldop> GPR64sp:$Rn, GPR64sp:$Rm, (i64 1)), (mulBz<ldop> GPR64sp:$Rn, GPR64sp:$Rm))))), (EXTRACT_SUBREG (i64 (DOT (DUPv2i32gpr WZR), (v8i8 (LD1Onev8b GPR64sp:$Rn)), (v8i8 (LD1Onev8b GPR64sp:$Rm)))), sub_32)>, Requires<[HasDotProd]>; def : DotProductI32<SDOTv8i8, sextloadi8>; def : DotProductI32<UDOTv8i8, zextloadi8>; Then when I extended it to 8 element vector...

Canonical way to handle zero registers?

2017 Dec 22

4

Canonical way to handle zero registers?

I looked around the codebase and didn't see anything that obviously looked like the natural place to turn constant zero immediates into zero-registers (i.e. registers that always return zero when read). Right now we are expanding them in ISelLowering::LowerOperation but that seems too early. The specific issue I'm hitting is that we have a register that reads as -1 and so when we replace

Strange regalloc behaviour: one more available register causes much worse allocation

2018 Dec 05

2

Strange regalloc behaviour: one more available register causes much worse allocation

...t the patch generating assembly with llc -mcpu=cortex-a57 everything looks fine, but with the patch we get this (which comes from the block bb.17.switchdest13): .LBB0_16: mov x29, x24 mov w24, w20 mov w20, w19 mov w19, w7 mov w7, w6 mov w6, w5 mov w5, w2 mov x2, x18 mov w18, w15 orr w15, wzr, #0x1c str w15, [x8, #8] mov w0, wzr mov w15, w18 mov x18, x2 mov w2, w5 mov w5, w6 mov w6, w7 mov w7, w19 mov w19, w20 mov w20, w24 mov x24, x29 b .LBB0_3 It looks like the orr and str have barged in and said "we're using w15!" and all the rest of the registers have meek...

[MTE] Tagging Globals

2020 Jul 15

2

[MTE] Tagging Globals

...g++ -O1 --target=aarch64-linux -march=armv8.5a+memtag -fsanitize=memtag test.cpp -S -o test.s main: // @main .Lmain$local: // %bb.0: // %entry adrp x8, global_array add x8, x8, :lo12:global_array str wzr, [x8, #4] add x8, x8, w0, sxtw #2 ldr w0, [x8, #64] ret .Lfunc_end0: .size main, .Lfunc_end0-main -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200715/568346a0/att...

[RFC] MC support for variant scheduling classes.

2018 May 10

2

[RFC] MC support for variant scheduling classes.

...).getImm() == 0) { assert(MI.getDesc().getNumOperands() == 3 && MI.getOperand(2).getImm() == 0 && "invalid MOVZi operands"); return true; } break; case AArch64::ANDWri: // and Rd, Rzr, #imm return MI.getOperand(1).getReg() == AArch64::WZR; case AArch64::ANDXri: return MI.getOperand(1).getReg() == AArch64::XZR; case TargetOpcode::COPY: return MI.getOperand(1).getReg() == AArch64::WZR; } return false; } ``` That logic can be replaced by the following MCPredicate definitions: ``` def CheckMOVZ : CheckAllOf<[ Chec...

Strange regalloc behaviour: one more available register causes much worse allocation

2018 Dec 05

3

Strange regalloc behaviour: one more available register causes much worse allocation

...s (which comes from the block bb.17.switchdest13): .LBB0_16: mov x29, x24 mov w24, w20 mov w20, w19 mov w19, w7 mov w7, w6 mov w6, w5 mov w5, w2 mov x2, x18 mov w18, w15 orr w15, wzr, #0x1c str w15, [x8, #8] mov w0, wzr mov w15, w18 mov x18, x2 mov w2, w5 mov w5, w6 mov w6, w7 mov w7, w19 mov w19, w20 mov w20, w24 mov x24, x29 b .LBB0...

Aarch64: unaligned access despite -mstrict-align

2020 Jun 01

3

Aarch64: unaligned access despite -mstrict-align

...at function f: // @f // %bb.0: adrp x8, g ldr x10, [x8, :lo12:g] ldr x9, [x0] ldr x8, [x10] rev x9, x9 rev x8, x8 cmp x8, x9 b.ne .LBB0_3 // %bb.1: ldr x8, [x10, #8] ldr x9, [x0, #8] rev x8, x8 rev x9, x9 cmp x8, x9 b.ne .LBB0_3 // %bb.2: mov w0, wzr ret .LBB0_3: cmp x8, x9 mov w8, #-1 cneg w0, w8, hs ret .Lfunc_end0: .size f, .Lfunc_end0-f // -- End function .ident "clang version 10.0.0-4ubuntu1 " .section ".note.GNU-stack","", at progbits .addrsig ---8<-------8&...

atomic ops are optimized with incorrect semantics .

2020 Feb 10

3

atomic ops are optimized with incorrect semantics .

Hi All, With the "https://gcc.godbolt.org/z/yBYTrd" case . the atomic is converted to non atomic ops for x86 like from xchg dword ptr [100], eax to mov dword ptr [100], 1 the pass is responsible for this tranformation was instCombine i.e InstCombiner::visitAtomicRMWInst which converts the IR like %0 = atomicrmw xchg i32* inttoptr (i64 100 to i32*), i32 1 monotonic to store

[MTE] Tagging Globals

2020 Jul 15

2

[MTE] Tagging Globals

...g++ -O1 --target=aarch64-linux -march=armv8.5a+memtag -fsanitize=memtag test.cpp -S -o test.s main: // @main .Lmain$local: // %bb.0: // %entry adrp x8, global_array add x8, x8, :lo12:global_array str wzr, [x8, #4] add x8, x8, w0, sxtw #2 ldr w0, [x8, #64] ret .Lfunc_end0: .size main, .Lfunc_end0-main _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> https://lists...

How vregs are assigned to operands in IR

2017 Oct 25

3

How vregs are assigned to operands in IR

...i8* getelementptr inbounds ([18 x i8], [18 x i8]* @.str, i32 0, i32 0), i32 %2) ret i32 0 } Generated machine instructions (initial) BB#0: derived from LLVM BB %entry %vreg11<def> = MOVi32imm 6; GPR32:%vreg11 %vreg12<def> = MOVi32imm 5; GPR32:%vreg12 STRWui %WZR, <fi#0>, 0; mem:ST4[FixedStack0] STRWui %vreg12, <fi#1>, 0; mem:ST4[FixedStack1] GPR32:%vreg12 STRWui %vreg11, <fi#2>, 0; mem:ST4[FixedStack2] GPR32:%vreg11 ................................. Best Nisal

Hardware ASan Generating Unknown Instruction

2020 Jun 22

3

Hardware ASan Generating Unknown Instruction

...add x8, x8, #1 2d4e4: a2 13 00 d1 sub x2, x29, #4 2d4e8: e9 03 08 aa mov x9, x8 2d4ec: df 64 ff 97 bl #-158852 <__hwasan_check_x2_18_short> 2d4f0: ea 03 1f 2a mov w10, wzr 2d4f4: aa c3 1f b8 stur w10, [x29, #-4] 2d4f8: a2 23 00 d1 sub x2, x29, #8 2d4fc: e9 03 08 aa mov x9, x8 2d500: da 64 ff 97 bl #-158872 <__hwasan_check_x2_18_short> 2d504: a0 83 1f b...

Hardware ASan Generating Unknown Instruction

2020 Jun 22

3

Hardware ASan Generating Unknown Instruction

...2d4e4: a2 13 00 d1 sub x2, x29, #4 >> 2d4e8: e9 03 08 aa mov x9, x8 >> 2d4ec: df 64 ff 97 bl #-158852 >> <__hwasan_check_x2_18_short> >> 2d4f0: ea 03 1f 2a mov w10, wzr >> 2d4f4: aa c3 1f b8 stur w10, [x29, #-4] >> 2d4f8: a2 23 00 d1 sub x2, x29, #8 >> 2d4fc: e9 03 08 aa mov x9, x8 >> 2d500: da 64 ff 97 bl #-158872 >> <__hwas...

[LLVMdev] Pseudo load and store instructions for AArch64

2014 Aug 22

5

[LLVMdev] Pseudo load and store instructions for AArch64

Hi Renato, > > I'm trying to add pseudo 64-bit load and store instructions for AArch64, which > > should have latencies set to "1" while being otherwise exactly the same as > > normal load and store instructions. > > Can I ask why would you need that? This is the only way I found to stop Machine Instruction Scheduler from reordering load and store

Testers needed for VoIP router solution

2007 Jul 24

1

Testers needed for VoIP router solution

...=UTF8&tag=jon002&index=blended&linkCode=u r2&camp=1789&creative=9325&keywords=buffalo%20whr> WHR-G54S, <http://www.amazon.com/gp/search?ie=UTF8&tag=jon002&index=blended&linkCode=u r2&camp=1789&creative=9325&keywords=buffalo%20whr> WHR-HP-G54, WZR-G54, WBR2-G54 * Asus <http://www.amazon.com/gp/search?ie=UTF8&tag=jon002&index=blended&linkCode=u r2&camp=1789&creative=9325&keywords=asus%20wl500g%20premium> WL500G Premium (no USB support) This will not work on Linksys WRT54G/GS v5-v7 or newer WRT54G/GS routers....

[LLVMdev] LICM promoting memory to scalar

2014 Sep 02

3

[LLVMdev] LICM promoting memory to scalar

...p" .globl _Z3fooii .align 2 .type _Z3fooii, at function _Z3fooii: // @_Z3fooii // BB#0: // %entry cbz w0, .LBB0_5 // BB#1: // %for.body.lr.ph mov w8, wzr cmp w0, #0 // =0 cinc w9, w0, lt asr w9, w9, #1 adrp x10, globalvar .LBB0_2: // %for.body // =>This Inner Loop Header: Depth=1 cmp w8, w9 b....

search for: wzr