thr3ads.net - llvm dev - [llvm-dev] Strange regalloc behaviour: one more available register causes much worse allocation [Dec 2018]

If this information is useful, please help other people find it:
Share via:

John Brawn via llvm-dev

2018-Dec-05 15:46 UTC

[llvm-dev] Strange regalloc behaviour: one more available register causes much worse allocation

Preamble
--------

While working on an IR-level optimisation completely unrelated to register
allocation I happened to trigger some really strange register allocator
behaviour causing a large regression in bzip2 in spec2006. I've been trying
to fix that regression before getting the optimisation patch committed, because
I don't want to regress spec2006, but I'm basically fumbling in the dark
because
I don't yet know how or why the register allocator is making the decisions
it
does and I thought I'd send an email to see if anyone has any advice.


The problem
-----------

Attached are (zipped, as llvm-dev has a 100kb message limit):
 * bzip2_regression.ll (reduced from bzip2 in spec2006 after being compiled with
   some patches that I'm working on) which demonstrates the problem.
 * 0001-AArch64-JumpTableDest-scratch-register-isn-t-earlycl.patch which causes
   the problem.
 * without_patch_regalloc.txt, the regalloc debug log for llc -mcpu=cortex-a57
   bzip2_regression.ll without the patch applied.
 * with_patch_regalloc.txt, the same log but with the patch applied.
Note that the patch is not correct, but it happens to be a useful way of
provoking the problem.

Without the patch generating assembly with llc -mcpu=cortex-a57 everything looks
fine, but with the patch we get this (which comes from the block
bb.17.switchdest13):

.LBB0_16:
	mov	x29, x24
	mov	w24, w20
	mov	w20, w19
	mov	w19, w7
	mov	w7, w6
	mov	w6, w5
	mov	w5, w2
	mov	x2, x18
	mov	w18, w15
	orr	w15, wzr, #0x1c
	str	w15, [x8, #8]
	mov	w0, wzr
	mov	w15, w18
	mov	x18, x2
	mov	w2, w5
	mov	w5, w6
	mov	w6, w7
	mov	w7, w19
	mov	w19, w20
	mov	w20, w24
	mov	x24, x29
	b	.LBB0_3

It looks like the orr and str have barged in and said "we're using
w15!" and all
the rest of the registers have meekly moved out of the way and then moved back
again at the and. If the orr and str had used w29 instead then none of this
would have happened.

What the patch does is make one of the input operands to the JumpTableDest32
pseudo-instruction be not marked as earlyclobber, or in other words it means we
have one extra register free compared to without the patch. And you would
expect that more free registers = better register allocation, but in this case
it appears we don't.

Note: this problem can happen without the patch, but the test case is much much
larger and manifested itself as -fno-omit-frame-pointer giving a better
allocation than -fomit-frame-pointer. This patch was actually my first attempt
at fixing this (as I'd noticed that we were unnecessarily keeping an extra
register live across the JumpTableDest8).


What's going on
---------------

What this block looks like after live range splitting has happened is:

  7352B	bb.17.switchdest13:
  	; predecessors: %bb.3
  	  successors: %bb.30(0x80000000); %bb.30(100.00%)
  
  7360B	  %390:gpr32 = COPY $wzr
  7364B	  %434:gpr64 = COPY %432:gpr64
  7368B	  %429:gpr32 = COPY %427:gpr32
  7376B	  %424:gpr32 = COPY %422:gpr32
  7384B	  %419:gpr32 = COPY %417:gpr32
  7392B	  %414:gpr32 = COPY %412:gpr32
  7400B	  %409:gpr32 = COPY %407:gpr32
  7408B	  %404:gpr32 = COPY %402:gpr32
  7416B	  %399:gpr64 = COPY %397:gpr64
  7424B	  %394:gpr32 = COPY %392:gpr32
  7528B	  %253:gpr32 = MOVi32imm 28
  7536B	  STRWui %253:gpr32, %182:gpr64common, 2 :: (store 4 into %ir.106, align
8)
  7752B	  %392:gpr32 = COPY %394:gpr32
  7756B	  %397:gpr64 = COPY %399:gpr64
  7764B	  %402:gpr32 = COPY %404:gpr32
  7768B	  %407:gpr32 = COPY %409:gpr32
  7776B	  %412:gpr32 = COPY %414:gpr32
  7780B	  %417:gpr32 = COPY %419:gpr32
  7788B	  %422:gpr32 = COPY %424:gpr32
  7792B	  %427:gpr32 = COPY %429:gpr32
  7800B	  %432:gpr64 = COPY %434:gpr64
  7808B	  %373:gpr64sp = IMPLICIT_DEF
  7816B	  %374:gpr64sp = IMPLICIT_DEF
  8048B	  B %bb.30

Looking at the debug output of the register allocator, the sequence of events
which kicks things off is
 %223 assigned to w0
 %283 evicts %381 from w15
 %381 requeued for second round
 %253 assigned to w15
 %381 split for w15 in 4 bundles into %391-%395
  %391, %392, %395 are not local intervals
  %393 is the local interval for bb.11.switchdest09
  %394 is the local interval for bb.17.switchdest13
 %392 assigned to w15
 %391 evicts %376 from w18
 %394 assigned to w18
 %376 split into %396-%400
and then %396 evicts something which is split into something which evicts
something etc. until we're done.

Looking at what happens when this patch isn't applied the difference is:
 %223 cannot be assigned to w0, evicts %381 from w15
 %381 requeued for second round
 %283 assigned to w15
 %253 assigned to w15
 %381 split for w15 in 1 bundle into %391 and %392
  Neither is a local interval
 %391 evicts %380 from w2
 %392 assigned to w2

So it looks like the difference is that with the patch we happen to split %381
in a way that causes the split intervals to be allocated such that we get a pair
of copies in bb.17.switchdest13, and this causes a cascade effect where we
repeatedly do the same thing with a whole load of other registers.


Possible Solutions
------------------

So there's two ways I can think of to fix this:
 * Make %381 be split in the same way that it is without the patch, which I
   think means deciding that there's only 1 bundle for w15. Does anyone know
   where and how exactly these bundles are decided?
 * Try and change how evicted / split registers are allocated in some way.
   Things I've tried:
  * In RAGreedy::enqueue reduce the score of unspillable local intervals, and in
    RAGreedy::evictInterference put evicted registers into stage RS_Split
    immediately. This causes %381 to be split immediately instead of being
    requeued, and then makes %391 have a higher score than %253 causing it to
    be allocated before it. This works, but ends up causing an extra spill.
  * In RAGreedy::splitAroundRegion put global intervals into stage RS_Split
    immediately. This makes the chain of evictions after %396 not happen, but
    that gives us one extra spill and we still get one pair of copies in
    bb.17.switchdest13.
  * In RAGreedy::evictInterference put evicted registers into a new RS_Evicted
    stage, which is like RS_Assign but can't evict anything. This seemed to
give
    OK results but was a mess and I didn't understand what I was doing, so I
    threw it away.
  * Turn on the ConsiderLocalIntervalCost option, as it's supposed to help
with
    eviction chains like this. Unfortunately it doesn't work as it's a
non-local
    interval that's causing the eviction chain. I tried making it also
handle
    non-local intervals, but couldn't figure out how to.
  * Turn on TRI->reverseLocalAssignment(). This seemed to work, but I'm
not sure
    why and reading the description of that it may not be the correct solution
    (it's described as being an option to reduce the time the register
allocator
    takes, not to give better allocation). The benchmark results are also
    overall slightly worse.

Any ideas on what the right approach to fixing this is?

John

-------------- next part --------------
A non-text attachment was scrubbed...
Name: attachments.zip
Type: application/x-zip-compressed
Size: 52433 bytes
Desc: attachments.zip
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20181205/ddd2d0ae/attachment-0001.bin>

Nirav Davé via llvm-dev

2018-Dec-05 17:13 UTC

head link

[llvm-dev] Strange regalloc behaviour: one more available register causes much worse allocation

This has cropped up before in X86 (
https://bugs.llvm.org/show_bug.cgi?id=26810 /
https://reviews.llvm.org/rL316295), and there's at least a partial
mitigation
(I recently ran into an eviction change on X86 when trying variants of a
MachineScheduler change, but couldn't find a reproduction post the landed
patch).

I suggest you try enabling enableAdvancedRASplitCost() for ARM and seeing
if that helps.

-Nirav



On Wed, Dec 5, 2018 at 10:46 AM John Brawn via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Preamble
> --------
>
> While working on an IR-level optimisation completely unrelated to register
> allocation I happened to trigger some really strange register allocator
> behaviour causing a large regression in bzip2 in spec2006. I've been
trying
> to fix that regression before getting the optimisation patch committed,
> because
> I don't want to regress spec2006, but I'm basically fumbling in the
dark
> because
> I don't yet know how or why the register allocator is making the
decisions
> it
> does and I thought I'd send an email to see if anyone has any advice.
>
>
> The problem
> -----------
>
> Attached are (zipped, as llvm-dev has a 100kb message limit):
>  * bzip2_regression.ll (reduced from bzip2 in spec2006 after being
> compiled with
>    some patches that I'm working on) which demonstrates the problem.
>  * 0001-AArch64-JumpTableDest-scratch-register-isn-t-earlycl.patch which
> causes
>    the problem.
>  * without_patch_regalloc.txt, the regalloc debug log for llc
> -mcpu=cortex-a57
>    bzip2_regression.ll without the patch applied.
>  * with_patch_regalloc.txt, the same log but with the patch applied.
> Note that the patch is not correct, but it happens to be a useful way of
> provoking the problem.
>
> Without the patch generating assembly with llc -mcpu=cortex-a57 everything
> looks
> fine, but with the patch we get this (which comes from the block
> bb.17.switchdest13):
>
> .LBB0_16:
>         mov     x29, x24
>         mov     w24, w20
>         mov     w20, w19
>         mov     w19, w7
>         mov     w7, w6
>         mov     w6, w5
>         mov     w5, w2
>         mov     x2, x18
>         mov     w18, w15
>         orr     w15, wzr, #0x1c
>         str     w15, [x8, #8]
>         mov     w0, wzr
>         mov     w15, w18
>         mov     x18, x2
>         mov     w2, w5
>         mov     w5, w6
>         mov     w6, w7
>         mov     w7, w19
>         mov     w19, w20
>         mov     w20, w24
>         mov     x24, x29
>         b       .LBB0_3
>
> It looks like the orr and str have barged in and said "we're using
w15!"
> and all
> the rest of the registers have meekly moved out of the way and then moved
> back
> again at the and. If the orr and str had used w29 instead then none of this
> would have happened.
>
> What the patch does is make one of the input operands to the
> JumpTableDest32
> pseudo-instruction be not marked as earlyclobber, or in other words it
> means we
> have one extra register free compared to without the patch. And you would
> expect that more free registers = better register allocation, but in this
> case
> it appears we don't.
>
> Note: this problem can happen without the patch, but the test case is much
> much
> larger and manifested itself as -fno-omit-frame-pointer giving a better
> allocation than -fomit-frame-pointer. This patch was actually my first
> attempt
> at fixing this (as I'd noticed that we were unnecessarily keeping an
extra
> register live across the JumpTableDest8).
>
>
> What's going on
> ---------------
>
> What this block looks like after live range splitting has happened is:
>
>   7352B bb.17.switchdest13:
>         ; predecessors: %bb.3
>           successors: %bb.30(0x80000000); %bb.30(100.00%)
>
>   7360B   %390:gpr32 = COPY $wzr
>   7364B   %434:gpr64 = COPY %432:gpr64
>   7368B   %429:gpr32 = COPY %427:gpr32
>   7376B   %424:gpr32 = COPY %422:gpr32
>   7384B   %419:gpr32 = COPY %417:gpr32
>   7392B   %414:gpr32 = COPY %412:gpr32
>   7400B   %409:gpr32 = COPY %407:gpr32
>   7408B   %404:gpr32 = COPY %402:gpr32
>   7416B   %399:gpr64 = COPY %397:gpr64
>   7424B   %394:gpr32 = COPY %392:gpr32
>   7528B   %253:gpr32 = MOVi32imm 28
>   7536B   STRWui %253:gpr32, %182:gpr64common, 2 :: (store 4 into %ir.106,
> align 8)
>   7752B   %392:gpr32 = COPY %394:gpr32
>   7756B   %397:gpr64 = COPY %399:gpr64
>   7764B   %402:gpr32 = COPY %404:gpr32
>   7768B   %407:gpr32 = COPY %409:gpr32
>   7776B   %412:gpr32 = COPY %414:gpr32
>   7780B   %417:gpr32 = COPY %419:gpr32
>   7788B   %422:gpr32 = COPY %424:gpr32
>   7792B   %427:gpr32 = COPY %429:gpr32
>   7800B   %432:gpr64 = COPY %434:gpr64
>   7808B   %373:gpr64sp = IMPLICIT_DEF
>   7816B   %374:gpr64sp = IMPLICIT_DEF
>   8048B   B %bb.30
>
> Looking at the debug output of the register allocator, the sequence of
> events
> which kicks things off is
>  %223 assigned to w0
>  %283 evicts %381 from w15
>  %381 requeued for second round
>  %253 assigned to w15
>  %381 split for w15 in 4 bundles into %391-%395
>   %391, %392, %395 are not local intervals
>   %393 is the local interval for bb.11.switchdest09
>   %394 is the local interval for bb.17.switchdest13
>  %392 assigned to w15
>  %391 evicts %376 from w18
>  %394 assigned to w18
>  %376 split into %396-%400
> and then %396 evicts something which is split into something which evicts
> something etc. until we're done.
>
> Looking at what happens when this patch isn't applied the difference
is:
>  %223 cannot be assigned to w0, evicts %381 from w15
>  %381 requeued for second round
>  %283 assigned to w15
>  %253 assigned to w15
>  %381 split for w15 in 1 bundle into %391 and %392
>   Neither is a local interval
>  %391 evicts %380 from w2
>  %392 assigned to w2
>
> So it looks like the difference is that with the patch we happen to split
> %381
> in a way that causes the split intervals to be allocated such that we get
> a pair
> of copies in bb.17.switchdest13, and this causes a cascade effect where we
> repeatedly do the same thing with a whole load of other registers.
>
>
> Possible Solutions
> ------------------
>
> So there's two ways I can think of to fix this:
>  * Make %381 be split in the same way that it is without the patch, which I
>    think means deciding that there's only 1 bundle for w15. Does anyone
> know
>    where and how exactly these bundles are decided?
>  * Try and change how evicted / split registers are allocated in some way.
>    Things I've tried:
>   * In RAGreedy::enqueue reduce the score of unspillable local intervals,
> and in
>     RAGreedy::evictInterference put evicted registers into stage RS_Split
>     immediately. This causes %381 to be split immediately instead of being
>     requeued, and then makes %391 have a higher score than %253 causing it
> to
>     be allocated before it. This works, but ends up causing an extra spill.
>   * In RAGreedy::splitAroundRegion put global intervals into stage RS_Split
>     immediately. This makes the chain of evictions after %396 not happen,
> but
>     that gives us one extra spill and we still get one pair of copies in
>     bb.17.switchdest13.
>   * In RAGreedy::evictInterference put evicted registers into a new
> RS_Evicted
>     stage, which is like RS_Assign but can't evict anything. This
seemed
> to give
>     OK results but was a mess and I didn't understand what I was doing,
so
> I
>     threw it away.
>   * Turn on the ConsiderLocalIntervalCost option, as it's supposed to
help
> with
>     eviction chains like this. Unfortunately it doesn't work as
it's a
> non-local
>     interval that's causing the eviction chain. I tried making it also
> handle
>     non-local intervals, but couldn't figure out how to.
>   * Turn on TRI->reverseLocalAssignment(). This seemed to work, but
I'm
> not sure
>     why and reading the description of that it may not be the correct
> solution
>     (it's described as being an option to reduce the time the register
> allocator
>     takes, not to give better allocation). The benchmark results are also
>     overall slightly worse.
>
> Any ideas on what the right approach to fixing this is?
>
> John
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20181205/f2fcce9c/attachment.html>

John Brawn via llvm-dev

2018-Dec-05 17:50 UTC

head link

[llvm-dev] Strange regalloc behaviour: one more available register causes much worse allocation

enableAdvancedRASplitCost() does the same thing as ConsiderLocalIntervalCost,
but as a
subtarget option instead of a command-line option, and as I’ve said it doesn’t
help because
it’s a non-local interval causing the eviction chain
(RAGreedy::splitCanCauseEvictionChain
only considers the local interval for a single block, and it’s unclear to me how
to make it
handle a non-local interval).

John

From: Nirav Davé [mailto:niravd at google.com]
Sent: 05 December 2018 17:14
To: John Brawn
Cc: llvm-dev; nd
Subject: Re: [llvm-dev] Strange regalloc behaviour: one more available register
causes much worse allocation

This has cropped up before in X86 (https://bugs.llvm.org/show_bug.cgi?id=26810 /
https://reviews.llvm.org/rL316295), and there's at least a partial
mitigation
(I recently ran into an eviction change on X86 when trying variants of a
MachineScheduler change, but couldn't find a reproduction post the landed
patch).

I suggest you try enabling enableAdvancedRASplitCost() for ARM and seeing if
that helps.

-Nirav



On Wed, Dec 5, 2018 at 10:46 AM John Brawn via llvm-dev <llvm-dev at
lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:
Preamble
--------

While working on an IR-level optimisation completely unrelated to register
allocation I happened to trigger some really strange register allocator
behaviour causing a large regression in bzip2 in spec2006. I've been trying
to fix that regression before getting the optimisation patch committed, because
I don't want to regress spec2006, but I'm basically fumbling in the dark
because
I don't yet know how or why the register allocator is making the decisions
it
does and I thought I'd send an email to see if anyone has any advice.


The problem
-----------

Attached are (zipped, as llvm-dev has a 100kb message limit):
 * bzip2_regression.ll (reduced from bzip2 in spec2006 after being compiled with
   some patches that I'm working on) which demonstrates the problem.
 * 0001-AArch64-JumpTableDest-scratch-register-isn-t-earlycl.patch which causes
   the problem.
 * without_patch_regalloc.txt, the regalloc debug log for llc -mcpu=cortex-a57
   bzip2_regression.ll without the patch applied.
 * with_patch_regalloc.txt, the same log but with the patch applied.
Note that the patch is not correct, but it happens to be a useful way of
provoking the problem.

Without the patch generating assembly with llc -mcpu=cortex-a57 everything looks
fine, but with the patch we get this (which comes from the block
bb.17.switchdest13):

.LBB0_16:
        mov     x29, x24
        mov     w24, w20
        mov     w20, w19
        mov     w19, w7
        mov     w7, w6
        mov     w6, w5
        mov     w5, w2
        mov     x2, x18
        mov     w18, w15
        orr     w15, wzr, #0x1c
        str     w15, [x8, #8]
        mov     w0, wzr
        mov     w15, w18
        mov     x18, x2
        mov     w2, w5
        mov     w5, w6
        mov     w6, w7
        mov     w7, w19
        mov     w19, w20
        mov     w20, w24
        mov     x24, x29
        b       .LBB0_3

It looks like the orr and str have barged in and said "we're using
w15!" and all
the rest of the registers have meekly moved out of the way and then moved back
again at the and. If the orr and str had used w29 instead then none of this
would have happened.

What the patch does is make one of the input operands to the JumpTableDest32
pseudo-instruction be not marked as earlyclobber, or in other words it means we
have one extra register free compared to without the patch. And you would
expect that more free registers = better register allocation, but in this case
it appears we don't.

Note: this problem can happen without the patch, but the test case is much much
larger and manifested itself as -fno-omit-frame-pointer giving a better
allocation than -fomit-frame-pointer. This patch was actually my first attempt
at fixing this (as I'd noticed that we were unnecessarily keeping an extra
register live across the JumpTableDest8).


What's going on
---------------

What this block looks like after live range splitting has happened is:

  7352B bb.17.switchdest13:
        ; predecessors: %bb.3
          successors: %bb.30(0x80000000); %bb.30(100.00%)

  7360B   %390:gpr32 = COPY $wzr
  7364B   %434:gpr64 = COPY %432:gpr64
  7368B   %429:gpr32 = COPY %427:gpr32
  7376B   %424:gpr32 = COPY %422:gpr32
  7384B   %419:gpr32 = COPY %417:gpr32
  7392B   %414:gpr32 = COPY %412:gpr32
  7400B   %409:gpr32 = COPY %407:gpr32
  7408B   %404:gpr32 = COPY %402:gpr32
  7416B   %399:gpr64 = COPY %397:gpr64
  7424B   %394:gpr32 = COPY %392:gpr32
  7528B   %253:gpr32 = MOVi32imm 28
  7536B   STRWui %253:gpr32, %182:gpr64common, 2 :: (store 4 into %ir.106, align
8)
  7752B   %392:gpr32 = COPY %394:gpr32
  7756B   %397:gpr64 = COPY %399:gpr64
  7764B   %402:gpr32 = COPY %404:gpr32
  7768B   %407:gpr32 = COPY %409:gpr32
  7776B   %412:gpr32 = COPY %414:gpr32
  7780B   %417:gpr32 = COPY %419:gpr32
  7788B   %422:gpr32 = COPY %424:gpr32
  7792B   %427:gpr32 = COPY %429:gpr32
  7800B   %432:gpr64 = COPY %434:gpr64
  7808B   %373:gpr64sp = IMPLICIT_DEF
  7816B   %374:gpr64sp = IMPLICIT_DEF
  8048B   B %bb.30

Looking at the debug output of the register allocator, the sequence of events
which kicks things off is
 %223 assigned to w0
 %283 evicts %381 from w15
 %381 requeued for second round
 %253 assigned to w15
 %381 split for w15 in 4 bundles into %391-%395
  %391, %392, %395 are not local intervals
  %393 is the local interval for bb.11.switchdest09
  %394 is the local interval for bb.17.switchdest13
 %392 assigned to w15
 %391 evicts %376 from w18
 %394 assigned to w18
 %376 split into %396-%400
and then %396 evicts something which is split into something which evicts
something etc. until we're done.

Looking at what happens when this patch isn't applied the difference is:
 %223 cannot be assigned to w0, evicts %381 from w15
 %381 requeued for second round
 %283 assigned to w15
 %253 assigned to w15
 %381 split for w15 in 1 bundle into %391 and %392
  Neither is a local interval
 %391 evicts %380 from w2
 %392 assigned to w2

So it looks like the difference is that with the patch we happen to split %381
in a way that causes the split intervals to be allocated such that we get a pair
of copies in bb.17.switchdest13, and this causes a cascade effect where we
repeatedly do the same thing with a whole load of other registers.


Possible Solutions
------------------

So there's two ways I can think of to fix this:
 * Make %381 be split in the same way that it is without the patch, which I
   think means deciding that there's only 1 bundle for w15. Does anyone know
   where and how exactly these bundles are decided?
 * Try and change how evicted / split registers are allocated in some way.
   Things I've tried:
  * In RAGreedy::enqueue reduce the score of unspillable local intervals, and in
    RAGreedy::evictInterference put evicted registers into stage RS_Split
    immediately. This causes %381 to be split immediately instead of being
    requeued, and then makes %391 have a higher score than %253 causing it to
    be allocated before it. This works, but ends up causing an extra spill.
  * In RAGreedy::splitAroundRegion put global intervals into stage RS_Split
    immediately. This makes the chain of evictions after %396 not happen, but
    that gives us one extra spill and we still get one pair of copies in
    bb.17.switchdest13.
  * In RAGreedy::evictInterference put evicted registers into a new RS_Evicted
    stage, which is like RS_Assign but can't evict anything. This seemed to
give
    OK results but was a mess and I didn't understand what I was doing, so I
    threw it away.
  * Turn on the ConsiderLocalIntervalCost option, as it's supposed to help
with
    eviction chains like this. Unfortunately it doesn't work as it's a
non-local
    interval that's causing the eviction chain. I tried making it also
handle
    non-local intervals, but couldn't figure out how to.
  * Turn on TRI->reverseLocalAssignment(). This seemed to work, but I'm
not sure
    why and reading the description of that it may not be the correct solution
    (it's described as being an option to reduce the time the register
allocator
    takes, not to give better allocation). The benchmark results are also
    overall slightly worse.

Any ideas on what the right approach to fixing this is?

John

_______________________________________________
LLVM Developers mailing list
llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20181205/d7d88e46/attachment.html>

Seemingly Similar Threads

Search for more reasonably related threads

llvm dev - Dec 2018 - Strange regalloc behaviour: one more available register causes much worse allocation

[llvm-dev] Strange regalloc behaviour: one more available register causes much worse allocation

[llvm-dev] Strange regalloc behaviour: one more available register causes much worse allocation

[llvm-dev] Strange regalloc behaviour: one more available register causes much worse allocation

Seemingly Similar Threads