Hi all,
Upon further investigation I can confirm that the pass does miss some
important cases, and visits some instructions more than it needs to. A
rewrite is in the works and will be posted soonish.
Cheers,
Lang.
On Thu, Mar 13, 2014 at 3:36 PM, Lang Hames <lhames at gmail.com> wrote:
> Hi Bruno,
>
> I'm looking at a test case where we're failing to insert a
vzeroupper
> between an instruction that dirties the YMM regs and a call that uses SSE
> regs. No test case yet - I'm still trying to reduce it to something
sane. I
> can see where the logic in the X86VZeroUpper optimization goes off the
> rails though: The entry state for the basic block is ST_UNKNOWN, and the
> optimization contains the following logic:
>
> if (CurState == ST_DIRTY) {
> // Only insert the VZEROUPPER in case the entry state isn't unknown.
> // When unknown, only compute the information within the block to have
> // it available in the exit if possible, but don't change the block.
> if (EntryState != ST_UNKNOWN) {
> BuildMI(BB, I, dl, TII->get(X86::VZEROUPPER));
> ++NumVZU;
> }
> // After the inserted VZEROUPPER the state becomes clean again, but
> // other YMM may appear before other subsequent calls or even before
> // the end of the BB.
> CurState = ST_CLEAN;
> }
>
> If CurState == ST_DIRTY and EntryState == ST_UNKNOWN, then some
> instruction in this basic block has dirtied the YMM regs. In that case, why
> would you want to avoid putting a vzeroupper instruction in? Is it just to
> avoid inserting duplicate vzerouppers when the block is revisited? If
> that's the case then I think the problem is actually in
> runOnMachineFunction, which contains the comment: "Each BB state
depends on
> all predecessors, loop over until everything converges. (Once we converge,
> we can implicitly mark everything that is still ST_UNKNOWN as
ST_CLEAN.)".
> We do iterate to convergence, but we don't mark anything as clean
> afterwards, nor do a final re-visit of the basic blocks that had previously
> had ST_UNKNOWN entry states. Is that an oversight?
>
> Cheers,
> Lang.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140314/d2ae0a8a/attachment.html>