SuccessChanges

Summary

  1. [AMDGPU][MC][NFC] Split large asm tests into smaller chunks (details)
  2. [ValueTracking] Fix isKnownNonEqual() with constexpr mul (details)
  3. [LV] Vectorize (some) early and multiple exit loops (details)
Commit c7ff2c0da1a66d8bae52751c2af4135e67bf3519 by dmitry.preobrazhensky
[AMDGPU][MC][NFC] Split large asm tests into smaller chunks

The following large tests have been split into smaller parts by instruction formats:

    gfx7_asm_all.s
    gfx8_asm_all.s
    gfx9_asm_all.s
    gfx10_asm_all.s

This change results in noticeable lit testing speedup.
For example, on a debug Windows build, split asm tests are run 3.5 times faster.
The file was addedllvm/test/MC/AMDGPU/gfx10_asm_vop1.s
The file was addedllvm/test/MC/AMDGPU/gfx10_asm_ds.s
The file was addedllvm/test/MC/AMDGPU/gfx10_asm_smem.s
The file was addedllvm/test/MC/AMDGPU/gfx7_asm_mimg.s
The file was addedllvm/test/MC/AMDGPU/gfx7_asm_ds.s
The file was addedllvm/test/MC/AMDGPU/gfx7_asm_vop1.s
The file was addedllvm/test/MC/AMDGPU/gfx9_asm_sopc.s
The file was addedllvm/test/MC/AMDGPU/gfx9_asm_sopp.s
The file was addedllvm/test/MC/AMDGPU/gfx8_asm_vop3_e64.s
The file was addedllvm/test/MC/AMDGPU/gfx9_asm_sop1.s
The file was addedllvm/test/MC/AMDGPU/gfx9_asm_vop3.s
The file was removedllvm/test/MC/AMDGPU/gfx9_asm_all.s
The file was addedllvm/test/MC/AMDGPU/gfx8_asm_sopc.s
The file was addedllvm/test/MC/AMDGPU/gfx10_asm_vopc.s
The file was addedllvm/test/MC/AMDGPU/gfx7_asm_vop2.s
The file was addedllvm/test/MC/AMDGPU/gfx10_asm_mubuf.s
The file was removedllvm/test/MC/AMDGPU/gfx8_asm_all.s
The file was addedllvm/test/MC/AMDGPU/gfx9_asm_mtbuf.s
The file was addedllvm/test/MC/AMDGPU/gfx9_asm_sopk.s
The file was addedllvm/test/MC/AMDGPU/gfx8_asm_vopc.s
The file was addedllvm/test/MC/AMDGPU/gfx7_asm_vintrp.s
The file was addedllvm/test/MC/AMDGPU/gfx9_asm_vop2.s
The file was addedllvm/test/MC/AMDGPU/gfx7_asm_sopk.s
The file was addedllvm/test/MC/AMDGPU/gfx8_asm_sopk.s
The file was addedllvm/test/MC/AMDGPU/gfx9_asm_mimg.s
The file was addedllvm/test/MC/AMDGPU/gfx8_asm_mtbuf.s
The file was addedllvm/test/MC/AMDGPU/gfx7_asm_vop3_e64.s
The file was addedllvm/test/MC/AMDGPU/gfx7_asm_exp.s
The file was addedllvm/test/MC/AMDGPU/gfx7_asm_smrd.s
The file was addedllvm/test/MC/AMDGPU/gfx8_asm_vop1.s
The file was addedllvm/test/MC/AMDGPU/gfx8_asm_smem.s
The file was addedllvm/test/MC/AMDGPU/gfx10_asm_flat.s
The file was addedllvm/test/MC/AMDGPU/gfx9_asm_vop3_e64.s
The file was addedllvm/test/MC/AMDGPU/gfx10_asm_vopcx.s
The file was addedllvm/test/MC/AMDGPU/gfx7_asm_sop1.s
The file was addedllvm/test/MC/AMDGPU/gfx8_asm_exp.s
The file was removedllvm/test/MC/AMDGPU/gfx10_asm_all.s
The file was addedllvm/test/MC/AMDGPU/gfx8_asm_sopp.s
The file was addedllvm/test/MC/AMDGPU/gfx9_asm_mubuf.s
The file was addedllvm/test/MC/AMDGPU/gfx8_asm_mubuf.s
The file was addedllvm/test/MC/AMDGPU/gfx9_asm_vop1.s
The file was addedllvm/test/MC/AMDGPU/gfx7_asm_flat.s
The file was addedllvm/test/MC/AMDGPU/gfx7_asm_sopc.s
The file was addedllvm/test/MC/AMDGPU/gfx8_asm_vop3.s
The file was addedllvm/test/MC/AMDGPU/gfx9_asm_smem.s
The file was addedllvm/test/MC/AMDGPU/gfx8_asm_vop2.s
The file was addedllvm/test/MC/AMDGPU/gfx9_asm_ds.s
The file was addedllvm/test/MC/AMDGPU/gfx9_asm_vintrp.s
The file was removedllvm/test/MC/AMDGPU/gfx7_asm_all.s
The file was addedllvm/test/MC/AMDGPU/gfx8_asm_sop1.s
The file was addedllvm/test/MC/AMDGPU/gfx10_asm_sop.s
The file was addedllvm/test/MC/AMDGPU/gfx10_asm_vopc_sdwa.s
The file was addedllvm/test/MC/AMDGPU/gfx10_asm_vop3.s
The file was addedllvm/test/MC/AMDGPU/gfx7_asm_sop2.s
The file was addedllvm/test/MC/AMDGPU/gfx7_asm_vopc.s
The file was addedllvm/test/MC/AMDGPU/gfx7_asm_mubuf.s
The file was addedllvm/test/MC/AMDGPU/gfx8_asm_flat.s
The file was addedllvm/test/MC/AMDGPU/gfx8_asm_sop2.s
The file was addedllvm/test/MC/AMDGPU/gfx8_asm_vintrp.s
The file was addedllvm/test/MC/AMDGPU/gfx10_asm_vopc_e64.s
The file was addedllvm/test/MC/AMDGPU/gfx8_asm_ds.s
The file was addedllvm/test/MC/AMDGPU/gfx9_asm_exp.s
The file was addedllvm/test/MC/AMDGPU/gfx9_asm_vop3p.s
The file was addedllvm/test/MC/AMDGPU/gfx7_asm_vop3.s
The file was addedllvm/test/MC/AMDGPU/gfx8_asm_mimg.s
The file was addedllvm/test/MC/AMDGPU/gfx7_asm_sopp.s
The file was addedllvm/test/MC/AMDGPU/gfx10_asm_vop2.s
The file was addedllvm/test/MC/AMDGPU/gfx9_asm_flat.s
The file was addedllvm/test/MC/AMDGPU/gfx9_asm_sop2.s
The file was addedllvm/test/MC/AMDGPU/gfx9_asm_vopc.s
The file was addedllvm/test/MC/AMDGPU/gfx7_asm_mtbuf.s
Commit dcd21572f971ae5b5f1bf1f1abefafa0085404e1 by nikita.ppv
[ValueTracking] Fix isKnownNonEqual() with constexpr mul

Confusingly, BinaryOperator is not an Operator,
OverflowingBinaryOperator is... We were implicitly assuming that
the multiply is an Instruction here.

This fixes the assertion failure reported in
https://reviews.llvm.org/D92726#2472827.
The file was modifiedllvm/test/Analysis/ValueTracking/known-non-equal.ll
The file was modifiedllvm/lib/Analysis/ValueTracking.cpp
Commit e4df6a40dad66e989a4333c11d39cf3ed9635135 by listmail
[LV] Vectorize (some) early and multiple exit loops

This patch is a major step towards supporting multiple exit loops in the vectorizer. This patch on it's own extends the loop forms allowed in two ways:

    single exit loops which are not bottom tested
    multiple exit loops w/ a single exit block reached from all exits and no phis in the exit block (because of LCSSA this implies no values defined in the loop used later)

The restrictions on multiple exit loop structures will be removed in follow up patches; disallowing cases for now makes the code changes smaller and more obvious. As before, we can only handle loops with entirely analyzable exits. Removing that restriction is much harder, and is not part of currently planned efforts.

The basic idea here is that we can force the last iteration to run in the scalar epilogue loop (if we have one). From the definition of SCEV's backedge taken count, we know that no earlier iteration can exit the vector body. As such, we can leave the decision on which exit to be taken to the scalar code and generate a bottom tested vector loop which runs all but the last iteration.

The existing code already had the notion of requiring one iteration in the scalar epilogue, this patch is mainly about generalizing that support slightly, making sure we don't try to use this mechanism when tail folding, and updating the code to reflect the difference between a single exit block and a unique exit block (very mechanical).

Differential Revision: https://reviews.llvm.org/D93317
The file was modifiedllvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
The file was modifiedllvm/test/Transforms/LoopVectorize/loop-form.ll
The file was modifiedllvm/test/Transforms/LoopVectorize/loop-legality-checks.ll
The file was modifiedllvm/lib/Transforms/Vectorize/LoopVectorize.cpp
The file was modifiedllvm/test/Transforms/LoopVectorize/control-flow.ll