Changes

Summary

  1. [X86][Costmodel] Now that `getReplicationShuffleCost()` is good, update `getInterleavedMemoryOpCostAVX512()` (details)
  2. [AArch64][SVE] Mark fixed-type FP extending/truncating loads/stores as custom (details)
  3. Use a deterministic order when updating the DominatorTree (details)
  4. fix typos in comments (details)
  5. [NFC][X86][LV][Costmodel] Add most basic test for masked interleaved load (details)
Commit cffe3a084f87ad2ed17aeebc1075eb100182114e by lebedev.ri
[X86][Costmodel] Now that `getReplicationShuffleCost()` is good, update `getInterleavedMemoryOpCostAVX512()`

... to actually ask about i1-elt-wide mask, since that is what will probably be used on AVX512.
This unblocks D111460.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D114316
The file was modifiedllvm/lib/Target/X86/X86TargetTransformInfo.cpp
The file was modifiedllvm/test/Analysis/CostModel/X86/interleaved-store-accesses-with-gaps.ll
Commit 61808066325ff0828bab7f016e8798b78d2e6b49 by bradley.smith
[AArch64][SVE] Mark fixed-type FP extending/truncating loads/stores as custom

This allows the generic DAG combine to fold fp_extend/fp_trunc into
loads/stores which we can then lower into a integer extending
load/truncating store plus an FP_EXTEND/FP_ROUND.

The nuance here is that fixed-type FP_EXTEND/FP_ROUND require unpacked
types hence lowering them introduces an unpack/zip. By allowing these
nodes to be combined with loads/store we make it much easier to have
this unpack/zip combined into the load/store by our custom lowering.

Differential Revision: https://reviews.llvm.org/D114580
The file was modifiedllvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
The file was modifiedllvm/lib/Target/AArch64/AArch64ISelLowering.cpp
The file was modifiedllvm/test/CodeGen/AArch64/sve-fixed-length-fp-extend-trunc.ll
Commit 297fb66484c73d3a04e8974921e94cb00c1587c2 by bjorn.a.pettersson
Use a deterministic order when updating the DominatorTree

This solves a problem with non-deterministic output from opt due
to not performing dominator tree updates in a deterministic order.

The problem that was analysed indicated that JumpThreading was using
the DomTreeUpdater via llvm::MergeBasicBlockIntoOnlyPred. When
preparing the list of updates to send to DomTreeUpdater::applyUpdates
we iterated over a SmallPtrSet, which didn't give a well-defined
order of updates to perform.

The added domtree-updates.ll test case is an example that would
result in non-deterministic printouts of the domtree. Semantically
those domtree:s are equivalent, but it show the fact that when we
use the domtree iterator the order in which nodes are visited depend
on the order in which dominator tree updates are performed.

Since some passes (at least EarlyCSE) are iterating over nodes in the
dominator tree in a similar fashion as the domtree printer, then the
order in which transforms are applied by such passes, transitively,
also depend on the order in which dominator tree updates are
performed. And taking EarlyCSE as an example the end result could be
different depending on in which order the transforms are applied.

Reviewed By: nikic, kuhar

Differential Revision: https://reviews.llvm.org/D110292
The file was modifiedllvm/include/llvm/Support/GenericDomTree.h
The file was modifiedllvm/lib/Transforms/Utils/Local.cpp
The file was modifiedllvm/lib/CodeGen/IndirectBrExpandPass.cpp
The file was modifiedllvm/lib/Transforms/Utils/SimplifyCFG.cpp
The file was modifiedllvm/lib/Transforms/Utils/BasicBlockUtils.cpp
The file was addedllvm/test/Transforms/JumpThreading/domtree-updates.ll
Commit d96f92ff16edab72cf78811673f02371f07a5a70 by sylvestre
fix typos in comments
The file was modifiedllvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp
The file was modifiedllvm/test/tools/llvm-objcopy/ELF/Inputs/ihex-elf-sections.yaml
The file was modifiedclang-tools-extra/clang-doc/ClangDoc.h
Commit 5e96553608a1bbf688f11c76890dc543c5f89c61 by lebedev.ri
[NFC][X86][LV][Costmodel] Add most basic test for masked interleaved load
The file was addedllvm/test/Analysis/CostModel/X86/masked-interleaved-store-i16.ll
The file was removedllvm/test/Analysis/CostModel/X86/interleaved-store-accesses-with-gaps.ll
The file was addedllvm/test/Analysis/CostModel/X86/masked-interleaved-load-i16.ll