Changes

Summary

  1. [mlir][sparse] add support for std unary operations (details)
  2. [mlir][Tensor] Implement `reifyReturnTypeShapesPerResultDim` for `tensor.insert_slice`. (details)
  3. [PowerPC] Add PowerPC compare and multiply related builtins and instrinsics for XL compatibility (details)
  4. [NFC][MLIR][std] Clean up ArithmeticCastOps (details)
  5. [NFC][sanitizer] Rename some MemoryMapper members (details)
  6. [NFC][sanitizer] Exctract DrainHalfMax (details)
  7. [ScalarEvolution] Make isKnownNonZero handle more cases. (details)
  8. RegAlloc: Allow targets to split register allocation (details)
  9. [NFC][sanitizer] Don't store region_base_ in MemoryMapper (details)
  10. [NewPM][SimpleLoopUnswitch] Add option to not trivially unswitch (details)
  11. sanitizer_common: optimize memory drain (details)
  12. AMDGPU: Try to fix test failure with EXPENSIVE_CHECKS (details)
  13. [NFC][sanitizer] Move MemoryMapper template parameter (details)
  14. [NFC][sanitizer] Simplify MapPackedCounterArrayBuffer (details)
  15. [AArch64][GlobalISel] Mark v2s64 -> v2p0 G_INTTOPTR as legal (details)
  16. Revert "[NFC][sanitizer] Simplify MapPackedCounterArrayBuffer" (details)
Commit 123e8dfcf86a74eb7ba08f33681df581d1be9dbd by ajcbik
[mlir][sparse] add support for std unary operations

Adds zero-preserving unary operators from std. Also adds xor.
Performs minor refactoring to remove "zero" node, and pushed
the irregular logic for negi (not support in std) into one place.

Reviewed By: gussmith23

Differential Revision: https://reviews.llvm.org/D105928
The file was modifiedmlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
The file was modifiedmlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
The file was modifiedmlir/include/mlir/Dialect/SparseTensor/Utils/Merger.h
The file was modifiedmlir/test/Dialect/SparseTensor/sparse_int_ops.mlir
The file was modifiedmlir/test/Dialect/SparseTensor/sparse_fp_ops.mlir
The file was modifiedmlir/unittests/Dialect/SparseTensor/MergerTest.cpp
Commit f2b5e438aa3620cd60d115cad8dcb39cc417c8a8 by ravishankarm
[mlir][Tensor] Implement `reifyReturnTypeShapesPerResultDim` for `tensor.insert_slice`.

Differential Revision: https://reviews.llvm.org/D105852
The file was modifiedmlir/include/mlir/Dialect/Tensor/IR/TensorOps.td
The file was modifiedmlir/include/mlir/Dialect/Tensor/IR/Tensor.h
The file was addedmlir/test/Dialect/Tensor/resolve-shaped-type-result-dims.mlir
The file was modifiedutils/bazel/llvm-project-overlay/mlir/BUILD.bazel
The file was modifiedmlir/lib/Dialect/Tensor/IR/TensorOps.cpp
The file was modifiedmlir/lib/Dialect/Tensor/IR/CMakeLists.txt
Commit 18c19414eb70578d4c487d6f4b0f438aead71d6a by wei.huang
[PowerPC] Add PowerPC compare and multiply related builtins and instrinsics for XL compatibility

This patch is in a series of patches to provide builtins for compatibility
with the XL compiler. This patch adds the builtins and instrisics for compare
and multiply related operations.

Reviewed By: nemanjai, #powerpc

Differential revision: https://reviews.llvm.org/D102875
The file was addedllvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-multiply-64bit-only.ll
The file was addedclang/test/CodeGen/builtins-ppc-xlcompat-pwr9-64bit.c
The file was modifiedclang/include/clang/Basic/BuiltinsPPC.def
The file was addedclang/test/CodeGen/builtins-ppc-xlcompat-multiply.c
The file was modifiedclang/lib/Basic/Targets/PPC.cpp
The file was modifiedllvm/lib/Target/PowerPC/PPCInstrInfo.td
The file was addedclang/test/CodeGen/builtins-ppc-xlcompat-multiply-64bit-only.c
The file was addedclang/test/CodeGen/builtins-ppc-xlcompat-pwr9.c
The file was addedllvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-multiply.ll
The file was modifiedclang/lib/Sema/SemaChecking.cpp
The file was modifiedllvm/lib/Target/PowerPC/PPCInstr64Bit.td
The file was addedclang/test/CodeGen/builtins-ppc-xlcompat-pwr9-error.c
The file was modifiedllvm/include/llvm/IR/IntrinsicsPowerPC.td
The file was addedllvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-compare-64bit-only.ll
The file was addedllvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-compare.ll
Commit 9955c652eafdcb5f1d16ee3db857f03ee7e5cfbc by gcmn
[NFC][MLIR][std] Clean up ArithmeticCastOps

The documentation on these was out of sync with the implementation. Also
the declaration of inputs was repeated when it is already part of the
ArithmeticCastOp definition.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D105934
The file was modifiedmlir/include/mlir/Dialect/StandardOps/IR/Ops.td
Commit 5df99954392e3a4448e4ff43d4cf644bc06bfa92 by Vitaly Buka
[NFC][sanitizer] Rename some MemoryMapper members

Part of D105778
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_allocator_primary64.h
Commit afa3fedcda98db4d47694ed596270a5396074224 by Vitaly Buka
[NFC][sanitizer] Exctract DrainHalfMax

Part of D105778
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_allocator_local_cache.h
Commit bb8c7a980fe487eb322d38641db9145a6b6cb1d4 by efriedma
[ScalarEvolution] Make isKnownNonZero handle more cases.

Using an unsigned range instead of signed ranges is a bit more precise.

Differential Revision: https://reviews.llvm.org/D105941
The file was modifiedllvm/lib/Analysis/ScalarEvolution.cpp
The file was modifiedllvm/test/Analysis/ScalarEvolution/trip-count9.ll
Commit eebe841a47cbbd55bdcc32da943c92d18f88a5b8 by Matthew.Arsenault
RegAlloc: Allow targets to split register allocation

AMDGPU normally spills SGPRs to VGPRs. Previously, since all register
classes are handled at the same time, this was problematic. We don't
know ahead of time how many registers will be needed to be reserved to
handle the spilling. If no VGPRs were left for spilling, we would have
to try to spill to memory. If the spilled SGPRs were required for exec
mask manipulation, it is highly problematic because the lanes active
at the point of spill are not necessarily the same as at the restore
point.

Avoid this problem by fully allocating SGPRs in a separate regalloc
run from VGPRs. This way we know the exact number of VGPRs needed, and
can reserve them for a second run.  This fixes the most serious
issues, but it is still possible using inline asm to make all VGPRs
unavailable. Start erroring in the case where we ever would require
memory for an SGPR spill.

This is implemented by giving each regalloc pass a callback which
reports if a register class should be handled or not. A few passes
need some small changes to deal with leftover virtual registers.

In the AMDGPU implementation, a new pass is introduced to take the
place of PrologEpilogInserter for SGPR spills emitted during the first
run.

One disadvantage of this is currently StackSlotColoring is no longer
used for SGPR spills. It would need to be run again, which will
require more work.

Error if the standard -regalloc option is used. Introduce new separate
-sgpr-regalloc and -vgpr-regalloc flags, so the two runs can be
controlled individually. PBQB is not currently supported, so this also
prevents using the unhandled allocator.
The file was modifiedllvm/include/llvm/CodeGen/Passes.h
The file was modifiedllvm/test/CodeGen/AMDGPU/agpr-csr.ll
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
The file was modifiedllvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
The file was modifiedllvm/lib/Target/AMDGPU/SIRegisterInfo.h
The file was modifiedllvm/test/CodeGen/AMDGPU/gfx-callable-argument-types.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/pei-build-spill.mir
The file was modifiedllvm/test/CodeGen/AMDGPU/mul24-pass-ordering.ll
The file was modifiedllvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
The file was modifiedllvm/test/CodeGen/AMDGPU/llc-pipeline.ll
The file was modifiedllvm/lib/CodeGen/TargetPassConfig.cpp
The file was modifiedllvm/test/CodeGen/AMDGPU/spill-empty-live-interval.mir
The file was modifiedllvm/test/CodeGen/AMDGPU/unstructured-cfg-def-use-issue.ll
The file was modifiedllvm/lib/CodeGen/RegAllocBasic.cpp
The file was addedllvm/include/llvm/CodeGen/RegAllocCommon.h
The file was modifiedllvm/lib/Target/AMDGPU/SIFrameLowering.cpp
The file was modifiedllvm/lib/CodeGen/RegAllocBase.cpp
The file was modifiedllvm/lib/CodeGen/RegAllocFast.cpp
The file was modifiedllvm/test/CodeGen/AMDGPU/remat-vop.mir
The file was modifiedllvm/test/CodeGen/AMDGPU/spill_more_than_wavesize_csr_sgprs.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/extractelement-stack-lower.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/callee-frame-setup.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/sibling-call.ll
The file was modifiedllvm/lib/CodeGen/RegAllocGreedy.cpp
The file was modifiedllvm/test/CodeGen/AMDGPU/vgpr-tuple-allocation.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/alloc-aligned-tuples-gfx90a.mir
The file was modifiedllvm/test/CodeGen/AMDGPU/attr-amdgpu-flat-work-group-size-vgpr-limit.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/sgpr-spill-wrong-stack-id.mir
The file was modifiedllvm/test/CodeGen/AMDGPU/indirect-call.ll
The file was modifiedllvm/lib/CodeGen/RegAllocBase.h
The file was modifiedllvm/test/CodeGen/AMDGPU/gfx-callable-preserved-registers.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/stack-slot-color-sgpr-vgpr-spills.mir
The file was addedllvm/test/CodeGen/AMDGPU/sgpr-regalloc-flags.ll
The file was addedllvm/test/CodeGen/AMDGPU/sgpr-spill-no-vgprs.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/virtregrewrite-undef-identity-copy.mir
The file was modifiedllvm/test/CodeGen/AMDGPU/alloc-aligned-tuples-gfx908.mir
The file was modifiedllvm/test/CodeGen/AMDGPU/spill-scavenge-offset.ll
The file was modifiedllvm/lib/CodeGen/LiveIntervals.cpp
The file was modifiedllvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
The file was modifiedllvm/include/llvm/CodeGen/RegAllocRegistry.h
Commit 99aebb62fb4f2a39c7f03579facf3a1e176b245d by Vitaly Buka
[NFC][sanitizer] Don't store region_base_ in MemoryMapper

Part of D105778
The file was modifiedcompiler-rt/lib/sanitizer_common/tests/sanitizer_allocator_test.cpp
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_allocator_primary64.h
Commit 0024ec59a0f3deb206a21567ac2ebe0fc097ea9d by aeubanks
[NewPM][SimpleLoopUnswitch] Add option to not trivially unswitch

To help with debugging non-trivial unswitching issues.

Don't care about the legacy pass, nobody is using it.

If a pass's string params are empty (e.g. "simple-loop-unswitch"), don't
default to the empty constructor for the pass params. We should still
let the parser take care of it in case the parser has its own defaults.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D105933
The file was modifiedllvm/lib/Passes/PassRegistry.def
The file was modifiedllvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp
The file was modifiedllvm/include/llvm/Transforms/Scalar/SimpleLoopUnswitch.h
The file was modifiedllvm/test/Other/print-passes.ll
The file was modifiedllvm/lib/Passes/PassBuilder.cpp
The file was addedllvm/test/Transforms/SimpleLoopUnswitch/options.ll
Commit 832ba20710ee09b00161ea72cf80c9af800fda63 by Vitaly Buka
sanitizer_common: optimize memory drain

Currently we allocate MemoryMapper per size class.
MemoryMapper mmap's and munmap's internal buffer.
This results in 50 mmap/munmap calls under the global
allocator mutex. Reuse MemoryMapper and the buffer
for all size classes. This radically reduces number of
mmap/munmap calls. Smaller size classes tend to have
more objects allocated, so it's highly likely that
the buffer allocated for the first size class will
be enough for all subsequent size classes.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D105778
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_allocator_local_cache.h
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_allocator_primary64.h
Commit 3191ac27e396dbd141243b8ca6cf5660c10ddf5c by Matthew.Arsenault
AMDGPU: Try to fix test failure with EXPENSIVE_CHECKS

The machine verifier is enabled by default for EXPENSIVE_CHECKS, so
the pass runs of it would pollute the output here.
The file was modifiedllvm/test/CodeGen/AMDGPU/sgpr-regalloc-flags.ll
Commit 7140382b17df7c33145cc6e9a2df7e84a2259444 by Vitaly Buka
[NFC][sanitizer] Move MemoryMapper template parameter
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_allocator_primary64.h
The file was modifiedcompiler-rt/lib/sanitizer_common/tests/sanitizer_allocator_test.cpp
Commit 8725b382b0a5ea375252d966bafbace62a21e93b by Vitaly Buka
[NFC][sanitizer] Simplify MapPackedCounterArrayBuffer
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_allocator_primary64.h
The file was modifiedcompiler-rt/lib/sanitizer_common/tests/sanitizer_allocator_test.cpp
Commit 5bd7cc4f42488129adb135539c64bb3933d5da4c by Jessica Paquette
[AArch64][GlobalISel] Mark v2s64 -> v2p0 G_INTTOPTR as legal

Allow

```
%x:_<2 x p0> = G_INTTOPTR %y:_<2 x s64>
```

This shows up when building clang for AArch64 with GlobalISel.

Also show that we can select it.

This should match SDAG's behaviour: https://godbolt.org/z/33oqYoaYv

Differential Revision: https://reviews.llvm.org/D105944
The file was addedllvm/test/CodeGen/AArch64/GlobalISel/legalize-inttoptr.mir
The file was modifiedllvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
The file was modifiedllvm/test/CodeGen/AArch64/GlobalISel/select-int-ptr-casts.mir
Commit ed430023e864c3b3ff7f47d5740e5380828c26f6 by Vitaly Buka
Revert "[NFC][sanitizer] Simplify MapPackedCounterArrayBuffer"

Does not compile.

This reverts commit 8725b382b0a5ea375252d966bafbace62a21e93b.
The file was modifiedcompiler-rt/lib/sanitizer_common/tests/sanitizer_allocator_test.cpp
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_allocator_primary64.h