SuccessChanges

Summary

  1. [CostModel] move early exit for free intrinsics (details)
  2. [AArch64][GlobalISel] Infer whether G_PHI is going to be a FPR in regbankselect (details)
  3. [WebAssembly] Use wasm::Signature for in ObjectWriter (NFC) (details)
  4. [InstCombine] Add trunc(shr(trunc(x),c)) non-uniform vector tests (details)
  5. [AddressSanitizer] Copy type metadata to prevent miscompilation (details)
  6. [clangd] Rename evaluate() to evaluateHeuristics() (details)
  7. Revert "[AArch64][GlobalISel] Add selection support for <8 x s16>  G_INSERT_VECTOR_ELT with GPR scalar." (details)
  8. [AArch64] reuse another map iterator. NFC (details)
  9. [mlir] [VectorOps] changes to printing support for integers (details)
  10. scudo: Re-order Allocator fields for improved performance. NFCI. (details)
  11. [python][tests] Fix string comparison with "is" (details)
  12. [CostModel] fill in arguments as part of intrinsic attribute constructor (details)
  13. [PowerPC] Legalize v256i1 and v512i1 and implement load and store of these types (details)
Commit 745abbbb852e3c0006f22b7beade820ac978252c by spatel
[CostModel] move early exit for free intrinsics

This should be NFC unless some target was expecting that
some form of cttz/ctlz/memcpy is free in terms of size/latency
but not free in throughput cost.
The file was modifiedllvm/include/llvm/CodeGen/BasicTTIImpl.h
Commit 9d7ec46f5740d7626171c2b8198f825176991e0a by Jessica Paquette
[AArch64][GlobalISel] Infer whether G_PHI is going to be a FPR in regbankselect

Some instructions (G_LOAD, G_SELECT, G_UNMERGE_VALUES) check if their uses
will define/use FPRs (using `onlyUsesFP` and `onlyDefinesFP`).

The register bank of a use isn't necessarily known when an instruction asks for
this.

Teach `hasFPConstraints` to look at the instructions feeding into a G_PHI when
its destination bank is unknown. If any of them are FPR, assume the entire
G_PHI will also be assigned a FPR.

Since a phi can have many inputs, and those inputs can in turn be phis,
restrict the search depth to a very low number.

Also improve the docs for `hasFPConstraints` and friends a little.

This is a 0.3% code size improvement on CTMark/Bullet at -O3, and a 0.2% code
size improvement at CTMark/pairlocalalign at -O3.

Differential Revision: https://reviews.llvm.org/D88177
The file was modifiedllvm/test/CodeGen/AArch64/GlobalISel/regbank-fp-use-def.mir
The file was modifiedllvm/lib/Target/AArch64/GISel/AArch64RegisterBankInfo.h
The file was modifiedllvm/lib/Target/AArch64/GISel/AArch64RegisterBankInfo.cpp
Commit 4c41fb5ad70caeda7f03f0049fb1dff9934dfc53 by aheejin
[WebAssembly] Use wasm::Signature for in ObjectWriter (NFC)

There are two `WasmSignature` structs, one in
include/llvm/BinaryFormat/Wasm.h and the other in
lib/MC/WasmObjectWriter.cpp. I don't know why they got separated in this
way in the first place, but it seems we can unify them to use the one in
Wasm.h for all cases.

Reviewed By: dschuff, sbc100

Differential Revision: https://reviews.llvm.org/D88428
The file was modifiedllvm/lib/MC/WasmObjectWriter.cpp
Commit d047bb1cf69316b39dd67ecc5af669414be9152c by llvm-dev
[InstCombine] Add trunc(shr(trunc(x),c)) non-uniform vector tests
The file was modifiedllvm/test/Transforms/InstCombine/trunc-shift-trunc.ll
Commit 06e68f05dafb96ea5395d2fed669fccdcd07f61f by d.c.ddcc
[AddressSanitizer] Copy type metadata to prevent miscompilation

When ASan and e.g. Dead Virtual Function Elimination are enabled, the
latter will rely on type metadata to determine if certain virtual calls can be
removed. However, ASan currently does not copy type metadata, which can cause
virtual function calls to be incorrectly removed.

Differential Revision: https://reviews.llvm.org/D88368
The file was modifiedllvm/lib/Transforms/Instrumentation/AddressSanitizer.cpp
The file was modifiedllvm/test/Instrumentation/AddressSanitizer/debug_info.ll
Commit 9b1666f3ce2b02be70f8e7f82c3ec5c81262010b by usx
[clangd] Rename evaluate() to evaluateHeuristics()

Since we have 2 scoring functions (heuristics and decision forest),
renaming the existing evaluate() function to be more descriptive of the
Heuristics being evaluated in it.

Differential Revision: https://reviews.llvm.org/D88431
The file was modifiedclang-tools-extra/clangd/unittests/QualityTests.cpp
The file was modifiedclang-tools-extra/clangd/FindSymbols.cpp
The file was modifiedclang-tools-extra/clangd/XRefs.cpp
The file was modifiedclang-tools-extra/clangd/Quality.cpp
The file was modifiedclang-tools-extra/clangd/CodeComplete.cpp
The file was modifiedclang-tools-extra/clangd/index/dex/Dex.cpp
The file was modifiedclang-tools-extra/clangd/Quality.h
Commit 6c8168324b5329c94fe7e8f9a1619802091b9bec by Amara Emerson
Revert "[AArch64][GlobalISel] Add selection support for <8 x s16>  G_INSERT_VECTOR_ELT with GPR scalar."

This reverts commit b5e87c9ef2243ecd65e0ef87a1bf303c0c26db04 as it seems to have
broken a bot.
The file was modifiedllvm/lib/Target/AArch64/GISel/AArch64InstructionSelector.cpp
The file was modifiedllvm/test/CodeGen/AArch64/GlobalISel/select-insert-vector-elt.mir
Commit 83dc53d30c273960c0e398b2fa7459c8ecf2b03f by jonathan_roelofs
[AArch64] reuse another map iterator. NFC
The file was modifiedllvm/lib/Target/AArch64/AArch64SIMDInstrOpt.cpp
Commit 54759cefdba929c89a0bde07df19d8946312a974 by ajcbik
[mlir] [VectorOps] changes to printing support for integers

(1) simplify integer printing logic by always using 64-bit print
(2) add index support (since vector<16xindex> is planned to be added)
(3) adjust naming convention print_x -> printX

Reviewed By: bkramer

Differential Revision: https://reviews.llvm.org/D88436
The file was modifiedmlir/lib/Conversion/VectorToLLVM/ConvertVectorToLLVM.cpp
The file was modifiedmlir/lib/ExecutionEngine/CRunnerUtils.cpp
The file was modifiedmlir/test/mlir-cpu-runner/bare_ptr_call_conv.mlir
The file was modifiedmlir/test/mlir-cpu-runner/unranked_memref.mlir
The file was modifiedmlir/include/mlir/ExecutionEngine/CRunnerUtils.h
The file was modifiedmlir/integration_test/Dialect/LLVMIR/CPU/test-vector-reductions-fp.mlir
The file was modifiedmlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir
The file was modifiedmlir/integration_test/Dialect/LLVMIR/CPU/test-vector-reductions-int.mlir
Commit e851aeb0a5084d968d6384fbc2257bbe05dcdacb by peter
scudo: Re-order Allocator fields for improved performance. NFCI.

Move smaller and frequently-accessed fields near the beginning
of the data structure in order to improve locality and reduce
the number of instructions required to form an access to those
fields. With this change I measured a ~5% performance improvement on
BM_malloc_sql_trace_default on aarch64 Android devices (Pixel 4 and
DragonBoard 845c).

Differential Revision: https://reviews.llvm.org/D88350
The file was modifiedcompiler-rt/lib/scudo/standalone/combined.h
Commit 0c82fa677f24d8a9656af41ac9cc64ea4f818bc0 by chfast
[python][tests] Fix string comparison with "is"
The file was modifiedclang/bindings/python/tests/cindex/test_cursor_kind.py
Commit 33125cffda96fd5c5d2b80eebfa89fbf4f6b76a6 by spatel
[CostModel] fill in arguments as part of intrinsic attribute constructor

This appears to be an error of code duplication - instead of
one constructor variant calling another, we have N similar
but not identical versions.

I think this is 'NFC' based on the current callers, but it's
hard to tell or guess the intent in all cases.
The file was modifiedllvm/lib/Analysis/TargetTransformInfo.cpp
Commit 0156914275be5b07155ecefe4dc2d58588265abc by baptiste.saleil
[PowerPC] Legalize v256i1 and v512i1 and implement load and store of these types

This patch legalizes the v256i1 and v512i1 types that will be used for MMA.

It implements loads and stores of these types.
v256i1 is a pair of VSX registers, so for this type, we load/store the two
underlying registers. v512i1 is used for MMA accumulators. So in addition to
loading and storing the 4 associated VSX registers, we generate instructions to
prime (copy the VSX registers to the accumulator) after loading and unprime
(copy the accumulator back to the VSX registers) before storing.

This patch also adds the UACC register class that is necessary to implement the
loads and stores. This class represents accumulator in their unprimed form and
allow the distinction between primed and unprimed accumulators to avoid invalid
copies of the VSX registers associated with primed accumulators.

Differential Revision: https://reviews.llvm.org/D84968
The file was modifiedclang/lib/Basic/Targets/PPC.h
The file was modifiedllvm/lib/Target/PowerPC/PPCISelLowering.h
The file was modifiedllvm/lib/Target/PowerPC/PPCISelLowering.cpp
The file was modifiedllvm/lib/Target/PowerPC/PPCRegisterInfo.td
The file was modifiedclang/test/CodeGen/target-data.c
The file was modifiedllvm/lib/Target/PowerPC/PPCTargetMachine.cpp
The file was modifiedllvm/lib/Target/PowerPC/PPCInstrInfo.cpp
The file was modifiedllvm/lib/Target/PowerPC/PPCInstrPrefix.td
The file was addedllvm/test/CodeGen/PowerPC/mma-acc-memops.ll

Summary

  1. [HACK] Disable matrix_types tests. (details)
  2. [RISCV] Add a toolchain file for RISC-V. (details)
  3. [RISCV] get toolchain path from enviroment variable (details)
  4. Disable the reference result for the aarch64_neon_intrinsics.c test since clang is buggy. (details)
Commit a9efc190f2a5c0735562c014211bfce35706d5bd by Amara Emerson
[HACK] Disable matrix_types tests.
The file was modifiedSingleSource/UnitTests/matrix-types-spec.cpp (diff)
Commit 8141fe6213d911f469f92bfe5f8a85672f9d8a3d by Amara Emerson
[RISCV] Add a toolchain file for RISC-V.
The file was addedcmake/caches/target-riscv64-linux.cmake
Commit f9af83f302b397d59361f0dd72f2b55171535901 by Amara Emerson
[RISCV] get toolchain path from enviroment variable
The file was modifiedcmake/caches/target-riscv64-linux.cmake (diff)
Commit 87d67af9d8565d068b6706c081b7ae07addcb882 by Amara Emerson
Disable the reference result for the aarch64_neon_intrinsics.c test since clang is buggy.

Benign changes in -O0 codegen is causing the known non-deterministic failure
modes because clang is doing the wrong thing w.r.t saturating intrinsics.

Until someone goes and fixes the underlying issue, this test is only useful
as a compile-time test.

Ref: D59615 and b5e87c9ef2243ecd65e0ef87a1bf303c0c26db04 discussions.
The file was removedSingleSource/UnitTests/Vector/AArch64/aarch64_neon_intrinsics.reference_output