SuccessChanges

Summary

  1. [Test] Tidy up loose ends from LLVM_HAS_GLOBAL_ISEL (details)
  2. [NFC][EarlyCSE][InstSimplify] Add tests for CSE of PHI nodes (details)
  3. [InstSimplify][EarlyCSE] Try to CSE PHI nodes in the same basic block (details)
  4. Revert "[libcxx] Fix compile for BUILD_EXTERNAL_THREAD_LIBRARY" (details)
  5. [HeapProf] Clang and LLVM support for heap profiling instrumentation (details)
  6. [MLIR][GPUToSPIRV] Fix use-after-free. Found by asan. (details)
  7. [CodeGen] Properly propagating Calling Convention information when lowering vector arguments (details)
  8. [GISel]: Fix one more CSE Non determinism (details)
Commit c9455d3c579292e7ae5b7559ad0302d459e69a95 by russell.gallop
[Test] Tidy up loose ends from LLVM_HAS_GLOBAL_ISEL

This hasn't been allowed as a build option since r309990

Remove leftover REQUIRES: global-isel

Differential Revision: https://reviews.llvm.org/D86714
The file was modifiedllvm/test/MachineVerifier/test_g_constant.mir (diff)
The file was modifiedllvm/test/MachineVerifier/test_g_jump_table.mir (diff)
The file was modifiedllvm/test/MachineVerifier/test_g_bitcast.mir (diff)
The file was modifiedllvm/test/MachineVerifier/test_copy_mismatch_types.mir (diff)
The file was modifiedllvm/tools/llvm-config/CMakeLists.txt (diff)
The file was modifiedllvm/test/MachineVerifier/test_g_sext_inreg.mir (diff)
The file was modifiedllvm/test/CodeGen/MIR/AArch64/generic-virtual-registers-error.mir (diff)
The file was modifiedllvm/test/MachineVerifier/test_g_insert.mir (diff)
The file was modifiedllvm/test/lit.cfg.py (diff)
The file was modifiedllvm/test/MachineVerifier/test_g_phi.mir (diff)
The file was modifiedllvm/test/MachineVerifier/test_g_ptr_add.mir (diff)
The file was modifiedllvm/tools/llvm-config/llvm-config.cpp (diff)
The file was modifiedllvm/test/MachineVerifier/test_g_fconstant.mir (diff)
The file was modifiedllvm/test/MachineVerifier/test_g_trunc.mir (diff)
The file was modifiedllvm/test/MachineVerifier/test_g_brjt.mir (diff)
The file was modifiedllvm/test/MachineVerifier/test_copy.mir (diff)
The file was modifiedllvm/test/MachineVerifier/test_g_addrspacecast.mir (diff)
The file was modifiedllvm/test/MachineVerifier/test_g_load.mir (diff)
The file was modifiedllvm/test/MachineVerifier/test_g_zextload.mir (diff)
The file was modifiedllvm/test/CodeGen/MIR/AArch64/register-operand-bank.mir (diff)
The file was modifiedllvm/test/CodeGen/MIR/X86/generic-instr-type.mir (diff)
The file was modifiedllvm/test/MachineVerifier/test_g_fcmp.mir (diff)
The file was modifiedllvm/test/MachineVerifier/test_g_extract.mir (diff)
The file was modifiedllvm/test/MachineVerifier/test_g_store.mir (diff)
The file was modifiedllvm/test/MachineVerifier/test_g_add.mir (diff)
The file was modifiedllvm/test/MachineVerifier/test_g_sextload.mir (diff)
The file was modifiedllvm/test/MachineVerifier/test_g_icmp.mir (diff)
The file was modifiedllvm/test/CodeGen/MIR/AArch64/generic-virtual-registers-with-regbank-error.mir (diff)
The file was modifiedllvm/test/MachineVerifier/test_g_ptrtoint.mir (diff)
The file was modifiedllvm/test/tools/llvm-config/booleans.test (diff)
The file was modifiedllvm/tools/llvm-config/BuildVariables.inc.in (diff)
The file was modifiedllvm/test/MachineVerifier/test_g_concat_vectors.mir (diff)
The file was modifiedllvm/test/MachineVerifier/test_g_inttoptr.mir (diff)
The file was modifiedllvm/test/MachineVerifier/test_g_select.mir (diff)
Commit 94d3dd8b08a1abcd2964c4a5d5f6a3a3da2fa4cf by lebedev.ri
[NFC][EarlyCSE][InstSimplify] Add tests for CSE of PHI nodes

PHI nodes depend on the block they're in,
so we can only deal with the most basic case of same-BB PHI's.
The file was addedllvm/test/Transforms/InstSimplify/phi-cse.ll
The file was addedllvm/test/Transforms/EarlyCSE/phi.ll
Commit 6102310d814ad73eab60a88b21dd70874f7a056f by lebedev.ri
[InstSimplify][EarlyCSE] Try to CSE PHI nodes in the same basic block

Apparently, we don't do this, neither in EarlyCSE, nor in InstSimplify,
nor in (old) GVN, but do in NewGVN and SimplifyCFG of all places..

While i could teach EarlyCSE how to hash PHI nodes,
we can't really do much (anything?) even if we find two identical
PHI nodes in different basic blocks, same-BB case is the interesting one,
and if we teach InstSimplify about it (which is what i wanted originally,
https://reviews.llvm.org/D86530), we get EarlyCSE support for free.

So i would think this is pretty uncontroversial.

On vanilla llvm test-suite + RawSpeed, this has the following effects:
```
| statistic name                                     | baseline  | proposed  |      Δ |        % |    \|%\| |
|----------------------------------------------------|-----------|-----------|-------:|---------:|---------:|
| instsimplify.NumPHICSE                             | 0         | 23779     |  23779 |    0.00% |    0.00% |
| asm-printer.EmittedInsts                           | 7942328   | 7942392   |     64 |    0.00% |    0.00% |
| assembler.ObjectBytes                              | 273069192 | 273084704 |  15512 |    0.01% |    0.01% |
| correlated-value-propagation.NumPhis               | 18412     | 18539     |    127 |    0.69% |    0.69% |
| early-cse.NumCSE                                   | 2183283   | 2183227   |    -56 |    0.00% |    0.00% |
| early-cse.NumSimplify                              | 550105    | 542090    |  -8015 |   -1.46% |    1.46% |
| instcombine.NumAggregateReconstructionsSimplified  | 73        | 4506      |   4433 | 6072.60% | 6072.60% |
| instcombine.NumCombined                            | 3640264   | 3664769   |  24505 |    0.67% |    0.67% |
| instcombine.NumDeadInst                            | 1778193   | 1783183   |   4990 |    0.28% |    0.28% |
| instcount.NumCallInst                              | 1758401   | 1758799   |    398 |    0.02% |    0.02% |
| instcount.NumInvokeInst                            | 59478     | 59502     |     24 |    0.04% |    0.04% |
| instcount.NumPHIInst                               | 330557    | 330533    |    -24 |   -0.01% |    0.01% |
| instcount.TotalInsts                               | 8831952   | 8832286   |    334 |    0.00% |    0.00% |
| simplifycfg.NumInvokes                             | 4300      | 4410      |    110 |    2.56% |    2.56% |
| simplifycfg.NumSimpl                               | 1019808   | 999607    | -20201 |   -1.98% |    1.98% |
```
I.e. it fires ~24k times, causes +110 (+2.56%) more `invoke` -> `call`
transforms, and counter-intuitively results in *more* instructions total.

That being said, the PHI count doesn't decrease that much,
and looking at some examples, it seems at least some of them
were previously getting PHI CSE'd in SimplifyCFG of all places..

I'm adjusting `Instruction::isIdenticalToWhenDefined()` at the same time.
As a comment in `InstCombinerImpl::visitPHINode()` already stated,
there are no guarantees on the ordering of the operands of a PHI node,
so if we just naively compare them, we may false-negatively say that
the nodes are not equal when the only difference is operand order,
which is especially important since the fold is in InstSimplify,
so we can't rely on InstCombine sorting them beforehand.

Fixing this for the general case is costly (geomean +0.02%),
and does not appear to catch anything in test-suite, but for
the same-BB case, it's trivial, so let's fix at least that.

As per http://llvm-compile-time-tracker.com/compare.php?from=04879086b44348cad600a0a1ccbe1f7776cc3cf9&to=82bdedb888b945df1e9f130dd3ac4dd3c96e2925&stat=instructions
this appears to cause geomean +0.03% compile time increase (regression),
but geomean -0.01%..-0.04% code size decrease (improvement).
The file was modifiedllvm/test/Transforms/InstCombine/merging-multiple-stores-into-successor.ll (diff)
The file was modifiedllvm/test/Transforms/InstCombine/phi-aware-aggregate-reconstruction.ll (diff)
The file was modifiedllvm/lib/IR/Instruction.cpp (diff)
The file was modifiedllvm/lib/Analysis/InstructionSimplify.cpp (diff)
The file was modifiedllvm/test/Transforms/EarlyCSE/phi.ll (diff)
The file was modifiedllvm/test/Transforms/JumpThreading/loop-phi.ll (diff)
The file was modifiedllvm/test/Transforms/InstSimplify/phi-cse.ll (diff)
The file was modifiedllvm/test/CodeGen/X86/statepoint-vector.ll (diff)
The file was modifiedllvm/test/Transforms/InstCombine/phi-equal-incoming-pointers.ll (diff)
The file was modifiedllvm/test/Transforms/LoopVectorize/reduction.ll (diff)
The file was modifiedllvm/test/Transforms/InstCombine/select.ll (diff)
Commit a19fd1aab519ccec18654f76a01b0345880c5200 by mikhail.maltsev
Revert "[libcxx] Fix compile for BUILD_EXTERNAL_THREAD_LIBRARY"

This reverts commit 3b71f91558ff8b569199547efe800cb501c3cf94.

The commit is breaking some build bots.
The file was modifiedlibcxx/src/CMakeLists.txt (diff)
The file was modifiedlibcxx/include/__threading_support (diff)
Commit 7ed8124d46f94601d5f1364becee9cee8538265e by tejohnson
[HeapProf] Clang and LLVM support for heap profiling instrumentation

See RFC for background:
http://lists.llvm.org/pipermail/llvm-dev/2020-June/142744.html

Note that the runtime changes will be sent separately (hopefully this
week, need to add some tests).

This patch includes the LLVM pass to instrument memory accesses with
either inline sequences to increment the access count in the shadow
location, or alternatively to call into the runtime. It also changes
calls to memset/memcpy/memmove to the equivalent runtime version.
The pass is modeled on the address sanitizer pass.

The clang changes add the driver option to invoke the new pass, and to
link with the upcoming heap profiling runtime libraries.

Currently there is no attempt to optimize the instrumentation, e.g. to
aggregate updates to the same memory allocation. That will be
implemented as follow on work.

Differential Revision: https://reviews.llvm.org/D85948
The file was modifiedllvm/lib/Transforms/Instrumentation/CMakeLists.txt (diff)
The file was addedllvm/test/Instrumentation/HeapProfiler/masked-load-store.ll
The file was addedllvm/lib/Transforms/Instrumentation/HeapProfiler.cpp
The file was addedllvm/test/Instrumentation/HeapProfiler/version-mismatch-check.ll
The file was modifiedllvm/lib/Passes/PassBuilder.cpp (diff)
The file was modifiedllvm/include/llvm/InitializePasses.h (diff)
The file was modifiedclang/lib/Frontend/CompilerInvocation.cpp (diff)
The file was modifiedclang/lib/Driver/ToolChains/Clang.cpp (diff)
The file was modifiedclang/include/clang/Driver/SanitizerArgs.h (diff)
The file was modifiedllvm/lib/Passes/PassRegistry.def (diff)
The file was addedllvm/test/Instrumentation/HeapProfiler/instrumentation-use-callbacks.ll
The file was addedllvm/test/Instrumentation/HeapProfiler/scale-granularity.ll
The file was modifiedclang/lib/Driver/ToolChains/CommonArgs.cpp (diff)
The file was addedclang/test/Driver/fmemprof.cpp
The file was addedllvm/include/llvm/Transforms/Instrumentation/HeapProfiler.h
The file was modifiedclang/lib/Driver/SanitizerArgs.cpp (diff)
The file was addedllvm/test/Instrumentation/HeapProfiler/basic.ll
The file was modifiedclang/lib/CodeGen/BackendUtil.cpp (diff)
The file was modifiedllvm/lib/Transforms/Instrumentation/Instrumentation.cpp (diff)
The file was modifiedclang/include/clang/Driver/Options.td (diff)
The file was modifiedclang/include/clang/Basic/CodeGenOptions.def (diff)
Commit fddf543e6e01cb72ec1af48c5dd82c525bd3e47a by benny.kra
[MLIR][GPUToSPIRV] Fix use-after-free. Found by asan.
The file was modifiedmlir/lib/Conversion/GPUToSPIRV/ConvertGPUToSPIRV.cpp (diff)
Commit 3d943bcd223e5b97179840c2f5885fe341e51747 by lucas.prates
[CodeGen] Properly propagating Calling Convention information when lowering vector arguments

When joining the legal parts of vector arguments into its original value
during the lower of Formal Arguments in SelectionDAGBuilder, the Calling
Convention information was not being propagated for the handling of each
individual parts. The same did not happen when lowering calls, causing a
mismatch.

This patch fixes the issue by properly propagating the Calling
Convention details.

This fixes Bugzilla #47001.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D86715
The file was modifiedllvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (diff)
The file was modifiedllvm/test/CodeGen/ARM/fp16-args.ll (diff)
The file was modifiedllvm/test/CodeGen/ARM/fp16-v3.ll (diff)
Commit 5c2db1655b2ac2a699d29165513c16894373e566 by aditya_nandakumar
[GISel]: Fix one more CSE Non determinism

https://reviews.llvm.org/D86676

Sometimes we can have the following code

x:gpr(s32) = G_OP

Say we build G_OP2 to the same x and then delete the previous instruction. Using something like

Register X = ...;
auto NewMIB = CSEBuilder.buildOp2(X, ... args);

Currently there's a mismatch in how NewMIB is profiled and inserted into the CSEMap (ie it doesn't consider register bank/register class along with type).Unify the profiling by refactoring and calling the common method.

This was found by turning on the CSEInfo::verify in at the end of each of our GISel passes which turns inconsistent state/non determinism in CSEing into crashes which likely usually indicates missing calls to Observer on mutations (the most common case). Here non determinism usually means not cseing sometimes, but almost never about producing incorrect code.
Also this patch adds this verification at the end of the combiners as well.
The file was modifiedllvm/lib/CodeGen/GlobalISel/CSEInfo.cpp (diff)
The file was modifiedllvm/lib/CodeGen/GlobalISel/CSEMIRBuilder.cpp (diff)
The file was modifiedllvm/lib/CodeGen/GlobalISel/Combiner.cpp (diff)
The file was modifiedllvm/include/llvm/CodeGen/GlobalISel/CSEInfo.h (diff)