1. [Support] On Windows, add optional support for {rpmalloc|snmalloc|mimalloc} (details)
  2. [CodeGen][AArch64] Support arm_sve_vector_bits attribute (details)
  3. [libcxx] Fix compile for BUILD_EXTERNAL_THREAD_LIBRARY (details)
  4. [libc++] Install a more recent CMake on libc++ builders (details)
  5. [Test] Tidy up loose ends from LLVM_HAS_GLOBAL_ISEL (details)
  6. [NFC][EarlyCSE][InstSimplify] Add tests for CSE of PHI nodes (details)
  7. [InstSimplify][EarlyCSE] Try to CSE PHI nodes in the same basic block (details)
  8. Revert "[libcxx] Fix compile for BUILD_EXTERNAL_THREAD_LIBRARY" (details)
  9. [HeapProf] Clang and LLVM support for heap profiling instrumentation (details)
  10. [MLIR][GPUToSPIRV] Fix use-after-free. Found by asan. (details)
  11. [CodeGen] Properly propagating Calling Convention information when lowering vector arguments (details)
  12. [GISel]: Fix one more CSE Non determinism (details)
  13. [Attributor] Add a phase flag to Attributor (details)
  14. [sda][nfc] clang-formatting (details)
  15. [OCaml] Remove add_constant_propagation (details)
  16. [lldb] Move triple construction out of getArchCFlags in DarwinBuilder (NFC) (details)
  17. [lldb] Make lldb-argdumper a dependency of liblldb (details)
  18. [GISel] Add new GISel combiners for G_SELECT (details)
  19. [test][Inliner] Make always-inline.ll work with NPM (details)
  20. [gn build] Manually port c9455d3 (details)
  21. [gn build] Port 7ed8124d46f (details)
Commit a6a37a2fcd2a8048a75bd0d8280497ed89d73224 by alexandre.ganea
[Support] On Windows, add optional support for {rpmalloc|snmalloc|mimalloc}

This patch optionally replaces the CRT allocator (i.e., malloc and free) with rpmalloc (mixed public domain licence/MIT licence) or snmalloc (MIT licence) or mimalloc (MIT licence). Please note that the source code for these allocators must be available outside of LLVM's tree.

To enable, use `cmake ... -DLLVM_INTEGRATED_CRT_ALLOC=D:/git/rpmalloc -DLLVM_USE_CRT_RELEASE=MT` where `D:/git/rpmalloc` has already been git clone'd from ``. The same applies to snmalloc and mimalloc.

When enabled, the allocator will be embeded (statically linked) into the LLVM tools & libraries. This currently only works with the static CRT (/MT), although using the dynamic CRT (/MD) could potentially work as well in the future.

When enabled, this changes the memory stack from:
  new/delete -> MS VC++ CRT malloc/free -> HeapAlloc -> VirtualAlloc
  new/delete -> {rpmalloc|snmalloc|mimalloc} -> VirtualAlloc

The goal of this patch is to bypass the application's global heap - which is thread-safe thus inducing locking - and instead take advantage of a modern lock-free, thread cache, allocator. On a 6-core Xeon Skylake we observe a 2.5x decrease in execution time when linking a large scale application with LLD and ThinLTO (12 min 20 sec -> 5 min 34 sec), when all hardware threads are being used (using LLD's flag /opt:lldltojobs=all). On a dual 36-core Xeon Skylake with all hardware threads used, we observe a 24x decrease in execution time (1 h 2 min -> 2 min 38 sec) when linking a large application with LLD and ThinLTO. Clang build times also see a decrease in the range 5-10% depending on the configuration.

Differential Revision:
The file was modifiedllvm/tools/remarks-shlib/CMakeLists.txt
The file was modifiedllvm/tools/llvm-shlib/CMakeLists.txt
The file was modifiedllvm/CMakeLists.txt
The file was modifiedllvm/docs/CMake.rst
The file was modifiedllvm/lib/Support/CMakeLists.txt
The file was modifiedllvm/unittests/Support/DynamicLibrary/CMakeLists.txt
Commit 42587345a3afc52c03c6e6095db773358a1b03e9 by cullen.rhodes
[CodeGen][AArch64] Support arm_sve_vector_bits attribute

This patch implements codegen for the 'arm_sve_vector_bits' type
attribute, defined by the Arm C Language Extensions (ACLE) for SVE [1].
The purpose of this attribute is to define vector-length-specific (VLS)
versions of existing vector-length-agnostic (VLA) types.

VLSTs are represented as VectorType in the AST and fixed-length vectors
in the IR everywhere except in function args/return. Implemented in this
patch is codegen support for the following:

  * Implicit casting between VLA <-> VLS types.
  * Coercion of VLS types in function args/return.
  * Mangling of VLS types.

Casting is handled by the CK_BitCast operation, which has been extended
to support the two new vector kinds for fixed-length SVE predicate and
data vectors, where the cast is implemented through memory rather than a
bitcast which is unsupported. Implementing this as a normal bitcast
would require relaxing checks in LLVM to allow bitcasting between
scalable and fixed types. Another option was adding target-specific
intrinsics, although codegen support would need to be added for these
intrinsics. Given this, casting through memory seemed like the best
approach as it's supported today and existing optimisations may remove
unnecessary loads/stores, although there is room for improvement here.

Coercion of VLSTs in function args/return from fixed to scalable is
implemented through the AArch64 ABI in TargetInfo.

The VLA and VLS types are defined by the ACLE to map to the same
machine-level SVE vectors. VLS types are mangled in the same way as:

  __SVE_VLS<typename, unsigned>

where the first argument is the underlying variable-length type and the
second argument is the SVE vector length in bits. For example:

  // Mangled as 9__SVE_VLSIu11__SVInt32_tLj512EE
  typedef svint32_t vec __attribute__((arm_sve_vector_bits(512)));
  // Mangled as 9__SVE_VLSIu10__SVBool_tLj512EE
  typedef svbool_t pred __attribute__((arm_sve_vector_bits(512)));

The latest ACLE specification (00bet5) does not contain details of this
mangling scheme, it will be specified in the next revision.  The
mangling scheme is otherwise defined in the appendices to the Procedure
Call Standard for the Arm Architecture, see [2] for more information.


Reviewed By: efriedma

Differential Revision:
The file was addedclang/test/CodeGen/attr-arm-sve-vector-bits-call.c
The file was addedclang/test/CodeGen/attr-arm-sve-vector-bits-globals.c
The file was addedclang/test/CodeGenCXX/aarch64-sve-fixedtypeinfo.cpp
The file was modifiedclang/lib/CodeGen/CGExprScalar.cpp
The file was modifiedclang/lib/AST/ItaniumMangle.cpp
The file was addedclang/test/CodeGen/attr-arm-sve-vector-bits-bitcast.c
The file was modifiedclang/lib/CodeGen/TargetInfo.cpp
The file was addedclang/test/CodeGen/attr-arm-sve-vector-bits-codegen.c
The file was addedclang/test/CodeGen/attr-arm-sve-vector-bits-cast.c
The file was modifiedclang/lib/CodeGen/CGCall.cpp
The file was addedclang/test/CodeGen/attr-arm-sve-vector-bits-types.c
The file was addedclang/test/CodeGenCXX/aarch64-mangle-sve-fixed-vectors.cpp
Commit 3b71f91558ff8b569199547efe800cb501c3cf94 by mikhail.maltsev
[libcxx] Fix compile for BUILD_EXTERNAL_THREAD_LIBRARY

Fix compilation with -DLIBCXX_BUILD_EXTERNAL_THREAD_LIBRARY when using clang. Now linking target  'cxx_external_threads' with 'cxx-headers'. Fix mismatching visibility for `libcpp_timed_backoff_policy` function in file <__threading_support>.

Reviewed By: #libc, ldionne

Differential Revision:
The file was modifiedlibcxx/include/__threading_support
The file was modifiedlibcxx/src/CMakeLists.txt
Commit 49644cd941c3bf0668b81c61055e9e9a2b58b99e by Louis Dionne
[libc++] Install a more recent CMake on libc++ builders
The file was modifiedlibcxx/utils/docker/debian9/buildbot/
Commit c9455d3c579292e7ae5b7559ad0302d459e69a95 by russell.gallop
[Test] Tidy up loose ends from LLVM_HAS_GLOBAL_ISEL

This hasn't been allowed as a build option since r309990

Remove leftover REQUIRES: global-isel

Differential Revision:
The file was modifiedllvm/test/CodeGen/MIR/AArch64/register-operand-bank.mir
The file was modifiedllvm/test/MachineVerifier/test_g_select.mir
The file was modifiedllvm/test/MachineVerifier/test_g_extract.mir
The file was modifiedllvm/test/MachineVerifier/test_g_sextload.mir
The file was modifiedllvm/test/MachineVerifier/test_g_ptrtoint.mir
The file was modifiedllvm/test/MachineVerifier/test_g_bitcast.mir
The file was modifiedllvm/test/MachineVerifier/test_g_inttoptr.mir
The file was modifiedllvm/test/MachineVerifier/test_g_fconstant.mir
The file was modifiedllvm/test/MachineVerifier/test_g_sext_inreg.mir
The file was modifiedllvm/test/MachineVerifier/test_g_phi.mir
The file was modifiedllvm/test/MachineVerifier/test_g_zextload.mir
The file was modifiedllvm/tools/llvm-config/CMakeLists.txt
The file was modifiedllvm/test/MachineVerifier/test_g_fcmp.mir
The file was modifiedllvm/test/MachineVerifier/test_g_store.mir
The file was modifiedllvm/test/MachineVerifier/test_g_brjt.mir
The file was modifiedllvm/tools/llvm-config/llvm-config.cpp
The file was modifiedllvm/test/tools/llvm-config/booleans.test
The file was modifiedllvm/test/MachineVerifier/test_g_ptr_add.mir
The file was modifiedllvm/test/MachineVerifier/test_g_icmp.mir
The file was modifiedllvm/test/MachineVerifier/test_g_jump_table.mir
The file was modifiedllvm/test/CodeGen/MIR/AArch64/generic-virtual-registers-error.mir
The file was modifiedllvm/test/MachineVerifier/test_g_concat_vectors.mir
The file was modifiedllvm/test/MachineVerifier/test_g_trunc.mir
The file was modifiedllvm/test/MachineVerifier/test_g_insert.mir
The file was modifiedllvm/test/MachineVerifier/test_g_constant.mir
The file was modifiedllvm/test/MachineVerifier/test_g_add.mir
The file was modifiedllvm/tools/llvm-config/
The file was modifiedllvm/test/MachineVerifier/test_copy_mismatch_types.mir
The file was modifiedllvm/test/CodeGen/MIR/AArch64/generic-virtual-registers-with-regbank-error.mir
The file was modifiedllvm/test/MachineVerifier/test_copy.mir
The file was modifiedllvm/test/
The file was modifiedllvm/test/CodeGen/MIR/X86/generic-instr-type.mir
The file was modifiedllvm/test/MachineVerifier/test_g_addrspacecast.mir
The file was modifiedllvm/test/MachineVerifier/test_g_load.mir
Commit 94d3dd8b08a1abcd2964c4a5d5f6a3a3da2fa4cf by lebedev.ri
[NFC][EarlyCSE][InstSimplify] Add tests for CSE of PHI nodes

PHI nodes depend on the block they're in,
so we can only deal with the most basic case of same-BB PHI's.
The file was addedllvm/test/Transforms/EarlyCSE/phi.ll
The file was addedllvm/test/Transforms/InstSimplify/phi-cse.ll
Commit 6102310d814ad73eab60a88b21dd70874f7a056f by lebedev.ri
[InstSimplify][EarlyCSE] Try to CSE PHI nodes in the same basic block

Apparently, we don't do this, neither in EarlyCSE, nor in InstSimplify,
nor in (old) GVN, but do in NewGVN and SimplifyCFG of all places..

While i could teach EarlyCSE how to hash PHI nodes,
we can't really do much (anything?) even if we find two identical
PHI nodes in different basic blocks, same-BB case is the interesting one,
and if we teach InstSimplify about it (which is what i wanted originally,, we get EarlyCSE support for free.

So i would think this is pretty uncontroversial.

On vanilla llvm test-suite + RawSpeed, this has the following effects:
| statistic name                                     | baseline  | proposed  |      Δ |        % |    \|%\| |
| instsimplify.NumPHICSE                             | 0         | 23779     |  23779 |    0.00% |    0.00% |
| asm-printer.EmittedInsts                           | 7942328   | 7942392   |     64 |    0.00% |    0.00% |
| assembler.ObjectBytes                              | 273069192 | 273084704 |  15512 |    0.01% |    0.01% |
| correlated-value-propagation.NumPhis               | 18412     | 18539     |    127 |    0.69% |    0.69% |
| early-cse.NumCSE                                   | 2183283   | 2183227   |    -56 |    0.00% |    0.00% |
| early-cse.NumSimplify                              | 550105    | 542090    |  -8015 |   -1.46% |    1.46% |
| instcombine.NumAggregateReconstructionsSimplified  | 73        | 4506      |   4433 | 6072.60% | 6072.60% |
| instcombine.NumCombined                            | 3640264   | 3664769   |  24505 |    0.67% |    0.67% |
| instcombine.NumDeadInst                            | 1778193   | 1783183   |   4990 |    0.28% |    0.28% |
| instcount.NumCallInst                              | 1758401   | 1758799   |    398 |    0.02% |    0.02% |
| instcount.NumInvokeInst                            | 59478     | 59502     |     24 |    0.04% |    0.04% |
| instcount.NumPHIInst                               | 330557    | 330533    |    -24 |   -0.01% |    0.01% |
| instcount.TotalInsts                               | 8831952   | 8832286   |    334 |    0.00% |    0.00% |
| simplifycfg.NumInvokes                             | 4300      | 4410      |    110 |    2.56% |    2.56% |
| simplifycfg.NumSimpl                               | 1019808   | 999607    | -20201 |   -1.98% |    1.98% |
I.e. it fires ~24k times, causes +110 (+2.56%) more `invoke` -> `call`
transforms, and counter-intuitively results in *more* instructions total.

That being said, the PHI count doesn't decrease that much,
and looking at some examples, it seems at least some of them
were previously getting PHI CSE'd in SimplifyCFG of all places..

I'm adjusting `Instruction::isIdenticalToWhenDefined()` at the same time.
As a comment in `InstCombinerImpl::visitPHINode()` already stated,
there are no guarantees on the ordering of the operands of a PHI node,
so if we just naively compare them, we may false-negatively say that
the nodes are not equal when the only difference is operand order,
which is especially important since the fold is in InstSimplify,
so we can't rely on InstCombine sorting them beforehand.

Fixing this for the general case is costly (geomean +0.02%),
and does not appear to catch anything in test-suite, but for
the same-BB case, it's trivial, so let's fix at least that.

As per
this appears to cause geomean +0.03% compile time increase (regression),
but geomean -0.01%..-0.04% code size decrease (improvement).
The file was modifiedllvm/test/Transforms/EarlyCSE/phi.ll
The file was modifiedllvm/test/Transforms/InstCombine/select.ll
The file was modifiedllvm/test/Transforms/JumpThreading/loop-phi.ll
The file was modifiedllvm/test/Transforms/InstSimplify/phi-cse.ll
The file was modifiedllvm/test/Transforms/InstCombine/phi-equal-incoming-pointers.ll
The file was modifiedllvm/test/CodeGen/X86/statepoint-vector.ll
The file was modifiedllvm/test/Transforms/InstCombine/phi-aware-aggregate-reconstruction.ll
The file was modifiedllvm/test/Transforms/InstCombine/merging-multiple-stores-into-successor.ll
The file was modifiedllvm/test/Transforms/LoopVectorize/reduction.ll
The file was modifiedllvm/lib/Analysis/InstructionSimplify.cpp
The file was modifiedllvm/lib/IR/Instruction.cpp
Commit a19fd1aab519ccec18654f76a01b0345880c5200 by mikhail.maltsev
Revert "[libcxx] Fix compile for BUILD_EXTERNAL_THREAD_LIBRARY"

This reverts commit 3b71f91558ff8b569199547efe800cb501c3cf94.

The commit is breaking some build bots.
The file was modifiedlibcxx/include/__threading_support
The file was modifiedlibcxx/src/CMakeLists.txt
Commit 7ed8124d46f94601d5f1364becee9cee8538265e by tejohnson
[HeapProf] Clang and LLVM support for heap profiling instrumentation

See RFC for background:

Note that the runtime changes will be sent separately (hopefully this
week, need to add some tests).

This patch includes the LLVM pass to instrument memory accesses with
either inline sequences to increment the access count in the shadow
location, or alternatively to call into the runtime. It also changes
calls to memset/memcpy/memmove to the equivalent runtime version.
The pass is modeled on the address sanitizer pass.

The clang changes add the driver option to invoke the new pass, and to
link with the upcoming heap profiling runtime libraries.

Currently there is no attempt to optimize the instrumentation, e.g. to
aggregate updates to the same memory allocation. That will be
implemented as follow on work.

Differential Revision:
The file was addedllvm/test/Instrumentation/HeapProfiler/version-mismatch-check.ll
The file was addedllvm/test/Instrumentation/HeapProfiler/basic.ll
The file was modifiedclang/include/clang/Driver/SanitizerArgs.h
The file was modifiedclang/lib/Frontend/CompilerInvocation.cpp
The file was modifiedclang/lib/Driver/ToolChains/CommonArgs.cpp
The file was addedllvm/test/Instrumentation/HeapProfiler/scale-granularity.ll
The file was modifiedllvm/lib/Passes/PassBuilder.cpp
The file was modifiedclang/include/clang/Driver/
The file was addedllvm/include/llvm/Transforms/Instrumentation/HeapProfiler.h
The file was addedclang/test/Driver/fmemprof.cpp
The file was modifiedllvm/lib/Transforms/Instrumentation/Instrumentation.cpp
The file was modifiedllvm/include/llvm/InitializePasses.h
The file was modifiedclang/lib/Driver/SanitizerArgs.cpp
The file was addedllvm/test/Instrumentation/HeapProfiler/masked-load-store.ll
The file was addedllvm/test/Instrumentation/HeapProfiler/instrumentation-use-callbacks.ll
The file was addedllvm/lib/Transforms/Instrumentation/HeapProfiler.cpp
The file was modifiedclang/lib/Driver/ToolChains/Clang.cpp
The file was modifiedllvm/lib/Transforms/Instrumentation/CMakeLists.txt
The file was modifiedclang/include/clang/Basic/CodeGenOptions.def
The file was modifiedclang/lib/CodeGen/BackendUtil.cpp
The file was modifiedllvm/lib/Passes/PassRegistry.def
Commit fddf543e6e01cb72ec1af48c5dd82c525bd3e47a by benny.kra
[MLIR][GPUToSPIRV] Fix use-after-free. Found by asan.
The file was modifiedmlir/lib/Conversion/GPUToSPIRV/ConvertGPUToSPIRV.cpp
Commit 3d943bcd223e5b97179840c2f5885fe341e51747 by lucas.prates
[CodeGen] Properly propagating Calling Convention information when lowering vector arguments

When joining the legal parts of vector arguments into its original value
during the lower of Formal Arguments in SelectionDAGBuilder, the Calling
Convention information was not being propagated for the handling of each
individual parts. The same did not happen when lowering calls, causing a

This patch fixes the issue by properly propagating the Calling
Convention details.

This fixes Bugzilla #47001.

Reviewed By: arsenm

Differential Revision:
The file was modifiedllvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
The file was modifiedllvm/test/CodeGen/ARM/fp16-args.ll
The file was modifiedllvm/test/CodeGen/ARM/fp16-v3.ll
Commit 5c2db1655b2ac2a699d29165513c16894373e566 by aditya_nandakumar
[GISel]: Fix one more CSE Non determinism

Sometimes we can have the following code

x:gpr(s32) = G_OP

Say we build G_OP2 to the same x and then delete the previous instruction. Using something like

Register X = ...;
auto NewMIB = CSEBuilder.buildOp2(X, ... args);

Currently there's a mismatch in how NewMIB is profiled and inserted into the CSEMap (ie it doesn't consider register bank/register class along with type).Unify the profiling by refactoring and calling the common method.

This was found by turning on the CSEInfo::verify in at the end of each of our GISel passes which turns inconsistent state/non determinism in CSEing into crashes which likely usually indicates missing calls to Observer on mutations (the most common case). Here non determinism usually means not cseing sometimes, but almost never about producing incorrect code.
Also this patch adds this verification at the end of the combiners as well.
The file was modifiedllvm/lib/CodeGen/GlobalISel/CSEMIRBuilder.cpp
The file was modifiedllvm/lib/CodeGen/GlobalISel/Combiner.cpp
The file was modifiedllvm/include/llvm/CodeGen/GlobalISel/CSEInfo.h
The file was modifiedllvm/lib/CodeGen/GlobalISel/CSEInfo.cpp
Commit 7a68f0f1e00b3542405ee596d7e54c4b243933e9 by okuraofvegetable
[Attributor] Add a phase flag to Attributor

Add a new flag that indicates which stage in the process we are in.
This flag is introduced for handling behavior of `getAAFor` according to the stage. (discussed in D86635)

Reviewed By: jdoerfert

Differential Revision:
The file was modifiedllvm/lib/Transforms/IPO/Attributor.cpp
The file was modifiedllvm/include/llvm/Transforms/IPO/Attributor.h
Commit c48b06c44f260e5bf2b906c605b6ca8dff954be2 by simon.moll
[sda][nfc] clang-formatting
The file was modifiedllvm/lib/Analysis/SyncDependenceAnalysis.cpp
Commit dd04fa17d794f73a4cf85827722988ab70239f71 by aeubanks
[OCaml] Remove add_constant_propagation

The file was modifiedllvm/bindings/ocaml/transforms/scalar_opts/
Commit b981924bdda71b610c349a1d502ba83af632ae98 by Jonas Devlieghere
[lldb] Move triple construction out of getArchCFlags in DarwinBuilder (NFC)

Move the construction of the triple out of getArchCFlags in the
The file was modifiedlldb/packages/Python/lldbsuite/test/builders/
Commit a7e4a1773535c64dea5c1d72d6a0a3e24378eaa1 by Jonas Devlieghere
[lldb] Make lldb-argdumper a dependency of liblldb

Always make lldb-argdumper a dependency of liblldb. Currently it is only
a dependency of the python swig target because of the relative symlink
in the python resource directory. That means that the dependency won't
be there when LLDB_ENABLE_PYTHON is disabled.

Differential revision:
The file was modifiedlldb/tools/argdumper/CMakeLists.txt
Commit db464a3dbf0e8fed363a7b2b9a5b320514ca60f8 by aditya_nandakumar
[GISel] Add new GISel combiners for G_SELECT

Patch adds two new GICombinerRules for G_SELECT. The rules include:
combining selects with undef comparisons into their first selectee value,
and to combine away selects with constant comparisons. Patch additionally
adds a new combiner test for the AArch64 target to test these new G_SELECT
combiner rules and the existing select_same_val combiner rule.

Patch by  mkitzan
The file was modifiedllvm/include/llvm/Target/GlobalISel/
The file was modifiedllvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/postlegalizercombiner-select.mir
The file was addedllvm/test/CodeGen/AArch64/GlobalISel/combine-select.mir
The file was modifiedllvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
Commit 8bdb98c781217e318f86c568139bf0b427eab7aa by aeubanks
[test][Inliner] Make always-inline.ll work with NPM

The NPM doesn't support call-site alwaysinline as described in the comments.

Also make NPM runs more similar to legacy PM runs.

Reviewed By: ychen, asbirlea

Differential Revision:
The file was modifiedllvm/test/Transforms/Inline/always-inline.ll
Commit 897839425bdb3564aec1a03ee9d2acad608ba265 by aeubanks
[gn build] Manually port c9455d3
The file was modifiedllvm/utils/gn/secondary/llvm/tools/llvm-config/
Commit b3efa65363ba9a3380b68a9a3bd4767a762b8715 by llvmgnsyncbot
[gn build] Port 7ed8124d46f
The file was modifiedllvm/utils/gn/secondary/llvm/lib/Transforms/Instrumentation/