1. [libc++] Move <__sso_allocator> out of include/ into src/. NFCI. (details)
  2. [libc++] [LIBCXX-DEBUG-FIXME] Fix an iterator-invalidation issue in string::assign. (details)
  3. [libc++] [LIBCXX-DEBUG-FIXME] Iterating a string::iterator "off the end" is UB. (details)
  4. [libc++] [LIBCXX-DEBUG-FIXME] Our `__debug_less` breaks some complexity guarantees. (details)
  5. [libc++] [LIBCXX-DEBUG-FIXME] std::advance shouldn't use ADL `>=` on the _Distance type. (details)
  6. [libc++] [LIBCXX-DEBUG-FIXME] Stop using invalid iterators to insert into sets/maps. (details)
  7. [scudo] Align objects with alignas (details)
  8. [mlir][tosa] Add tosa.depthwise lowering to existing linalg.depthwise_conv (details)
  9. [lld] Convert LLVM_CMAKE_PATH to a CMake path (details)
  10. [WebAssembly] Add SIMD const_splat intrinsics (details)
  11. [NFC][X86][Codegen] Add some tests for 64-bit shift by (32-x) (details)
  12. Preserve metadata on masked intrinsics in auto-upgrade (details)
  13. [Utils][NFC] Rename replace-function-regex in update_cc_test_checks (details)
  14. [MachineCSE][NFC]: Refactor and comment on preventing CSE for isConvergent instrs (details)
  15. [mlir] Add polynomial approximation for math::ExpM1 (details)
  16. GlobalISel: Use DAG call lowering infrastructure in a more compatible way (details)
  17. X86/GlobalISel: Use generic version of splitToValueTypes (details)
  18. AMDGPU/GlobalISel: Remove unnecessary override (details)
  19. GlobalISel: Update documentation (details)
  20. [clangd] Split CC and refs limit and increase refs limit to 1000 (details)
  21. [AMDGPU] Improve global SADDR selection (details)
  22. When performing template argument deduction to select a partial (details)
  23. ARM/GlobalISel: Don't store a MachineInstrBuilder reference (details)
  24. AMDGPU: Add a few more tail call tests (details)
  25. [gn build] (semi-manually) port 0b10bb7ddd3c (details)
  26. [lld-macho] Check simulator platforms to avoid issuing false positive errors. (details)
  27. [lldb] Handle missing SBStructuredData copy assignment cases (details)
  28. [gn build] (semi-manually) port 0b10bb7ddd3c more (details)
  29. [AMDGPU][GlobalISel] Widen 1 and 2 byte scalar loads (details)
  30. [Driver] Move -print-runtime-dir and -print-resource-dir tests (details)
  31. [AArch64] Fix some coding standard issues related to namespace llvm (details)
  32. [mlir][Linalg] Fix element type of results when folding reshapes. (details)
  33. AMDGPU: Fix lit test (details)
  34. Allow /STACK in #pragma comment(linker, ...) (details)
  35. Attach metadata to simplified masked loads and stores (details)
  36. [mlir][Linalg] Fix test to use new reshape op form. (details)
  37. [MCAsmInfo] Support UsesCFIForDebug for targets with no exception handling (details)
  38. [AArch64] Deleted unused AsmBackend functions (details)
Commit 0b10bb7ddd3c92465ef12d52e88614e6b4c5ef27 by arthur.j.odwyer
[libc++] Move <__sso_allocator> out of include/ into src/. NFCI.

This allocator is not intended for libc++'s users to use;
it's strictly an implementation detail of `src/locale.cpp`.
So, move it to the `src/include/` directory.

Drive-by const-qualify its comparison operators.

For consistency with `__hidden_allocator` (defined in `src/thread.cpp`),
do *not* remove it from "libcxx/lib/libc++unexp.exp",
"libcxx/utils/symcheck-blacklists/linux_blacklist.txt", etc.

Differential Revision:
The file was modifiedlibcxx/include/module.modulemap
The file was removedlibcxx/include/__sso_allocator
The file was modifiedlibcxx/src/locale.cpp
The file was modifiedlibcxx/include/CMakeLists.txt
The file was modifiedlibcxx/src/CMakeLists.txt
The file was addedlibcxx/src/include/sso_allocator.h
Commit db9425cb060bd076fcdcbb5a37bfd992deff2086 by arthur.j.odwyer
[libc++] [LIBCXX-DEBUG-FIXME] Fix an iterator-invalidation issue in string::assign.

This appears to be a bug in our string::assign: when assigning into
a longer string, from a shorter snippet of itself, we invalidate
iterators before doing the copy. We should invalidate them afterward.
Also drive-by improve the formatting of a function header.

Differential Revision:
The file was modifiedlibcxx/test/std/strings/basic.string/string.modifiers/string_assign/iterator.pass.cpp
The file was modifiedlibcxx/include/string
Commit 12dd9cdf1a8267e0c5db4f191f2598648de02619 by arthur.j.odwyer
[libc++] [LIBCXX-DEBUG-FIXME] Iterating a string::iterator "off the end" is UB.

The range of char pointers [data, data+size] is a valid closed range,
but the range [begin, end) is valid only half-open.

Differential Revision:
The file was modifiedlibcxx/test/std/input.output/filesystems/class.path/path.nonmember/path.factory.pass.cpp
Commit 165ad89947e8ef6c08c80eb067d85b4fa9074904 by arthur.j.odwyer
[libc++] [LIBCXX-DEBUG-FIXME] Our `__debug_less` breaks some complexity guarantees.

`__debug_less` ends up running the comparator up-to-twice per comparison,
because whenever `(x < y)` it goes on to verify that `!(y < x)`.
This breaks the strict "Complexity" guarantees of algorithms like
`inplace_merge`, which we test in the test suite. So, just skip the
complexity assertions in debug mode.

Differential Revision:
The file was modifiedlibcxx/test/std/algorithms/alg.sorting/alg.merge/inplace_merge_comp.pass.cpp
The file was modifiedlibcxx/docs/DesignDocs/DebugMode.rst
Commit 9571b8f238f97bce01bcf3c84a4f87cfb1c00dbf by arthur.j.odwyer
[libc++] [LIBCXX-DEBUG-FIXME] std::advance shouldn't use ADL `>=` on the _Distance type.

Convert to a primitive type first; then use primitive `>=` on that value.

Differential Revision:
The file was modifiedlibcxx/include/iterator
The file was modifiedlibcxx/test/std/iterators/iterator.primitives/iterator.operations/robust_against_adl.pass.cpp
Commit 9ea2db2c513534aa63acc087b8dc744c37119d02 by arthur.j.odwyer
[libc++] [LIBCXX-DEBUG-FIXME] Stop using invalid iterators to insert into sets/maps.

This simply applies Howard's commit 4c80bfbd53caf consistently
across all the associative and unordered container tests.

"unord.set/insert_hint_const_lvalue.pass.cpp" failed with `-D_LIBCPP_DEBUG=1`
before this patch; it was the only one that incorrectly reused
invalid iterator `e`. The others already used valid iterators
(generally `c.end()`); I'm just making them all match the same pattern
of usage: "e, then r, then c.end() for the rest."

Differential Revision:
The file was modifiedlibcxx/test/std/containers/unord/
The file was modifiedlibcxx/test/std/containers/unord/unord.set/insert_hint_rvalue.pass.cpp
The file was modifiedlibcxx/test/std/containers/unord/unord.multiset/insert_hint_const_lvalue.pass.cpp
The file was modifiedlibcxx/test/std/containers/unord/unord.set/insert_hint_const_lvalue.pass.cpp
The file was modifiedlibcxx/test/std/containers/unord/
The file was modifiedlibcxx/test/std/containers/unord/unord.multimap/unord.multimap.modifiers/insert_hint_const_lvalue.pass.cpp
Commit 1d767b13bfad806bf584e0b054eb7d00a494591d by Vitaly Buka
[scudo] Align objects with alignas

Operator new must align allocations for types with large alignment.

Before c++17 behavior was implementation defined and both clang and gc++
before 11 ignored alignment. Miss-aligned objects mysteriously crashed
tests on Ubuntu 14.

Alternatives are compile with -std=c++17 or -faligned-new, but they were
discarded as less portable.

Reviewed By: hctim

Differential Revision:
The file was modifiedcompiler-rt/lib/scudo/standalone/tests/combined_test.cpp
The file was modifiedcompiler-rt/lib/scudo/standalone/tests/primary_test.cpp
Commit 7abb56c78ba7bb9e2a91f61a65bb8feb69a92865 by rob.suderman
[mlir][tosa] Add tosa.depthwise lowering to existing linalg.depthwise_conv

Implements support for undialated depthwise convolution using the existing
depthwise convolution operation. Once convolutions migrate to yaml defined
versions we can rewrite for cleaner implementation.

Reviewed By: mravishankar

Differential Revision:
The file was modifiedmlir/test/Conversion/TosaToLinalg/tosa-to-linalg.mlir
The file was modifiedmlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp
Commit 662a58fa0534508c2c37b22425bfdf16b9d985a8 by isuruf
[lld] Convert LLVM_CMAKE_PATH to a CMake path

Otherwise I get the following error on windows.
CMake Error at D:/bld/lld_1569206597988/work/build/CMakeFiles/CMakeTmp/CMakeLists.txt:2 (set):
  Syntax error in cmake code at


  when parsing string


  Invalid character escape '\b'.

CMake Error at D:/bld/lld_1569206597988/_build_env/Library/share/cmake-3.15/Modules/CheckSymbolExists.cmake:100 (try_compile):
  Failed to configure test project build system.
Call Stack (most recent call first):
  D:/bld/lld_1569206597988/_build_env/Library/share/cmake-3.15/Modules/CheckSymbolExists.cmake:57 (__CHECK_SYMBOL_EXISTS_IMPL)
  D:/bld/lld_1569206597988/_h_env/Library/lib/cmake/llvm/HandleLLVMOptions.cmake:943 (check_symbol_exists)
  CMakeLists.txt:56 (include)

Reviewed By: sbc100

Differential Revision:
The file was modifiedlld/CMakeLists.txt
Commit 81fce29d6e1f0a83e8a4170c7f24cdd93869d55a by tlively
[WebAssembly] Add SIMD const_splat intrinsics

These intrinsics do not correspond to their own underlying instruction, but are
a convenience for the common case of materializing a constant vector that has
the same value in each lane.

Differential Revision:
The file was modifiedclang/test/Headers/wasm.c
The file was modifiedclang/lib/Headers/wasm_simd128.h
Commit 40147c33d17eca98d186628272a076a1bb3e6868 by lebedev.ri
[NFC][X86][Codegen] Add some tests for 64-bit shift by (32-x)
The file was addedllvm/test/CodeGen/X86/64-bit-shift-by-32-minus-y.ll
Commit 1817dae1924144c19b9caec196f574c51d6d9957 by kparzysz
Preserve metadata on masked intrinsics in auto-upgrade

When auto-upgrade was replacing a call to a masked intrinsic, it would
not copy the metadata from the original call.

If an intrinsic had metadata, but did not need any updates, the metadata
would stay, but if an update was needed, the would end up being removed.
A similar effect could be observed with masked_expandload and
masked_compressstore, which at the moment are not handled by auto-upgrade:
the metadata remained untouched.

Differential Revision:
The file was addedllvm/test/Bitcode/upgrade-masked-keep-metadata.ll
The file was modifiedllvm/lib/IR/AutoUpgrade.cpp
Commit 78a7d8c4dd1076dccfde2c48fc924d8f5529f4d1 by georgakoudis1
[Utils][NFC] Rename replace-function-regex in update_cc_test_checks

This patch renames the replace-function-regex to replace-value-regex to indicate that the existing regex replacement functionality can replace any IR value besides functions.

Reviewed By: jdoerfert

Differential Revision:
The file was modifiedllvm/utils/
The file was modifiedclang/test/OpenMP/nvptx_lambda_capturing.cpp
The file was modifiedclang/test/OpenMP/nvptx_target_teams_distribute_codegen.cpp
The file was modifiedclang/test/OpenMP/nvptx_multi_target_parallel_codegen.cpp
The file was modifiedclang/test/OpenMP/nvptx_target_parallel_codegen.cpp
The file was modifiedllvm/utils/
The file was modifiedclang/test/utils/update_cc_test_checks/Inputs/generated-funcs-regex.c.expected
The file was modifiedclang/test/OpenMP/nvptx_target_teams_codegen.cpp
The file was modifiedclang/test/utils/update_cc_test_checks/Inputs/generated-funcs-regex.c
The file was modifiedclang/test/OpenMP/nvptx_allocate_codegen.cpp
The file was modifiedclang/test/utils/update_cc_test_checks/generated-funcs-regex.test
The file was modifiedclang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_simd_codegen.cpp
The file was modifiedclang/test/OpenMP/nvptx_teams_reduction_codegen.cpp
The file was modifiedclang/test/OpenMP/nvptx_nested_parallel_codegen.cpp
The file was modifiedclang/test/OpenMP/nvptx_target_parallel_num_threads_codegen.cpp
The file was modifiedclang/test/OpenMP/nvptx_target_codegen.cpp
The file was modifiedclang/test/OpenMP/nvptx_distribute_parallel_generic_mode_codegen.cpp
The file was modifiedclang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp
The file was modifiedclang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_generic_mode_codegen.cpp
The file was modifiedclang/test/OpenMP/target_parallel_debug_codegen.cpp
The file was modifiedclang/test/OpenMP/nvptx_data_sharing.cpp
The file was modifiedclang/test/OpenMP/nvptx_parallel_codegen.cpp
The file was modifiedllvm/utils/UpdateTestChecks/
The file was modifiedclang/test/OpenMP/target_parallel_for_debug_codegen.cpp
The file was modifiedclang/test/OpenMP/nvptx_parallel_for_codegen.cpp
Commit a11489ae3e36063c64921439cbab89d1f3280f4a by mkitzan
[MachineCSE][NFC]: Refactor and comment on preventing CSE for isConvergent instrs

- Move the code preventing CSE of `isConvergent` instrs into
  `ProcessBlockCSE` (from `isProfitableToCSE`)
- Add comments explaining why `isConvergent` is used to prevent
  CSE of non-local instrs in MachineCSE and the new test
The file was modifiedllvm/lib/CodeGen/MachineCSE.cpp
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/no-cse-nonlocal-convergent-instrs.mir
Commit 0edc4bc84aa246ee1f156982e19a1b8b5fbecf4c by ezhulenev
[mlir] Add polynomial approximation for math::ExpM1

This approximation matches the one in Eigen.

name                      old cpu/op  new cpu/op  delta
BM_mlir_Expm1_f32/10      90.9ns ± 4%  52.2ns ± 4%  -42.60%    (p=0.000 n=74+87)
BM_mlir_Expm1_f32/100      837ns ± 3%   231ns ± 4%  -72.43%    (p=0.000 n=79+69)
BM_mlir_Expm1_f32/1k      8.43µs ± 3%  1.58µs ± 5%  -81.30%    (p=0.000 n=77+83)
BM_mlir_Expm1_f32/10k     83.8µs ± 3%  15.4µs ± 5%  -81.65%    (p=0.000 n=83+69)
BM_eigen_s_Expm1_f32/10   68.8ns ±17%  72.5ns ±14%   +5.40%  (p=0.000 n=118+115)
BM_eigen_s_Expm1_f32/100   694ns ±11%   717ns ± 2%   +3.34%   (p=0.000 n=120+75)
BM_eigen_s_Expm1_f32/1k   7.69µs ± 2%  7.97µs ±11%   +3.56%   (p=0.000 n=95+117)
BM_eigen_s_Expm1_f32/10k  88.0µs ± 1%  89.3µs ± 6%   +1.45%   (p=0.000 n=74+106)
BM_eigen_v_Expm1_f32/10   44.3ns ± 6%  45.0ns ± 8%   +1.45%   (p=0.018 n=81+111)
BM_eigen_v_Expm1_f32/100   351ns ± 1%   360ns ± 9%   +2.58%    (p=0.000 n=73+99)
BM_eigen_v_Expm1_f32/1k   3.31µs ± 1%  3.42µs ± 9%   +3.37%   (p=0.000 n=71+100)
BM_eigen_v_Expm1_f32/10k  33.7µs ± 8%  34.1µs ± 9%   +1.04%    (p=0.007 n=99+98)

Reviewed By: ezhulenev

Differential Revision:
The file was modifiedmlir/lib/Dialect/Math/Transforms/PolynomialApproximation.cpp
The file was modifiedmlir/test/Dialect/Math/polynomial-approximation.mlir
The file was modifiedmlir/test/mlir-cpu-runner/math_polynomial_approx.mlir
Commit fa0b93b5a0866aad3ce517daab6cd91cc67823ad by Matthew.Arsenault
GlobalISel: Use DAG call lowering infrastructure in a more compatible way

Unfortunately the current call lowering code is built on top of the
legacy MVT/DAG based code. However, GlobalISel was not using it the
same way. In short, the DAG passes legalized types to the assignment
function, and GlobalISel was passing the original raw type if it was

I do believe the DAG lowering is conceptually broken since it requires
picking a type up front before knowing how/where the value will be
passed. This ends up being a problem for AArch64, which wants to pass
i1/i8/i16 values as a different size if passed on the stack or in

The argument type decision is split across 3 different places which is
hard to follow. SelectionDAG builder uses
getRegisterTypeForCallingConv to pick a legal type, tablegen gives the
illusion of controlling the type, and the target may have additional
hacks in the C++ part of the call lowering. AArch64 hacks around this
by not using the standard AnalyzeFormalArguments and special casing
i1/i8/i16 by looking at the underlying type of the original IR

I believe people have generally assumed the calling convention code is
processing the original types, and I've discovered a number of dead
paths in several targets.

x86 actually relies on the opposite behavior from AArch64, and relies
on x86_32 and x86_64 sharing calling convention code where the 64-bit
cases implicitly do not work on x86_32 due to using the pre-legalized

AMDGPU targets without legal i16/f16 have always used a broken ABI
that promotes to i32/f32. GlobalISel accidentally fixed this to be the
ABI we should have, but this fixes it so we're using the worse ABI
that is compatible with the DAG. Ideally we would fix the DAG to match
the old GlobalISel behavior, but I don't wish to fight that battle.

A new native GlobalISel call lowering framework should let the target
process the incoming types directly.

CCValAssigns select a "ValVT" and "LocVT" but the meanings of these
aren't entirely clear. Different targets don't use them consistently,
even within their own call lowering code. My current belief is the
intent was "ValVT" is supposed to be the legalized value type to use
in the end, and and LocVT was supposed to be the ABI passed type
(which is also legalized).

With the default CCState::Analyze functions always passing the same
type for these arguments, these only differ when the TableGen part of
the lowering decide to promote the type from one legal type to
another. AArch64's i1/i8/i16 hack ends up inverting the meanings of
these values, so I had to add an additional hack to let the target
interpret how large the argument memory is.

Since targets don't consistently interpret ValVT and LocVT, this
doesn't produce quite equivalent code to the initial DAG
lowerings. I've opted to consistently interpret LocVT as the in-memory
size for stack passed values, and ValVT as the register type to assign
from that memory. We therefore produce extending loads directly out of
the IRTranslator, whereas the DAG would emit regular loads of smaller
values. This will also produce loads/stores that are wider than the
argument value if the allocated stack slot is larger (and there will
be undef padding bytes). If we had the optimizations to reduce
load/stores based on truncated values, this wouldn't produce a
different end result.

Since ValVT/LocVT are more consistently interpreted, we now will emit
more G_BITCASTS as requested by the CCAssignFn. For example AArch64
was directly assigning types to some physical vector registers which
according to the tablegen spec should have been casted to a vector
with a different element type.

This also moves the responsibility for inserting
G_ASSERT_SEXT/G_ASSERT_ZEXT from the target ValueHandlers into the
generic code, which is closer to how SelectionDAGBuilder works.

I had to xfail an x86 test since I don't see a quick way to fix it
right now (I filed bug 50035 for this). It's broken independently of
this change, and only triggers since now we end up with more ands
which hit the improperly handled selection pattern.

I also observed that FP arguments that need promotion (e.g. f16 passed
as f32) are broken, and use regular G_TRUNC and G_ANYEXT.

TLDR; the current call lowering infrastructure is bad and nobody has
ever understood how it chooses types.
The file was modifiedllvm/lib/Target/X86/X86CallLowering.cpp
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/shl.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/fshl.ll
The file was modifiedllvm/test/CodeGen/ARM/GlobalISel/arm-unsupported.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/fdiv.f16.ll
The file was modifiedllvm/test/CodeGen/ARM/GlobalISel/arm-isel.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/fpow.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/xnor.ll
The file was modifiedllvm/test/CodeGen/X86/GlobalISel/irtranslator-callingconv.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/roundeven.ll
The file was modifiedllvm/test/CodeGen/ARM/GlobalISel/arm-param-lowering.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/shl-ext-reduce.ll
The file was modifiedllvm/test/CodeGen/AArch64/GlobalISel/arm64-callingconv.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/orn2.ll
The file was modifiedllvm/test/CodeGen/AArch64/GlobalISel/irtranslator-reductions.ll
The file was modifiedllvm/test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/andn2.ll
The file was modifiedllvm/test/CodeGen/X86/GlobalISel/memop-scalar-x32.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-call.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/usubsat.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/ashr.ll
The file was modifiedllvm/lib/Target/AArch64/GISel/AArch64CallLowering.cpp
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/ssubsat.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/fshr.ll
The file was modifiedllvm/include/llvm/CodeGen/GlobalISel/CallLowering.h
The file was modifiedllvm/lib/Target/ARM/ARMCallLowering.cpp
The file was modifiedllvm/test/CodeGen/AArch64/GlobalISel/call-translator.ll
The file was modifiedllvm/test/CodeGen/ARM/GlobalISel/arm-irtranslator.ll
The file was modifiedllvm/test/CodeGen/AArch64/GlobalISel/arm64-callingconv-ios.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/lshr.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-function-args.ll
The file was modifiedllvm/test/CodeGen/ARM/GlobalISel/arm-legalize-vfp4.mir
The file was modifiedllvm/test/CodeGen/X86/GlobalISel/ext.ll
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
The file was modifiedllvm/test/CodeGen/X86/GlobalISel/callingconv.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/fma.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/bswap.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/saddsat.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/uaddsat.ll
The file was modifiedllvm/lib/CodeGen/GlobalISel/CallLowering.cpp
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/dummy-target.ll
The file was modifiedllvm/test/CodeGen/X86/GlobalISel/add-scalar.ll
Commit 23ae35e858da37c753b8efaac965046358ec3818 by Matthew.Arsenault
X86/GlobalISel: Use generic version of splitToValueTypes

The custom insert of an unmerge and the callback weirdness should be
unnecessary. Since handleAssignments should now use
getRegisterTypeForCalling conv as SelectionDAG builder would, this
should now just be able to use the generic code. X86-32 relies on the
generated CCAssignFns not seeing illegal types and sharing code with
x86_64, so i64 values would incorrectly be assigned to 64-bit
The file was modifiedllvm/lib/Target/X86/X86CallLowering.h
The file was modifiedllvm/test/CodeGen/X86/GlobalISel/irtranslator-callingconv.ll
The file was modifiedllvm/lib/Target/X86/X86CallLowering.cpp
Commit 8fc4eb9e732006b3b4f0b224c79ab097f3026f85 by Matthew.Arsenault
AMDGPU/GlobalISel: Remove unnecessary override

This is the same as the default implementation
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
Commit e723b511e6e951444d2a646a23fc2e9cf4faecd4 by Matthew.Arsenault
GlobalISel: Update documentation
The file was modifiedllvm/docs/GlobalISel/IRTranslator.rst
Commit e623ce6188d698422d4ead24065056d6a869e6f8 by kbobyrev
[clangd] Split CC and refs limit and increase refs limit to 1000

Related discussion:

Reviewed By: kadircet

Differential Revision:
The file was modifiedclang-tools-extra/clangd/ClangdLSPServer.cpp
The file was modifiedclang-tools-extra/clangd/ClangdLSPServer.h
The file was modifiedclang-tools-extra/clangd/tool/ClangdMain.cpp
Commit 909a5ccf3be7868b24320aaaf0e588b56ba6e3f3 by Stanislav.Mekhanoshin
[AMDGPU] Improve global SADDR selection

An address can be a uniform sum of two i64 bit values.
That regularly happens in a loop where index is an induction
variable promoted to 64 bit by the LSR. We can materialize
zero in a VGPR and still use SADDR form of the load.

Differential Revision:
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
The file was modifiedllvm/test/CodeGen/AMDGPU/offset-split-global.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-load-global-saddr.mir
The file was modifiedllvm/test/CodeGen/AMDGPU/global-saddr-load.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/global_atomics_i64.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/global_atomics.ll
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
Commit 6bbfa0fd408e81055c360c2e059554dd76fd7f09 by richard
When performing template argument deduction to select a partial
specialization while substituting a partial template parameter pack,
don't try to extend the existing deduction.

This caused us to select the wrong partial specialization in some rare
cases. A recent change to libc++ caused this to happen in practice for
code using std::conjunction.
The file was modifiedclang/test/SemaTemplate/partial-spec-instantiate.cpp
The file was modifiedclang/lib/Sema/SemaTemplateDeduction.cpp
Commit 6e88539ab16de1cbe1b6b0a7f2922fd5e710cab9 by Matthew.Arsenault
ARM/GlobalISel: Don't store a MachineInstrBuilder reference

This is basically a pointer anyway
The file was modifiedllvm/lib/Target/ARM/ARMCallLowering.cpp
Commit ef5f0adecd02d92cbb1a713ac7316f6768269412 by Matthew.Arsenault
AMDGPU: Add a few more tail call tests

Add some cases I noticed were missing when porting to GlobalISel. The
cases that required any argument splitting did not work at first.
The file was modifiedllvm/test/CodeGen/AMDGPU/sibling-call.ll
Commit ceccfaae140d2a067d9023a9a3ca71efc86f9e2d by thakis
[gn build] (semi-manually) port 0b10bb7ddd3c
The file was modifiedllvm/utils/gn/secondary/libcxx/src/
Commit 23233ad139f4b69ea4ed1cdbd22abc72c7a4cb93 by vyng
[lld-macho] Check simulator platforms to avoid issuing false positive errors.

Currently the linker causes unnecessary errors when either the target or the config's platform is a simulator.

Differential Revision:
The file was modifiedlld/MachO/InputFiles.cpp
The file was modifiedlld/test/MachO/invalid/incompatible-arch.s
Commit c5cf4b8f11cd641560b0cd6e106765721688e74a by
[lldb] Handle missing SBStructuredData copy assignment cases

Fix cases that can crash `SBStructuredData::operator=`.

This happened in a case where `rhs` had a null `SBStructuredDataImpl`.

Differential Revision:
The file was modifiedlldb/source/API/SBStructuredData.cpp
The file was addedlldb/unittests/API/SBStructuredDataTest.cpp
The file was modifiedlldb/unittests/API/CMakeLists.txt
Commit ea3777fe2201fac29bfd5450a35f628b2a294306 by thakis
[gn build] (semi-manually) port 0b10bb7ddd3c more
The file was modifiedllvm/utils/gn/secondary/libcxx/include/
Commit 7a41639c60ab1bd3712302e2588d5c7d6d8b57dc by Vang.Thao
[AMDGPU][GlobalISel] Widen 1 and 2 byte scalar loads

Widen 1 and 2 byte scalar loads to 4 bytes when sufficiently
aligned to avoid using a global load.

Reviewed By: arsenm

Differential Revision:
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/frem.ll
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp
The file was addedllvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-widen-scalar-loads.mir
The file was addedllvm/test/CodeGen/AMDGPU/GlobalISel/widen-i8-i16-scalar-loads.ll
Commit 9d3dbcd24c7d61e83fe5b3c4e7b7e4ecf9d70cd7 by phosek
[Driver] Move -print-runtime-dir and -print-resource-dir tests

Put these into a separate files to match other -print-* options tests.

Differential Revision:
The file was addedclang/test/Driver/print-resource-dir.c
The file was modifiedclang/test/Driver/immediate-options.c
The file was addedclang/test/Driver/print-runtime-dir.c
Commit 7b0756a51a750f92d07ef82d02380c262e8e3803 by i
[AArch64] Fix some coding standard issues related to namespace llvm
The file was modifiedllvm/lib/Target/AArch64/MCTargetDesc/AArch64WinCOFFObjectWriter.cpp
The file was modifiedllvm/lib/Target/AArch64/AArch64FastISel.cpp
The file was modifiedllvm/lib/Target/AArch64/MCTargetDesc/AArch64WinCOFFStreamer.cpp
The file was modifiedllvm/lib/Target/AArch64/AArch64StackTaggingPreRA.cpp
The file was modifiedllvm/lib/Target/AArch64/AArch64ExpandImm.cpp
The file was modifiedllvm/lib/Target/AArch64/AArch64MacroFusion.cpp
The file was modifiedllvm/lib/Target/AArch64/MCTargetDesc/AArch64TargetStreamer.cpp
The file was modifiedllvm/lib/Target/AArch64/MCTargetDesc/AArch64ELFStreamer.cpp
Commit b6060b76731da36e14ef96c789b79e3b23672973 by ravishankarm
[mlir][Linalg] Fix element type of results when folding reshapes.

Fixing a minor bug which lead to element type of the output being
modified when folding reshapes with generic op.

Differential Revision:
The file was modifiedmlir/lib/Dialect/Linalg/Transforms/FusionOnTensors.cpp
The file was modifiedmlir/test/Dialect/Linalg/fusion-push-reshape.mlir
Commit b6d244e5b8ab62d04ae9b1cf7463b78ecbb2f989 by Matthew.Arsenault
AMDGPU: Fix lit test
The file was modifiedllvm/test/CodeGen/AMDGPU/sibling-call.ll
Commit 7ac3fcc526ceb36da9ed41f27f686709a5554af8 by rnk
Allow /STACK in #pragma comment(linker, ...)

The Halide project uses `#pragma comment(linker, "/STACK:...")` to set
the stack size high enough for our embedded compiler to run in end-user
programs on Windows.

Unfortunately, lld-link.exe breaks on this when embedded in a COFF
object, despite supporting the flag on the command line. MSVC's link.exe
supports this fine. This patch extends support for this to lld-link.exe
for better compatibility with MSVC projects.

Differential Revision:
The file was modifiedlld/COFF/Driver.cpp
The file was addedlld/test/COFF/stack-drectve.s
Commit 6251b2f7f697f9378f4f0dbb284eea9cbe286728 by kparzysz
Attach metadata to simplified masked loads and stores
The file was modifiedllvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
The file was addedllvm/test/Transforms/InstCombine/masked_intrinsics_keep_metadata.ll
Commit 4b2d7ef3ea81d0d6746e609b46f38bfceff23838 by ravishankarm
[mlir][Linalg] Fix test to use new reshape op form.

Differential Revision:
The file was modifiedmlir/test/Dialect/Linalg/fusion-push-reshape.mlir
Commit 41f8b8e8075bfb80037390ff033558565f656007 by VenkataRamanaiah.Nalamothu
[MCAsmInfo] Support UsesCFIForDebug for targets with no exception handling

This change enables emitting CFI unwind information for debugging purpose
for targets with MCAsmInfo::ExceptionsType == ExceptionHandling::None.

Currently generating CFI unwind information is entangled with supporting
the exceptions, even when AsmPrinter explicitly recognizes that the unwind
tables are being generated as debug information.

In fact, the unwind information is not generated even if we specify
--force-dwarf-frame-section, unless exceptions are enabled. The LIT test
llvm/test/CodeGen/AMDGPU/debug_frame.ll demonstrates this behavior.

Enable this option for AMDGPU to prepare for future patches which add
complete CFI support.

Reviewed By: dblaikie, MaskRay

Differential Revision:
The file was modifiedllvm/include/llvm/MC/MCAsmInfo.h
The file was modifiedllvm/test/CodeGen/AMDGPU/debug_frame.ll
The file was addedllvm/test/MC/ELF/AMDGPU/lit.local.cfg
The file was modifiedllvm/test/CodeGen/AMDGPU/ptr-arg-dbg-value.ll
The file was addedllvm/test/MC/ELF/AMDGPU/cfi.s
The file was modifiedllvm/test/CodeGen/AMDGPU/split-arg-dbg-value.ll
The file was modifiedllvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
The file was modifiedllvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCAsmInfo.cpp
The file was addedllvm/test/DebugInfo/AMDGPU/cfi.ll
The file was modifiedllvm/include/llvm/CodeGen/AsmPrinter.h
The file was modifiedllvm/lib/CodeGen/AsmPrinter/DwarfCFIException.cpp
Commit d738ac6e12ac90f0254febb45f7b79d2dc5357e8 by i
[AArch64] Deleted unused AsmBackend functions
The file was modifiedllvm/lib/Target/AArch64/MCTargetDesc/AArch64AsmBackend.cpp