Changes

Summary

  1. Revert "[sanitizer] Don't tie builders with particular workers" (details)
Commit b899cd8edcb824c4e4f999ef254209060d1ab646 by Vitaly Buka
Revert "[sanitizer] Don't tie builders with particular workers"

This reverts commit d37259ec73a4341700e981214b9032631adfdda0.
With some changes.
The file was modifiedbuildbot/osuosl/master/config/builders.py (diff)

Summary

  1. sanitizer_common: optimize memory drain (details)
  2. [NFC] Do not track calls to inlined intrinsics in IFI. (details)
  3. [sanitizer_common] Define internal_usleep on Solaris (details)
  4. [remangleIntrinsicFunction] Detect and resolve name clash (details)
  5. [RISCV] Pass undef VECTOR_SHUFFLE indices on to BUILD_VECTOR (details)
  6. [libc] update benchmark distributions (details)
  7. AArch64: use 4-byte slots for arm64_32 pointers in a tail call (details)
  8. [OpenCL] Add support of __opencl_c_generic_address_space feature macro (details)
  9. [AMDGPU] Mark waterfall loops as SI_WATERFALL_LOOP (details)
  10. [AMDGPU] Optimize VGPR LiveRange in waterfall loops (details)
  11. [mlir][Linalg] Add layout specification support to bufferization. (details)
  12. Support: reduce stack used in default size test. (details)
  13. [X86][SSE] Add signbit tests to show cmpss/cmpsd intrinsics not recognised as 'allbits' results. (details)
  14. [mlir][Linalg] Better support for bufferizing non-tensor results. (details)
  15. [lldb] Fix editline unicode on Linux (details)
  16. [libomptarget][devicertl] Remove branches around setting parallelLevel (details)
  17. [AMDGPU] Handle s_branch to another section. (details)
  18. [libomptarget] Update device pointer only if needed (details)
  19. [MLIR] Fix documentation of the `ExecutionEngine` in the toy tutorial example (details)
  20. [X86][SSE] X86ISD::FSETCC nodes (cmpss/cmpsd) return a 0/-1 allbits signbits result (REAPPLIED) (details)
  21. [libomp] ompd_init(): fix heap-buffer-overflow when constructing libompd.so path (details)
  22. [OpenCL] Add support of __opencl_c_read_write_images feature macro (details)
  23. [InstCombine] Pre-commit ashr(or(neg(x),x),bw-1) --> sext(icmp_ne(x,0)) tests from D105764 (details)
  24. [clang/objc] Optimize getters for non-atomic, copied properties (details)
  25. [InstCombine] Fold lshr/ashr(or(neg(x),x),bw-1) --> zext/sext(icmp_ne(x,0)) (PR50816) (details)
  26. [NFC] Add paranthesis around logical expression to silence -Wlogical-op-parentheses warning. (details)
  27. [OpenMP] Minor improvement in task allocation (details)
  28. [libc++] Generate ABI list for macOS arm64 (details)
  29. [libc++] Target x86_64 only for the backdeployment jobs (details)
  30. [libc++] Workaround non-constexpr std::exchange pre C++20 (details)
  31. Mips: Mark special case calling convention handling as custom (details)
  32. Mips/GlobalISel: Use more standard call lowering infrastructure (details)
  33. GlobalISel: Remove getIntrinsicID utility function (details)
  34. GlobalISel: Use extension instead of merge with undef in common case (details)
  35. AMDGPU: Promote signext/zeroext i16 shader returns (details)
  36. Prepare Compiler-RT for GnuInstallDirs, matching libcxx, document all (details)
  37. [mlir] Handle unused variable when assertions are disabled. (details)
  38. [OpenCL] Add verbosity when checking support of read_write images (details)
  39. [PowerPC][NFC] Power ISA features for Semachecking (details)
  40. [InstCombine] Regenerate select-gep.ll tests (details)
  41. [InstCombine] Add basic (select C, (gep Ptr, Idx), Ptr) tests from PR50183 (details)
  42. [PowerPC] Fix L[D|W]ARX Implementation (details)
  43. [mlir][memref] adjust integration tests to new lowering passes (details)
  44. [llvm] Add enum iteration to Sequence (details)
  45. [RISCV] Support machine constraint "S" (details)
  46. [mlir][Linalg] Properly specify Linalg attribute. (details)
  47. Avoid triggering assert when program calls OSAtomicCompareAndSwapLong (details)
  48. [OpaquePtr] Use AllocaInst::getAllocatedType() (details)
  49. [OpaquePtr] Use GlobalValue::getValueType() more (details)
  50. [OpaquePtr] Get load/store type without PointerType::getElementType() (details)
  51. [OpaquePtr] Use byval type more (details)
  52. Revert "[llvm] Add enum iteration to Sequence" (details)
  53. [RISCV] Prevent use of t0(aka x5) as rs1 for jalr instructions. (details)
  54. [mlir] Add support for tensor.extract to comprehensive bufferization (details)
  55. [NFC] Inline variable to prevent unused variable warning (details)
  56. [ScalarEvolution] Fix overflow when computing max trip counts (details)
  57. AST: correct name decoration for swift async functions on Windows (details)
  58. [llvm-mca] [NFC] Formatting code (details)
  59. [clang] C++98 implicit moves are back with a vengeance (details)
  60. [OpenMP][NFC] Change comment style to eliminate warnings from GCC (details)
  61. [PowerPC] Add FI alignment check if the addressing mode is DS/DQ-Form, emit X-Form if necessary. (details)
  62. [RISCV] Use DIVUW/REMUW/DIVW instructions for i8/i16/i32 udiv/urem/sdiv when LHS is constant. (details)
  63. [libc++] NFC: Add comment for running macOS CI setup script remotely (details)
  64. [OpenMP] Fix one sign-compare warning from GCC (details)
  65. Fix utils/update_cc_test_checks/check-globals.test on stand-alone builds (details)
  66. [libc++] Add a CI job for macOS on arm64 hardware 🥳 (details)
  67. [sanitizer] Fix VSNPrintf %V on Windows (details)
  68. [analyzer][solver][NFC] Introduce ConstraintAssignor (details)
  69. [analyzer][solver][NFC] Refactor how we detect (dis)equalities (details)
  70. [NFC][sanitizer] Remove trailing whitespace (details)
  71. [test] Add a SCEV backedge computation test with an explicit zero stride (details)
  72. [CUDA] Only allow NVIDIA offload-arch during CUDA compilation. (details)
  73. [AArch64][GlobalISel] Legalize store <2 x i16> (details)
  74. [AArch64][GlobalISel] Legalize load <2 x i16> (details)
  75. Revert "[PowerPC][NFC] Power ISA features for Semachecking" (details)
  76. [PowerPC][NFC] Power ISA features for Semachecking (details)
  77. [WebAssembly] Generate checks for simd-load-store-alignment.ll (details)
  78. [InstCombine] Precommit tests for D105088 (NFC) (details)
  79. [Tests] Fix test broken by: 43c7ca8e4963 [AArch64][GlobalISel] Legalize store <2 x i16> (details)
  80. [SCEV] Strengthen inference of RHS > Start in howManyLessThans (details)
  81. Fix cuda-bad-arch.cu test. (details)
  82. [tests] Precommit a test case from D105216 (details)
  83. [gn build] (manually) port 303ddb60a2d2 (details)
  84. [LoopReroll] Add an extra defensive check to avoid SCEV assertion. (details)
  85. [NFC] Use CHECK-LABEL in trip-count-unknown-stride.ll (details)
  86. [mlir][Vector] Remove Vector TupleOp as it is unused (details)
  87. [lld][AMDGPU] Handle R_AMDGPU_REL16 relocation. (details)
  88. [libc++] [test] Add a missing `()` in TestEachIntegralType. (details)
  89. Revert "sanitizer_common: optimize memory drain" (details)
  90. [NFC][sanitizer] clang-format part of D105778 (details)
  91. [docs/llvm-cov] Document -compilation-dir (details)
  92. [libc] Add on float properties for precision floating point numbers in FloatProperties.h (details)
  93. [NFC][sanitizer] Move MemoryMapper out of SizeClassAllocator64 (details)
  94. [libcxx] [docs] Acknowledge that the library is known to work in some configs outside of what's tested in CI (details)
  95. [SCEV] Handle zero stride correctly in howManyLessThans (details)
  96. [sanitizer] Few more NFC changes from D105778 (details)
  97. [libc] Don't pass -fpie/-ffreestanding on Windows (details)
  98. [libc] Capture floating point encoding and arrange it sequentially in memory (details)
  99. [LLD] Adding support for RELA for CG Profile. (details)
  100. [WebAssembly] Run varargs codegen test with non-emscripten triple (details)
  101. Add more types to the LLVM dialect C API (details)
  102. [mlir][sparse] add support for std unary operations (details)
  103. [mlir][Tensor] Implement `reifyReturnTypeShapesPerResultDim` for `tensor.insert_slice`. (details)
  104. [PowerPC] Add PowerPC compare and multiply related builtins and instrinsics for XL compatibility (details)
  105. [NFC][MLIR][std] Clean up ArithmeticCastOps (details)
  106. [NFC][sanitizer] Rename some MemoryMapper members (details)
  107. [NFC][sanitizer] Exctract DrainHalfMax (details)
  108. [ScalarEvolution] Make isKnownNonZero handle more cases. (details)
  109. RegAlloc: Allow targets to split register allocation (details)
  110. [NFC][sanitizer] Don't store region_base_ in MemoryMapper (details)
  111. [NewPM][SimpleLoopUnswitch] Add option to not trivially unswitch (details)
  112. sanitizer_common: optimize memory drain (details)
  113. AMDGPU: Try to fix test failure with EXPENSIVE_CHECKS (details)
  114. [NFC][sanitizer] Move MemoryMapper template parameter (details)
  115. [NFC][sanitizer] Simplify MapPackedCounterArrayBuffer (details)
  116. [AArch64][GlobalISel] Mark v2s64 -> v2p0 G_INTTOPTR as legal (details)
  117. Revert "[NFC][sanitizer] Simplify MapPackedCounterArrayBuffer" (details)
  118. Revert "[SCEV] Handle zero stride correctly in howManyLessThans" (details)
  119. [AArch64] Fix AArch64::dsub's size (details)
  120. [AArch64] rm unused subreg's (details)
  121. [NFC][sanitizer] Simplify MapPackedCounterArrayBuffer (details)
  122. [CSSPGO][llvm-profgen] Allow multiple executable load segments. (details)
  123. [CSSPGO] Do not import pseudo probe desc in thinLTO (details)
  124. [NFC][CSSPGO] Rename the name of an enum value. (details)
  125. Fix test trying to write a spurious output file into the source (details)
  126. [SCEV] Handle zero stride correctly in howManyLessThans (details)
  127. [AbstractAttributor] Fold function calls to `__kmpc_is_spmd_exec_mode` if possible (details)
  128. [CSSPGO][llvm-profgen] Fix a missing initalization (details)
  129. Revert "[CSSPGO][llvm-profgen] Fix a missing initalization" (details)
  130. [CSSPGO][llvm-profgen] Fix a missing initalization (details)
  131. [AIX] Update testcase to use aix triple (details)
  132. [Polly] Fix typo. NFC. (details)
  133. [sanitizer] Convert script to python 3 (details)
  134. Revert "[clang] Refactor AST printing tests to share more infrastructure" (details)
  135. [sanitizer] Upgrade android scripts to python 3 (details)
  136. [sanitizer] Fix type error in python 3 (details)
  137. [RISCV] Support overloading for RVV miscellaneous functions. (details)
  138. Reformulate OrcJIT tutorial doc to make it more clear. (details)
  139. sanitizer_common: add simpler ThreadRegistry ctor (details)
  140. [libcxx] [test] Clarify weak_ptr_ret on Windows, remove a LIBCXX-WINDOWS-FIXME (details)
  141. [RISCV] Pass -u to linker correctly. (details)
Commit d89d3dfae17d7795dc1ef013db66272020de1959 by dvyukov
sanitizer_common: optimize memory drain

Currently we allocate MemoryMapper per size class.
MemoryMapper mmap's and munmap's internal buffer.
This results in 50 mmap/munmap calls under the global
allocator mutex. Reuse MemoryMapper and the buffer
for all size classes. This radically reduces number of
mmap/munmap calls. Smaller size classes tend to have
more objects allocated, so it's highly likely that
the buffer allocated for the first size class will
be enough for all subsequent size classes.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D105778
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_allocator_primary64.h
The file was modifiedcompiler-rt/lib/sanitizer_common/tests/sanitizer_allocator_test.cpp
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_allocator_local_cache.h
Commit 1d8030053d46b89e3677986d059065c6a2e7a2e1 by jeroen.dobbelaere
[NFC] Do not track calls to inlined intrinsics in IFI.

Just like intrinsics are not tracked for IFI.InlinedCalls, they should not be tracked for IFI.InlinedCallSites.

In the current top-of-tree this change is a NFC, but the full restrict patches (D68484) potentially trigger an read-after-free
if intrinsics are also added to the InlindeCallSites, due to a late optimization potentially removing some of the inlined intrinsics.

Also see https://lists.llvm.org/pipermail/llvm-dev/2021-July/151722.html for a discussion about the problem.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D105805
The file was modifiedllvm/lib/Transforms/Utils/InlineFunction.cpp
The file was modifiedllvm/lib/Transforms/IPO/Inliner.cpp
Commit 45430983ef8235f2a018e7daa10a0ad71ef7b85c by ro
[sanitizer_common] Define internal_usleep on Solaris

The Solaris/amd64 buildbot
<https://lab.llvm.org/staging/#/builders/101/builds/2845> has recently been
broken several times, at least one of those remains unfixed:

  [63/446] Generating Sanitizer-x86_64-Test
  [...]
  Undefined first referenced
   symbol      in file
  _ZN11__sanitizer15internal_usleepEy /opt/llvm-buildbot/home/solaris11-amd64/clang-solaris11-amd64/stage1/projects/compiler-rt/lib/sanitizer_common/tests/libRTSanitizerCommon.test.x86_64.a(sanitizer_common.cpp.o)
  ld: fatal: symbol referencing errors

Thist patch fixes it by defining the missing `internal_usleep`.

Tested on `amd64-pc-solaris2.11.`

Differential Revision: https://reviews.llvm.org/D105878
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_solaris.cpp
Commit 90a6bb30fafa4e68d4af1fef62987fe187fa70ab by jeroen.dobbelaere
[remangleIntrinsicFunction] Detect and resolve name clash

It is possible that the remangled name for an intrinsic already exists with a different (and wrong) prototype within the module.
As the bitcode reader keeps both versions of all remangled intrinsics around for a longer time, this can result in a
crash, as can be seen in https://bugs.llvm.org/show_bug.cgi?id=50923

This patch makes 'remangleIntrinsicFunction' aware of this situation. When it is detected, it moves the version with the wrong prototype to a different name. That version will be removed anyway once the module is completely loaded.

With thanks to @asbirlea for reporting this issue when trying out an lto build with the full restrict patches, and @efriedma for suggesting a sane resolution mechanism.

Reviewed By: apilipenko

Differential Revision: https://reviews.llvm.org/D105118
The file was addedllvm/test/tools/llvm-link/remangle.ll
The file was addedllvm/test/tools/llvm-link/Inputs/remangle2.ll
The file was modifiedllvm/include/llvm/IR/Intrinsics.h
The file was modifiedllvm/lib/IR/Function.cpp
The file was addedllvm/test/Assembler/remangle.ll
The file was addedllvm/test/tools/llvm-link/Inputs/remangle1.ll
Commit d991b7212b4c852c29b03d6d9aec40a6e819be95 by fraser
[RISCV] Pass undef VECTOR_SHUFFLE indices on to BUILD_VECTOR

Often when lowering vector shuffles, we split the shuffle into two
LHS/RHS shuffles which are then blended together. To do so we split the
original indices into two, indexed into each respective vector. These
two index vectors are then separately lowered as BUILD_VECTORs.

This patch forwards on any undef indices to the BUILD_VECTOR, rather
than having the VECTOR_SHUFFLE lowering decide on an optimal concrete
index. The motiviation for ths change is so that we don't duplicate
optimization logic between the two lowering methods and let BUILD_VECTOR
do what it does best.

Propagating undef in this way allows us, for example, to generate
`vid.v` to produce the LHS indices of commonly-used interleave-type
shuffles. I have designs on further optimizing interleave-type and other
common shuffle patterns in the near future.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D104789
The file was modifiedllvm/test/CodeGen/RISCV/rvv/fixed-vectors-int-shuffles.ll
The file was modifiedllvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp-buildvec.ll
The file was modifiedllvm/lib/Target/RISCV/RISCVISelLowering.cpp
The file was modifiedllvm/test/CodeGen/RISCV/rvv/interleave-crash.ll
The file was addedllvm/test/CodeGen/RISCV/rvv/common-shuffle-patterns.ll
The file was modifiedllvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp-shuffles.ll
Commit 8724a7ec1131d2d550cabab36784a30c6a97c852 by gchatelet
[libc] update benchmark distributions

All distributions (expect D) have been updated using 7 days worth of data.
Distributions are smoother.
This patch also moves data from header file to individual csv file. It
helps the editor and allows easier export/plotting of the data.

Differential Revision: https://reviews.llvm.org/D105766
The file was addedlibc/benchmarks/distributions/MemcmpGoogleL.csv
The file was addedlibc/benchmarks/distributions/MemcmpGoogleA.csv
The file was addedlibc/benchmarks/distributions/MemcpyGoogleA.csv
The file was addedlibc/benchmarks/distributions/Uniform384To4096.csv
The file was addedlibc/benchmarks/distributions/MemcpyGoogleD.csv
The file was addedlibc/benchmarks/distributions/MemcmpGoogleQ.csv
The file was addedlibc/benchmarks/distributions/MemsetGoogleB.csv
The file was addedlibc/benchmarks/distributions/MemsetGoogleU.csv
The file was addedlibc/benchmarks/distributions/MemcpyGoogleM.csv
The file was addedlibc/benchmarks/distributions/MemsetGoogleW.csv
The file was addedlibc/benchmarks/distributions/MemcpyGoogleL.csv
The file was addedlibc/benchmarks/distributions/MemcmpGoogleW.csv
The file was addedlibc/benchmarks/distributions/MemsetGoogleM.csv
The file was addedlibc/benchmarks/distributions/MemcpyGoogleB.csv
The file was addedlibc/benchmarks/distributions/MemsetGoogleS.csv
The file was addedlibc/benchmarks/distributions/MemsetGoogleA.csv
The file was addedlibc/benchmarks/distributions/MemcmpGoogleD.csv
The file was addedlibc/benchmarks/distributions/MemsetGoogleQ.csv
The file was addedlibc/benchmarks/distributions/MemcmpGoogleB.csv
The file was addedlibc/benchmarks/distributions/README.md
The file was addedlibc/benchmarks/distributions/MemcpyGoogleQ.csv
The file was addedlibc/benchmarks/distributions/MemcmpGoogleU.csv
The file was addedlibc/benchmarks/distributions/MemcpyGoogleW.csv
The file was addedlibc/benchmarks/distributions/MemcmpGoogleM.csv
The file was addedlibc/benchmarks/distributions/MemcmpGoogleS.csv
The file was addedlibc/benchmarks/distributions/MemsetGoogleL.csv
The file was modifiedlibc/benchmarks/MemorySizeDistributions.cpp
The file was addedlibc/benchmarks/distributions/MemsetGoogleD.csv
The file was addedlibc/benchmarks/distributions/MemcpyGoogleS.csv
The file was addedlibc/benchmarks/distributions/MemcpyGoogleU.csv
Commit 7802f62b3f2c18aa689e315460539736e1c81974 by Tim Northover
AArch64: use 4-byte slots for arm64_32 pointers in a tail call
The file was addedllvm/test/CodeGen/AArch64/swifttail-arm64_32.ll
The file was modifiedllvm/lib/Target/AArch64/AArch64ISelLowering.cpp
Commit 78463ebde2f8a1b8ce984c1ae7c6da0c2d323005 by anton.zabaznov
[OpenCL] Add support of __opencl_c_generic_address_space feature macro

Reviewed By: Anastasia

Differential Revision: https://reviews.llvm.org/D103401
The file was modifiedclang/test/CodeGenOpenCL/address-spaces-mangling.cl
The file was modifiedclang/test/SemaOpenCL/address-spaces-conversions-cl2.0.cl
The file was modifiedclang/test/SemaOpenCL/address-spaces.cl
The file was modifiedclang/lib/Basic/TargetInfo.cpp
The file was modifiedclang/test/CodeGenOpenCL/address-spaces.cl
The file was modifiedclang/test/CodeGenOpenCL/amdgpu-sizeof-alignof.cl
The file was modifiedclang/test/CodeGenOpenCL/address-spaces-conversions.cl
The file was modifiedclang/lib/Parse/ParseDecl.cpp
The file was modifiedclang/test/CodeGenOpenCL/overload.cl
Commit 9d72c0ad43e720ef2394a23a2f4c58f79d753f03 by sebastian.neubauer
[AMDGPU] Mark waterfall loops as SI_WATERFALL_LOOP

This way, they can be detected later, e.g. by the
SIOptimizeVGPRLiveRange pass.

Differential Revision: https://reviews.llvm.org/D105467
The file was modifiedllvm/lib/Target/AMDGPU/SIInstrInfo.cpp
The file was modifiedllvm/lib/Target/AMDGPU/SIInstructions.td
The file was modifiedllvm/lib/Target/AMDGPU/SILowerControlFlow.cpp
The file was modifiedllvm/test/CodeGen/AMDGPU/mubuf-legalize-operands.mir
Commit ad2c66ec5d4bb0425625155bba966732ef85e6e5 by sebastian.neubauer
[AMDGPU] Optimize VGPR LiveRange in waterfall loops

The loops are run exactly once per lane, so VGPRs do not need to be
saved. Use the SIOptimizeVGPRLiveRange pass to add phi nodes that take
undef when coming from the loop.

There is still a shortcoming:
Return values from a function call in the loop are copied because their
live range conflicts with the live range of arguments, even if arguments
are only IMPLICIT_DEF after the phi insertion.

Differential Revision: https://reviews.llvm.org/D105192
The file was modifiedllvm/test/CodeGen/AMDGPU/vgpr-descriptor-waterfall-loop-idom-update.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/indirect-call.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/image-sample-waterfall.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/llvm.amdgcn.struct.buffer.load.format.v3f16.ll
The file was modifiedllvm/lib/Target/AMDGPU/SIOptimizeVGPRLiveRange.cpp
Commit e312fc49ae1ec86999676edc9c02a4ac0bc39cec by nicolas.vasilache
[mlir][Linalg] Add layout specification support to bufferization.

Previously, linalg bufferization always had to be conservative at function boundaries and assume the most dynamic strided memref layout.
This revision introduce the mechanism to specify a  linalg.buffer_layout function argument attribute that carries an affine map used to set a less pessimistic layout.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D105859
The file was modifiedmlir/include/mlir/Dialect/Linalg/IR/LinalgBase.td
The file was modifiedmlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp
The file was modifiedmlir/test/Dialect/Linalg/comprehensive-module-bufferize.mlir
Commit 85cb4f9904e9b080302e0c0874e3b441fe7062a7 by Tim Northover
Support: reduce stack used in default size test.

When the sanitizers aren't enabled they can use more than 1KB of stack, causing
an overflow where there shouldn't be.

Should fix Green Dragon test.
The file was modifiedllvm/unittests/Support/Threading.cpp
Commit afdae7c5d797f952bdfbaeb2cfe41a7dcca7a7b9 by llvm-dev
[X86][SSE] Add signbit tests to show cmpss/cmpsd intrinsics not recognised as 'allbits' results.

This adds test coverage for the crash reported on rGe4aa6ad13216
The file was modifiedllvm/test/CodeGen/X86/known-signbits-vector.ll
Commit af55335924ea852e1208d35e2462435f4a3d639c by nicolas.vasilache
[mlir][Linalg] Better support for bufferizing non-tensor results.

Clean up corner cases related to elemental tensor / buffer type return values that would previously fail.

Reviewed By: ThomasRaoux

Differential Revision: https://reviews.llvm.org/D105857
The file was modifiedmlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp
The file was modifiedmlir/test/Dialect/Linalg/comprehensive-module-bufferize.mlir
Commit 72748488addd651beb7b60da462c721f3e175357 by jan.kratochvil
[lldb] Fix editline unicode on Linux

Based on:
  [lldb-dev] proposed change to remove conditional WCHAR support in libedit wrapper
  https://lists.llvm.org/pipermail/lldb-dev/2021-July/016961.html

There is already setlocale in lldb/source/Core/IOHandlerCursesGUI.cpp
but that does not apply for Editline GUI editing.

Unaware how to make automated test for this, it requires pty.

Reviewed By: teemperor

Differential Revision: https://reviews.llvm.org/D105779
The file was modifiedlldb/tools/driver/Driver.cpp
The file was modifiedlldb/source/Core/IOHandlerCursesGUI.cpp
Commit b6b53ffef4414ed62701a63ad28e70cfd9d26191 by jonathanchesterfield
[libomptarget][devicertl] Remove branches around setting parallelLevel

Simplifies control flow to allow store/load forwarding

This change folds two basic blocks into one, leaving a single store to parallelLevel.
This is a step towards spmd kernels with sufficiently aggressive inlining folding
the loads from parallelLevel and thus discarding the nested parallel handling
when it is unused.

Transform:
```
int threadId = GetThreadIdInBlock();
if (threadId == 0) {
  parallelLevel[0] = expr;
} else if (GetLaneId() == 0) {
  parallelLevel[GetWarpId()] = expr;
}
// =>
if (GetLaneId() == 0) {
  parallelLevel[GetWarpId()] = expr;
}
// because
unsigned GetLaneId() { return GetThreadIdInBlock() & (WARPSIZE - 1);}
// so whenever threadId == 0, GetLaneId() is also 0.
```

That replaces a store in two distinct basic blocks with as single store.

A more aggressive follow up is possible if the threads in the warp/wave
race to write the same value to the same address. This is not done as
part of this change.

```
if (GetLaneId() == 0) {
  parallelLevel[GetWarpId()] = expr;
}
// =>
parallelLevel[GetWarpId()] = expr;
// because
unsigned GetWarpId() { return GetThreadIdInBlock() / WARPSIZE; }
// so GetWarpId will index the same element for every thread in the warp
// and, because expr is lane-invariant in this case, every lane stores the
// same value to this unique address
```

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D105699
The file was modifiedopenmp/libomptarget/deviceRTLs/common/src/omptarget.cu
Commit b205f2bb8938447638e9ddc4ee1f6b82caeb1ad3 by abidh
[AMDGPU] Handle s_branch to another section.

Currently, if target of s_branch instruction is in another section, it will fail with the error of undefined label.  Although in this case, the label is not undefined but present in another section. This patch tries to handle this issue. So while handling fixup_si_sopp_br fixup in getRelocType, if the target label is undefined we issue an error as before. If it is defined, a new relocation type R_AMDGPU_REL16 is returned.

This issue has been reported in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100181 and https://bugs.llvm.org/show_bug.cgi?id=45887. Before https://reviews.llvm.org/D79943, we used to get an crash for this scenario. The crash is fixed now but the we still get an undefined label error.  Jumps to other section can arise with hold/cold splitting.

A patch to handle the relocation in lld will follow shortly.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D105760
The file was modifiedllvm/test/MC/AMDGPU/reloc.s
The file was modifiedllvm/include/llvm/BinaryFormat/ELFRelocs/AMDGPU.def
The file was modifiedllvm/docs/AMDGPUUsage.rst
The file was modifiedllvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUELFObjectWriter.cpp
The file was modifiedllvm/test/tools/llvm-readobj/ELF/reloc-types-elf-amdgpu.test
Commit bb0166dc72791e2cefdb0c8dc9e495ea0555357b by georgios.rokos
[libomptarget] Update device pointer only if needed

Currently, libomptarget will always perform a host-to-device memory transfer in
order to update the device pointer of a PTR_AND_OBJ entry. This is not always
necessary because the device pointer may have been set to the correct pointee
address already, so we can eliminate the redundant memory transfer.
The file was modifiedopenmp/libomptarget/src/omptarget.cpp
The file was addedopenmp/libomptarget/test/mapping/device_ptr_update.c
Commit 9c90725eaee5a00e5dd450e51c4070afd7081472 by frgossen
[MLIR] Fix documentation of the `ExecutionEngine` in the toy tutorial example

Differential Revision: https://reviews.llvm.org/D105813
The file was modifiedmlir/docs/Tutorials/Toy/Ch-6.md
Commit 3cee36c5acdb292c331818c553bfb8e5abbdb95e by llvm-dev
[X86][SSE] X86ISD::FSETCC nodes (cmpss/cmpsd) return a 0/-1 allbits signbits result (REAPPLIED)

Annoyingly, i686 cmpsd handling still fails to remove the unnecessary neg(and(x,1))

Reapplied rGe4aa6ad13216 with fix for intrinsic variants of the opcode which uses a vector return type
The file was modifiedllvm/lib/Target/X86/X86ISelLowering.cpp
The file was modifiedllvm/test/CodeGen/X86/known-signbits-vector.ll
Commit 4709d9d5be79835a5a8751dba83e9150dbce9e6e by lebedev.ri
[libomp] ompd_init(): fix heap-buffer-overflow when constructing libompd.so path

There is no guarantee that the space allocated in `libname`
is enough to accomodate the whole `dl_info.dli_fname`,
because it could e.g. have an suffix  - `.5`,
and that highlights another problem - what it should do about suffxies,
and should it do anything to resolve the symlinks before changing the filename?

```
$ LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/lib"  ./src/utilities/rstest/rstest -c /tmp/f49137920.NEF
dl_info.dli_fname "/usr/local/lib/libomp.so.5"
strlen(dl_info.dli_fname) 26
lib_path_length 14
lib_path_length + 12 26
=================================================================
==30949==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60300000002a at pc 0x000000548648 bp 0x7ffdfa0aa780 sp 0x7ffdfa0a9f40
WRITE of size 27 at 0x60300000002a thread T0
    #0 0x548647 in strcpy (/home/lebedevri/rawspeed/build-Clang-SANITIZE/src/utilities/rstest/rstest+0x548647)
    #1 0x7fb9e3e3d234 in ompd_init() /repositories/llvm-project/openmp/runtime/src/ompd-specific.cpp:102:5
    #2 0x7fb9e3dcb446 in __kmp_do_serial_initialize() /repositories/llvm-project/openmp/runtime/src/kmp_runtime.cpp:6742:3
    #3 0x7fb9e3dcb40b in __kmp_get_global_thread_id_reg /repositories/llvm-project/openmp/runtime/src/kmp_runtime.cpp:251:7
    #4 0x59e035 in main /home/lebedevri/rawspeed/build-Clang-SANITIZE/../src/utilities/rstest/rstest.cpp:491
    #5 0x7fb9e3762d09 in __libc_start_main csu/../csu/libc-start.c:308:16
    #6 0x4df449 in _start (/home/lebedevri/rawspeed/build-Clang-SANITIZE/src/utilities/rstest/rstest+0x4df449)

0x60300000002a is located 0 bytes to the right of 26-byte region [0x603000000010,0x60300000002a)
allocated by thread T0 here:
    #0 0x55cc5d in malloc (/home/lebedevri/rawspeed/build-Clang-SANITIZE/src/utilities/rstest/rstest+0x55cc5d)
    #1 0x7fb9e3e3d224 in ompd_init() /repositories/llvm-project/openmp/runtime/src/ompd-specific.cpp:101:17
    #2 0x7fb9e3762d09 in __libc_start_main csu/../csu/libc-start.c:308:16

SUMMARY: AddressSanitizer: heap-buffer-overflow (/home/lebedevri/rawspeed/build-Clang-SANITIZE/src/utilities/rstest/rstest+0x548647) in strcpy
Shadow bytes around the buggy address:
  0x0c067fff7fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c067fff7fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c067fff7fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c067fff7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c067fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c067fff8000: fa fa 00 00 00[02]fa fa fa fa fa fa fa fa fa fa
  0x0c067fff8010: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c067fff8020: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c067fff8030: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c067fff8040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c067fff8050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==30949==ABORTING
Aborted
```
The file was modifiedopenmp/runtime/src/ompd-specific.cpp
Commit ab76101f40f80bbec82073fc5bfddd7203e63a52 by anton.zabaznov
[OpenCL] Add support of __opencl_c_read_write_images feature macro

This feature requires support of __opencl_c_images, so diagnostics for that is provided as well

Reviewed By: Anastasia

Differential Revision: https://reviews.llvm.org/D104915
The file was modifiedclang/test/Misc/opencl-c-3.0.incorrect_options.cl
The file was modifiedclang/include/clang/Basic/DiagnosticSemaKinds.td
The file was modifiedclang/lib/Basic/Targets.cpp
The file was modifiedclang/lib/Sema/SemaDeclAttr.cpp
The file was modifiedclang/include/clang/Basic/OpenCLOptions.h
The file was modifiedclang/test/SemaOpenCL/access-qualifier.cl
The file was modifiedclang/test/SemaOpenCL/unsupported-image.cl
The file was modifiedclang/include/clang/Basic/DiagnosticCommonKinds.td
The file was modifiedclang/lib/Basic/OpenCLOptions.cpp
Commit c99e17fef5f34ac536192fa7b915641f1962c7b9 by llvm-dev
[InstCombine] Pre-commit ashr(or(neg(x),x),bw-1) --> sext(icmp_ne(x,0)) tests from D105764

Added 'thwart complexity-based canonicalization' hacks and the lshr(or(neg(x),x),bw-1) --> zext(icmp_ne(x,0)) variants suggested by Sanjay.
The file was modifiedllvm/test/Transforms/InstCombine/sub-ashr-or-to-icmp-select.ll
The file was addedllvm/test/Transforms/InstCombine/sub-lshr-or-to-icmp-select.ll
Commit 45ffe6341d9642487785b0d0028166e6fbdbe5d7 by thakis
[clang/objc] Optimize getters for non-atomic, copied properties

Properties that were declared `@property(copy, nonatomic) id foo` make an
unnecessary call to objc_get_property().  This call can be replaced with a
direct access to the backing variable identical to how a `@property(nonatomic)
id foo` would do it.

This reduces codegen by 4 bytes (x86_64/arm64) and removes a cross linkage unit
function call per property declared as copy/nonatomic.

Differential Revision: https://reviews.llvm.org/D105311
The file was modifiedclang/test/CodeGenObjC/arc-blocks.m
The file was modifiedclang/lib/CodeGen/CGObjC.cpp
Commit b2f6cf14798ac738bc2c9b35bd83171e0771b7a3 by llvm-dev
[InstCombine] Fold lshr/ashr(or(neg(x),x),bw-1) --> zext/sext(icmp_ne(x,0)) (PR50816)

Handle the missing fold reported in PR50816, which is a variant of the existing ashr(sub_nsw(X,Y),bw-1) --> sext(icmp_sgt(X,Y)) fold.

We also handle the lshr(or(neg(x),x),bw-1) --> zext(icmp_ne(x,0)) equivalent - https://alive2.llvm.org/ce/z/SnZmSj

We still allow multi uses of the neg(x) - as this is likely to let us further simplify other uses of the neg - but not multi uses of the or() which would increase instruction count.

Differential Revision: https://reviews.llvm.org/D105764
The file was modifiedllvm/test/Transforms/InstCombine/sub-lshr-or-to-icmp-select.ll
The file was modifiedllvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
The file was modifiedllvm/test/Transforms/InstCombine/sub-ashr-or-to-icmp-select.ll
Commit e9533b84920798cf9b35d26586a61bad0a1f9825 by alexfh
[NFC] Add paranthesis around logical expression to silence -Wlogical-op-parentheses warning.

Reviewed By: alexfh

Differential Revision: https://reviews.llvm.org/D105890
The file was modifiedclang/lib/Sema/SemaDeclAttr.cpp
Commit db635a28e65fa168536a100542d250f0b13c7039 by hansang.bae
[OpenMP] Minor improvement in task allocation

This patch includes a few changes to improve task allocation
performance slightly. These changes are enough to restore performance
drop observed after introducing hidden helper.

Differential Revision: https://reviews.llvm.org/D105715
The file was modifiedopenmp/runtime/src/kmp_tasking.cpp
Commit 2a9366c0e53593b2be2b91b4a37019ca8cae4557 by Louis Dionne
[libc++] Generate ABI list for macOS arm64
The file was addedlibcxx/lib/abi/arm64-apple-darwin.libcxxabi.v1.stable.exceptions.no_new_in_libcxx.abilist
Commit c5ad8bb8d41018ba58490873e95cc841d9276702 by Louis Dionne
[libc++] Target x86_64 only for the backdeployment jobs

Differential Revision: https://reviews.llvm.org/D105846
The file was modifiedlibcxx/utils/ci/buildkite-pipeline.yml
Commit 0da95a5cf2691c8e01ae02f108487c397ab7e0ce by Louis Dionne
[libc++] Workaround non-constexpr std::exchange pre C++20

std::exchange is only constexpr in C++20 and later. We were using it
in a constructor marked unconditionally constexpr, which caused issues
when building with -std=c++17.

The weird part is that the issue only showed up when building on the
arm64 macs, but that must be caused by the specific version of Clang
used on those. Since the code is clearly wrong and the fix is obvious,
I'm not going to investigate this further.
The file was modifiedlibcxx/test/std/utilities/optional/optional.object/optional.object.ctor/explicit_optional_U.pass.cpp
Commit 6a3904f16e8e2095082f71e862a33266e10fa871 by Matthew.Arsenault
Mips: Mark special case calling convention handling as custom

The number of registers used for passing f64 in some cases is context
dependent, and thus getNumRegistersForCallingConv is sometimes
inaccurate. For f64, it reports 1 but is sometimes split into 2 32-bit
registers.

For GlobalISel, the generic argument assignment code expects
getNumRegistersForCallingConv to return an accurate answer. Switch to
marking these arguments as custom so we can deal with this case as a
custom assignment rather.

This temporarily breaks a few globalisel tests which are fixed by a
future change to use more of the generic infrastructure.
The file was modifiedllvm/test/CodeGen/Mips/GlobalISel/irtranslator/float_args.ll
The file was modifiedllvm/test/CodeGen/Mips/GlobalISel/llvm-ir/float_args.ll
The file was modifiedllvm/lib/Target/Mips/MipsISelLowering.cpp
Commit 121541fdcd5c9760ff242451d2b682c45a2a54df by Matthew.Arsenault
Mips/GlobalISel: Use more standard call lowering infrastructure

This also fixes some missing implicit uses on call instructions, adds
missing G_ASSERT_SEXT/ZEXT annotations, and some missing outgoing
sext/zexts. This also fixes not respecting tablegen requested type
promotions.

This starts treating f64 passed in i32 GPRs as a type of custom
assignment, which restores some previously XFAILed tests. This is due
to getNumRegistersForCallingConv returns a static value, but in this
case it is context dependent on other arguments.

Most of the ugliness is reproducing a hack CC_MipsO32 uses in
SelectionDAG. CC_MipsO32 depends on a bunch of vectors populated from
the original IR argument types in MipsCCState. The way this ends up
working in GlobalISel is it only ends up inspecting the most recently
added vector element. I'm pretty sure there are cleaner ways to do
this, but this seemed easier than fixing up the current DAG
handling. This is another case where it would be easier of the
CCAssignFns were passed the original type instead of only the
pre-legalized ones.

There's still a lot of junk here that shouldn't be necessary. This
also likely breaks big endian handling, but it wasn't complete/tested
anyway since the IRTranslator gives up on big endian targets.
The file was modifiedllvm/lib/Target/Mips/MipsCCState.cpp
The file was modifiedllvm/test/CodeGen/Mips/GlobalISel/irtranslator/float_args.ll
The file was modifiedllvm/lib/Target/Mips/MipsCCState.h
The file was modifiedllvm/include/llvm/CodeGen/GlobalISel/CallLowering.h
The file was modifiedllvm/test/CodeGen/Mips/GlobalISel/irtranslator/extend_args.ll
The file was modifiedllvm/lib/Target/ARM/ARMCallLowering.cpp
The file was modifiedllvm/lib/Target/Mips/MipsCallLowering.cpp
The file was modifiedllvm/lib/Target/Mips/MipsCallLowering.h
The file was modifiedllvm/test/CodeGen/Mips/GlobalISel/llvm-ir/float_args.ll
Commit 77a608d9de472766fcab51412100764e534ceaf9 by Matthew.Arsenault
GlobalISel: Remove getIntrinsicID utility function

This is redundant with a method directly on MachineInstr
The file was modifiedllvm/lib/CodeGen/GlobalISel/Utils.cpp
The file was modifiedllvm/include/llvm/CodeGen/GlobalISel/Utils.h
The file was modifiedllvm/lib/Target/AArch64/GISel/AArch64RegisterBankInfo.cpp
Commit 222fde1eec341a47f571a8afdf90e83c3a830c5b by Matthew.Arsenault
GlobalISel: Use extension instead of merge with undef in common case

This fixes not respecting signext/zeroext in these cases. In the
anyext case, this avoids a larger merge with undef and should be a
better canonical form.

This should also handle this if a merge is needed, but I'm not aware
of a case where that can happen. In a future change this will also
allow AMDGPU to drop some custom code without introducing regressions.
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-call.ll
The file was modifiedllvm/lib/CodeGen/GlobalISel/CallLowering.cpp
Commit fb44c3223e0c36e969762dd182b4992061b455d3 by Matthew.Arsenault
AMDGPU: Promote signext/zeroext i16 shader returns

This makes them consistent with all the other return convention
handling. If we don't do this, we lose the sext/zext flag if treated
as a full assignment, which complicates a future GlobalISel patch.
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPUCallingConv.td
Commit 1e03c37b97b6176a60404d84665c40321f4e33a4 by John.Ericson
Prepare Compiler-RT for GnuInstallDirs, matching libcxx, document all

This is a second attempt at D101497, which landed as
9a9bc76c0eb72f0f2732c729a460abbd5239c2e3 but had to be reverted in
8cf7ddbdd4e5af966a369e170c73250f2e3920e7.

This issue was that in the case that `COMPILER_RT_INSTALL_PATH` is
empty, expressions like "${COMPILER_RT_INSTALL_PATH}/bin" evaluated to
"/bin" not "bin" as intended and as was originally.

One solution is to make `COMPILER_RT_INSTALL_PATH` always non-empty,
defaulting it to `CMAKE_INSTALL_PREFIX`. D99636 adopted that approach.
But, I think it is more ergonomic to allow those project-specific paths
to be relative the global ones. Also, making install paths absolute by
default inhibits the proper behavior of functions like
`GNUInstallDirs_get_absolute_install_dir` which make relative install
paths absolute in a more complicated way.

Given all this, I will define a function like the one asked for in
https://gitlab.kitware.com/cmake/cmake/-/issues/19568 (and needed for a
similar use-case).

---

Original message:

Instead of using `COMPILER_RT_INSTALL_PATH` through the CMake for
complier-rt, just use it to define variables for the subdirs which
themselves are used.

This preserves compatibility, but later on we might consider getting rid
of `COMPILER_RT_INSTALL_PATH` and just changing the defaults for the
subdir variables directly.

---

There was a seaming bug where the (non-Apple) per-target libdir was
`${target}` not `lib/${target}`. I suspect that has to do with the docs
on `COMPILER_RT_INSTALL_PATH` saying was the library dir when that's no
longer true, so I just went ahead and fixed it, allowing me to define
fewer and more sensible variables.

That last part should be the only behavior changes; everything else
should be a pure refactoring.

---

I added some documentation of these variables too. In particular, I
wanted to highlight the gotcha where `-DSomeCachePath=...` without the
`:PATH` will lead CMake to make the path absolute. See [1] for
discussion of the problem, and [2] for the brief official documentation
they added as a result.

[1]: https://cmake.org/pipermail/cmake/2015-March/060204.html

[2]: https://cmake.org/cmake/help/latest/manual/cmake.1.html#options

In 38b2dec37ee735d5409148e71ecba278caf0f969 the problem was somewhat
misidentified and so `:STRING` was used, but `:PATH` is better as it
sets the correct type from the get-go.

---

D99484 is the main thrust of the `GnuInstallDirs` work. Once this lands,
it should be feasible to follow both of these up with a simple patch for
compiler-rt analogous to the one for libcxx.

Reviewed By: phosek, #libc_abi, #libunwind

Differential Revision: https://reviews.llvm.org/D105765
The file was modifiedcompiler-rt/cmake/Modules/CompilerRTUtils.cmake
The file was modifiedlibcxx/CMakeLists.txt
The file was modifiedcompiler-rt/cmake/Modules/CompilerRTDarwinUtils.cmake
The file was modifiedcompiler-rt/lib/dfsan/CMakeLists.txt
The file was modifiedcompiler-rt/cmake/Modules/AddCompilerRT.cmake
The file was addedcompiler-rt/docs/BuildingCompilerRT.rst
The file was modifiedcompiler-rt/cmake/base-config-ix.cmake
The file was modifiedlibcxxabi/CMakeLists.txt
The file was modifiedlibcxx/docs/BuildingLibcxx.rst
The file was modifiedcompiler-rt/include/CMakeLists.txt
The file was modifiedclang/runtime/CMakeLists.txt
The file was modifiedlibunwind/docs/BuildingLibunwind.rst
The file was modifiedlibunwind/CMakeLists.txt
Commit 32627f4ab4b717dc1932141db99605b723037bf8 by tpopp
[mlir] Handle unused variable when assertions are disabled.
The file was modifiedmlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp
Commit 03d8fed34951bc6e92b36615ec3afe6f36d10de6 by anton.zabaznov
[OpenCL] Add verbosity when checking support of read_write images

Parenthesis were fixed incorrectly by D105890

Reviewed By: Anastasia

Differential Revision: https://reviews.llvm.org/D105892
The file was modifiedclang/lib/Sema/SemaDeclAttr.cpp
Commit 10e0cdfc6526578c8892d895c0448e77cb9ba876 by wei.huang
[PowerPC][NFC] Power ISA features for Semachecking

[NFC] This patch adds features for pwr7, pwr8, and pwr9 that can be
used for semachecking builtin functions that are only valid for certain
versions of ppc.

Reviewed By: nemanjai, #powerpc
Authored By: Quinn Pham <Quinn.Pham@ibm.com>

Differential revision: https://reviews.llvm.org/D105501
The file was modifiedllvm/lib/Target/PowerPC/PPCInstrInfo.td
The file was modifiedclang/lib/Basic/Targets/PPC.cpp
The file was modifiedclang/lib/Sema/SemaChecking.cpp
The file was modifiedllvm/lib/Target/PowerPC/PPCSubtarget.cpp
The file was modifiedclang/include/clang/Basic/DiagnosticSemaKinds.td
The file was modifiedllvm/lib/Target/PowerPC/PPCSubtarget.h
The file was modifiedclang/lib/Basic/Targets/PPC.h
The file was modifiedllvm/lib/Target/PowerPC/PPC.td
Commit 1bfec34ac3e71ae3e65d5132fb475b6f8cc0bafe by llvm-dev
[InstCombine] Regenerate select-gep.ll tests
The file was modifiedllvm/test/Transforms/InstCombine/select-gep.ll
Commit 4975837f1480621d9428a4be468831d07b2201de by llvm-dev
[InstCombine] Add basic (select C, (gep Ptr, Idx), Ptr) tests from PR50183
The file was modifiedllvm/test/Transforms/InstCombine/select-gep.ll
Commit f1aca5ac96ebd0beadfa68a474c5947d3bc8c109 by albionapc
[PowerPC] Fix L[D|W]ARX Implementation

LDARX and LWARX sometimes gets optimized out by the compiler
when it is critical to the correctness of the code. This inline asm generation
ensures that it preserved.

Differential Revision: https://reviews.llvm.org/D105754
The file was modifiedclang/lib/CodeGen/CGBuiltin.cpp
The file was modifiedllvm/lib/Target/PowerPC/PPCInstrInfo.td
The file was modifiedllvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-LoadReserve-StoreCond.ll
The file was modifiedllvm/include/llvm/IR/IntrinsicsPowerPC.td
The file was modifiedllvm/lib/Target/PowerPC/PPCInstr64Bit.td
The file was modifiedclang/test/CodeGen/builtins-ppc-xlcompat-LoadReseve-StoreCond-64bit-only.c
The file was modifiedclang/test/CodeGen/builtins-ppc-xlcompat-LoadReseve-StoreCond.c
The file was modifiedllvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-LoadReserve-StoreCond-64bit-only.ll
Commit 7039dfc6dd157a26de2f5a6fd15662510a1dd119 by ajcbik
[mlir][memref] adjust integration tests to new lowering passes

these tests run under the emulator and thus were overlooked

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D105855
The file was modifiedmlir/test/Integration/Dialect/Vector/CPU/AMX/test-mulf.mlir
The file was modifiedmlir/test/Integration/Dialect/Vector/CPU/AMX/test-tilezero-block.mlir
The file was modifiedmlir/test/Integration/Dialect/Vector/CPU/AMX/test-tilezero.mlir
The file was modifiedmlir/test/Integration/Dialect/Vector/CPU/AMX/test-muli.mlir
The file was modifiedmlir/test/Integration/Dialect/Vector/CPU/AMX/test-muli-ext.mlir
The file was modifiedmlir/test/Integration/Dialect/Vector/CPU/X86Vector/test-sparse-dot-product.mlir
Commit a006af5d6ec6280034ae4249f6d2266d726ccef4 by gchatelet
[llvm] Add enum iteration to Sequence

This patch allows iterating typed enum via the ADT/Sequence utility.

Differential Revision: https://reviews.llvm.org/D103900
The file was modifiedllvm/unittests/CodeGen/ScalableVectorMVTsTest.cpp
The file was modifiedllvm/unittests/ADT/SequenceTest.cpp
The file was modifiedllvm/tools/llvm-reduce/deltas/ReduceAttributes.cpp
The file was modifiedllvm/unittests/IR/ConstantRangeTest.cpp
The file was modifiedllvm/include/llvm/Support/MachineValueType.h
The file was modifiedllvm/include/llvm/ADT/Sequence.h
The file was modifiedllvm/tools/llvm-exegesis/lib/X86/Target.cpp
The file was modifiedmlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
The file was modifiedllvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
Commit 3d89fb4d13bc3af1c3643a310b90fce51a649119 by i
[RISCV] Support machine constraint "S"

Similar to D46745, "S" represents an absolute symbolic operand, which
can be used to specify the access models, e.g.

  extern int var;
  void *addr_via_asm() {
    void *ret;
    asm("lui %0, %%hi(%1)\naddi %0,%0,%%lo(%1)" : "=r"(ret) : "S"(&var));
    return ret;
  }

'S' is documented in trunk GCC: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101275

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D105254
The file was modifiedclang/lib/Basic/Targets/RISCV.cpp
The file was modifiedclang/test/CodeGen/RISCV/riscv-inline-asm.c
The file was modifiedllvm/lib/Target/RISCV/RISCVISelLowering.cpp
The file was addedllvm/test/CodeGen/RISCV/inline-asm-S-constraint.ll
Commit 68ae8bacfce3b9bd73fefb0d28efd461e1588586 by nicolas.vasilache
[mlir][Linalg] Properly specify Linalg attribute.

This fixes undefined reference introduced by https://reviews.llvm.org/D105859

Differential Revision: https://reviews.llvm.org/D105897
The file was modifiedmlir/lib/Dialect/Linalg/IR/LinalgTypes.cpp
The file was modifiedmlir/include/mlir/Dialect/Linalg/IR/LinalgBase.td
Commit 1893b630fec06947b4f59e43c00db4d787f39262 by julian.lettner
Avoid triggering assert when program calls OSAtomicCompareAndSwapLong

A previous change brought the new, relaxed implementation of "on failure
memory ordering" for synchronization primitives in LLVM over to TSan
land [1].  It included the following assert:
```
// 31.7.2.18: "The failure argument shall not be memory_order_release
// nor memory_order_acq_rel". LLVM (2021-05) fallbacks to Monotonic
// (mo_relaxed) when those are used.
CHECK(IsLoadOrder(fmo));

static bool IsLoadOrder(morder mo) {
  return mo == mo_relaxed || mo == mo_consume
      || mo == mo_acquire || mo == mo_seq_cst;
}
```

A previous workaround for a false positive when using an old Darwin
synchronization API assumed this failure mode to be unused and passed a
dummy value [2].  We update this value to `mo_relaxed` which is also the
value used by the actual implementation to avoid triggering the assert.

[1] https://reviews.llvm.org/D99434
[2] https://reviews.llvm.org/D21733

rdar://78122243

Differential Revision: https://reviews.llvm.org/D105844
The file was modifiedcompiler-rt/lib/tsan/rtl/tsan_interceptors_mac.cpp
Commit b25aca503d296eeeb2a174d8fb97637de74b8653 by aeubanks
[OpaquePtr] Use AllocaInst::getAllocatedType()
The file was modifiedllvm/lib/Target/NVPTX/NVPTXLowerAlloca.cpp
Commit 693bc04bf615b63b0070c7d1ad15257a7ce31a20 by aeubanks
[OpaquePtr] Use GlobalValue::getValueType() more
The file was modifiedllvm/lib/Target/PowerPC/PPCAsmPrinter.cpp
The file was modifiedllvm/lib/Transforms/IPO/MergeFunctions.cpp
The file was modifiedllvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp
The file was modifiedllvm/lib/Transforms/Coroutines/Coroutines.cpp
Commit 113a80797731b1d7cb20d8b42238908efc9e4f48 by aeubanks
[OpaquePtr] Get load/store type without PointerType::getElementType()
The file was modifiedllvm/lib/Transforms/Scalar/LoopLoadElimination.cpp
Commit ab5693aa4ac45fed0fa4c9106f0eef6d409b6c3e by aeubanks
[OpaquePtr] Use byval type more
The file was modifiedllvm/lib/Transforms/IPO/ArgumentPromotion.cpp
The file was modifiedllvm/lib/Transforms/Coroutines/CoroFrame.cpp
The file was modifiedllvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
Commit 2c47b8847ec75c25187e9819abd85cc9e908d742 by gchatelet
Revert "[llvm] Add enum iteration to Sequence"

This reverts commit a006af5d6ec6280034ae4249f6d2266d726ccef4.
The file was modifiedllvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
The file was modifiedllvm/tools/llvm-reduce/deltas/ReduceAttributes.cpp
The file was modifiedllvm/include/llvm/Support/MachineValueType.h
The file was modifiedllvm/tools/llvm-exegesis/lib/X86/Target.cpp
The file was modifiedmlir/lib/Dialect/Linalg/IR/LinalgOps.cpp
The file was modifiedllvm/unittests/CodeGen/ScalableVectorMVTsTest.cpp
The file was modifiedllvm/include/llvm/ADT/Sequence.h
The file was modifiedllvm/unittests/ADT/SequenceTest.cpp
The file was modifiedllvm/unittests/IR/ConstantRangeTest.cpp
Commit 46e89708170c40e8cf0305b6de048ca879f43aab by craig.topper
[RISCV] Prevent use of t0(aka x5) as rs1 for jalr instructions.

Some microarchitectures treat rs1=x1/x5 on jalr as a hint to pop
the return-address stack. We should avoid using x5 on jalr
instructions since we aren't using x5 as an alternate link register.

Differential Revision: https://reviews.llvm.org/D105875
The file was modifiedllvm/lib/Target/RISCV/RISCVInstrInfo.td
The file was modifiedllvm/test/CodeGen/RISCV/calls.ll
The file was modifiedllvm/lib/Target/RISCV/RISCVRegisterInfo.td
The file was modifiedllvm/test/CodeGen/RISCV/tail-calls.ll
Commit ae4cea38f18e32d4a106871d751af380032e16fe by thomasraoux
[mlir] Add support for tensor.extract to comprehensive bufferization

Differential Revision: https://reviews.llvm.org/D105870
The file was modifiedmlir/test/Dialect/Linalg/comprehensive-module-bufferize.mlir
The file was modifiedmlir/lib/Dialect/Linalg/Transforms/ComprehensiveBufferize.cpp
Commit 489742991f7dc4c621264d223e8973ff876e9080 by aeubanks
[NFC] Inline variable to prevent unused variable warning
The file was modifiedllvm/lib/Transforms/Scalar/LoopLoadElimination.cpp
Commit e4b43973fbd41aee3b8197cf250e9fb9ac40f986 by listmail
[ScalarEvolution] Fix overflow when computing max trip counts

This is split from D105216 to reduce patch complexity.  Original code by Eli with very minor modification by me.

The primary point of this patch is to add the getUDivCeilSCEV routine.  I included the two callers with constant arguments as we know those must constant fold even without any of the fancy inference logic.
The file was modifiedllvm/include/llvm/Analysis/ScalarEvolution.h
The file was modifiedllvm/lib/Analysis/ScalarEvolution.cpp
Commit 7a20670d168af31ef77209f43ca0622800ce513a by Saleem Abdulrasool
AST: correct name decoration for swift async functions on Windows

The name decoration scheme on Windows does not have a vendor namespace,
and the decoration scheme is not shared ownership - it is controlled by
Microsoft.  `T` is a reserved identifier for an unknown calling
convention.  The `W` identifier has been discussed with Microsoft
offline and is reserved as `Swift_3` as the identifier for the swift
async calling convention.  Adjust the name decoration accordingly.
The file was modifiedclang/lib/AST/MicrosoftMangle.cpp
Commit 14f77576c9c4f502267a92992abe3bdcbeb96b2c by marcos.horro
[llvm-mca] [NFC] Formatting code

Applied clang-format to all files. Discarded BottleneckAnalysis.h
80-column width violation since it contains an example of report.
Caught some typos and minor style details.

Reviewed By: andreadb

Differential Revision: https://reviews.llvm.org/D105900
The file was modifiedllvm/tools/llvm-mca/Views/TimelineView.h
The file was modifiedllvm/tools/llvm-mca/Views/BottleneckAnalysis.cpp
The file was modifiedllvm/tools/llvm-mca/PipelinePrinter.cpp
The file was modifiedllvm/tools/llvm-mca/llvm-mca.cpp
The file was modifiedllvm/tools/llvm-mca/Views/InstructionView.h
The file was modifiedllvm/tools/llvm-mca/Views/RegisterFileStatistics.cpp
The file was modifiedllvm/tools/llvm-mca/Views/InstructionView.cpp
The file was modifiedllvm/tools/llvm-mca/Views/RetireControlUnitStatistics.cpp
The file was modifiedllvm/tools/llvm-mca/Views/View.h
The file was modifiedllvm/tools/llvm-mca/Views/BottleneckAnalysis.h
The file was modifiedllvm/tools/llvm-mca/Views/SummaryView.cpp
The file was modifiedllvm/tools/llvm-mca/Views/DispatchStatistics.cpp
The file was modifiedllvm/tools/llvm-mca/Views/SummaryView.h
Commit 03282f2fe14e9dd61aaeeda3785f56c7ccb4f3c9 by mizvekov
[clang] C++98 implicit moves are back with a vengeance

After taking C++98 implicit moves out in D104500,
we put it back in, but now in a new form which preserves
compatibility with pure C++98 programs, while at the same time
giving almost all the goodies from P1825.

* We use the exact same rules as C++20 with regards to which
  id-expressions are move eligible. The previous
  incarnation would only benefit from the proper subset which is
  copy ellidable. This means we can implicit move, in addition:
  * Parameters.
  * RValue references.
  * Exception variables.
  * Variables with higher-than-natural required alignment.
  * Objects with different type from the function return type.
* We preserve the two-overload resolution, with one small tweak to the
  first one: If we either pick a (possibly converting) constructor which
  does not take an rvalue reference, or a user conversion operator which
  is not ref-qualified, we abort into the second overload resolution.

This gives C++98 almost all the implicit move patterns which we had created test
cases for, while at the same time preserving the meaning of these
three patterns, which are found in pure C++98 programs:
* Classes with both const and non-const copy constructors, but no move
  constructors, continue to have their non-const copy constructor
  selected.
* We continue to reject as ambiguous the following pattern:
```
struct A { A(B &); };
struct B { operator A(); };
A foo(B x) { return x; }
```
* We continue to pick the copy constructor in the following pattern:
```
class AutoPtrRef { };
struct AutoPtr {
  AutoPtr(AutoPtr &);
  AutoPtr();

  AutoPtr(AutoPtrRef);
  operator AutoPtrRef();
};
AutoPtr test_auto_ptr() {
  AutoPtr p;
  return p;
}
```

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>

Reviewed By: Quuxplusone

Differential Revision: https://reviews.llvm.org/D105756
The file was modifiedclang/test/SemaCXX/conversion-function.cpp
The file was modifiedclang/lib/Sema/SemaStmt.cpp
The file was modifiedclang/test/CXX/class/class.init/class.copy.elision/p3.cpp
The file was modifiedclang/test/SemaObjCXX/block-capture.mm
Commit 405eefe46497bde580c28ce2d2b79f0e96f2a1d0 by jonathan.l.peyton
[OpenMP][NFC] Change comment style to eliminate warnings from GCC

Standalone build for OpenMP runtime using GCC is giving -Wcomment
warnings where a backslash newline is encountered in the // style
comment. This switches the // style for /* style to silence the
warnings.
The file was modifiedopenmp/runtime/src/kmp_os.h
Commit b5f4ac4c11b041ab9dfed42a7133d1eca6536aaa by amy.kwan1
[PowerPC] Add FI alignment check if the addressing mode is DS/DQ-Form, emit X-Form if necessary.

This patch adds a function that checks whether or not the frame index
is aligned when the computed addressing mode is an aligned D-Form (DS, or DQ-Form).
If the frame index appears to be unaligned, within these two modes, reset
the mode to X-Form in order to fall back to selection X-Form loads.

A test case is added to ensure that the test emits X-Form loads and not DQ-Form
loads since the frame index is not aligned within the test case.

Differential Revision: https://reviews.llvm.org/D105661
The file was addedllvm/test/CodeGen/PowerPC/unaligned-dqform-ld.ll
The file was modifiedllvm/lib/Target/PowerPC/PPCISelLowering.cpp
Commit 1e670dc7d78427156c252317b3571576d465043f by craig.topper
[RISCV] Use DIVUW/REMUW/DIVW instructions for i8/i16/i32 udiv/urem/sdiv when LHS is constant.

We don't really have optimizations for division with a constant
LHS. If we don't use a W instruction we end up needing to sign
or zero extend the RHS to use the 64-bit instruction.

I had to sign_extend i32 constants on the LHS instead of using
any_extend which becomes zero_extend. If we don't do this, constants
that were originally negative become harder to materialize. I think
this problem exists for more of our W instruction cases. For example
(i32 (shl -1, X)), but we don't have lit tests. I'll work on that
as a follow up.

I also left a FIXME for enabling W instruction for RHS constants
under -Oz.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D105769
The file was modifiedllvm/test/CodeGen/RISCV/div.ll
The file was modifiedllvm/lib/Target/RISCV/RISCVISelLowering.cpp
The file was modifiedllvm/test/CodeGen/RISCV/rem.ll
Commit 04942a7ffc716d2f782402089201cadfbbbb2a04 by Louis Dionne
[libc++] NFC: Add comment for running macOS CI setup script remotely
The file was modifiedlibcxx/utils/ci/macos-ci-setup
Commit 424f14f0d2e98b83a41bdc7408f15d28aaa4cbd0 by jonathan.l.peyton
[OpenMP] Fix one sign-compare warning from GCC
The file was modifiedopenmp/runtime/src/kmp_runtime.cpp
Commit 303ddb60a2d28fb7603266d8977f69ac77b194dd by tstellar
Fix utils/update_cc_test_checks/check-globals.test on stand-alone builds

We want to use LLVM_EXTERNAL_LIT if defined for the %lit substitution.

Reviewed By: jdenny

Differential Revision: https://reviews.llvm.org/D105873
The file was modifiedclang/test/lit.site.cfg.py.in
The file was modifiedclang/test/utils/update_cc_test_checks/lit.local.cfg
The file was modifiedclang/test/CMakeLists.txt
Commit 2a399e60b6ea74aca47881b48414a5198a868cc3 by Louis Dionne
[libc++] Add a CI job for macOS on arm64 hardware 🥳

Differential Revision: https://reviews.llvm.org/D105848
The file was modifiedlibcxxabi/test/thread_local_destruction_order.pass.cpp
The file was modifiedlibcxx/utils/ci/buildkite-pipeline.yml
Commit 2bc07083a258fdbbafc9c0381e936f441f93af70 by Vitaly Buka
[sanitizer] Fix VSNPrintf %V on Windows
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_common.h
The file was modifiedcompiler-rt/lib/sanitizer_common/tests/sanitizer_printf_test.cpp
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_printf.cpp
Commit f26deb4e6ba7e00c57b4be888c4d20c95a881154 by vsavchenko
[analyzer][solver][NFC] Introduce ConstraintAssignor

The new component is a symmetric response to SymbolicRangeInferrer.
While the latter is the unified component, which answers all the
questions what does the solver knows about a particular symbolic
expression, assignor associates new constraints (aka "assumes")
with symbolic expressions and can imply additional knowledge that
the solver can extract and use later on.

- Why do we need it and why is SymbolicRangeInferrer not enough?

As it is noted before, the inferrer only helps us to get the most
precise range information based on the existing knowledge and on the
mathematical foundations of different operations that symbolic
expressions actually represent.  It doesn't introduce new constraints.

The assignor, on the other hand, can impose constraints on other
symbols using the same domain knowledge.

- But for some expressions, SymbolicRangeInferrer looks into constraints
  for similar expressions, why can't we do that for all the cases?

That's correct!  But in order to do something like this, we should
have a finite number of possible "similar expressions".

Let's say we are asked about `$a - $b` and we know something about
`$b - $a`.  The inferrer can invert this expression and check
constraints for `$b - $a`.  This is simple!
But let's say we are asked about `$a` and we know that `$a * $b != 0`.
In this situation, we can imply that `$a != 0`, but the inferrer shouldn't
try every possible symbolic expression `X` to check if `$a * X` or
`X * $a` is constrained to non-zero.

With the assignor mechanism, we can catch this implication right at
the moment we associate `$a * $b` with non-zero range, and set similar
constraints for `$a` and `$b` as well.

Differential Revision: https://reviews.llvm.org/D105692
The file was modifiedclang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
Commit 60bd8cbc0c84a41146b1ad6c832fa75f48cd2568 by vsavchenko
[analyzer][solver][NFC] Refactor how we detect (dis)equalities

This patch simplifies the way we deal with (dis)equalities.
Due to the symmetry between constraint handler and range inferrer,
we can have very similar implementations of logic handling
questions about (dis)equality and assumptions involving (dis)equality.

It also helps us to remove one more visitor, and removes uncertainty
that we got all the right places to put `trackNE` and `trackEQ`.

Differential Revision: https://reviews.llvm.org/D105693
The file was modifiedclang/test/Analysis/equality_tracking.c
The file was modifiedclang/lib/StaticAnalyzer/Core/RangeConstraintManager.cpp
Commit ce25eb0b71bfcd104afd300c2eb2fb5982f827e8 by Vitaly Buka
[NFC][sanitizer] Remove trailing whitespace
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_common.h
Commit 6245252d4c8c7c9b1be5b9e6a876be9776c000e4 by listmail
[test] Add a SCEV backedge computation test with an explicit zero stride
The file was modifiedllvm/test/Analysis/ScalarEvolution/trip-count-unknown-stride.ll
Commit 01d3a3dcabaf862581b1d1aee604fcee6a18b240 by tra
[CUDA] Only allow NVIDIA offload-arch during CUDA compilation.

Otherwise, if someone specifies a valid AMD arch, we may end up triggering an
assertion on unexpected arch later on.

Differential Revision: https://reviews.llvm.org/D105295
The file was modifiedclang/lib/Driver/Driver.cpp
The file was modifiedclang/test/Driver/cuda-bad-arch.cu
Commit 43c7ca8e4963beb2e5a57639f20b8f43608296d7 by Jon Roelofs
[AArch64][GlobalISel] Legalize store <2 x i16>

Differential revision: https://reviews.llvm.org/D105912
The file was modifiedllvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
The file was modifiedllvm/test/CodeGen/AArch64/GlobalISel/legalize-load-store.mir
Commit eba638dbbb77ca2a446fd76b4f52ad85640da4f9 by Jon Roelofs
[AArch64][GlobalISel] Legalize load <2 x i16>

Differential revision: https://reviews.llvm.org/D105913
The file was modifiedllvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
The file was modifiedllvm/test/CodeGen/AArch64/GlobalISel/legalize-load-store.mir
Commit e4585d3f4e1f076ff12db65259924492f5912b19 by wei.huang
Revert "[PowerPC][NFC] Power ISA features for Semachecking"

This reverts commit 10e0cdfc6526578c8892d895c0448e77cb9ba876.
The file was modifiedclang/lib/Basic/Targets/PPC.cpp
The file was modifiedllvm/lib/Target/PowerPC/PPC.td
The file was modifiedllvm/lib/Target/PowerPC/PPCSubtarget.cpp
The file was modifiedclang/include/clang/Basic/DiagnosticSemaKinds.td
The file was modifiedclang/lib/Sema/SemaChecking.cpp
The file was modifiedclang/lib/Basic/Targets/PPC.h
The file was modifiedllvm/lib/Target/PowerPC/PPCInstrInfo.td
The file was modifiedllvm/lib/Target/PowerPC/PPCSubtarget.h
Commit 781929b4236bc34681fb0783cf7b6021109fe28b by wei.huang
[PowerPC][NFC] Power ISA features for Semachecking

[NFC] This patch adds features for pwr7, pwr8, and pwr9 that can be
used for semachecking builtin functions that are only valid for certain
versions of ppc.

Reviewed By: nemanjai, #powerpc
Authored By: Quinn Pham <Quinn.Pham@ibm.com>

Differential revision: https://reviews.llvm.org/D105501
The file was modifiedclang/include/clang/Basic/DiagnosticSemaKinds.td
The file was modifiedllvm/lib/Target/PowerPC/PPCInstrInfo.td
The file was modifiedclang/lib/Basic/Targets/PPC.h
The file was addedclang/test/Driver/ppc-isa-features.cpp
The file was modifiedclang/lib/Basic/Targets/PPC.cpp
The file was modifiedllvm/lib/Target/PowerPC/PPCSubtarget.cpp
The file was modifiedclang/lib/Sema/SemaChecking.cpp
The file was modifiedllvm/lib/Target/PowerPC/PPCSubtarget.h
The file was modifiedllvm/lib/Target/PowerPC/PPC.td
Commit 308d38128333af65455e8343a620b40a099e896a by tlively
[WebAssembly] Generate checks for simd-load-store-alignment.ll

This will make it easier to update these tests as we add support for generating
more SIMD loads and stores with custom alignments.

Differential Revision: https://reviews.llvm.org/D105862
The file was modifiedllvm/test/CodeGen/WebAssembly/simd-load-store-alignment.ll
Commit e56b2e57067652710418973e11bb9b118f37b177 by nikita.ppv
[InstCombine] Precommit tests for D105088 (NFC)

Add tests for D105088, as well as an option to disable the
(generally) unsound inttoptr of ptrtoint optimization.

Differential Revision: https://reviews.llvm.org/D105771
The file was modifiedllvm/lib/IR/Instructions.cpp
The file was addedllvm/test/Transforms/InstCombine/ptr-int-ptr-icmp.ll
Commit 3e5cff19fdae2515f87c08dc8b0e483751165153 by Jon Roelofs
[Tests] Fix test broken by: 43c7ca8e4963 [AArch64][GlobalISel] Legalize store <2 x i16>
The file was modifiedllvm/test/CodeGen/AArch64/arm64-rev.ll
Commit 087310c71e5c1c70818ac62acd781860d59a6ce7 by listmail
[SCEV] Strengthen inference of RHS > Start in howManyLessThans

Split off from D105216 to simplify review.  Rewritten with a lambda to be easier to follow.  Comments clarified.

Sorry for no test case, this is tricky to exercise with the current structure of the code.  It's about to be hit more frequently in a follow up patch, and the change itself is simple.
The file was modifiedllvm/lib/Analysis/ScalarEvolution.cpp
Commit 25629bb45f0a4b8c8e99dbde4f4a7e3d980b9fd7 by tra
Fix cuda-bad-arch.cu test.

Tests for correctness of HIP architecture need `- xhip`
The file was modifiedclang/test/Driver/cuda-arch-translation.cu
The file was modifiedclang/test/Driver/cuda-bad-arch.cu
The file was modifiedclang/test/Driver/cuda-flush-denormals-to-zero.cu
Commit 5ca9cf0e6b15647a4f6959c1fc1c23b9f6cb0cba by listmail
[tests] Precommit a test case from D105216
The file was modifiedllvm/test/Analysis/ScalarEvolution/trip-count-unknown-stride.ll
Commit 3ea8860afb302f628703a57226e5466091b2c418 by thakis
[gn build] (manually) port 303ddb60a2d2
The file was modifiedllvm/utils/gn/secondary/clang/test/BUILD.gn
Commit 5d1ba534043707a7b41542e9d1e514483f88503a by efriedma
[LoopReroll] Add an extra defensive check to avoid SCEV assertion.

Make sure getMinusSCEV() didn't return a pointer.  The following check
would never succeed if it was a pointer, anyway, but calling
getMulExpr() on a pointer SCEV now asserts.
The file was modifiedllvm/lib/Transforms/Scalar/LoopRerollPass.cpp
The file was modifiedllvm/test/Transforms/LoopReroll/basic.ll
Commit b28c465e4902f579799bc94512197c04a5ad4a29 by efriedma
[NFC] Use CHECK-LABEL in trip-count-unknown-stride.ll
The file was modifiedllvm/test/Analysis/ScalarEvolution/trip-count-unknown-stride.ll
Commit 6296e109728d58805004739530b8f265c6a130b9 by thomasraoux
[mlir][Vector] Remove Vector TupleOp as it is unused

TupleOp is not used anymore after recent refactoring.

Differential Revision: https://reviews.llvm.org/D105924
The file was modifiedmlir/lib/Dialect/Vector/VectorOps.cpp
The file was modifiedmlir/include/mlir/Dialect/Vector/VectorOps.td
Commit fb9c5c3dce27b352534641dbb6e3cb8c05da7bc9 by abidh
[lld][AMDGPU] Handle R_AMDGPU_REL16 relocation.

This patch is a followup patch to https://reviews.llvm.org/D105760 which adds this relocation. This handles the relocation in lld.

The s_branch family of instruction does the following:
PC = PC + signext(simm * 4) + 4

so we we do the opposite on the target address before writing it in the instruction stream.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D105761
The file was modifiedlld/ELF/Arch/AMDGPU.cpp
The file was addedlld/test/ELF/amdgpu-relocs2.s
Commit 7efe3887858fe77da5c6687e3ac9ed9b00f9ed4e by arthur.j.odwyer
[libc++] [test] Add a missing `()` in TestEachIntegralType.
The file was modifiedlibcxx/test/support/atomic_helpers.h
Commit ba8dcaef0d79ae0174cdcea6d6f62015266c1d40 by Vitaly Buka
Revert "sanitizer_common: optimize memory drain"

Breaks https://lab.llvm.org/buildbot/#/builders/anitizer-windows

This reverts commit d89d3dfae17d7795dc1ef013db66272020de1959.
The file was modifiedcompiler-rt/lib/sanitizer_common/tests/sanitizer_allocator_test.cpp
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_allocator_primary64.h
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_allocator_local_cache.h
Commit d558bfaf8e1e8e7814053abc406cdaaed00cf784 by Vitaly Buka
[NFC][sanitizer] clang-format part of D105778
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_allocator_local_cache.h
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_allocator_primary64.h
The file was modifiedcompiler-rt/lib/sanitizer_common/tests/sanitizer_allocator_test.cpp
Commit 5105a77035d080a5f14668b136c8def52b182ce2 by Vedant Kumar
[docs/llvm-cov] Document -compilation-dir

Document the `-compilation-dir` option added in D100232.

Differential Revision: https://reviews.llvm.org/D105826
The file was modifiedllvm/docs/CommandGuide/llvm-cov.rst
Commit d12a7f142e2430f4983c668d910897db8cc2afc7 by hedingarcia
[libc] Add on float properties for precision floating point numbers in FloatProperties.h

Defined constant that express the number of bits for exponent in single and double precision. Added bit masks values and other properties for quad precision floating point numbers that specifically targets architectures defined in PlatfromDefs.h. The exponentWidth values were added to be used in LongDoubleBitsX86.h where the implementation to set the exponent component uses this and the bitWidth value. The need occurred because of the 80-bit quad precision implementation.

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D105153
The file was modifiedlibc/utils/FPUtil/FloatProperties.h
Commit 9f1f666b30c03376d3816f7b2d18c93073517330 by Vitaly Buka
[NFC][sanitizer] Move MemoryMapper out of SizeClassAllocator64

Part of D105778
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_allocator_primary64.h
Commit 1c69005c2e11414669ac8ba094a9b059920936db by martin
[libcxx] [docs] Acknowledge that the library is known to work in some configs outside of what's tested in CI

Differential Revision: https://reviews.llvm.org/D105888
The file was modifiedlibcxx/docs/index.rst
Commit 4df591b5c960affd1612e330d0c9cd3076c18053 by listmail
[SCEV] Handle zero stride correctly in howManyLessThans

This is split from D105216, but the code is hoisted much earlier into the path where we can actually get a zero stride flowing through. Some fairly simple proofs handle the cases which show up in practice. The only test changes are the cases where we really do need a non-zero divider to produce the right result.

Differential Revision: https://reviews.llvm.org/D105921
The file was modifiedllvm/test/Analysis/ScalarEvolution/trip-count-unknown-stride.ll
The file was modifiedllvm/lib/Analysis/ScalarEvolution.cpp
Commit f990da59c5df840526baeb70bc5b5594fb5599ed by Vitaly Buka
[sanitizer] Few more NFC changes from D105778
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_allocator_local_cache.h
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_allocator_primary64.h
Commit a16071e409a55cfc83e59eb738fd6144207dd5d1 by caitlyncano
[libc] Don't pass -fpie/-ffreestanding on Windows

The current compile options function hardcodes the -fpie and
-ffreestanding flags, which don't exist on Windows. This patch sets the
compilation flags conditionally based on the OS specifics.

Reviewed By: sivachandra, aeubanks

Differential Revision: https://reviews.llvm.org/D105643
The file was modifiedlibc/cmake/modules/LLVMLibCObjectRules.cmake
Commit a5a337e55ed2e265358ac0a2ce6db1af2dd69e07 by hedingarcia
[libc] Capture floating point encoding and arrange it sequentially in memory

Redefined FPBits.h and LongDoubleBitsX86 so its implementation works for the Windows
and Linux platform while maintaining a packed memory alignment of the precision floating
point numbers. For its size in memory to be the same as the data type of the float point number.
This change was necessary because the previous attribute((packed)) specification in the struct was not working
for Windows like it was for Linux and consequently static_asserts in the FPBits.h file were failing.

Reviewed By: aeubanks, sivachandra

Differential Revision: https://reviews.llvm.org/D105561
The file was modifiedlibc/test/src/math/NextAfterTest.h
The file was modifiedlibc/utils/FPUtil/NormalFloat.h
The file was modifiedlibc/utils/FPUtil/NextAfterLongDoubleX86.h
The file was modifiedlibc/utils/FPUtil/BasicOperations.h
The file was modifiedlibc/test/src/math/LdExpTest.h
The file was modifiedlibc/utils/FPUtil/Hypot.h
The file was modifiedlibc/utils/FPUtil/SqrtLongDoubleX86.h
The file was modifiedlibc/test/src/math/SqrtTest.h
The file was modifiedlibc/utils/FPUtil/LongDoubleBitsX86.h
The file was modifiedlibc/utils/FPUtil/NearestIntegerOperations.h
The file was modifiedlibc/utils/FPUtil/DivisionAndRemainderOperations.h
The file was modifiedlibc/utils/FPUtil/Sqrt.h
The file was modifiedlibc/utils/FPUtil/FPBits.h
The file was modifiedlibc/utils/FPUtil/TestHelpers.cpp
The file was modifiedlibc/utils/FPUtil/ManipulationFunctions.h
The file was modifiedlibc/test/src/math/RoundToIntegerTest.h
The file was modifiedlibc/utils/FPUtil/generic/FMA.h
Commit 24129fbc9aa006badc2e6e8432980cb94aba090c by ayermolo
[LLD] Adding support for RELA for CG Profile.

This is a follow up to https://reviews.llvm.org/D104080, and https://github.com/llvm/llvm-project/commit/ca3bdb57fa1ac98b711a735de048c12b5fdd8086#diff-e64a48fabe31db213a631fdc5f2acb51bdddf3f16a8fb2928784f4c579229585. The implementation of  call graph profile was changed from a black box section to relocation approach. This was done to be compatible with post processing tools like strip/objcopy, and llvm equivalent. When they are invoked on object file before the final linking step with this new approach the symbol indices correctness is preserved.

The GNU binutils tools change the REL section to RELA section, unlike llvm tools. For example when strip -S is run on the ELF object files, as an intermediate step before linking. To preserve compatibility this patch extends implementation in LLD and ELFDumper to support both REL and RELA sections for call graph profile.

Reviewed By: MaskRay, jhenderson

Differential Revision: https://reviews.llvm.org/D105217
The file was modifiedllvm/tools/llvm-readobj/ELFDumper.cpp
The file was addedlld/test/ELF/cgprofile-rela.test
The file was modifiedlld/ELF/Driver.cpp
The file was modifiedlld/ELF/InputFiles.h
The file was modifiedlld/ELF/InputFiles.cpp
The file was modifiedllvm/test/tools/llvm-readobj/ELF/call-graph-profile.test
Commit d4e2693a679927a62dd738dd3bba24863dcd290a by dschuff
[WebAssembly] Run varargs codegen test with non-emscripten triple

This is a followup from D105749 to cover both triples in the case
where they differ.
The file was modifiedllvm/test/CodeGen/WebAssembly/varargs.ll
Commit 8a2720d81e159fc71550b10b4c34f1de912d5880 by jpienaar
Add more types to the LLVM dialect C API

This includes:
- void type
- array types
- function types
- literal (unnamed) struct types

Reviewed By: jpienaar, ftynse

Differential Revision: https://reviews.llvm.org/D105908
The file was modifiedmlir/include/mlir-c/Dialect/LLVM.h
The file was modifiedmlir/lib/CAPI/Dialect/LLVM.cpp
The file was modifiedmlir/test/CAPI/llvm.c
Commit 123e8dfcf86a74eb7ba08f33681df581d1be9dbd by ajcbik
[mlir][sparse] add support for std unary operations

Adds zero-preserving unary operators from std. Also adds xor.
Performs minor refactoring to remove "zero" node, and pushed
the irregular logic for negi (not support in std) into one place.

Reviewed By: gussmith23

Differential Revision: https://reviews.llvm.org/D105928
The file was modifiedmlir/test/Dialect/SparseTensor/sparse_fp_ops.mlir
The file was modifiedmlir/test/Dialect/SparseTensor/sparse_int_ops.mlir
The file was modifiedmlir/unittests/Dialect/SparseTensor/MergerTest.cpp
The file was modifiedmlir/lib/Dialect/SparseTensor/Transforms/Sparsification.cpp
The file was modifiedmlir/lib/Dialect/SparseTensor/Utils/Merger.cpp
The file was modifiedmlir/include/mlir/Dialect/SparseTensor/Utils/Merger.h
Commit f2b5e438aa3620cd60d115cad8dcb39cc417c8a8 by ravishankarm
[mlir][Tensor] Implement `reifyReturnTypeShapesPerResultDim` for `tensor.insert_slice`.

Differential Revision: https://reviews.llvm.org/D105852
The file was modifiedmlir/lib/Dialect/Tensor/IR/CMakeLists.txt
The file was modifiedutils/bazel/llvm-project-overlay/mlir/BUILD.bazel
The file was modifiedmlir/include/mlir/Dialect/Tensor/IR/TensorOps.td
The file was modifiedmlir/lib/Dialect/Tensor/IR/TensorOps.cpp
The file was addedmlir/test/Dialect/Tensor/resolve-shaped-type-result-dims.mlir
The file was modifiedmlir/include/mlir/Dialect/Tensor/IR/Tensor.h
Commit 18c19414eb70578d4c487d6f4b0f438aead71d6a by wei.huang
[PowerPC] Add PowerPC compare and multiply related builtins and instrinsics for XL compatibility

This patch is in a series of patches to provide builtins for compatibility
with the XL compiler. This patch adds the builtins and instrisics for compare
and multiply related operations.

Reviewed By: nemanjai, #powerpc

Differential revision: https://reviews.llvm.org/D102875
The file was addedclang/test/CodeGen/builtins-ppc-xlcompat-multiply-64bit-only.c
The file was modifiedllvm/lib/Target/PowerPC/PPCInstr64Bit.td
The file was addedclang/test/CodeGen/builtins-ppc-xlcompat-pwr9-error.c
The file was addedllvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-compare.ll
The file was modifiedclang/lib/Basic/Targets/PPC.cpp
The file was addedclang/test/CodeGen/builtins-ppc-xlcompat-multiply.c
The file was modifiedclang/include/clang/Basic/BuiltinsPPC.def
The file was modifiedllvm/include/llvm/IR/IntrinsicsPowerPC.td
The file was modifiedclang/lib/Sema/SemaChecking.cpp
The file was addedclang/test/CodeGen/builtins-ppc-xlcompat-pwr9-64bit.c
The file was addedllvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-multiply-64bit-only.ll
The file was addedllvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-multiply.ll
The file was addedllvm/test/CodeGen/PowerPC/builtins-ppc-xlcompat-compare-64bit-only.ll
The file was modifiedllvm/lib/Target/PowerPC/PPCInstrInfo.td
The file was addedclang/test/CodeGen/builtins-ppc-xlcompat-pwr9.c
Commit 9955c652eafdcb5f1d16ee3db857f03ee7e5cfbc by gcmn
[NFC][MLIR][std] Clean up ArithmeticCastOps

The documentation on these was out of sync with the implementation. Also
the declaration of inputs was repeated when it is already part of the
ArithmeticCastOp definition.

Reviewed By: rriddle

Differential Revision: https://reviews.llvm.org/D105934
The file was modifiedmlir/include/mlir/Dialect/StandardOps/IR/Ops.td
Commit 5df99954392e3a4448e4ff43d4cf644bc06bfa92 by Vitaly Buka
[NFC][sanitizer] Rename some MemoryMapper members

Part of D105778
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_allocator_primary64.h
Commit afa3fedcda98db4d47694ed596270a5396074224 by Vitaly Buka
[NFC][sanitizer] Exctract DrainHalfMax

Part of D105778
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_allocator_local_cache.h
Commit bb8c7a980fe487eb322d38641db9145a6b6cb1d4 by efriedma
[ScalarEvolution] Make isKnownNonZero handle more cases.

Using an unsigned range instead of signed ranges is a bit more precise.

Differential Revision: https://reviews.llvm.org/D105941
The file was modifiedllvm/test/Analysis/ScalarEvolution/trip-count9.ll
The file was modifiedllvm/lib/Analysis/ScalarEvolution.cpp
Commit eebe841a47cbbd55bdcc32da943c92d18f88a5b8 by Matthew.Arsenault
RegAlloc: Allow targets to split register allocation

AMDGPU normally spills SGPRs to VGPRs. Previously, since all register
classes are handled at the same time, this was problematic. We don't
know ahead of time how many registers will be needed to be reserved to
handle the spilling. If no VGPRs were left for spilling, we would have
to try to spill to memory. If the spilled SGPRs were required for exec
mask manipulation, it is highly problematic because the lanes active
at the point of spill are not necessarily the same as at the restore
point.

Avoid this problem by fully allocating SGPRs in a separate regalloc
run from VGPRs. This way we know the exact number of VGPRs needed, and
can reserve them for a second run.  This fixes the most serious
issues, but it is still possible using inline asm to make all VGPRs
unavailable. Start erroring in the case where we ever would require
memory for an SGPR spill.

This is implemented by giving each regalloc pass a callback which
reports if a register class should be handled or not. A few passes
need some small changes to deal with leftover virtual registers.

In the AMDGPU implementation, a new pass is introduced to take the
place of PrologEpilogInserter for SGPR spills emitted during the first
run.

One disadvantage of this is currently StackSlotColoring is no longer
used for SGPR spills. It would need to be run again, which will
require more work.

Error if the standard -regalloc option is used. Introduce new separate
-sgpr-regalloc and -vgpr-regalloc flags, so the two runs can be
controlled individually. PBQB is not currently supported, so this also
prevents using the unhandled allocator.
The file was modifiedllvm/test/CodeGen/AMDGPU/spill_more_than_wavesize_csr_sgprs.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/virtregrewrite-undef-identity-copy.mir
The file was modifiedllvm/test/CodeGen/AMDGPU/stack-slot-color-sgpr-vgpr-spills.mir
The file was modifiedllvm/test/CodeGen/AMDGPU/GlobalISel/extractelement-stack-lower.ll
The file was modifiedllvm/lib/Target/AMDGPU/SIRegisterInfo.h
The file was addedllvm/test/CodeGen/AMDGPU/sgpr-regalloc-flags.ll
The file was modifiedllvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
The file was modifiedllvm/test/CodeGen/AMDGPU/spill-empty-live-interval.mir
The file was modifiedllvm/lib/CodeGen/TargetPassConfig.cpp
The file was modifiedllvm/test/CodeGen/AMDGPU/pei-build-spill.mir
The file was modifiedllvm/include/llvm/CodeGen/RegAllocRegistry.h
The file was modifiedllvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
The file was modifiedllvm/test/CodeGen/AMDGPU/llc-pipeline.ll
The file was addedllvm/include/llvm/CodeGen/RegAllocCommon.h
The file was modifiedllvm/lib/CodeGen/RegAllocBase.cpp
The file was modifiedllvm/lib/Target/AMDGPU/SILowerSGPRSpills.cpp
The file was modifiedllvm/test/CodeGen/AMDGPU/sibling-call.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/gfx-callable-preserved-registers.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/alloc-aligned-tuples-gfx908.mir
The file was modifiedllvm/test/CodeGen/AMDGPU/alloc-aligned-tuples-gfx90a.mir
The file was modifiedllvm/test/CodeGen/AMDGPU/remat-vop.mir
The file was modifiedllvm/test/CodeGen/AMDGPU/unstructured-cfg-def-use-issue.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/agpr-csr.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/mul24-pass-ordering.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/indirect-call.ll
The file was modifiedllvm/lib/CodeGen/RegAllocFast.cpp
The file was modifiedllvm/test/CodeGen/AMDGPU/callee-frame-setup.ll
The file was modifiedllvm/include/llvm/CodeGen/Passes.h
The file was addedllvm/test/CodeGen/AMDGPU/sgpr-spill-no-vgprs.ll
The file was modifiedllvm/lib/CodeGen/LiveIntervals.cpp
The file was modifiedllvm/lib/CodeGen/RegAllocGreedy.cpp
The file was modifiedllvm/test/CodeGen/AMDGPU/vgpr-tuple-allocation.ll
The file was modifiedllvm/lib/CodeGen/RegAllocBase.h
The file was modifiedllvm/test/CodeGen/AMDGPU/sgpr-spill-wrong-stack-id.mir
The file was modifiedllvm/test/CodeGen/AMDGPU/attr-amdgpu-flat-work-group-size-vgpr-limit.ll
The file was modifiedllvm/lib/CodeGen/RegAllocBasic.cpp
The file was modifiedllvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
The file was modifiedllvm/test/CodeGen/AMDGPU/gfx-callable-argument-types.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/spill-scavenge-offset.ll
The file was modifiedllvm/lib/Target/AMDGPU/SIFrameLowering.cpp
Commit 99aebb62fb4f2a39c7f03579facf3a1e176b245d by Vitaly Buka
[NFC][sanitizer] Don't store region_base_ in MemoryMapper

Part of D105778
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_allocator_primary64.h
The file was modifiedcompiler-rt/lib/sanitizer_common/tests/sanitizer_allocator_test.cpp
Commit 0024ec59a0f3deb206a21567ac2ebe0fc097ea9d by aeubanks
[NewPM][SimpleLoopUnswitch] Add option to not trivially unswitch

To help with debugging non-trivial unswitching issues.

Don't care about the legacy pass, nobody is using it.

If a pass's string params are empty (e.g. "simple-loop-unswitch"), don't
default to the empty constructor for the pass params. We should still
let the parser take care of it in case the parser has its own defaults.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D105933
The file was modifiedllvm/test/Other/print-passes.ll
The file was addedllvm/test/Transforms/SimpleLoopUnswitch/options.ll
The file was modifiedllvm/lib/Passes/PassBuilder.cpp
The file was modifiedllvm/include/llvm/Transforms/Scalar/SimpleLoopUnswitch.h
The file was modifiedllvm/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp
The file was modifiedllvm/lib/Passes/PassRegistry.def
Commit 832ba20710ee09b00161ea72cf80c9af800fda63 by Vitaly Buka
sanitizer_common: optimize memory drain

Currently we allocate MemoryMapper per size class.
MemoryMapper mmap's and munmap's internal buffer.
This results in 50 mmap/munmap calls under the global
allocator mutex. Reuse MemoryMapper and the buffer
for all size classes. This radically reduces number of
mmap/munmap calls. Smaller size classes tend to have
more objects allocated, so it's highly likely that
the buffer allocated for the first size class will
be enough for all subsequent size classes.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D105778
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_allocator_primary64.h
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_allocator_local_cache.h
Commit 3191ac27e396dbd141243b8ca6cf5660c10ddf5c by Matthew.Arsenault
AMDGPU: Try to fix test failure with EXPENSIVE_CHECKS

The machine verifier is enabled by default for EXPENSIVE_CHECKS, so
the pass runs of it would pollute the output here.
The file was modifiedllvm/test/CodeGen/AMDGPU/sgpr-regalloc-flags.ll
Commit 7140382b17df7c33145cc6e9a2df7e84a2259444 by Vitaly Buka
[NFC][sanitizer] Move MemoryMapper template parameter
The file was modifiedcompiler-rt/lib/sanitizer_common/tests/sanitizer_allocator_test.cpp
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_allocator_primary64.h
Commit 8725b382b0a5ea375252d966bafbace62a21e93b by Vitaly Buka
[NFC][sanitizer] Simplify MapPackedCounterArrayBuffer
The file was modifiedcompiler-rt/lib/sanitizer_common/tests/sanitizer_allocator_test.cpp
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_allocator_primary64.h
Commit 5bd7cc4f42488129adb135539c64bb3933d5da4c by Jessica Paquette
[AArch64][GlobalISel] Mark v2s64 -> v2p0 G_INTTOPTR as legal

Allow

```
%x:_<2 x p0> = G_INTTOPTR %y:_<2 x s64>
```

This shows up when building clang for AArch64 with GlobalISel.

Also show that we can select it.

This should match SDAG's behaviour: https://godbolt.org/z/33oqYoaYv

Differential Revision: https://reviews.llvm.org/D105944
The file was modifiedllvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
The file was modifiedllvm/test/CodeGen/AArch64/GlobalISel/select-int-ptr-casts.mir
The file was addedllvm/test/CodeGen/AArch64/GlobalISel/legalize-inttoptr.mir
Commit ed430023e864c3b3ff7f47d5740e5380828c26f6 by Vitaly Buka
Revert "[NFC][sanitizer] Simplify MapPackedCounterArrayBuffer"

Does not compile.

This reverts commit 8725b382b0a5ea375252d966bafbace62a21e93b.
The file was modifiedcompiler-rt/lib/sanitizer_common/tests/sanitizer_allocator_test.cpp
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_allocator_primary64.h
Commit 5738819679fd3bb08c4848129b27c63690d937a5 by aeubanks
Revert "[SCEV] Handle zero stride correctly in howManyLessThans"

This reverts commit 4df591b5c960affd1612e330d0c9cd3076c18053.

Causes crashes, see comments on D105921.
The file was modifiedllvm/test/Analysis/ScalarEvolution/trip-count-unknown-stride.ll
The file was modifiedllvm/lib/Analysis/ScalarEvolution.cpp
Commit 6377388c32ffc1f5c054a813d0bc81ac118108af by Jon Roelofs
[AArch64] Fix AArch64::dsub's size
The file was modifiedllvm/lib/Target/AArch64/AArch64RegisterInfo.td
Commit 87c6bf92a9c7722b18643ea73f76623f2463c5bb by Jon Roelofs
[AArch64] rm unused subreg's
The file was modifiedllvm/lib/Target/AArch64/AArch64RegisterInfo.td
Commit 35ce66330a2686878ea0a1da93e0a94961933006 by Vitaly Buka
[NFC][sanitizer] Simplify MapPackedCounterArrayBuffer
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_allocator_primary64.h
The file was modifiedcompiler-rt/lib/sanitizer_common/tests/sanitizer_allocator_test.cpp
Commit 071203845887a2ff0347747bd5864f8738d17eef by hoy
[CSSPGO][llvm-profgen] Allow multiple executable load segments.

The linker or post-link optimizer can create an ELF image with multiple executable segments each of which will be loaded separately at run time. This breaks the assumption of llvm-profgen that currently only supports one base load address. What it ends up with is that the subsequent mmap events will be treated as an overwrite of the first mmap event which will in turn screw up address mapping. While it is non-trivial to support multiple separate load addresses and given that on x64 those segments will always be loaded at consecutive addresses (though via separate mmap
sys calls), I'm adding an error checking logic to bail out if that's violated and keep using a single load address which is the address of the first executable segment.

Also changing the disassembly output from printing section offset to printing the virtual address instead, which matches the behavior of objdump.

Differential Revision: https://reviews.llvm.org/D103178
The file was addedllvm/test/tools/llvm-profgen/disassemble.test
The file was modifiedllvm/tools/llvm-profgen/ProfiledBinary.h
The file was removedllvm/test/tools/llvm-profgen/disassemble.s
The file was addedllvm/test/tools/llvm-profgen/symbolize.test
The file was addedllvm/test/tools/llvm-profgen/Inputs/symbolize.ll
The file was addedllvm/test/tools/llvm-profgen/Inputs/multi-load-segs.perfbin
The file was addedllvm/test/tools/llvm-profgen/Inputs/symbolize.perfbin
The file was addedllvm/test/tools/llvm-profgen/multi-load-segs.test
The file was removedllvm/test/tools/llvm-profgen/symbolize.ll
The file was modifiedllvm/test/tools/llvm-profgen/mmapEvent.test
The file was modifiedllvm/tools/llvm-profgen/ProfiledBinary.cpp
The file was addedllvm/test/tools/llvm-profgen/Inputs/multi-load-segs.perfscript
The file was modifiedllvm/tools/llvm-profgen/PerfReader.h
The file was modifiedllvm/tools/llvm-profgen/PerfReader.cpp
Commit 74b99b5c2eacbdef15b99b3e0a8073598f985bb4 by hoy
[CSSPGO] Do not import pseudo probe desc in thinLTO

Previously we reliedy on pseudo probe descriptors to look up precomputed GUID during probe emission for inlined probes. Since we are moving to always using unique linkage names, GUID for functions can be computed in place from dwarf names. This eliminates the need of importing pseudo probe descs in thinlto, since those descs should be emitted by the original modules.

This significantly reduces thinlto memory footprint in some extreme case where the number of imported modules for a single module is massive.

Test Plan:

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D105248
The file was modifiedllvm/lib/CodeGen/AsmPrinter/PseudoProbePrinter.cpp
The file was addedllvm/test/ThinLTO/X86/Inputs/pseudo-probe-desc-import.ll
The file was modifiedllvm/lib/CodeGen/AsmPrinter/PseudoProbePrinter.h
The file was modifiedllvm/lib/Linker/IRMover.cpp
The file was modifiedllvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
The file was addedllvm/test/ThinLTO/X86/pseudo-probe-desc-import.ll
Commit cda2394d9768f97cbacbbf8a5c6288c1015b981a by hoy
[NFC][CSSPGO] Rename the name of an enum value.
The file was modifiedllvm/tools/llvm-profgen/PerfReader.cpp
Commit 8a0f1163d02c77c6e764929b66c26ba196cfc549 by richard
Fix test trying to write a spurious output file into the source
directory.

This causes test failures if the source directory is read-only.
The file was modifiedclang/test/CodeGen/builtins-ppc-xlcompat-pwr9-64bit.c
The file was modifiedclang/test/CodeGen/builtins-ppc-xlcompat-pwr9.c
Commit 205ed009a44c2b04a15aea039d8947e74856f158 by efriedma
[SCEV] Handle zero stride correctly in howManyLessThans

This is split from D105216, but the code is hoisted much earlier into
the path where we can actually get a zero stride flowing through. Some
fairly simple proofs handle the cases which show up in practice. The
only test changes are the cases where we really do need a non-zero
divider to produce the right result.

Recommitting with isLoopInvariant() check.

Differential Revision: https://reviews.llvm.org/D105921
The file was modifiedllvm/test/Analysis/ScalarEvolution/trip-count-unknown-stride.ll
The file was modifiedllvm/lib/Analysis/ScalarEvolution.cpp
Commit 1100e4aafea233bc8bbc307c5758a7d287ad3bae by tianshilei1992
[AbstractAttributor] Fold function calls to `__kmpc_is_spmd_exec_mode` if possible

In the device runtime there are many function calls to `__kmpc_is_spmd_exec_mode`
to query the execution mode of current kernels. In many cases, user programs
only contain target region executing in one mode. As a consequence, those runtime
function calls will only return one value. If we can get rid of these function
calls during compliation, it can potentially improve performance.

In this patch, we use `AAKernelInfo` to analyze kernel execution. Basically, for
each kernel (device) function `F`, we collect all kernel entries `K` that can
reach `F`. A new AA, `AAFoldRuntimeCall`, is created for each call site. In each
iteration, it will check all reaching kernel entries, and update the folded value
accordingly.

In the future we will support more function.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D105787
The file was addedllvm/test/Transforms/OpenMP/is_spmd_exec_mode_fold.ll
The file was modifiedllvm/test/Transforms/OpenMP/custom_state_machines.ll
The file was modifiedllvm/lib/Transforms/IPO/OpenMPOpt.cpp
Commit fef5f4456abcb1ea052206db6c232468d70b07f2 by hoy
[CSSPGO][llvm-profgen] Fix a missing initalization

Fixing a missing initalization that accidentaly caused by https://reviews.llvm.org/D103178 .
The file was modifiedllvm/tools/llvm-profgen/ProfiledBinary.h
Commit 597e9c61cee39071141f3c8f31f47561d2844196 by hoy
Revert "[CSSPGO][llvm-profgen] Fix a missing initalization"

This reverts commit fef5f4456abcb1ea052206db6c232468d70b07f2.
The file was modifiedllvm/tools/llvm-profgen/ProfiledBinary.h
Commit 6b04ecaab355f0dfce8a980cb67a39662759734c by hoy
[CSSPGO][llvm-profgen] Fix a missing initalization

Fixing a missing initalization that accidentaly caused by https://reviews.llvm.org/D103178 .
The file was modifiedllvm/tools/llvm-profgen/ProfiledBinary.h
Commit 64785ac12ef8b94fe7281e2cbe2db68d64d55e4c by Jinsong Ji
[AIX] Update testcase to use aix triple

We have implemented the basic MCAsmParser now, we can use the triple
directly now.
The file was modifiedllvm/test/MC/PowerPC/modern-aix-as.s
Commit d5c0b0102a25c27f41137588422d368eb42d971e by llvm-project
[Polly] Fix typo. NFC.

Thanks to Mugerwa Martin for reporting.
The file was modifiedpolly/docs/Architecture.rst
Commit ba127a45701b5fa870a1df6b1fb09a351ad14051 by Vitaly Buka
[sanitizer] Convert script to python 3
The file was modifiedcompiler-rt/test/sanitizer_common/android_commands/android_compile.py
Commit 40ce58d0ca10a1195da82895749b67f30f000243 by david.green
Revert "[clang] Refactor AST printing tests to share more infrastructure"

This reverts commit 20176bc7dd3f431db4c3d59b51a9f53d52190c82 as some
versions of GCC do not seem to handle the new code very well. They
complain about:

/tmp/ccqUQZyw.s: Assembler messages:
/tmp/ccqUQZyw.s:1151: Error: symbol `_ZNSt14_Function_base13_Base_managerIN5clangUlPKNS1_4StmtEE2_EE10_M_managerERSt9_Any_dataRKS7_St18_Manager_operation' is already defined
/tmp/ccqUQZyw.s:11963: Error: symbol `_ZNSt17_Function_handlerIFbPKN5clang4StmtEENS0_UlS3_E2_EE9_M_invokeERKSt9_Any_dataOS3_' is already defined

This seems like it is some GCC issue, but multiple buildbots (and my
local machine) are all failing because of it.
The file was modifiedclang/unittests/AST/StmtPrinterTest.cpp
The file was modifiedclang/unittests/AST/NamedDeclPrinterTest.cpp
The file was modifiedclang/unittests/AST/DeclPrinterTest.cpp
The file was modifiedclang/unittests/AST/ASTPrint.h
Commit 94210b12d1d6454c6de8ca4c83a82a1148b5cd1a by Vitaly Buka
[sanitizer] Upgrade android scripts to python 3
The file was modifiedcompiler-rt/test/sanitizer_common/android_commands/android_run.py
The file was modifiedcompiler-rt/test/sanitizer_common/android_commands/android_common.py
Commit 16f8207de377a055b7b75a3003d82059ca63992d by Vitaly Buka
[sanitizer] Fix type error in python 3
The file was modifiedcompiler-rt/test/sanitizer_common/android_commands/android_run.py
Commit 08cf69c31f849310ec45945d18f0feef4ea8f2e6 by zakk.chen
[RISCV] Support overloading for RVV miscellaneous functions.

Based on this update to the intrinsic doc
https://github.com/riscv/rvv-intrinsic-doc/pull/103

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D105611
The file was addedclang/test/CodeGen/RISCV/rvv-intrinsics-overloaded/vset.c
The file was modifiedclang/utils/TableGen/RISCVVEmitter.cpp
The file was modifiedclang/include/clang/Basic/riscv_vector.td
The file was addedclang/test/CodeGen/RISCV/rvv-intrinsics-overloaded/vget.c
The file was addedclang/test/CodeGen/RISCV/rvv-intrinsics-overloaded/vreinterpret.c
The file was addedclang/test/CodeGen/RISCV/rvv-intrinsics-overloaded/vlmul.c
Commit 8ae31b08d9da5f42dd149eb48ef3e3baae2d1b07 by joker.eph
Reformulate OrcJIT tutorial doc to make it more clear.

Fixed a minor writing error. The text was hard to understand.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D105899
The file was modifiedllvm/docs/tutorial/BuildingAJIT2.rst
Commit dfd9808b6cea59ff075498ee7e6e57f2b5b3a798 by Vitaly Buka
sanitizer_common: add simpler ThreadRegistry ctor

Currently ThreadRegistry is overcomplicated because of tsan,
it needs tid quarantine and reuse counters. Other sanitizers
don't need that. It also seems that no other sanitizer now
needs max number of threads. Asan used to need 2^24 limit,
but it does not seem to be needed now. Other sanitizers blindly
copy-pasted that without reasons. Lsan also uses quarantine,
but I don't see why that may be potentially needed.

Add a ThreadRegistry ctor that does not require any sizes
and use it in all sanitizers except for tsan.
In preparation for new tsan runtime, which won't need
any of these parameters as well.

Reviewed By: vitalybuka

Differential Revision: https://reviews.llvm.org/D105713
The file was modifiedcompiler-rt/lib/lsan/lsan_thread.cpp
The file was modifiedcompiler-rt/lib/memprof/memprof_thread.cpp
The file was modifiedcompiler-rt/lib/asan/asan_thread.cpp
The file was modifiedcompiler-rt/lib/memprof/memprof_thread.h
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_thread_registry.h
The file was modifiedcompiler-rt/lib/asan/asan_thread.h
The file was modifiedcompiler-rt/lib/sanitizer_common/tests/sanitizer_thread_registry_test.cpp
The file was modifiedcompiler-rt/lib/sanitizer_common/sanitizer_thread_registry.cpp
Commit 2c425c17e678c522d8f4961e9ad94ad718a7cba0 by martin
[libcxx] [test] Clarify weak_ptr_ret on Windows, remove a LIBCXX-WINDOWS-FIXME

On Windows, structs with a destructor are always returned indirectly;
add this to the list of known exceptions in the test where the class
isn't returned in registers as expected.

Differential Revision: https://reviews.llvm.org/D105906
The file was modifiedlibcxx/test/libcxx/memory/trivial_abi/weak_ptr_ret.pass.cpp
Commit 5635d2a56dab6dc64d3a3f185d68f676b81dc736 by kito.cheng
[RISCV] Pass -u to linker correctly.

`-u` is a linker option used to pretend a symbol is undefined,
this option are common used for forcing archive member extraction.

This option should pass to `ld`, and many other toolchain in Clang
like `tools::gnutools` has pass that too.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D105091
The file was modifiedclang/test/Driver/riscv-args.c
The file was modifiedclang/lib/Driver/ToolChains/RISCVToolchain.cpp

Summary

  1. Revert "[sanitizer] Don't tie builders with particular workers" (details)
Commit b899cd8edcb824c4e4f999ef254209060d1ab646 by Vitaly Buka
Revert "[sanitizer] Don't tie builders with particular workers"

This reverts commit d37259ec73a4341700e981214b9032631adfdda0.
With some changes.
The file was modifiedbuildbot/osuosl/master/config/builders.py