SuccessChanges

Summary

  1. [mlir][Linalg] Generalize the logic to compute reassociation maps (details)
  2. [InstCombine] Add bswap(trunc(bswap(x))) -> trunc(lshr(x, c)) vector tests (details)
  3. [InstCombine] Fix bswap(trunc(bswap(x))) -> trunc(lshr(x, c)) vector support (details)
  4. [PowerPC] Avoid unused variable warning in Release builds (details)
  5. [PPC] Do not emit extswsli in 32BIT mode when using -mcpu=pwr9 (details)
  6. [InstCombine] Add tests for 'partial' bswap patterns (details)
  7. [NFC][regalloc] Make VirtRegAuxInfo part of allocator state (details)
  8. [DA][SDA] SyncDependenceAnalysis re-write (details)
  9. [VE] Support TargetBlockAddress (details)
  10. [ObjCARCAA][NewPM] Add already ported objc-arc-aa to PassRegistry.def (details)
  11. [mlir][openacc] Remove -allow-unregistred-dialect from ops and invalid tests (details)
  12. [llvm-exegesis] Add option to check the hardware support for a given feature before benchmarking. (details)
  13. scudo: Make it thread-safe to set some runtime configuration flags. (details)
  14. [test][SampleProfile][NewPM] Fix some tests under NPM (details)
  15. [asan][test] Several Posix/unpoison-alternate-stack.cpp fixes (details)
  16. [AArch64] Avoid pairing loads when the base reg is modified (details)
  17. [CodeGen] add test for NAN creation; NFC (details)
  18. [Sema] Support Comma operator for fp16 vectors. (details)
  19. Fix interaction of `constinit` and `weak`. (details)
  20. [OpenMP] Add Error Handling for Conflicting Pointer Sizes for Target Offload (details)
  21. [OpenMP] Replace OpenMP RTL Functions With OMPIRBuilder and OMPKinds.def (details)
  22. [AIX][Clang][Driver] Link libm in c++ mode (details)
Commit 892fdc923f06adbef507ebe594fa7b48224d93f0 by ravishankarm
[mlir][Linalg] Generalize the logic to compute reassociation maps
while folding tensor_reshape op.

While folding reshapes that introduce unit extent dims, the logic to
compute the reassociation maps can be generalized to handle some
corner cases, for example, when the folded shape still has unit-extent
dims but corresponds to folded unit extent dims of the expanded shape.

Differential Revision: https://reviews.llvm.org/D88521
The file was modifiedmlir/lib/Dialect/Linalg/Transforms/DropUnitDims.cpp
The file was modifiedmlir/test/Dialect/Linalg/drop-unit-extent-dims.mlir
Commit b85de2c69cf3d6fbc2ad3439a6224667a58f704c by llvm-dev
[InstCombine] Add bswap(trunc(bswap(x))) -> trunc(lshr(x, c)) vector tests

Add tests showing failure to correctly fold vector bswap(trunc(bswap(x))) intrinsic patterns
The file was modifiedllvm/test/Transforms/InstCombine/bswap-fold.ll
Commit 323d08e50a7bb80786dc00a8ade6ae49e1358393 by llvm-dev
[InstCombine] Fix bswap(trunc(bswap(x))) -> trunc(lshr(x, c)) vector support

Use getScalarSizeInBits not getPrimitiveSizeInBits to determine the shift value at the element level.
The file was modifiedllvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
The file was modifiedllvm/test/Transforms/InstCombine/bswap-fold.ll
Commit 2c394bd4071d32000e2eed0f7d90fe7c576d7050 by benny.kra
[PowerPC] Avoid unused variable warning in Release builds

PPCFrameLowering.cpp:632:8: warning: unused variable 'isAIXABI' [-Wunused-variable]
The file was modifiedllvm/lib/Target/PowerPC/PPCFrameLowering.cpp
Commit 052c5bf40a9fc9ffe1bb2669763d8a0d2dea2b2e by zarko
[PPC] Do not emit extswsli in 32BIT mode when using -mcpu=pwr9

It looks like in some circumstances when compiling with `-mcpu=pwr9` we create an EXTSWSLI node when which causes llc to fail. No such error occurs in pwr8 or lower.

This occurs in 32BIT AIX and BE Linux. the cause seems to be that the default return in combineSHL is to create an EXTSWSLI node.  Adding a check for whether we are in PPC64 before that fixes the issue.

Reviewed By: #powerpc, nemanjai

Differential Revision: https://reviews.llvm.org/D87046
The file was modifiedllvm/lib/Target/PowerPC/PPCISelLowering.cpp
The file was addedllvm/test/CodeGen/PowerPC/ppc-32bit-shift.ll
Commit f425418fc4ebd989c6c3d59d20e7fe37cb29259c by llvm-dev
[InstCombine] Add tests for 'partial' bswap patterns

As mentioned on PR47191, if we're bswap'ing some bytes and the zero'ing the remainder we can perform this as a bswap+mask which helps us match 'partial' bswaps as a first step towards folding into a more complex bswap pattern.
The file was modifiedllvm/test/Transforms/InstCombine/bswap.ll
Commit d6de40f8865e2c016731f9b63d8a0a218ce1b74f by mtrofin
[NFC][regalloc] Make VirtRegAuxInfo part of allocator state

All the state of VRAI is allocator-wide, so we can avoid creating it
every time we need it. In addition, the normalization function is
allocator-specific. In a next change, we can simplify that design in
favor of just having it as a virtual member.

Differential Revision: https://reviews.llvm.org/D88499
The file was modifiedllvm/lib/CodeGen/CalcSpillWeights.cpp
The file was modifiedllvm/lib/CodeGen/RegAllocBasic.cpp
The file was modifiedllvm/lib/CodeGen/RegAllocPBQP.cpp
The file was modifiedllvm/include/llvm/CodeGen/CalcSpillWeights.h
The file was modifiedllvm/lib/CodeGen/RegAllocGreedy.cpp
Commit 05ae04c396519cca9ef50d3b9cafb0cd9c87d1d7 by simon.moll
[DA][SDA] SyncDependenceAnalysis re-write

This patch achieves two things:
1. It breaks up the `join_blocks` interface between the SDA to the DA to
   return two separate sets for divergent loops exits and divergent,
disjoint path joins.
2. It updates the SDA algorithm to run in O(n) time and improves the
   precision on divergent loop exits.

This fixes `https://bugs.llvm.org/show_bug.cgi?id=46372` (by virtue of
the improved `join_blocks` interface) and revealed an imprecise expected
result in the `Analysis/DivergenceAnalysis/AMDGPU/hidden_loopdiverge.ll`
test.

Reviewed By: sameerds

Differential Revision: https://reviews.llvm.org/D84413
The file was modifiedllvm/test/Analysis/DivergenceAnalysis/AMDGPU/trivial-join-at-loop-exit.ll
The file was modifiedllvm/include/llvm/Analysis/SyncDependenceAnalysis.h
The file was modifiedllvm/include/llvm/Analysis/DivergenceAnalysis.h
The file was modifiedllvm/test/Analysis/DivergenceAnalysis/AMDGPU/hidden_loopdiverge.ll
The file was modifiedllvm/lib/Analysis/DivergenceAnalysis.cpp
The file was modifiedllvm/lib/Analysis/SyncDependenceAnalysis.cpp
Commit 1034262e0a38f0bd755e68aa41b6bb856ebd2eb8 by jam
[VE] Support TargetBlockAddress

Change to handle TargetBlockAddress and add a regression test for it.

Reviewed By: simoll

Differential Revision: https://reviews.llvm.org/D88576
The file was modifiedllvm/lib/Target/VE/VEMCInstLower.cpp
The file was addedllvm/test/CodeGen/VE/blockaddress.ll
The file was modifiedllvm/lib/Target/VE/VEInstrInfo.td
Commit 4fbd83c716dbc1d68e0aac5d71d201b664762489 by aeubanks
[ObjCARCAA][NewPM] Add already ported objc-arc-aa to PassRegistry.def

Also add missing AnalysisKey definition.
The file was modifiedllvm/test/Transforms/ObjCARC/gvn.ll
The file was modifiedllvm/lib/Passes/PassRegistry.def
The file was modifiedllvm/lib/Analysis/ObjCARCAliasAnalysis.cpp
The file was modifiedllvm/lib/Passes/PassBuilder.cpp
Commit dd4fb7c8cfe394a3290bd19a1eac03435472ccfa by clementval
[mlir][openacc] Remove -allow-unregistred-dialect from ops and invalid tests

Switch to a dummy op in the test dialect so we can remove the -allow-unregistred-dialect
on ops.mlir and invalid.mlir. Change after comment on D88272.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D88587
The file was modifiedmlir/test/Dialect/OpenACC/invalid.mlir
The file was modifiedmlir/test/Dialect/OpenACC/ops.mlir
Commit 4fcd1a8e6528ca42fe656f2745e15d2b7f5de495 by vyng
[llvm-exegesis] Add option to check the hardware support for a given feature before benchmarking.

This is mostly for the benefit of the LBR latency mode.
Right now, it performs no checking. If this is run on non-supported hardware, it will produce all zeroes for latency.

Differential Revision: https://reviews.llvm.org/D85254
The file was modifiedllvm/tools/llvm-exegesis/llvm-exegesis.cpp
The file was modifiedllvm/tools/llvm-exegesis/lib/Target.h
The file was modifiedllvm/tools/llvm-exegesis/lib/X86/X86Counter.h
The file was modifiedllvm/test/tools/llvm-exegesis/X86/lbr/lit.local.cfg
The file was modifiedllvm/tools/llvm-exegesis/lib/X86/Target.cpp
The file was modifiedllvm/tools/llvm-exegesis/lib/X86/X86Counter.cpp
Commit 719ab7309eb7b7b5d802273b0f1871d6cdb965b1 by peter
scudo: Make it thread-safe to set some runtime configuration flags.

Move some of the flags previously in Options, as well as the
UseMemoryTagging flag previously in the primary allocator, into an
atomic variable so that it can be updated while other threads are
running. Relaxed accesses are used because we only have the requirement
that the other threads see the new value eventually.

The code is set up so that the variable is generally loaded once per
allocation function call with the exception of some rarely used code
such as error handlers. The flag bits can generally stay in a register
during the execution of the allocation function which means that they
can be branched on with minimal overhead (e.g. TBZ on aarch64).

Differential Revision: https://reviews.llvm.org/D88523
The file was addedcompiler-rt/lib/scudo/standalone/options.h
The file was modifiedcompiler-rt/lib/scudo/standalone/combined.h
The file was modifiedcompiler-rt/lib/scudo/standalone/primary64.h
The file was modifiedcompiler-rt/lib/scudo/standalone/wrappers_c.inc
The file was modifiedcompiler-rt/lib/scudo/standalone/atomic_helpers.h
The file was modifiedcompiler-rt/lib/scudo/standalone/primary32.h
Commit 2ab87702231e193ca170aa8ad4caa9f98bc7ced1 by aeubanks
[test][SampleProfile][NewPM] Fix some tests under NPM
The file was modifiedllvm/test/Transforms/SampleProfile/fnptr.ll
The file was modifiedllvm/test/Transforms/SampleProfile/discriminator.ll
The file was modifiedllvm/test/Transforms/SampleProfile/offset.ll
The file was modifiedllvm/test/Transforms/SampleProfile/branch.ll
The file was modifiedllvm/test/Transforms/SampleProfile/propagate.ll
The file was modifiedllvm/test/Transforms/SampleProfile/calls.ll
The file was modifiedllvm/test/Transforms/SampleProfile/remap.ll
Commit 73fb9698c0573778787e77a8ffa57e7fa3caebd4 by ro
[asan][test] Several Posix/unpoison-alternate-stack.cpp fixes

`Posix/unpoison-alternate-stack.cpp` currently `FAIL`s on Solaris/i386.
Some of the problems are generic:

- `clang` warns compiling the testcase:

  compiler-rt/test/asan/TestCases/Posix/unpoison-alternate-stack.cpp:83:7: warning: nested designators are a C99 extension [-Wc99-designator]
        .sa_sigaction = signalHandler,
        ^~~~~~~~~~~~~
  compiler-rt/test/asan/TestCases/Posix/unpoison-alternate-stack.cpp:84:7: warning: ISO C++ requires field designators to be specified in declaration order; field '_funcptr' will be initialized after field 'sa_flags' [-Wreorder-init-list]
        .sa_flags = SA_SIGINFO | SA_NODEFER | SA_ONSTACK,
        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

  and some more instances.  This can all easily be avoided by initializing
  each field separately.

- The test `SEGV`s in `__asan_memcpy`.  The default Solaris/i386 stack size
  is only 4 kB, while `__asan_memcpy` tries to allocate either 5436
  (32-bit) or 10688 bytes (64-bit) on the stack.  This patch avoids this by
  requiring at least 16 kB stack size.

- Even without `-fsanitize=address` I get an assertion failure:

  Assertion failed: !isOnSignalStack(), file compiler-rt/test/asan/TestCases/Posix/unpoison-alternate-stack.cpp, line 117

  The fundamental problem with this testcase is that `longjmp` from a
  signal handler is highly unportable; XPG7 strongly warns against it and
  it is thus unspecified which stack is used when `longjmp`ing from a
  signal handler running on an alternative stack.

  So I'm `XFAIL`ing this testcase on Solaris.

Tested on `amd64-pc-solaris2.11` and `x86_64-pc-linux-gnu`.

Differential Revision: https://reviews.llvm.org/D88501
The file was modifiedcompiler-rt/test/asan/TestCases/Posix/unpoison-alternate-stack.cpp
Commit 8d8cb1ad80b7074ac60d070fae89261894d34a0d by dancgr
[AArch64] Avoid pairing loads when the base reg is modified

When pairing loads, we should check if in between the two loads the
base register has been modified. If that is the case then avoid pairing
them because the second load actually loads from a different address.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D86956
The file was modifiedllvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
The file was addedllvm/test/CodeGen/AArch64/aarch64-ldst-modified-baseReg.mir
Commit 187686bea3878c0bf2b150d784e7eab223434e25 by spatel
[CodeGen] add test for NAN creation; NFC

This goes with the APFloat change proposed in
D88238.
This is copied from the MIPS-specific test in
builtin-nan-legacy.c to verify that the normal
behavior is correct on other targets without the
complication of an inverted quiet bit.
The file was addedclang/test/CodeGen/builtin-nan-exception.c
Commit 700e63293eea4a23440f300b1e9125ca2e80c6e9 by flo
[Sema] Support Comma operator for fp16 vectors.

The current half vector was enforcing an assert expecting
"(LHS is half vector) == (RHS is half vector)"
for comma.

Reviewed By: ahatanak, fhahn

Differential Revision: https://reviews.llvm.org/D88265
The file was modifiedclang/test/Sema/fp16vec-sema.c
The file was modifiedclang/lib/Sema/SemaExpr.cpp
Commit 892df30a7f344b6cb9995710efbc94bb25cfb95b by richard
Fix interaction of `constinit` and `weak`.

We previously took a shortcut and said that weak variables never have
constant initializers (because those initializers are never correct to
use outside the variable). We now say that weak variables can have
constant initializers, but are never usable in constant expressions.
The file was addedclang/test/SemaCXX/cxx20-constinit.cpp
The file was modifiedclang/lib/AST/Decl.cpp
The file was modifiedclang/lib/AST/ExprConstant.cpp
The file was modifiedclang/lib/Sema/SemaDeclCXX.cpp
Commit 9d2378b59150f6f1cb5c9cf42ea06b0bb57029a1 by huberjn
[OpenMP] Add Error Handling for Conflicting Pointer Sizes for Target Offload

Summary:
This patch adds an error to Clang that detects if OpenMP offloading is used
between two architectures with incompatible pointer sizes. This ensures that
the data mapping can be done correctly and solves an issue in code generation
generating the wrong size pointer.

Reviewer: jdoerfert

Subscribers:

Tags: #OpenMP #Clang

Differential Revision:
The file was modifiedclang/include/clang/Basic/DiagnosticDriverKinds.td
The file was modifiedclang/test/OpenMP/nvptx_target_parallel_reduction_codegen_tbaa_PR46146.cpp
The file was addedclang/test/OpenMP/target_incompatible_architecture_messages.cpp
The file was modifiedclang/lib/Frontend/CompilerInvocation.cpp
Commit 90eaedda9b8ef46e2c0c1b8bce33e98a3adbb68c by jhuber6
[OpenMP] Replace OpenMP RTL Functions With OMPIRBuilder and OMPKinds.def

Summary:
Replace the OpenMP Runtime Library functions used in CGOpenMPRuntimeGPU
for OpenMP device code generation with ones in OMPKinds.def and use
OMPIRBuilder for generating runtime calls. This allows us to consolidate
more OpenMP code generation into the OMPIRBuilder. This patch also
invalidates specifying target architectures with conflicting pointer
sizes.

Reviewers: jdoerfert

Subscribers: aaron.ballman cfe-commits guansong llvm-commits sstefan1 yaxunl

Tags: #OpenMP #Clang #LLVM

Differential Revision: https://reviews.llvm.org/D88430
The file was modifiedllvm/test/Transforms/OpenMP/add_attributes.ll
The file was modifiedclang/test/OpenMP/nvptx_parallel_codegen.cpp
The file was modifiedclang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
The file was modifiedllvm/include/llvm/Frontend/OpenMP/OMPKinds.def
The file was modifiedclang/lib/CodeGen/CodeGenModule.h
The file was modifiedclang/lib/CodeGen/CGOpenMPRuntime.h
Commit afc277b0ed0dcd9fbbde6015bbdf289349fb2104 by daltenty
[AIX][Clang][Driver] Link libm in c++ mode

since that is the normal behaviour of other compilers on the platform.

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D88500
The file was modifiedclang/test/Driver/aix-ld.c
The file was modifiedclang/lib/Driver/ToolChains/AIX.cpp