Changes

Summary

  1. [mlir] Do not use pass labels in unrolled ProgressiveVectorToSCF (details)
  2. [AIX] XFAIL CodeGen/Generic/externally_available.ll (details)
  3. Add entry about Hexagon V68 support to the release notes (details)
  4. Revert "[CMake][ELF] Add -fno-semantic-interposition and -Bsymbolic-functions" (details)
  5. Fix section title underlining in the release notes (details)
  6. [mlir] Migrate vector-to-loops.mlir to ProgressiveVectorToSCF (details)
  7. Reapply [ConstantFold] Fold more operations to poison (details)
  8. [TableGen] Make the NUL character invalid in .td files (details)
  9. [mlir][linalg] Remove IndexedGenericOp support from DropUnitDims... (details)
  10. [mlir] Replace vector-to-scf with progressive-vector-to-scf (details)
  11. [mlir][linalg] Remove IndexedGenericOp support from FusionOnTensors... (details)
  12. [AArch64][SVE] Fix missed immediate selection due to mishandling of signedness (details)
  13. Parse vector bool when stdbool.h and altivec.h are included (details)
  14. [HIP] Add __builtin_amdgcn_groupstaticsize (details)
  15. [AMDGPU] Only allow global fp atomics with unsafe option (details)
  16. [OpenMP] Test unified shared memory tests only on systems that support it. (details)
  17. [InstSimplify] Remove redundant {insert,extract}_vector intrinsic chains (details)
  18. [mlir] Add python test for shape dialect (details)
  19. [libomptarget][amdgpu] Convert an assert to print and offload_fail (details)
  20. [libomptarget][amdgpu] Fix truncation error for partial wavefront (details)
Commit bf068e1077a44fcb52fdf2aeb8f03f80517b64ab by springerm
[mlir] Do not use pass labels in unrolled ProgressiveVectorToSCF

Do not rely on pass labels to detect if the pattern was already applied in the past (which allows for more some extra optimizations to avoid extra InsertOps and ExtractOps). Instead, check if these optimizations can be applied on-the-fly.

This also fixes a bug, where vector.insert and vector.extract ops sometimes disappeared in the middle of the pass because they get folded away, but the next application of the pattern expected them to be there.

Differential Revision: https://reviews.llvm.org/D102206
The file was modifiedmlir/lib/Conversion/VectorToSCF/ProgressiveVectorToSCF.cpp
The file was modifiedmlir/test/Integration/Dialect/Vector/CPU/test-transfer-read-3d.mlir
Commit b1509d067e426bb8451bba789e34c4e65a168aba by Jinsong Ji
[AIX] XFAIL CodeGen/Generic/externally_available.ll

    Globals with “available_externally” linkage should never be emitted into the
    object file corresponding to the LLVM module.

    However, AIX system assembler default print error for undefined reference .
    so AIX chose to emit the available externally symbols into .s,
    so that users won't run into errors in situations like:

    clang -target powerpc-ibm-aix -xc -<<<$'extern inline
    __attribute__((__gnu_inline__)) void foo() {}\nvoid bar() { foo(); }' -O
    -Xclang -disable-llvm-passes

Reviewed By: hubert.reinterpretcast

Differential Revision: https://reviews.llvm.org/D102377
The file was modifiedllvm/test/CodeGen/Generic/externally_available.ll
Commit 4dea3487315ed2870134c303c550152965a0580b by kparzysz
Add entry about Hexagon V68 support to the release notes
The file was modifiedllvm/docs/ReleaseNotes.rst
Commit 92260d7a186425510e96b7036b467a6889d08d97 by oliver.stannard
Revert "[CMake][ELF] Add -fno-semantic-interposition and -Bsymbolic-functions"

This reverts commit 3bf1acab5b454ad7fb2074b34663108b53620695.

This is causing the test `gcov-shared-flush.c' to fail on the 2-stage
aarch64 buildbots (https://lab.llvm.org/buildbot/#/builders/7/builds/2720).
The file was modifiedllvm/tools/llvm-shlib/CMakeLists.txt
The file was modifiedclang/tools/clang-shlib/CMakeLists.txt
The file was modifiedllvm/cmake/modules/HandleLLVMOptions.cmake
Commit 2b20dee59bc8829182235964b02777d54f53bb62 by kparzysz
Fix section title underlining in the release notes
The file was modifiedllvm/docs/ReleaseNotes.rst
Commit d020dd2b21be85b60f935980ab8e93caee7a661a by springerm
[mlir] Migrate vector-to-loops.mlir to ProgressiveVectorToSCF

Create a copy of vector-to-loops.mlir and adapt the test for
ProgressiveVectorToSCF. Fix a small bug in getExtractOp() triggered by
this test.

Differential Revision: https://reviews.llvm.org/D102388
The file was modifiedmlir/lib/Conversion/VectorToSCF/ProgressiveVectorToSCF.cpp
The file was addedmlir/test/Conversion/VectorToSCF/progressive-vector-to-loops.mlir
Commit 395607af3cb80f25ee05420ea5ae0ad0be948533 by nikita.ppv
Reapply [ConstantFold] Fold more operations to poison

This was reverted to mitigate mitigate miscompiles caused by
the logical and/or to bitwise and/or fold. Reapply it now that
the underlying issue has been fixed by D101191.

-----

This patch folds more operations to poison.

Alive2 proof: https://alive2.llvm.org/ce/z/mxcb9G (it does not contain tests about div/rem because they fold to poison when raising UB)

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D92270
The file was modifiedllvm/test/Transforms/VectorCombine/X86/insert-binop-with-constant.ll
The file was modifiedllvm/test/Transforms/InstSimplify/rem.ll
The file was modifiedllvm/test/Transforms/InstSimplify/ConstProp/vector-undef-elts.ll
The file was modifiedllvm/test/Transforms/InstSimplify/ConstProp/vscale.ll
The file was modifiedllvm/test/Transforms/InstCombine/apint-shift.ll
The file was modifiedllvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-after-truncation-variant-b.ll
The file was modifiedllvm/test/Transforms/InstCombine/select-of-bittest.ll
The file was modifiedllvm/test/Transforms/InstSimplify/ConstProp/poison.ll
The file was modifiedllvm/test/Transforms/InstSimplify/ConstProp/vscale-inseltpoison.ll
The file was modifiedllvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-after-truncation-variant-c.ll
The file was modifiedllvm/test/Transforms/InstSimplify/undef.ll
The file was modifiedllvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-e.ll
The file was modifiedllvm/test/Transforms/InstCombine/canonicalize-shl-lshr-to-masking.ll
The file was modifiedllvm/test/Transforms/SROA/phi-gep.ll
The file was modifiedllvm/test/Transforms/InstCombine/icmp.ll
The file was modifiedllvm/test/Transforms/InstCombine/shift-add.ll
The file was modifiedllvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-after-truncation-variant-d.ll
The file was modifiedllvm/lib/IR/ConstantFold.cpp
The file was modifiedllvm/test/Transforms/InstSimplify/ConstProp/cast.ll
The file was modifiedllvm/test/Transforms/InstSimplify/ConstProp/shift.ll
The file was modifiedllvm/test/Transforms/InstCombine/canonicalize-ashr-shl-to-masking.ll
The file was modifiedllvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-c.ll
The file was modifiedllvm/unittests/IR/ConstantsTest.cpp
The file was modifiedllvm/test/Transforms/VectorCombine/X86/insert-binop.ll
The file was modifiedllvm/test/Transforms/SROA/select-gep.ll
The file was modifiedllvm/test/Transforms/VectorCombine/X86/insert-binop-inseltpoison.ll
The file was modifiedllvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-after-truncation-variant-a.ll
The file was modifiedllvm/test/Transforms/InstCombine/shift-add-inseltpoison.ll
The file was modifiedllvm/test/Transforms/InstSimplify/div.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-fold-binop-select.ll
The file was modifiedllvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-a.ll
The file was modifiedllvm/test/Transforms/InstCombine/canonicalize-lshr-shl-to-masking.ll
The file was modifiedllvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-after-truncation-variant-e.ll
The file was modifiedllvm/test/Transforms/InstSimplify/ConstProp/vector-undef-elts-inseltpoison.ll
The file was modifiedclang/test/Frontend/fixed_point_unary.c
The file was modifiedllvm/test/Transforms/InstSimplify/ConstProp/InsertElement-inseltpoison.ll
The file was modifiedllvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-b.ll
The file was modifiedllvm/test/Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-d.ll
The file was modifiedllvm/test/Transforms/InstSimplify/ConstProp/InsertElement.ll
Commit fe9101c3d8db4d054aade400efae45857f0840da by Paul C. Anagnostopoulos
[TableGen] Make the NUL character invalid in .td files

Now uses tr instead of sed.

Differential Revision: https://reviews.llvm.org/D102254
The file was addedllvm/test/TableGen/nul-char.td
The file was modifiedllvm/lib/TableGen/TGLexer.cpp
Commit f358c372094599bf2a9246a0d2145cd949b4c62d by gysit
[mlir][linalg] Remove IndexedGenericOp support from DropUnitDims...

after introducing the IndexedGenericOp to GenericOp canonicalization (https://reviews.llvm.org/D101612).

Differential Revision: https://reviews.llvm.org/D102235
The file was modifiedmlir/test/Dialect/Linalg/drop-unit-extent-dims.mlir
The file was modifiedmlir/lib/Dialect/Linalg/Transforms/DropUnitDims.cpp
Commit 0f24163870e1a633c1d79377fdd188fe03769dd8 by springerm
[mlir] Replace vector-to-scf with progressive-vector-to-scf

Depends On D102388

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D102101
The file was modifiedmlir/lib/Conversion/VectorToSCF/CMakeLists.txt
The file was modifiedmlir/test/Integration/Dialect/Vector/CPU/test-transfer-read-1d.mlir
The file was modifiedmlir/test/lib/Transforms/TestVectorTransforms.cpp
The file was removedmlir/test/Conversion/VectorToSCF/progressive-vector-to-loops.mlir
The file was modifiedmlir/test/Integration/Dialect/Vector/CPU/test-transfer-to-loops.mlir
The file was modifiedmlir/test/Conversion/VectorToSCF/unrolled-vector-to-loops.mlir
The file was modifiedmlir/test/Integration/Dialect/Vector/CPU/test-transfer-read.mlir
The file was modifiedmlir/test/Integration/Dialect/Vector/CPU/test-transfer-read-3d.mlir
The file was modifiedmlir/lib/Conversion/VectorToSCF/VectorToSCF.cpp
The file was modifiedmlir/include/mlir/Conversion/VectorToSCF/VectorToSCF.h
The file was removedmlir/include/mlir/Conversion/VectorToSCF/ProgressiveVectorToSCF.h
The file was modifiedmlir/test/Conversion/VectorToSCF/vector-to-loops.mlir
The file was removedmlir/lib/Conversion/VectorToSCF/ProgressiveVectorToSCF.cpp
The file was modifiedmlir/test/Integration/Dialect/Vector/CPU/test-transfer-read-2d.mlir
Commit cf194da1bbf79d392688dba0c74875829e9873f2 by gysit
[mlir][linalg] Remove IndexedGenericOp support from FusionOnTensors...

after introducing the IndexedGenericOp to GenericOp canonicalization (https://reviews.llvm.org/D101612).

Differential Revision: https://reviews.llvm.org/D102163
The file was modifiedmlir/lib/Dialect/Linalg/Transforms/FusionOnTensors.cpp
The file was modifiedmlir/test/Dialect/Linalg/reshape_linearization_fusion.mlir
The file was modifiedmlir/test/Dialect/Linalg/fusion-tensor.mlir
The file was modifiedmlir/test/Dialect/Linalg/reshape_fusion.mlir
The file was modifiedmlir/include/mlir/Dialect/Linalg/Transforms/Transforms.h
Commit b1a074951ff78bf06a2d944c01ca0a0fcd63dd33 by bradley.smith
[AArch64][SVE] Fix missed immediate selection due to mishandling of signedness

The complex selection pattern for add/sub shifted immediates is
incorrect in it's handling of incoming constant values, in that it
does not properly anticipate the values to be signed extended to
32-bits.

Co-authored-by: Graham Hunter <graham.hunter@arm.com>

Differential Revision: https://reviews.llvm.org/D101833
The file was modifiedllvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
The file was modifiedllvm/test/CodeGen/AArch64/sve-int-imm.ll
Commit 8fa168fc50ba4f63b79773c947ef5b3e43d5c02f by zarko
Parse vector bool when stdbool.h and altivec.h are included

Currently when including stdbool.h and altivec.h declaration of `vector bool` leads to
errors due to `bool` being expanded to '_Bool`. This patch allows the parser
to recognize `_Bool`.

Reviewed By: hubert.reinterpretcast, Everybody0523

Differential Revision: https://reviews.llvm.org/D102064
The file was modifiedclang/include/clang/Parse/Parser.h
The file was modifiedclang/lib/Parse/Parser.cpp
The file was modifiedclang/lib/Parse/ParseDecl.cpp
The file was addedclang/test/Parser/altivec-zvector-bool.c
Commit 6a67e05a26eb5f58665bd6d063b9f389e7dd28a7 by enye.shi
[HIP] Add __builtin_amdgcn_groupstaticsize

Differential Revision: https://reviews.llvm.org/D102403
The file was modifiedclang/include/clang/Basic/BuiltinsAMDGPU.def
The file was modifiedclang/test/CodeGenOpenCL/builtins-amdgcn-vi.cl
The file was modifiedclang/test/CodeGenOpenCL/builtins-amdgcn-gfx9.cl
The file was modifiedclang/test/CodeGenOpenCL/builtins-amdgcn-gfx10.cl
The file was modifiedclang/test/CodeGenOpenCL/builtins-amdgcn-ci.cl
Commit 8f98356bb53dca07a86bf098556d446e0d5af6fe by Stanislav.Mekhanoshin
[AMDGPU] Only allow global fp atomics with unsafe option

Previously we were allowing to use FP atomics without
-amdgpu-unsafe-fp-atomics option if a scope is less then
system. This is not safe just as well if we have UC memory.

This change only allows global and flat FP atomics with
the unsafe option. Consequentially that makes a check for
denorm mode redundant since we skip it with the unsafe
option and do not have a way to produce these instructions
without it anyway.

Differential Revision: https://reviews.llvm.org/D102347
The file was modifiedllvm/lib/Target/AMDGPU/SIISelLowering.cpp
The file was modifiedllvm/test/CodeGen/AMDGPU/global-atomics-fp.ll
The file was modifiedllvm/test/CodeGen/AMDGPU/fp64-atomics-gfx90a.ll
Commit 34ed3e63378e34f93ada56a19cebc68cf1498092 by llvm-project
[OpenMP] Test unified shared memory tests only on systems that support it.

Add a `REQUIRES: unified_shared_memory` option to tests that use `#pragma omp requires unified_shared_memory`.

For CUDA, the feature tag is derived from LIBOMPTARGET_DEP_CUDA_ARCH which itself is derived using [[ https://cmake.org/cmake/help/latest/module/FindCUDA.html#commands | cuda_select_nvcc_arch_flags ]]. The latter determines which compute capability the GPU in the system supports. To ensure that this is the CUDA arch being used, we could also set the `-Xopenmp-target -march=` flag.
In the absence of an NVIDIA GPU, LIBOMPTARGET_DEP_CUDA_ARCH will be 35. That is, in that case we are assuming unified_shared_memory is not available. CUDA plugin testing could be disabled entirely in this case, but this currently depends on `LIBOMPTARGET_CAN_LINK_LIBCUDA OR LIBOMPTARGET_FORCE_DLOPEN_LIBCUDA`, not on whether the hardware is actually available.

For all other targets, nothing changes and we are assuming unified shared memory is available. This might need refinement if not the case.

This tries to fix the [[ http://meinersbur.de:8011/#/builders/143 | OpenMP Offloading Buildbot ]] that, although brand-new, only has a Pascal-generation (sm_61) GPU installed. Hence, tests that require unified shared memory are currently failing. I wish I had known in advance.

Reviewed By: protze.joachim, tianshilei1992

Differential Revision: https://reviews.llvm.org/D101498
The file was modifiedopenmp/libomptarget/test/lit.site.cfg.in
The file was modifiedopenmp/libomptarget/test/lit.cfg
The file was modifiedopenmp/libomptarget/test/unified_shared_memory/close_manual.c
The file was modifiedopenmp/libomptarget/test/mapping/present/unified_shared_memory.c
The file was modifiedopenmp/libomptarget/test/unified_shared_memory/close_enter_exit.c
The file was modifiedopenmp/libomptarget/test/unified_shared_memory/close_modifier.c
The file was modifiedopenmp/libomptarget/test/unified_shared_memory/shared_update.c
Commit 2ed7db0d206b6af2fffa4cb2704264b76ca61266 by joe.ellis
[InstSimplify] Remove redundant {insert,extract}_vector intrinsic chains

This commit removes some redundant {insert,extract}_vector intrinsic
chains by implementing the following patterns as instsimplifies:

   (insert_vector _, (extract_vector X, 0), 0) -> X
   (extract_vector (insert_vector _, X, 0), 0) -> X

Reviewed By: peterwaller-arm

Differential Revision: https://reviews.llvm.org/D101986
The file was modifiedllvm/lib/Analysis/InstructionSimplify.cpp
The file was addedllvm/test/Transforms/InstSimplify/extract-vector.ll
The file was addedllvm/test/Transforms/InstSimplify/insert-vector.ll
The file was modifiedclang/test/CodeGen/attr-arm-sve-vector-bits-cast.c
The file was modifiedclang/test/CodeGen/attr-arm-sve-vector-bits-call.c
Commit 3f2891db6dd5684ee743055c0e86d0d3dd66c90b by jpienaar
[mlir] Add python test for shape dialect

Add basic test for shape.const_shape op as start.

Differential Revision: https://reviews.llvm.org/D102341
The file was addedmlir/test/python/dialects/shape.py
Commit b049870d3b47a93b0c53f3ad69b11c4731b39d7f by jonathanchesterfield
[libomptarget][amdgpu] Convert an assert to print and offload_fail

[libomptarget][amdgpu] Convert an assert to print and offload_fail

The kernel launched is supposed to be present in the binary, but a not yet
diagnosed bug means it is missing for some of the qmcpack test cases. Changing
from assert to print and offload_fail should help diagnose that and similar bugs.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D102378
The file was modifiedopenmp/libomptarget/plugins/amdgpu/src/rtl.cpp
Commit 10de21720989166a6b51cbf48b21efacbb913f23 by jonathanchesterfield
[libomptarget][amdgpu] Fix truncation error for partial wavefront

[libomptarget][amdgpu] Fix truncation error for partial wavefront

The partial barrier implementation involves one wavefront resetting and N-1
waiting. This change future proofs against launching with a number of threads
that is not a multiple of the wavefront size.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D102407
The file was modifiedopenmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.hip