Changes

Summary

  1. [ARM] Extra widening and narrowing combinations tests. NFC (details)
  2. [Attributor][FIX] Do not replace a value with a non-dominating instruction (details)
  3. [Attributor] Reorganize AAHeapToStack (details)
  4. [OpenMP][FIX] Update remark in test file after rewording (details)
  5. [Attributor] Look through selects in genericValueTraversal (details)
  6. [Attributor][FIX] Sanitize queries to LVI and ScalarEvolution (details)
  7. [OpenMP] Unified entry point for SPMD & generic kernels in the device RTL (details)
  8. [OpenMP] Create custom state machines for generic target regions (details)
  9. [OpenMP][NFCI] Re-enable two remarks tests after D101977 landed (details)
  10. [OpenMP] Remove checkXXXX device runtime functions (details)
  11. [OpenMP][FIX] Add missing `)` to remark (details)
  12. [OpenMP] Detect SPMD compatible kernels and execute them as such (details)
  13. [Attributor][FIX] Destroy bump allocator objects to avoid leaks (details)
Commit a6470408cf3601391c6c85f8b3a743f2b5fbaad2 by david.green
[ARM] Extra widening and narrowing combinations tests. NFC
The file was modifiedllvm/test/CodeGen/Thumb2/mve-widen-narrow.ll
The file was modifiedllvm/test/CodeGen/Thumb2/block-placement.mir
Commit dbb3a65f5b30ff78e0a7165b377180a00e580f8c by johannes
[Attributor][FIX] Do not replace a value with a non-dominating instruction

We have to be careful when we replace values to not use a non-dominating
instruction. It makes sense that simplification offers those as
"simplified values" but we can't manifest them in the IR without PHI
nodes. In the future we should consider potentially adding those PHI
nodes.
The file was modifiedllvm/lib/Transforms/IPO/Attributor.cpp
The file was modifiedllvm/test/Transforms/Attributor/memory_locations.ll
The file was modifiedllvm/lib/Transforms/IPO/AttributorAttributes.cpp
The file was modifiedllvm/test/Transforms/Attributor/IPConstantProp/return-argument.ll
The file was modifiedllvm/test/Transforms/Attributor/IPConstantProp/PR26044.ll
The file was modifiedllvm/test/Transforms/Attributor/nonnull.ll
The file was modifiedllvm/test/Transforms/Attributor/IPConstantProp/PR16052.ll
The file was modifiedllvm/include/llvm/Transforms/IPO/Attributor.h
The file was modifiedllvm/test/Transforms/Attributor/noalias.ll
The file was modifiedllvm/test/Transforms/Attributor/value-simplify.ll
The file was modifiedllvm/test/Transforms/Attributor/heap_to_stack_gpu.ll
The file was modifiedllvm/test/Transforms/Attributor/heap_to_stack.ll
Commit c1c1fe93852e88b544c46087363400751b3a3ceb by johannes
[Attributor] Reorganize AAHeapToStack

In order to simplify future extensions, e.g., the merge of
AAHeapToShared in to AAHeapToStack, we reorganize AAHeapToStack and the
state we keep for each malloc-like call. The result is also less
confusing as we only track malloc-like calls, not all calls. Further, we
only perform the updates necessary for a malloc-like to argue it can go
to the stack, e.g., we won't check all uses if we moved on to the
"must-be-freed" argument.

This patch also uses Attributor helps to simplify the allocated size,
alignment, and the potentially freed objects.

Overall, this is mostly a reorganization and only the use of the
optimistic helpers should change (=improve) the capabilities a bit.

Differential Revision: https://reviews.llvm.org/D104993
The file was modifiedllvm/lib/Transforms/IPO/Attributor.cpp
The file was modifiedllvm/lib/Transforms/IPO/AttributorAttributes.cpp
The file was modifiedllvm/test/Transforms/OpenMP/remove_globalization.ll
The file was modifiedllvm/lib/Transforms/IPO/OpenMPOpt.cpp
The file was modifiedllvm/test/Transforms/Attributor/depgraph.ll
The file was modifiedllvm/include/llvm/Transforms/IPO/Attributor.h
The file was modifiedllvm/test/Transforms/Attributor/heap_to_stack.ll
Commit 5b05a5f6cee2ed3bda299d317907f8c89f4d089d by johannes
[OpenMP][FIX] Update remark in test file after rewording
The file was modifiedllvm/test/Transforms/OpenMP/globalization_remarks.ll
Commit c1d53a316d6c7f7d80908ec8b5a65172f82e9721 by johannes
[Attributor] Look through selects in genericValueTraversal

If we can simplify the select condition we can avoid one value in the
traversal.

Differential Revision: https://reviews.llvm.org/D103861
The file was modifiedllvm/test/Transforms/Attributor/lvi-for-ashr.ll
The file was modifiedllvm/test/Transforms/Attributor/value-simplify.ll
The file was modifiedllvm/lib/Transforms/IPO/AttributorAttributes.cpp
Commit 4761d29633ac7889329dc6ef966eab01c7b7903d by johannes
[Attributor][FIX] Sanitize queries to LVI and ScalarEvolution

When we talk to outside analyse, e.g., LVI and ScalarEvolution, we need
to be careful with the query. The particular error occurred because we
folded a PHI node before the LVI query but the context location was now
not dominated by the value anymore. This is not supported by LVI so we
have to filter these situations before we query the outside analyses.
The file was modifiedllvm/test/Transforms/Attributor/value-simplify.ll
The file was modifiedllvm/lib/Transforms/IPO/AttributorAttributes.cpp
Commit e2cfbfcc0c1f3a89ab79c5615f0789b6a9966dc5 by johannes
[OpenMP] Unified entry point for SPMD & generic kernels in the device RTL

In the spirit of TRegions [0], this patch provides a simpler and uniform
interface for a kernel to set up the device runtime. The OMPIRBuilder is
used for reuse in Flang. A custom state machine will be generated in the
follow up patch.

The "surplus" threads of the "master warp" will not exit early anymore
so we need to use non-aligned barriers. The new runtime will not have an
extra warp but also require these non-aligned barriers.

[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11

This was in parts extracted from D59319.

Reviewed By: ABataev, JonChesterfield

Differential Revision: https://reviews.llvm.org/D101976
The file was modifiedclang/test/OpenMP/nvptx_parallel_codegen.cpp
The file was modifiedllvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
The file was modifiedclang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
The file was modifiedclang/test/OpenMP/nvptx_target_parallel_codegen.cpp
The file was modifiedclang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_generic_mode_codegen.cpp
The file was modifiedclang/test/OpenMP/target_parallel_for_debug_codegen.cpp
The file was modifiedclang/test/OpenMP/nvptx_target_parallel_proc_bind_codegen.cpp
The file was modifiedclang/test/OpenMP/nvptx_target_printf_codegen.c
The file was modifiedclang/test/OpenMP/target_parallel_debug_codegen.cpp
The file was modifiedclang/test/OpenMP/nvptx_target_parallel_reduction_codegen_tbaa_PR46146.cpp
The file was modifiedclang/test/OpenMP/nvptx_target_teams_distribute_codegen.cpp
The file was modifiedclang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_simd_codegen.cpp
The file was modifiedopenmp/libomptarget/deviceRTLs/common/src/omptarget.cu
The file was modifiedclang/test/OpenMP/amdgcn_target_codegen.cpp
The file was modifiedclang/test/OpenMP/nvptx_target_teams_distribute_simd_codegen.cpp
The file was modifiedclang/test/OpenMP/nvptx_distribute_parallel_generic_mode_codegen.cpp
The file was modifiedllvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
The file was modifiedopenmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu
The file was modifiedclang/test/OpenMP/nvptx_target_parallel_num_threads_codegen.cpp
The file was modifiedopenmp/libomptarget/deviceRTLs/interface.h
The file was modifiedllvm/include/llvm/Frontend/OpenMP/OMPKinds.def
The file was modifiedopenmp/libomptarget/deviceRTLs/common/src/parallel.cu
The file was modifiedclang/test/OpenMP/nvptx_SPMD_codegen.cpp
The file was modifiedclang/test/OpenMP/nvptx_target_firstprivate_codegen.cpp
The file was modifiedclang/test/OpenMP/remarks_parallel_in_target_state_machine.c
The file was modifiedllvm/test/Transforms/OpenMP/single_threaded_execution.ll
The file was modifiedclang/test/OpenMP/nvptx_data_sharing.cpp
The file was modifiedclang/test/OpenMP/nvptx_force_full_runtime_SPMD_codegen.cpp
The file was modifiedclang/test/OpenMP/nvptx_target_parallel_reduction_codegen.cpp
The file was modifiedclang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp
The file was modifiedclang/test/OpenMP/nvptx_teams_reduction_codegen.cpp
The file was modifiedllvm/test/Transforms/OpenMP/replace_globalization.ll
The file was modifiedclang/lib/CodeGen/CGOpenMPRuntimeGPU.h
The file was modifiedclang/test/OpenMP/nvptx_parallel_for_codegen.cpp
The file was modifiedclang/test/OpenMP/nvptx_multi_target_parallel_codegen.cpp
The file was modifiedclang/test/OpenMP/nvptx_target_codegen.cpp
The file was modifiedclang/test/OpenMP/remarks_parallel_in_multiple_target_state_machines.c
The file was addedopenmp/libomptarget/deviceRTLs/common/include/target.h
The file was modifiedclang/test/OpenMP/nvptx_lambda_capturing.cpp
The file was modifiedclang/test/OpenMP/assumes_include_nvptx.cpp
The file was modifiedclang/test/OpenMP/nvptx_target_teams_codegen.cpp
The file was modifiedclang/test/OpenMP/nvptx_target_simd_codegen.cpp
The file was modifiedclang/test/OpenMP/declare_target_codegen_globalization.cpp
The file was modifiedclang/test/OpenMP/nvptx_teams_codegen.cpp
The file was modifiedllvm/lib/Transforms/IPO/OpenMPOpt.cpp
The file was modifiedclang/test/OpenMP/nvptx_nested_parallel_codegen.cpp
Commit d9659bf6a036545125a39648b4abe838080299ec by johannes
[OpenMP] Create custom state machines for generic target regions

In the spirit of TRegions [0], this patch creates a custom state
machine for a generic target region based on the potentially called
parallel regions.

The code analysis is done interprocedurally via an abstract attribute
(AAKernelInfo). All outermost parallel regions are collected and we
check if there might be unknown outermost parallel regions for which
we need an indirect call. Other AAKernelInfo extensions are expected.

[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11

Differential Revision: https://reviews.llvm.org/D101977
The file was addedllvm/test/Transforms/OpenMP/custom_state_machines.ll
The file was modifiedllvm/test/Transforms/OpenMP/globalization_remarks.ll
The file was modifiedllvm/test/Transforms/PhaseOrdering/openmp-opt-module.ll
The file was addedllvm/test/Transforms/OpenMP/custom_state_machines_remarks.ll
The file was modifiedllvm/test/Transforms/OpenMP/remove_globalization.ll
The file was modifiedllvm/test/Transforms/OpenMP/single_threaded_execution.ll
The file was modifiedllvm/lib/Transforms/IPO/Attributor.cpp
The file was modifiedllvm/lib/Transforms/IPO/OpenMPOpt.cpp
The file was modifiedllvm/test/Transforms/OpenMP/replace_globalization.ll
Commit a706b94ea5560a7733e403006a9066cc41e82b5d by johannes
[OpenMP][NFCI] Re-enable two remarks tests after D101977 landed
The file was modifiedclang/test/OpenMP/remarks_parallel_in_multiple_target_state_machines.c
The file was modifiedclang/test/OpenMP/remarks_parallel_in_target_state_machine.c
Commit 0a223827de8d923f357bf6d3d222fd26e2fbca4a by johannes
[OpenMP] Remove checkXXXX device runtime functions

We had multiple functions to determine the execution mode (SPMD/Generic)
and runtime status (initialized/uninitialized) but that just increased
complexity without a real benefit. Especially with D102307 in mind it
is helpful to reduce the dependence on the `ident_t` flags.

Differential Revision: https://reviews.llvm.org/D105586
The file was modifiedopenmp/libomptarget/deviceRTLs/common/src/task.cu
The file was modifiedopenmp/libomptarget/deviceRTLs/common/src/sync.cu
The file was modifiedopenmp/libomptarget/deviceRTLs/common/support.h
The file was modifiedopenmp/libomptarget/deviceRTLs/common/src/support.cu
The file was modifiedopenmp/libomptarget/deviceRTLs/common/src/loop.cu
The file was modifiedopenmp/libomptarget/deviceRTLs/common/src/parallel.cu
The file was modifiedopenmp/libomptarget/deviceRTLs/common/src/reduction.cu
Commit 8cb7d71355f9ca884efde1dfa03dc349fb890721 by johannes
[OpenMP][FIX] Add missing `)` to remark
The file was modifiedllvm/lib/Transforms/IPO/OpenMPOpt.cpp
The file was modifiedllvm/test/Transforms/OpenMP/custom_state_machines_remarks.ll
Commit 514c033db1e0c237eccd56b9fc11fe05a6baff39 by johannes
[OpenMP] Detect SPMD compatible kernels and execute them as such

In the spirit of TRegions [0], this patch analyzes a kernel and tracks
if it can be executed in SPMD-mode. If so, we flip the arguments of
the __kmpc_target_init and deinit call to enable the mode. We also
update the `<kernel>_exec_mode` flag to indicate to the runtime we
changed the mode to SPMD.

The code analysis is done interprocedurally by extending the
AAKernelInfo abstract attribute to track SPMD compatibility as well.

[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11

Differential Revision: https://reviews.llvm.org/D102307
The file was addedllvm/test/Transforms/OpenMP/spmdization.ll
The file was addedllvm/test/Transforms/OpenMP/spmdization_remarks.ll
The file was modifiedclang/test/OpenMP/remarks_parallel_in_multiple_target_state_machines.c
The file was modifiedllvm/lib/IR/Assumptions.cpp
The file was modifiedllvm/include/llvm/Frontend/OpenMP/OMPConstants.h
The file was modifiedclang/test/OpenMP/remarks_parallel_in_target_state_machine.c
The file was modifiedllvm/test/Transforms/OpenMP/custom_state_machines_remarks.ll
The file was modifiedllvm/lib/Transforms/IPO/OpenMPOpt.cpp
The file was modifiedllvm/test/Transforms/OpenMP/custom_state_machines.ll
Commit 2e7e2994a94efad7fde5547d4e493e28b3b660a3 by johannes
[Attributor][FIX] Destroy bump allocator objects to avoid leaks

AllocationInfo and DeallocationInfo objects themselves are allocated
with the Attributor bump allocator and do not need to be deallocated.
That said, the sets in AllocationInfo and DeallocationInfo need to be
destroyed to avoid memory leaks.
The file was modifiedllvm/lib/Transforms/IPO/AttributorAttributes.cpp

Summary

  1. SPEC2006: Pronounce endianness flags both ways (details)
  2. [fpcmp] Fix memory leak. NFC. (details)
Commit ad57b9ae24df5838eddb7234f7300f3e7aee22fe by llvm-test-suite
SPEC2006: Pronounce endianness flags both ways

473.astar chose to pronounce the CPU endianness C preprocessor symbol differently from the rest of the benchmark suite.

This was caught in the make-based build system, but missed in the cmake-based system.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D96432
The file was modifiedExternal/SPEC/SpecCPU2006.cmake (diff)
Commit d382dfd3c56acce63ba6ef27b4b68d81fc9a9eed by llvm-test-suite
[fpcmp] Fix memory leak. NFC.
The file was modifiedtools/fpcmp.c (diff)