Commit
ebb1092a2875739d3e9bb6b1fb230c0e0d88ebff
by tianshilei1992[Clang][OpenMP] Added support for nowait target in CodeGen via regular task
Previously for nowait target, CG emitted a function call to `__tgt_target_nowait`, etc. However, in OpenMP RTL, these functions just directly call the no-nowait version, which means nowait is not working as expected.
OpenMP specification says a target is acutally a target task, which is an untied and detachable task. It is natural to go to the direction that generates a task for a nowait target. However, OpenMP task has a problem that it must be within to a parallel region; otherwise the task will be executed immediately. As a result, if we directly wrap to a regular task, the `target nowait` outside of a parallel region is still a synchronous version.
In D77609, I added the support for unshackled task in OpenMP RTL. Basically, unshackled task is a task that is not bound to any parallel region. So all nowait target will be tranformed into an unshackled task. In order to distinguish from regular task, a new flag bit is set for unshackled task. This flag will be used by RTL for later process.
Since all target tasks are allocated via `__kmpc_omp_target_task_alloc`, and in current `libomptarget`, `__kmpc_omp_target_task_alloc` just calls `__kmpc_omp_task_alloc`. Therefore, we can modify the flag in `__kmpc_omp_target_task_alloc` so that we don't need to modify the FE too much. If users choose to opt out the feature, they just need to use a RTL w/o support of unshackled threads.
As a result, in this patch, the `target nowait` region is simply wrapped into a regular task. Later once we have RTL support for unshackled tasks, the wrapped tasks can be executed by unshackled threads w/o changes in the FE.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D78075
|
 | clang/test/OpenMP/target_codegen.cpp |
 | clang/test/OpenMP/target_simd_codegen.cpp |
 | clang/test/OpenMP/target_parallel_codegen.cpp |
 | clang/test/OpenMP/target_parallel_for_codegen.cpp |
 | clang/test/OpenMP/target_teams_distribute_simd_codegen.cpp |
 | clang/test/OpenMP/declare_mapper_codegen.cpp |
 | clang/test/OpenMP/target_parallel_for_simd_codegen.cpp |
 | clang/test/OpenMP/target_teams_codegen.cpp |
 | clang/test/OpenMP/target_teams_distribute_codegen.cpp |
 | clang/lib/CodeGen/CGOpenMPRuntime.cpp |
Commit
76419525fba62c93d5c337acdb0b80d6e42b00c9
by joker.ephCommon code preparation for tblgen-types patch
Cleanup and add methods which https://reviews.llvm.org/D86904 requires. Breaking up to lower review load.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D88267
|
 | mlir/include/mlir/TableGen/CodeGenHelpers.h |
 | mlir/include/mlir/TableGen/Operator.h |
 | mlir/tools/mlir-tblgen/DialectGen.cpp |
 | mlir/tools/mlir-tblgen/OpDefinitionsGen.cpp |
 | mlir/lib/TableGen/Operator.cpp |
 | llvm/include/llvm/TableGen/Record.h |
 | llvm/lib/TableGen/Record.cpp |
Commit
63c58c2b934525c9863e624cf39ec542dd84ca78
by i[bindings/go] Fix TestAttributes after D88241
|
 | llvm/bindings/go/llvm/ir_test.go |
Commit
96318f64a7864747ebbb4e33cb75b0dea465abfc
by dmantipov[Driver] Perform Linux distribution detection only once
Differential Revision: https://reviews.llvm.org/D87187
|
 | clang/lib/Driver/Distro.cpp |
 | clang/include/clang/Driver/Distro.h |
Commit
2ca0ea15e5c910ff93874679f0a03c923fe85e5b
by dmantipov[Driver] Fix formatting as suggested by clang-format (NFC)
|
 | clang/include/clang/Driver/Distro.h |
Commit
c0f8e4c06c85db256806cfce90a2b49e4cdd58d4
by qiucofan[SelectionDAG] Add guard to automatically insert flags
This is like FastMathFlagGuard in IR. Since we use SDAG instance to get values, it's with SelectionDAG. By creating a FlagInserter in current scope, all values created by getNode will get the flags if no Flags argument provided.
In this patch, I applied it to floating point operations folding part in DAG combiner, and removed Flags passing to getNode to show its effect. Other places in DAG combiner and other helper methods similar to getNode also need this. They can be done in follow-up patches.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D87361
|
 | llvm/test/CodeGen/X86/sqrt-fastmath-mir.ll |
 | llvm/include/llvm/CodeGen/SelectionDAG.h |
 | llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp |
 | llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp |
Commit
c6c5629f2fb4ddabd376fbe7c218733283e91d09
by simon[CodeGen] Do not call `emitGlobalConstantLargeInt` for constant requires 8 bytes to store
This is a fix for PR47630. The regression is caused by the D78011. After this change the code starts to call the `emitGlobalConstantLargeInt` even for constants which requires eight bytes to store.
Differential revision: https://reviews.llvm.org/D88261
|
 | llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp |
 | llvm/test/CodeGen/Mips/emit-big-cst.ll |