Commit
1f40870dda4604c90a0b60de69b1080870c646f5
by efriedma[NFC][ScalarEvolution] Precommit tests for D104075.
|
 | llvm/test/Analysis/ScalarEvolution/lt-overflow.ll |
 | llvm/test/Analysis/ScalarEvolution/trip-count13.ll |
 | llvm/test/Analysis/ScalarEvolution/max-trip-count.ll |
Commit
97c426394a7148ee73bda8b3cb525d6bb3d8b8df
by Amara Emerson[AArch64][GlobalISel] Implement moreElements legalization for G_SHUFFLE_VECTOR.
Differential Revision: https://reviews.llvm.org/D103301
|
 | llvm/include/llvm/CodeGen/GlobalISel/LegalizerHelper.h |
 | llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp |
 | llvm/unittests/CodeGen/GlobalISel/LegalizerHelperTest.cpp |
 | llvm/test/CodeGen/AArch64/GlobalISel/legalize-shuffle-vector.mir |
 | llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp |
Commit
8cf7ddbdd4e5af966a369e170c73250f2e3920e7
by martinRevert "Prepare Compiler-RT for GnuInstallDirs, matching libcxx"
This reverts commit 9a9bc76c0eb72f0f2732c729a460abbd5239c2e3.
That commit broke "ninja install" when building compiler-rt for mingw targets, building standalone (pointing cmake at the compiler-rt directory) with cmake 3.16.3 (the one shipped in ubuntu 20.04), with errors like this:
-- Install configuration: "Release" CMake Error at cmake_install.cmake:44 (file): file cannot create directory: /include/sanitizer. Maybe need administrative privileges. Call Stack (most recent call first): /home/martin/code/llvm-mingw/src/llvm-project/compiler-rt/build-i686-sanitizers/cmake_install.cmake:37 (include)
FAILED: include/CMakeFiles/install-compiler-rt-headers cd /home/martin/code/llvm-mingw/src/llvm-project/compiler-rt/build-i686-sanitizers/include && /usr/bin/cmake -DCMAKE_INSTALL_COMPONENT="compiler-rt-headers" -P /home/martin/code/llvm-mingw/src/llvm-project/compiler-rt/build-i686-sanitizers/cmake_install.cmake ninja: build stopped: subcommand failed.
|
 | compiler-rt/cmake/Modules/CompilerRTDarwinUtils.cmake |
 | compiler-rt/include/CMakeLists.txt |
 | compiler-rt/cmake/base-config-ix.cmake |
 | compiler-rt/cmake/Modules/AddCompilerRT.cmake |
 | compiler-rt/lib/dfsan/CMakeLists.txt |
 | compiler-rt/cmake/Modules/CompilerRTUtils.cmake |
Commit
41b6057641720e6ba7d4b6c7c2905f2870a885d3
by sander.desmalen[InstructionCost] Add saturation support.
This patch makes the operations on InstructionCost saturate, so that when costs are accumulated they saturate to <max value>.
One of the compelling reasons for wanting to have saturation support is because in various places, arbitrary values are used to represent a 'high' cost, but when accumulating the cost of some set of operations or a loop, overflow is not taken into account, which may lead to unexpected results. By defining the operations to saturate, we can express the cost of something 'very expensive' as InstructionCost::getMax().
Reviewed By: kparzysz, dmgreen
Differential Revision: https://reviews.llvm.org/D105108
|
 | llvm/unittests/Support/InstructionCostTest.cpp |
 | llvm/include/llvm/Support/InstructionCost.h |
Commit
239fcda268dc554daf4c2bf20888c72c55518bda
by sander.desmalen[LV] NFCI: Do cost comparison on InstructionCost directly.
Instead of performing the isMoreProfitable() operation on InstructionCost::CostTy the operation is performed on InstructionCost directly, so that it can handle the case where one of the costs is Invalid.
This patch also changes the CostTy to be int64_t, so that the type is wide enough to deal with multiplications with e.g. `unsigned MaxTripCount`.
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D105113
|
 | llvm/lib/Transforms/Vectorize/LoopVectorize.cpp |
 | llvm/include/llvm/Support/InstructionCost.h |
Commit
d919bca87556548555af0a7aa1239ea64ba4f3e8
by andrea.dibiagio[llvm-mca][JSON] Further refactoring of the JSON printing logic.
This patch renames object "Resources" to "TargetInfo".
Moved the getJSONTargetInfo method from class InstructionView to the PipelinePrinter.
Removed uses of std::stringstream. Removed unused method View::printViewJSON().
|
 | llvm/tools/llvm-mca/Views/View.h |
 | llvm/tools/llvm-mca/PipelinePrinter.cpp |
 | llvm/test/tools/llvm-mca/JSON/X86/views-multiple-region.s |
 | llvm/tools/llvm-mca/Views/InstructionView.cpp |
 | llvm/tools/llvm-mca/Views/View.cpp |
 | llvm/tools/llvm-mca/PipelinePrinter.h |
 | llvm/test/tools/llvm-mca/JSON/X86/instruction-tables-multiple-regions.s |
 | llvm/tools/llvm-mca/Views/InstructionView.h |
 | llvm/test/tools/llvm-mca/JSON/X86/views.s |
Commit
4fe0fcd1c032091f60cabc70ee72a3b0f529a875
by andrea.dibiagio[llvm-mca][JSON] Teach the PipelinePrinter how to deal with anonymous code regions (PR51008)
This patch addresses the last remaining problems reported in PR51008.
Previous fixes for PR51008 worked under the wrong assumption that code regions are always named (except maybe for the default region, which was automatically named "main").
In reality, it is quite common for users to declare multiple anonymous regions. So we cannot really use the region name as the key string of a JSON object. In practice, code region names are completely optional.
Using "main" for the default region was also problematic because there can be another region with that same name.
This patch fixes these issues by introducing a json::array of regions. Each region has a "Name" field, which would default to the empty string for anonymous regions.
Added a few more tests to verify that the JSON file format is still valid, and that multiple anonymous regions all appear in the final output.
|
 | llvm/test/tools/llvm-mca/JSON/X86/views-multiple-anonymous-regions.s |
 | llvm/test/tools/llvm-mca/JSON/X86/instruction-tables-multiple-regions.s |
 | llvm/test/tools/llvm-mca/JSON/X86/instruction-tables-multiple-anonymous-regions.s |
 | llvm/test/tools/llvm-mca/JSON/X86/views-multiple-region.s |
 | llvm/test/tools/llvm-mca/JSON/X86/views.s |
 | llvm/tools/llvm-mca/PipelinePrinter.cpp |
Commit
a328ee6577980d7b3a575bebf5279b4a38ec14ed
by llvm-dev[X86] Add tests from D93707 for fsub_strict(x,fneg(y)) -> fadd_strict(x,y) folds.
Also, add matching i686 coverage to strict-fadd-combines.ll and regenerate checks.
|
 | llvm/test/CodeGen/X86/strict-fadd-combines.ll |
 | llvm/test/CodeGen/X86/strict-fsub-combines.ll |
Commit
8f4e5474de74169c6c2f7dddbb84c93d3e3ccb07
by kazu[AFDO] Require x86_64-linux in a testcase
This patch fixes a testcase failure by requring x86_64-linux in a testcase.
|
 | llvm/test/Transforms/SampleProfile/merge-function-attributes.ll |
Commit
1d0456361a4216855e8e7646dc28a372aff07efb
by jdenny.ornl[OpenMP] Avoid checking parent reference count in targetDataEnd
The patch has the following benefits:
* Eliminates a lock/unlock of the data mapping table. * Clarifies the logic that determines whether a struct member's device-to-host transfer occurs. The old logic, which checks the parent struct's reference count, is a leftover from back when we had a different map interface (as pointed out at <https://reviews.llvm.org/D104924#2846972>).
Reviewed By: grokos
Differential Revision: https://reviews.llvm.org/D104924
|
 | openmp/libomptarget/src/omptarget.cpp |
Commit
d99f65de2ab1765c588688876641f5018bfe3b53
by jdenny.ornl[OpenMP] Avoid checking parent reference count in targetDataBegin
This patch is an attempt to do for `targetDataBegin` what D104924 does for `targetDataEnd`:
* Eliminates a lock/unlock of the data mapping table. * Clarifies the logic that determines whether a struct member's host-to-device transfer occurs. The old logic, which checks the parent struct's reference count, is a leftover from back when we had a different map interface (as pointed out at <https://reviews.llvm.org/D104924#2846972>).
Additionally, it eliminates the `DeviceTy::getMapEntryRefCnt`, which is no longer used after this patch.
While D104924 does not change the computation of `IsLast`, I found I needed to change the computation of `IsNew` for this patch. As far as I can tell, the change is correct, and this patch does not cause any additional `openmp` tests to fail. However, I'm not sure I've thought of all use cases. Please advise.
Reviewed By: jdoerfert, jhuber6, protze.joachim, tianshilei1992, grokos, RaviNarayanaswamy
Differential Revision: https://reviews.llvm.org/D105121
|
 | openmp/libomptarget/src/device.cpp |
 | openmp/libomptarget/src/device.h |
 | openmp/libomptarget/src/omptarget.cpp |
Commit
f4f11ee4a705761e3dfc6c678412831f770f71d6
by joker.eph[mlir][NFC] Switched `interfaces` to a private member of SSANameState.
`interfaces` is passed through to the `numberValuesIn*` functions with exactly the same value as when SSANameState is constructed. This just seems cleaner.
Also, a dependent PR adds `printerFlags` which follows similar code paths.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D105299
|
 | mlir/lib/IR/AsmPrinter.cpp |
Commit
2c0f17982f39b14c7ed13069d6ed959ef43d02d9
by joker.eph[mlir] Added OpPrintingFlags to AsmState and SSANameState.
This enables checking the printing flags when formatting names in SSANameState.
Depends On D105299
Reviewed By: mehdi_amini, bondhugula
Differential Revision: https://reviews.llvm.org/D105300
|
 | mlir/include/mlir/IR/AsmState.h |
 | mlir/include/mlir/IR/Operation.h |
 | mlir/lib/Transforms/LocationSnapshot.cpp |
 | mlir/lib/IR/AsmPrinter.cpp |
Commit
ebbe149a6f08535ede848a531a601ae6591cfbc5
by joker.eph[mlir] Gated calls to getAsm{Result,BlockArgument}Names on whether printing ops in generic form.
Depends On D105300
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D105301
|
 | mlir/lib/IR/AsmPrinter.cpp |
 | mlir/test/mlir-lsp-server/hover.test |
 | mlir/test/IR/print-op-generic.mlir |
Commit
be5d46e9bbc92ffbff26fa56181f5e21f9e30761
by johannes[Attributor][FIX] Traverse uses even if a value is assumed constant
Not all attributes are able to handle the interprocedural step and follow the uses into a call site. Let them be able to combine call site uses instead. This might result in some unused values/arguments being leftover but it removes problems where we misused "is dead" even though it was actually "is simplified/replaced".
We explicitly check for dead values due to constant propagation in `AAIsDeadValueImpl::areAllUsesAssumedDead` instead.
Differential Revision: https://reviews.llvm.org/D103858
|
 | llvm/lib/Transforms/IPO/AttributorAttributes.cpp |
 | llvm/test/Transforms/Attributor/ArgumentPromotion/alignment.ll |
 | llvm/test/Transforms/Attributor/IPConstantProp/openmp_parallel_for.ll |
 | llvm/test/Transforms/Attributor/internal-noalias.ll |
 | llvm/test/Transforms/Attributor/range.ll |
 | llvm/test/Transforms/Attributor/internalize.ll |
 | llvm/test/Transforms/Attributor/ArgumentPromotion/fp80.ll |
 | llvm/test/Transforms/Attributor/liveness.ll |
 | llvm/test/Transforms/Attributor/ArgumentPromotion/control-flow2.ll |
 | llvm/test/Transforms/Attributor/nonnull.ll |
 | llvm/test/Transforms/Attributor/IPConstantProp/2009-09-24-byval-ptr.ll |
 | llvm/test/Transforms/Attributor/IPConstantProp/musttail-call.ll |
 | llvm/test/Transforms/Attributor/IPConstantProp/dangling-block-address.ll |
 | llvm/test/Transforms/Attributor/ArgumentPromotion/attrs.ll |
 | llvm/test/Transforms/Attributor/align.ll |
 | llvm/test/Transforms/Attributor/norecurse.ll |
 | llvm/test/Transforms/Attributor/nodelete.ll |
 | llvm/test/Transforms/Attributor/potential.ll |
 | llvm/test/Transforms/Attributor/depgraph.ll |
 | llvm/lib/Transforms/IPO/Attributor.cpp |
 | llvm/test/Transforms/Attributor/ArgumentPromotion/reserve-tbaa.ll |
 | llvm/test/Transforms/Attributor/memory_locations.ll |
 | llvm/test/Transforms/Attributor/noundef.ll |
 | llvm/test/Transforms/Attributor/value-simplify.ll |
Commit
93a279a67dc05c4ce2e3476ee249c1f64634d6e4
by johannes[Attributor] Introduce an optimistic getUnderlyingObjects helper
As the `llvm::getUnderlyingObjects` helper, the optimistic version collects objects that might be the base of a given pointer. In contrast to the llvm variant, the optimistic one will use assumed information, e.g., about select conditions or dead blocks, to provide a more precise result.
Differential Revision: https://reviews.llvm.org/D103859
|
 | llvm/lib/Transforms/IPO/AttributorAttributes.cpp |
 | llvm/test/Transforms/Attributor/nocapture-1.ll |
Commit
374e573cfc2b85ee2bc661bfb5fdaeb026fb1cc9
by johannes[Attributor] Use AAValueSimplify to simplify returned values
We should use AAValueSimplify for all value simplification, however there was some leftover logic that predates AAValueSimplify in AAReturnedValues. This remove the AAReturnedValues part and provides a replacement by making AAValueSimplifyReturned strong enough to handle all previously covered cases. Further, this improve AAValueSimplifyCallSiteReturned to handle returned arguments.
AAReturnedValues is now much easier and the collected returned values/instructions are now from the associated function only, making it much more sane. We also do not have the brittle logic anymore that looks for unresolved calls. Instead, we use AAValueSimplify to handle recursion.
Useful code has been split into helper functions, e.g., an Attributor interface to get a simplified value.
Differential Revision: https://reviews.llvm.org/D103860
|
 | llvm/test/Transforms/Attributor/ArgumentPromotion/profile.ll |
 | llvm/test/Transforms/Attributor/nonnull.ll |
 | llvm/include/llvm/Transforms/IPO/Attributor.h |
 | llvm/test/Transforms/Attributor/undefined_behavior.ll |
 | llvm/test/Transforms/Attributor/dereferenceable-2.ll |
 | llvm/test/Transforms/Attributor/cb_liveness_enabled.ll |
 | llvm/test/Transforms/Attributor/cb_range_enabled.ll |
 | llvm/test/Transforms/Attributor/internalize.ll |
 | llvm/test/Transforms/Attributor/heap_to_stack_gpu.ll |
 | llvm/test/Transforms/Attributor/dereferenceable-2-inseltpoison.ll |
 | llvm/test/Transforms/Attributor/IPConstantProp/arg-count-mismatch.ll |
 | llvm/test/Transforms/Attributor/noalias.ll |
 | llvm/test/Transforms/Attributor/align.ll |
 | llvm/test/Transforms/Attributor/range.ll |
 | llvm/test/Transforms/Attributor/heap_to_stack.ll |
 | llvm/test/Transforms/Attributor/nocapture-1.ll |
 | llvm/test/Transforms/Attributor/returned.ll |
 | llvm/test/Transforms/Attributor/IPConstantProp/openmp_parallel_for.ll |
 | llvm/test/Transforms/Attributor/cb_liveness_disabled.ll |
 | llvm/test/Transforms/Attributor/IPConstantProp/return-argument.ll |
 | llvm/test/Transforms/Attributor/IPConstantProp/PR26044.ll |
 | llvm/test/Transforms/Attributor/readattrs.ll |
 | llvm/test/Transforms/Attributor/ArgumentPromotion/sret.ll |
 | llvm/lib/Transforms/IPO/AttributorAttributes.cpp |
 | llvm/test/Transforms/OpenMP/replace_globalization.ll |
 | llvm/test/Transforms/Attributor/read_write_returned_arguments_scc.ll |
 | llvm/test/Transforms/Attributor/memory_locations.ll |
 | llvm/test/Transforms/Attributor/potential.ll |
 | llvm/test/Transforms/Attributor/IPConstantProp/pthreads.ll |
 | llvm/test/Transforms/Attributor/depgraph.ll |
 | llvm/test/Transforms/Attributor/ArgumentPromotion/inalloca.ll |
 | llvm/test/Transforms/Attributor/IPConstantProp/musttail-call.ll |
 | llvm/test/Transforms/Attributor/nocapture-2.ll |
 | llvm/test/Transforms/Attributor/cgscc_bugs.ll |
 | llvm/test/Transforms/Attributor/IPConstantProp/PR16052.ll |
 | llvm/test/Transforms/Attributor/value-simplify.ll |
 | llvm/test/Transforms/Attributor/IPConstantProp/multiple_callbacks.ll |
 | llvm/lib/Transforms/IPO/Attributor.cpp |
Commit
1eb31d6de36bc274fa8fb692615515223148ac5e
by johannes[Attributor] Reorganize AAHeapToStack
In order to simplify future extensions, e.g., the merge of AAHeapToShared in to AAHeapToStack, we reorganize AAHeapToStack and the state we keep for each malloc-like call. The result is also less confusing as we only track malloc-like calls, not all calls. Further, we only perform the updates necessary for a malloc-like to argue it can go to the stack, e.g., we won't check all uses if we moved on to the "must-be-freed" argument.
This patch also uses Attributor helps to simplify the allocated size, alignment, and the potentially freed objects.
Overall, this is mostly a reorganization and only the use of the optimistic helpers should change (=improve) the capabilities a bit.
Differential Revision: https://reviews.llvm.org/D104993
|
 | llvm/test/Transforms/Attributor/depgraph.ll |
 | llvm/test/Transforms/Attributor/heap_to_stack.ll |
 | llvm/test/Transforms/OpenMP/remove_globalization.ll |
 | llvm/include/llvm/Transforms/IPO/Attributor.h |
 | llvm/lib/Transforms/IPO/AttributorAttributes.cpp |
 | llvm/lib/Transforms/IPO/Attributor.cpp |
 | llvm/lib/Transforms/IPO/OpenMPOpt.cpp |
Commit
5003ba2542c14ab203ff6875af496d816144a5f4
by johannes[Attributor] Look through selects in genericValueTraversal
If we can simplify the select condition we can avoid one value in the traversal.
Differential Revision: https://reviews.llvm.org/D103861
|
 | llvm/lib/Transforms/IPO/AttributorAttributes.cpp |
 | llvm/test/Transforms/Attributor/value-simplify.ll |
 | llvm/test/Transforms/Attributor/lvi-for-ashr.ll |
Commit
1d5711c3eeb62098b46d4d383f2e849b9756105d
by johannes[OpenMP] Unified entry point for SPMD & generic kernels in the device RTL
In the spirit of TRegions [0], this patch provides a simpler and uniform interface for a kernel to set up the device runtime. The OMPIRBuilder is used for reuse in Flang. A custom state machine will be generated in the follow up patch.
The "surplus" threads of the "master warp" will not exit early anymore so we need to use non-aligned barriers. The new runtime will not have an extra warp but also require these non-aligned barriers.
[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11
This was in parts extracted from D59319.
Reviewed By: ABataev, JonChesterfield
Differential Revision: https://reviews.llvm.org/D101976
|
 | llvm/include/llvm/Frontend/OpenMP/OMPKinds.def |
 | llvm/test/Transforms/OpenMP/replace_globalization.ll |
 | llvm/test/Transforms/OpenMP/single_threaded_execution.ll |
 | llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h |
 | openmp/libomptarget/deviceRTLs/common/include/target.h |
 | openmp/libomptarget/deviceRTLs/interface.h |
 | openmp/libomptarget/deviceRTLs/common/src/parallel.cu |
 | llvm/lib/Transforms/IPO/OpenMPOpt.cpp |
 | openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu |
 | clang/lib/CodeGen/CGOpenMPRuntimeGPU.h |
 | openmp/libomptarget/deviceRTLs/common/src/omptarget.cu |
 | clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp |
 | llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp |
Commit
f0628c6ff7ba2f3ceeb99791e5e34028de0c82c4
by johannes[OpenMP] Create custom state machines for generic target regions
In the spirit of TRegions [0], this patch creates a custom state machine for a generic target region based on the potentially called parallel regions.
The code analysis is done interprocedurally via an abstract attribute (AAKernelInfo). All outermost parallel regions are collected and we check if there might be unknown outermost parallel regions for which we need an indirect call. Other AAKernelInfo extensions are expected.
[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11
Differential Revision: https://reviews.llvm.org/D101977
|
 | llvm/lib/Transforms/IPO/OpenMPOpt.cpp |
 | llvm/test/Transforms/OpenMP/globalization_remarks.ll |
 | llvm/lib/Transforms/IPO/Attributor.cpp |
 | llvm/test/Transforms/OpenMP/single_threaded_execution.ll |
 | llvm/test/Transforms/OpenMP/remove_globalization.ll |
 | llvm/test/Transforms/OpenMP/replace_globalization.ll |
 | llvm/test/Transforms/OpenMP/custom_state_machines.ll |
 | llvm/test/Transforms/OpenMP/custom_state_machines_remarks.ll |
Commit
ae08df87dfbae62542d2a37ecebbbc5fa04b82f4
by johannes[Attributor][FIX] Do not replace a value with a non-dominating instruction
We have to be careful when we replace values to not use a non-dominating instruction. It makes sense that simplification offers those as "simplified values" but we can't manifest them in the IR without PHI nodes. In the future we should consider potentially adding those PHI nodes.
|
 | llvm/test/Transforms/Attributor/heap_to_stack_gpu.ll |
 | llvm/test/Transforms/Attributor/value-simplify.ll |
 | llvm/test/Transforms/Attributor/IPConstantProp/return-argument.ll |
 | llvm/test/Transforms/Attributor/heap_to_stack.ll |
 | llvm/lib/Transforms/IPO/Attributor.cpp |
 | llvm/test/Transforms/Attributor/IPConstantProp/PR16052.ll |
 | llvm/test/Transforms/Attributor/noalias.ll |
 | llvm/test/Transforms/Attributor/nonnull.ll |
 | llvm/include/llvm/Transforms/IPO/Attributor.h |
 | llvm/test/Transforms/Attributor/memory_locations.ll |
 | llvm/lib/Transforms/IPO/AttributorAttributes.cpp |
 | llvm/test/Transforms/Attributor/IPConstantProp/PR26044.ll |
Commit
966342790e8d4a64295845b50f49117d4ec9e7cf
by johannes[Attributor][FIX] Sanitize queries to LVI and ScalarEvolution
When we talk to outside analyse, e.g., LVI and ScalarEvolution, we need to be careful with the query. The particular error occurred because we folded a PHI node before the LVI query but the context location was now not dominated by the value anymore. This is not supported by LVI so we have to filter these situations before we query the outside analyses.
|
 | llvm/test/Transforms/Attributor/value-simplify.ll |
 | llvm/lib/Transforms/IPO/AttributorAttributes.cpp |
Commit
e603ca0306d7f41d407b013c8cb27ca49e571de2
by johannes[OpenMP] Remove checkXXXX device runtime functions
We had multiple functions to determine the execution mode (SPMD/Generic) and runtime status (initialized/uninitialized) but that just increased complexity without a real benefit. Especially with D102307 in mind it is helpful to reduce the dependence on the `ident_t` flags.
Differential Revision: https://reviews.llvm.org/D105586
|
 | openmp/libomptarget/deviceRTLs/common/src/sync.cu |
 | openmp/libomptarget/deviceRTLs/common/src/parallel.cu |
 | openmp/libomptarget/deviceRTLs/common/src/support.cu |
 | openmp/libomptarget/deviceRTLs/common/src/reduction.cu |
 | openmp/libomptarget/deviceRTLs/common/support.h |
 | openmp/libomptarget/deviceRTLs/common/src/loop.cu |
 | openmp/libomptarget/deviceRTLs/common/src/task.cu |
Commit
d39179d7fa17a20e214cffeaad61a12e9b0d337a
by johannes[OpenMP] Detect SPMD compatible kernels and execute them as such
In the spirit of TRegions [0], this patch analyzes a kernel and tracks if it can be executed in SPMD-mode. If so, we flip the arguments of the __kmpc_target_init and deinit call to enable the mode. We also update the `<kernel>_exec_mode` flag to indicate to the runtime we changed the mode to SPMD.
The code analysis is done interprocedurally by extending the AAKernelInfo abstract attribute to track SPMD compatibility as well.
[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11
Differential Revision: https://reviews.llvm.org/D102307
|
 | llvm/test/Transforms/OpenMP/custom_state_machines.ll |
 | llvm/include/llvm/Frontend/OpenMP/OMPConstants.h |
 | llvm/test/Transforms/OpenMP/spmdization_remarks.ll |
 | llvm/test/Transforms/OpenMP/spmdization.ll |
 | llvm/lib/IR/Assumptions.cpp |
 | llvm/lib/Transforms/IPO/OpenMPOpt.cpp |
 | llvm/test/Transforms/OpenMP/custom_state_machines_remarks.ll |
Commit
269416d41908bb670f67af689155d5ab8eea689a
by johannes[Attributor][NFCI] Add UsedAssumedInformation to more interfaces
As with other Attributor interfaces we often want to know if assumed information was used to answer a query. This is important if only known information is allowed or if known information can lead to an early fixpoint. The users have been adjusted but none of them utilizes the new information yet.
|
 | llvm/lib/Transforms/IPO/AttributorAttributes.cpp |
 | llvm/include/llvm/Transforms/IPO/Attributor.h |
 | llvm/lib/Transforms/IPO/OpenMPOpt.cpp |
 | llvm/lib/Transforms/IPO/Attributor.cpp |
Commit
768510632c5ddbf9438693d9c7db1903e39295ad
by thakisRevert "llvm-symbolizer: Fix "start file" to work with Split DWARF"
This reverts commit 04c203e310bd3fb58e16c936c0200d680100526e. Test fails on Windows.
|
 | llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp |
 | llvm/test/DebugInfo/X86/symbolize_function_start.s |
 | llvm/include/llvm/DebugInfo/DWARF/DWARFUnit.h |
 | llvm/lib/DebugInfo/DWARF/DWARFDie.cpp |
Commit
f01d45c378cd0271e279d971c79d6e4900f045e0
by v.g.vassilevReland "[clang-repl] Allow passing in code as positional arguments."
This reverts commit 3ec88ca60b24 which reverted e386871e1d21 due to a asan build failure.
This patch removes the new lines in the test case which seem to introduce the failure.
Differential revision: https://reviews.llvm.org/D104898
|
 | clang/tools/clang-repl/ClangRepl.cpp |
 | clang/test/Interpreter/execute.cpp |
Commit
d3e749133319aaea6a143b1404154345a3cc2541
by thakisRevert Attributor patch series
Broke check-clang, see https://reviews.llvm.org/D102307#2869065 Ran `git revert -n ebbe149a6f08535ede848a531a601ae6591cfbc5..269416d41908bb670f67af689155d5ab8eea689a`
|
 | llvm/test/Transforms/Attributor/norecurse.ll |
 | llvm/test/Transforms/OpenMP/custom_state_machines.ll |
 | llvm/include/llvm/Frontend/OpenMP/OMPKinds.def |
 | llvm/test/Transforms/Attributor/IPConstantProp/pthreads.ll |
 | llvm/test/Transforms/Attributor/internalize.ll |
 | openmp/libomptarget/deviceRTLs/common/src/reduction.cu |
 | clang/lib/CodeGen/CGOpenMPRuntimeGPU.h |
 | llvm/test/Transforms/Attributor/range.ll |
 | llvm/test/Transforms/Attributor/dereferenceable-2-inseltpoison.ll |
 | llvm/test/Transforms/Attributor/IPConstantProp/return-argument.ll |
 | openmp/libomptarget/deviceRTLs/interface.h |
 | llvm/lib/IR/Assumptions.cpp |
 | llvm/test/Transforms/Attributor/nocapture-2.ll |
 | llvm/test/Transforms/Attributor/ArgumentPromotion/attrs.ll |
 | llvm/test/Transforms/Attributor/IPConstantProp/PR16052.ll |
 | llvm/test/Transforms/Attributor/IPConstantProp/multiple_callbacks.ll |
 | llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h |
 | llvm/test/Transforms/Attributor/depgraph.ll |
 | llvm/test/Transforms/Attributor/ArgumentPromotion/inalloca.ll |
 | llvm/test/Transforms/Attributor/ArgumentPromotion/profile.ll |
 | llvm/test/Transforms/Attributor/cb_range_enabled.ll |
 | llvm/test/Transforms/OpenMP/replace_globalization.ll |
 | llvm/test/Transforms/Attributor/ArgumentPromotion/alignment.ll |
 | openmp/libomptarget/deviceRTLs/common/src/omptarget.cu |
 | llvm/test/Transforms/Attributor/noundef.ll |
 | llvm/test/Transforms/Attributor/potential.ll |
 | llvm/test/Transforms/Attributor/nocapture-1.ll |
 | llvm/test/Transforms/Attributor/cgscc_bugs.ll |
 | llvm/test/Transforms/Attributor/read_write_returned_arguments_scc.ll |
 | llvm/test/Transforms/Attributor/noalias.ll |
 | llvm/test/Transforms/Attributor/value-simplify.ll |
 | llvm/lib/Transforms/IPO/AttributorAttributes.cpp |
 | openmp/libomptarget/deviceRTLs/common/src/task.cu |
 | llvm/test/Transforms/Attributor/ArgumentPromotion/sret.ll |
 | llvm/include/llvm/Frontend/OpenMP/OMPConstants.h |
 | llvm/test/Transforms/OpenMP/remove_globalization.ll |
 | llvm/test/Transforms/Attributor/ArgumentPromotion/control-flow2.ll |
 | llvm/test/Transforms/Attributor/memory_locations.ll |
 | llvm/test/Transforms/Attributor/cb_liveness_enabled.ll |
 | llvm/test/Transforms/Attributor/IPConstantProp/arg-count-mismatch.ll |
 | llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp |
 | llvm/test/Transforms/Attributor/returned.ll |
 | llvm/test/Transforms/OpenMP/globalization_remarks.ll |
 | llvm/test/Transforms/Attributor/nonnull.ll |
 | llvm/test/Transforms/OpenMP/spmdization.ll |
 | openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu |
 | llvm/lib/Transforms/IPO/OpenMPOpt.cpp |
 | llvm/test/Transforms/Attributor/IPConstantProp/musttail-call.ll |
 | llvm/test/Transforms/OpenMP/custom_state_machines_remarks.ll |
 | llvm/test/Transforms/Attributor/IPConstantProp/2009-09-24-byval-ptr.ll |
 | llvm/test/Transforms/Attributor/ArgumentPromotion/fp80.ll |
 | llvm/test/Transforms/Attributor/undefined_behavior.ll |
 | llvm/test/Transforms/Attributor/nodelete.ll |
 | openmp/libomptarget/deviceRTLs/common/support.h |
 | llvm/test/Transforms/Attributor/align.ll |
 | llvm/test/Transforms/Attributor/dereferenceable-2.ll |
 | openmp/libomptarget/deviceRTLs/common/src/parallel.cu |
 | llvm/test/Transforms/Attributor/ArgumentPromotion/reserve-tbaa.ll |
 | llvm/test/Transforms/Attributor/heap_to_stack.ll |
 | llvm/test/Transforms/OpenMP/spmdization_remarks.ll |
 | llvm/test/Transforms/Attributor/IPConstantProp/PR26044.ll |
 | llvm/test/Transforms/Attributor/internal-noalias.ll |
 | openmp/libomptarget/deviceRTLs/common/src/loop.cu |
 | openmp/libomptarget/deviceRTLs/common/include/target.h |
 | llvm/test/Transforms/Attributor/readattrs.ll |
 | llvm/test/Transforms/Attributor/liveness.ll |
 | llvm/test/Transforms/Attributor/cb_liveness_disabled.ll |
 | llvm/lib/Transforms/IPO/Attributor.cpp |
 | llvm/test/Transforms/Attributor/heap_to_stack_gpu.ll |
 | llvm/test/Transforms/Attributor/lvi-for-ashr.ll |
 | clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp |
 | openmp/libomptarget/deviceRTLs/common/src/support.cu |
 | llvm/test/Transforms/OpenMP/single_threaded_execution.ll |
 | openmp/libomptarget/deviceRTLs/common/src/sync.cu |
 | llvm/include/llvm/Transforms/IPO/Attributor.h |
 | llvm/test/Transforms/Attributor/IPConstantProp/dangling-block-address.ll |
Commit
5b12cf3e659bb7e1a975b3b866b933c0c2acff10
by johannes[Attributor][FIX] Traverse uses even if a value is assumed constant
Not all attributes are able to handle the interprocedural step and follow the uses into a call site. Let them be able to combine call site uses instead. This might result in some unused values/arguments being leftover but it removes problems where we misused "is dead" even though it was actually "is simplified/replaced".
We explicitly check for dead values due to constant propagation in `AAIsDeadValueImpl::areAllUsesAssumedDead` instead.
Differential Revision: https://reviews.llvm.org/D103858
|
 | llvm/test/Transforms/Attributor/norecurse.ll |
 | llvm/test/Transforms/Attributor/ArgumentPromotion/control-flow2.ll |
 | llvm/test/Transforms/Attributor/range.ll |
 | llvm/lib/Transforms/IPO/AttributorAttributes.cpp |
 | llvm/test/Transforms/Attributor/nonnull.ll |
 | llvm/test/Transforms/Attributor/IPConstantProp/dangling-block-address.ll |
 | llvm/test/Transforms/Attributor/ArgumentPromotion/alignment.ll |
 | llvm/test/Transforms/Attributor/align.ll |
 | llvm/lib/Transforms/IPO/Attributor.cpp |
 | llvm/test/Transforms/Attributor/internal-noalias.ll |
 | llvm/test/Transforms/Attributor/potential.ll |
 | llvm/test/Transforms/Attributor/nodelete.ll |
 | llvm/test/Transforms/Attributor/IPConstantProp/2009-09-24-byval-ptr.ll |
 | llvm/test/Transforms/Attributor/ArgumentPromotion/attrs.ll |
 | llvm/test/Transforms/Attributor/value-simplify.ll |
 | llvm/test/Transforms/Attributor/liveness.ll |
 | llvm/test/Transforms/Attributor/memory_locations.ll |
 | llvm/test/Transforms/Attributor/depgraph.ll |
 | llvm/test/Transforms/Attributor/ArgumentPromotion/fp80.ll |
 | llvm/test/Transforms/Attributor/IPConstantProp/musttail-call.ll |
 | llvm/test/Transforms/Attributor/internalize.ll |
 | llvm/test/Transforms/Attributor/ArgumentPromotion/reserve-tbaa.ll |
 | llvm/test/Transforms/Attributor/IPConstantProp/openmp_parallel_for.ll |
 | llvm/test/Transforms/Attributor/noundef.ll |
Commit
0aab13aaf942a1e4fbf21338fa2223dc292bbc46
by johannes[Attributor] Introduce an optimistic getUnderlyingObjects helper
As the `llvm::getUnderlyingObjects` helper, the optimistic version collects objects that might be the base of a given pointer. In contrast to the llvm variant, the optimistic one will use assumed information, e.g., about select conditions or dead blocks, to provide a more precise result.
Differential Revision: https://reviews.llvm.org/D103859
|
 | llvm/test/Transforms/Attributor/nocapture-1.ll |
 | llvm/lib/Transforms/IPO/AttributorAttributes.cpp |
Commit
5ef18e2421835251eb5176bf2e711516b1f4c670
by johannes[Attributor] Use AAValueSimplify to simplify returned values
We should use AAValueSimplify for all value simplification, however there was some leftover logic that predates AAValueSimplify in AAReturnedValues. This remove the AAReturnedValues part and provides a replacement by making AAValueSimplifyReturned strong enough to handle all previously covered cases. Further, this improve AAValueSimplifyCallSiteReturned to handle returned arguments.
AAReturnedValues is now much easier and the collected returned values/instructions are now from the associated function only, making it much more sane. We also do not have the brittle logic anymore that looks for unresolved calls. Instead, we use AAValueSimplify to handle recursion.
Useful code has been split into helper functions, e.g., an Attributor interface to get a simplified value.
Differential Revision: https://reviews.llvm.org/D103860
|
 | llvm/test/Transforms/Attributor/IPConstantProp/multiple_callbacks.ll |
 | llvm/test/Transforms/Attributor/internalize.ll |
 | llvm/test/Transforms/Attributor/IPConstantProp/PR26044.ll |
 | llvm/test/Transforms/Attributor/memory_locations.ll |
 | llvm/test/Transforms/Attributor/IPConstantProp/PR16052.ll |
 | llvm/test/Transforms/Attributor/IPConstantProp/pthreads.ll |
 | llvm/test/Transforms/Attributor/range.ll |
 | llvm/test/Transforms/Attributor/heap_to_stack.ll |
 | llvm/test/Transforms/OpenMP/replace_globalization.ll |
 | llvm/test/Transforms/Attributor/cb_liveness_disabled.ll |
 | llvm/test/Transforms/Attributor/cb_range_enabled.ll |
 | llvm/test/Transforms/Attributor/ArgumentPromotion/sret.ll |
 | llvm/test/Transforms/Attributor/IPConstantProp/arg-count-mismatch.ll |
 | llvm/include/llvm/Transforms/IPO/Attributor.h |
 | llvm/test/Transforms/Attributor/ArgumentPromotion/inalloca.ll |
 | llvm/test/Transforms/Attributor/read_write_returned_arguments_scc.ll |
 | llvm/test/Transforms/Attributor/nonnull.ll |
 | llvm/test/Transforms/Attributor/dereferenceable-2.ll |
 | llvm/lib/Transforms/IPO/Attributor.cpp |
 | llvm/test/Transforms/Attributor/depgraph.ll |
 | llvm/test/Transforms/Attributor/undefined_behavior.ll |
 | llvm/test/Transforms/Attributor/dereferenceable-2-inseltpoison.ll |
 | llvm/test/Transforms/Attributor/ArgumentPromotion/profile.ll |
 | llvm/test/Transforms/Attributor/returned.ll |
 | llvm/test/Transforms/Attributor/nocapture-1.ll |
 | llvm/test/Transforms/Attributor/cgscc_bugs.ll |
 | llvm/test/Transforms/Attributor/noalias.ll |
 | llvm/test/Transforms/Attributor/cb_liveness_enabled.ll |
 | llvm/test/Transforms/Attributor/align.ll |
 | llvm/test/Transforms/Attributor/heap_to_stack_gpu.ll |
 | llvm/test/Transforms/Attributor/IPConstantProp/musttail-call.ll |
 | llvm/test/Transforms/Attributor/IPConstantProp/return-argument.ll |
 | llvm/test/Transforms/Attributor/nocapture-2.ll |
 | llvm/test/Transforms/Attributor/readattrs.ll |
 | llvm/lib/Transforms/IPO/AttributorAttributes.cpp |
 | llvm/test/Transforms/Attributor/IPConstantProp/openmp_parallel_for.ll |
 | llvm/test/Transforms/Attributor/value-simplify.ll |
 | llvm/test/Transforms/Attributor/potential.ll |
Commit
a6470408cf3601391c6c85f8b3a743f2b5fbaad2
by david.green[ARM] Extra widening and narrowing combinations tests. NFC
|
 | llvm/test/CodeGen/Thumb2/block-placement.mir |
 | llvm/test/CodeGen/Thumb2/mve-widen-narrow.ll |
Commit
dbb3a65f5b30ff78e0a7165b377180a00e580f8c
by johannes[Attributor][FIX] Do not replace a value with a non-dominating instruction
We have to be careful when we replace values to not use a non-dominating instruction. It makes sense that simplification offers those as "simplified values" but we can't manifest them in the IR without PHI nodes. In the future we should consider potentially adding those PHI nodes.
|
 | llvm/test/Transforms/Attributor/noalias.ll |
 | llvm/test/Transforms/Attributor/memory_locations.ll |
 | llvm/include/llvm/Transforms/IPO/Attributor.h |
 | llvm/test/Transforms/Attributor/IPConstantProp/PR16052.ll |
 | llvm/lib/Transforms/IPO/AttributorAttributes.cpp |
 | llvm/test/Transforms/Attributor/value-simplify.ll |
 | llvm/test/Transforms/Attributor/heap_to_stack.ll |
 | llvm/test/Transforms/Attributor/nonnull.ll |
 | llvm/lib/Transforms/IPO/Attributor.cpp |
 | llvm/test/Transforms/Attributor/IPConstantProp/PR26044.ll |
 | llvm/test/Transforms/Attributor/IPConstantProp/return-argument.ll |
 | llvm/test/Transforms/Attributor/heap_to_stack_gpu.ll |
Commit
c1c1fe93852e88b544c46087363400751b3a3ceb
by johannes[Attributor] Reorganize AAHeapToStack
In order to simplify future extensions, e.g., the merge of AAHeapToShared in to AAHeapToStack, we reorganize AAHeapToStack and the state we keep for each malloc-like call. The result is also less confusing as we only track malloc-like calls, not all calls. Further, we only perform the updates necessary for a malloc-like to argue it can go to the stack, e.g., we won't check all uses if we moved on to the "must-be-freed" argument.
This patch also uses Attributor helps to simplify the allocated size, alignment, and the potentially freed objects.
Overall, this is mostly a reorganization and only the use of the optimistic helpers should change (=improve) the capabilities a bit.
Differential Revision: https://reviews.llvm.org/D104993
|
 | llvm/include/llvm/Transforms/IPO/Attributor.h |
 | llvm/lib/Transforms/IPO/Attributor.cpp |
 | llvm/lib/Transforms/IPO/OpenMPOpt.cpp |
 | llvm/test/Transforms/Attributor/depgraph.ll |
 | llvm/test/Transforms/Attributor/heap_to_stack.ll |
 | llvm/test/Transforms/OpenMP/remove_globalization.ll |
 | llvm/lib/Transforms/IPO/AttributorAttributes.cpp |
Commit
5b05a5f6cee2ed3bda299d317907f8c89f4d089d
by johannes[OpenMP][FIX] Update remark in test file after rewording
|
 | llvm/test/Transforms/OpenMP/globalization_remarks.ll |
Commit
c1d53a316d6c7f7d80908ec8b5a65172f82e9721
by johannes[Attributor] Look through selects in genericValueTraversal
If we can simplify the select condition we can avoid one value in the traversal.
Differential Revision: https://reviews.llvm.org/D103861
|
 | llvm/test/Transforms/Attributor/lvi-for-ashr.ll |
 | llvm/test/Transforms/Attributor/value-simplify.ll |
 | llvm/lib/Transforms/IPO/AttributorAttributes.cpp |
Commit
4761d29633ac7889329dc6ef966eab01c7b7903d
by johannes[Attributor][FIX] Sanitize queries to LVI and ScalarEvolution
When we talk to outside analyse, e.g., LVI and ScalarEvolution, we need to be careful with the query. The particular error occurred because we folded a PHI node before the LVI query but the context location was now not dominated by the value anymore. This is not supported by LVI so we have to filter these situations before we query the outside analyses.
|
 | llvm/test/Transforms/Attributor/value-simplify.ll |
 | llvm/lib/Transforms/IPO/AttributorAttributes.cpp |
Commit
e2cfbfcc0c1f3a89ab79c5615f0789b6a9966dc5
by johannes[OpenMP] Unified entry point for SPMD & generic kernels in the device RTL
In the spirit of TRegions [0], this patch provides a simpler and uniform interface for a kernel to set up the device runtime. The OMPIRBuilder is used for reuse in Flang. A custom state machine will be generated in the follow up patch.
The "surplus" threads of the "master warp" will not exit early anymore so we need to use non-aligned barriers. The new runtime will not have an extra warp but also require these non-aligned barriers.
[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11
This was in parts extracted from D59319.
Reviewed By: ABataev, JonChesterfield
Differential Revision: https://reviews.llvm.org/D101976
|
 | clang/test/OpenMP/nvptx_data_sharing.cpp |
 | clang/test/OpenMP/target_parallel_debug_codegen.cpp |
 | clang/test/OpenMP/nvptx_target_parallel_reduction_codegen.cpp |
 | clang/lib/CodeGen/CGOpenMPRuntimeGPU.h |
 | clang/test/OpenMP/nvptx_nested_parallel_codegen.cpp |
 | openmp/libomptarget/deviceRTLs/interface.h |
 | clang/test/OpenMP/nvptx_target_codegen.cpp |
 | clang/test/OpenMP/nvptx_target_firstprivate_codegen.cpp |
 | openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu |
 | clang/test/OpenMP/nvptx_distribute_parallel_generic_mode_codegen.cpp |
 | clang/test/OpenMP/nvptx_target_parallel_codegen.cpp |
 | clang/test/OpenMP/nvptx_target_parallel_proc_bind_codegen.cpp |
 | clang/test/OpenMP/nvptx_target_printf_codegen.c |
 | llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h |
 | clang/test/OpenMP/nvptx_target_teams_distribute_codegen.cpp |
 | clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp |
 | clang/test/OpenMP/nvptx_target_teams_distribute_simd_codegen.cpp |
 | clang/test/OpenMP/nvptx_force_full_runtime_SPMD_codegen.cpp |
 | openmp/libomptarget/deviceRTLs/common/src/omptarget.cu |
 | clang/test/OpenMP/nvptx_target_simd_codegen.cpp |
 | clang/test/OpenMP/remarks_parallel_in_target_state_machine.c |
 | clang/test/OpenMP/nvptx_SPMD_codegen.cpp |
 | clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp |
 | llvm/include/llvm/Frontend/OpenMP/OMPKinds.def |
 | clang/test/OpenMP/nvptx_lambda_capturing.cpp |
 | clang/test/OpenMP/declare_target_codegen_globalization.cpp |
 | openmp/libomptarget/deviceRTLs/common/src/parallel.cu |
 | openmp/libomptarget/deviceRTLs/common/include/target.h |
 | clang/test/OpenMP/nvptx_parallel_for_codegen.cpp |
 | clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_generic_mode_codegen.cpp |
 | clang/test/OpenMP/nvptx_target_parallel_num_threads_codegen.cpp |
 | clang/test/OpenMP/nvptx_target_parallel_reduction_codegen_tbaa_PR46146.cpp |
 | clang/test/OpenMP/nvptx_target_teams_codegen.cpp |
 | clang/test/OpenMP/nvptx_multi_target_parallel_codegen.cpp |
 | clang/test/OpenMP/amdgcn_target_codegen.cpp |
 | llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp |
 | clang/test/OpenMP/assumes_include_nvptx.cpp |
 | llvm/test/Transforms/OpenMP/single_threaded_execution.ll |
 | clang/test/OpenMP/nvptx_teams_reduction_codegen.cpp |
 | clang/test/OpenMP/nvptx_parallel_codegen.cpp |
 | clang/test/OpenMP/nvptx_teams_codegen.cpp |
 | clang/test/OpenMP/target_parallel_for_debug_codegen.cpp |
 | clang/test/OpenMP/nvptx_target_teams_distribute_parallel_for_simd_codegen.cpp |
 | llvm/lib/Transforms/IPO/OpenMPOpt.cpp |
 | clang/test/OpenMP/remarks_parallel_in_multiple_target_state_machines.c |
 | llvm/test/Transforms/OpenMP/replace_globalization.ll |
Commit
d9659bf6a036545125a39648b4abe838080299ec
by johannes[OpenMP] Create custom state machines for generic target regions
In the spirit of TRegions [0], this patch creates a custom state machine for a generic target region based on the potentially called parallel regions.
The code analysis is done interprocedurally via an abstract attribute (AAKernelInfo). All outermost parallel regions are collected and we check if there might be unknown outermost parallel regions for which we need an indirect call. Other AAKernelInfo extensions are expected.
[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11
Differential Revision: https://reviews.llvm.org/D101977
|
 | llvm/test/Transforms/OpenMP/single_threaded_execution.ll |
 | llvm/lib/Transforms/IPO/Attributor.cpp |
 | llvm/test/Transforms/OpenMP/replace_globalization.ll |
 | llvm/lib/Transforms/IPO/OpenMPOpt.cpp |
 | llvm/test/Transforms/OpenMP/remove_globalization.ll |
 | llvm/test/Transforms/OpenMP/globalization_remarks.ll |
 | llvm/test/Transforms/OpenMP/custom_state_machines.ll |
 | llvm/test/Transforms/OpenMP/custom_state_machines_remarks.ll |
 | llvm/test/Transforms/PhaseOrdering/openmp-opt-module.ll |
Commit
a706b94ea5560a7733e403006a9066cc41e82b5d
by johannes[OpenMP][NFCI] Re-enable two remarks tests after D101977 landed
|
 | clang/test/OpenMP/remarks_parallel_in_target_state_machine.c |
 | clang/test/OpenMP/remarks_parallel_in_multiple_target_state_machines.c |
Commit
0a223827de8d923f357bf6d3d222fd26e2fbca4a
by johannes[OpenMP] Remove checkXXXX device runtime functions
We had multiple functions to determine the execution mode (SPMD/Generic) and runtime status (initialized/uninitialized) but that just increased complexity without a real benefit. Especially with D102307 in mind it is helpful to reduce the dependence on the `ident_t` flags.
Differential Revision: https://reviews.llvm.org/D105586
|
 | openmp/libomptarget/deviceRTLs/common/src/reduction.cu |
 | openmp/libomptarget/deviceRTLs/common/src/loop.cu |
 | openmp/libomptarget/deviceRTLs/common/src/sync.cu |
 | openmp/libomptarget/deviceRTLs/common/src/support.cu |
 | openmp/libomptarget/deviceRTLs/common/src/task.cu |
 | openmp/libomptarget/deviceRTLs/common/support.h |
 | openmp/libomptarget/deviceRTLs/common/src/parallel.cu |
Commit
8cb7d71355f9ca884efde1dfa03dc349fb890721
by johannes[OpenMP][FIX] Add missing `)` to remark
|
 | llvm/lib/Transforms/IPO/OpenMPOpt.cpp |
 | llvm/test/Transforms/OpenMP/custom_state_machines_remarks.ll |
Commit
514c033db1e0c237eccd56b9fc11fe05a6baff39
by johannes[OpenMP] Detect SPMD compatible kernels and execute them as such
In the spirit of TRegions [0], this patch analyzes a kernel and tracks if it can be executed in SPMD-mode. If so, we flip the arguments of the __kmpc_target_init and deinit call to enable the mode. We also update the `<kernel>_exec_mode` flag to indicate to the runtime we changed the mode to SPMD.
The code analysis is done interprocedurally by extending the AAKernelInfo abstract attribute to track SPMD compatibility as well.
[0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11
Differential Revision: https://reviews.llvm.org/D102307
|
 | llvm/test/Transforms/OpenMP/custom_state_machines.ll |
 | clang/test/OpenMP/remarks_parallel_in_multiple_target_state_machines.c |
 | llvm/test/Transforms/OpenMP/spmdization.ll |
 | clang/test/OpenMP/remarks_parallel_in_target_state_machine.c |
 | llvm/lib/Transforms/IPO/OpenMPOpt.cpp |
 | llvm/lib/IR/Assumptions.cpp |
 | llvm/test/Transforms/OpenMP/spmdization_remarks.ll |
 | llvm/include/llvm/Frontend/OpenMP/OMPConstants.h |
 | llvm/test/Transforms/OpenMP/custom_state_machines_remarks.ll |
Commit
2e7e2994a94efad7fde5547d4e493e28b3b660a3
by johannes[Attributor][FIX] Destroy bump allocator objects to avoid leaks
AllocationInfo and DeallocationInfo objects themselves are allocated with the Attributor bump allocator and do not need to be deallocated. That said, the sets in AllocationInfo and DeallocationInfo need to be destroyed to avoid memory leaks.
|
 | llvm/lib/Transforms/IPO/AttributorAttributes.cpp |
Commit
86109fa9e84cd6630f5f14414779b890144b3fc3
by craig.topper[RISCV] Add test cases for div/rem with constant left hand side. NFC
Some of these would produce better code if we used W instructions, but constant LHS currently prevents that.
|
 | llvm/test/CodeGen/RISCV/div.ll |
 | llvm/test/CodeGen/RISCV/rem.ll |
Commit
4f94121cce24af28b64a9b67e2f5355bcca43574
by kazu[Analysis] Remove changeCondBranchToUnconditionalTo (NFC)
The last use was removed on Jan 21, 2021 in commit 0895b836d74ed333468ddece2102140494eb33b6.
|
 | llvm/include/llvm/Analysis/MemorySSAUpdater.h |
 | llvm/lib/Analysis/MemorySSAUpdater.cpp |
Commit
99b8c4682865f54498f47389f673df5d6a3558cd
by craig.topper[RISCV] Restore non-constant srem test I accidentally deleted. NFC
|
 | llvm/test/CodeGen/RISCV/rem.ll |
Commit
cbba7299f3085985e2c8e4c0a6643ce8a7d2b2db
by craig.topper[DivRemPairs] Add test cases for D87555. NFC
|
 | llvm/test/Transforms/DivRemPairs/X86/div-expanded-rem-pair.ll |
Commit
b447b9dce0d105e7f0b22db719fe8624108e99dc
by dblaikieReapply "llvm-symbolizer: Fix "start file" to work with Split DWARF"
Originally committed as 04c203e310bd3fb58e16c936c0200d680100526e Reverted in 768510632c5ddbf9438693d9c7db1903e39295ad due to the test failing when encountering windows directory separators.
Fix the path separator platform issue with a FileCheck pattern {{[/\\]}}
Original commit message:
A followup to the feature added in 69da27c7496ea373567ce5121e6fe8613846e7a5 that added the optional "start file name" to match "start line" - but this didn't work with Split DWARF because of the need for the decl file number resolution code to refer back to the skeleton unit to find its .debug_line contribution. So this patch adds the necessary infrastructure to track the skeleton unit corresponding to a split full unit for the purpose of this lookup.
|
 | llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp |
 | llvm/lib/DebugInfo/DWARF/DWARFDie.cpp |
 | llvm/test/DebugInfo/X86/symbolize_function_start.s |
 | llvm/include/llvm/DebugInfo/DWARF/DWARFUnit.h |
Commit
09cdcf09b54d328fc0a247b3a0f351d2610e928f
by dblaikieFix windows directory separator some more for test from b447b9dce0d105e7f0b22db719fe8624108e99dc
|
 | llvm/test/DebugInfo/X86/symbolize_function_start.s |
Commit
1a5f4cbe1bd62e6624cbb77dad0d363addd1b324
by aqjune[InstCombine] Add optimization to prevent poison from being propagated.
In D104569, Freeze was inserted just before br to solve the `branching on undef` miscompilation problem. But value analysis was being disturbed by added freeze.
``` v = load ptr cond = freeze(icmp (and v, const), const') br cond, ... ``` The case in which value analysis disturbed is as above. By changing freeze to add immediately after load, value analysis will be successful again.
``` v = load ptr freeze(icmp (and v, const), const') => v = load ptr v' = freeze v icmp (and v', const), const' ``` In this patch, I propose the above optimization. With this patch, the poison will not spread as the freeze is performed early.
Reviewed By: nikic, lebedev.ri
Differential Revision: https://reviews.llvm.org/D105392
|
 | llvm/test/Transforms/InstCombine/freeze.ll |
 | llvm/lib/Transforms/InstCombine/InstCombineInternal.h |
 | llvm/lib/Transforms/InstCombine/InstructionCombining.cpp |
Commit
d5c0b9c84886aea65d7148f403b08799bec9186e
by jezng[lld-macho][nfc] Expand the compact unwind symbol reloc test
Add a bit more detail to the comments, and check that the final binary does indeed have a `__unwind_info` section (D105557 previosly regressed this).
Also rename the test to emphasize that we are testing relocations compact unwind, not relocations in general.
|
 | lld/test/MachO/compact-unwind-sym-relocs.s |
 | lld/test/MachO/relocs-syms-not-in-got.s |