Commit
c579c658cd42034449d4fa19f28b43f2082c0991
by leonardchan[compiler-rt][profile] Make corrupted-profile.c more robust
This test specifically checks that profiles are not mergeable if there's a change in the CounterPtr in the profile header. The test manually changes CounterPtr by explicitly calling memset on some offset into the profile file. This test would fail if binary IDs were emitted because the offset calculation does not take into account the binary ID sizes.
This patch updates the test to use types provided in profile/InstrProfData.inc to make it more resistant to profile layout changes.
Differential Revision: https://reviews.llvm.org/D110277
|
 | compiler-rt/test/profile/Linux/corrupted-profile.c |
Commit
2c1defeee40cf643ea6f0fa5e01164c9a4c48c30
by tejohnson[ThinLTO] Don't emit original GUID for locals to distributed indexes
In ThinLTO for locals we normally compute the GUID from the name after prepending the source path to get a unique global id. SamplePGO indirect call profiles contain the target GUID without this uniquification, however (unless compiling with -funique-internal-linkage-names). Therefore, the index contains the original GUID of the local symbols (without module path prepended to uniquify), in order to correctly handle the call edges added for these indirect call profile targets with SamplePGO.
We were emitting these to the combined index when writing it out as bitcode, which is unnecessary and causes overhead when writing out the indexes for distributed backends. The only use of the original GUID name is in the thin link. Suppress it in that case. This reduced the thin link time for a large distributed build by about 7%, and the aggregate size of the serialized indexes by over 2%.
Continue to print it when writing out the full index, since that is just used for debugging and testing.
Update a distributed thinlto index test to contain a local and ensure that we don't get a COMBINED_ORIGINAL_NAME record.
Differential Revision: https://reviews.llvm.org/D110296
|
 | llvm/test/ThinLTO/X86/distributed_indexes.ll |
 | llvm/lib/Bitcode/Writer/BitcodeWriter.cpp |
Commit
7da4ee2df088d39c7ca6531d80172af7d973bb67
by tejohnson[ThinLTO] Fix bot failures
Fix bot failures after 2c1defeee40cf643ea6f0fa5e01164c9a4c48c30. The new GUID I added isn't matching because it is a local with the source path prepended. There isn't much use in matching the GUID's exactly anyway, so remove those from the patterns.
|
 | llvm/test/ThinLTO/X86/distributed_indexes.ll |
Commit
8dc16ba8d2b429261dd95e88496b2a866dc18ae5
by springerm[mlir][linalg] Merge all tiling passes into a single one.
Passes such as `linalg-tile-to-tiled-loop` are merged into `linalg-tile`.
Differential Revision: https://reviews.llvm.org/D110214
|
 | mlir/test/Dialect/Linalg/tile-conv.mlir |
 | mlir/include/mlir/Dialect/Linalg/Passes.h |
 | mlir/test/Integration/Dialect/Linalg/CPU/test-conv-1d-nwc-wcf-call.mlir |
 | mlir/test/Integration/Dialect/Linalg/CPU/test-conv-2d-nhwc-hwcf-call.mlir |
 | mlir/test/Integration/Dialect/Linalg/CPU/test-conv-3d-ndhwc-dhwcf-call.mlir |
 | mlir/include/mlir/Dialect/Linalg/Passes.td |
 | mlir/test/Integration/Dialect/Linalg/CPU/test-tensor-matmul.mlir |
 | mlir/test/Dialect/Linalg/tile-indexed.mlir |
 | mlir/include/mlir/Dialect/Linalg/Utils/Utils.h |
 | mlir/test/Integration/Dialect/Linalg/CPU/test-conv-2d-call.mlir |
 | mlir/test/Integration/Dialect/Linalg/CPU/test-conv-1d-call.mlir |
 | mlir/include/mlir/Dialect/Linalg/Transforms/Transforms.h |
 | mlir/test/Dialect/Linalg/tile-simple-conv.mlir |
 | mlir/test/Dialect/Linalg/tile.mlir |
 | mlir/test/Dialect/Linalg/tile-conv-padding.mlir |
 | mlir/lib/Dialect/Linalg/Transforms/Tiling.cpp |
 | mlir/test/Dialect/Linalg/tile-pad-tensor-op.mlir |
 | mlir/test/Dialect/Linalg/tile-parallel-reduce.mlir |
 | mlir/test/Dialect/Linalg/tile-tensors.mlir |
 | mlir/test/Dialect/Linalg/tile-parallel.mlir |
 | mlir/test/Integration/Dialect/Linalg/CPU/test-conv-3d-call.mlir |
Commit
2190f8a8b1e01b7bc7429eb490f3001a23f27df1
by springerm[mlir][linalg] Support tile+peel with TiledLoopOp
Only scf.for was supported until now.
Differential Revision: https://reviews.llvm.org/D110220
|
 | mlir/test/Dialect/Linalg/tile-and-peel-tensors.mlir |
 | mlir/test/lib/Dialect/Linalg/TestLinalgTransforms.cpp |
 | mlir/lib/Dialect/Linalg/Transforms/Transforms.cpp |
Commit
83f3c615dde3fce5c0560c19316b08c1e6aa8c27
by joker.ephAdd missing storageType to AttrDef to ODS
This is only noticeable when using an attribute across dialects I think. Previously the namespace would be ommited, but it wouldn't matter as long as the generated code stays within a single namespace.
Differential Revision: https://reviews.llvm.org/D110367
|
 | mlir/test/mlir-tblgen/op-attribute.td |
 | mlir/include/mlir/IR/OpBase.td |
Commit
e470f9268a448fedea25289ec343f82ff52ccc36
by llvm-project[Polly] Implement user-directed loop distribution/fission.
This is a simple version without the possibility to define distribute points or followup-transformations. However, it is the first transformation that has to check whether the transformation is correct.
It interprets the same metadata as the LoopDistribute pass.
Re-apply after revert in c7bcd72a38bcf99e03e4651ed5204d1a1f2bf695 with fix: Take isBand out of #ifndef NDEBUG since it now is used unconditionally.
|
 | polly/lib/Analysis/DependenceInfo.cpp |
 | polly/test/ScheduleOptimizer/ManualOptimization/distribute_illegal_pragmaloc.ll |
 | polly/lib/Transform/ScheduleTreeTransform.cpp |
 | polly/include/polly/ManualOptimizer.h |
 | polly/test/ScheduleOptimizer/ManualOptimization/distribute_illegal_looploc.ll |
 | polly/include/polly/DependenceInfo.h |
 | polly/test/ScheduleOptimizer/ManualOptimization/distribute_heuristic.ll |
 | polly/lib/Transform/ScheduleOptimizer.cpp |
 | polly/include/polly/ScheduleTreeTransform.h |
 | polly/lib/Transform/ManualOptimizer.cpp |
Commit
afab3c488f0c86af87e262cc7454e04de18e3e6a
by i[Driver] Default Generic_GCC x86 to -fasynchronous-unwind-tables
to match GCC and Clang's own x86-64.
|
 | clang/lib/Driver/ToolChains/Gnu.cpp |
 | clang/test/Driver/clang-translation.c |
Commit
7a62a5b56d670c4e152159740cd7fc4030a9470f
by Christudasan.Devadasan[AMDGPU] Legalize initialized LDS variables
We don't allow an initializer for LDS variables and there is an early abort during instruction selection. This patch legalizes them by ignoring the init values. During assembly emission, proper error reporting already exists for such instances.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D109901
|
 | llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp |
 | llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp |
 | llvm/test/CodeGen/AMDGPU/lds-zero-initializer.ll |
 | llvm/test/CodeGen/AMDGPU/GlobalISel/lds-zero-initializer.ll |
Commit
25ac0d3c73d68c017546eb622ba7632c6b581bfb
by dblaikieDebugInfo: Implement the -gsimple-template-names functionality
This excludes certain names that can't be rebuilt from the available DWARF:
* Atomic types - no DWARF differentiating int from atomic int. * Vector types - enough DWARF (an attribute on the array type) to do this, but I haven't written the extra code to add the attributes required for this * Lambdas - ambiguous with any other unnamed class * Unnamed classes/enums - would need column info for the type in addition to file/line number * noexcept function types - not encoded in DWARF
|
 | clang/lib/CodeGen/CGDebugInfo.cpp |
 | clang/test/CodeGenCXX/debug-info-simple-template-names.cpp |
Commit
a2c1cf09dfaaa6d2161fee00f8317005bf955d64
by Lang Hames[ORC] Introduce EPCGenericDylibManager / SimpleExecutorDylibManager.
EPCGenericDylibManager provides an interface for loading dylibs and looking up symbols in the executor, implemented using EPC-calls to functions in the executor.
SimpleExecutorDylibManager is an executor-side service that provides the functions used by EPCGenericDylibManager.
SimpleRemoteEPC is updated to use an EPCGenericDylibManager instance to implement the ExecutorProcessControl loadDylib and lookup methods. In a future commit these methods will be removed, and clients updated to use EPCGenericDylibManagers directly.
|
 | llvm/include/llvm/ExecutionEngine/Orc/TargetProcess/SimpleExecutorDylibManager.h |
 | llvm/lib/ExecutionEngine/Orc/Shared/OrcRTBridge.cpp |
 | llvm/lib/ExecutionEngine/Orc/TargetProcess/SimpleRemoteEPCServer.cpp |
 | llvm/lib/ExecutionEngine/Orc/EPCGenericDylibManager.cpp |
 | llvm/lib/ExecutionEngine/Orc/TargetProcess/CMakeLists.txt |
 | llvm/lib/ExecutionEngine/Orc/SimpleRemoteEPC.cpp |
 | llvm/lib/ExecutionEngine/Orc/TargetProcess/SimpleExecutorDylibManager.cpp |
 | llvm/include/llvm/ExecutionEngine/Orc/EPCGenericDylibManager.h |
 | llvm/include/llvm/ExecutionEngine/Orc/SimpleRemoteEPC.h |
 | llvm/include/llvm/ExecutionEngine/Orc/TargetProcess/SimpleRemoteEPCServer.h |
 | llvm/lib/ExecutionEngine/Orc/CMakeLists.txt |
 | llvm/include/llvm/ExecutionEngine/Orc/Shared/OrcRTBridge.h |
Commit
58d9ed2c935d6665da388cd72273360349792281
by llvmgnsyncbot[gn build] Port a2c1cf09dfaa
|
 | llvm/utils/gn/secondary/llvm/lib/ExecutionEngine/Orc/BUILD.gn |
 | llvm/utils/gn/secondary/llvm/lib/ExecutionEngine/Orc/TargetProcess/BUILD.gn |
Commit
40ddde5d1fa7e5eadb76f6c3cc37dae2f80a8ca2
by Christudasan.Devadasan[TableGen] Allow targets to entirely ignore Psets for registers
Tablegen currently expects targets to have at least one pressure set for every broader register category. AMDGPU's VGPR or AGPR, for instance, seemed to work correctly without any pset, though we have forced one for each type to avoid the assertion in computeRegUnitSets. However, psets can not be entirely empty. At least one set is mandatory for every target. This patch bypasses the assertion for the classes when GeneratePressureSet is zero while ensuring the RegUnitSets are not empty.
Reviewed By: arsenm, rampitec
Differential Revision: https://reviews.llvm.org/D110305
|
 | llvm/test/TableGen/bare-minimum-psets.td |
 | llvm/utils/TableGen/CodeGenRegisters.cpp |
 | llvm/test/TableGen/empty-psets.td |